In the previous discussion no attempt was made to define “about” the expression “is about” was merely a synonym for “cover”. That is, “what a document is about” was used to mean the same as “what a document covers”. These expressions may not be very precise and the terms “about” and “cover” are not easily defined. Nevertheless, they are expressions that seem acceptable to most people and to be understood by them. It is not my intention to enter into a philosophical discussion on the meaning of “about” or “aboutness”. A number of authers have already done so. In so doing, they have failed to clarify the situation, at least as far as the task of subject indexing is concerned. Beghtol (1986) and Hutchins (1978) both draw upon text linguistics in discussing the subject, Maron (1977) adopts a probabilistic approach, and Swift et al. (1978) are careful to point out that aboutness in indexing may not coincide with the aboutness that searchers for information are concerned with. Wilson (1968) goes so far as to imply that subject indexing faces “intractable” problems because it is so difficult to decide what a document is about.
Moen et al. (1999) take the position that a text does have an intrinsic “aboutness” but that it also has different “meanings” in accordance with “the particular use that a person can make of the aboutness at a given time.”
Layne (2002) makes a distinction between “of-ness” and “aboutness” in the case of art images :
Less obvious than the of-ness of a work of art, but often more intriguing, is what the work of art is about… Sometimes the about-ness of a work of art is relatively clear as in Georg Pencz’s Allegory of Justice… This image is of a naked woman holding a sword and scales, but the title tells us that the image is an allegorical figure representing justice or, in other words, that the image is about the abstract concept “justice.” In Goya’s drawing Contemptuous of the Insults… the aboutness is slightly less obvious, but it is still clear that this work of art has some meaning beyond simply what it is of. Indeed a description of what it is of -- a man, perhaps Goya himself, gesturing toward two dwarfs wearing uniforms -- is not really sufficient to make sense of this image ; it symbolizes something else, it is about something else ; the relationship between Spain and France at the beginning of the nineteenth century or, more specifically, Goya’s personal attitude toward the French occupation of Spain. (Page 4)
She believes this distinction is a valuable one and that, in retrieval, it should be possible to separate the two :
…it makes it possible to retrieve, for example, just those images that are of “death” and to exclude those images that are about “death.” It also permits the subdivision of large sets of retrieved images based on these distinctions. For example, a search on “death” as a subject could result in a retrieval of images subdivided into groups based on whether the image explicitly depicts “death” or is about the theme of “death” (Page 13)
Bruza et al. (2000) deal with aboutness from a logical perspective. They “attempt” to formalize logical relevance by formalizing commonsense properties describing the aboutness relation.” They also deal with “nonaboutness” and the interaction between aboutness and nonaboutness. In the information retrieval context, nonaboutness is actually a simpler situation because the great majority of items in any database clearly bear no possible relationship to any particular query or information need (i.e., they are clearly “nonabout” items)
The subject of aboutness is very much related to that of “relevance” -- i.e., the relationship between a document and an information need or between a document and a statement of information need (a query). The subject of relevance/pertinence has generated a great deal of debate and literature. A very complete overview can be found in Mizzaro (1998). Hjørland (2000) points out that relevance is dependent on the theoretical assumptions that guide the behavior of the person seeking information.
As Harter (1992) has pointed out, however, a document can be relevant to some information need without being “about” that information need. For example, if I am writing on the subject of barriers to communication, a history of Latin may have some relevance, especially if it deals with the present use of Latin in the Catholic Church and with those organizations that are now trying to promote its wider use. Nevertheless, although I might be able to draw upon this source in my article, few people would claim that it is “about” international communication and it is unlikely to be indexed in this way unless the author explicitly makes reference to the international communication aspect.
Wong et al. (2001) treat “aboutness” as more or less synonymous with “relevance” :
…if a given document D is about the request Q, then there is a high likelihood that D will be relevant with respect to the associated information need. Thus the information retrieval problem is reduced to deciding the aboutness relation between documents and requests. (Page 33)
They relate aboutness directly to recall and precision measures.
Articles on aboutness continue to appear in the literature. Hjørland (2001) and Bruza et al. (2000) are examples. While these may have some academic interest (Hjørland goes to great length to try to distinguish such terms as “subject,” “topic,” “theme,” “domain,” “field,” and “content”), they have no practical value to the indexer who would do well to ignore such semantic differences and simply give an item the labels that will make it usefully retrievable by members of a target community.
Put differently, do we really need to understand “aboutness” in order to index effectively? Is it not enough to be able to recognize that a document is of interest to a particular community because it contributes to our understanding of topics X, Y, and Z? The recognition that it does contribute in this way exemplifies the process we have called “conceptual analysis,” while the process of “translation” involves a decision on which of the available labels best represent X, Y, and Z. “Concept” is another word that some writers like to philosophize around (see, for example, Dahlberg (1979)). In this book I use it to refer to a topic discussed by an author or represented in some other way (e.g., in a photograph or other image). “Conceptual analysis,” then, means nothing more than identifying the topics discussed or otherwise represented in a document. Preschel (1972) has a very practical approach. She takes “concept” to mean “indexable matter” and defines “conceptual analysis” as “indexer perception of indexable matter.” Also practical is Tinker (1966) :
By assigning a descriptor [i.e., an index term] to a document, the indexer asserts that the descriptor has a high degree of relevance to the content of the document ; that is, he asserts that the meaning of the descriptor is strongly associated with a concept embodied in the document, and that it is appropriate for the subject area of the document. (Page 97)
Wooster (1964) is even more pragmatic. He refers to indexing as assigning terms “presumably related in some fashion to the intellectual content of the original document, to help you find it when you want to.”
I find nothing wrong with these pragmatic definitions or descriptions of subject indexing. Purists will no doubt quibble with them on the grounds that such expressions as “indexable matter,” “relevance,” “meaning,” “associated with,” “concept,” “appropriate for,” “related to,” and “intellectual content” are not precisely defined to everyone’s satisfaction. However, if one must reach agreement on the precise definition of terms before pursuing any task, one is unlikely to accomplish much -- in indexing or any other activity.
Weinberg (1988) hypothesizes that indexing fails the researcher because it deals only in a general way with what a document is “about” and does not focus on what it provides that is “new” concerning the topic. She maintains that this distinction is reflected in the difference between “aboutness” and “aspect,” between “topic” and “comment,” or between “theme” and “rheme.” She fails to convince that these distinctions are really useful in the context of indexing or that it might be possible for indexers to maintain such distinctions.
Swift et al. (1978) discuss the limitations of an aboutness approach to indexing in the social sciences. They recommend indexing documents according to the “problems” to which they seem to relate. It is difficult to see how the distinction they make differs from the distinction, made earlier in this chapter, between what an item deals with and why a particular user or group of users might be interested in it. Crowe (1986) maintains that the indexer should address the “subjective viewpoint” of the author. One of her examples deals with the topic of depression which can be discussed in books or articles from several different viewpoints (e.g., treatment through psychotherapy, through drug therapy, and so on). Again, it is difficult to see how this differs form normal indexing practice as exemplified by the National Library of Medicine’s use of subheadings.
Breton (1981) claims that engineers make little use of databases because indexers label items with the names of materials or devices while engineers are more likely to want to search for their attributes or the functions they perform. In other words, they would like to locate a material or device that satisfies some current requirement (for strength, conductivity, corrosion resistance, or whatever) without being able to name it. This is not a condemnation of subject indexing per se but of the indexing policies adopted by the majority of database producers. If a new material or alloy described in a
In the previous discussion no attempt was made to define “about” the expression “is about” was merely a synonym for “cover”. That is, “what a document is about” was used to mean the same as “what a document covers”. These expressions may not be very precise and the terms “about” and “cover” are not easily defined. Nevertheless, they are expressions that seem acceptable to most people and to be understood by them. It is not my intention to enter into a philosophical discussion on the meaning of “about” or “aboutness”. A number of authers have already done so. In so doing, they have failed to clarify the situation, at least as far as the task of subject indexing is concerned. Beghtol (1986) and Hutchins (1978) both draw upon text linguistics in discussing the subject, Maron (1977) adopts a probabilistic approach, and Swift et al. (1978) are careful to point out that aboutness in indexing may not coincide with the aboutness that searchers for information are concerned with. Wilson (1968) goes so far as to imply that subject indexing faces “intractable” problems because it is so difficult to decide what a document is about.
Moen et al. (1999) take the position that a text does have an intrinsic “aboutness” but that it also has different “meanings” in accordance with “the particular use that a person can make of the aboutness at a given time.”
Layne (2002) makes a distinction between “of-ness” and “aboutness” in the case of art images :
Less obvious than the of-ness of a work of art, but often more intriguing, is what the work of art is about… Sometimes the about-ness of a work of art is relatively clear as in Georg Pencz’s Allegory of Justice… This image is of a naked woman holding a sword and scales, but the title tells us that the image is an allegorical figure representing justice or, in other words, that the image is about the abstract concept “justice.” In Goya’s drawing Contemptuous of the Insults… the aboutness is slightly less obvious, but it is still clear that this work of art has some meaning beyond simply what it is of. Indeed a description of what it is of -- a man, perhaps Goya himself, gesturing toward two dwarfs wearing uniforms -- is not really sufficient to make sense of this image ; it symbolizes something else, it is about something else ; the relationship between Spain and France at the beginning of the nineteenth century or, more specifically, Goya’s personal attitude toward the French occupation of Spain. (Page 4)
She believes this distinction is a valuable one and that, in retrieval, it should be possible to separate the two :
…it makes it possible to retrieve, for example, just those images that are of “death” and to exclude those images that are about “death.” It also permits the subdivision of large sets of retrieved images based on these distinctions. For example, a search on “death” as a subject could result in a retrieval of images subdivided into groups based on whether the image explicitly depicts “death” or is about the theme of “death” (Page 13)
Bruza et al. (2000) deal with aboutness from a logical perspective. They “attempt” to formalize logical relevance by formalizing commonsense properties describing the aboutness relation.” They also deal with “nonaboutness” and the interaction between aboutness and nonaboutness. In the information retrieval context, nonaboutness is actually a simpler situation because the great majority of items in any database clearly bear no possible relationship to any particular query or information need (i.e., they are clearly “nonabout” items)
The subject of aboutness is very much related to that of “relevance” -- i.e., the relationship between a document and an information need or between a document and a statement of information need (a query). The subject of relevance/pertinence has generated a great deal of debate and literature. A very complete overview can be found in Mizzaro (1998). Hjørland (2000) points out that relevance is dependent on the theoretical assumptions that guide the behavior of the person seeking information.
As Harter (1992) has pointed out, however, a document can be relevant to some information need without being “about” that information need. For example, if I am writing on the subject of barriers to communication, a history of Latin may have some relevance, especially if it deals with the present use of Latin in the Catholic Church and with those organizations that are now trying to promote its wider use. Nevertheless, although I might be able to draw upon this source in my article, few people would claim that it is “about” international communication and it is unlikely to be indexed in this way unless the author explicitly makes reference to the international communication aspect.
Wong et al. (2001) treat “aboutness” as more or less synonymous with “relevance” :
…if a given document D is about the request Q, then there is a high likelihood that D will be relevant with respect to the associated information need. Thus the information retrieval problem is reduced to deciding the aboutness relation between documents and requests. (Page 33)
They relate aboutness directly to recall and precision measures.
Articles on aboutness continue to appear in the literature. Hjørland (2001) and Bruza et al. (2000) are examples. While these may have some academic interest (Hjørland goes to great length to try to distinguish such terms as “subject,” “topic,” “theme,” “domain,” “field,” and “content”), they have no practical value to the indexer who would do well to ignore such semantic differences and simply give an item the labels that will make it usefully retrievable by members of a target community.
Put differently, do we really need to understand “aboutness” in order to index effectively? Is it not enough to be able to recognize that a document is of interest to a particular community because it contributes to our understanding of topics X, Y, and Z? The recognition that it does contribute in this way exemplifies the process we have called “conceptual analysis,” while the process of “translation” involves a decision on which of the available labels best represent X, Y, and Z. “Concept” is another word that some writers like to philosophize around (see, for example, Dahlberg (1979)). In this book I use it to refer to a topic discussed by an author or represented in some other way (e.g., in a photograph or other image). “Conceptual analysis,” then, means nothing more than identifying the topics discussed or otherwise represented in a document. Preschel (1972) has a very practical approach. She takes “concept” to mean “indexable matter” and defines “conceptual analysis” as “indexer perception of indexable matter.” Also practical is Tinker (1966) :
By assigning a descriptor [i.e., an index term] to a document, the indexer asserts that the descriptor has a high degree of relevance to the content of the document ; that is, he asserts that the meaning of the descriptor is strongly associated with a concept embodied in the document, and that it is appropriate for the subject area of the document. (Page 97)
Wooster (1964) is even more pragmatic. He refers to indexing as assigning terms “presumably related in some fashion to the intellectual content of the original document, to help you find it when you want to.”
I find nothing wrong with these pragmatic definitions or descriptions of subject indexing. Purists will no doubt quibble with them on the grounds that such expressions as “indexable matter,” “relevance,” “meaning,” “associated with,” “concept,” “appropriate for,” “related to,” and “intellectual content” are not precisely defined to everyone’s satisfaction. However, if one must reach agreement on the precise definition of terms before pursuing any task, one is unlikely to accomplish much -- in indexing or any other activity.
Weinberg (1988) hypothesizes that indexing fails the researcher because it deals only in a general way with what a document is “about” and does not focus on what it provides that is “new” concerning the topic. She maintains that this distinction is reflected in the difference between “aboutness” and “aspect,” between “topic” and “comment,” or between “theme” and “rheme.” She fails to convince that these distinctions are really useful in the context of indexing or that it might be possible for indexers to maintain such distinctions.
Swift et al. (1978) discuss the limitations of an aboutness approach to indexing in the social sciences. They recommend indexing documents according to the “problems” to which they seem to relate. It is difficult to see how the distinction they make differs from the distinction, made earlier in this chapter, between what an item deals with and why a particular user or group of users might be interested in it. Crowe (1986) maintains that the indexer should address the “subjective viewpoint” of the author. One of her examples deals with the topic of depression which can be discussed in books or articles from several different viewpoints (e.g., treatment through psychotherapy, through drug therapy, and so on). Again, it is difficult to see how this differs form normal indexing practice as exemplified by the National Library of Medicine’s use of subheadings.
Breton (1981) claims that engineers make little use of databases because indexers label items with the names of materials or devices while engineers are more likely to want to search for their attributes or the functions they perform. In other words, they would like to locate a material or device that satisfies some current requirement (for strength, conductivity, corrosion resistance, or whatever) without being able to name it. This is not a condemnation of subject indexing per se but of the indexing policies adopted by the majority of database producers. If a new material or alloy described in a
การแปล กรุณารอสักครู่..