© Copyright 2006 Texas Digital Library. All Rights Reserved.
View full text: HTML
The paper describes a project to add value to controlled vocabularies by making inter-vocabulary associations. A methodology for mapping terms from one vocabulary to another is presented in the form of a case study applying the approach to the Educational Resources Information Center (ERIC) Thesaurus and the Library of Congress Subject Headings (LCSH). Our approach to mapping involves encoding vocabularies according to Machine-Readable Cataloging (MARC) standards, machine matching of vocabulary terms, and categorizing candidate mappings by likelihood of valid mapping. Mapping data is then stored as machine links. Vocabularies with associations to other schemes will be a key component of Web-based terminology services. The paper briefly describes how the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is used to provide access to a vocabulary with mappings.
View full text: HTML
Today's semantic Web deals with meaning in a very restricted sense and offers static solutions. This is adequate for many scientific, technical purposes and for business transactions requiring machine-to-machine communication, but does not answer the needs of culture. Science, technology and business are concerned primarily with the latest findings, the state of the art, i.e. the paradigm or dominant world-view of the day. In this context, history is considered non-essential because it deals with things that are out of date. By contrast, culture faces a much larger challenge, namely, to re-present changes in ways of knowing; changing meanings in different places at a given time (synchronically) and over time (diachronically). Culture is about both objects and the commentaries on them; about a cumulative body of knowledge; about collective memory and heritage. Here, history plays a central role and older does not mean less important or less relevant. Hence, a Leonardo painting that is 400 years old, or a Greek statue that is 2500 years old, typically have richer commentaries and are often more valuable than their contemporary equivalents. In this context, the science of meaning (semantics) is necessarily much more complex than semantic primitives. A semantic Web in the cultural domain must enable us to trace how meaning and knowledge organisation have evolved historically in different cultures. The paper examines five issues to address this challenge: 1. different world-views (i.e. a shift from substance to function and from ontology to multiple ontologies); 2. developments in definitions and meaning; 3. distinctions between words and concepts; 4. new classes of relations; 5. dynamic models of knowledge organisation. These issues reveal that historical dimensions of cultural diversity in knowledge organisation are also central to classification of biological diversity. New ways are proposed of visualizing knowledge using a time/space horizon to distinguish between universals and particulars. It is suggested that new visualization methods make possible a history of questions as well as of answers, thus enabling dynamic access to cultural and historical dimensions of knowledge. Unlike earlier media, which were limited to recording factual dimensions of collective memory, digital media enable us to explore theories, ways of perceiving, ways of knowing; to enter into other mindsets and world-views and thus to attain novel insights and new levels of tolerance. Some practical consequences are outlined.
View full text: PDF
Existing classification schemes and thesauri are lacking in well-defined semantics and structural consistency. Empowering end users in searching collections of ever increasing magnitudes with performance far exceeding plain free-text searching (as used in many Web search engines), and developing systems that not only find but also process information for action, requires far more powerful and complex knowledge organization systems (KOSs). The paper presents a conceptual structure and transition procedure to support the shift from a traditional KOS towards a full-fledged and semantically rich KOS. The proposed structure also complies with other interoperability approaches like RDFS and XML in the Web environment. AGROVOC, a traditional thesaurus developed and maintained by the Food and Agriculture Organization (FAO) of the United Nations, serves as a case study for exploring the reengineering of a traditional thesaurus into a fully-fledged ontology. We start the process of developing an inventory of specific relationship types with well-defined semantics for the agricultural domain and explore the rules-as-you-go approach to streamlining the reengineering process.
View full text: HTML
Applying conventional principles of knowledge organization, representation, and other semantic tools, we have constructed a model for scientific concepts and employed knowledge bases and visualization tools to represent knowledge concerning scientific concepts. Strongly-structured models, such as the integration of a taxonomy (or thesaurus) with metadata (or attribute-value pairs) and domain-specific markup languages, as well as specialized models for learning scientific concepts, focus on such attributes as objective representations, operational semantics, use, and interrelationships of concepts. All of these play important roles in constructing representations of knowledge in most domains of science. Instructional activities for undergraduate teaching and learning are greatly facilitated with the use of such integrated semantic tools.
View full text: HTML
The lack of standardised access and interchange formats for knowledge organisation systems (KOS) are a barrier to their interoperability and wider use in automated Web and retrieval applications. Programmatic access to thesaurus (and other types of KOS) resources requires a commonly agreed distributed service protocol, building on lower-level standards, such as Web services. This paper reflects on our experiences in building a Web demonstrator of some novel thesaurus browsing and search tools, developed as part of a research project on the role of the thesaurus in controlled vocabulary retrieval applications. The Web system provides dynamically generated interface components for finding terms and browsing the thesaurus, building a query and returning ranked results using term expansion from a collections database. We designed a custom application programming interface of lower-level thesaurus functions to support the various user interface requirements of the application demonstrator. Based on our experience with developing the system, we review the literature on protocols for distributed access to thesauri and offer suggestions for further development of thesaurus service protocols. The FACET project, its semantic expansion and ranked result, multi-concept matching capabilities are briefly outlined. We provide a detailed description of key elements of the Web demonstrator and their rationale, together with a discussion of the data elements required by the different interface components. Existing proposals (Ceres, Zthes and ADL) for thesaurus service protocols are reviewed. The paper concludes by reflecting on lessons from constructing the Web demonstrator and implications for separating the service protocol from the interface. We argue that basing distributed protocol services on the atomic elements of thesaurus data structures and standard relationships is not necessarily the best approach. Client interfaces with similar components to the Web demonstrator require a service-oriented approach, with base services that group primitive KOS data elements (via their relationships) into composites. This leads to a proposal for a novel, unified semantic expansion service, which can be used both for specifying composite display formats and for query expansion services. Thesaurus (KOS) representations and service protocols for retrieval are closely linked. A service protocol should be explicitly expressed in terms of a well defined but extensible set of KOS data elements and relationships.
View full text: HTML