Center for the Study of Digital Libraries, Texas A&M University: Lab Reports: JoDI

Center for the Study of Digital Libraries, Texas A&M University


CSDL logo

The Center for the Study of Digital Libraries (CSDL) was established in 1995 by The Texas A&M University System Board of Regents and builds upon research developed in the Hypermedia Research Laboratory established in 1987. A member of the global digital library research community, the Center provides a focal point for digital libraries research and technology for the State of Texas. Its mission is to foster pioneering research on the theory and application of digital libraries and to create flexible and efficient new technologies for their use.

The Center provides expertise and experience to help transfer collections of all types -- from books and journals to biological specimens and museum pieces -- into useful digital libraries. Center staff includes experts in key new technologies required for digital libraries: electronic document modeling and publication, hyperbase systems, process-based and spatial hypermedia systems, collaborative systems, and computer-human interaction.

The Center's program of research responds to the U.S. government's National Challenge program for research in information infrastructure technology. It provides a leadership role in the on-line development and application of world-wide access to digital library services. Development of this technology provides valuable fundamental research and supports the broader goal of research and education through improved means for collaboration and distance learning. The Center is not limited to one discipline; rather the development of digital libraries may be viewed as a fundamental contribution to research in all disciplines.

The Center organized the first two Conferences on the Theory and Practice of Digital Libraries. The proceedings of these conferences helped lay the foundation for this important new research field. The first ACM-sponsored international conference on digital libraries was held in the spring of 1996 and the second will be held in the summer of 1997. The ACM conferences continue this series and now provide the premier research venue for the field of digital libraries.

Research context

Information in the future will be produced, transmitted, and consumed in electronic form. The printed book will be replaced by new electronic forms and today's static, paper-based library with its fixed indexing schemes will give way to dynamic digital libraries with flexible and efficient mechanisms for locating, organizing, and personalizing vast amounts of multimedia information. We will no longer be bound by the physical affordances of shelves, floors, and buildings, nor the single conceptual library structure mapping all information by "call numbers" onto the physical library building. Instead, we will use collaborative hypermedia library systems allowing multiple conceptual mappings, personalization of library resources, and sharing of digital library information spaces. Collections of all manner and type will be digitized and made widely available through high capacity networking.

Increasingly, scholarly work involves a collaboration of geographically dispersed researchers, teachers, and students. Scholarly work in the digital library of the future will be mediated through coordinated access to shared information spaces. Patrons will organize their own private digital libraries, collaborate with colleagues through shared digital libraries, and have access to huge amounts of multimedia information in global, public digital libraries. A multitude of new media and new data types, and common access to high-speed computer networks will revolutionize our conceptions of books, libraries, scientific research, scholarship, learning, commerce, and ownership.

Within the past decade the number and types of digital information sources have proliferated. Computing system advances and the continuing networking and communications revolution have resulted in a remarkable expansion in the ability to generate, process, and disseminate digital information. Together, these developments have made new forms of knowledge repositories and information delivery mechanisms feasible. Before these sources can be combined into realistic, full-scale digital libraries, fundamental research must be performed in areas such as information representation, presentation, and retrieval; human-computer interaction; hypermedia and hyperbase systems; computer-supported collaborative work; distributed multimedia systems; and broadband networking. Answering the question of how best to take advantage of these promising technologies requires significant theoretical and empirical results from well-designed studies and experimental prototypes set in the context of solving real problems for patrons of experimental digital library testbeds.

Research

The research of the Center for the Study of Digital Libraries falls into two broad areas. Digital library projects are concerned with building digital libraries of significant collections that are used by practicing scholars and the public. Computing infrastructure projects are concerned with providing the technology (networking, databases, software, user interfaces) necessary to deliver the digital library projects in an efficient and timely manner. The following gives a brief description of selected research being conducted in the CSDL. Links are provided to project homepages for more detail.

Digital library projects

  • The George Bush Digital Library is being developed in collaboration with the George Bush Presidential Materials Project, a unit of the National Archives and Records Administration. The Bush Library materials are comprised of over 36 million pages of documents, 1.5 million photographs, 6,000 hours of audio/video, and 40,000 museum artifacts. The Center is working with the Bush Presidential Library archivists to make these materials widely available on the Internet. Center researchers will also be engaged in developing computer-based tools for use by the archivists as well as scholars studying the Bush presidency and administration.
  • The CSDL is working with Professor Eduardo Urbina of the Department of Modern and Classical Languages on two digital library projects. The Cervantes International Bibliography On-line and the Anuario Bibliografico Cervantino, compiled with the assistance of an international team of collaborators, attempt to solve the problem of currency, thoroughness, and accessibility which now hampers research on Cervantes by publishing a comprehensive record of all significant books, articles, dissertations, reviews, and other scholarly materials related to his works and life.
  • The TAMU Herbaria Project is being developed in conjunction with Professors Hugh Wilson and Jim Manhart of the Biology Department and Professor Steven Hatch of the Rangeland Ecology and Management Department. Over 250,000 dried plant specimens are housed in two herbaria on the Texas A&M Campus and provide a focus for study by the Bioinformatics Working Group, consisting of representatives of the CSDL and the herbaria. Current projects include the generation and networked dissemination of a unified database of herbarium specimen data, support for an extensive image gallery, examination of graphical map-based visualizations of plant distributions, and work, in conjunction with the BONAP project, towards computerization of a consistent national taxonomy.
  • The CSDL is a participating member in the Flora of Texas Consortium. The goal of this project is to create a digital library containing approximately 6,000 taxa of native and naturalized vascular plants of Texas accessible via the Internet. These materials will be widely used in support of floristics, plant community studies, regional biotic histories and synonymies, distribution maps, and to provide access to illustrations and images of the flora of Texas. This project is being developed in conjunction with Professor Hugh Wilson of the Biology Department and Professor Steven Hatch of the Rangeland Ecology and Management Department.
  • A digital image library is being developed in collaboration with the USDA Food Safety and Inspection Service to deliver training information, course materials and research on the Internet. This library will contain more than 20,000 high-resolution color images and will be used to train meat inspectors.

Computing infrastructure projects

  • The Walden's Paths project involves developing tools that allow K-12 educators to organize World-Wide Web material for their students' use. The Internet provides students with a wealth of multimedia materials that must be tailored before they can be used in the classroom. Walden's Paths allows teachers to make use of these materials by creating directed paths that students can follow to obtain a cohesive view of the collected material, browse off the path freely, and then return to the path.
  • VIKI is a spatial hypertext system that supports the use of spatial and visual cues such as proximity, alignment, and graphical similarity to express relationships while interpreting information. This allows users to take advantage of their perceptual system, spatial and geographic memory, and more generally, spatial intelligence. VIKI users create hierarchies of two-dimensional spaces to collect, arrange, and author visual symbols. These symbols may represent information stored within the VIKI space or they may point to information external to VIKI, such as information available via the World-Wide Web. VIKI facilitates the use of common implicit structures through access to recognized stacks, lists, and composites of visual symbols. End users may use VIKI to organize and personalize the massive amounts of information available on the Internet.
  • The Trellis project investigates the structure and semantics of human-computer interaction; specifically in the context of hypertext/hypermedia systems. As with all hypertext systems, Trellis permits the identification of information objects (e.g., nodes) and the definition of their relationships (e.g., links). Beyond this, the Trellis information structure, called a "hyperprogram," also directs the way in which the reader uses the information (i.e., the reader's browsing semantics). In other words, a Trellis hyperprogram integrates task with information. The design work, which has been ongoing since 1988, is based on Petri nets. This representation provides a usable compromise between fully-programmable but non-analyzable representations, such as those used in general programming languages, and fully-analyzable but non-programmable representations, such as that provided by pure HTML as used on the World-Wide Web. Results from the Trellis project are being used in the Walden's Paths project to improve the authoring of educational content on the Internet.
  • HOSS (Hypermedia Operating System Services) is an attempt to move hypermedia functionality into the core operating environment of the computer itself. By making the operating system of the computer hypermedia-aware, we hope to derive greater convenience for both user and programmer, as well as greater efficiency of operation. Three aspects of operating system level hypermedia-awareness that we are currently investigating are: using a hyperbase as a file system; modifying the swapper to take advantage of semantic relationships among data (this allows the pre-fetching of information via recognition of link structures); and, identifying different types of processes (for example, navigational processes versus application processes) to be distinguished by the operating system.

Facilities

The Center occupies 12 offices and two laboratories on the main campus of Texas A&M University in College Station, Texas, as well as an office at the Institute of Biosciences and Technology in Houston, Texas. The Center is currently renovating its space to include a conference room equipped with ATM networking for distance education applications.

Research facilities include a local area network of Sun SparcStation 20s and 10s, PowerMacs and Pentium PCs with multimedia capabilities; SparcStation 1000 Server; over 400 gigabytes of mass storage; Hewlett Packard K220 Server, HP Disk Array, HP 600fx Jukebox and subsystem; LiveWorks LiveBoard; scanners, digital cameras and virtual reality equipment.


Staff

Graduate Students

Acknowledgements

The Center for the Study of Digital Libraries gratefully acknowledges the corporate support of the Hewlett-Packard Company; Knowledge Systems, Inc.; and Informix Software, Inc.

Contacts

Center for the Study of Digital Libraries,
Texas A&M University,
College Station,
Texas 77843-3112,
USA
Tel: 01-409-862-3217
Fax: 01-409-847-8578
Email: csdl@csdl.tamu.edu
Web: www.csdl.tamu.edu