Vol 4, No 3 (2004)

Articles

The Connectivity Sonar: Detecting Site Functionality by Structural Patterns

Ronny Lempel, Einat Amitay, David Carmel, Adam Darlow, Aya Soffer

Web sites today serve many different functions, such as corporate sites, search engines, e-stores, and so forth. As sites are created for different purposes, their structure and connectivity characteristics vary. However, this research argues that sites of similar role exhibit similar structural patterns, as the functionality of a site naturally induces a typical hyperlinked structure and typical connectivity patterns to and from the rest of the Web. Thus, the functionality of Web sites is reflected in a set of structural and connectivity-based features that form a typical signature. In this paper, we automatically categorize sites into eight distinct functional classes, and highlight several search-engine related applications that could make immediate use of such technology. We purposely limit our categorization algorithms by tapping connectivity and structural data alone, making no use of any content analysis whatsoever. When applying two classification algorithms to a set of 202 sites of the eight defined functional categories, the algorithms correctly classified between 54.5% and 59% of the sites. On some categories, the precision of the classification exceeded 85%. An additional result of this work indicates that the structural signature can be used to detect spam rings and mirror sites, by clustering sites with almost identical signatures.

View full text: HTML

Implementation Challenges Associated with Developing a Web-based E-notebook - Addendum on Related Work

Yolanda Jacobs Reimer, Sarah A. Douglas

The addendum provides a brief history of the NetNotes development, and discusses relevant research not included in the original paper, responding to comments from a JoDI editor that the paper may have missed some related work in the hypermedia systems field.

View full text: HTML

Implementation Challenges Associated with Developing a Web-based E-notebook

Yolanda Jacobs Reimer, Sarah A. Douglas

As people increasingly turn to the World Wide Web to help them manage their daily tasks, they engage in the process of information assimilation (IA). IA refers to the gathering, editing, annotating, organizing, and saving of Web information, as well as the tracking of ongoing Web work processes. Although evidence suggests that IA is a critical process for Web users, it is currently not well supported by existing browsers and other software applications. The lack of adequate software support for IA may be attributed to implementation difficulties associated with developing general Web-based applications. In addition, usability must be a major priority in the development of interactive systems to support IA. The NetNotes prototype, a Web-based e-notebook, represents a limited solution to the problem of developing software to support IA. NetNotes works in conjunction with a specific Web domain, deals with a limited number of Web components, and requires minor server-side modifications. Despite these limitations, however, the NetNotes implementation exposes some of the key technical problems associated with implementing Web-based software, it successfully incorporates a number of critical IA requirements, and it is robust enough to be used in future experimental evaluations.

View full text: HTML

Merging Metadata and Content-Based Retrieval

Dave Deniman, Tamara Sumner, Lynne Davis, Sonal Bhushan, Jackson Fox

Educational digital libraries employ resource discovery systems that are aimed at providing educators and learners with curriculum materials to support learning in both formal and informal settings. The article describes a "hybrid" educational resource discovery system, which combines metadata and content-based retrieval methods. This hybrid system was implemented and evaluated in the context of the Digital Library for Earth System Education (DLESE). A pilot study was conducted to compare this hybrid system with an existing metadata-based system, with the aim of finding out if the hybrid system helps educators locate relevant resources with less effort. The results of the study suggest that the hybrid system decreased the variability in the number of user actions required to locate learning resources. The hybrid system interface featured embedded links, pointing to inner pages within a larger compound learning resource; study participants made use of these embedded links to locate individual learning objects.

View full text: HTML