Implementation of a Web-based E-notebook: Reimer and Douglas: JoDI

Implementation Challenges Associated with Developing a Web-based E-notebook

Yolanda Jacobs Reimer and Sarah A. Douglas*
Department of Computer Science,
University of Montana, Missoula, MT 59812
Email: reimer@cs.umt.edu; Web: www.cs.umt.edu/u/reimer/
*Department of Computer and Information Science,
University of Oregon, Eugene, OR 97403
Email: douglas@cs.uoregon.edu; Web: http://www.cs.uoregon.edu/~douglas/

Abstract

As people increasingly turn to the World Wide Web to help them manage their daily tasks, they engage in the process of information assimilation (IA). IA refers to the gathering, editing, annotating, organizing, and saving of Web information, as well as the tracking of ongoing Web work processes. Although evidence suggests that IA is a critical process for Web users, it is currently not well supported by existing browsers and other software applications. The lack of adequate software support for IA may be attributed to implementation difficulties associated with developing general Web-based applications. In addition, usability must be a major priority in the development of interactive systems to support IA. The NetNotes prototype, a Web-based e-notebook, represents a limited solution to the problem of developing software to support IA. NetNotes works in conjunction with a specific Web domain, deals with a limited number of Web components, and requires minor server-side modifications. Despite these limitations, however, the NetNotes implementation exposes some of the key technical problems associated with implementing Web-based software, it successfully incorporates a number of critical IA requirements, and it is robust enough to be used in future experimental evaluations.

1 Introduction

When people use the Web, they often engage in a process referred to as information assimilation (IA). IA is defined as the gathering, editing, annotating, organizing, and saving of Web information, as well as tracking ongoing Web work processes. Usability must be a major priority in the development of interactive systems to support IA. Evidence that suggests IA is a critical process for many Web users--and scientists in particular--comes from a number of background studies (including an ethnography and a review of the process of traditional notetaking) which are described by Reimer (2001) and by Reimer and Douglas (2003). This paper begins by looking at the process of IA in more detail, and evaluates how well two of the most commonly used Web browsers (Netscape and Internet Explorer) support critical IA tasks. For the most part, these Web browsers fail to support the process of IA in an integrated and useful fashion. This result necessitates the development of other more useful software applications which are focused on end-users and their critical tasks, despite the technical challenges that are involved. The second half of the paper describes the implementation of NetNotes, a Web-based e-notebook designed specifically to support critical IA tasks. Some of the technical difficulties from an early version of the NetNotes prototype have been overcome. While NetNotes still represents a limited solution to the problem, it is robust enough to be used in future experimental evaluations.

2 Support for IA

2.1 Web browsers

To determine the current state of software support for IA, we performed a heuristic evaluation (Nielsen and Molich 1990) of recent versions of the two most popular Web browsers--Netscape Navigator 4.7 and Microsoft's Internet Explorer (IE) 5. In particular, we reviewed how well these browsers allow users to perform the following tasks:

  1. Gather Web information (i.e. text, images, lists, tables and hyperlinks) by copying and pasting from multiple Web pages into an e-notebook; collect archival data pertaining to when and where original Web information was published
  2. Edit original Web elements as stored in an e-notebook
  3. Annotate e-notebook contents (i.e. add/delete text, highlight information, create cross-references)
  4. Organize e-notebook contents (i.e. control the spatial layout, re-structure, combine similar information together, etc.)
  5. Save the contents of an e-notebook
  6. Track (represent) and save ongoing work processes.

Some of these tasks are grouped together to facilitate the review.

2.1.1 IA Tasks 1 and 5: Gather and Save Web Information

Even the most basic of IA requirements--gathering and saving formatted text, images, lists, tables and hyperlinks from the Web--is currently difficult to achieve using standard browsing applications. While the copy and paste commands can be used to copy selected information from a Web page into another application, such as Microsoft Word or Windows Notepad, many of the formatted objects are typically lost in the transfer. Even those applications that correctly handle the copying and pasting of some formatted objects, like text and images, fail to do so consistently.

Users might also opt to save an entire Web page in HTML format, but this does not allow for the selection of certain portions of the page only, and the information is then only accessible for future use in applications that can interpret HTML code. Furthermore, images are often lost altogether when an entire Web page is saved as HTML, which may prompt users to save images separately. However, in this case the user must re-integrate the images with their related text, which requires significant effort and the involvement of additional applications. Users can also print out Web pages as a way of saving information. This option, too, is problematic because it means that users can't readily combine the information with other electronic notes or annotations, it generates too much paper that must be further organized and stored, and it assumes that users have access to a printer. Lastly, users might choose to use a Web authoring tool, like Netscape Composer, to create a new Web page from pieces of existing pages. However, these tools are designed for the generation and publication of new information, not the long-term storage of existing data.

In addition to saving information from the Web, users may also need to prove that the information they are gathering existed at a particular URL, date and time. Currently, the only way users can keep verifiable records as to the state of the Web is to print out and retain hard copies of entire pages. Once again, though, printing hard copies of entire Web pages requires additional effort to organize, store and retrieve, especially when only specific portions of the pages are actually needed. Web browsing software should provide better support for users who need to maintain records validating the state of the Web.

2.1.2 IA Tasks 2, 3 and 4: Edit, Annotate and Organize Notes

Since standard Web browsers lack any sort of accompanying notebook, the ability of users to perform the key IA activities of editing, annotating and organizing their Web notes depends entirely on the functionality of other software applications. In the most recent GVU WWW survey (Kehoe and Pitkow 1998), 27.6 percent of cases reported not being able to organize their gathered information efficiently as among the biggest problems in using the Web. Persistent users might be able use word processing software to edit certain types of Web information--like plain text--and then document management systems to organize and retrieve it, but this assumes that users have the necessary access to a variety of desktop applications and the technical knowledge to use these applications in an integrated fashion.

2.1.3  IA Task 6: Track Ongoing Work Processes

Standard Web browsing software is also significantly impaired by its inability to support protracted and/or fragmented work processes. While Web users may complete work in one continuous, uninterrupted session, it is equally likely that their work will span a longer period of time and multiple browsing sessions. In this latter case, it is essential that users have the ability to recall and rejoin the work of a previous Web session quickly and easily.

The study by Abrams et al. (1998) of bookmarking behavior reveals that many Web users use bookmarks ("Favorites" in IE) to represent their inter-session history because no other suitable functionality exists. That these users will adapt a tool designed for something very different to compensate for missing functionality suggests the obvious need to support ongoing work activity. Further evidence that tracking previous work processes is a critical function for many Web users can be found in another notable study: Tauscher and Greenberg (1997) analyzed six weeks of Web usage data from 23 users and discovered a high number of page revisits per individual (58% recurrence rate). Perhaps one of the reasons users return to previously visited Web pages so frequently is because those pages are part of a longer term work process; this, in turn, suggests that these same users would benefit from tools that help them track, remember and rejoin their ongoing work processes.

It is not surprising, then, that the limitations Web users find with bookmarks also renders them inadequate for representing long-term work processes. For example, users from the study by Abrams et al. (1998, p. 47) say that "bookmarks aren't descriptive enough" and that they "aren't great describers of the actual content". If current bookmark functionality is considered insufficient for identifying and describing single Web pages, then it is surely unsatisfactory as a tool for representing more complex, ongoing work processes. Furthermore, one can imagine that a crucial component of depicting longer-term activities is being able to organize and arrange representations in a spatially/visually meaningful way. Again, trying to use bookmarks in this capacity is problematic as users complain that long lists of bookmarks are hard to maintain, visualize, browse and categorize (Abrams et al. 1998). In fact, organizing bookmarks is one of the top three Web problems with bookmarks as reported by 4770 respondents in the 10th GVU WWW survey (Kehoe and Pitkow 1998).

Using the bookmarking tool to represent protracted work processes has other limitations as well. Users are unable to identify specific portions of Web pages that are of particular interest since bookmarks flag entire Web pages only; bookmarks can only mark dynamic Web page content and cannot be used to keep track of the information on a particular Web page at a particular moment; and users may wish to have more flexible ways of identifying the various parts of their work process other than simply by page titles (e.g. an icon). The findings of both the Abrams et al. (1998) study and the 10th GVU WWW survey, along with the other limitations pointed out here, make it evident that bookmarking is problematic and seriously inadequate as a tool for representing long-term work activities.

The detailed history list found in most Web browsers provides an alternative to bookmarks that users might consider to help them recall and rejoin a previous work session. For example, a user could copy items from the detailed history list, paste them into a text file, and then save and reuse them to piece together previous work at a later time. In Netscape, the detailed history list displays the title, location (i.e. URL), first and last visited date/time, expiration date, and visit counts for a Web page. However, this option is also unsuitable for a number of reasons: the items can only be copied-and-pasted one at a time (in Netscape at least), making it both tedious and time-consuming for a user to copy the browsing history for an entire session; duplicate items are displayed in the list; there is no graphic representation depicting how the user browsed the listed Web pages or how the pages relate to one another; a user would have to retype the URLs to load the pages in the subsequent browsing session; and the title and URL location may not be sufficient information for the user to recall a previous work process.

2.2 Related Work

It should be noted that a number of other Web-based e-notebook prototypes have been developed. While some of these systems contain individual functions that arguably support IA, none of them can be said to support the process in a complete and integrated fashion. Indeed, this is not surprising since these applications were developed to satisfy a different set of requirements (than to support IA). However, the three most notable Web-based e-notebook prototypes include Nabbit (Manber 1997), the Internet Scrapbook (Sugiura and Koseki 1998), and WebBook (Card et al. 1996). Other systems that create graphical representations of a user’s Web browsing history, and thereby could be used to track an ongoing Web work process (which is a key IA task), have also been researched. These systems include MosaicG (Ayers and Stasko 1995), Pad++ (Bederson et al. 1998), PadPrints (Hightower et al. 1998), Webmap (Domel 1994), and WebNet (Cockburn and Jones 1996). Unfortunately, it turns out that these systems are not particularly well suited for the personalized tracking of long-term work processes because in some instances the browsing history cannot be saved between sessions, and in other instances users cannot directly manipulate the automatically generated graphical views (i.e. delete nodes, select which pages to include and which to exclude, include annotations, etc.).

3 NetNotes: A Web-based E-notebook

3.1 Overview

The NetNotes prototype was developed using user-centered design (UCD) methodologies (Gould 1988, Whiteside et al. 1988) and in direct response to the lack of adequate software support for IA. There are, however, significant technical difficulties involved with implementing a general Web-based notebook (i.e. one that is usable in conjunction with any Web site for all Web users, and that supports all key IA tasks). These difficulties include dealing with application security, the wide variety of Web objects that users may wish to save in a notebook (e.g. text, images, animations, programs, forms, etc.), and a diverse and distributed user population. Consequently, the NetNotes program represents a limited solution to the problems highlighted thus far. In particular, NetNotes works in connection with a specific Web domain, the Zebrafish Information Network (ZFIN), which is a multimedia repository and relational database of genetics information related to the zebrafish species. It was chosen as the initial Web domain for NetNotes because

  1. it is a technically complex, real system;
  2. we had access to test environment where we could make necessary modifications to ZFIN's implementation;
  3. previous research has indicated that ZFIN's user group might have strong demands for personalized Web-based e-notebooks;
  4. the ZFIN biologists at the University of Oregon were an immediately accessible user group.

In additon, NetNotes:

  • provides for a subset of the highest priority IA requirements
  • deals with a limited number of static, dynamic and linked Web components (no programmed elements)
  • is implemented on the client-side
  • requires minor server-side modifications
  • has been tested by a group of biologists resident at the University of Oregon.

Despite these limitations, NetNotes successfully exposes some of the technical problems associated with implementing Web-based software, it represents a marked improvement over an earlier prototyping effort (Reimer and Douglas 2001) (and thus represents another loop in the UCD iterative design process), it incorporates a number of key IA requirements, and it is robust enough to be used in future experimental evaluations.

3.2 Functional Requirements

Table 1 shows the high-priority IA functional requirements that were successfully implemented in NetNotes.


Table 1. IA Functional Requirements Implemented in NetNotes
Functional Requirements
Users should be able to ...
NetNotes
Gather:  
 1.  copy and paste text (both plain and formatted) statically from multiple, disparate Web pages into an e-notebook while retaining formatting +
 2.  copy and paste images statically from Web pages into an e-notebook page while retaining formatting +a
 3.  copy and paste lists and tables statically from Web pages into an e-notebook while retaining formatting +
 4.  copy and paste hyperlinks from the Web into an e-notebook while retaining formatting and functionality (i.e. hyperlinks should remain "active" in e-notebook) +
 5.  archive Web information by having the URL, date and time of the original source information automatically included in their e-notes. Users should not be able to modify the source or the authentication stamp for such archived information. +
Edit:  
 6.  delete any content from their e-notebooks, including original Web elements +
 7.  modify (change text, format text, etc.) any content in their e-notebooks (except images), including original Web elements. +
Annotate:  
 8.  add text to or delete text from their e-notebooks +
 9.  emphasize or differentiate text in their e-notebooks by choosing between different font styles (e.g. bold, italic, underline) and sizes +
 10.  create automatic cross-references (i.e. links) from their e-notes to any Web page. +
Organize:  
 11.  have multiple pages in their e-notebooks and should be able to copy Web information into any page they choose +
 12.  create separations between groups of notes +
 13.  name, insert and delete e-notebook pages. +b
Save:  
 14.  save text (plain and formatted) in their e-notebook while retaining formatting +
 15.  save images in their e-notebook +a
 16.  save lists and tables in their e-notebook while retaining formatting +
 17.  save hyperlinks in their e-notebook while retaining formatting and functionality +
 18.  save archived Web information in their e-notebook +
Track ongoing work:  
 19.  track an ongoing Web work process in their e-notebooks so that they can easily remember the work they were doing at a later time +
 20.  track their current progress in an ongoing Web work process (i.e. users should be able to see how much of their initial work goals they have completed, and they should be able to gauge how much work is outstanding) +
 21.  annotate an ongoing work process +
 22.  edit an ongoing work process (e.g. delete some portion of it, insert text into it, etc.) +
 23.  restart and rejoin an ongoing Web work process from within their e-notebooks with minimal repeated work (i.e. users should not have to relocate Web pages of importance). +
aImages can only be saved dynamically in NetNotes
bNotebook pages can be deleted, but only by system file management applications

3.3 Early Prototyping Efforts

Prior to the design and implementation of NetNotes, some early system prototyping and usability testing was conducted on a similar system called CAJIN (Computer Assisted Journal and Integrated Notebook). Details of CAJIN can be found in Reimer and Douglas (2001). It is important to note that during the implementation of CAJIN, a number of technical limitations were uncovered that have subsequently been fixed in the NetNotes implementation. In particular,

  • The CAJIN prototype allowed only one person to copy Web elements from ZFIN and paste them into the e-notebook at a time. The NetNotes system architecture has been modified accordingly so that multiple users can copy and paste without interfering with one another.
  • When information was copied and pasted from ZFIN into CAJIN, some elements failed to transfer correctly, and at times the original formatting and alignment were not preserved accurately. This functionality has been made more robust in NetNotes.
  • The NetNotes prototype includes more critical IA requirements than CAJIN did, including the ability of users to save images (dynamically), to archive Web pages, and to track their ongoing work processes.

3.4 System Architecture

A guiding principle used in determining NetNotes' system architecture--in particular how it would interact seamlessly with ZFIN--was to push as many system and programming components on to the client machine as possible. This heuristic was used to ensure minimal changes to both the server and to the ZFIN Web site, and thereby increase the generality of the prototype solution.

It was also determined that the NetNotes application should be kept separate from the Web browser that provides access to the ZFIN site. This distinction increases NetNotes' flexibility by allowing it to work in conjunction with any Web browser, and it also allows users the option to use their notebooks even when they do not wish to interact with the Web. It should be noted that for this implementation, NetNotes works only with the Netscape Web browser simply because, at the time of development, ZFIN was only accessible via Netscape. However, there is no inherent reason why NetNotes could not also be made to work with other Web browsers such as IE. The decision to separate NetNotes from the Web browser also meant that a way had to be devised for the two applications to communicate. This inter-application communication, which was essential for the copy/pasting of ZFIN items into NetNotes and for the tracking of ongoing work processes, posed perhaps the most difficult and interesting technical challenge in the development of NetNotes, and thus is the focus throughout the remainder of this paper.

A final consideration central to NetNotes' system architecture design was that the NetNotes application had to support interaction with ZFIN for multiple users simultaneously. This is one example where the earlier CAJIN prototype failed; the CAJIN/ZFIN inter-process communication relied on critical information (i.e. the URL of the current ZFIN page) being stored in a server-side file, so only one user could copy and paste from ZFIN into CAJIN at a time, which proved to be a serious limitation of the system.

The final system architecture for NetNotes based on all these considerations is shown in Figure 1. Both the ZFIN database and its front-end Web system are implemented on the server, while the Netscape browser that loads the ZFIN Web site resides on the client machine along with the NetNotes application. A client-side cookie file and the system clipboard are also displayed in Figure 1 because they are critical to the NetNotes/ZFIN inter-process communication detailed in the next sections.

Figure 1: NetNotes and ZFIN System Architecture

Figure 1. NetNotes and ZFIN System Architecture

3.5 Server-Side Modifications

While the NetNotes implementation involved a number of key server-side modifications to the ZFIN Web site, no changes to the underlying relational database were necessary. All server-side modifications were required for the NetNotes/ZFIN inter-process communication (i.e. when a ZFIN selection is copied and pasted into a NetNotes page, and when a NetNotes user tracks his or her ongoing work processes). The following list identifies and describes the implementation details behind each of the necessary server-side modifications.

  1. Javascript functions were added to ZFIN Web pages so that the current Web page URL is sent to the client as a cookie. Every time select ZFIN Web pages are loaded in Netscape, a set of Javascript functions execute and the current URL is sent to the client machine as a cookie. While we opted to implement this functionality using Javascript, other programming options are also feasible, such as CGI/Perl scripts. This modification also allows NetNotes to handle a common problem that occurs when frames are used, which they are for the ZFIN site, when the URL in the browser location field stays the same regardless of which page is being viewed (instead of reflecting the actual location of the underlying HTML source code). It was only necessary to add the cookie-related Javascript functions to one common security file that is executed by many ZFIN pages.
  2. Special HTML breakpoint tags were added to ZFIN pages to delineate copy/paste selections. To add consistency and accuracy to selections that a user copies from a ZFIN page and pastes into a NetNotes page, special breakpoint tags were added to select ZFIN source code (i.e. HTML) in the form of <A NAME="NetNotes breakpoint">. These breakpoint tags are used in the NetNotes copy/paste algorithm to determine exactly what Web page content should be copied into a NetNotes page. Based on the information displayed in the particular Web pages where these tags were added, we decided where to situate the breakpoints and how frequently they appear. In general, the more breakpoints there are, the more accurate the copy/paste function will be. The outermost breakpoints should bound the entire HTML page, but comments and Javascript code should be excluded as they will likely be undesirable when they appear in the NetNotes page. It should be noted that there are other more robust ways of handling this issue. For example, the Document Object Model (DOM) could be used to parse the entire HTML document (i.e. ZFIN page) into a tree structure, and the HTML tags related to a selection of text could then be determined. For a long term technical solution, the DOM technique is preferable. However, in this case, our resolution is simpler to implement, and it does not detract from the functionality of the NetNotes prototype for purposes of user testing.
  3. BASE tags were added to HTML source code to resolve relative references in copied ZFIN selections. ZFIN pages are dynamically generated and have no HTML BASE tag in the source code, so it was necessary to add a BASE tag to the HTML to handle relative references for hyperlinks and images. The BASE tag that we added took the special form of <A NAME="BASE http://edison.cs.uoregon.edu"> instead of the normal <BASE href="http://edison.cs.uoregon.edu"> tag simply because of a bug in the JDK 1.3 which seems to ignore the normal form of the tag. This server-side addition is only necessary when there are relative references in the HTML source code.

The server-side modifications just described proved to be relatively minor and easy to implement, particularly for Web sites that, like ZFIN, employ dynamically generated pages. Since the generation of most of the ZFIN Web pages involves the execution of common script files, by adding code to only a small number of files we were able to affect a large number of ZFIN Web pages.

3.6 Client-Side Modifications

NetNotes was implemented in Java on a Dell 8100 PC running the Windows Millennium (Me) operating system. The JDK 1.3 software development environment was used for programming and its integrated set of Swing classes was used to represent the graphical user interface (GUI). In addition to the server-side modifications necessary for the ZFIN to NetNotes interaction, the following related client-side modifications were also made:

  • The client-side cookie file location must match the location listed in the NetNotes program code. This is critical so that NetNotes will be able to find and read the cookie file, which is necessary for obtaining the URL of the current ZFIN page.
  • The location and command to launch the client Netscape program must match the NetNotes program code. This is critical so that when a user selects an active hyperlink in NetNotes, the Netscape browser will be launched and will load the appropriate Web page (i.e. the page referred to by the link).

3.7 NetNotes/ZFIN Inter-process Communication

Perhaps the most interesting technical aspects of the NetNotes implementation have to do with its interaction with ZFIN. One of the motivating factors behind the development of NetNotes was to provide users with the ability to copy and paste information from ZFIN into their notebooks. From a user interface perspective, this process is quite straightforward:

  1. Users select the elements in ZFIN they want to copy by clicking and dragging the mouse over the selection;
  2. With the selection highlighted, users choose the Netscape copy command;
  3. With the mouse positioned in the appropriate spot in the notebook, users select the NetNotes paste command.

Figure 2 illustrates an example of this copy and paste interaction. The top screenshot shows a number of non-contiguous ZFIN selections as viewed in Netscape, while the bottom screenshot shows how these selections appear in NetNotes after they have been copied and pasted (each selection is copied and pasted individually). Note that NetNotes successfully handles the copy/pasting of text (plain and formatted), images, lists, tables, and active hyperlinks.

Figure 2: Copy and Paste from ZFIN into NetNotes

Figure 2. Copy and paste from ZFIN into NetNotes(larger image)

While the steps that a user must perform to copy and paste elements from ZFIN to NetNotes are quite simple and intuitive, as just described, the underlying program details are considerably more complex. The following algorithm describes these details, and its numbering scheme also coincides with the numbers displayed in the systems architecture diagram in Figure 1.

3.7.1 ZFIN to NetNotes Copy/Paste Algorithm

  1. When a user loads a ZFIN page in Netscape, Netscape sends a cookie of the page URL to the client machine.
  2. The user selects some portion of a ZFIN page, and then chooses the Netscape copy command. The text portion of the selection gets sent to the client system clipboard.
  3. When the user selects the NetNotes paste command, the NetNotes program performs the following steps:
  4. a, Reads the cookie file and locates the ZFIN URL of the current Web page.
    b. Uses the URL to read the HTML source code of the current ZFIN page.
    c. Strips out all HTML tags and blank spaces from the ZFIN page source code while keeping track of how the stripped source code matches back to the original HTML source code. This is necessary for step e below.
    d. Retrieves the clipboard text from the client system and removes all blank spaces.
    e. Tries to match the clipboard text with the stripped source code.
    If a match is found then (e1):
    The matched string is compared to the original HTML source code.
    The nearest breakpoint tags are located, forming the new HTML copy string.
    Relative references that occur in the newly built HTML copy string are resolved. The original HTML source code is searched for the special BASE tag (see server-side modification 3) and its URL portion is extracted (e.g. http://edison.cs.uoregon.edu). All <A HREF="/ and <IMG SRC="/ strings are located in the newly built HTML copy string and the relative reference is replaced with an absolute URL. For example
     
    <A HREF="/cgi-bin/webdriver?..."> becomes
    <A HREF="http://edison.cs.uoregon.edu/cgi-bin/webdriver?...">
    and
    <IMG SRC="somepict.jpg"> becomes
    <IMG SRC="http://edison.cs.uoregon.edu/somepict.jpg">
     
    The replacement of relative references with absolute references works on only the copied portion of HTML--as opposed to the entire original source page--to improve the efficiency of the algorithm. This algorithm is also particularly good because it correctly handles the situation where different selections from different Web sites are copied and pasted into the same NetNotes page. The alternative approach--inserting one BASE tag in the HEAD section of the underlying HTML code for the NetNotes page--results in conflicting BASE URLs when there is more than one originating Web page.
    The newly built HTML copy string is pasted into the NetNotes page where the HTML is interpreted and correctly displayed.
    If a match is not found then (e2):
    The plain clipboard text is pasted into the NetNotes page.

The other ZFIN to NetNotes interaction that occurs when NetNotes users track an ongoing Web work process is considerably simpler. In this case, the NetNotes program only needs access to the current ZFIN page URL, which it gets from the client-side cookie file.

3.8 Known Bugs and Limitations

One of the purposes of developing the NetNotes prototype was to explore some of the challenges associated with implementing a Web-based e-notebook. A number of limitations were in fact discovered during the implementation of NetNotes, and they are discussed below. It would be necessary to fix most of these problems before NetNotes ever became publicly available, but for the purposes of this research and future experimental studies, these limitations were generally surmountable.

  • Changing the layout of information copied from the Web into a NetNotes notes page is problematic. When specially formatted information--such as lists and tables--is copied from a ZFIN Web page and pasted into NetNotes, the underlying HTML for that information is also transferred to NetNotes. As it is invisible to users, this underlying HTML can affect subsequent changes to the layout of those notes in an undesirable way. This can not only confuse users, it can also make formatting the layout of notes difficult. In future versions of the software, stricter parsing decisions should be made to alleviate this problem. For example, if a user copies and pastes a list from the Web into the e-notebook, all extraneous tags (such as the table that the list resides within) should be stripped out. This should give the users more control over formatting elements directly before and after the transferred data. Using a DOM-based approach as previously mentioned should also help with this issue.
  • The ZFIN-to-NetNotes copy/paste procedure does not work correctly when there are special characters in the source Web page. For example, since the '&' (ampersand) symbol is represented in HTML source code as &amp, when a ZFIN copy/paste selection happens to contain an '&', a mismatch occurs between the system clipboard text and the underlying HTML source code that is matched in the NetNotes parsing algorithm. The clipboard text will contain the '&', but the HTML source code contains &amp instead. Possible solutions to this problem include resolving these special characters individually in the NetNotes algorithm or using a DOM-based approach.
  • Images are only saved dynamically in a NetNotes page, not statically. Whenever a NetNotes page containing an image is saved, only the reference to the image is stored locally, not the actual image itself. When the notes page is re-accessed in NetNotes, the image URL is referenced to display the image. This means that if an image stored in a NetNotes page moves or changes from its originating Web location, there will be a dead link in the notes and no image will appear. Ideally, users should be able to save images both dynamically and statically in their notebooks. To store an image statically, the prototype software would have to download the image directly to the client machine and maintain a pointer to the locally stored data from within the notebook.
  • The Java Swing classes editorPane and HTMLEditorKit have difficulty in correctly handling font sizes. In a NetNotes notes page, multiple-sized fonts are supported in what appears to be an appropriate manner. For example, if a user selects some text and changes its font size to 18 point, the change seems appropriately reflected in the notes page. However, when that text is subsequently saved (as HTML), the stored HTML code becomes <FONT SIZE="18">. The next time this page is loaded in NetNotes, the font size of that text is huge (much bigger than normal font size 18). To get around this problem, whenever a notes page is saved, all font sizes are translated as follows
  • <FONT SIZE="10"> becomes <FONT SIZE="-1">
    <FONT SIZE="12"> becomes <FONT SIZE="+0">
    <FONT SIZE="18"> becomes <FONT SIZE="+1">
    <FONT SIZE="24"> becomes <FONT SIZE="+2">
    For this implementation of NetNotes, the above fix works fine, and when saved pages are re-accessed in NetNotes, the font sizes appear normal. For future versions of the software, it is assumed that Java Swing will have fixed the classes that handle font sizes erroneously.

4 Conclusion

The process of IA is critical to many Web users, but unfortunately it is currently not well supported by commonly used Web browsers or other software applications. Developing usable yet general Web-based software applications is particularly challenging from a technical perspective. Issues such as security, the increasing diversity of accessible Web objects (including animations, programs, forms, etc.), and a diverse and distributed user population often stand in the way. This paper described the NetNotes prototype, a limited solution to many of these problems. Although NetNotes works with a specific Web domain, handles only a select number of Web components, and requires minor server-side modifications, it can be used both as a model prototype for future, more robust versions of similar software applications, and it successfully incorporates enough key IA functions to be used in experimental evaluations.

Acknowledgements

We would like to thank Monte Westerfield, Dave Clements, and funds provided by the NIH (RR/HD12546) for technical and other support during the development of the NetNotes prototype.

References

Abrams, D., Baecker, R. and Chignell, M.

(1998) "Information archiving with bookmarks: Personal web space construction and organization". In Proceedings of the ACM CHI ‘98 Conference on Human Factors in Computing Systems (New York: ACM Press), pp. 41-48

Ayers, E. and Stasko, J. (1995) "Using graphic history in browsing the World Wide Web". Proc. 4th Intl. WWW Conference, Boston, MA
http://www.w3j.com/1/ayers.270/paper/270.html

Bederson, B., Hollan, J., Stewart, J., Rogers, D. and Vick, D. (1998) "A zooming Web browser". Human Factors and Web Development (Lawrence Erlbaum Associates: Mahwah, NJ), pp. 255-266

Card, S., Robertson, G. and York, W. (1996) "The WebBook and the Web Forager: An information workspace for the World-Wide Web". Proc. ACM SIGCHI ’96, pp. 111-117

Cockburn, A. and Jones, S. (1996) "Which way now? Analysing and easing inadequacies in WWW navigation". Int. J. Human-Computer Studies, 45, (1), 105-129

Domel, P. (1994) "WebMap--A graphical hypertext navigation tool". Proc. 2nd Intl. WWW Conference, Chicago
http://archive.ncsa.uiuc.edu/SDG/IT94/Proceedings/Searching/doemel/www-fall94.html

Gould, J. (1988) "How to design usable systems". In Handbook of Human-Computer Interaction, edited by M. Helander (North-Holland)

Hightower, R., Ring, L., Helfman, J., Bederson, B. and Hollan, J. (1998) "Graphical multiscale Web histories: a study of padprints". Proc. Hypertext ‘98, pp. 58-65

Kehoe, C. M. and Pitkow, J. E. (1998) "Graphic, Visualization, & Usability Center's (GVU's) 10th WWW User Survey"
http://www.gvu.gatech.edu/user_surveys/survey-1998-10

Manber, U. (1997) "Creating a personal Web notebook". Proc. USENIX Symposium, Monterey, CA, December

Nielsen, J. and Molich, R. (1990) "Heuristic evaluation of user interfaces". ACM SIGCHI ’90, pp. 249-256

Reimer, Y. J. (2001) "Information Assimilation in the Digital Age: Developing Support for Web-based Notetaking Tasks". Doctoral Dissertation, University of Oregon

Reimer, Y. J. and Douglas, S. A. (2001) "Capturing volatile information: Server-side solutions for a WWW notebook". WebNet Journal: Internet Technologies, Applications, & Issues, 3 (1), 36-44

Reimer, Y. J. and Douglas, S.A. (2003) "Ethnography, Task Analysis, and Other Background User Studies Inform the Design of a Web-based E-notebook". Submitted for publication

Sugiura, A. and Koseki, Y. (1998) "Internet scrapbook: Automating Web browsing tasks by demonstration". Proc. UIST ‘98, pp. 9-18

Tauscher, L. and Greenberg, S. (1997) "Revisitation patterns in world wide web navigation". ACM SIGCHI '97, pp. 399-406

Whiteside, J., Bennett, J. and Holtzblatt, K. (1988) "Usability engineering: Our experience and evolution". In Handbook of Human-Computer Interaction, edited by M. Helander (North-Holland), pp. 791-817

Addendum

An addendum was provided for this paper (2004-03-24)