Making the Semantic Web usable: interface principles to empower the layperson

Stephen Davies
University of Mary Washington

Chris Donaher
University of Mary Washington

Jesse Hatfield
University of Mary Washington

Jessica Zeitz
University of Mary Washington

Abstract

Before the overall volume of Semantic Web data will ever approach the order of magnitude of the original Web, tools must be available that allow non-technical laypeople to readily contribute. Both the concepts and surface syntax of RDF are daunting to newcomers, and this threatens to prevent nonprofessionals from having an appreciable impact. We discuss the key features of a tool designed specifically to help novices generate semantic information, with a primary focus on instance data. This paradigm of interaction enables users to make valid RDF assertions while shielding them from many of the complexities of syntax and of resource lookup. We also present the results from a focused empirical study of the behavior of novice users as they created data with the tool. This study sheds light on the usability of specific features, and illuminates some surprising behavioral trends in Semantic Web authoring that should help guide the design of next generation of user applications.

1. Introduction

The Semantic Web (first trumpeted with much fanfare in Berners-Lee et al. 2001) is an emerging development effort whose focus is on the generation and re-use of machine-processible information. It is intended to be a complement to, rather than a replacement for, the free-text-based "Web 2.0." The vision is that in addition to the natural language content available to information consumers on the original World Wide Web, a richly interlinked graph of concepts will capture its meaning in an unambiguous modeling language like RDF. Automated reasoners and semantic query engines will then be able to leverage this information in powerful new ways. Applications, for example, could use RDF assertions to draw inferences about truths not explicitly stated, thereby enriching users' ability to explore and understand their content. And enhanced search interfaces would allow users to pose high-precision queries that sift the meaning inherent in the underlying documents, rather than just the presence of certain words in the text.

Semantic Web initiatives are already having encouraging successes in key, focused domains. The global phenomenon of rank-and-file end users generating large amounts of data on their own, however, has not yet occurred. This is surely for at least two reasons. First, most users feel no incentive to do so, since they cannot foresee any significant benefits coming from their labors. Second, most users are ill-equipped to do so, since today's tools mostly still assume a high degree of comfort and proficiency with Semantic Web concepts.

In this paper, we concentrate on the second of these two problems. In order to empower end users to create Semantic Web data, two things must occur. First, tools must be designed with novice users in mind, so that those without extensive training or specialized skills can become productive contributors of semantic information. Second, user behavior in working with such tools must be studied, so as to discover precisely what aspects of the data creation process are difficult for novices, and how these can be overcome.

The contribution of this paper is twofold. First, we present OKM1, an experimental Semantic Web authoring interface that incorporates several novel design features in an attempt to reduce the barrier for novices. Second, we analyze the behavior of a small number of users who were studied as they used the interface to perform a set of scripted but realistic tasks. The result is enhanced insight into the behavior of end users and a set of promising directions in the future design of interfaces for them.

We urge the reader to consider the uninitiated user throughout this paper. The user we have in mind is one for whom the realm of RDF triples, global URIs, domain and range assertions, ontologies, and automated inferences is a complete mystery. This user has no training in formal logic or in modeling languages, and they are quite content never to venture into those realms. Yet what they do comprehend, and care about, is meaning. The "semantics" of the Semantic Web are nothing more than precisely phrased statements about the meaning of various aspects of the real world: meaning which, for the most part, machines are powerless to discern with any reliable accuracy unless they are told. We do believe that non-technical users will have significant things to "say" semantically, just as they have contributed significant amounts of free text in Web 2.0: the fact that the mode of expression is the assertion rather than the sentence does not change the fact that users have knowledge and opinions. What we need for the Semantic Web to truly take off on a global scale is to put the power in the hands of the users. The volume of meaning that the multitudes perceive is the volume of data the Semantic Web needs in order to reach its tremendous potential.

2. Terminology

For readers unfamiliar with Semantic Web principles and terms, we here present a brief summary so that our exposition is clear:

For our purposes, all Semantic Web data is representable in RDF (Resource Description Framework; see Manola and Miller 2004) of which the triple is the foundational building block. A triple (also referred to as an assertion) is a basic statement of fact, and consists of three parts: the subject, predicate, and object. These correspond to the common noun-verb-noun structure of many natural language sentences, such as "Joe works for NASA" in which "Joe" is subject, "works-for" is predicate, and "NASA" is object.

The subjects and predicates of triples are known as resources, each of which has a globally-unique URI (Uniform Resource Identifier; see Berners-Lee 2005) that defines it. URIs typically look very much like URLs, with an "http://" prefix and following other URL naming conventions. (What we referred to as "Joe" in the previous paragraph might actually have a URI like "http://somebusiness.com/employees/empId458972".) The distinction between URIs and URLs, however, is that a URI is only a name that identifies a concept (like "Joe" or "works-for") rather than an online document that can be accessed over the Web.

The object of a triple can also be a resource (as "NASA" would certainly be, in the above example), but objects are sometimes literals instead, which are non-unique, unaddressable character strings. The distinction between resources and literals is not unlike that between objects and primitive types in a programming language like C++ or Java.

Somewhat inconsistently, predicates are also sometimes known as properties. If the object of a triple that has a certain predicate is expected to be a resource, that predicate is termed an object property. On the other hand, if that object is expected to be a literal, the predicate is called a datatype property. For example, a predicate such as "works-for" would probably be an object property, whereas "first-name" would probably be a datatype property.

In the Semantic Web, instance data is often distinguished from ontology. Instance data consists of those triples which assert individual, concrete facts (like "Joe works for NASA"), whereas ontology consists of those triples which assert generalities to which instance data is expected to conform (such as "Employee subClassOf Person".) The W3C languages RDF Schema and OWL (see McGuinness and van Harmelen 2004) are layered on top of basic RDF and were created to accommodate ontology triples. (Since our work focuses on instance data creation, we will say little more about ontology.)

Domain and range assertions are two types of ontology triples that can be made about a predicate. Stating the triples "works-for domain Person" and "works-for range Organization" indicates that whatever appears as the subject and object of a "works-for" triple should be assumed to be a Person and Organization, respectively. Much of the promise of the Semantic Web is based on inferences such as these that can be drawn. If domain and range assertions exist for a given predicate, then an end user need only say "Nancy works-for IBM" in order for a Semantic Web knowledge base to infer that Nancy is a Person and IBM is an Organization. Hence domain and range assertions not only make today's data richer, but also promise to automatically enhance tomorrow's data.

Finally, note that RDF triples are conceptual rather than tangible; in order for them to be electronically recorded and exchanged, they must be expressed in some kind of Unicode-based syntax. RDF/XML (see Beckett and McBride 2004) is one of several such syntaxes, and is the only official standard for the Semantic Web. RDF/XML is infamously verbose, cluttered, and difficult to read, which is one of many reasons why laypeople cannot be expected to compose it.

3. Related Work

In this section, we describe current research and development efforts for both designing data authoring tools and studying the usability of such tools. Section 3.1 provides a taxonomy of the major categories of applications, in order to give the reader an idea of what kinds of approaches have been tried. At the end of each grouping, we compare OKM to the tools in that class, so as to distinguish our approach from these others. Section 3.2 then presents the (small) body of usability studies in the Semantic Web field, again contrasting them with our work.

3.1 Instance Data Authoring Tools

Short of hand-crafting RDF triples in a text editor, there are primarily three groups of tools available to end users for authoring instance data: RDF editors, ontology editors, and semantic wikis. Each is geared towards a different audience and a different set of design activities, with varying advantages and disadvantages.

3.1.1 RDF editors

Tools that explicitly call themselves "RDF editors" vary considerably in functionality. Some are simply forms-based interfaces that help a user create a single RDF file - they relieve the user from having to type the syntax from scratch, and offer some simple autocomplete capabilities. (Grove 2009; Punin, et al. 2009) Some tools (e.g., (Palmer and Naeve 2005; Pietriga 2001; Steer 2001) present a visual interface that permits the user to see the underlying graph laid out spatially.

Tabulator (Berners-Lee 2007) is unique in this class in that it operates in distributed fashion. Rather than creating a single, local RDF file, Tabulator lets users view - and edit - the distributed Semantic Web "in place." RDF assertions drawn from multiple sources are materialized to the user as though they were a single, coherent graph. Editing that graph may result in Tabulator propogating changes to more than one RDF store.

Comparison with OKM: The target audience for these tools appears to be users who are already savvy with Semantic Web concepts, and desire aids to facilitate interacting with the data. Our work differs from these approaches in that we attempt to hide many of the complexities of the Semantic Web from the user, rather than expose them. We target novice users who may not want to think in terms of "triples" or "RDF documents" at all, but simply in terms of abstract knowledge.

3.1.2 Ontology editors

Applications like Protege (Noy 2001; Tudorache and Noy 2007) and Swoop (Kalyanpur 2006b) also allow users to collaboratively build knowledge, but their emphasis is on the schema level rather than the data level: ontologies rather than instances. They support a comprehensive toolset for specifying sophisticated relationships and constraints among types, and typically require a higher-level mastery of Semantic Web concepts to use effectively.

Comparison with OKM: Though such tools can in principle be used to create significant volumes of instance data, in practice they are not often used that way, and the skill set they presume makes them prohibitive for many less sophisticated users. Too, these tools conceptually divide the tasks of ontology construction and instance creation: users can edit schema, and then separately author instance data that conforms to it. OKM, by contrast, seeks to unite these two activities into a single fluid scheme whereby the implicit ontology will emerge in response to data creation.

3.1.3 Semantic wikis

Semantic wikis combine features of wikis with semantic assertions. When editing a page, the user can add semantics that make RDF-compatible assertions about the information therein. This can be done either through an extension to the markup (Krotzsch et al. 2006); Souzis 2005) or through a separate interface mechanism (Campanini, et al 2004; Schaffert 2006). The selling point of semantic wikis is that the benefits of traditional wikis - simplicity of use, low barrier to entry, ease of undoing changes - are carried over into the semantic domain. The presumption of such tools is that the user is working primarily in a natural language context, and wants to augment this unstructured text with semantic data.

We include Loomp (Luczak-Rosch 2009) and SMORE (Kalyanpur 2006a) in this category, even though these projects do not self-identify as "semantic wikis," because they are also based on the concept of adding semantics to Web-based free text. Both tools were designed specifically to appeal to non-technical users. Loomp is like a Semantic Wiki, but substitutes point-and-click annotation for markup. Making semantic assertions has the same familiar feel as formatting text in a word processor. For instance, a user can create free-text content (as in a normal Web page) and then highlight certain fragments of that content to add Semantic Web triples to the page. The process is similar to highlighting a phrase in a word processor and pressing a "boldface" or "italics" button. The user can choose from a menu of predicates, and at the click of a button, a triple will be created that uses that predicate and the highlighted text as an object. In this way, the Loomp designers attempt to merge the free-text creation process with the Semantic Web authoring process, and give semantic annotation a familiar feel.

In a similar vein, SMORE's design goal is the "seamless integration of content creation and annotation." It is an integrated environment for creating Web pages, email, and other online content, and including Semantic annotations within. SMORE facilitates the deferral of commitment to specific ontologies so that users can begin with less focused ideas and gradually converge on precision. It also provides numerous tools for automatic or semi-automatic extraction of semantics from common data sources (like e-mails, images, or tables).

Comparison with OKM: OKM shares with semantic wikis a commitment to ease of use, reversibility of changes, and the ability for knowledge to grow spontaneously in non-preplanned directions. It differs in that it is not based on free text. Where semantic wikis permit assertions to be made in the context of a Web document, OKM users are creating semantically encoded knowledge for its own sake. (Most immediately, so that it can be more flexibly navigated and queried, but also to serve as fuel for automated reasoners later on.)

3.1.4 OntoWiki

Finally, we mention OntoWiki (Auer 2006), which is the project closest in spirit to OKM. Though often classified as a semantic wiki, OntoWiki is actually strikingly different from most such tools, and in our opinion should not be labeled as such. Like OKM, in OntoWiki the focus is not on free text that can be semantically annotated, but on the semantic assertions themselves. Like an RDF editor, it presents a portal into an RDF knowledge base. OntoWiki offers many social collaboration features so that the knowledge creation workflow can take place in a distributed setting. It is an excellent example of a practical tool that makes semantic data creation by novices a real possibility.

Comparison with OKM: The principal differences between OKM and OntoWiki are the design approaches described in section 4 of this paper, namely: allowing the user to deal with human-readable names rather than full URIs; the grouping of properties by role; channeling the user towards consistency by means of intelligent autocomplete functionality; and providing a forms-based alternative to the SPARQL Semantic Web query language. Through these innovations, we attempt to further reduce the barrier to the noninitiate, making Semantic Web authoring more possible for the masses.

3.2 User interface studies

Although Semantic Web researchers have produced this plethora of tools for RDF creation, usability studies are rare. Very few compelling studies have evaluated the level of effectiveness of such tools, or what aspects in particular make them effective (or ineffective.) The great majority have provided no user studies at all; a few (e.g., Auer 2006, Bollacker 2008, Krotzsch 2006) point to user communities as evidence of effectiveness; occasionally (e.g., Stojanovic 2001) a case study is performed illustrating use in a limited setting, often by Semantic Web experts. Tabulator's designers (Berners-Lee 2007) explain that their project is in "exploration mode" rather than "analysis mode" and make only the general observation that "much opportunity exists for improvement." It appears that focused study on what specific barriers end users face when using Semantic Web interfaces is not being given much attention.

The most helpful study in this regard was by Noy, et al. (Noy et al. 2000), who directed military domain experts to use a version of Protege-2000 with domain-specific extensions in order to perform specific knowledge acquisition tasks. The study's conclusion was a happy one: domain experts, with 1-2 hours of training but no computer science background, are in fact able to effectively use a large KB that concerns a domain in which they are intimately familiar. This applies to both viewing and editing tasks.

Our work differs from Noy et al.'s in several ways. First, we are focusing on laypeople (not domain experts) who are tasked with formulating generalized, open-ended knowledge. In the Protege-2000 study, the structure of the KB given to participants was very detailed, and contained a precisely specified class hierarchy containing concepts (e.g., types of combat units) that subjects used on a daily basis. There was no question about the precise definition of each of these concepts, nor about the manner in which they were related. Hence subjects could easily navigate (and extend, where necessary) the taxonomy and instances in a fairly straightforward way. Second, the application was equipped with features designed specifically for the domain, further channeling users towards known success paths. Third, the subjects received 1-2 hours of training with the system, indicating a higher-level of investment (and presumably, proficiency) than in our experiments. And fourth, the aim of the study was to determine the level of overall proficiency subjects obtained with the tool, not the viability of specific features (other than the domain-specific extension to Protege.)

One other study worth mentioning involved the OntoAnnotate semantic annotation tool (Staab et al. 2001). An in-depth analysis was conducted of the behavior of nine experimental subjects (industrial engineering students) who used the application to add machine-processible metadata to web pages. The focus of the study was the level of inter-annotator agreement; that is, the degree to which different subjects independently annotated a page in the same way. The conclusion, roughly speaking, was that novices to the Semantic Web, operating in a general domain (where they are not experts), will not in general produce high-quality structured knowledge, or at least not knowledge that agrees with one another. Specific UI aspects of the OntoAnnotate tool and the ways users interacted with them were not studied, however, which distinguishes our approach from theirs.

To our knowledge, this list represents the sum total of evaluation efforts of Semantic Web authoring tools in the decade since the Semantic Web was first announced to the world at large (in Berners-Lee et al. 2001). It is not a long list, and given that empowering novice users to contribute is imperative to the success of the Semantic Web movement, studies like the one presented in this paper, which shed light on how to empower them, are vital to the success of the whole project.

4. OKM's Design

4.1 The Basic Paradigm

OKM maintains a local repository that contains information about the instances in its knowledge base. The implementation uses a Jena/SDB RDF store (McBride 2002)), which is a database specifically designed to store and retrieve Semantic Web triples from a relational database. (In our case, Jena is configured to read and write triples from a MySQL database.) The OKM interface uses the terms "fields" and "relationships" to refer to the two kinds of triples that can be made concerning a resource, depending on whether the object is a literal or another resource. An example of a field would be "James height 1.9meters", while an example of a relationship might be "James worksFor LansingCorporation." These correspond to "datatype properties" and "object properties," respectively; however, we use the terms "field" and "relationship" instead because they seem more intuitive to users.

Like many other tools (e.g., (Auer 2006; Campanini 2004; Souzis 2005)) each resource is materialized in the display as a page, featuring triples relating to that resource. All RDF triples in which the resource appears as a subject or object are shown, with fields (somewhat arbitrarily) appearing first, then relationships. Users can navigate the site by traversing hyperlinks to related resources.

Users can create new resources by naming them with a nickname, or label, hereafter referred to as a human-readable name (HRN.) At creation time, OKM auto-generates a globally-unique URI for that resource (scoped to the domain name of the OKM server.) Users can then add properties to the resource. If a property value is another resource (as opposed to a primitive data type), the user can specify an existing resource in the system as the object, at which point the new resource is effectively "stitched in" to the rest of the graph. Users can also search the system for resources by typing in a search box that autocompletes based on HRNs, or any portion thereof (e.g., typing "lin" will match a resource whose HRN is Abraham Lincoln.)

Note that OKM's focus is on browsing and editing the data in its own local repository. Any new resources or added/changed assertions about existing resources are added to this repository. The repository can be exported to any standard RDF serialization, and external data can be imported to it, but while the user is interfacing with OKM it is the contents of the repository only that are in view. The user currently cannot navigate or search outside its bounds to the larger Semantic Web.

We now describe several fundamental design aspects of the OKM interface that are designed to lower the barrier to entry for novice users.

4.2 HRNs and Local Namespaces

Although it is important to have a globally-unique identifier for every resource on the Semantic Web (so that it can be unambiguously identified) human beings normally function within some context. This allows the full specification of concepts to be truncated to locally-scoped names, offering convenience and brevity. This is why we have pronouns in natural language, and why we refer to our brother as "John" in a conversation rather than "John Smith III who was born in 1982 and lives at 123 Main Street in Cedar Rapids South Dakota" every time we mention him. The burden of communication, and even of thought, becomes insurmountable if all references to entities must be fully-specified.

For this reason, we contend that a Semantic Web editor for the novice should present the user almost exclusively with HRNs, while backing each of those entities with a global URI that only appears when necessary. The RDF generated, of course, will be anchored in the global uniqueness paradigm, but that is quite a different matter from what the user works with in the trenches.

This stance is similar to that taken by Tabulator (Berners-Lee 2007), but for a different reason. Tabulator's designers wanted to steer users towards thinking about the "graph of things" rather than the "web of documents," and so it displays nicknames to avoid confusing URIs (of things) with URLs (of pages.) We, on the other hand, simply want to encourage the user to "think locally." Knowledge modeling is a difficult task that requires great attention, and letting users quickly summon the common names for things they are already familiar with makes the process more agile.

In OKM, the HRN that a user gives each newly created resource is the primary way of referring to it. Whether displaying a resource's name, searching for it by typing in the search box, or selecting it as the object of a triple on some other resource's page, it is always the resource's HRN that is displayed or matched. In the event that more than one resource in the OKM repository has the same HRN, the URI is appended to the display so as to disambiguate. But since this is relatively rare, it gives the user the pleasant experience of working in a world of familiar names for things, with the harmless illusion that the names are all unique. Thus the user can work in their local context and yet still make assertions that are globally meaningful in the Semantic Web, since each resource and predicate is backed by a (non-displayed) URI.

The effect of all this is to "quiet down" the user interface by making it less cognitively overwhelming. A large percentage of the visual display is occupied by names and concepts that the user has defined him/herself, with a minimum of syntactic "noise" that might make it intimidating and difficult to navigate.

4.3 Grouping Properties by Role

Consider the following triples in a very small, toy knowledge base (presented for simplicity as simple subject/predicate/object triples rather than in an RDF serialization syntax):


Thatcher gender female
Thatcher ratified FixedLinkTreaty
Thatcher wrote ThePathToPower
Thatcher bornIn 1925
Thatcher elected 1979
Thatcher memberOfParty ConservativeParty
elected domain PrimeMinister
memberOfParty domain PrimeMinister
ratified domain PrimeMinister
gender domain Person
bornIn domain Person
wrote domain Author

A set of RDF triples has no order. Each triple is a standalone statement that must be independently meaningful, on the same "level" as all the others. But in the user's mind, clearly there is organization present. Here, the domain assertions insist that the elected, memberOfParty, and ratified properties have something to do with "Thatcher as a PrimeMinister," while gender and bornIn have to do with "Thatcher as a Person" and wrote with "Thatcher as an Author." Clearly, the resource known as Thatcher is complex and multifaceted. One might say that the entity it represents plays multiple roles, and that its various properties make sense only in the context of one of those roles. This is reflective of the ontological theory of role-concepts and role-holders delineated by (Kozaki, et al. 2002): "PrimeMinister" is a role-concept, and "Thatcher as a PrimeMinister" is a role-holder.

We contend that presenting the user with a flat list of properties for a resource is likely to be cognitively overwhelming. In such a disorganized list, users will inherently have difficulty finding the information they're searching for. A successful Semantic Web will contain a large amount of rich data for the entities it describes, yet a user would quickly become disoriented if forced to scroll through pages of unordered assertions. It seems that in some cases, an implied organization exists in the form of the different contexts or roles a resource participates in. These roles may be inferred from rdf:type assertions about the resource, or, as in the above example, from the domains (and ranges) of properties.

To take advantage of this implicit organization, OKM's display for a resource is grouped by role. (Note that a role is equivalent to the Semantic Web notion of a type.) Each property that OKM can associate with a role is displayed in a box labeled with and devoted to that role (see Figure 1.) Similarly, when the user creates new assertions (as described below), they do so from within the context of one of the roles. The user does not add a property "in general" to a resource page, but rather adds a property to one of the role boxes (as illustrated in Figure 2.) OKM then creates appropriate domain and/or range triples - which the user does not see explicitly - so that the new property is thereafter associated with that role. The result of all this is that coherence now emerges from the display, since properties that are semantically related are now spatially grouped together.

The basic OKM interface

Figure 1. The basic OKM interface. Each resource in the system is manifested as a page. The triples involving that resource (whether as subject or object) are grouped by the role (type) to which they pertain. (For instance, the triple "Margaret Thatcher ratified Fixed Link Treaty" appears on the Margaret Thatcher page, in the box for the "PrimeMinister" role, since it involves Margaret Thatcher (as subject) and it pertains to Margaret Thatcher as a prime minister (rather than as a person or author.) This method of display reduces the cognitive overload of an unorganized, flat list of properties. The user can traverse the knowledge base by clicking on hyperlinks of related resources, or globally search for objects by any part of human-readable name (in the upper-right search box.)

Note that this approach goes beyond the low-level grouping that tools like OntoWiki (Auer 2006) and Platypus (Campanini 2004) provide, which simply omit duplicate subjects and predicates from the display when they are repeated in more than one triple. (This is reminiscent of the Turtle syntax (Beckett 2004), which allows groups of triples to be similarly compressed.) By contrast, OKM is grouping at a semantic, rather than a syntactic, level. It uses information about domains and ranges in order to intelligently cluster the properties visually by role.

Some interesting cases arise surrounding multi-role properties. Suppose a property in some imported data set has more than one domain or range specified for it? For example, if the predicate winsTripleCrownRace has both the domain Horse and the domain Competitor declared for it, it seems that the property is relevant to both of those roles. Hence, OKM's "Secretariat" page displays any winsTripleCrownRace triples in both role boxes. (Editing one automatically updates the other's display.) Conversely, what if a user adds a property to a role box A of resource X, and then later adds the same property to a different role box B of resource Y? Are they in fact asserting that Y is of also type A, and X of type B? Or are they merely saying that one possible domain of the property is type A, and another possible domain is B? (If a user added a veto property to a particular resource's Governor box, they would think it odd indeed if suddenly that resource also acquired a President box with the same property!) This situation seems insoluble without further clarification from the user. OKM currently chooses to disallow it simply by forcing the user to associate a property with only one domain (and range) when creating properties through the interface. (In the case above, a separate veto property with a distinct URI would have to be created for governors.)

4.4 Form-based, not Markup-based

Applications like DreamWeaver (Adobe Systems 1997) and FrontPage (Microsoft Corporation 1997) made WWW authoring accessible to a whole new generation. Essentially, these tools allowed users to create HTML without knowing they were doing it. Hidden was not only the syntax, but also the very fact that the underlying representation of the page was in HTML at all. To the user, the artifact in question was a visual collage of text, fonts, colors, and images, and the job of the tool was to facilitate design of this artifact while masking the messy details. More generally, the obvious successes of graphical widget-based interfaces for the masses cannot be ignored. Although command-line interfaces offer unmatched power and flexibility in expression, it's certainly no accident that the MacIntosh interface design revolutionized the computing world, or that Windows replaced DOS, or that even Ubuntu Linux now features GUIs front-and-center. Many non-technical users simply prefer graphical, point-and-click interfaces.

When we probe the reasons for this phenomenon, we find at least three. First, the functions provided by a markup language are not easily discoverable. The user cannot effortlessly answer, "what are all the things I could possibly express?" and hence they can't readily perceive the boundaries of the tool. Second, a language-based interface does not aggressively guide the user towards probable success paths. The user cannot easily answer, "so what am I supposed to do?" without finding what seems to be a reasonable example and following its style. And thirdly, of course, language-based interfaces are not forgiving syntactically. A noninitiate has a much better chance of properly typing things into predefined fields than in articulating something correctly in a markup syntax.

At some level, requiring the user to enter free text, no matter how simple the syntax is made, will suffer from these liabilities. The inevitable result, we believe, is that a (probably large) class of users will simply not contribute Semantic Web data.

Rank-and-file users need to be able to choose, not merely to state. For this reason, OKM eschews a Wiki-like markup syntax as in (Krotzsch et al. 2006; Souzis 2005) in favor of a flexible forms-based interface. (See Figure 2.) This provides a ready organization to the material, suggests properties and values that make sense for the viewed entity (see below), obviates the need to remember and hand-craft language syntax, and eliminates syntax errors. The price paid for all this, as with all point-and-click approaches, is a less flexible form of expression than a language-based interface and a more cumbersome method of entry. We believe, however, that this tradeoff is worthwhile for a large body of the population.

Editing in OKM

Figure 2. Editing in OKM. When the user clicks the "Edit" link on a page (see Figure 1) fields pop up that allow each property and value to be edited. Properties known to be associated with the resource's roles but which do not (yet) have values for that resource ("height," in the example figure) appear in grey to suggest that they might be meaningfully filled in. New roles are added to a resource through the "+Role" button, and properties are added by clicking the "+Data" button for a particular role. An aggressive autocomplete feature (described in Section 3.5) encourages, but does not force, users to create data that conforms to the schema that already exists. In the figure, the user is specifying which resource should be the object of the "ratified" predicate, and has typed the letters "t-r-e" into the value box. The system first finds only "tre" matches for resources that have the Treaty role, since OKM has learned from other data that Prime Ministers can ratify treaties. Other matches appear lower in the box, and in grey, since they are of types that are not (yet) known to be "ratifiable."

4.5 Channeling towards consistency

A forms-based interface seems to imply, however, that the names, types, and meanings of the expected inputs are known in advance and can be used to prompt the user. How is this possible in the Semantic Web, where the paradigm is that "anyone can say anything about anything?" (Klyne and Carroll 2002)

OKM's fundamental principle here is that although schema and data will undoubtedly evolve as a knowledge base grows, schema is nevertheless more stable than data. And as data is added to a system, this actually implies a schema of sorts, since the particulars that are given are presumably examples of a more general phenomenon.

For example, suppose that in the absence of any other information, you are told that Betty Smith is 1.3 meters tall. You have learned one particular fact, but you can probably also infer a more general one: a person has a height. Similarly, if you are told that "President Roosevelt vetoed the Blackfeet allotment bill" you also learn that presidents can veto things and the things they veto can be bills. To be sure, it may also be true that people have properties other than height, that height is not always numeric, that persons other than presidents can veto things, that presidents can veto things other than bills, and so on. None of these things are outlawed, and OKM does not prevent them from being asserted. However, upon learning a general from a particular, OKM gently steers the user towards similar particulars, since these are presumably consistent with the implied schema. The system assumes that new kinds of facts will be encountered more rarely than will new facts that are consistent with the already existing kinds.

At this point one might object. The original Web did not require any "steering." In fact, it exploded for just that reason: anyone could write anything on a Web page, and the human race proved to be very prolific in doing so. But this comparison is misleading. It worked for the original Web because the medium was natural language, the best (and practically only) mechanism of expression that the masses would ever express themselves in. They were quite comfortable with that.

But the Semantic Web is different. It's foreign territory for the newcomer - the realm of formal logic rather than emotive expression - and the possible future uses of what is asserted are mysterious. Users have something to say, but they want to say things that make sense, and they want their authoring tools to help them make sense. They want to be guided so they are consistent with the conventions of others and themselves. OKM's modus operandi is to subtly guide the user towards assertions that are schematically compatible with the assertions that are already known. This is done almost exclusively through intelligent autocomplete boxes. Every time the user attempts to "cross the schema boundary," they are alerted with a confirmation box and informed of the ramifications of this. If the user confirms, OKM has learned something new, and adjusts its knowledge of the schema accordingly.

Concrete examples make this idea clearer. Suppose an OKM repository already has instances of type Author in it, and a wrote property has been added to several of the "Author" boxes for these instances. If a user now attempts to add a property to another resource's Author box, and begins to type "w-r-o", the autocomplete function will suggest and offer to complete the word "wrote." It would not, however, aggressively offer to complete the word "wronged" if that were a predicate associated with a different domain (say, Criminal.) The word "wronged" still appears as an autocomplete choice, but it is relegated to a secondary status (i.e., it appears lower in the drop-down list and in a grey font, indicating that it is not a preferred match.) If the user does select "wronged," then they receive a confirmation box asking them to clarify whether it is true that, in general, an instance acting in an Author role can indeed "wrong" things. If this is confirmed, this new meta-fact is now added to the implicit schema, and from that point on, OKM will aggressively match "wronged" for both Authors and Criminals.

A similar example is illustrated in figure 2. The interface aggressively offers to match objects whose types are known to make sense for the assertion being created. Others are given secondary status, since adding one of them would "cross the schema boundary" and therefore extend it. Thus the system achieves a happy medium between flexibility and constraint. Based on what it knows, it channels users towards compatible facts, while giving the user a warning any time they stray from that compatibility. Each time the user does cross the boundary, and confirms the action, OKM has learned a new schema-related fact, and incorporates that into future prompting. In this way, data and schema grow together without the user having to plan ahead for it.

This type of intelligent suggestion goes beyond that of other current tools. OntoWiki (Auer 2006) and Tabulator (Berners-Lee 2007)) have an auto-suggest feature, and Protege (Noy 2001; Tudorache and Noy 2007) lets the user choose from known instances and predicates, but they do not make use of type/domain/range information to intelligently suggest. SMORE (Kalyanpur 2006a) makes inferences (called "manifested inferences") about types from domain/range information, which the user can see and optionally override. But it is not used to provide this kind of guided auto-suggestion. Freebase (Bollacker 2008) provides type-based autocompletion, but only objects whose types fall within the known schema can be added that way. The instance-based editing interface cannot be used to implicitly expand the known schema; schema changes must be made through a separate, explicit interface.

4.6 Semantic Query Interface

Finally, OKM allows the user to pose focused queries on all this knowledge through a forms-based alternative to SPARQL, the W3C query language for RDF (Prud'Hommeaux and Seaborne 2006) (and to similar query languages, e.g. (Krotzsch et al. 2006).)) SPARQL (a recursive acronym for SPARQL Protocol And RDF Query Language) is fast emerging as the standard declarative query language for Semantic Web knowledge bases. As SQL is used as a standard interface for querying relational databases, so SPARQL is used to query a set of RDF triples. A programmer can specify a set of "triple patterns" whose structure is to be matched against the knowledge base, and receive results indicating which sets of triples fit the pattern described.

Again, we draw an analogy to the original Web. SQL is without doubt an expressive language that can plumb the depths of any relational database. And in fact, any forms-based query interface - one that lets users pose boolean queries against a library's online catalog, say - will of course use SQL in its underlying implementation. But that does not mean that the average user will write SQL queries to answer everyday questions. Most end users interact with a scripted, forms-based interface that provides more guidance (and more restriction) than the SQL language itself does. Similarly, for end users to take advantage of the precision inherent in the assertions they have created, they need a veneer on top of SPARQL to pose semantic queries.

The challenge is to allow expressivity and flexibility without making the interface incomprehensible. This is a delicate balance to walk. The more the interface "helps" novice users by channeling them into sensible query formulas, the more difficult it is for the user to break out of this predefined mold and ask something intricate and unexpected. Conversely, giving too much open-ended freedom risks making it difficult for newcomers to understand what to do. Undoubtedly the answer is to provide different interfaces for different kinds of users; here, we simply steer towards a middle ground.

OKM gives a user forms and autocomplete widgets for building up and naming simple queries, and lets them compose more complex queries out of those smaller building blocks. Each simple query is a conjunction (boolean "and") of conditions. Each condition is one of the following three types:

  1. membership in a role (e.g., "resources that have the WarDeclaration type")
  2. field value (e.g., "resources that have a yearAppointed field in their DistrictJudge role," or "resources whose yearAppointed value is less than 2005")
  3. relationship (e.g., "resources that have any relationship to TonyBlair," or "resources whose Party role has a nominates relationship to TonyBlair.")

The user can build a query combining any of these conditions, and then name it. The named query then becomes a sort of "virtual resource" that can be used for further queries. For example, one could build a query combining a condition of type 1 ("resources that have the PrimeMinister role") and one of type 2 ("resources whose gender field in the Person role is female") and name that query "Female Prime Ministers." Then, a second query could be created with a type 3 condition ("resources whose Party role has a nominates relationship to Female Prime Ministers") to find all known political parties who have nominated female Prime Ministers. All of this and more is possible in SPARQL, of course. But this forms-based approach with interim naming lets users who are likely to be daunted by query syntax and variable placeholders nevertheless build up sophisticated queries.

Figure 3 illustrates the process. In the top image, the user has just created a semantic query and given it two conditions to match: instances that have a partOf relationship to the Europe2 instance, and are of type (role) Country. (Observe that interim search results appear in the white box to the right, and are updated immediately after every change to a query.) By typing into the "Save Query" text box the user now permanently saves this query under the name "European countries" for future use.

In the middle image, the user is in the midst of creating a second query which they will shortly name "European treaties." It consists of all instances that are of type Treaty, and that have an involved relationship to some "European county" (which is precisely the query the user named and saved previously.) (Note that the interim search results contain all treaties at the instant the snapshot was taken, but when the user presses "save" to add the second condition, the British-US treaty will disappear from the list since it does not meet the second criteria.)

Finally, this second named query is used as a component of a third, which is the user's final goal: to find all known Prime Ministers born after 1850 who have ratified European treaties. The user names this query "Recent European treaty ratifiers." Each of the user's named queries shows up on the "Existing Queries" page, as shown in the bottom image of Figure 3. The user can click on any query and see its definition expanded, and its updated search results in the right-hand pane.

Note that it can still be challenging for a user to build up compound queries in this way. At some point, however, it must be recognized that the very question the user is trying to ask is complex, and requires some degree of thought to crystallize. Our aim with this interface is to eliminate the syntax of query languages as a barrier.

Semantic Query interface

Figure 3. The Semantic Query interface A user builds up a complex semantic query in piecemeal fashion. (See main text for description of each image.)

OKM's semantic search is similar in spirit to /facet's "multi-type queries" (Hildebrand 2006), in that it allows properties of one resource type (for instance, Treaties) to affect the search results of another (for instance, Prime Ministers.) A single-type facet browser (e.g., Auer 2006; Butler, et al. 2009) can only support boolean conditions involving facets of a single object type: for instance, "all hardback books that cost under $10 and that have either Rowlings or Tolkien as the author." OKM (and /facet) also permit second-order queries: "all hardback books that cost under $10 whose authors were born before 1910." With OKM it is perhaps slightly more difficult to compose such a query, since it involves multiple steps with named subqueries. However, OKM queries can be extended to any number of levels ("all books whose authors worked for publishers whose gross profits exceeded $2 billion last year") and therefore come closer to matching the full power of SPARQL expressivity.

5. Usability Evaluation

5.1 Focused Empirical Testing

Empirical testing is an important aspect of validating any system's effectiveness, as well as evaluating flaws in the interface that designers may have missed. As previously observed (Nielsen and Landauer 1993, Nielsen 2000) productive usability testing can be accomplished with a relatively small number of testers, as long as they are provided with a comprehensive and thorough procedure describing how to utilize the system. In (Nielsen 2000) the authors recommend that only five users be consulted, and that this number is normally sufficient to reveal as much about the interface as can be reasonably discovered.

In addition to validating certain OKM-specific features, a more general goal of our empirical tests was to witness the activity of novices working with Semantic Web data in general, and to draw conclusions about how this activity might be aided. As mentioned in section 3.2, above, empirical studies of novice users generating Semantic Web data are few and far between. The similarity of several aspects of OKM's basic paradigm (described in section 4.1, above) to numerous other popular tools gives hope that by studying novices using OKM, we will glean important information about how they are likely to experience Semantic Web authoring in general.

Our empirical tests involved eight subjects taken from the general student body at the University of Mary Washington. (None were technology or computer science majors, and all but two were freshmen.) These subjects were given a 10-minute tutorial on how to use OKM, and then asked to find, edit and add specific data using the program starting with a KB we had constructed previously. The purpose of the testing was open-ended discovery and not validation of a specific hypothesis. We wanted to get an idea of how novices would interact with the system, and what barriers they might face. The test was composed of three sections: locate data already in the KB, edit and add data, and an interview. During the scope of this testing, we did not focus on semantic search.

The first section presented the subject with ten questions. The content of these questions was strategically planned in order to test certain aspects of the program. We tested simple questions where the subject needed to search for a resource and skim that page to find the information (for example, "how tall is Jason Thompson?") More difficult questions required them to search for a resource and then traverse one or more links from that resource to find the requisite information (for example, "how many seats are in the arena that Daniel Gibson plays in?") These questions tested whether subjects could find information by browsing links and following connected pages.

In the second section, the subject was given 18 sentences and was to add the data reflected in those sentences to the KB. Some resources referred to in the sentences already existed in the KB but needed to be updated in some way; others were not yet existent and required the user to add them. A simple item like "Jason Thompson now weighs 186 pounds" required only modification of already existing data found on a page. An item like "Eric Clapton auditioned for the Rolling Stones" was more difficult: the only already existing data was a resource page for Eric Clapton. The subject had to, among other things, create a new resource for the Rolling Stones and a new predicate representing "auditioning." The former could be done in one of two ways: either by first creating the resource explicitly, and then adding a relationship connecting it to Eric Clapton; or else implicitly, by creating a relationship from the Eric Clapton page to the (as yet non-existent) Rolling Stones resource, thereby automatically creating it. Throughout this section, we wanted to discover trends relating to how subjects added data to the KB when certain pieces were missing such as roles, resources, predicates and fields.

Throughout both sections we encouraged the subjects to talk out loud about what they were doing. This let us know how the subject was reacting to the interface, and what might be unclear. We also took screencasts of the sessions that recorded what the subjects were doing on the screen and what the subjects were saying as they used the tool.

Finally, we interviewed each subject informally to find out what they believed was difficult about the interface. As we watched the subjects complete the test, we noted what they seemed to have trouble with and then asked them about that during the interview.

5.2 Findings

By studying how well the users fulfilled their tasks, we discovered several key insights into how average users function in this kind of knowledge environment:

Notes

  1. OKM is a recursive acronym for "OKM Knowledge Management," and is pronounced the same as the name "Occam."
  2. Throughout this example, "Europe" is taken to mean "Continental Europe"; that is, excluding the British Isles.

References