Emerging Tools for Evaluating Digital Library Services: Heath et al.: JoDI

Abstract

The paper describes ways to examine how digital libraries are valued by their users, and explores ways of permitting the allocation of resources to areas of user-identified need. Pertinent models from marketing, economics, and library assessment and evaluation are reviewed, focussing on the application of the LibQUAL+^TM and CAPM methodologies. Each methodology, which was developed independently, provides a useful framework for evaluating digital library services. The paper discusses the benefits of a combined methodology that would provide even greater potential for evaluation of digital library services.

1 Introduction

The level of interest in digital libraries has grown steadily as a greater number of institutions, including archives and museums, consider the possible implications of digital libraries. While there are important, unresolved digital library research and development issues, there is also a concurrent desire to develop strategies for systematic digital library programs built upon the results of digital library projects. Digital library programs generally include both digital collections and services that facilitate access, retrieval and analysis of the collections. This interest reflects growing expectations from patrons and end users. In an ideal world with unlimited resources, it would be possible to provide a full range of digital library services to all users. In reality, resource constraints require a consideration of priorities. Consequently, it would be useful to evaluate potential benefits, as determined by patrons and end users, regarding digital library services. Even without considering digital library services, Saracevic and Kantor (1997), and Kyrillidou (1998) provided compelling reasons for evaluating libraries based on user feedback. Choudhury et al. (2002) review evaluation studies for library services, emphasizing those that incorporate user feedback. Since this review, Morse (2002), Bollen and Luce (2002) and Montgomery and King (2002) have published relevant papers.

How are digital libraries valued by their users? Answers to this question can be pulled from a variety of areas including work done in marketing related to e-services and electronic service quality (e-SQ), developments in library evaluation and assessment, issues related to the economics of digital libraries, and the milieu that tries to explore how organizations are managing innovation. Digital library service evaluation has received relatively little attention in the early phases of the Internet revolution, as developments tended to be technology driven. Even though most early DL projects had a user evaluation component, the impetus was primarily on formative evaluation and technology development. However, it is becoming clear that the customer-service facet of Internet-based interactions is critical not only from the perspective of e-commerce, but also from the perspective of delivering effective digital library services.

Pioneering work by Zeithaml et al. (2000) explored e-SQ from the e-commerce perspective. This work has clear parallels with some of the insights we are gaining from library users as they are translating their virtual Web experiences into perceptions of physical reality (Lincoln 2002). Work on e-SQ offers many valuable lessons for digital library evaluation. Coupled with insights from the library service quality evaluation literature, this means that the possibility of applying pioneering and innovative tools for assessing digital libraries is within our grasp.

These studies demonstrate an increasing emphasis on both inter- and intra-institutional measures, outcomes rather than inputs, a user-centric perspective, evaluation of digital libraries, and adaptation of evaluation tools from various disciplines. This paper brings together two such pioneering and innovative tools: the LibQUAL+^TM methodology (http://www.libqual.org), which measures gaps in perceptions of service quality, and the Comprehensive Access to Printed Materials (CAPM) methodology (http://dkc.mse.jhu.edu/CAPM/), based on a multi-attribute, stated-preference economic model that was utilized to evaluate the CAPM project at Johns Hopkins University (Choudhury et al. 2001, Suthakorn et al. 2002).

1.1 Introduction to LibQUAL+ and CAPM

The LibQUAL+ methodology is a total market survey of user perceptions measured against minimum and desired expectations; the tool is grounded in the research library environment and the methodology provides a framework for identifying gaps in service delivery. The CAPM methodology provides a framework for prioritizing the development of digital library services, based on users' preferences. Although that methodology was applied to a specific project, the framework can be used generally for evaluating users' preferences for digital library services.

This paper proposes the application of a mixed tools model that includes the application of LibQUAL+, the identification of service quality improvement gaps, and the follow-up application of multi-attribute, stated-preference economic models (a CAPM-type model), where users are asked to prioritize solutions that would contribute to the closing of the service quality gaps identified by LibQUAL+. In an ideal environment, a library would implement the solutions and engage in a follow-up cycle of evaluation by going through the LibQUAL+ cycle again, identifying priorities through a modified CAPM methodology, and engaging in a continuous improvement effort. Both tools can be used to build better digital libraries in a complementary and iterative fashion.

2 SERVQUAL

LibQUAL+ is a modification of SERVQUAL as it has been tested in the research library environment. SERVQUAL (for SERVice QUALity) was developed for the for-profit sector in the 1980s by the marketing research group of Parasuraman, Zeithaml, and Berry (Parasuraman et al. 1985, Parasuraman et al. 1988, Parasuraman et al. 1991, Parasuraman et al. 1994). Grounded in the Gap Theory of Service Quality, the singular percept of SERVQUAL is that "only customers judge quality; all other judgments are essentially irrelevant" (Zeithaml et al. 1990, p. 16). To derive the gaps essential for measuring perceptions of service quality, respondents are asked to establish their judgments across three scales for each question: the desired level of service they would like to receive, the minimum they are willing to accept, and the actual level of service they perceive to have been rendered. The desired scores and the minimum scores establish the boundaries of a zone of tolerance within which the perceived scores should desirably float.

The original SERVQUAL design asked 22 questions across the five survey dimensions. For each question, the user is asked for impressions of service quality according to (1) minimum service levels, (2) desired service levels, and (3) perceived performance. For each question, gap scores are calculated between minimum and perceived expectations and between desired and perceived expectations. The zone of tolerance is the difference between the minimum and desired scores. Optimally, perceived performance assessments should fall comfortably within that zone. Administrators should be concerned by scores that fall outside the zone and by decreasing trajectories over time. Excellence in service might have been achieved for attributes where the perception of actual service delivery has a higher score than the desired expectation. The difference between the minimum and perceived scores is called the Service Adequacy (SA) score, and the difference between the perceived and desired score is called the Service Superiority (SU) score.

The SERVQUAL protocol measures the gap between customer expectations and perceptions across five dimensions:

reliability, i.e. ability to perform the promised service dependably and accurately;
assurance, i.e. knowledge and courtesy of employees and their ability to inspire trust and confidence;
empathy, i.e. the caring, individualized attention the library provides to its users;
responsiveness, i.e. willingness to help users and provide prompt service;
tangibles, i.e. appearance of physical facilities, equipment, personnel, and communications materials.

3 LibQUAL+

LibQUAL+, a joint research and development project of Texas A&M and ARL, has emerged as both a process and a tool that enables institutions to address service quality gaps between their expectations and the perceived service delivery, to enhance student and faculty research, teaching, and learning needs. LibQUAL+ has been gradually and carefully applied to a variety of post-secondary library environments, including the health sciences library context and statewide contexts such as OhioLINK. Furthermore, LibQUAL+ aspires to push the frontiers of service quality assessment theory and pioneer the use of large-scale, Web-based, survey applications in a digital library environment.

3.1 LibQUAL+ from Texas A&M

User perceptions are critical in the context of a public research library that asks the user community to support the delivery of services through a substantial fee system. The student body has approved a series of fees that contribute heavily to the library's operating budget, positively impacting the library's ability to support the university's mission of instruction and research. By seeking student approval of these fees, the administration's desire to respond to issues perceived as important to the students is heightened (Snyder 2002).

Texas A&M had a six-year history of regrounding the SERVQUAL instrument to library purposes before it emerged as one component of the ARL-sponsored New Measures initiative (http://www.arl.org/stats/newmeas/newmeas.html, DeWitt 2001). Over three years of applying LibQUAL+ in the post-secondary library environment, the project has benefited from the increasing number of interested and participating libraries. So, while the project started as a small, self-funded pilot among 12 ARL libraries in 2000, there were 43 libraries participating in 2001, and 164 libraries in 2002. More than 70,000 users submitted complete and valid data in 2002 across these 164 different settings. The application of a scalable Web-based protocol has helped the project emerge as one of the most promising tools for evaluating not only traditional libraries, but their digital aspects as well. LibQUAL+ is quickly becoming an ever-growing digital library of evaluation data of user perceptions of library service quality.

3.2 LibQUAL+ Dimensions

After SERVQUAL was rigorously re-grounded for academic libraries through a meticulous qualitative phase (Cook and Heath 2001), the LibQUAL+ instrument of 25 questions emerged to evaluate the construct of service quality in a library environment. The LibQUAL+ questions measure customer perceptions of library service across four dimensions:

Affect of Service - (nine items) the human side of the enterprise, encompassing traits of empathy, accessibility, and personal competence (e.g. "willingness to help users")
Personal Control - (six items) the extent to which users are able to navigate and control the information universe that is provided (e.g. "Web site enabling me to locate information on my own")
Access to Information - (five items) an assessment of the adequacy of the collections themselves and the ability to access needed information on a timely basis regardless of the location of the user or the medium of the resource in question (e.g. "comprehensive collections", and "convenient business hours")
Library as Place - (five items) comprising variously, according to the perspective of the user, utilitarian space for study and collaboration, a sanctuary for contemplation and reflection, or an affirmation of the primacy of the life of the mind in university priorities (e.g. "a haven for quiet and solitude") (Cook et al. 2003).

The 78,863 responses with complete data on all of the 25 items were retained for the summary statistics, a completion rate of about 53% of those who initially opened the Web survey. Figure 1 illustrates the aggregate service adequacy scores for each dimension in the LibQUAL+ instrument from the spring 2002 implementation. Table 1 summarizes the data across the dimensions for the Spring 2002 administration.

Figure 1. Aggregate service adequacy scores for each dimension in the LibQUAL+ instrument (source: Webster and Heath (2002) LibQUAL+ Spring 2002 Aggregate Survey Results, Association of Research Libraries, Washington, D.C.)

Table 1. Aggregate dimension means (n=70,445)

Dimension	Minimum	Desired	Perceived	Service Adequacy (SA) Gap
Access to Information	6.57	7.93	6.82	0.25
Affect of Service	6.51	7.90	7.11	0.60
Library as Place	5.98	7.41	6.62	0.64
Personal Control	6.74	8.15	7.07	0.33

Note. Webster and Heath (2002) LibQUAL+ Spring 2002 Aggregate Survey Results (2002) Vol. 1, p. 24

"The personal control dimension is a challenging construct and seems to be concerned primarily with how users want to interact with the modern library. The emphasis is on a felt need for personal control of the information universe in general and aspects of Web navigation in particular. In that Web-based interactions with the libraries are steadily increasing in number, while face-to-face interactions are declining, this dimension seems to be integral to a service quality construct in the modern research library. An element of reliability again suffuses the questions in this factor. It is noteworthy that the items in this dimension imply electronic rather than physical access to the library in the minds of the users." (Snyder 2002, 5)

Of the four dimensions defined by LibQUAL+ as comprising library service quality, Personal Control was found to be the most important among the respondents to the 2002 survey as defined by having the highest mean desired score. "As a group North America's libraries were most successful in providing physical library environments that met their users' needs (Library as Place SA: 0.64) and in providing the trained and caring staff who assisted users in their quest for information (Affect of Service SA: 0.60). With users and libraries alike confronting rapidly changing technological scenes and with the latter continually battling the rising costs of commercial information sources, it comes as little surprise that libraries in the aggregate were least successful in facilitating access to information (Personal Control SA: 0.25)." (Cook et al. 2003)

3.3 Limitations

Little can be done with the aggregate data, however. The real power of LibQUAL+ is the community of people it brings together to share in the learning and application of using evaluation to create better libraries. There is a real challenge that is gradually being overcome as one tries to move from the aggregate summary and individual library notebooks to interpretive frameworks such as gap score analysis and/or normative perspectives. For a compilation of participant perspectives from the Spring 2001 participants, see the special issue of Performance Measurement and Metrics (Cook 2001). For normative data from the spring 2001 and spring 2002 implementations, see <http://www.coe.tamu.edu/~bthompson/libq2002.htm>.

Even more challenging is the ability to use the results to implement real innovations. Gap score analysis has inherent limitations in that it does not really provide a way for prioritizing the gaps and identifying which improvements would be most beneficial to the user of a digital library. Listening to customers has its limits in that customers have a limited frame of reference and tend to offer incremental rather than bold suggestions; ultimately innovation is the responsibility of staff, the developers and builders of digital libraries (Ulwick 2002).

4 Multi-Attribute Stated-Preference Methods

As the results from a LibQUAL+ analysis identify gaps in services, it might be useful to consider a framework to prioritize actions to address these gaps. Multi-attribute stated-preference techniques offer such a framework. Researchers from Johns Hopkins and Colorado developed a multi-attribute stated-preference analysis to evaluate benefits associated with a CAPM implementation (Choudhury et al. 2002). While the project team evaluated the specific costs and benefits associated with CAPM implementation, they also considered the appropriateness of multi-attribute stated-preference methods for evaluating digital library services in general.

Multi-attribute stated-preference methods feature choice experiments to gather data for modeling user preferences. In the choice experiments, often expressed as surveys, subjects state which alternatives (services or features) they most prefer; the alternatives are distinguished by their multi-attributes. In designing the choice experiments, it is important to develop credible choices such that subjects (users) have appropriate information to make meaningful choices between alternatives. Additionally, the format for making choices must also be meaningful for subjects. Multi-attribute stated-preference experiments provide data that are then used to estimate the marginal benefit of each attribute. These experiments are based on the idea that utility from services is specified and measured through rational models.

The multi-attribute stated-preference technique has been used extensively in marketing research to help predict demand for new products. In the past ten years, this approach has been used increasingly for cost-benefit analysis of public projects as well as natural resource damage assessments. It is important to note that library services are not offered through private markets, a characteristic shared by public projects or natural resources. Adamowicz et al. (1998) provide an overview of the multi-attribute stated-preference methodology; Kamakura and Russell (1989) describe an application of the methodology for marketing studies, while the Wisconsin Department of Natural Resource Services (1999) offers an example application for natural resources.

One of the criticisms of multi-attribute stated-preference techniques is that they rely on choices made by users within hypothetical experiments. However, these methodologies offer additional insight and perspective regarding user preferences for decision-makers, who most probably should consider information from various sources. Ultimately, decision-makers have to make decisions. Multi-attribute stated-preference techniques provide an additional tool for this purpose.

5 Combining the Methodologies

The LibQUAL+ methodology results in an identified set of gaps in library services. This information is useful for library administrators to understand the set of services that might require additional attention or resources. For the 25 institutions that participated in LibQUAL+ over two consecutive years (spring 2001 and spring 2002 implementation), a comparison of their results shows that perceived service quality scores improved approximately 0.2 points from one year to the next. This can be the result either of actual service improvements or an indirect effect of improved staff performance on the measured variables. Measuring user perceptions of service quality signifies that this is an important concept that is worth paying attention to, so the observed improvements are to be expected from a design perspective (Cook et al. 2003).

At Texas A&M University libraries, the annual surveys identified a persistent problem among the faculty in the "Personal Control" dimension. Further inquiry showed faculty dissatisfaction with the libraries' ability to support their information-seeking behavior adequately. Initial efforts to redress the situation include Web-initiated electronic document delivery to the users' desktops for all members of the academic community. Likewise, the library's Web site is being completely rebuilt around powerful enterprise-level software with the goal of making its content-management resource the community's portal of choice.

The results of LibQUAL+ have proved useful for libraries considering improvement in services. There may be more to gain if a LibQUAL+ analysis is combined with a multi-attribute, stated-preference technique analysis, such as the one developed for the CAPM system. Fundamentally, the "outputs" from a LibQUAL+ analysis can provide the "inputs" for a multi-attribute stated-preference analysis, which acknowledges the need for tradeoffs when making decisions regarding resource allocation.

This combined approach would identify gaps in digital library services, along with a means to prioritize actions to address gaps. Additionally, the multi-attribute stated-preference provides a mechanism to consider assigning (hypothetical) monetary values, as determined by end users and patrons, for digital library services. While it is important to consider this "willingness to pay" metric carefully, it does represent a measure that has widespread understanding, especially for university administrators.

6 Summary

The 2002 LibQUAL+ experience shows that the survey administration is scalable and that a robust digital evaluation tool of user perceptions is a reality. "Project technology can handle large numbers of institutions and survey respondents simultaneously. Indeed, load tests have indicated that the configurations are sufficiently robust to allow participation in LibQUAL+ by the entire potential populations of the institutions of higher education in the United States. The experience has also shown how preliminary analyses can be turned around quickly and distributed to participating institutions, even with large numbers of libraries participating. One of the distinct benefits of the program is that participation requires limited expertise and time commitment on the part of local sites." (Cook et al. 2003). As LibQUAL+ enters its third and final year of development under a FIPSE grant and its second year of a NSF/NSDL grant, the boundaries of the existing protocol and the emerging digital LibQUAL+ protocol will have to be more clearly defined.

The use of Web-based surveys for the LibQUAL+ and CAPM methodologies facilitates any potential integration. Both research teams are eager to explore an analysis using the combined methodologies, especially within institutions that have already participated in LibQUAL+ studies. Such an analysis would allow the research teams to build the combined framework and explore its potential for improving library services through a comprehensive analysis based on a rich database of user feedback.

Emerging Tools for Evaluating Digital Library Services: Conceptual Adaptations of LibQUAL+ and CAPM