Macro Approaches to Digital Searching and Secondary Research: Levitt: JoDI

Macro Approaches to Digital Searching and Secondary Research

Jonathan Mendel Levitt
Goldsmiths College, London University, and The Open University, UK
Email: JL794@tutor.open.ac.uk

Abstract

The use of digital information can be approached from more than one angle. The main emphasis over the past decade has been on making a few basic tools (for instance, browsers and search mechanisms) powerful, versatile and easy to use. Coleman and Oxnam (2002) suggest both improving the usability of current tools and developing new ones. This article suggests focusing on the use of tools for searching digital information. The power of a search mechanism depends not only on how it is constructed but also on how it is used. Coleman and Oxnam ask: "How can interactional digital libraries enhance and augment human capabilities?" I ask a related question: "How can we use current tools such as search mechanisms more effectively?" Coleman and Oxnam wrote their article in the form of a challenge to JoDI readers, authors and researchers in the realm of interactional digital libraries. In a similar spirit, this article can be considered an initial investigation of this question.

1 Introduction

Electronic journals offer a wealth of searchable empirical experiences and insights. The Internet offers searchable access to the views and experiences of millions of people. Electronic discussion groups give access to the searchable views of hundreds of thousands of people.

These digital media collectively provide searchable access to a wealth of experiences and insights, the quantity and diversity of which seems likely to increase substantially. In addition, the search mechanisms seem likely to become much more versatile and powerful.

Digital media have resulted in rapid growth in the range and quantity of published material. Search facilities offer the capacity to extract and build on the insights and empirical experiences in this knowledge-explosion.

The efficient extraction of knowledge from digital media is of particular interest to print-disabled people, but the efficient use of digital media is not a problem confined to the dyslexic or visually impaired. The impact of digital media depends critically on how effectively we can extract and build on the wealth of insights presented in digital format.

Described here are approaches both to digital searching and secondary research, the latter referring to research that makes use of data that is already available (for instance, archival data). This contrasts with primary research, which obtains and uses new data (for instance, social surveys).

Much has been published on digital search strategies. EBSCO's Academic Search Elite database of journals contains over 200 articles with both 'Internet' and 'search' in the title. The overriding focus of these articles is on the details of the implementation of searches at the micro level, as opposed to macro or higher level approaches to searching.

Much has also been published on using digital media in research. The Elite database contains more than 150 articles with both 'Internet' and 'research' in the title. The overriding focus of these articles is on conducting primary research. Secondary research seems likely to benefit particularly from the knowledge-explosion associated with digital media. Yet the Elite database does not have any articles with the phrase 'secondary research' in the title, and only three with the phrase in the abstract.

The paper quotes past thinkers, illustrating that many of these approaches are far from new. This is because the emphasis here is not on identifying new approaches, but on describing some that seem suited to utilizing the power of searchable digital media. Although these approaches are described in the context of digital searching, they can be used irrespective of whether one is conducting a digital search.

There are several types of digital search mechanism. Some, such as catalogues, directories and citation indexes, have paper-based equivalents. Some, such as digital search engines, are relatively new. The focus of this article is on this latter type of digital search mechanism.

Sections 2 to 5 describe macro approaches. The focus of this article then becomes more practical, examining some search mechanisms currently in use.

2 Single-faceted and multi-faceted approaches

The concept of a multi-faceted approach can be introduced through what I term 'a single-faceted approach'. An example of a single-faceted approach is the concept that natural science should model itself on mechanics.

Einstein (1949: 21) described this paradigm in his autobiography: "All physicists of the last century (i.e. the 19th century) saw in classical mechanics a firm and final foundation for all physics, yes, indeed, for all natural science ... It was Ernst Mach who, in his History of Mechanics (now called The Science of Mechanics), shook this dogmatic faith". Mach (1883: 596-7) wrote: "The view that makes mechanics the basis of the remaining branches of physics, and explains all physical phenomena by mechanical ideas, is in our judgment a prejudice".

Peirce (1891: 315) was highly critical of the single-faceted approach. Referring to "perhaps the larger number" of past philosophical systems, Peirce wrote: "Just as if a man, being seized with the conviction that paper was a good material to make things of, were to go to work to build a papier mâché house, with roof of roofing paper, foundations of pasteboard, windows of paraffined paper, chimneys, bath tubs, locks, etc., all of different forms of paper, his experiment would probably afford valuable lessons to builders, while it would certainly make a detestable house, so those one-idea'd philosophies are exceedingly interesting and instructive, and yet are quite unsound".

Peirce added: "Every person who wishes to form an opinion concerning fundamental problems should.... make a systematic study of the conceptions out of which a philosophical theory may be built, in order to ascertain what place each conception may fitly occupy in such a theory, and to what uses it is adapted".

What Peirce suggested in these remarks is a type of multi-faceted approach. Peirce's approach involves assessing the appropriate use of each component, of a philosophical theory. Peirce suggested examining components, but there are other types of multi-faceted approach.

The basis in mechanics of all natural science, that Einstein described, can be contrasted with this multi-faceted approach, in which each discipline seeks to learn from a diversity of disciplines.

The concept of multi-faceted can be applied to opinions, for instance to those held in a diversity of countries or in more than one period of time. According to Russell (1945: 7-8), philosophers are "effects of their social circumstances and of the politics and institutions of their time". Digital media provide wider access to views expressed in a diversity of countries.

Until the advent of searchable digital media, it was often difficult to implement a multi-faceted approach. For instance, it was time-consuming to identify the strengths and weaknesses of a diversity of disciplines. But searchable digital media can substantially reduce the time taken to identify these strengths and weaknesses.

3 Magpie approach

Another approach that seems much easier to implement in the electronic era is what I call a 'magpie approach', named after this enterprising bird. When a magpie seeks to build a nest it uses components that are already available. In this paradigm I envisage adapting components that have been used elsewhere rather than seeking to build new components.

One advantage is that components that have already been used elsewhere seem less likely to be flawed than those developed afresh. Another advantage is that it is generally quicker to identify and modify strong components than to build them.

While an approach can be both magpie and multi-faceted, the two concepts are different. One can use a single-faceted magpie approach, in which one seeks to build on the concepts from a single discipline, or a non-magpie multi-faceted approach, in which one constructs afresh but also seeks to learn from a diversity of sources. The rapid searchable access to insights and experiences provided by digital media can make the implementation of one or both of these paradigms much less time-consuming.

4 Learning from experience

In 1916, the year in which he published his General Theory of Relativity, Einstein wrote: "Science is nothing else but the comparing and ordering of our observations according to methods and angles which we learn by trial and error" (Frank, 271). Popper (1957: 67) recommended that we solve social problems through what he termed 'piecemeal engineering'. He added: "The piecemeal engineer knows ... that we can learn only from our mistakes".

Learning from trial and error and learning from our mistakes are examples of learning from experience, but the concept can be applied more generally. A surgeon learns from the experiences of others, rather than from trial and error or from his/her mistakes. We can learn from the mistakes of others. In addition we can learn from our own successes and from those of other people.

Searchable digital media provide views, insights and experiences from which to learn, and therefore provide much more opportunity in which to learn from experience.

5 Other approaches

5.1 Developmental approach

The developmental approach envisages that each discipline examines its own development. The concept that subjects are shaped by their past goes back at least as far as Comte (1830-42: 28), who wrote: "There is no science which, having attained the positive stage, does not bear marks of having passed through the others". By 'other stages' Comte was referring to the theological and metaphysical stages.

Mach (1883: 596-7) wrote: "Knowledge which is historically first, is not necessarily the foundation of all that is subsequently gained". Mach examined the development of mechanics, and according to Einstein shook the belief that mechanics is the foundation of science. The impact of Mach's investigation indicates that it might be worth examining how other disciplines developed.

Searchable digital media make it increasingly easy to find information outside our special interests, and would make it easier to carry out investigations similar to that conducted by Mach.

5.2 Frequent scrutiny and rational re-alignment

Returning to Mach's remark, it indicates that our perspectives might be biased towards ideas that have been present for longer. This suggests using an approach in which our perspectives are subjected to frequent scrutiny, with particular attention being focused on our longer-established perceptions. I call this form of paradigm analysis 'rational re-alignment'. Searchable digital media can facilitate rational re-alignment by providing rapid access to a wide range of perceptions

5.3 Comparative approach

One form of comparison is to describe similarities. For instance, Comte (1830-42: 27) described some traits shared by diverse branches of knowledge: "Each branch of our knowledge passes successively through three different theoretical conditions: the Theological, or fictitious; the Metaphysical, or abstract; and the Scientific, or positive".

There are many other types of comparative approach. The focus can be on differences rather than on similarities. The comparison can be between diverse disciplines, diverse locations and diverse periods in time. Compared items can be arranged according to a classification scheme. Searchable digital media seems likely to facilitate the cross-fertilization of ideas and the making of comparisons.

6 Electronic searching for academic purposes

The extent to which a user is likely to find these macro approaches useful depends critically on his or her interests and objectives. This section describes some search mechanisms and their relevance to academic research.

There are numerous interesting general articles on searching (e.g. Liddy 2001, Price 2001, Price 2002, Tenopir et al. 2002). Coleman and Oxnam (2002) suggested developing new tools for searching, and Liu et al. (2002) describe approaches to keyword searching.

6.1 Internet search engines

The function of Internet search engines is to locate Web pages that satisfy criteria specified by the user. Commonly used search engines include Google, AltaVista and AllTheWeb. Material on the Internet is not subject to peer review. While this results in the expression of a wide range of viewpoints, the user needs to be particularly careful when evaluating the content.

One mechanism for focusing an Internet search on academic resources is to limit the search to university Web sites. To implement this in Google, go to the advanced search page and type '.edu' in the box labeled 'results from the site or domain'. Another mechanism for locating more academic content is to restrict a search to files in '.pdf' format, the format commonly used to present serious articles and documents on the Internet. To implement this in Google, go to the advanced search page and select 'Adobe Acrobat PDF (.pdf)' from the drop down menu labeled 'return results of the file format'. These two mechanisms can be used together or on their own.

6.2 Example of a search using Google

Using Google to search for the phrase 'search engine' yielded approximately 4,220,000 matching pages. Restricting this search to items in English with ' search engine' in the title yielded approximately 1,190,000 matching pages. Further restricting this search to .edu sites and .pdf documents yielded only 79 results. Here are the first 20:

  • A Search Engine for 3D Models
  • HyPursuit: A Hierarchical Network Search Engine that Exploits Content-Link Hypertext
  • Web Search Engine FAQs
  • Automatic summarization of search engine hit lists
  • The Anatomy of a Search Engine
  • An Interactive WWW Search Engine for User-Defined Collections
  • Measuring Search Engine Quality and Query Difficulty: Ranking with Target and Freestyle
  • Search Engine Training
  • Search Engine Math
  • The TEK Search Engine
  • Agora: A Search Engine for Software Components
  • Search Engine Knowledge
  • PETTT TECHNICAL REPORT PETTT-01-AS-03 Improving Search Engine Position of Internet Educational Materials: Design Heuristics and Indexing Methods Aaron.
  • Search Engine Report
  • A Structure-Based Search Engine for Phylogenetic Databases
  • Ultraseek Search Engine
  • Search Engine Specific Query Transformations for Question Answering Eugene
  • A Conceptual Search Engine
  • Search Engine Features
  • Locality in Search Engine Queries and Its Implications

6.3 Tools for searching journals

There are several tools for searching journals. Zetoc provides e-mail alerts of new articles published in over 20,000 journals. A researcher can specify keywords of interest, and at regular intervals receive an e-mail list of new articles containing these keywords. Sociological Abstracts enables a researcher to electronically search databases of journal abstracts. EBSCO's Academic Search Elite enables a researcher to locate full-text articles in over 1800 journals. Abstract and full-text searches are particularly useful as they allow material of interest to be quickly located and saved in Word documents. In this way each of these tools enables a researcher to become more aware of work published in areas outside his/her main interests.

6.4 Example of a search using Academic Search Elite

A search on Academic Search Elite yielded 3,323 articles containing the phrase 'search engine'. Restricting the search to the articles where the title contains 'search engine' yielded 436 articles. Further restricting this search to articles in which the full-text is displayed yielded 290. Here are the first 20 results:

  • Search Engines.; Online, Nov/Dec2002, Vol. 26 Issue 6, p16, 2/3p
  • Internet SEARCH ENGINE Update.; Online, Nov/Dec2002, Vol. 26 Issue 6, p18, 1p
  • Elsevier, FAST Update Scirus Search Engine.; Information Today, Oct2002, Vol. 19 Issue 9, p31, 1/4p, 1c
  • Specialized Search Engine FAQs: More Questions, Answers, and Issues.; By: Price, Gary D.., Searcher, Oct2002, Vol. 10 Issue 9, p42, 5p
  • Search Engines Handbook (Book).; By: Dobson, Stephanie., Book Report, Sep/Oct2002, Vol. 21 Issue 2, p72, 2p
  • SEARCH ENGINES AND METADATA--CURRENT SITUATION.; Library Technology Reports, Sep/Oct2002, Vol. 38 Issue 5, p71, 2p
  • Search Engines.; Online, Sep/Oct2002, Vol. 26 Issue 5, p12, 1/3p
  • Internet SEARCH ENGINE Update.; By: Notess, Greg R.., Online, Sep/Oct2002, Vol. 26 Issue 5, p18, 1p
  • A New Era of Search Engines: Not Just Web Pages Anymore. (cover story); By: Hock, Ran., Online, Sep/Oct2002, Vol. 26 Issue 5, p20, 7p, 1c
  • Web Search Engines: Search Syntax and Features.; By: Ojala, Marydee; Perez, Ernest., Online, Sep/Oct2002, Vol. 26 Issue 5, p28, 8p, 3c
  • URL Inclusion Programs: New Revenue Generator for Search Engines.; By: Calishain, Tara., Searcher, Sep2002, Vol. 10 Issue 8, p70, 3p
  • E-Retailers Seek Improved Search Engines.; By: Sliwa, Carol., Computerworld, 8/12/2002, Vol. 36 Issue 33, p7, 2/3p
  • Search Engines Break the Sound Barrier.; By: Mitchell, Robert L.., Computerworld, 8/5/2002, Vol. 36 Issue 32, p34, 1p, 1 diagram
  • Researchers, Quit Your Search Engines.; By: Goldsborough, Reid., Community College Week, 7/8/2002, Vol. 14 Issue 24, p19, 3/5p, 1c
  • Search Engines.; Online, Jul/Aug2002, Vol. 26 Issue 4, p8, 1/3p
  • Internet SEARCH ENGINE Update.; By: Notess, Greg R.., Online, Jul/Aug2002, Vol. 26 Issue 4, p18, 1p
  • the straight story on search engines.; By: McLaughlin, Laurianne; Spring, Tom., PC World, Jul2002, Vol. 20 Issue 7, p115, 8p, 4c
  • Extreme searcher's guide to Web search engines (Book).; By: Hock, Randolph., Teacher Librarian, Jun2002, Vol. 29 Issue 5, p41, 1/9p
  • 2002 Search Engine Meeting.; By: Hawkins, Donald T.., Information Today, Jun2002, Vol. 19 Issue 6, p26, 4p, 1 graph, 1bw
  • Job-Search Engines.; By: C.E.., PC Magazine, 5/21/2002, Vol. 21 Issue 10, p149, 1/4p

7 Conclusion

This article describes approaches to utilizing the huge amounts of searchable material made available through electronic media. The quotations illustrate that in general these approaches are not new. Approaches such as learning from experience and making comparisons are currently used extensively, but the objective of this article is to identify approaches that seem likely to benefit from digital media, rather than to identify new or uncommon approaches. In practice I would recommend using more than one approach or even using a hybrid approach that combines characteristics from two or more approaches.

Some of the approaches described are similar to methods currently used in primary research. For instance, learning from experience is a major component of learning from focus groups, interviews and case studies. The emphasis on using a diversity of primary research methods is an example of a multi-faceted approach. Reflexivity is a form of paradigm analysis.

The focus of this article has been on secondary research rather than on primary research and, in order to maintain this distinction, the similarities have not been described in the main body of the article. In practice the distinction between the two processes may be far from clear-cut. The percentage of researchers undertaking secondary research is higher in natural science than in social science. It seems likely that one major impact of searchable digital media will be to enhance the role of secondary research in social science.

This article has examined some approaches that may be suited to building on the wealth of insights and empirical experiences offered by diverse electronic media. These approaches have been described in the context of digital searching, but they can be applied to other contexts. Thinking in macro terms can be productive irrespective of whether one is conducting a digital search.

Suggesting approaches that seem to have potential can encourage an exploration of approaches, but it is only after substantial exploration that we will be in a position to quantify the likely potential of each approach. Although digital searching can provide interesting leads, each user needs to evaluate the benefit of the leads in relation to the time spent. I consider this article to be an initial investigation.

References

Coleman, A. and Oxnam, M. (2002) "Editorial: Interactional Digital Libraries: Introduction to the Special Issue on Interactivity in Digital Libraries". Journal of Digital Information, 2 (4), May
http://hdl.handle.net/2249.2/jodi-56

Comte, A. (1830-42) "Cours de Philosophie Positive", translated by Harriet Martineau. In The Positive Philosophy of Auguste Comte (Thoemmes)
http://www.socsci.mcmaster.ca/~econ/ugcm/3ll3/comte/Philosophy1.pdf

Einstein, A. (1949) "Einstein's Autobiography". In Albert Einstein, Philosopher-Scientist, edited by P. A. Schilpp (New York, Evanston and London: Harper & Row)
Excerpt: http://www.marlboro.edu/~jharker/Other/Classes/einst1.html

Frank, P.G. "Einstein, Mach, and Logical Positivism". In Albert Einstein, Philosopher-Scientist, edited by P. A. Schilpp (New York, Evanston and London: Harper & Row)

Liddy, E. (2001) "How a Search Engine Works". Searcher, 9 (5), May, 38-43
http://www.infotoday.com/searcher/may01/liddy.htm

Liu, X., Maly, K., Zubair, M., Hong, Q., Nelson, M., Knudson, F., and Holtkamp, I. (2002) "Federated Searching Interface Techniques for Heterogeneous OAI Repositories". Journal of Digital Information, 2 (4), May
http://hdl.handle.net/2249.2/jodi-55

Mach, E. (1883) Science of Mechanics (Open Court)

Peirce, C. S. (1891) "The Architecture of Theories". In Philosophical Writings of Peirce, edited by J. Buchler (New York: Dover)

Popper, K. R. (1957) The Poverty of Historicism (London: Routledge)

Price, G. (2001) "Web Search Engine FAQs: Questions, Answers, and Issues". Searcher, 9 (9), October, 38-51
http://www.infotoday.com/searcher/oct01/price.htm

Price, G. (2002) "Specialized Search Engine FAQs: More Questions, Answers, and Issues". Searcher, 10 (9), October, 42-46
http://www.infotoday.com/searcher/oct02/price.htm

Russell, B. (1945) The History of Western Philosophy (London: Unwin Hyman)

Tenopir, C., Baker, G. and Robinson, W. (2002) " The Database Universe". Library Journal, 127 (9), May, 42-49