Extending Domain-Specific Resources to Enable Semantic Access to Cultural Heritage Data

Paul D Clough, Neil Ireson, Jennifer Marlow

Abstract


Cultural heritage material often contains rich semantic information, which can be utilised for alternative forms of information access beyond keyword searching and browsing by subject categories. In order to provide such functionality it is desirable to annotate all the material in a collection with named entities and their relationships so that all the collection is available for semantic search. In this paper, we examine issues involved with automatic semantic annotation of information about artists from Tate Online using a pre-existing domain-specific structured resource (ULAN). In particular, we focus on extending ULAN's coverage of artists and their associated semantic properties (e.g. birth/death date, birth/death location) by applying focused crawling and automatic information extraction techniques to exploit semi-structured sources of information. This enables the cross-referencing of collections against a range of information sources, thereby improving visibility and end-user information access.

Full Text: HTML