Putting Hybrid Cultural Data on the Semantic Web

Kate Byrne


A prerequisite for joining the rapidly growing Semantic Web is to expose data as RDF triples. In the cultural heritage world the data in question is very often a mixture of structured database fields and associated textual documents. Transforming relational database (RDB) content to RDF is not altogether straightforward and the issues are examined as a preliminary to the much more difficult step of augmenting the RDB content by extracting structured RDF triples directly from natural language text, using a specially designed txt2rdf process. This opens the way to a true integration of the hybrid data so common in heritage management. Finally we lead up to experimental results showing structured queries (using SPARQL) that cannot be answered from the RDB material alone, but which are satisfied against the augmented graph. In this domain there are potentially vast amounts of textual material available for linking to structured records, so the future possibilities of the techniques described are exciting.

