Sheer Curation of Experiments: Data, Process, Provenance

Mark Hedges, Tobias Blanke, Stella Fabiane, Gareth Knight, Eric Liao

Abstract


This paper describes an environment for the “sheer curation” of the experimental data of a group of researchers in the fields of biophysics and structural biology. The approach involves embedding data capture and interpretation within researchers' working practices, so that it is automatic and invisible to the researcher. The environment does not capture just the individual datasets generated by an experiment, but the entire workflow that represent the “story” of the experiment, including intermediate files and provenance metadata, so as to support the verification and reproduction of published results. As the curation environment is decoupled from the researchers’ processing environment, the provenance is inferred from a variety of domain-specific contextual information, using software that implements the knowledge and expertise of the researchers. We also present an approach to publishing the data files and their provenance according to linked data principles by using OAI-ORE (Open Archives Initiative Object Reuse and Exchange) and OPMV.

Keywords


sheer curation, provenance, data repositories, experimental data, OAI-ORE, linked data, Fedora, OPM, OPMV

Full Text: PDF