1,777
Views
35
CrossRef citations to date
0
Altmetric
Original Articles

A Metadata Best Practice for a Scientific Data Repository

, , &
Pages 194-212
Published online: 10 Dec 2009
 

Digital data repositories ought to support immediate operational needs and long-term project goals. This paper presents the Dryad repository's metadata best practice balancing of these two needs. The paper reviews background work exploring the meaning of science, characterizing data, and highlighting data curation metadata challenges. The Dryad repository is introduced, and the initiative's metadata best practice and underlying rationales are described. Dryad's metadata approach includes two prongs: one addressing the long-term goal to align with the Semantic Web via a metadata application profile; and another addressing the immediate need to make content available in DSpace via an extensible markup language (XML) schema. The conclusion summarizes limitations and advantages of the two prongs underlying Dryad's metadata effort.

ACKNOWLEDGMENT

This work is supported by National Science Foundation Grant # EF-0423641. We would like to acknowledge contributions by the Dryad team members Hilmar Lapp and Todd Vision of NESCent; and Michael Whitlock, University of British Columbia. We would also like to thank Stuart Weibel, OCLC, for his thoughtful comments and support of this work.

Notes

1. DOE (Department of Energy) Data Explorer (DDE): http://www.osti.gov/dataexplorer/

2. Knowledge Network for Biocomplexity Data (KNB): http://knb.ecoinformatics.org/

3. The Dublin Core comprises both the 15 core properties from the DCMES Metadata Element Set (DCMES), Version 1.1. Reference Description: http://dublincore.org/documents/2004/12/20/dces/ and a set of additional properties registered in the DCMI (Dublin Core Metadata Initiative) Metadata Terms namespace: http://dublincore.org/documents/dcmi-terms/

4. Dublin Core Abstract Model (DCAM): http://dublincore.org/documents/abstract-model/

5. Dublin Core Application Profile Guidelines: http://dublincore.org/usage/documents/profile-guidelines/.

6. Dryad repository: http://www.datadryad.org/repo/

7. Dryad repository Partners: http://www.datadryad.org/repo/themes/Dryad/pages/partners.html

8. Joint Data Archiving Policy: http://www.datadryad.org/repo/

9. Interoperability Levels for Dublin Core Metadata: http://dublincore.org/documents/interoperability-levels/

10. Dryad Workshop: https://www.datadryad.org/wiki/Dec_5_Workshop_Minutes

11. Collectively the DCMES (http://dublincore.org/documents/2004/12/20/dces/) and DCMI Metadata Terms (http://dublincore.org/documents/dcmi-terms/), as explained in footnote 3.

12. Darwin Core (DwC), Version 1.3: http://digir.sourceforge.net/schema/conceptual/darwin/core/2.0/darwincoreWithDiGIRv1.3.xsd; Version 1.4 being reviewed, see: http://wiki.tdwg.org/twiki/bin/view/DarwinCore/DarwinCoreVersions

13. Publishing Requirements for Industry Standard Metadata (PRISM): http://www.prismstandard.org/specifications/

14. Journal Publishing Tag Set Tag Library, Version 3.0, November 2008: http://dtd.nlm.nih.gov/publishing/tag-library/

15. Data Document Initiative (DDI): http://webapp.icpsr.umich.edu/cocoon/DDI-LIBRARY/Version2-1.xsd?section=all

16. Ecological Metadata Language (EML): http://knb.ecoinformatics.org/software/eml/eml-2.0.1/index.html

17. PREMIS Editorial Committee. PREMIS Data Dictionary for Preservation Metadata Version 2.0, 2008: http://www.loc.gov/standards/premis/v2/premis-2-0.pdf

18. Status Element—Dryad: http://www.purl.org/dryad/terms/status

19. Dryad Domain: http://www.purl.org/dryad

20. Text Encoding Initiative (TEI) Header, Chapter 2 (P5: Guidelines for Electronic Text Encoding and Interchange): http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html

21. Tim Berners-Lee on the next Web (TED Conferences, LLC): http://www.ted.com/index.php/talks/tim_berners_lee_on_the_next_web.html

22. GenBank database: http://www.psc.edu/general/software/packages/genbank/genbank.php

23. TreeBASE: http://www.treebase.org

24. Long Term Ecological Research (LTER) Network's Metacat data catalog: http://metacat.lternet.edu/knb

25. Gleaning Resource Descriptions from Dialects of Languages (GRDDL): http://www.w3.org/TR/grddl-primer/

Reprints and Permissions

Please note: We are unable to provide a copy of the article, please see our help page How do I view content?

To request a reprint or commercial or derivative permissions for this article, please click on the relevant link below.

Permission can also be obtained via Rightslink. For more information please visit our Permissions help page.