Login / Signup

Modeling community standards for metadata as templates makes data FAIR.

Mark A MusenMartin J O'ConnorErik SchultesMarcos Martínez-RomeroJosef HardiJohn Graybeal
Published in: Scientific data (2022)
It is challenging to determine whether datasets are findable, accessible, interoperable, and reusable (FAIR) because the FAIR Guiding Principles refer to highly idiosyncratic criteria regarding the metadata used to annotate datasets. Specifically, the FAIR principles require metadata to be "rich" and to adhere to "domain-relevant" community standards. Scientific communities should be able to define their own machine-actionable templates for metadata that encode these "rich," discipline-specific elements. We have explored this template-based approach in the context of two software systems. One system is the CEDAR Workbench, which investigators use to author new metadata. The other is the FAIRware Workbench, which evaluates the metadata of archived datasets for their adherence to community standards. Benefits accrue when templates for metadata become central elements in an ecosystem of tools to manage online datasets-both because the templates serve as a community reference for what constitutes FAIR data, and because they embody that perspective in a form that can be distributed among a variety of software applications to assist with data stewardship and data sharing.
Keyphrases
  • electronic health record
  • mental health
  • healthcare
  • big data
  • data analysis
  • social media
  • deep learning
  • human health