Login / Signup

SKIOME Project: a curated collection of skin microbiome datasets enriched with study-related metadata.

Giulia AgostinettoDavide BozziDanilo PorroMaurizio CasiraghiMassimo LabraAntonia Bruno
Published in: Database : the journal of biological databases and curation (2022)
Large amounts of data from microbiome-related studies have been (and are currently being) deposited on international public databases. These datasets represent a valuable resource for the microbiome research community and could serve future researchers interested in integrating multiple datasets into powerful meta-analyses. However, this huge amount of data lacks harmonization and it is far from being completely exploited in its full potential to build a foundation that places microbiome research at the nexus of many subdisciplines within and beyond biology. Thus, it urges the need for data accessibility and reusability, according to findable, accessible, interoperable and reusable (FAIR) principles, as supported by National Microbiome Data Collaborative and FAIR Microbiome. To tackle the challenge of accelerating discovery and advances in skin microbiome research, we collected, integrated and organized existing microbiome data resources from human skin 16S rRNA amplicon-sequencing experiments. We generated a comprehensive collection of datasets, enriched in metadata, and organized this information into data frames ready to be integrated into microbiome research projects and advanced post-processing analyses, such as data science applications (e.g. machine learning). Furthermore, we have created a data retrieval and curation framework built on three different stages to maximize the retrieval of datasets and metadata associated with them. Lastly, we highlighted some caveats regarding metadata retrieval and suggested ways to improve future metadata submissions. Overall, our work resulted in a curated skin microbiome datasets collection accompanied by a state-of-the-art analysis of the last 10 years of the skin microbiome field. Database URL:  https://github.com/giuliaago/SKIOMEMetadataRetrieval.
Keyphrases
  • electronic health record
  • big data
  • machine learning
  • systematic review
  • quality improvement
  • rna seq
  • randomized controlled trial
  • public health
  • risk assessment
  • artificial intelligence
  • meta analyses
  • social media