Login / Signup

Best practices for genetic and genomic data archiving.

Deborah M LeighAmy G VandergastMargaret E HunterEric D CrandallW Chris FunkColin J GarrowaySean HobanSara J Oyler-McCanceChristian RellstabGernot SegelbacherChloé SchmidtElla Vázquez-DomínguezIvan Paz-Vinas
Published in: Nature ecology & evolution (2024)
Genetic and genomic data are collected for a vast array of scientific and applied purposes. Despite mandates for public archiving, data are typically used only by the generating authors. The reuse of genetic and genomic datasets remains uncommon because it is difficult, if not impossible, due to non-standard archiving practices and lack of contextual metadata. But as the new field of macrogenetics is demonstrating, if genetic data and their metadata were more accessible and FAIR (findable, accessible, interoperable and reusable) compliant, they could be reused for many additional purposes. We discuss the main challenges with existing genetic and genomic data archives, and suggest best practices for archiving genetic and genomic data. Recognizing that this is a longstanding issue due to little formal data management training within the fields of ecology and evolution, we highlight steps that research institutions and publishers could take to improve data archiving.
Keyphrases
  • electronic health record
  • copy number
  • big data
  • genome wide
  • healthcare
  • primary care
  • emergency department
  • gene expression
  • mental health
  • high throughput
  • artificial intelligence
  • mass spectrometry
  • deep learning
  • rna seq