Login / Signup

Shortcomings of SARS-CoV-2 genomic metadata.

Landen GozashtiRussell Corbett-Detig
Published in: BMC research notes (2021)
Our analysis reveals a startling prevalence of spelling errors and inconsistent naming conventions, which together occur in an estimated ~ 9.8% and ~ 11.6% of "originating lab" and "submitting lab" GISAID metadata entries respectively. We also find numerous ambiguous entries which provide very little information about the actual source of a sample and could easily associate with multiple sources worldwide. Importantly, all of these issues can impair the ability and accuracy of association studies by deceptively causing a group of samples to identify with multiple sources when they truly all identify with one source, or vice versa.
Keyphrases
  • sars cov
  • drinking water
  • risk factors
  • respiratory syndrome coronavirus
  • patient safety
  • copy number
  • health information
  • adverse drug
  • social media
  • quality improvement