Login / Signup

Extending TCGA queries to automatically identify analogous genomic data from dbGaP.

Erin K WagnerSatyajeet RajeLiz AmosJessica KurataAbhijit S BadveYingquan LiBen Busby
Published in: F1000Research (2017)
Data sharing is critical to advance genomic research by reducing the demand to collect new data by reusing and combining existing data and by promoting reproducible research. The Cancer Genome Atlas (TCGA) is a popular resource for individual-level genotype-phenotype cancer related data. The Database of Genotypes and Phenotypes (dbGaP) contains many datasets similar to those in TCGA. We have created a software pipeline that will allow researchers to discover relevant genomic data from dbGaP, based on matching TCGA metadata. The resulting research provides an easy to use tool to connect these two data sources.
Keyphrases
  • electronic health record
  • big data
  • healthcare
  • emergency department
  • gene expression
  • copy number
  • squamous cell carcinoma
  • machine learning
  • young adults
  • social media
  • genome wide
  • rna seq
  • single cell
  • deep learning