Login / Signup

NCI's Proteomic Data Commons: A Cloud-Based Proteomics Repository Empowering Comprehensive Cancer Analysis Through Cross-Referencing with Genomic and Imaging Data.

Ratna R ThanguduMichael HolckDeepak SinghalAlexander PilozziNathan J EdwardsPaul A RudnickMarcin J DomagalskiPadmini ChilappagariLei MaYi XinToan LeKristen NyceRekha ChaudharyKaren A KetchumAaron MauraisBrian ConnollyMichael RiffleMatthew C ChambersBrendan X MacLeanMichael J MacCossPeter B McGarveyAnand BasuJohn OtridgeEsmeralda Casas-SilvaSudha VenkatachariHenry RodriguezXu Zhang
Published in: Cancer research communications (2024)
Proteomics has emerged as a powerful tool for studying cancer biology, developing diagnostics, and therapies. With the continuous improvement and widespread availability of high-throughput proteomic technologies, the generation of large-scale proteomic data has become more common in cancer research, and there is a growing need for resources that support the sharing and integration of multi-omics datasets. Such datasets require extensive metadata including clinical, biospecimen and experimental and workflow annotations that are crucial for data interpretation and reanalysis. The need to integrate, analyze, and share these data has led to the development of National Cancer Institute's (NCI) Proteomic Data Commons (PDC), accessible at https://pdc.cancer.gov. As a specialized repository within the NCI Cancer Research Data Commons (CRDC), PDC enables researchers to locate and analyze proteomic data from various cancer types and connect with genomic and imaging data available for the same samples in other CRDC nodes. Presently, PDC houses annotated data from nearly 140 datasets across 19 cancer types, generated by several large-scale cancer research programs with cohort sizes exceeding 100 samples (tumor and associated normal when available). In this paper, we review the current state of PDC in cancer research, discuss the opportunities and challenges associated with data sharing in proteomics, and propose future directions for the resource.
Keyphrases
  • papillary thyroid
  • electronic health record
  • squamous cell
  • big data
  • lymph node metastasis
  • healthcare
  • public health
  • gene expression
  • machine learning
  • palliative care
  • childhood cancer
  • single cell
  • locally advanced