Proteogenomic characterization of 2002 human cancers reveals pan-cancer molecular subtypes and associated pathways.
Yiqun ZhangFengju ChenDarshan S ChandrashekarSooryanarayana VaramballyChad J CreightonPublished in: Nature communications (2022)
Mass-spectrometry-based proteomic data on human tumors-combined with corresponding multi-omics data-present opportunities for systematic and pan-cancer proteogenomic analyses. Here, we assemble a compendium dataset of proteomics data of 2002 primary tumors from 14 cancer types and 17 studies. Protein expression of genes broadly correlates with corresponding mRNA levels or copy number alterations (CNAs) across tumors, but with notable exceptions. Based on unsupervised clustering, tumors separate into 11 distinct proteome-based subtypes spanning multiple tissue-based cancer types. Two subtypes are enriched for brain tumors, one subtype associating with MYC, Wnt, and Hippo pathways and high CNA burden, and another subtype associating with metabolic pathways and low CNA burden. Somatic alteration of genes in a pathway associates with higher pathway activity as inferred by proteome or transcriptome data. A substantial fraction of cancers shows high MYC pathway activity without MYC copy gain but with mutations in genes with noncanonical roles in MYC. Our proteogenomics survey reveals the interplay between genome and proteome across tumor lineages.
Keyphrases
- papillary thyroid
- genome wide
- copy number
- mass spectrometry
- electronic health record
- endothelial cells
- squamous cell
- transcription factor
- big data
- stem cells
- mitochondrial dna
- lymph node metastasis
- single cell
- dna methylation
- machine learning
- cell proliferation
- rna seq
- cross sectional
- data analysis
- squamous cell carcinoma
- induced pluripotent stem cells
- genome wide identification
- young adults
- bioinformatics analysis
- simultaneous determination
- single molecule
- gas chromatography