Evolutionary Analysis of TCGA Data Using Over- and Under- Mutated Genes Identify Key Molecular Pathways and Cellular Functions in Lung Cancer Subtypes.
Audrey R FreischelJamie K TeerKimberly LuddyJessica CunninghamYael Artzy-RandrupTamir EpsteinKenneth Y TsaiAnders E BerglundJohn L ClevelandRobert James GilliesJoel S BrownRobert A GatenbyPublished in: Cancers (2022)
We identify critical conserved and mutated genes through a theoretical model linking a gene’s fitness contribution to its observed mutational frequency in a clinical cohort. “Passenger” gene mutations do not alter fitness and have mutational frequencies determined by gene size and the mutation rate. Driver mutations, which increase fitness (and proliferation), are observed more frequently than expected. Non-synonymous mutations in essential genes reduce fitness and are eliminated by natural selection resulting in lower prevalence than expected. We apply this “evolutionary triage” principle to TCGA data from EGFR-mutant, KRAS-mutant, and NEK (non-EGFR/KRAS) lung adenocarcinomas. We find frequent overlap of evolutionarily selected non-synonymous gene mutations among the subtypes suggesting enrichment for adaptations to common local tissue selection forces. Overlap of conserved genes in the LUAD subtypes is rare suggesting negative evolutionary selection is strongly dependent on initiating mutational events during carcinogenesis. Highly expressed genes are more likely to be conserved and significant changes in expression (>20% increased/decreased) are common in genes with evolutionarily selected mutations but not in conserved genes. EGFR-mut cancers have fewer average mutations (89) than KRAS-mut (228) and NEK (313). Subtype-specific variation in conserved and mutated genes identify critical molecular components in cell signaling, extracellular matrix remodeling, and membrane transporters. These findings demonstrate subtype-specific patterns of co-adaptations between the defining driver mutation and somatically conserved genes as well as novel insights into epigenetic versus genetic contributions to cancer evolution.
Keyphrases
- genome wide
- genome wide identification
- dna methylation
- transcription factor
- bioinformatics analysis
- small cell lung cancer
- genome wide analysis
- physical activity
- wild type
- extracellular matrix
- epidermal growth factor receptor
- emergency department
- tyrosine kinase
- gene expression
- squamous cell carcinoma
- big data
- machine learning
- risk factors
- young adults
- single cell
- single molecule
- binding protein
- lymph node metastasis