Bayesian inference of negative and positive selection in human cancers.
Shamil R SunyaevShamil R SunyaevPublished in: Nature genetics (2017)
Cancer genomics efforts have identified genes and regulatory elements driving cancer development and neoplastic progression. From a microevolution standpoint, these are subject to positive selection. Although elusive in current studies, genes whose wild-type coding sequences are needed for tumor growth are also of key interest. They are expected to experience negative selection and stay intact under pressure of incessant mutation. The detection of significantly mutated (or undermutated) genes is completely confounded by the genomic heterogeneity of cancer mutation. Here we present a hierarchical framework that allows modeling of coding point mutations. Application of the model to sequencing data from 17 cancer types demonstrates an increased power to detect known cancer driver genes and identifies new significantly mutated genes with highly plausible biological functions. The signal of negative selection is very subtle, but is detectable in several cancer types and in a pan-cancer data set. It is enriched in cell-essential genes identified in a CRISPR screen, as well as in genes with reported roles in cancer.