Correlated allele frequency changes reveal clonal structure and selection in temporal genetic data.
Yunxiao LiJohn P BartonPublished in: Molecular biology and evolution (2024)
In evolving populations where the rate of beneficial mutations is large, subpopulations of individuals with competing beneficial mutations can be maintained over long times. Evolution with this kind of clonal structure is commonly observed in a wide range of microbial and viral populations. However, it can be difficult to completely resolve clonal dynamics in data. This is due to limited read lengths in high-throughput sequencing methods, which are often insufficient to directly measure linkage disequilibrium or determine clonal structure. Here, we develop a method to infer clonal structure using correlated allele frequency changes in time-series sequence data. Simulations show that our method recovers true, underlying clonal structures when they are known and accurately estimates linkage disequilibrium. This information can then be combined with other inference methods to improve estimates of the fitness effects of individual mutations. Applications to data suggest novel clonal structures in an E. coli long-term evolution experiment, and yield improved predictions of the effects of mutations on bacterial fitness and antibiotic resistance. Moreover, our method is computationally efficient, requiring orders of magnitude less run time for large data sets than existing methods. Overall, our method provides a powerful tool to infer clonal structures from data sets where only allele frequencies are available, which can also improve downstream analyses.
Keyphrases
- electronic health record
- big data
- genome wide
- high resolution
- physical activity
- gene expression
- healthcare
- microbial community
- machine learning
- single cell
- mass spectrometry
- human immunodeficiency virus
- dna methylation
- social media
- artificial intelligence
- men who have sex with men
- hiv testing
- hepatitis c virus
- deep learning
- high throughput sequencing