Pangenome-spanning epistasis and coselection analysis via de Bruijn graphs.
Juri KuronenSamuel T HorsfieldAnna Kaarina PöntinenSudaraka MallawaarachchiSergio Arredondo-AlonsoHarry A ThorpeRebecca A GladstoneRob J L WillemsStephen D BentleyNicholas J CroucherJohan PensarJohn A LeesGerry Q Tonkin-HillJukka CoranderPublished in: Genome research (2024)
Studies of bacterial adaptation and evolution are hampered by the difficulty of measuring traits such as virulence, drug resistance, and transmissibility in large populations. In contrast, it is now feasible to obtain high-quality complete assemblies of many bacterial genomes thanks to scalable high-accuracy long-read sequencing technologies. To exploit this opportunity, we introduce a phenotype- and alignment-free method for discovering coselected and epistatically interacting genomic variation from genome assemblies covering both core and accessory parts of genomes. Our approach uses a compact colored de Bruijn graph to approximate the intragenome distances between pairs of loci for a collection of bacterial genomes to account for the impacts of linkage disequilibrium (LD). We demonstrate the versatility of our approach to efficiently identify associations between loci linked with drug resistance and adaptation to the hospital niche in the major human bacterial pathogens Streptococcus pneumoniae and Enterococcus faecalis .
Keyphrases
- genome wide
- escherichia coli
- endothelial cells
- healthcare
- magnetic resonance
- antimicrobial resistance
- pseudomonas aeruginosa
- single cell
- emergency department
- copy number
- computed tomography
- magnetic resonance imaging
- genome wide association study
- biofilm formation
- neural network
- induced pluripotent stem cells
- convolutional neural network
- deep learning
- protein kinase
- multidrug resistant
- human immunodeficiency virus
- men who have sex with men
- candida albicans
- gram negative
- adverse drug
- antiretroviral therapy
- contrast enhanced