Machine learning and structural analysis of Mycobacterium tuberculosis pan-genome identifies genetic signatures of antibiotic resistance.
Erol S KavvasEdward CatoiuNathan MihBernhard O PalssonYara SeifNicholas DillonDavid HeckmannAmitesh AnandLaurence YangVictor NizetJonathan M MonkBernhard O PalssonPublished in: Nature communications (2018)
Mycobacterium tuberculosis is a serious human pathogen threat exhibiting complex evolution of antimicrobial resistance (AMR). Accordingly, the many publicly available datasets describing its AMR characteristics demand disparate data-type analyses. Here, we develop a reference strain-agnostic computational platform that uses machine learning approaches, complemented by both genetic interaction analysis and 3D structural mutation-mapping, to identify signatures of AMR evolution to 13 antibiotics. This platform is applied to 1595 sequenced strains to yield four key results. First, a pan-genome analysis shows that M. tuberculosis is highly conserved with sequenced variation concentrated in PE/PPE/PGRS genes. Second, the platform corroborates 33 genes known to confer resistance and identifies 24 new genetic signatures of AMR. Third, 97 epistatic interactions across 10 resistance classes are revealed. Fourth, detailed structural analysis of these genes yields mechanistic bases for their selection. The platform can be used to study other human pathogens.
Keyphrases
- genome wide
- mycobacterium tuberculosis
- antimicrobial resistance
- dna methylation
- machine learning
- copy number
- endothelial cells
- high throughput
- pulmonary tuberculosis
- induced pluripotent stem cells
- big data
- gene expression
- escherichia coli
- high resolution
- emergency department
- single cell
- rna seq
- mass spectrometry
- drug induced
- antiretroviral therapy
- genome wide identification
- bioinformatics analysis