inStrain profiles population microdiversity from metagenomic data and sensitively detects shared microbial strains.
Matthew R OlmAlexander Crits-ChristophKeith Bouma-GregsonBrian A FirekMichael J MorowitzJillian F BanfieldPublished in: Nature biotechnology (2021)
Coexisting microbial cells of the same species often exhibit genetic variation that can affect phenotypes ranging from nutrient preference to pathogenicity. Here we present inStrain, a program that uses metagenomic paired reads to profile intra-population genetic diversity (microdiversity) across whole genomes and compares microbial populations in a microdiversity-aware manner, greatly increasing the accuracy of genomic comparisons when benchmarked against existing methods. We use inStrain to profile >1,000 fecal metagenomes from newborn premature infants and find that siblings share significantly more strains than unrelated infants, although identical twins share no more strains than fraternal siblings. Infants born by cesarean section harbor Klebsiella with significantly higher nucleotide diversity than infants delivered vaginally, potentially reflecting acquisition from hospital rather than maternal microbiomes. Genomic loci that show diversity in individual infants include variants found between other infants, possibly reflecting inoculation from diverse hospital-associated sources. inStrain can be applied to any metagenomic dataset for microdiversity analysis and rigorous strain comparison.
Keyphrases
- genetic diversity
- microbial community
- escherichia coli
- copy number
- healthcare
- antibiotic resistance genes
- electronic health record
- gene expression
- induced apoptosis
- quality improvement
- dna methylation
- data analysis
- signaling pathway
- gestational age
- pseudomonas aeruginosa
- machine learning
- weight loss
- endoplasmic reticulum stress
- preterm infants
- cord blood
- physical activity
- preterm birth
- clinical evaluation