A comprehensive evaluation of binning methods to recover human gut microbial species from a non-redundant reference gene catalog.
Marianne BorderesCyrielle GascEmmanuel PrestatMariana Galvão FerrariniSusana VingaLilia BoucinhaMarie-France SagotPublished in: NAR genomics and bioinformatics (2021)
The human gut microbiota performs functions that are essential for the maintenance of the host physiology. However, characterizing the functioning of microbial communities in relation to the host remains challenging in reference-based metagenomic analyses. Indeed, as taxonomic and functional analyses are performed independently, the link between genes and species remains unclear. Although a first set of species-level bins was built by clustering co-abundant genes, no reference bin set is established on the most used gut microbiota catalog, the Integrated Gene Catalog (IGC). With the aim to identify the best suitable method to group the IGC genes, we benchmarked nine taxonomy-independent binners implementing abundance-based, hybrid and integrative approaches. To this purpose, we designed a simulated non-redundant gene catalog (SGC) and computed adapted assessment metrics. Overall, the best trade-off between the main metrics is reached by an integrative binner. For each approach, we then compared the results of the best-performing binner with our expected community structures and applied the method to the IGC. The three approaches are distinguished by specific advantages, and by inherent or scalability limitations. Hybrid and integrative binners show promising and potentially complementary results but require improvements to be used on the IGC to recover human gut microbial species.
Keyphrases
- genome wide
- genome wide identification
- endothelial cells
- induced pluripotent stem cells
- copy number
- microbial community
- dna methylation
- mental health
- transcription factor
- single cell
- magnetic resonance imaging
- network analysis
- antibiotic resistance genes
- high resolution
- wastewater treatment
- quality improvement
- genetic diversity
- magnetic resonance
- diffusion weighted imaging
- mass spectrometry