Login / Signup

Fast and accurate bootstrap confidence limits on genome-scale phylogenies using little bootstraps.

Sudip SharmaSudhir Kumar
Published in: Nature computational science (2021)
Felsenstein's bootstrap approach is widely used to assess confidence in species relationships inferred from multiple sequence alignments. It resamples sites randomly with replacement to build alignment replicates of the same size as the original alignment and infers a phylogeny from each replicate dataset. The proportion of phylogenies recovering the same grouping of species is its bootstrap confidence limit. But, standard bootstrap imposes a high computational burden in applications involving long sequence alignments. Here, we introduce the bag of little bootstraps approach to phylogenetics, bootstrapping only a few little samples, each containing a small subset of sites. We report that the median bagging of bootstrap confidence limits from little samples produces confidence in inferred species relationships similar to standard bootstrap but in a fraction of computational time and memory. Therefore, the little bootstraps approach can potentially enhance the rigor, efficiency, and parallelization of big data phylogenomic analyses.
Keyphrases
  • big data
  • artificial intelligence
  • machine learning
  • working memory
  • gene expression