Login / Signup

Scalable Reconstruction of SARS-CoV-2 Phylogeny with Recurrent Mutations.

Daniel NovikovSergey KnyazevMark GrinshponPelin IcerPavel SkumsAlexander Zelikovsky
Published in: Journal of computational biology : a journal of computational molecular cell biology (2021)
This article presents a novel scalable character-based phylogeny algorithm for dense viral sequencing data called SPHERE (Scalable PHylogEny with REcurrent mutations). The algorithm is based on an evolutionary model where recurrent mutations are allowed, but backward mutations are prohibited. The algorithm creates rooted character-based phylogeny trees, wherein all leaves and internal nodes are labeled by observed taxa. We show that SPHERE phylogeny is more stable than Nextstrain's, and that it accurately infers known transmission links from the early pandemic. SPHERE is a fast algorithm that can process >200,000 sequences in <2 hours, which offers a compact phylogenetic visualization of Global Initiative on Sharing All Influenza Data (GISAID).
Keyphrases
  • sars cov
  • machine learning
  • deep learning
  • electronic health record
  • big data
  • neural network
  • respiratory syndrome coronavirus
  • gene expression
  • artificial intelligence
  • radiation therapy
  • dna methylation