Deep distributed computing to reconstruct extremely large lineage trees.
Naoki KonnoYusuke KijimaKeito WatanoSoh IshiguroKeiichiro OnoMamoru TanakaHideto MoriNanami MasuyamaDexter PrattTrey IdekerWataru IwasakiNozomu YachiePublished in: Nature biotechnology (2022)
Phylogeny estimation (the reconstruction of evolutionary trees) has recently been applied to CRISPR-based cell lineage tracing, allowing the developmental history of an individual tissue or organism to be inferred from a large number of mutated sequences in somatic cells. However, current computational methods are not able to construct phylogenetic trees from extremely large numbers of input sequences. Here, we present a deep distributed computing framework to comprehensively trace accurate large lineages (FRACTAL) that substantially enhances the scalability of current lineage estimation software tools. FRACTAL first reconstructs only an upstream lineage of the input sequences and recursively iterates the same produce for its downstream lineages using independent computing nodes. We demonstrate the utility of FRACTAL by reconstructing lineages from >235 million simulated sequences and from >16 million cells from a simulated experiment with a CRISPR system that accumulates mutations during cell proliferation. We also successfully applied FRACTAL to evolutionary tree reconstructions and to an experiment using error-prone PCR (EP-PCR) for large-scale sequence diversification.
Keyphrases
- single cell
- genome wide
- cell proliferation
- crispr cas
- genome editing
- induced apoptosis
- cell fate
- dna methylation
- genetic diversity
- copy number
- cell cycle arrest
- cell therapy
- radiation therapy
- squamous cell carcinoma
- heavy metals
- magnetic resonance imaging
- oxidative stress
- cell death
- risk assessment
- endoplasmic reticulum stress
- signaling pathway
- early stage
- neural network
- rectal cancer