Login / Signup

NGSEP 4: Efficient and accurate identification of orthogroups and whole-genome alignment.

Daniel TelloLaura Natalia Gonzalez-GarciaJorge GomezJuan Camilo Zuluaga-MonaresRogelio GarciaRicardo AngelDaniel MahechaErick DuarteMaria Del Rosario LeonFernando ReyesCamilo Escobar-VelásquezMario Linares-VásquezNicolas CardozoJorge Duitama
Published in: Molecular ecology resources (2022)
Whole-genome alignment allows researchers to understand the genomic structure and variation among genomes. Approaches based on direct pairwise comparisons of DNA sequences require large computational capacities. As a consequence, pipelines combining tools for orthologous gene identification and synteny have been developed. In this manuscript, we present the latest functionalities implemented in NGSEP 4, to identify orthogroups and perform whole genome alignments. NGSEP implements functionalities for identification of clusters of homologus genes, synteny analysis and whole genome alignment. Our results showed that the NGSEP algorithm for orthogroups identification has competitive accuracy and efficiency in comparison to commonly used tools. The implementation also includes a visualization of the whole genome alignment based on synteny of the orthogroups that were identified, and a reconstruction of the pangenome based on frequencies of the orthogroups among the genomes. NGSEP 4 also includes a new graphical user interface based on the JavaFX technology. We expect that these new developments will be very useful for several studies in evolutionary biology and population genomics.
Keyphrases
  • bioinformatics analysis
  • genome wide
  • healthcare
  • copy number
  • machine learning
  • genome wide identification
  • single molecule
  • neural network