Login / Signup

A workflow with R: Phylogenetic analyses and visualizations using mitochondrial cytochrome b gene sequences.

Emine ToparslanKemal KarabagUgur Bilge
Published in: PloS one (2020)
Phylogenetic analyses can provide a wealth of information about the past demography of a population and the level of genetic diversity within and between species. By using special computer programs developed in recent years, large amounts of data have been produced in the molecular genetics area. To analyze these data, powerful new methods based on large computations have been applied in various software packages and programs. But these programs have their own specific input and output formats, and users need to create different input formats for almost every program. R is an open source software environment, and it supports open contribution and modification to its libraries. Furthermore, it is also possible to perform several analyses using a single input file format. In this article, by using the multiple sequences FASTA format file (.fas extension) we demonstrate and share a workflow of how to extract haplotypes and perform phylogenetic analyses and visualizations in R. As an example dataset, we used 120 Bombus terrestris dalmatinus mitochondrial cytochrome b gene (cyt b) sequences (373 bp) collected from eight different beehives in Antalya. This article presents a short guide on how to perform phylogenetic analyses using R and RStudio.
Keyphrases
  • genetic diversity
  • electronic health record
  • oxidative stress
  • public health
  • genome wide
  • copy number
  • data analysis
  • machine learning
  • dna methylation
  • genome wide identification
  • deep learning