Login / Signup

PVTree: A Sequential Pattern Mining Method for Alignment Independent Phylogeny Reconstruction.

Yongyong KangXiaofei YangJiadong LinKai Ye
Published in: Genes (2019)
Phylogenetic tree is essential to understand evolution and it is usually constructed through multiple sequence alignment, which suffers from heavy computational burdens and requires sophisticated parameter tuning. Recently, alignment free methods based on k-mer profiles or common substrings provide alternative ways to construct phylogenetic trees. However, most of these methods ignore the global similarities between sequences or some specific valuable features, e.g., frequent patterns overall datasets. To make further improvement, we propose an alignment free algorithm based on sequential pattern mining, where each sequence is converted into a binary representation of sequential patterns among sequences. The phylogenetic tree is further constructed via clustering distance matrix which is calculated from pattern vectors. To increase accuracy for highly divergent sequences, we consider pattern weight and filtering redundancy sub-patterns. Both simulated and real data demonstrates our method outperform other alignment free methods, especially for large sequence set with low similarity.
Keyphrases
  • wastewater treatment
  • machine learning
  • rna seq
  • physical activity
  • deep learning
  • electronic health record
  • body mass index
  • weight loss
  • amino acid
  • ionic liquid
  • artificial intelligence
  • gene therapy