Login / Signup

Density Peak clustering of protein sequences associated to a Pfam clan reveals clear similarities and interesting differences with respect to manual family annotation.

Elena Tea RussoAlessandro LaioMarco Punta
Published in: BMC bioinformatics (2021)
The clustering procedure described in this work takes advantage of the information contained in a large set of pairwise alignments and successfully identifies a set of putative families and family architectures in an unsupervised manner. Comparison with the Pfam classification highlights significant overlap and points to interesting differences, suggesting that our new algorithm could have potential in applications related to automatic protein classification. Testing this hypothesis, however, will require further experiments on large and diverse sequence datasets.
Keyphrases
  • machine learning
  • deep learning
  • rna seq
  • single cell
  • amino acid
  • protein protein
  • binding protein
  • minimally invasive
  • healthcare
  • gene expression
  • drug induced