Login / Signup

Alignment-free clustering of large data sets of unannotated protein conserved regions using minhashing.

Armen AbnousiShira L BroschatAnanth Kalyanaraman
Published in: BMC bioinformatics (2018)
The new clustering algorithm can be used to generate meaningful clusters of conserved regions. It is a scalable method that when paired with our prior work, NADDA for detecting conserved regions, provides a complete end-to-end pipeline for annotating protein sequences.
Keyphrases
  • transcription factor
  • single cell
  • protein protein
  • machine learning
  • rna seq
  • amino acid
  • electronic health record
  • binding protein
  • deep learning
  • big data