Login / Signup

nail: software for high-speed, high-sensitivity protein sequence annotation.

Jack W RoddyDavid H RichTravis J Wheeler
Published in: bioRxiv : the preprint server for biology (2024)
Here, we introduce a new tool that bridges the gap between advances in these two directions, reaching speeds comparable to fast annotation methods such as MMseqs2 while retaining most of the sensitivity offered by pHMMs. The tool, called nail, implements a heuristic approximation of the pHMM Forward/Backward (FB) algorithm by identifying a sparse subset of the cells in the FB dynamic programming matrix that contains most of the probability mass. The method produces an accurate approximation of pHMM scores and E-values with high speed and small memory requirements. On a protein benchmark, nail recovers the majority of recall difference between MMseqs2 and HMMER, with run time ~26x faster than HMMER3 (only ~2.4x slower than MMseqs2's sensitive variant). nail is released under the open BSD-3-clause license and is available for download at https://github.com/TravisWheelerLab/nail.
Keyphrases
  • high speed
  • atomic force microscopy
  • high resolution
  • induced apoptosis
  • amino acid
  • machine learning
  • protein protein
  • deep learning
  • rna seq
  • cell cycle arrest
  • working memory
  • minimally invasive
  • oxidative stress