Login / Signup

DomainMapper: Accurate domain structure annotation including those with non-contiguous topologies.

Edgar Manriquez-SandovalStephen D Fried
Published in: Protein science : a publication of the Protein Society (2022)
Automated domain annotation is an important tool for structural informatics. These pipelines typically involve searching query sequences against hidden Markov model (HMM) profiles, yielding matches to profiles for various domains. However, domain annotation can be ambiguous or inaccurate when proteins contain domains with non-contiguous residue ranges, and especially when insertional domains are hosted within them. Here, we present DomainMapper, an algorithm that accurately assigns a unique domain structure annotation to a query sequence, including those with complex topologies. We validate our domain assignments using the AlphaFold database and confirm that non-contiguity is pervasive (10.74% of all domains in yeast and 4.52% in human). Using this resource, we find that certain folds have strong propensities to be non-contiguous or insertional across the Tree of Life. DomainMapper is freely available and can be ran as a single command-line function.
Keyphrases
  • rna seq
  • machine learning
  • deep learning
  • high resolution
  • artificial intelligence
  • single cell
  • induced pluripotent stem cells
  • genetic diversity