Login / Signup

GPSFun: geometry-aware protein sequence function predictions with language models.

Qianmu YuanChong TianYidong SongPeihua OuMingming ZhuHuiying ZhaoYuedong Yang
Published in: Nucleic acids research (2024)
Knowledge of protein function is essential for elucidating disease mechanisms and discovering new drug targets. However, there is a widening gap between the exponential growth of protein sequences and their limited function annotations. In our prior studies, we have developed a series of methods including GraphPPIS, GraphSite, LMetalSite and SPROF-GO for protein function annotations at residue or protein level. To further enhance their applicability and performance, we now present GPSFun, a versatile web server for Geometry-aware Protein Sequence Function annotations, which equips our previous tools with language models and geometric deep learning. Specifically, GPSFun employs large language models to efficiently predict 3D conformations of the input protein sequences and extract informative sequence embeddings. Subsequently, geometric graph neural networks are utilized to capture the sequence and structure patterns in the protein graphs, facilitating various downstream predictions including protein-ligand binding sites, gene ontologies, subcellular locations and protein solubility. Notably, GPSFun achieves superior performance to state-of-the-art methods across diverse tasks without requiring multiple sequence alignments or experimental protein structures. GPSFun is freely available to all users at https://bio-web1.nscc-gz.cn/app/GPSFun with user-friendly interfaces and rich visualizations.
Keyphrases
  • amino acid
  • protein protein
  • binding protein
  • emergency department
  • gene expression
  • autism spectrum disorder
  • squamous cell carcinoma
  • small molecule
  • machine learning
  • transcription factor
  • lymph node metastasis