Login / Signup

AggBERT: Best in Class Prediction of Hexapeptide Amyloidogenesis with a Semi-Supervised ProtBERT Model.

Ryann PerezXinning LiSam G GiannakouliasE James Petersson
Published in: Journal of chemical information and modeling (2023)
The prediction of peptide amyloidogenesis is a challenging problem in the field of protein folding. Large language models, such as the ProtBERT model, have recently emerged as powerful tools in analyzing protein sequences for applications, such as predicting protein structure and function. In this article, we describe the use of a semisupervised and fine-tuned ProtBERT model to predict peptide amyloidogenesis from sequences alone. Our approach, which we call AggBERT, achieved state-of-the-art performance, demonstrating the potential for large language models to improve the accuracy and speed of amyloid fibril prediction over simple heuristics or structure-based approaches. This work highlights the transformative potential of machine learning and large language models in the fields of chemical biology and biomedicine.
Keyphrases
  • machine learning
  • autism spectrum disorder
  • protein protein
  • binding protein
  • human health
  • risk assessment
  • molecular dynamics simulations
  • climate change