A Perspective on the Prospective Use of AI in Protein Structure Prediction.
Raphaelle VersiniSujith SritharanBurcu Aykac FasThibault TubianaSana Zineb AimeurJulien HenriMarie ErardOliver NüßeJessica AndreaniMarc BaadenPatrick FuchsTatiana GalochkinaAlexios ChatzigoulasZoe CourniaHubert SantuzSophie Sacquin-MoraAntoine TalyPublished in: Journal of chemical information and modeling (2023)
AlphaFold2 (AF2) and RoseTTaFold (RF) have revolutionized structural biology, serving as highly reliable and effective methods for predicting protein structures. This article explores their impact and limitations, focusing on their integration into experimental pipelines and their application in diverse protein classes, including membrane proteins, intrinsically disordered proteins (IDPs), and oligomers. In experimental pipelines, AF2 models help X-ray crystallography in resolving the phase problem, while complementarity with mass spectrometry and NMR data enhances structure determination and protein flexibility prediction. Predicting the structure of membrane proteins remains challenging for both AF2 and RF due to difficulties in capturing conformational ensembles and interactions with the membrane. Improvements in incorporating membrane-specific features and predicting the structural effect of mutations are crucial. For intrinsically disordered proteins, AF2's confidence score (pLDDT) serves as a competitive disorder predictor, but integrative approaches including molecular dynamics (MD) simulations or hydrophobic cluster analyses are advocated for accurate dynamics representation. AF2 and RF show promising results for oligomeric models, outperforming traditional docking methods, with AlphaFold-Multimer showing improved performance. However, some caveats remain in particular for membrane proteins. Real-life examples demonstrate AF2's predictive capabilities in unknown protein structures, but models should be evaluated for their agreement with experimental data. Furthermore, AF2 models can be used complementarily with MD simulations. In this Perspective, we propose a "wish list" for improving deep-learning-based protein folding prediction models, including using experimental data as constraints and modifying models with binding partners or post-translational modifications. Additionally, a meta-tool for ranking and suggesting composite models is suggested, driving future advancements in this rapidly evolving field.
Keyphrases
- molecular dynamics
- atrial fibrillation
- protein protein
- high resolution
- mass spectrometry
- density functional theory
- binding protein
- deep learning
- amino acid
- small molecule
- electronic health record
- magnetic resonance
- big data
- machine learning
- computed tomography
- single molecule
- human immunodeficiency virus
- liquid chromatography
- hiv testing
- convolutional neural network
- men who have sex with men
- hepatitis c virus
- dual energy
- data analysis
- tandem mass spectrometry