Login / Signup

AlphaFold2 fails to predict protein fold switching.

Devlina ChakravartyLauren L Porter
Published in: Protein science : a publication of the Protein Society (2022)
AlphaFold2 has revolutionized protein structure prediction by leveraging sequence information to rapidly model protein folds with atomic-level accuracy. Nevertheless, previous work has shown that these predictions tend to be inaccurate for structurally heterogeneous proteins. To systematically assess factors that contribute to this inaccuracy, we tested AlphaFold2's performance on 98-fold-switching proteins, which assume at least two distinct-yet-stable secondary and tertiary structures. Topological similarities were quantified between five predicted and two experimentally determined structures of each fold-switching protein. Overall, 94% of AlphaFold2 predictions captured one experimentally determined conformation but not the other. Despite these biased results, AlphaFold2's estimated confidences were moderate-to-high for 74% of fold-switching residues, a result that contrasts with overall low confidences for intrinsically disordered proteins, which are also structurally heterogeneous. To investigate factors contributing to this disparity, we quantified sequence variation within the multiple sequence alignments used to generate AlphaFold2's predictions of fold-switching and intrinsically disordered proteins. Unlike intrinsically disordered regions, whose sequence alignments show low conservation, fold-switching regions had conservation rates statistically similar to canonical single-fold proteins. Furthermore, intrinsically disordered regions had systematically lower prediction confidences than either fold-switching or single-fold proteins, regardless of sequence conservation. AlphaFold2's high prediction confidences for fold switchers indicate that it uses sophisticated pattern recognition to search for one most probable conformer rather than protein biophysics to model a protein's structural ensemble. Thus, it is not surprising that its predictions often fail for proteins whose properties are not fully apparent from solved protein structures. Our results emphasize the need to look at protein structure as an ensemble and suggest that systematic examination of fold-switching sequences may reveal propensities for multiple stable secondary and tertiary structures.
Keyphrases
  • amino acid
  • protein protein
  • binding protein
  • high resolution
  • healthcare
  • magnetic resonance
  • gene expression
  • genome wide
  • dna methylation
  • deep learning
  • single cell
  • genetic diversity