Login / Signup

SAMPL7 protein-ligand challenge: A community-wide evaluation of computational methods against fragment screening and pose-prediction.

Harold GrosjeanMehtap IşıkAnthony AimonDavid L MobleyJohn D ChoderaFrank von DelftPhilip Charles Biggin
Published in: Journal of computer-aided molecular design (2022)
A novel crystallographic fragment screening data set was generated and used in the SAMPL7 challenge for protein-ligands. The SAMPL challenges prospectively assess the predictive power of methods involved in computer-aided drug design. Application of various methods to fragment molecules are now widely used in the search for new drugs. However, there is little in the way of systematic validation specifically for fragment-based approaches. We have performed a large crystallographic high-throughput fragment screen against the therapeutically relevant second bromodomain of the Pleckstrin-homology domain interacting protein (PHIP2) that revealed 52 different fragments bound across 4 distinct sites, 47 of which were bound to the pharmacologically relevant acetylated lysine (Kac) binding site. These data were used to assess computational screening, binding pose prediction and follow-up enumeration. All submissions performed randomly for screening. Pose prediction success rates (defined as less than 2 Å root mean squared deviation against heavy atom crystal positions) ranged between 0 and 25% and only a very few follow-up compounds were deemed viable candidates from a medicinal-chemistry perspective based on a common molecular descriptors analysis. The tight deadlines imposed during the challenge led to a small number of submissions suggesting that the accuracy of rapidly responsive workflows remains limited. In addition, the application of these methods to reproduce crystallographic fragment data still appears to be very challenging. The results show that there is room for improvement in the development of computational tools particularly when applied to fragment-based drug design.
Keyphrases
  • high throughput
  • electronic health record
  • big data
  • binding protein
  • protein protein
  • healthcare
  • single cell
  • drug delivery
  • transcription factor
  • data analysis
  • dna binding