Login / Signup

Elucidating polymorphs of crystal structures by intensity-based hierarchical clustering analysis of multiple diffraction data sets.

Hiroaki MatsuuraNaoki SakaiSachiko Toma-FukaiNorifumi MurakiKoki HayamaHironari KamikuboShigetoshi AonoYoshiaki KawanoMasaki YamamotoKunio Hirata
Published in: Acta crystallographica. Section D, Structural biology (2023)
In macromolecular structure determination using X-ray diffraction from multiple crystals, the presence of different structures (structural polymorphs) necessitates the classification of the diffraction data for appropriate structural analysis. Hierarchical clustering analysis (HCA) is a promising technique that has so far been used to extract isomorphous data, mainly for single-structure determination. Although in principle the use of HCA can be extended to detect polymorphs, the absence of a reference to define the threshold used to group the isomorphous data sets (the `isomorphic threshold') poses a challenge. Here, unit-cell-based and intensity-based HCAs have been applied to data sets for apo trypsin and inhibitor-bound trypsin that were mixed post data acquisition to investigate the efficacy of HCA in classifying polymorphous data sets. Single-step intensity-based HCA successfully classified polymorphs with a certain `isomorphic threshold'. In data sets for several samples containing an unknown degree of structural heterogeneity, polymorphs could be identified by intensity-based HCA using the suggested `isomorphic threshold'. Polymorphs were also detected in single crystals using data collected using the continuous helical scheme. These findings are expected to facilitate the determination of multiple structural snapshots by exploiting automated data collection and analysis.
Keyphrases
  • electronic health record
  • big data
  • machine learning
  • single cell
  • data analysis
  • oxidative stress
  • stem cells
  • magnetic resonance imaging
  • mass spectrometry
  • contrast enhanced
  • dual energy