Comment on the Optimal Parameters to Derive Intrinsically Disordered Protein Conformational Ensembles from Small-Angle X-ray Scattering Data Using the Ensemble Optimization Method.
Amin SagarCy M JeffriesMaxim V PetoukhovDmitri I SvergunPau BernadóPublished in: Journal of chemical theory and computation (2021)
The Ensemble Optimization Method (EOM) is a popular approach to describe small-angle X-ray scattering (SAXS) data from highly disordered proteins. The EOM algorithm selects subensembles of coexisting states from large pools of randomized conformations to fit the SAXS data. Based on the unphysical bimodal radius of gyration (Rg) distribution of conformations resulting from the EOM analysis, a recent article (Fagerberg et al. J. Chem. Theory Comput. 2019, 15 (12), 6968-6983) concluded that this approach inadequately described the SAXS data measured for human Histatin 5 (Hst5), a peptide with antifungal properties. Using extensive experimental and synthetic data, we explored the origin of this observation. We found that the one-bead-per-residue coarse-grained representation with averaged scattering form factors (provided in the EOM as an add-on to represent disordered missing loops or domains) may not be appropriate for EOM analyses of scattering data from short (below 50 residues) proteins/peptides. The method of choice for these proteins is to employ atomistic models (e.g., from molecular dynamics simulations) to sample the protein conformational landscape. As a convenient alternative, we have also improved the coarse-grained approach by introducing amino acid specific form factors in the calculations. We also found that, for small proteins, the search for relatively large subensembles of 20-50 conformers (as implemented in the original EOM version) more adequately describes the conformational space sampled in solution than the procedures optimizing the ensemble size. Our observations have been added as recommendations into the information for EOM users to promote the proper utilization of the program for ensemble-based modeling of SAXS data for all types of disordered systems.
Keyphrases
- molecular dynamics simulations
- molecular dynamics
- electronic health record
- amino acid
- big data
- molecular docking
- high resolution
- healthcare
- neural network
- open label
- clinical trial
- deep learning
- magnetic resonance imaging
- endothelial cells
- density functional theory
- decision making
- single cell
- candida albicans
- protein protein