Login / Signup

On the relationship between cumulative correlation coefficients and the quality of crystallographic data sets.

Jimin WangGary W BrudvigVictor S BatistaPeter B Moore
Published in: Protein science : a publication of the Protein Society (2017)
In 2012, Karplus and Diederichs demonstrated that the Pearson correlation coefficient CC1/2 is a far better indicator of the quality and resolution of crystallographic data sets than more traditional measures like merging R-factor or signal-to-noise ratio. More specifically, they proposed that CC1/2 be computed for data sets in thin shells of increasing resolution so that the resolution dependence of that quantity can be examined. Recently, however, the CC1/2 values of entire data sets, i.e., cumulative correlation coefficients, have been used as a measure of data quality. Here, we show that the difference in cumulative CC1/2 value between a data set that has been accurately measured and a data set that has not is likely to be small. Furthermore, structures obtained by molecular replacement from poorly measured data sets are likely to suffer from extreme model bias.
Keyphrases
  • electronic health record
  • big data
  • magnetic resonance imaging
  • single molecule
  • data analysis
  • air pollution
  • quality improvement
  • deep learning
  • artificial intelligence