Login / Signup

The Impact of Diagnostic Code Misclassification on Optimizing the Experimental Design of Genetic Association Studies.

Steven J Schrodi
Published in: Journal of healthcare engineering (2017)
Diagnostic codes within electronic health record systems can vary widely in accuracy. It has been noted that the number of instances of a particular diagnostic code monotonically increases with the accuracy of disease phenotype classification. As a growing number of health system databases become linked with genomic data, it is critically important to understand the effect of this misclassification on the power of genetic association studies. Here, I investigate the impact of this diagnostic code misclassification on the power of genetic association studies with the aim to better inform experimental designs using health informatics data. The trade-off between (i) reduced misclassification rates from utilizing additional instances of a diagnostic code per individual and (ii) the resulting smaller sample size is explored, and general rules are presented to improve experimental designs.
Keyphrases
  • electronic health record
  • copy number
  • genome wide
  • big data
  • healthcare
  • public health
  • clinical decision support
  • deep learning
  • health information
  • adverse drug
  • risk assessment
  • human health
  • finite element analysis