Login / Signup

A tool for translating polygenic scores onto the absolute scale using summary statistics.

Oliver PainAlexandra C GillettJehannine C AustinLasse FolkersenCathryn M Lewis
Published in: European journal of human genetics : EJHG (2022)
There is growing interest in the clinical application of polygenic scores as their predictive utility increases for a range of health-related phenotypes. However, providing polygenic score predictions on the absolute scale is an important step for their safe interpretation. We have developed a method to convert polygenic scores to the absolute scale for binary and normally distributed phenotypes. This method uses summary statistics, requiring only the area-under-the-ROC curve (AUC) or variance explained (R 2 ) by the polygenic score, and the prevalence of binary phenotypes, or mean and standard deviation of normally distributed phenotypes. Polygenic scores are converted using normal distribution theory. We also evaluate methods for estimating polygenic score AUC/R 2 from genome-wide association study (GWAS) summary statistics alone. We validate the absolute risk conversion and AUC/R 2 estimation using data for eight binary and three continuous phenotypes in the UK Biobank sample. When the AUC/R 2 of the polygenic score is known, the observed and estimated absolute values were highly concordant. Estimates of AUC/R 2 from the lassosum pseudovalidation method were most similar to the observed AUC/R 2 values, though estimated values deviated substantially from the observed for autoimmune disorders. This study enables accurate interpretation of polygenic scores using only summary statistics, providing a useful tool for educational and clinical purposes. Furthermore, we have created interactive webtools implementing the conversion to the absolute ( https://opain.github.io/GenoPred/PRS_to_Abs_tool.html ). Several further barriers must be addressed before clinical implementation of polygenic scores, such as ensuring target individuals are well represented by the GWAS sample.
Keyphrases
  • healthcare
  • genome wide association study
  • multiple sclerosis
  • machine learning
  • quality improvement
  • risk factors
  • high resolution
  • drug induced