Login / Signup

Predictive modeling of schizophrenia from genomic data: Comparison of polygenic risk score with kernel support vector machines approach.

Timothy Vivian-GriffithsEmily BakerKarl M SchmidtMatthew Bracher-SmithJames WaltersAndreas ArtemiouPeter HolmansMichael C O'DonovanMichael J OwenAndrew PocklingtonValentina Escott-Price
Published in: American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics (2018)
A major controversy in psychiatric genetics is whether nonadditive genetic interaction effects contribute to the risk of highly polygenic disorders. We applied a support vector machines (SVMs) approach, which is capable of building linear and nonlinear models using kernel methods, to classify cases from controls in a large schizophrenia case-control sample of 11,853 subjects (5,554 cases and 6,299 controls) and compared its prediction accuracy with the polygenic risk score (PRS) approach. We also investigated whether SVMs are a suitable approach to detecting nonlinear genetic effects, that is, interactions. We found that PRS provided more accurate case/control classification than either linear or nonlinear SVMs, and give a tentative explanation why PRS outperforms both multivariate regression and linear kernel SVMs. In addition, we observe that nonlinear kernel SVMs showed higher classification accuracy than linear SVMs when a large number of SNPs are entered into the model. We conclude that SVMs are a potential tool for assessing the presence of interactions, prior to searching for them explicitly.
Keyphrases
  • case control
  • bipolar disorder
  • genome wide
  • machine learning
  • deep learning
  • copy number
  • mental health
  • electronic health record
  • neural network
  • mass spectrometry
  • climate change
  • big data