Login / Signup

Structure-based Method for Predicting Deleterious Missense SNPs.

Boshen WangWei TianXue LeiAlan Perez-RathkeYan Yuan TsengJie Liang
Published in: ... IEEE-EMBS International Conference on Biomedical and Health Informatics. IEEE-EMBS International Conference on Biomedical and Health Informatics (2019)
Missense SNPs are key factors contributing towards many Mendelian disorders and complex diseases. Identifying whether a single amino acid substitution will lead to pathological effects is important for interpreting personal genome and for precision medicine. In this study, we describe a novel method for predicting whether a missense SNP likely brings about pathological effects. Our approach integrates sequence information, biophysical properties, and topological properties of protein structures. In our test dataset consisting of 500 deleterious variants and 500 neutral, our method achieves an accuracy of 0.823. The ROC curve of model has an AUC of 0.910. Our methods outperforms two well known methods, and is comparable with the widely used Polyphen-2 method, while requiring a much smaller amount (approximately 25%) of training data. Our method can be used to aid in distinguishing driver and passenger mutations in cancer and in assessing missense mutations assocaited with rare diseases. It can also be used to identifying mutations in rare disease where only limited patient exome data exsit.
Keyphrases
  • amino acid
  • intellectual disability
  • genome wide
  • copy number
  • electronic health record
  • big data
  • gene expression
  • machine learning
  • high resolution
  • binding protein