Login / Signup

Pathogenicity and functional impact of non-frameshifting insertion/deletion variation in the human genome.

Kymberleigh A PagelDanny AntakiAoJie LianMatthew MortDavid N CooperJonathan SebatLilia M IakouchevaSean D MooneyPredrag Radivojac
Published in: PLoS computational biology (2019)
Differentiation between phenotypically neutral and disease-causing genetic variation remains an open and relevant problem. Among different types of variation, non-frameshifting insertions and deletions (indels) represent an understudied group with widespread phenotypic consequences. To address this challenge, we present a machine learning method, MutPred-Indel, that predicts pathogenicity and identifies types of functional residues impacted by non-frameshifting insertion/deletion variation. The model shows good predictive performance as well as the ability to identify impacted structural and functional residues including secondary structure, intrinsic disorder, metal and macromolecular binding, post-translational modifications, allosteric sites, and catalytic residues. We identify structural and functional mechanisms impacted preferentially by germline variation from the Human Gene Mutation Database, recurrent somatic variation from COSMIC in the context of different cancers, as well as de novo variants from families with autism spectrum disorder. Further, the distributions of pathogenicity prediction scores generated by MutPred-Indel are shown to differentiate highly recurrent from non-recurrent somatic variation. Collectively, we present a framework to facilitate the interrogation of both pathogenicity and the functional effects of non-frameshifting insertion/deletion variants. The MutPred-Indel webserver is available at http://mutpred.mutdb.org/.
Keyphrases
  • machine learning
  • endothelial cells
  • copy number
  • genome wide
  • small molecule
  • emergency department
  • gene expression
  • induced pluripotent stem cells
  • staphylococcus aureus
  • dna methylation
  • young adults