Selecting variants of unknown significance through network-based gene-association significantly improves risk prediction for disease-control cohorts.
Anastasis OulasGeorge MinadakisMargarita ZachariouGeorge M SpyrouPublished in: Scientific reports (2019)
Variants of unknown/uncertain significance (VUS) pose a huge dilemma in current genetic variation screening methods and genetic counselling. Driven by methods of next generation sequencing (NGS) such as whole exome sequencing (WES), a plethora of VUS are being detected in research laboratories as well as in the health sector. Motivated by this overabundance of VUS, we propose a novel computational methodology, termed VariantClassifier (VarClass), which utilizes gene-association networks and polygenic risk prediction models to shed light into this grey area of genetic variation in association with disease. VarClass has been evaluated using numerous validation steps and proves to be very successful in assigning significance to VUS in association with specific diseases of interest. Notably, using VUS that are deemed significant by VarClass, we improved risk prediction accuracy in four large case-studies involving disease-control cohorts from GWAS as well as WES, when compared to traditional odds ratio analysis. Biological interpretation of selected high scoring VUS revealed interesting biological themes relevant to the diseases under investigation. VarClass is available as a standalone tool for large-scale data analyses, as well as a web-server with additional functionalities through a user-friendly graphical interface.