An exploratory phenome wide association study linking asthma and liver disease genetic variants to electronic health records from the Estonian Biobank.
Glen JamesSulev ReisbergKaido LepikNicholas GalweyPaul AvillachLiis KolbergReedik MägiTõnu EskoMyriam AlexanderDawn WaterworthA Katrina LoomisJaak ViloPublished in: PloS one (2019)
The Estonian Biobank, governed by the Institute of Genomics at the University of Tartu (Biobank), has stored genetic material/DNA and continuously collected data since 2002 on a total of 52,274 individuals representing ~5% of the Estonian adult population and is increasing. To explore the utility of data available in the Biobank, we conducted a phenome-wide association study (PheWAS) in two areas of interest to healthcare researchers; asthma and liver disease. We used 11 asthma and 13 liver disease-associated single nucleotide polymorphisms (SNPs), identified from published genome-wide association studies, to test our ability to detect established associations. We confirmed 2 asthma and 5 liver disease associated variants at nominal significance and directionally consistent with published results. We found 2 associations that were opposite to what was published before (rs4374383:AA increases risk of NASH/NAFLD, rs11597086 increases ALT level). Three SNP-diagnosis pairs passed the phenome-wide significance threshold: rs9273349 and E06 (thyroiditis, p = 5.50x10-8); rs9273349 and E10 (type-1 diabetes, p = 2.60x10-7); and rs2281135 and K76 (non-alcoholic liver diseases, including NAFLD, p = 4.10x10-7). We have validated our approach and confirmed the quality of the data for these conditions. Importantly, we demonstrate that the extensive amount of genetic and medical information from the Estonian Biobank can be successfully utilized for scientific research.
Keyphrases
- electronic health record
- chronic obstructive pulmonary disease
- healthcare
- lung function
- genome wide
- type diabetes
- allergic rhinitis
- genome wide association
- copy number
- clinical decision support
- big data
- cardiovascular disease
- air pollution
- machine learning
- quality improvement
- health information
- systematic review
- circulating tumor
- dna methylation
- gene expression
- artificial intelligence
- ionic liquid
- cell free