Haplotype function score improves biological interpretation and cross-ancestry polygenic prediction of human complex traits.
Weicheng SongYongyong ShiGuan Ning LinPublished in: eLife (2024)
We propose a new framework for human genetic association studies: at each locus, a deep learning model (in this study, Sei) is used to calculate the functional genomic activity score for two haplotypes per individual. This score, defined as the Haplotype Function Score (HFS), replaces the original genotype in association studies. Applying the HFS framework to 14 complex traits in the UK Biobank, we identified 3619 independent HFS-trait associations with a significance of p < 5 × 10 -8 . Fine-mapping revealed 2699 causal associations, corresponding to a median increase of 63 causal findings per trait compared with single-nucleotide polymorphism (SNP)-based analysis. HFS-based enrichment analysis uncovered 727 pathway-trait associations and 153 tissue-trait associations with strong biological interpretability, including 'circadian pathway-chronotype' and 'arachidonic acid-intelligence'. Lastly, we applied least absolute shrinkage and selection operator (LASSO) regression to integrate HFS prediction score with SNP-based polygenic risk scores, which showed an improvement of 16.1-39.8% in cross-ancestry polygenic prediction. We concluded that HFS is a promising strategy for understanding the genetic basis of human complex traits.