Unsupervised ensemble-based phenotyping enhances discoverability of genes related to left-ventricular morphology.
Rodrigo BonazzolaEnzo FerranteNishant RavikumarYan XiaBernard D KeavneySven PleinTanveer Syeda-MahmoodAlejandro F FrangiPublished in: Nature machine intelligence (2024)
Recent genome-wide association studies have successfully identified associations between genetic variants and simple cardiac morphological parameters derived from cardiac magnetic resonance images. However, the emergence of large databases, including genetic data linked to cardiac magnetic resonance facilitates the investigation of more nuanced patterns of cardiac shape variability than those studied so far. Here we propose a framework for gene discovery coined unsupervised phenotype ensembles. The unsupervised phenotype ensemble builds a redundant yet highly expressive representation by pooling a set of phenotypes learnt in an unsupervised manner, using deep learning models trained with different hyperparameters. These phenotypes are then analysed via genome-wide association studies, retaining only highly confident and stable associations across the ensemble. We applied our approach to the UK Biobank database to extract geometric features of the left ventricle from image-derived three-dimensional meshes. We demonstrate that our approach greatly improves the discoverability of genes that influence left ventricle shape, identifying 49 loci with study-wide significance and 25 with suggestive significance. We argue that our approach would enable more extensive discovery of gene associations with image-derived phenotypes for other organs or image modalities.
Keyphrases
- deep learning
- genome wide association
- left ventricular
- genome wide
- magnetic resonance
- machine learning
- convolutional neural network
- genome wide identification
- mitral valve
- artificial intelligence
- high throughput
- copy number
- big data
- small molecule
- pulmonary hypertension
- dna methylation
- hypertrophic cardiomyopathy
- acute myocardial infarction
- left atrial
- cardiac resynchronization therapy
- genome wide analysis
- magnetic resonance imaging
- computed tomography
- contrast enhanced
- pulmonary artery
- neural network
- transcription factor
- coronary artery
- percutaneous coronary intervention
- anti inflammatory
- cross sectional