Interpretable artificial intelligence model for accurate identification of medical conditions using immune repertoire.
Yu ZhaoBing HeZhimeng XuYidan ZhangXuan ZhaoZhi-An HuangFan YangLiang WangLei DuanJiangning SongJianhua YaoPublished in: Briefings in bioinformatics (2022)
Underlying medical conditions, such as cancer, kidney disease and heart failure, are associated with a higher risk for severe COVID-19. Accurate classification of COVID-19 patients with underlying medical conditions is critical for personalized treatment decision and prognosis estimation. In this study, we propose an interpretable artificial intelligence model termed VDJMiner to mine the underlying medical conditions and predict the prognosis of COVID-19 patients according to their immune repertoires. In a cohort of more than 1400 COVID-19 patients, VDJMiner accurately identifies multiple underlying medical conditions, including cancers, chronic kidney disease, autoimmune disease, diabetes, congestive heart failure, coronary artery disease, asthma and chronic obstructive pulmonary disease, with an average area under the receiver operating characteristic curve (AUC) of 0.961. Meanwhile, in this same cohort, VDJMiner achieves an AUC of 0.922 in predicting severe COVID-19. Moreover, VDJMiner achieves an accuracy of 0.857 in predicting the response of COVID-19 patients to tocilizumab treatment on the leave-one-out test. Additionally, VDJMiner interpretively mines and scores V(D)J gene segments of the T-cell receptors that are associated with the disease. The identified associations between single-cell V(D)J gene segments and COVID-19 are highly consistent with previous studies. The source code of VDJMiner is publicly accessible at https://github.com/TencentAILabHealthcare/VDJMiner. The web server of VDJMiner is available at https://gene.ai.tencent.com/VDJMiner/.
Keyphrases
- artificial intelligence
- sars cov
- coronavirus disease
- machine learning
- heart failure
- deep learning
- chronic obstructive pulmonary disease
- healthcare
- big data
- coronary artery disease
- chronic kidney disease
- genome wide
- respiratory syndrome coronavirus
- single cell
- copy number
- cardiovascular disease
- high resolution
- type diabetes
- genome wide identification
- dna methylation
- rheumatoid arthritis
- left ventricular
- squamous cell carcinoma
- multiple sclerosis
- rna seq
- juvenile idiopathic arthritis
- atrial fibrillation
- small molecule
- cardiovascular events
- skeletal muscle
- adipose tissue
- transcription factor
- peritoneal dialysis
- lymph node metastasis