Human Genome Polymorphisms and Computational Intelligence Approach Revealed a Complex Genomic Signature for COVID-19 Severity in Brazilian Patients.
André Filipe PastorCássia DocenaAntonio Mauro RezendeFlávio Rosendo da Silva OliveiraMarília de Albuquerque SenaClarice Neuenschwander Lins de MoraisCristiane Campello Bresani-SalviLuydson Richardson Silva VasconcelosKennya Danielle Campelo ValençaCarolline de Araújo MarizCarlos BritoCláudio Duarte FonsecaMaria Cynthia BragaChristian Robson de Souza ReisErnesto Torres de Azevedo MarquesBartolomeu Acioli-SantosPublished in: Viruses (2023)
We present a genome polymorphisms/machine learning approach for severe COVID-19 prognosis. Ninety-six Brazilian severe COVID-19 patients and controls were genotyped for 296 innate immunity loci. Our model used a feature selection algorithm, namely recursive feature elimination coupled with a support vector machine, to find the optimal loci classification subset, followed by a support vector machine with the linear kernel (SVM-LK) to classify patients into the severe COVID-19 group. The best features that were selected by the SVM-RFE method included 12 SNPs in 12 genes: PD-L1 , PD-L2 , IL10RA , JAK2 , STAT1 , IFIT1 , IFIH1 , DC-SIGNR , IFNB1 , IRAK4 , IRF1 , and IL10 . During the COVID-19 prognosis step by SVM-LK, the metrics were: 85% accuracy, 80% sensitivity, and 90% specificity. In comparison, univariate analysis under the 12 selected SNPs showed some highlights for individual variant alleles that represented risk ( PD-L1 and IFIT1 ) or protection ( JAK2 and IFIH1 ). Variant genotypes carrying risk effects were represented by PD-L2 and IFIT1 genes. The proposed complex classification method can be used to identify individuals who are at a high risk of developing severe COVID-19 outcomes even in uninfected conditions, which is a disruptive concept in COVID-19 prognosis. Our results suggest that the genetic context is an important factor in the development of severe COVID-19.
Keyphrases
- coronavirus disease
- sars cov
- machine learning
- genome wide
- deep learning
- end stage renal disease
- early onset
- ejection fraction
- respiratory syndrome coronavirus
- newly diagnosed
- dna methylation
- chronic kidney disease
- artificial intelligence
- prognostic factors
- copy number
- type diabetes
- peritoneal dialysis
- endothelial cells
- gene expression
- insulin resistance
- skeletal muscle
- immune response
- metabolic syndrome
- adipose tissue
- big data
- systemic lupus erythematosus
- ankylosing spondylitis
- induced pluripotent stem cells
- hiv infected
- single cell
- glycemic control
- genome wide association