Identification of key gene expression associated with quality of life after recovery from COVID-19.
JingXin RenQian GaoXianChao ZhouLei ChenWei GuoKaiYan FengTao HuangYu-Dong CaiPublished in: Medical & biological engineering & computing (2023)
Post-acute sequelae of COVID-19 (PASC) is a persistent complication of severe acute respiratory syndrome coronavirus 2 infection that includes symptoms, such as fatigue, cognitive impairment, and respiratory distress. These symptoms severely affect the quality of life of patients after their recovery from COVID-19. In this study, a group of machine learning algorithms analyzed the whole blood RNA-seq data from patients with different PASC levels. The purpose of this analysis was to identify the gene markers associated with PASC and the special expression patterns for different PASC levels. By comparing the quality of life of patients after the acute phase of COVID-19 and before the disease, samples in the dataset were divided into three groups, namely, "Better," "The Same," and "Worse." Each patient was represented by the expression levels of 58,929 genes. The machine learning-based workflow included six feature-ranking algorithms, incremental feature selection (IFS), and four classification algorithms. The feature ranking algorithms were in charge of assessing feature importance, whereas IFS with classification algorithms were used to extract essential genes and to construct efficient classifiers and classification rules. The expression of top genes in the results was associated with the immune response to viral infection, which is supported by the published literature. For example, patients with low CCDC18 expression and high CPED1 expression had good quality of life, whereas those with low CDC16 expression had poor quality of life.
Keyphrases
- machine learning
- deep learning
- poor prognosis
- coronavirus disease
- sars cov
- artificial intelligence
- big data
- respiratory syndrome coronavirus
- gene expression
- rna seq
- genome wide
- end stage renal disease
- newly diagnosed
- binding protein
- long non coding rna
- ejection fraction
- systematic review
- randomized controlled trial
- cognitive impairment
- chronic kidney disease
- dna methylation
- genome wide identification
- case report
- liver failure
- cell proliferation
- single cell
- transcription factor
- bioinformatics analysis
- hepatitis b virus
- drug induced
- respiratory failure
- copy number
- genome wide analysis