Risk Factors and Predictive Modeling for Post-Acute Sequelae of SARS-CoV-2 Infection: Findings from EHR Cohorts of the RECOVER Initiative.
Chengxi ZangYu HouEdward SchenckZhenxing XuYongkang ZhangJie XuJiang BianDmitry MorozyukDhruv KhullarAnna NordvigElizabeth ShenkmanRussel RothmanJason BlockKristin LymanYiye ZhangJay VarmaMark WeinerThomas CartonFei WangRainu KaushalThe Recover ConsortiumPublished in: Research square (2023)
Background Patients who were SARS-CoV-2 infected could suffer from newly incidental conditions in their post-acute infection period. These conditions, denoted as the post-acute sequelae of SARS-CoV-2 infection (PASC), are highly heterogeneous and involve a diverse set of organ systems. Limited studies have investigated the predictability of these conditions and their associated risk factors. Method In this retrospective cohort study, we investigated two large-scale PCORnet clinical research networks, INSIGHT and OneFlorida+, including 11 million patients in the New York City area and 16.8 million patients from Florida, to develop machine learning prediction models for those who are at risk for newly incident PASC and to identify factors associated with newly incident PASC conditions. Adult patients aged 20 with SARS-CoV-2 infection and without recorded infection between March 1 st , 2020, and November 30 th , 2021, were used for identifying associated factors with incident PASC after removing background associations. The predictive models were developed on infected adults. Results We find several incident PASC, e.g., malnutrition, COPD, dementia, and acute kidney failure, were associated with severe acute SARS-CoV-2 infection, defined by hospitalization and ICU stay. Older age and extremes of weight were also associated with these incident conditions. These conditions were better predicted (C-index >0.8). Moderately predictable conditions included diabetes and thromboembolic disease (C-index 0.7-0.8). These were associated with a wider variety of baseline conditions. Less predictable conditions included fatigue, anxiety, sleep disorders, and depression (C-index around 0.6). Conclusions This observational study suggests that a set of likely risk factors for different PASC conditions were identifiable from EHRs, predictability of different PASC conditions was heterogeneous, and using machine learning-based predictive models might help in identifying patients who were at risk of developing incident PASC.
Keyphrases
- cardiovascular disease
- machine learning
- end stage renal disease
- liver failure
- sars cov
- type diabetes
- chronic kidney disease
- ejection fraction
- physical activity
- respiratory failure
- drug induced
- intensive care unit
- respiratory syndrome coronavirus
- peritoneal dialysis
- sleep quality
- prognostic factors
- body mass index
- mild cognitive impairment
- artificial intelligence
- coronavirus disease
- patient reported outcomes
- cognitive impairment
- single molecule
- electronic health record
- aortic dissection
- acute respiratory distress syndrome
- quality improvement
- body weight
- mass spectrometry
- middle aged