A 2-Gene Host Signature for Improved Accuracy of COVID-19 Diagnosis Agnostic to Viral Variants.
Jack AlbrightEran MickEstella Sanchez-GuerreroJack KammAnthea MitchellAngela M DetweilerNorma NeffAlexandra TsitsiklisPaula Hayakawa SerpaKalani RatnasiriDiane HavlirAmy KistlerJoseph L DeRisiAngela Oliveira PiscoCharles R LangelierPublished in: mSystems (2022)
The continued emergence of SARS-CoV-2 variants is one of several factors that may cause false-negative viral PCR test results. Such tests are also susceptible to false-positive results due to trace contamination from high viral titer samples. Host immune response markers provide an orthogonal indication of infection that can mitigate these concerns when combined with direct viral detection. Here, we leverage nasopharyngeal swab RNA-seq data from patients with COVID-19, other viral acute respiratory illnesses, and nonviral conditions ( n = 318) to develop support vector machine classifiers that rely on a parsimonious 2-gene host signature to diagnose COVID-19. We find that optimal classifiers include an interferon-stimulated gene that is strongly induced in COVID-19 compared with nonviral conditions, such as IFI6 , and a second immune-response gene that is more strongly induced in other viral infections, such as GBP5 . The IFI6 + GBP5 classifier achieves an area under the receiver operating characteristic curve (AUC) greater than 0.9 when evaluated on an independent RNA-seq cohort ( n = 553). We further provide proof-of-concept demonstration that the classifier can be implemented in a clinically relevant RT-qPCR assay. Finally, we show that its performance is robust across common SARS-CoV-2 variants and is unaffected by cross-contamination, demonstrating its utility for improved accuracy of COVID-19 diagnostics. IMPORTANCE In this work, we study upper respiratory tract gene expression to develop and validate a 2-gene host-based COVID-19 diagnostic classifier and then demonstrate its implementation in a clinically practical qPCR assay. We find that the host classifier has utility for mitigating false-negative results, for example due to SARS-CoV-2 variants harboring mutations at primer target sites, and for mitigating false-positive viral PCR results due to laboratory cross-contamination. Both types of error carry serious consequences of either unrecognized viral transmission or unnecessary isolation and contact tracing. This work is directly relevant to the ongoing COVID-19 pandemic given the continued emergence of viral variants and the continued challenges of false-positive PCR assays. It also suggests the feasibility of pan-respiratory virus host-based diagnostics that would have value in congregate settings, such as hospitals and nursing homes, where unrecognized respiratory viral transmission is of particular concern.
Keyphrases
- sars cov
- copy number
- rna seq
- respiratory syndrome coronavirus
- immune response
- gene expression
- genome wide
- single cell
- respiratory tract
- risk assessment
- coronavirus disease
- healthcare
- high throughput
- dna methylation
- machine learning
- primary care
- dendritic cells
- drinking water
- artificial intelligence
- liver failure
- electronic health record
- deep learning
- diabetic rats
- intensive care unit
- stress induced
- label free