Application of machine learning for multi-community COVID-19 outbreak predictions with wastewater surveillance.
Yuehan AiFan HeEmma LancasterJiyoung LeePublished in: PloS one (2022)
The potential of wastewater-based epidemiology (WBE) as a surveillance and early warning tool for the COVID-19 outbreak has been demonstrated. For areas with limited testing capacity, wastewater surveillance can provide information on the disease dynamic at a community level. A predictive model is a key to generating quantitative estimates of the infected population. Modeling longitudinal wastewater data can be challenging as biomarkers in wastewater are susceptible to variations caused by multiple factors associated with the wastewater matrix and the sewersheds characteristics. As WBE is an emerging trend, the model should be able to address the uncertainties of wastewater from different sewersheds. We proposed exploiting machine learning and deep learning techniques, which are supported by the growing WBE data. In this article, we reviewed the existing predictive models, among which the emerging machine learning/deep learning models showed great potential. However, most models are built for individual sewersheds with few features extracted from the wastewater. To fulfill the research gap, we compared different time-series and non-time-series models for their short-term predictive performance of COVID-19 cases in 9 diverse sewersheds. The time-series models, long short-term memory (LSTM) and Prophet, outcompeted the non-time-series models. Besides viral (SARS-CoV-2) loads and location identity, domain-specific features like biochemical parameters of wastewater, geographical parameters of the sewersheds, and some socioeconomic parameters of the communities can contribute to the models. With proper feature engineering and hyperparameter tuning, we believe machine learning models like LSTM can be a feasible solution for the COVID-19 trend prediction via WBE. Overall, this is a proof-of-concept study on the application of machine learning in COVID-19 WBE. Future studies are needed to deploy and maintain the model in more real-world applications.