Login / Signup

Drivers and forecasts of multiple waves of the coronavirus disease 2019 pandemic: A systematic analysis based on an interpretable machine learning framework.

Zicheng CaoZekai QiuFeng TangShiwen LiangYinghan WangHaoyu LongCai ChenBing ZhangChi ZhangYaqi WangKang TangJing TangJunhong ChenChunhui YangYuzhe XuYulin YangShenglan XiaoDechao TianGuozhi JiangXiangjun Du
Published in: Transboundary and emerging diseases (2022)
Coronavirus disease 2019 (COVID-19) has become a global pandemic and continues to prevail with multiple rebound waves in many countries. The driving factors for the spread of COVID-19 and their quantitative contributions, especially to rebound waves, are not well studied. Multidimensional time-series data, including policy, travel, medical, socioeconomic, environmental, mutant and vaccine-related data, were collected from 39 countries up to 30 June 2021, and an interpretable machine learning framework (XGBoost model with Shapley Additive explanation interpretation) was used to systematically analyze the effect of multiple factors on the spread of COVID-19, using the daily effective reproduction number as an indicator. Based on a model of the pre-vaccine era, policy-related factors were shown to be the main drivers of the spread of COVID-19, with a contribution of 60.81%. In the post-vaccine era, the contribution of policy-related factors decreased to 28.34%, accompanied by an increase in the contribution of travel-related factors, such as domestic flights, and contributions emerged for mutant-related (16.49%) and vaccine-related (7.06%) factors. For single-peak countries, the dominant ones were policy-related factors during both the rising and fading stages, with overall contributions of 33.7% and 37.7%, respectively. For double-peak countries, factors from the rebound stage contributed 45.8% and policy-related factors showed the greatest contribution in both the rebound (32.6%) and fading (25.0%) stages. For multiple-peak countries, the Delta variant, domestic flights (current month) and the daily vaccination population are the three greatest contributors (8.12%, 7.59% and 7.26%, respectively). Forecasting models to predict the rebound risk were built based on these findings, with accuracies of 0.78 and 0.81 for the pre- and post-vaccine eras, respectively. These findings quantitatively demonstrate the systematic drivers of the spread of COVID-19, and the framework proposed in this study will facilitate the targeted prevention and control of the ongoing COVID-19 pandemic.
Keyphrases
  • coronavirus disease
  • healthcare
  • public health
  • machine learning
  • sars cov
  • respiratory syndrome coronavirus
  • mental health
  • big data
  • physical activity
  • artificial intelligence
  • cancer therapy
  • drug delivery
  • data analysis