Login / Signup

Divide-and-conquer: machine-learning integrates mammalian and viral traits with network features to predict virus-mammal associations.

Maya WardehMarcus S C BlagroveKieran J SharkeyMatthew Baylis
Published in: Nature communications (2021)
Our knowledge of viral host ranges remains limited. Completing this picture by identifying unknown hosts of known viruses is an important research aim that can help identify and mitigate zoonotic and animal-disease risks, such as spill-over from animal reservoirs into human populations. To address this knowledge-gap we apply a divide-and-conquer approach which separates viral, mammalian and network features into three unique perspectives, each predicting associations independently to enhance predictive power. Our approach predicts over 20,000 unknown associations between known viruses and susceptible mammalian species, suggesting that current knowledge underestimates the number of associations in wild and semi-domesticated mammals by a factor of 4.3, and the average potential mammalian host-range of viruses by a factor of 3.2. In particular, our results highlight a significant knowledge gap in the wild reservoirs of important zoonotic and domesticated mammals' viruses: specifically, lyssaviruses, bornaviruses and rotaviruses.
Keyphrases
  • genetic diversity
  • healthcare
  • machine learning
  • sars cov
  • human health
  • gene expression
  • network analysis