Environmental Impacts of Machine Learning Applications in Protein Science.
Loïc LannelongueMichael InouyePublished in: Cold Spring Harbor perspectives in biology (2023)
Computing tools and machine learning models play an increasingly important role in biology and are now an essential part of discoveries in protein science. The growing energy needs of modern algorithms have raised concerns in the computational science community in light of the climate emergency. In this work, we summarize the different ways in which protein science can negatively impact the environment and we present the carbon footprint of some popular protein algorithms: molecular simulations, inference of protein-protein interactions, and protein structure prediction. We show that large deep learning models such as AlphaFold and ESMFold can have carbon footprints reaching over 100 tonnes of CO 2 e in some cases. The magnitude of these impacts highlights the importance of monitoring and mitigating them, and we list actions scientists can take to achieve more sustainable protein computational science.