Login / Signup

The machine learning life cycle and the cloud: implications for drug discovery.

Ola SpjuthJens FridAndreas Hellander
Published in: Expert opinion on drug discovery (2021)
Introduction: Artificial intelligence (AI) and machine learning (ML) are increasingly used in many aspects of drug discovery. Larger data sizes and methods such as Deep Neural Networks contribute to challenges in data management, the required software stack, and computational infrastructure. There is an increasing need in drug discovery to continuously re-train models and make them available in production environments.Areas covered: This article describes how cloud computing can aid the ML life cycle in drug discovery. The authors discuss opportunities with containerization and scientific workflows and introduce the concept of MLOps and describe how it can facilitate reproducible and robust ML modeling in drug discovery organizations. They also discuss ML on private, sensitive and regulated data.Expert opinion: Cloud computing offers a compelling suite of building blocks to sustain the ML life cycle integrated in iterative drug discovery. Containerization and platforms such as Kubernetes together with scientific workflows can enable reproducible and resilient analysis pipelines, and the elasticity and flexibility of cloud infrastructures enables scalable and efficient access to compute resources. Drug discovery commonly involves working with sensitive or private data, and cloud computing and federated learning can contribute toward enabling collaborative drug discovery within and between organizations.Abbreviations: AI = Artificial Intelligence; DL = Deep Learning; GPU = Graphics Processing Unit; IaaS = Infrastructure as a Service; K8S = Kubernetes; ML = Machine Learning; MLOps = Machine Learning and Operations; PaaS = Platform as a Service; QC = Quality Control; SaaS = Software as a Service.
Keyphrases