Learning to automate cryo-electron microscopy data collection with Ptolemy.
Paul T KimAlex J NobleAnchi ChengTristan BeplerPublished in: IUCrJ (2023)
Over the past decade, cryo-electron microscopy (cryoEM) has emerged as an important method for determining near-native, near-atomic resolution 3D structures of biological macromolecules. To meet the increasing demand for cryoEM, automated methods that improve throughput and efficiency of microscope operation are needed. Currently, the targeting algorithms provided by most data-collection software require time-consuming manual tuning of parameters for each grid, and, in some cases, operators must select targets completely manually. However, the development of fully automated targeting algorithms is non-trivial, because images often have low signal-to-noise ratios and optimal targeting strategies depend on a range of experimental parameters and macromolecule behaviors that vary between projects and collection sessions. To address this, Ptolemy provides a pipeline to automate low- and medium-magnification targeting using a suite of purpose-built computer vision and machine-learning algorithms, including mixture models, convolutional neural networks and U-Nets. Learned models in this pipeline are trained on a large set of images from real-world cryoEM data-collection sessions, labeled with locations selected by human operators. These models accurately detect and classify regions of interest in low- and medium-magnification images, and generalize to unseen sessions, as well as to images collected on different microscopes at another facility. This open-source, modular pipeline can be integrated with existing microscope control software to enable automation of cryoEM data collection and can serve as a foundation for future cryoEM automation software.
Keyphrases
- deep learning
- electron microscopy
- convolutional neural network
- machine learning
- big data
- artificial intelligence
- electronic health record
- data analysis
- cancer therapy
- high resolution
- optical coherence tomography
- current status
- endothelial cells
- air pollution
- mass spectrometry
- quality improvement
- resistance training