Login / Signup

Deep clustering of small molecules at large-scale via variational autoencoder embedding and K-means.

Hamid HadipourChengyou LiuRebecca DavisSilvia T CardonaPingzhao Hu
Published in: BMC bioinformatics (2022)
This study developed a novel analytical framework that comprises a feature engineering scheme for molecule-specific atomic and bonding features and a deep learning-based embedding strategy for different molecular features. By applying the identified embeddings, we show their usefulness for clustering a large molecule dataset. Our novel analytic algorithms can be applied to any virtual library of chemical compounds with diverse molecular structures. Hence, these tools have the potential of optimizing drug discovery, as they can decrease the number of compounds to be screened in any drug screening campaign.
Keyphrases