Login / Signup

Method for Systematic Analogue Search Using the Mega SAR Matrix Database.

Atsushi YoshimoriYuichi HoritaToru TanoueJürgen Bajorath
Published in: Journal of chemical information and modeling (2019)
Analogue searching is a typical requirement in hit expansion, hit-to-lead, and lead optimization projects. A new computational methodology is introduced to search for existing and virtual analogues of active compounds. The approach is based upon the SAR matrix (SARM) data structure that was originally developed for the systematic identification and structural organization of analogue series. The SARM-based analogue search algorithm further extends the capacity of current substructure-based methods by (i) simultaneously considering existing and virtual analogues that populate chemical space around query compounds, (ii) permitting not only R-group replacements but also well-defined chemical modifications in core structures to further expand the analogue space, and (iii) automatically extracting all possible analogues from large pools. In addition, as a basis for analogue searching following the SARM concept, the Mega-SARM database is introduced. Mega-SARM is derived from nearly 3.7 million compounds and contains ∼250 000 matrices with structurally related analogue series and more than 1.5 million virtual candidate compounds.
Keyphrases
  • machine learning
  • deep learning
  • quality improvement
  • electronic health record
  • adverse drug
  • drug induced
  • bioinformatics analysis