Utility of the Morgan Fingerprint in Structure-Based Virtual Ligand Screening.

Published in: The journal of physical chemistry. B (2024)

In modern drug discovery, virtual ligand screening (VLS) is frequently applied to identify possible hits before experimental testing and refinement due to its cost-effective nature for large compound libraries. For decades, efforts have been devoted to developing VLS methods with high accuracy. These include the state-of-the-art FINDSITE suite of approaches FINDSITE comb2.0 , FRAGSITE, and FRAGSITE2 and the meta version FRAGSITE comb that were developed in our lab. These methods combine ligand homology modeling (LHM), traditional ligand similarity methods, and more recently machine learning approaches to rank ligands and have proven to be superior to most recent deep learning and large language model-based approaches. Here, we describe further improvements to our previous best methods by combining the Morgan fingerprint (MF) with the originally used PubChem fingerprint and FP2 fingerprint. We then benchmarked FINDSITE comb2.0M , FRAGSITE M , FRAGSITE2 M , and the composite meta-approach FRAGSITE combM . On the 102 target DUD-E set, the 1% enrichment factor (EF 1% ) and area under the precision-recall curve (AUPR) of FRAGSITE comb increased from 42.0/0.59 to 47.6/0.72. This 0.72 AUPR is significantly better than that of the state-of-the-art deep learning-based method DenseFS's AUPR of 0.443. An independent test on the 81 targets DEKOIS2.0 set shows that EF 1% /AUPR increases from 18.3/0.520 to 23.1/0.683. An ablation investigation shows that the MF contributes to most of the improvement of all four approaches. Thus, the MF is a useful addition to structure-based VLS.

Keyphrases