Drug vector representation: a tool for drug similarity analysis.
Liping LinLuoyao WanHuaqin HeLiping LinPublished in: Molecular genetics and genomics : MGG (2020)
DrugMatrix is a valuable toxicogenomic dataset, which provides in vivo transcriptome data corresponding to hundreds of chemical drugs. However, the relationships between drugs and how those drugs affect the biological process are still unknown. The high dimensionality of the microarray data hinders its application. The aims of this study are to (1) represent the transcriptome data by lower-dimensional vectors, (2) compare drug similarity, (3) represent drug combinations by adding vectors and (4) infer drug mechanism of action (MoA) and genotoxicity features. We borrowed the latent semantic analysis (LSA) technique from natural language processing to represent treatments (drugs with multiple concentrations and time points) by dense vectors, each dimension of which is an orthogonal biological feature. The gProfiler enrichment tool was used for the 100-dimensional vector feature annotation. The similarity between treatments vectors was calculated by the cosine function. Adding vectors may represent drug combinations, treatment times or treatment doses that are not presented in the original data. Drug-drug interaction pairs had a higher similarity than random drug pairs in the hepatocyte data. The vector features helped to reveal the MoA. Differential feature expression was also implicated for genotoxic and non-genotoxic carcinogens. An easy-to-use Web tool was developed by Shiny Web application framework for the exploration of treatment similarities and drug combinations (https://bioinformatics.fafu.edu.cn/drugmatrix/). We represented treatments by vectors and provided a tool that is useful for hypothesis generation in toxicogenomic, such as drug similarity, drug repurposing, combination therapy and MoA.