DLRAPom: a hybrid pipeline of Optimized XGBoost-guided integrative multiomics analysis for identifying targetable disease-related lncRNA-miRNA-mRNA regulatory axes.
Chen ShenHuiyu LiMiao LiYu NiuJing LiuLi ZhuHongsheng GuiWei HanHuiying WangWenpei ZhangXiaochen WangXiao LuoYu SunJiangwei YanFanglin GuanPublished in: Briefings in bioinformatics (2022)
The lack of a reliable and easy-to-operate screening pipeline for disease-related noncoding RNA regulatory axis is a problem that needs to be solved urgently. To address this, we designed a hybrid pipeline, disease-related lncRNA-miRNA-mRNA regulatory axis prediction from multiomics (DLRAPom), to identify risk biomarkers and disease-related lncRNA-miRNA-mRNA regulatory axes by adding a novel machine learning model on the basis of conventional analysis and combining experimental validation. The pipeline consists of four parts, including selecting hub biomarkers by conventional bioinformatics analysis, discovering the most essential protein-coding biomarkers by a novel machine learning model, extracting the key lncRNA-miRNA-mRNA axis and validating experimentally. Our study is the first one to propose a new pipeline predicting the interactions between lncRNA and miRNA and mRNA by combining WGCNA and XGBoost. Compared with the methods reported previously, we developed an Optimized XGBoost model to reduce the degree of overfitting in multiomics data, thereby improving the generalization ability of the overall model for the integrated analysis of multiomics data. With applications to gestational diabetes mellitus (GDM), we predicted nine risk protein-coding biomarkers and some potential lncRNA-miRNA-mRNA regulatory axes, which all correlated with GDM. In those regulatory axes, the MALAT1/hsa-miR-144-3p/IRS1 axis was predicted to be the key axis and was identified as being associated with GDM for the first time. In short, as a flexible pipeline, DLRAPom can contribute to molecular pathogenesis research of diseases, effectively predicting potential disease-related noncoding RNA regulatory networks and providing promising candidates for functional research on disease pathogenesis.