Identifying Antitubercular Peptides via Deep Forest Architecture with Effective Feature Representation.
Lantian YaoJiahui GuanWenshuo LiChia-Ru ChungJunyang DengYing-Chih ChiangTzong-Yi LeePublished in: Analytical chemistry (2024)
Tuberculosis (TB) is a severe disease caused by Mycobacterium tuberculosis that poses a significant threat to human health. The emergence of drug-resistant strains has made the global fight against TB even more challenging. Antituberculosis peptides (ATPs) have shown promising results as a potential treatment for TB. However, conventional wet lab-based approaches to ATP discovery are time-consuming and costly and often fail to discover peptides with desired properties. To address these challenges, we propose a novel machine learning-based framework called ATPfinder that can significantly accelerate the discovery of ATP. Our approach integrates various efficient peptide descriptors and utilizes the deep forest algorithm to construct the model. This neural network-like cascading structure can effectively process and mine features without complex hyperparameter tuning. Our experimental results show that ATPfinder outperforms existing ATP prediction tools, achieving state-of-the-art performance with an accuracy of 89.3% and an MCC of 0.70. Moreover, our framework exhibits better robustness than baseline algorithms commonly used for other sequence analysis tasks. Additionally, the excellent interpretability of our model can assist researchers in understanding the critical features of ATP. Finally, we developed a downloadable desktop application to simplify the use of our framework for researchers. Therefore, ATPfinder can facilitate the discovery of peptide drugs and provide potential solutions for TB treatment. Our framework is freely available at https://github.com/lantianyao/ATPfinder/ (data sets and code) and https://awi.cuhk.edu.cn/dbAMP/ATPfinder.html (software).
Keyphrases
- mycobacterium tuberculosis
- machine learning
- human health
- drug resistant
- neural network
- climate change
- small molecule
- risk assessment
- deep learning
- pulmonary tuberculosis
- multidrug resistant
- high throughput
- big data
- amino acid
- artificial intelligence
- acinetobacter baumannii
- escherichia coli
- squamous cell carcinoma
- working memory
- early onset
- antiretroviral therapy
- hiv aids
- replacement therapy
- pseudomonas aeruginosa
- electronic health record
- hiv infected
- human immunodeficiency virus
- smoking cessation