Towards Accurate Identification of Antibiotic-Resistant Pathogens through the Ensemble of Multiple Preprocessing Methods Based on MALDI-TOF Spectra.
Chia-Ru ChungHsin-Yao WangPo-Han ChouLi-Ching WuJang-Jih LuJorng-Tzong HorngTzong-Yi LeePublished in: International journal of molecular sciences (2023)
Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) has been used to identify microorganisms and predict antibiotic resistance. The preprocessing method for the MS spectrum is key to extracting critical information from complicated MS spectral data. Different preprocessing methods yield different data, and the optimal approach is unclear. In this study, we adopted an ensemble of multiple preprocessing methods--FlexAnalysis, MALDIquant, and continuous wavelet transform-based methods--to detect peaks and build machine learning classifiers, including logistic regressions, naïve Bayes classifiers, random forests, and a support vector machine. The aim was to identify antibiotic resistance in Acinetobacter baumannii , Acinetobacter nosocomialis , Enterococcus faecium , and Group B Streptococci (GBS) based on MALDI-TOF MS spectra collected from two branches of a referral tertiary medical center. The ensemble method was compared with the individual methods. Random forest models built with the data preprocessed by the ensemble method outperformed individual preprocessing methods and achieved the highest accuracy, with values of 84.37% ( A. baumannii ), 90.96% ( A. nosocomialis ), 78.54% ( E. faecium ), and 70.12% (GBS) on independent testing datasets. Through feature selection, important peaks related to antibiotic resistance could be detected from integrated information. The prediction model can provide an opinion for clinicians. The discriminative peaks enabling better prediction performance can provide a reference for further investigation of the resistance mechanism.
Keyphrases
- mass spectrometry
- liquid chromatography
- acinetobacter baumannii
- machine learning
- high resolution
- high performance liquid chromatography
- ms ms
- gas chromatography
- capillary electrophoresis
- neural network
- big data
- convolutional neural network
- electronic health record
- multidrug resistant
- multiple sclerosis
- drug resistant
- climate change
- deep learning
- pseudomonas aeruginosa
- magnetic resonance imaging
- staphylococcus aureus
- healthcare
- density functional theory
- health information
- social media
- cystic fibrosis
- magnetic resonance
- bioinformatics analysis