Structural annotation of unknown molecules in a miniaturized mass spectrometer based on a transformer enabled fragment tree method.
Yiming YangShuang SunShuyuan YangQin YangXinqiong LuXiaohao WangQuan YuXinming HuoXiang QianPublished in: Communications chemistry (2024)
Structural annotation of small molecules in tandem mass spectrometry has always been a central challenge in mass spectrometry analysis, especially using a miniaturized mass spectrometer for on-site testing. Here, we propose the Transformer enabled Fragment Tree (TeFT) method, which combines various types of fragmentation tree models and a deep learning Transformer module. It is aimed to generate the specific structure of molecules de novo solely from mass spectrometry spectra. The evaluation results on different open-source databases indicated that the proposed model achieved remarkable results in that the majority of molecular structures of compounds in the test can be successfully recognized. Also, the TeFT has been validated on a miniaturized mass spectrometer with low-resolution spectra for 16 flavonoid alcohols, achieving complete structure prediction for 8 substances. Finally, TeFT confirmed the structure of the compound contained in a Chinese medicine substance called the Anweiyang capsule. These results indicate that the TeFT method is suitable for annotating fragmentation peaks with clear fragmentation rules, particularly when applied to on-site mass spectrometry with lower mass resolution.
Keyphrases
- high resolution
- mass spectrometry
- liquid chromatography
- tandem mass spectrometry
- gas chromatography
- high performance liquid chromatography
- ultra high performance liquid chromatography
- high resolution mass spectrometry
- deep learning
- simultaneous determination
- single molecule
- capillary electrophoresis
- solid phase extraction
- density functional theory
- machine learning
- molecular dynamics
- artificial intelligence
- convolutional neural network