Machine Learning Analysis of Raman Spectra To Quantify the Organic Constituents in Complex Organic-Mineral Mixtures.
Mahsa ZareiNatalia V SolomatovaHoda AghaeiAustin RothwellJeffrey WiensLuke MeloTravis G GoodSadegh ShokatianEdward R GrantPublished in: Analytical chemistry (2023)
Important decisions in local agricultural policy and practice often hinge on the soil's chemical composition. Raman spectroscopy offers a rapid noninvasive means to quantify the constituents of complex organic systems. But the application of Raman spectroscopy to soils presents a multifaceted challenge due to organic/mineral compositional complexity and spectral interference arising from overwhelming fluorescence. The present work compares methodologies with the capacity to help overcome common obstacles that arise in the analysis of soils. We created conditions representative of these challenges by combining varying proportions of six amino acids commonly found in soils with fluorescent bentonite clay and coarse mineral components. Referring to an extensive data set of Raman spectra, we compare the performance of the convolutional neural network (CNN) and partial least-squares regression (PLSR) multivariate models for amino acid composition. Strategies employing volume-averaged spectral sampling and data preprocessing algorithms improve the predictive power of these models. Our average test R 2 for PLSR models exceeds 0.89 and approaches 0.98, depending on the complexity of the matrix, whereas CNN yields an R 2 range from 0.91 to 0.97, demonstrating that classic PLSR and CNN perform comparably, except in cases where the signal-to-noise ratio of the organic component is very low, whereupon CNN models outperform. Artificially isolating two of the most prevalent obstacles in evaluating the Raman spectra of soils, we have characterized the effect of each obstacle on the performance of machine learning models in the absence of other complexities. These results highlight important considerations and modeling strategies necessary to improve the Raman analysis of organic compounds in complex mixtures in the presence of mineral spectral components and significant fluorescence.
Keyphrases
- raman spectroscopy
- convolutional neural network
- machine learning
- heavy metals
- amino acid
- deep learning
- water soluble
- optical coherence tomography
- big data
- human health
- healthcare
- primary care
- electronic health record
- ionic liquid
- public health
- quantum dots
- climate change
- single molecule
- label free
- density functional theory
- molecular dynamics simulations
- molecular dynamics
- mental health
- computed tomography
- sensitive detection
- living cells
- fluorescent probe