Distinguishing Exposure to Secondhand and Thirdhand Tobacco Smoke among U.S. Children Using Machine Learning: NHANES 2013-2016.
Ashley L MerianosE Melinda Mahabee-GittensTimothy M StoneRoman A JandarovLanqing WangDeepak BhandariBenjamin C BlountGeorg E MattPublished in: Environmental science & technology (2023)
While the thirdhand smoke (THS) residue from tobacco smoke has been recognized as a distinct public health hazard, there are currently no gold standard biomarkers to differentiate THS from secondhand smoke (SHS) exposure. This study used machine learning algorithms to assess which combinations of biomarkers and reported tobacco smoke exposure measures best differentiate children into three groups: no/minimal tobacco smoke exposure (NEG); predominant THS exposure (TEG); and mixed SHS and THS exposure (MEG). Participants were 4485 nonsmoking 3-17-year-olds from the National Health and Nutrition Examination Survey 2013-2016. We fitted and tested random forest models, and the majority (76%) of children were classified in NEG, 16% were classified in TEG, and 8% were classified in MEG. The final classification model based on reported exposure, biomarker, and biomarker ratio variables had a prediction accuracy of 95%. This final model had prediction accuracies of 100% for NEG, 88% for TEG, followed by 71% for MEG. The most important predictors were the reported number of household smokers, serum cotinine, serum hydroxycotinine, and urinary 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol (NNAL). In the absence of validated biomarkers specific to THS, comprehensive biomarker and questionnaire data for tobacco smoke exposure can distinguish children exposed to SHS and THS with high accuracy.