Supervised machine learning to predict smoking lapses from Ecological Momentary Assessments and sensor data: Implications for just-in-time adaptive intervention development.
Olga PerskiDimitra KaleCorinna LeppinTosan OkpakoDavid SimonsStephanie P GoldsteinEric HeklerJamie BrownPublished in: PLOS digital health (2024)
Specific moments of lapse among smokers attempting to quit often lead to full relapse, which highlights a need for interventions that target lapses before they might occur, such as just-in-time adaptive interventions (JITAIs). To inform the decision points and tailoring variables of a lapse prevention JITAI, we trained and tested supervised machine learning algorithms that use Ecological Momentary Assessments (EMAs) and wearable sensor data of potential lapse triggers and lapse incidence. We aimed to identify a best-performing and feasible algorithm to take forwards in a JITAI. For 10 days, adult smokers attempting to quit were asked to complete 16 hourly EMAs/day assessing cravings, mood, activity, social context, physical context, and lapse incidence, and to wear a Fitbit Charge 4 during waking hours to passively collect data on steps and heart rate. A series of group-level supervised machine learning algorithms (e.g., Random Forest, XGBoost) were trained and tested, without and with the sensor data. Their ability to predict lapses for out-of-sample (i) observations and (ii) individuals were evaluated. Next, a series of individual-level and hybrid (i.e., group- and individual-level) algorithms were trained and tested. Participants (N = 38) responded to 6,124 EMAs (with 6.9% of responses reporting a lapse). Without sensor data, the best-performing group-level algorithm had an area under the receiver operating characteristic curve (AUC) of 0.899 (95% CI = 0.871-0.928). Its ability to classify lapses for out-of-sample individuals ranged from poor to excellent (AUCper person = 0.524-0.994; median AUC = 0.639). 15/38 participants had adequate data for individual-level algorithms to be constructed, with a median AUC of 0.855 (range: 0.451-1.000). Hybrid algorithms could be constructed for 25/38 participants, with a median AUC of 0.692 (range: 0.523 to 0.998). With sensor data, the best-performing group-level algorithm had an AUC of 0.952 (95% CI = 0.933-0.970). Its ability to classify lapses for out-of-sample individuals ranged from poor to excellent (AUCper person = 0.494-0.979; median AUC = 0.745). 11/30 participants had adequate data for individual-level algorithms to be constructed, with a median AUC of 0.983 (range: 0.549-1.000). Hybrid algorithms could be constructed for 20/30 participants, with a median AUC of 0.772 (range: 0.444 to 0.968). In conclusion, high-performing group-level lapse prediction algorithms without and with sensor data had variable performance when applied to out-of-sample individuals. Individual-level and hybrid algorithms could be constructed for a limited number of individuals but had improved performance, particularly when incorporating sensor data for participants with sufficient wear time. Feasibility constraints and the need to balance multiple success criteria in the JITAI development and implementation process are discussed.
Keyphrases
- machine learning
- big data
- artificial intelligence
- electronic health record
- deep learning
- heart rate
- wastewater treatment
- healthcare
- randomized controlled trial
- smoking cessation
- emergency department
- blood pressure
- mental health
- young adults
- bipolar disorder
- risk factors
- risk assessment
- resistance training
- decision making
- sleep quality