Computational models of exploration and exploitation characterise onset and efficacy of treatment in methamphetamine use disorder.
Alex H RobinsonTrevor T-J ChongAntonio Verdejo-GarciaPublished in: Addiction biology (2022)
People with Methamphetamine Use Disorder (PwMUD) spend substantial time and resources on substance use, which hinders their ability to explore alternate reinforcers. Gold-standard behavioural treatments attempt to remedy this by encouraging action towards non-drug reinforcers, but substance use often persists. We aimed to unravel the mechanistic drivers of this behaviour by applying a computational model of explore/exploit behaviour to decision-making data (Iowa Gambling Task) from 106 PwMUD and 48 controls. We then examined the longitudinal link between explore/exploit mechanisms and changes in methamphetamine use 6 weeks later. Exploitation parameters included reinforcement sensitivity and inverse decay (i.e., number of past outcomes used to guide choices). Exploration parameters included maximum directed exploration value (i.e., value of trying novel actions). The Timeline Follow Back measured changes in methamphetamine use. Compared to controls, PwMUD showed deficits in exploitative decision-making, characterised by reduced reinforcement sensitivity, U = 3065, p = 0.009, and less use of previous choice outcomes, U = 3062, p = 0.010. This was accompanied by a behavioural pattern of frequent shifting between choices, which appeared consistent with random exploration. Furthermore, PwMUD with greater reductions of methamphetamine use at 6 weeks had increased directed exploration (β = 0.22, p = 0.045); greater use of past choice outcomes (β = -0.39, p = 0.002) and greater choice consistency (β = -0.39, p = 0.002). Therefore, limited computational exploitation and increased behavioural exploration characterise PwMUD's presentation to treatment, while increased directed exploration, use of past choice outcomes and choice consistency predict greater reductions of methamphetamine use.