Network computations underlying learning from symbolic gains and losses.
Hua TangRamon Bartolo-OrozcoBruno B AverbeckPublished in: bioRxiv : the preprint server for biology (2024)
Reinforcement learning (RL) engages a network of areas, including the orbitofrontal cortex (OFC), ventral striatum (VS), amygdala (AMY), and mediodorsal thalamus (MDt). This study examined RL mediated by gains and losses of symbolic reinforcers across this network. Monkeys learned to select options that led to gaining tokens and avoid options that led to losing tokens. Tokens were cashed out for juice rewards periodically. We found that task-relevant information was distributed across the network. However, examination of the way in which information was encoded differed, with VS showing increased responses to appetitive outcomes, OFC differentiating primary and symbolic reinforcers, and AMY responding to the salience of outcomes. In addition, analysis of network activity showed that symbolic reinforcement was calculated by temporal differentiation of accumulated tokens. This process was mediated by dynamics within the OFC-MDt-VS circuit. Thus, we provide a neurocomputational account of learning from symbolic gains and losses.