The neurocognitive role of working memory load when Pavlovian motivational control affects instrumental learning

Park, H., Doh, H., Lee, E., Park, H., & Ahn, W.-Y. 2023. PLOS Computational Biology

Abstract

Humans and animals learn optimal behaviors by interacting with the environment. Research suggests that a fast, capacity-limited working memory (WM) system and a slow, incremental reinforcement learning (RL) system jointly contribute to instrumental learning. Situations that strain WM resources alter several decision-making processes and the balance between multiple decision-making systems: under WM loads, learning becomes slow and incremental, while reward prediction error (RPE) signals become stronger; the reliance on computationally efficient learning increases as WM demands are balanced against computationally costly strategies; and action selection becomes more random. Meanwhile, instrumental learning is known to interact with Pavlovian learning, a hard-wired system that motivates approach to reward and avoidance of punishment. However, the neurocognitive role of WM load during instrumental learning under Pavlovian influence remains unknown, while conflict between the two systems sometimes leads to suboptimal behavior. Thus, we conducted a functional magnetic resonance imaging (fMRI) study (N=49) in which participants completed an instrumental learning task with Pavlovian–instrumental conflict (the orthogonalized go/no-go task); WM load was manipulated with dual-task conditions. Behavioral and computational modeling analyses revealed that WM load compromised learning by reducing the learning rate and increasing random choice, without affecting Pavlovian bias. Model-based fMRI analysis revealed that WM load strengthened RPE signaling in the striatum. Moreover, under WM load, the striatum showed weakened connectivity with the ventromedial and dorsolateral prefrontal cortex when computing reward expectations. These results suggest that the limitation of cognitive resources by WM load decelerates instrumental learning through the weakened cooperation between WM and RL; such limitation also makes action selection more random, but it does not directly affect the balance between instrumental and Pavlovian systems.