Compulsive behavior is enacted under a belief that a specific act controls the likelihood of an undesired future event. Compulsive behaviors are widespread in the general population despite having no causal relationship with events they aspire to influence. In the current study, we tested whether there is an increased tendency to assign value to aspects of a task that do not predict an outcome (i.e., outcome-irrelevant learning) among individuals with compulsive tendencies. We studied 514 healthy individuals who completed self-report compulsivity, anxiety, depression, and schizotypal measurements, and a well-established reinforcement-learning task (i.e., the two-step task). As expected, we found a positive relationship between compulsivity and outcome-irrelevant learning. Specifically, individuals who reported having stronger compulsive tendencies (e.g., washing, checking, grooming) also tended to assign value to response keys and stimuli locations that did not predict an outcome. Controlling for overall goal-directed abilities and the co-occurrence of anxious, depressive, or schizotypal tendencies did not impact these associations. These findings indicate that outcome-irrelevant learning processes may contribute to the expression of compulsivity in a general population setting. We highlight the need for future research on the formation of non-veridical action−outcome associations as a factor related to the occurrence and maintenance of compulsive behavior.

A list of authors and their affiliations appears at the end of the paper.

The original online version of this article was revised: Due to mistakes in the names of a few first authors in the references (ref 11, 13, 16, 19, 20, 39, 44, 45, 46) and a typo in figure 1.

corrected publication 2021

To say that a reinforcement is contingent upon a response, may mean nothing more than that it follows the response. B.F. Skinner (1948) [

Compulsive, ritualistic behaviors are enacted to influence the likelihood that a certain event will occur [

Outcome-irrelevant learning can be defined as a tendency to assign credit to actions that do not hold any causal association to an outcome [

Recently, we observed outcome-irrelevant learning in human subjects that manifests as a tendency to press a response key that was previously followed by a monetary gain, and a tendency to avoid it when it was followed by a loss, despite there being no actual causal relationship between the response key and an outcome [

Previous reinforcement-learning studies that have examined an association between value-based learning and compulsivity focused on goal-directed reasoning strategies (i.e., model-based control) [

In the current study, we tested for an association between outcome-irrelevant learning and compulsive behavior in a healthy, general population sample, and to assess whether this association exists over and above other associated factors previously reported in the literature. We analyzed data from 514 individuals from a community-based longitudinal sample, comprising adolescent and young adult volunteers, living in Cambridgeshire and London, UK (Neuroscience in Psychiatry Network [

We obtained data from a community-based longitudinal sample of adolescent and young adult volunteers living in Cambridgeshire and London, UK (Neuroscience in Psychiatry Network [

Self-report ratings regarding symptoms of compulsive, obsessive, anxious, depressive, and schizotypal tendencies were obtained by asking participants to complete the following scales:

Overall, the six questionnaires (i.e., OCI-R, PI-WSUR, LOI, MFQ, RCMAS, and SPQ) resulted in 25 subscales (see Supplementary Table

To obtain individual measures of outcome-irrelevant learning and model-based control, participants completed a two-step reinforcement-learning task [

We operationalized outcome-irrelevant learning as a disposition to assign value to a task representation that is not predictive of an outcome (see Fig. _{outcome-irrelevant}; see Supplementary Information, Eqs. 6 and 7 and Table _{outcome-irrelevant}; see Supplementary Information, Eqs. 3 and 4 and Table _{outcome-irrelevant} and _{outcome-irrelevant}, we further estimated three independent sequential trial scores previously found to be closely related to these two outcome-irrelevant computational parameters [

The figure illustrates two sequential trial analyses (previously reported in Shahar et al.), demonstrating outcome-irrelevant value learning. These analyses examined a tendency to repeat a response key selection from trial

Sample characteristics and descriptive data per time point.

Baseline | Follow-up 1 | Follow-up 2 | Across time points | |
---|---|---|---|---|

Sample characteristics | ||||

514 | 48 | 514 | ||

Gender (m/f) | 259/255 | 24/24 | 259/255 | |

Age | 18.81 (2.96) | 19.30 (2.87) | 20.27 (2.98) | |

Outcome-irrelevant learning | ||||

| 0.23 (0.09) | |||

| 0.29 (0.23) | |||

First-stage score | 0.03 (0.12) | 0.04 (0.11) | 0.04 (0.10) | |

Second-stage score I | 0.04 (0.15) | 0.04 (0.14) | 0.03 (0.12) | |

Second-stage score II | 0.15 (0.27) | 0.12 (0.24) | 0.16 (0.22) | |

Model-based control | ||||

| 0.38 (0.20) | |||

First-stage score | 0.10 (0.25) | 0.08 0(0.23) | 0.12 (0.20) | |

Second-stage score (ms) | 120 (110) | 110 (90) | 130 (100) |

Outcome-irrelevant learning: _{outcome-irrelevant} reflects the weight of the response key cached value on the individual’s trial-by-trial decisions (units are arbitrary and should be interpreted in terms of being negative, zero, or positive; see Supplementary Information, Eqs. 6 and 7), estimated using computational modeling across all three time points. _{outcome-irrelevant} is the learning rate for the response key cached value (range is between 0 and 1; see Supplementary Information, Eqs. 3 and 4), estimated across all three time points. First-stage and second-stage score I estimates are depicted as unstandardized regression coefficients, representing the effect of outcome in the previous trial (rewarded vs. unrewarded) on the probability of making the same response key choice (see Fig. _{model-based}) reflects the weight of model-based strategies on an individual’s first-stage choices (units are arbitrary and should be interpreted in terms of being negative, zero, or positive; see Supplementary Information, Eq. 6). The first-stage score shows the unstandardized regression coefficients of the previous reward × previous transition interaction effect on the probability that individuals will repeat their first-stage fractal choice (see Supplementary Fig.

Summary statistics for the five outcome-irrelevant learning estimates can be found in Table _{outcome-irrelevant} with first-stage and second-stage I and II scores were 0.12, 0.11, and 0.15, respectively (_{outcome-irrelevant} with first-stage and second-stage I and II scores were 0.35, 0.45, and 0.41, respectively (

A number of issues regarding outcome-irrelevant learning estimation need consideration. First, this type of learning was observed despite extensive task experience [

Model-based strategies are an expression of goal-directed control, which utilize explicit knowledge about the transition structure of the environment in order to inform the best option choices [_{model-based}; see Supplementary Information, Eq. 6 and Table _{model-based} parameter [

First-stage score (see Supplementary Fig.

Second-stage score (see Supplementary Fig.

Summary statistics for the three model-based estimates of interest can be found in Table _{1} parameter and the first-stage score, and 0.37 between the _{model-based} parameter and the second-stage score (all

A few caveats regarding model-based control estimates need to be acknowledged. First, as studies have raised concerns regarding estimates derived from first-stage scores [

Our main question was whether outcome-irrelevant learning is associated with compulsivity. Thus, we examined the correlation between latent compulsivity factor scores and outcome-irrelevant learning estimates (see Fig. _{95%}: 0.08–0.25, BF_{10} = 140.47 in support of H1, see Fig.

Next, we replicate a finding reported in previous studies, where we find a negative correlation between compulsivity and model-based abilities [_{95%}: −0.26 to −0.09, BF_{10} = 272.74 in support of H1, see Fig. _{95%}: −0.37 to −0.21, BF_{10} = 8.84 × 10^{8}; see Supplementary Fig.

To examine an association between outcome-irrelevant learning and compulsivity, while controlling for model-based abilities, we next conducted a multiple Bayesian linear regression. In this analysis, we tested the effects of outcome-irrelevant learning and model-based abilities, as well as their interaction, on compulsivity. Following recent guidelines for Bayesian linear regression [^{2} = 4.5%). The results indicated that the data are 1435.92 times more likely under the winning model compared to the null model, 6.15 times more likely compared to a model with only model-based control as a predictor, 11.87 times more likely compared to a model with only outcome-irrelevant learning as a predictor of compulsivity, and 2.69 times more likely compared to a model with both outcome-irrelevant learning, model-based control and their interaction as predictors of compulsivity. Examining the posterior parameter distributions for the winning model showed that higher outcome-irrelevant learning (coefficient posterior mean = 0.12, CI_{95%} = 0.04−0.21) and lower model-based abilities (coefficient posterior mean = −0.13, CI_{95%} = −0.22 to −0.06) predicted higher compulsivity estimates (see Fig.

Our latent factor of compulsivity was obtained using a non-orthogonal rotation, as this allowed us to deal with factor indeterminacy and provided us with an easier way to interpret factor estimates. However, it also meant that clinical factors were expected to be correlated (see Supplementary Fig.

Finally, one concern in our analysis comes from the use of individual random effect coefficients for subsequent analyses, a procedure that can underestimate variances and overestimate the covariance [

Thus, our main finding is a positive association between outcome-irrelevant learning and compulsivity. Despite the small effect size (~3% explained variance), this association remained significant even after controlling for model-based control, and the co-occurrence of obsessive, anxious, depressive, and schizotypal tendencies.

Compulsive rituals are often performed under the belief that they alter the probability of the occurrence of some future event [

A remarkable element of outcome-irrelevant learning estimates is that they are expressed across outcome-relevant features of the task (i.e., fractals, states, and stages) [

A prominent observation in the reinforcement-learning literature regarding compulsive behavior is that individuals with high compulsive tendencies show reduced model-based control [

Another related issue is that the current study did not address possible theoretical reasons as to why model-based control was negatively associated with outcome-irrelevant learning, and we suggest this as a useful focus for future investigation. Interestingly, Moran et al. argued that a cognitive map (or model) guides credit assignment [

Our results have relevance for the interpretation of findings from value-based neuroimaging studies on compulsive individuals. Specifically, a blunted neural response to a reward has been reported in compulsive individuals, with areas such as the nucleus accumbens showing reduced reward anticipation encoding [

One limitation to the current study is that we cannot determine a direction of causality using regression analysis alone [

Another potential limitation relates to a suggestion that the deployment of model-based strategies, such as in Daw et al.’s task version, do not necessarily lead to higher gains. This means model-based estimates such as ours might underestimate an individual’s true ability, as participants might not have been motivated to deploy model-based strategies [

To conclude, we demonstrate a positive relationship between outcome-irrelevant learning and compulsive behavior in a healthy volunteer sample. We suggest that attributing value to task representations regardless of their outcome relevance may be one contributory component to the emergence of compulsive behaviors.

We thank Dr. Gita Prabhu in NSPN data management and Dr. Matilde Vaghi for their help with data processing. This work was funded by a Wellcome Trust Strategic Award 095844/Z/11/Z (NSPN) and a Wellcome Trust Investigator Award 098362/Z/12/Z (RJD). The UCL Max Planck Centre for Computational Psychiatry is jointly funded by UCL and the Max Planck Society (MPS). TUH is supported by a Wellcome Sir Henry Dale Fellowship (211155/Z/18/Z), a grant from the Jacobs Foundation (2017-1261-04), the Medical Research Foundation, and a 2018 NARSAD Young Investigator Grant (27023) from the Brain & Behavior Research Foundation. NS has received funding from the Israeli Science Foundation (grant no. 2536/20). ETB is an NIHR Senior Investigator (RNAG/356).

NS analyzed the data and wrote the manuscript. TUH and RM contributed to data analysis and took part in writing and revising the manuscript. MM supervised data collection, reviewed data analysis and took part in reviewing and revising the manuscript. ETB supervised the study design and data collection. RJD supervised data collection, data analysis, reviewed and revised the manuscript.

The authors declare no competing interests.

Supplemental Information

The online version contains supplementary material available at