Table of Content
Associative representations underlying instrumental behavior
Detecting shifts towards habits
Stress Affects the Balance Between Goal-directed and Habitual Control of Behavior
Design and Procedure
Subjective stress ratings
Checking of Stress Manipulation, Instrumental Learning, and Mindfulness
Slips of action
Stress-induced Shift towards Habits
Differential Stress Responses between more and less Mindful Individuals
Moderation of Mindfulness on the Stress-induced Shift
A Mediation Model
Individuals differ in their tendency to shift towards habitual behavior after stress. Mindfulness is an individual characteristic that does not only influence the stress response, but also cognitive processes. Therefore, effects of trait-mindfulness on the balance between goal-directed and habitual behavior after a stress induction were investigated. During this study, forty-seven healthy male and female participants mainly without mediation experience, were exposed to either a laboratory acute stress induction or a control procedure before they performed in a two-staged instrumental learning paradigm and completed a mindfulness questionnaire. Participants learned the acquired discriminations and a congruence effect, as predicted, was found. Moreover, explicit knowledge was highly indicative of level of goal-directedness. The stress manipulation was successful. Stressed individuals did not shift more towards habits. Mindfulness did not predict the level of goal-directedness, neither did it affect stress responses. Future studies might incorporate a mindfulness intervention or compare differently experienced meditators. This could help to approve mindfulness as a potential intervention to reduce cognitive shifts towards habits, which play a major role in development of pathological behavior.
Keywords: mindfulness, stress, instrumental learning, goal-directedness, habits
After a stressful day at work, where your boss shouted at you earlier that day, you might engage in thoughts about why he did so, what you could have done differently to prevent the verbal attack, and emotions like anger, rage or sadness might arise. You might thereby lose relatedness to the awareness of the present moment, which is you being in a safe environment, simply having thoughts and feelings. When seeing the fridge, you might then automatically open it and grab a snack, despite your new goal of eating less.
To experience a situation as stressful, the individual must appraise the stressor as threatening and review its resources to deal with it as insufficient (Lazarus & Folkman, 1984). The perception of stress is a major factor in determining behavioral and physiologic stress responses (McEwen, 1998). The immediate response to a stressor is the rapid activation of the autonomic nervous system (ANS) resulting in the releases of catecholamines, specifically noradrenaline and adrenaline from the adrenal medulla (de Kloet, Joëls, & Holsboer, 2005). Elevated levels of these hormones increase the heart rate and blood pressure and enhance attention (Ulrich-Lai & Herman, 2009). The increased activation of the slower hypothalamic-pituitary-adrenal (HPA) axis, leads to elevations in the hormone cortisol, which is released from the adrenal cortex (de Kloet et al., 2005; McEwen, 1998; Ulrich-Lai & Herman, 2009). Glucocorticoids affect the brain in multiple ways, for instance suppressing cognitive processes unrelated to the stressor (van Stegeren, Roozendaal, Kindt, Wolf, & Joëls, 2010). First of all this is an adaptive process, because it enables the individual to save cognitive resources and to avoid delay and hesitation when faced with a stressor (Schwabe & Wolf, 2013). However, such habitual behavior is less flexible, because it does not take changes in a goal’s value into account (e.g., Adams, 1982). Not all individuals revert to habitual behavior when stressed (Schwabe & Wolf, 2009). Reconsidering the given example from above: The individual might have shown an intense bodily response towards the stressors at work that then rendered the behavioral control habitual at the expense of the new goal of eating less. If this individual had effective strategies in reducing experience of stress and bodily responses, shifting towards habits could be less likely. Such strategy could be mindfulness as it is associated with reduced levels of experienced stress (e.g., Brown & Ryan, 2003), as well as lower blood pressure reactivity (Kemeny et al., 2012) and salivary cortisol response to a stressor (Brown, Weinstein, & Creswell, 2012).
Behavior is called instrumental when it has been learned and controlled by a relationship between a response (R) and its outcome (O) in the presence of some discriminative stimulus (S) (Balleine & Dickinson, 1998; Colwill & Rescorla, 1988). Recalling the example again: If one was seeing the fridge (S) and one would leave it closed (R) to stay with one’s goal of eating less (O) this behavior would be an instrumentally learned goal-directed action. There are multiple ways how associations between these components are formed.
Associative representations underlying instrumental behavior.
Two-process theories (e.g., Asratyan, 1974) suggest that instrumental behavior results from both, Pavlovian S–O and instrumental R–O learning. The product of both learning processes S-O and R-O is called an S → O → R associative chain (Bray, Rangel, Shimojo, Balleine, & O'Doherty, 2008; de Wit & Dickinson, 2009; Dickinson & de Wit, 2003). If one was seeing the fridge (S) one would consider the goal first, which might be to allay one’s hunger, or as in the previous case to eat less (O), and then either open the fridge to eat or refrain from doing so (R). This association is mediated by the representation of the common outcome (de Wit & Dickinson, 2009), because a purely Pavlovian stimulus S can prime performance of an instrumental response R that was separately paired with the same outcome O as the Pavlovian stimulus (Colwill & Rescorla, 1988; Corbit, Janak, & Balleine, 2007).
The same behavior can be either goal-directed or habitual in animals (Adams, 1982; Balleine & Dickinson, 1998; Dickinson, 1985) or humans (Tricomi, Balleine, & O'Doherty, 2009). A goal-directed action is purposeful behavior that is controlled by knowledge about the consequences (Adams & Dickinson, 1981a; de Wit & Dickinson, 2009; Dickinson, 1985). Therefore, behavior is controlled more flexibly as it can be constantly adjusted towards the current desirability of goals (Adams & Dickinson, 1981a; Dickinson, 1985). In contrast, habits are simply triggered by accompanying stimuli, regardless of the current value of the consequences (Adams & Dickinson, 1981a; Dickinson, 1985). This enables for fast and appropriate responses in stable environments (de Wit & Dickinson, 2009), like being able to change gears while concentrating on the traffic. However, this can make behavior less flexible. Both systems can also be differentiated in terms of their neural substrates. For instance, fMRI studies in humans have found that the ventromedial prefrontal cortex (de Wit, Watson et al., 2012; Tanaka, Balleine, & O'Doherty, 2008; Valentin, Dickinson, & O'Doherty, 2007) as well as one of its target areas in the dorsomedial striatum, the anterior caudate nucleus are involved in goal-directed actions (de Wit, Watson et al., 2012; Tanaka et al., 2008). Conversely, the dorsolateral posterior striatum corresponds functionally to habitual behavior (de Wit, Watson et al., 2012; Tricomi et al., 2009).
A common technique to identify the form of behavioral control is to devaluate the outcome. Goal-directed actions and habits can be distinguished through examining for knowledge about the outcome in the associative structure (Adams & Dickinson, 1981b; Colwill & Rescorla, 1985; Dickinson, 1985). As an example, rats were trained to perform a certain action that was reinforced with food. The food was subsequently devalued by either conditioning an aversion to it or by feeding to satiety. In a following extinction test, depression in the behavior that was previously reinforced with the now devalued outcome shows evidence that the animal has learned to exert the behavior to acquire the outcome, for instance an R → O associative chain (Colwill & Rescorla, 1986). Behavior can change from a state in which it is sensitive to outcome devaluation into one in which it is not (Adams & Dickinson, 1981a). With extended training, the rats’ responding, after outcome devaluation, stayed on a high level, compared to rats that were reinforced on a lower amount of responses (Adams, 1982). This failure of integration of the outcome’s current value suggests a shift towards habits, that is, an instrumental behavior that used to be driven by knowledge about the outcome, with repeated practice becomes a response independent of the current value of the outcome (Dickinson, 1985). Such shifts also happen as a result of response conflict within an instrumental learning paradigm (de Wit, Niry, Wariyar, Aitken, & Dickinson, 2007; de Wit, Standing et al., 2012; de Wit, Watson et al., 2012; Gillan et al., 2011), and after experience of an acute stressor (Schwabe & Wolf, 2009, 2011, 2013).
Detecting shifts towards habits.
The present study investigated shifts towards habitual behavior using an instrumental learning paradigm (de Wit et al., 2007; de Wit, Standing et al., 2012; de Wit, Watson et al., 2012; Gillan et al., 2011). Within this paradigm, in a learning stage three different biconditional discriminations are employed: congruent, incongruent, standard. In the congruent discrimination the outcome and stimulus assigned to an action are the same event, allowing the discrimination to be learned by two simple S/O → R associations. On the contrary, in the incongruent discrimination, the stimulus assigned to one response is the same as the outcome of another response. This would lead to response conflict if the discrimination was learned with integration of the outcome that is by two S → O → R chains. As a result of this conflict, participants learned the congruent discrimination quicker than the incongruent discrimination (de Wit et al., 2007; de Wit, Standing et al., 2012; de Wit, Watson et al., 2012; Gillan et al., 2011). Despite slower learning, still performing well in this discrimination type must indicate that learning is mediated through S → R associations which are insensitive to outcome devaluation (de Wit et al., 2007; de Wit, Standing et al., 2012; de Wit, Watson et al., 2012; Gillan et al., 2011). In the critical standard discrimination, stimuli and outcomes were different sets of events that could either be learned by S → R associations or S → O → R associations but without raising conflict. In a subsequent slips-of-action test (Gillan et al., 2011), to the extent that individuals are insensitive to outcome devaluation, for instance committing slips of action, behavior would be controlled by S-R habits.
Stress Affects the Balance Between Goal-directed and Habitual Control of Behavior
It was shown that stress affects learning. For example rodents and men favored dorsal striatum-dependent S-R learning over hippocampus-dependent spatial (cognitive) learning after stress (Kim, Lee, Han, & Packard, 2001; Schwabe et al., 2007). Furthermore, when rats were exposed to chronic stress, their decision-making was not only biased towards habitual strategies, but stress also caused opposite structural changes, with atrophy in the medial prefrontal cortex, the area underlying goal-directed learning and hypertrophy in the dorsolateral striatum, which facilitates habit learning (Dias-Ferreira et al., 2009). For humans, stress before learning an instrumental task rendered behavior insensitive to outcome devaluation and therefore habitual at the expense of goal-directed performance (Schwabe & Wolf, 2009). This may indicate a stress-induced shift from prefrontal cortex- and dorsomedial striatum-based goal-directed learning to dorsolateral striatum-based habit learning (Schwabe & Wolf, 2013), with reduced activity in the prefrontal cortex (van Stegeren et al., 2010) and facilitated S-R memory processes (Schwabe et al., 2007). Glucocorticoid stress hormones (mainly cortisol for humans) have been suggested to act as switch between ‘cognitive’ and ‘habit’ learning systems (Schwabe, Schächinger, de Kloet, & Oitzl, 2010). Indeed, the high density of glucocorticoid receptors in the prefrontal cortex (McEwen, de Kloet, & Rostene, 1986) suggests that this area is highly sensitive to stress. However, only concurrent glucocorticoid and noradrenergic activity impairs goal-directed action (Schwabe, Tegenthoff, Höffken, & Wolf, 2012). The present study expects to raise both catecholamine and glucocorticoid concentrations as a result of exposing participants to an acute stressor, which should in return elicit shifts towards habits.
Until now, it has not been examined how characteristics that affect an individual reaction towards stress, might affect stress-induced shifts towards habits. If an individual had strategies to reduce both experience of stress and bodily responses, shifting towards habitual control such as eating habits, could be less likely. Such strategy could be mindfulness.
People can engage in states of mind in which their attention is persistently focused elsewhere. This can include excessive thinking about how and why things happened or will happen and judging one’s perceptions, cognitions, emotions, or sensations as good, bad, pleasant, unpleasant, et cetera, (Marlatt & Kristeller, 2003), in this way behaving automatically and without awareness (Brown & Ryan, 2003). Others are much more mindful, that is they are in a cognitive state that “can be considered as an enhanced attention to and awareness of current experience or present reality” (Brown & Ryan, 2003, p. 822), or is “the nonjudgmental observation of the ongoing stream of internal and external stimuli as they arise” (Baer, 2003, p. 125). The construct is ambiguous, as it may describe a set of skills that can be developed through meditation or other exercises (Baer, 2004). It can also be a state or a trait (Brown & Ryan, 2004) because mindfulness has been suggested to be a naturally occurring capacity that varies within persons (Brown & Ryan, 2003; Kabat-Zinn, 2003). Mindfulness may enhance psychological well-being through serving as an exposure procedure in which maintained observation of aversive thoughts and feelings reduces reactivity to emotional stimuli (Kabat-Zinn, 1982; Linehan, 1993). The exposure and nonjudgmental attending may cause a process of desensitization through which distressing sensations, thoughts and emotions that would otherwise be avoided, become less distressing (Keng, Smoski, & Robins, 2011). Mindfulness can be cultivated in interventions like the Mindfulness-Based Stress Reduction (MBSR), which is designed to enable individuals to be less reactive and judgmental and refrain from habitual and maladaptive patterns of thinking and behavior (Keng et al., 2011). The program reduces self-report levels of general psychological distress, including perceived stress (Brown & Ryan, 2003; Bränström, Kvillemo, Brandberg, & Moskowitz, 2010; Oman, Shapiro, Thoresen, Plante, & Flinders, 2008). Moreover, short-term meditation has been shown to result in lower salivary cortisol responses after an acute stressor (Tang et al., 2007) and attenuated blood pressure reactivity during the Trier Social Stress Test (TSST; Kirschbaum, Pirke, & Hellhammer, 1993) in individuals who practiced meditation more (Kemeny et al., 2012). Meditation practice is associated with higher cognitive flexibility, greater ability to focus and sustain attention (Moore & Malinowski, 2009), and to improve resolutions of mental conflicts (Tang et al., 2007).
However, carrying out an intervention is time-consuming and costly, and would not have fit into the residual circumstances of the present study. Mindfulness training increases trait mindfulness (Brown & Ryan, 2003; Shapiro, Oman, Thoresen, Plante, & Flinders, 2008), which in return mediates the effects of the intervention on clinical outcomes, for instance perceived stress (Shapiro et al., 2008). Several instruments have been developed to assess trait-mindfulness. For example the Kentucky Inventory of Mindfulness Skills (KIMS; Baer, 2004), that was used in the present study to assess mindfulness as a multi-faceted construct by using four scales that correspond to four mindfulness skills as conceptualized in Dialectical Behavior Therapy (Linehan, 1993). Brown et al. (2012) have provided the first evidence that trait-mindfulness moderates the neuroendocrine response towards an acute stressor: higher trait-mindfulness predicts lower salivary cortisol response to the TSST. Correlational research found multiple associates of trait mindfulness, for instance fewer stress symptoms (Brown & Ryan, 2003), higher levels of satisfaction (Brown & Ryan, 2003), less cognitive failures (Herndon, 2008), less rumination (Brown & Ryan, 2003; Raes & Williams, 2010), reduced negative thought frequency and higher ability to let go of negative thoughts (Frewen, Evans, Maraj, Dozois, & Partridge, 2007). The ability to let go of negative thinking may lead to fewer overthinking of experienced stress, thus enabling for quicker regain of adaptive cognitive control to the effect that one can focus on what is important right now, that is being oriented towards current goals. Therefore, one would assume that individuals high in trait-mindfulness show lower stress perception and neuroendocrine response when faced with an acute stressor. For individuals high in trait mindfulness, this buffer effect on stress responding, as well as higher cognitive control and flexibility, may reduce cognitive shifts towards habits.
To summarize, under stress behavior may shift from goal-directed actions to habits. First of all this is an adaptive process, but it may impede flexible behavioral changes and an overreliance towards habits may set the stage for pathologic behavior such as addiction or depression (Schwabe, Dickinson, & Wolf, 2011; Schwabe & Wolf, 2013). Stress is a major risk factor for these particular disorders (Sinha, 2008) and they are characterized by reduced cognitive flexibility or dysfunctional behavioral routines (Everitt et al., 2008; Gotlib & Joormann, 2010). Since both stress and mindfulness are found to influence cognitive behavioral control and neuroendocrine processes, the emerging question is whether the stress-induced shift to habitual behavior control is moderated by mindfulness.
Therefore, the following hypotheses are proposed: (1) in the learning phase, performance is reduced in the incongruent discrimination as compared to standard and congruent discriminations (congruence effect). Performance on incongruent trials is further insensitive to outcome devaluation, as associations could only have been learned through S → R associations. (2) Stress shifts behavioral control habitual. In the initial phase, learning performance does not differ between the stress and control group, because in the standard discrimination both goal-directed and habitual learning lead to success. However, the stress group commits more slips of action and shows reduced outcome learning as compared to the control group. (3) Individuals higher in trait mindfulness show lower physiological stress responses and lower stress perception, as compared to less mindful individuals. (4) Stressed individuals higher in trait-mindfulness are less likely to shift towards habits compared to individuals that are less mindful (Fig. 1).
Forty-seven participants (25 females and 22 males) with a mean age M = 24.68 years (SE = 0.44 years, range: 19-31 years) were recruited by online announcement and posted notices at the University of Hamburg. Therefore, participants were mainly students, however, psychology students were not permitted to participate. All participants were physically and mentally healthy, non-smokers and did not drink more than one glass of beer or wine on a daily basis, nor take any drugs. Participation was limited to individuals with a normal body mass index (range: 18.5 to 26.5), which is calculated by dividing body weight trough the square of body height. To avoid confounding in the salivary cortisol response to the stressor, women were not tested during their follicular phase of menses and not if they used hormonal contraception (Kirschbaum, Kudielka, Gaab, Schommer, & Hellhammer, 1999). On test day, respondents had to refrain from drinking caffeine-containing beverages, eating food and exerting excessive physical activity two hours prior to testing. All participants provided written informed consent. Participants received a monetary compensation of 32 Euros for their participation. The study was approved by the local ethics committee (Ethics commission Faculty of Psychology and Movement Sciences, University of Hamburg).
Design and Procedure
Performances of a stress and a control group were compared on an instrumental learning paradigm (de Wit et al., 2007). Therefore, participants were randomly assigned to either a stress group (n = 23, MAge = 24.74, SEAge = 0.50) or a control group (n = 24, MAge = 24.63, SEAge = 0.74). The testing took place between 12 a.m. and 6 p.m. at the Psychological Department at University of Hamburg. After participants’ arrival, they signed the informed consent, filled out questionnaires which were unrelated to this study, and were set up with electroencephalographic (EEG) equipment (reported elsewhere). Besides, baseline blood pressure measurements (t baseline) were obtained. The participant then was brought to a different room, where either the stress or non-stress procedure was carried out. Prior to the procedure, cortisol and blood pressure were measured (t pre-stress), and the participant provided answers to unrelated questionnaires again. The cortisol analysis was not part of the present study. At the end of the stress or control procedure, cortisol and blood pressure measurements (t +0), subjective stress ratings, and other questionnaires were obtained. The participants then were brought back to the first experimental room, where they were instructed about the upcoming instrumental learning task. They had to provide another saliva sample, as well as blood pressure measures (t +10). The first phase of the instrumental learning paradigm started 10 minutes after the end of the stress and control protocol. Afterwards, participants again provided a saliva sample and blood pressure measures (t +25). Then, they were instructed about the slips-of-action phase and conducted this second phase 25 minutes after the stressor. Afterwards, participants filled out explicit knowledge questionnaires, some unrelated questionnaires, and the mindfulness questionnaire. The procedure continued with another saliva sampling and blood pressure measurement (t +90), an unrelated task for the present study, some unrelated questionnaires and the debriefing.
In this study, the Maastricht Acute Stress Test (MAST; Smeets et al., 2012) was used to elicit the stress response. The MAST is a combination of the Socially Evaluated Cold-Pressor Test (SECPT; Schwabe, Haddad, & Schachinger, 2008), and the TSST. It elicits an autonomic (i.e. blood pressure increase), a glucocorticoid (i.e. salivary cortisol increase), as well as a subjective stress response (Smeets et al., 2012). The MAST merges the painful aspect, social-evaluative threat, and uncontrollability of SECPT and TSST into a new procedure, as follows: the experimenter brought the participant to the stress room that was exclusively used for the stress phase telling the subject to perform a task with a research colleague. That second experimenter, wearing a white coat, behaved reserved and distant towards the participant. He placed the participant in front of a computer screen and subsequently took the baseline cortisol and blood pressure measurement. During this preparation period, participants were told that they would have to immerse their right hand into ice-cold (2 °C) water in multiple trials, the duration and time in between would be randomly selected by the computer. This was told to increase participants’ feelings of unpredictability. In fact, the duration of all 4 trials was the same and fixed. They were also told that the overall task would take 12 minutes and would include a break. The overall duration was in fact 10 minutes, and did not involve any break. This false information was told to ensure that there would be no feelings of relief in the participants prior to measurement of stress-related cortisol and blood pressure, as well as subjective stress ratings. During trial-breaks, participants were instructed to engage in mental arithmetic tasks as fast as possible. To trigger feelings of social feedback and uncontrollability they were only given negative feedback and had to restart the task each time they made a mistake. When they performed well and without difficulties, they were told to count faster. Participants were also told, that they would be videotaped throughout the task and had to provide written consent therefore. In fact, no records were made but the camera signal was broadcasted live to a large TV screen in front of the participant, which had to be looked at during the whole procedure. This was done to increase feelings of unpleasantness and social evaluative threat. Participants were constantly watched by the experimenter throughout the whole procedure and had the right to abort the task at any time.
Participants in the control group went through a no-stress version of the MAST, in which they had to immerse their hand into warm water (35-37 °C), and had to perform simple counting from 1 to 25. They performed the task without videotaping and negative feedback.
Subjective stress ratings
Immediately after the last hand immersion trial, participants indicated their subjective experience of stress, difficulty, unpleasantness, and painfulness on 100 mm visual analogue scales (VAS; was filled out using Qualtrics (2016)). A higher score indicates a higher experience in the respective dimension.
Baseline blood pressure was measured in the beginning of the experiment (i.e. t baseline), five minutes before the MAST or the control condition (i.e. t pre-stress), and four times afterwards (i.e. t +0, t +10, t +25 and t +90 minutes with reference to the end of the stressor or MAST-control). The measurements were obtained using the Dinamap system (Critikon, Florida) with the cuff placed on the left upper arm. Systolic and diastolic blood pressure, as well as pulse were obtained.
An adapted version of the instrumental learning task (de Wit et al., 2007) was programmed in Mathlab (2015), using the Psychophysics Toolbox extensions (Brainard, 1997; Kleiner et al., 2007; Pelli, 1997) and was presented on a standard computer screen. The paradigm was divided into three stages: instrumental learning phase, slips-of-action test (Gillan et al., 2011) and paper-pencil questionnaires. Instructions about the task were provided on paper and had to be summarized by the participants in order to ensure a sufficient understanding to manage the task. To ensure that participants made an effort (i.e. collect as many points as possible), they were told that they would be able to receive a monetary bonus for successful processing. Every participant received two extra Euros at the end. The given answers were recorded on a standard keyboard. Eight different fruits served as stimuli; pineapple, apple, banana, pear, cherry, melon, orange and grapes.