Moving thoughts: emotion concepts from the perspective of context dependent embodied simulation

ABSTRACT This review article presents our perspective on psychological and physiological mechanisms underlying concepts from the domain of affect, emotion, and motivation. We suggest that these concepts are linked to sensorimotor and interoceptive systems, and as such represent a paradigmatic example of embodied conceptual processing. In view of recent debates about the scope of embodiment, however, we argue that the use of grounded resources in emotion concepts is flexible and context dependent. The degree to which embodied resources are engaged during conceptual processing depends upon multiple factors, including an individual's task, goals, resources, as well as constraints both temporal and situational. In addition, we highlight the extent to which conceptual understanding of emotion, and its specific embodiment, is shaped by social and cultural influences. Accordingly, we call for research that more fully incorporates higher-order psychological factors into the study of the physiological and neural mechanisms that underpin emotion concepts.

Congenital analgesia is a rare condition in which a person cannot feel (and has never felt) physical pain (Manfredi et al., 1981).Less rarean estimated 1.1% of the British populationare adults who have never experienced sexual desire (Bogaert, 2004).If a concept is a structure in semantic memory that supports categorisation, meaning, and inference, do these individuals understand the meaning of concepts such as PAIN or LUST?Are these cases of understanding affectand motivation-related concepts similar to that of colourblind individuals who may (or may not) be missing some part of the meaning of RED? Do our concepts originate, as Hume (1740Hume ( /1973) ) argued, in the rearrangement of sensory data?Can careful observation of other people, or reading an extensive list of novels, compensate for the lack of first-person contact with these experiential phenomena?Perhaps there is no issueas in the opinion of one of our colleagues who dismissively said that an economist can win a Nobel prize for understanding the operation of factories, without ever visiting one.
While of interest and importance today, research in the cognitive sciences has historically treated such questions as largely irrelevant.On these traditional accounts, understanding is seen as resulting from the human ability to form a rich, distributed network of abstract, symbolic representations (Collins & Loftus, 1975).Concepts are distinct from percepts because percepts are driven by interaction with the external world, while concepts are arbitrary symbols subject to offline manipulation and logical operations.Consequently, concepts are meaningful in virtue of their role in a larger compositional system governed by truth-preserving operations (Fodor, 1975;see Quilty-Dunn et al., 2022 for a recent incarnation of this approach).Though all concepts get their meaning from their role in a larger system, some concepts are "concrete" and refer to objects (e.g.BIRD) and actions (FLY) that can be directly perceived, whereas others are "abstract" and refer to unperceivable entities, like ideas (DEMOCRACY, ODD NUMBER) or, as is our focus here, feelings, motivations, and emotions (PAIN, LUST, ANGER).
According to symbolic accounts, knowledge and understanding develop gradually via a process of moving beyond individual experiences with referents, e.g.different instances of birds, or different instances of anger, to derive abstract conceptual cores used in language and reasoning (Mahon & Caramazza, 2008).In the end, understanding "anger" involves the apprehension of its essential abstract features, just as understanding the essence of "odd number" transcends whether the number is 3 or 287, is displayed in Roman or Arabic numerals, is written in green or yellow, or spoken in English or Polish.This understanding also involves knowing the relation of ANGER to other concepts (e.g. that ANGER is an EMOTION, that it can arise from INJUSTICE, can lead to a FIGHT and is different from FEAR).This conceptual meaning can (but need not) be indexed by our vocabulary.For example, the concept of ANGER is indexed by the English word "anger" which points to a context-invariant set of features that constitute a semantic core.These cores are then meaningfully connected with other concepts (ANGER -EMOTION -FEELING -EXPERIENCE, etc.).Of course, the specific semantic network varies greatly as a function of culture, and this shapes the ultimate emotional understanding (Jackson et al., 2019).
In contrast, the idea of embodied or grounded cognition was originally proposed as a way of solving the symbol grounding problemhow do we get a meaning into the conceptual system if all it has is references to other ungrounded symbols (Harnad, 1990;Searle, 1980)?The solution to this problem lies in having concepts be intrinsically linked to the recruitment of specific sensorimotor resources involved in the actual experience with a real world example of the concept (Barsalou, 1999).When the real world is not available, the perceiver can then simulate (reinstate) select aspects of their perceptual experience of it.This approach views concepts as perceptual symbols (or more broadly modal symbols) and suggests the meaning of "anger" includes the ability to construct a selective, temporary, dynamic interpretation of (say) ANGER, focusing on features or dimensions relevant to current representational needs (Barsalou, 2003).
The basic claims of grounded cognition are that concepts are supported by brain systems for perception and action (see Barsalou, 1999or Prinz, 2002 for a more thorough comparison of symbolic and grounded approaches).Because words and concepts are learned through sensorimotor experience, conceptual retrieval involves a simulation process that recruits a subset of the brain areas linked to learning those concepts.One of the most basic claims of grounded theories of meaning is that language comprehension involves the activation of brain systems for action and perception.When applied to emotion, grounded approaches suggest that emotional language prompts affectrelated responses in the body and the brain, and these responses play a functional role in its comprehension.
Accordingly, we will use the term grounded cognition in the larger sense that implies not only the use of sensory and motor resources, but also includes the brain's representations of the actual physical body including the role of peripheral inputs and outputs (Kiefer & Barsalou, 2013).Because these bodily components are important for emotion concepts, we will also use the terms embodiment and embodied concepts.Although the term "embodied cognition" is occasionally used to describe a more radical idea that the body and its coupling with the environment are constitutive of cognition, we remain agnostic on this thesis (for a comprehensive presentation of various radical notions of embodiment, see Newen et al., 2018).

Overview and main thesis
In this review article, we present our own perspective on emotion concepts and argue for the importance of their links to sensorimotor resources.Our particular theoretical perspective is inspired by the grounded cognition approach originally formulated and later developed by Barsalou (Barsalou, 1999, 2008;Barsalou et al., 2018).Our approach is generally compatible with what is known as Multiple Representation Views, which see abstract concepts as activating and recruiting the sensorimotor, affective, interoceptive, and introspective components (Borghi & Barsalou, 2021;Kiefer & Harpaintner, 2020;Vigliocco et al., 2014).There are, of course, important distinctions between various specific views, which we will highlight later.But for now, it is worth noting that some views emphasise more external grounding (object affordances, actions on the world, Borghi & Barsalou, 2021, Harpaintner et al., 2018) and links to actual speaking acts, such as mouth movements (Mazzuca et al., 2018), whereas others emphasise more internal grounding in the brain's affect system (Vigliocco et al., 2014) or introspective experience of one's own mental states and mentalising about social interactions with others (Kiefer & Harpaintner, 2020).However, they all agree on the general importance of grounding for both concrete and abstract concepts.
The structure of our review is as follows.We begin with a discussion of how emotion concepts are grounded in somatosensory processes, including in development.Next we describe research that supports a role for grounded emotion concepts and consider whether the data are most consistent with strong or weak accounts of grounded cognition.We provide a brief overview of our CODES model, outlining its motivation in research addressing how context changes the involvement of sensorimotor information in emotion processing.Finally, we suggest that a Multiple Representation account can best accommodate the role of high level contextual factors such as metaphor and cultural variation in emotion concepts.Throughout, we highlight potential physiological and neurological mechanisms underpinning the processing of emotion concepts.Importantly, we do not attempt a comprehensive review of grounded approaches to emotion concepts.Rather, we emphasise how our own theoretical and empirical work investigates embodied emotion concepts and their contextual nature, and how it fits with the existing literature.

Getting off the ground
In recent years, the grounded approach to concepts has been fruitfully applied to affective, emotional, motivational, and social concepts (Dreyer & Pulvermüller, 2018;Niedenthal et al., 2005;Wilson-Mendenhall et al., 2013).On this view, understanding the concept of ANGER is (at least in part) facilitated by the recruitment of specific sensorimotor resources involved in the actual experience of anger.When people think about the meaning of ANGER they may simulate a relevant experience of iteither from memory or constructively using currently relevant resources.Importantly, our particular perspective emphasises that the activation of sensorimotor content varies as a function of contextual factors, which we will explain later when we elaborate on our Context Dependent Embodied Simulation, i.e.CODES, model (Winkielman et al., 2018).
Emotion concepts are of particular interest to embodied theories of meaning because they simultaneously have concrete sensorimotor features and abstract relational ones.These sensorimotor features include external bodily changes (e.g.action tendencies, body movements, and facial expressions), internal bodily changes (e.g.changes in heart rate and breathing), and brain state changes (e.g.dopaminergic release), all of which may contribute to a phenomenal component such as the experience of feelings of anger, sadness, or desire (Barrett & Lindquist, 2008;Niedenthal, 2007).However, emotion concepts also have abstract, relational features (Ortony et al., 1988).Emotions are intentionalthey are about things, properties, and states of affairs.For example, a feeling of anger usually comes with a strong conviction that one has been unjustly thwarted by someone or something.It is a feeling that is directed at someone or something, and is closely associated with thoughts of retaliation or revenge.When participants are asked to rate emotion concepts along the dimensions of abstractness, imageability, and context availability (that is, how easy it is to think of a context in which they occur), they rate them as significantly different from their abstract and concrete counterparts, falling in between the two (Mazzuca et al., 2018).
Emotion concepts thus present the language learner with a challenge.Whereas concrete concepts such as CAR or WALK can be grounded in relatively similar sensorimotor experiences, abstract emotional concepts such as SADNESS or JOY cannot.Part of the challenge is that the relational component of an emotion varies greatly in terms of its featuresthat is, many disparate situations can induce a feeling of sadness.Moreover, like many abstract concepts, the referent of emotion concepts cannot be directly observed.When a mother tells her child that she feels "sad", the child cannot directly experience her mother's sadness.However, because the child can observe the mother's actions and vocalisations, these observable features may help bridge the gap between consciously perceived internal states, the concepts that organise them, and the language that indexes them (Pulvermüller, 2018).
Further, while emotions can be elicited by a variety of different situational stimuli, there are somewhat coherent patterns in their phenomenological and physiological aspects (Matsumoto et al., 2007).This family resemblance of the internal feeling elicited by different triggers of, say, SADNESS, may offer an early hook for creating a concept.Importantly, we are not saying that different emotions are determined by distinct and stable patterns of autonomic or central activity.It has been known for years that similar autonomic and central states can underpin different emotions (Barrett, 2019;Dutton & Aron, 1974;Kragel & LaBar, 2013;Schachter & Singer, 1962;Siegel et al., 2018).What we are saying is that some pattern of internal physiological experience (e.g.feelings of low energy in SADNESS) can help mediate the integration of different instances of SADNESS into a single concept.
In fact, despite their abstract characteristics, many emotion concepts can be understood early in development.For example, infants begin to grasp key elements of SADNESS as early as 18 months (Chiarella & Poulin-Dubois, 2018).While young children lack sophisticated conceptualisation abilities, they nonetheless understand both that emotions are mental states, and that the same emotion can arise from perceptually dissimilar causes (Harris, 2008).Importantly, young children's understanding of emotion concepts starts unidimensionally, distinguished primarily in terms of valence, and only becomes multidimensional and categorical in adolescence (Nook et al., 2020).This suggests that many emotion concepts have their roots in cultural socialisation (Hoemann et al., 2020;Lindquist et al., 2015;2022;Shablack & Lindquist, 2019).
Vigliocco and colleagues argue that internal experiences, especially those marked with valence, underlie the grounding of many abstract concepts (Vigliocco et al., 2009).This is because abstract words (by definition) do not have concrete referents that are experienced through bodily interaction with the environment.Rather they are learned by noticing similarities in internal reactions to the situations in which we learn those concepts.In keeping with Vigliocco's position, a meta-analysis suggests that in addition to the traditional five senses, interoception (the perception of one's own bodily state) makes a unique contribution to conceptual grounding (Connell et al., 2018).Connell and colleagues found that participants associated "sensations in the body" with concepts to a similar degree as the five traditional sensory modalities.Measured in this way, interoceptive grounding drove perceptual strength more strongly for abstract concepts than concrete ones and was particularly relevant for emotion concepts (Connell et al., 2018).Relatedly, Villani and colleagues (2021) present multiple studies showing that an interoceptive load condition (monitoring the heart rate) interferes selectively with the comprehension of emotion-related concepts, while a manual interference condition (squeezing a ball) hinders understanding of more concrete concepts.
Indeed, similarities in the affective response to a given situation may not only mediate the acquisition of emotion concepts, but abstract concepts more generally (Ponari et al., 2018).Evidence for this comes from studies showing a positive statistical correlation between ratings of the valence and the abstractness of words, as well as a processing advantage for valenced abstract words over more neutral ones (Kousta et al., 2011;c.f. Winter, 2023).Affectively loaded abstract words are acquired earlier than abstract words that are less affectively loaded (Ponari et al., 2018).Perhaps most directly, the processing of abstract words is known to recruit the brain's affective systems, including a network of structures connected to rostral ACCa part of the anterior cingulate associated with emotion processing and that is highly interconnected with limbic structures (Vigliocco et al., 2014).Stressing the diversity of abstract concepts, however, other scholars have argued that affective experience is important for the grounding of some abstract concepts, like ARGUMENT, but not others, like THEORY or CALCULUS (Borghi et al., 2018;Kiefer & Harpaintner, 2020;Winter, 2023).

Feelings, interoception, and concepts
The internal experiences (feelings) underlying emotional concepts can be traced to more fundamental neural substrates supporting the representation of emotion.The exact mechanisms underlying specific emotions, feelings, and their consciousness are still under intense debate.This is because emotions are complex, multifaceted processes incorporating a range of components, which can operate with and without conscious feelings (Paul et al., 2020).Still, it appears that one source of positive feelings, desire, and approach motivations are neural structures linking the cortical and limbic systems, such as the orbitofrontal cortex, nucleus accumbens (NAc) and ventral pallidum (Berridge & Kringelbach, 2013).This network, along with its links to brain areas implicated in sensory experiences such as taste, provides neural grounding to concepts related to more specific desires and motivations, such as concepts of FOOD and HUNGER (Papies & Barsalou, 2015;Simmons et al., 2005).General processing of arousal and valence (both positive and negative) is also clearly linked to the network involving the amygdala (Herbert et al., 2009), perhaps because of its role in detecting salient, affectively-relevant events (Kissler, 2013).
Also important are neural structures that map and monitor bodily states, such as the insula which underpins many interoceptive experiences (Craig, 2008), including emotional ones (Critchley & Garfinkel, 2017).Interesting insights about the connection between the conceptual and experiential/bodily realm come from work on interoceptive accuracy, or the notion that individuals differ in their ability to notice and discriminate their bodily states (e.g.variations in their heart beat, their respiratory load, etc.).Notwithstanding disputes about whether interoceptive accuracy is a single skill or multiple, modality-dependent interoceptive skills (i.e.sensitivity to one's own heart rate, ability to breathe, fullness of one's stomach or bladder), there are at least some correlations across measures of participants' metacognitive insight into interoception (Garfinkel et al., 2016).Moreover, interoceptive accuracy (as measured by individual difference measures) appears to determine how good participants are at differentiating the contributions of different sources of arousal to their mental representations.One study found that participants rated highly arousing images as more familiar when their bodily arousal was enhanced by a simple exercise manipulation (Kever et al., 2021).However, participants who scored high on interoceptive accuracy were better at differentiating the source of their arousal (that is, whether their arousal was due to the image or to the prior exercise) and were thus less influenced by the exercise manipulation when judging image familiarity (Kever et al., 2021).
Research also points to a relationship between interoceptive accuracy and refinement in the use of emotion concepts, as measured by alexithymia (Brewer et al., 2016;Trevisan et al., 2019).Accordingly, higher interoceptive accuracy is associated with feeling emotions more strongly (Barrett et al., 2004).Further, recent theoretical proposals argue many key socio-emotional concepts such as LONELINESS, TRUST, and EMPATHY are linked to interoception (Arnold et al., 2019).In line with this idea, participants use their interoceptive signals, such as cardiac contractions, to judge the trustworthiness of novel faces (Azevedo et al., 2022).Even higher order emotion concepts such as BEAUTY involve interoceptive feelings associated with contemplation, wonderment, and the motivation to approach the object we find beautiful (Fingerhut & Prinz, 2018;Freedberg & Gallese, 2007).

Emotion concepts and bodily action
Another path for grounding concepts is action, intended or realised (Glenberg & Robertson, 2000).One key link between emotion and action is the planned motor activity.For example, fear is associated with preparation for fleeing, freezing, or fighting (Frijda, 1986).There is also the actual motor activity associated with emotional expressions of the face and body (Darwin, 1872, Ekman & Friesen, 1971).For example, facial movements associated with expressions of disgust such as nose wrinkling, reduce the acquisition of sensory information, while facial movements associated with expressions of fear, such as eyes widening, enhance sensory intake (Susskind et al., 2008).These non-arbitrary movement patterns can be noticed in facial and bodily expressions of very young children, and even in individuals who are congenitally blind and so not subject to cultural inputs via the visual modality, such as observing others' facial expressions to an object of disgust (Matsumoto & Willingham, 2009).There is a lively debate about the specificity of these motor patterns when produced, and the extent they need to be interpreted when perceived (Barrett et al., 2007).Furthermore, conceptual input clearly plays a role in interpretation and embodiment of facial expressions (e.g.Halberstadt et al., 2009).Still, some non-arbitrary motor profiles could influence how we engage with the concrete referents of disgust and fear.As such, the motivated relationship between these facial movements and their eliciting conditions might provide a scaffold for grounding their meaning (see Perniss & Vigliocco, 2014 for a general review of iconicity and motivated meaning).Consistent with these ideas, emotion terms like FEAR activate the primary motor cortex, presumably because of its role in postural, gestural and facial expression of emotion (Dreyer & Pulvermüller, 2018).
Another relevant phenomenon is facial mimicry when an observer spontaneously reproduces the emotional expression of another person (Dimberg, 1982;Palagi et al., 2020).In some ways, this mimicry is quite automatic and can occur to expression-like stimuli (e.g.smiles) that are presented very briefly (Bornemann et al., 2012;Dimberg et al., 2000), and even to "expressions" presented by non-human agents (Hofree et al., 2014).Still, these bodily reactions are consequential and influence the extent to which observing a facial expression will impact social judgments and decisions (Foroni & Semin, 2011;Winkielman et al., 2022).
Importantly, emotional mimicry is subject to modulation by social and other contextual factors (for reviews, see Arnold & Winkielman, 2019;Hess & Fischer, 2013).It is well known in this literature that facial mimicry is more common among social actors in cooperative situations than competitive ones (Hofree et al., 2018;Lanzetta & Englis, 1989).Mimicry also differs as a function of tasks and goals.For example, Hess and Kafetsios (2022) showed that emotional mimicry is more pronounced when participants are asked to rate emotions on a continuous dimension (how happy is this person?)than when simply asked to categorise the expression (is this person happy or sad?).In keeping with proposals that context determines what is bodily simulated and when, recent work shows that mimicry can have both cognitive and social roles.For example, participants presented with partially occluded facial expressions that are free of social context (standard lab stimuli with artificial occlusions) mimic only the muscles they can see; by contrast, a partially occluded happy expression presented in a social context elicits a mimicry response of the entire face, including the invisible parts (Davis et al., 2022).
Observers' propensity to mimic the emotional expressions of others also creates the possibility of emotional contagion (Hatfield et al., 1993).Under the right conditions, there is a relationship between facial mimicry and emotional experience (Olszanowski et al., 2020).Empathising with another person's pain activates neural circuits that are involved in the first-person experience of pain (Cheng et al., 2010).Note that for the grounding problem, it is not essential whether these connections exploit a unique neural mechanism (Iacoboni, 2009), innate human predispositions (Warneken & Tomasello, 2006), or are learned entirely via the perception-action system (Heyes, 2011).The critical point is that these mechanisms allow us to bridge the external actions of others to our own internal experiences.
In helping to establish common ground, this bridge is important both for making inferences about others' internal states and for talking about them in a meaningful way.The relationship between action and emotion means that observing others' actions may help us predict their emotions, at least within a culture, and to some extent across cultures in similar social contexts (Cowen et al., 2021).Body postures (Aviezer et al., 2012), facial expressions (Ekman & Friesen, 1971), vocal prosody (Scherer et al., 2001), and subtle motor activity around the eyes (Baron-Cohen et al., 2001) all provide information that can help an observer identify what another individual is feeling.Accordingly, the mere observation of facial expressions activates a rich network of neural structures that include motor areas of the brain.Further, somatosensory and motor resources in the brain can be used to construct partial simulations or "as if" loops even without any peripheral engagement (Adolphs, 2002;Damasio, 1999).

Embodied emotion concepts
We have suggested that the development of emotion concepts is mediated in part by interoceptive mechanisms that give rise to the subjective experience of emotion, and in part by the bodily actions used to express our emotions to others.Although the situations that elicit a particular emotion are highly variable, the internal and external responses they provoke are less so.Consequently, their co-occurrence with particular word forms provides a basis for aggregating across their shared semantic features (see Pulvermüller, 2018).Together this provides a means for grounding emotion concepts in embodied experiences.In this section, we describe empirical research that supports the claim that emotion concepts are grounded in embodied states with an emphasis on whether the data best support strong or weak accounts of embodiment.
One method for testing whether emotion concepts are embodied is electromyography, or EMG, which involves placing electrodes on various muscle sites to evaluate subtle facial or bodily expressions.Studies using EMG have found that participants smile to positive stimuli and frown to negative ones, though the effect is weaker for words than it is for pictures (Larsen et al., 2003).It is possible that pictures are more likely than words to "move" participants because pictures are more concretely connected to their emotional referents than are the words (Winkielman & Gogolushko, 2018).Similarly, under proper task conditions, concrete verbs associated with specific emotional expressions, (e.g."smile" and "frown") elicit corresponding EMG responses (that is, smiles and frowns, respectively), while abstract adjectives (e.g."funny") elicit weaker, affectively congruent responses (Foroni & Semin, 2009).
Taboo words and verbal reprimands are emotionally charged and elicit greater facial responses (Foroni, 2015) and increased skin conductance relative to control words (Harris et al., 2003).These effects are stronger for words in participants' native than nonnative language, as the affective element of the concepts is arguably more strongly represented in the mother tongue (Baumeister et al., 2017;Harris et al., 2003).Further, neuroimaging studies consistently find that affectively charged words activate brain regions associated with the actual experience of affect and emotion (Citron, 2012;Kensinger & Schacter, 2006;Kuhnke et al., 2022).

Varieties of grounded cognition: from weak to strong
These studies show a connection between emotion concepts and their associated embodied responses.However, there are multiple reasons why such responses could occur.Consistent with the grounded cognition perspective, it is possible that embodied responses are partially constitutive of emotion concepts, viz.that they play some representational role.From a "strong" position, this is because the conceptual and sensorimotor systems are one and the same (as reviewed by Leshinskaya & Caramazza, 2016).From a "weak" position, conceptual representations are embodied at different levels of abstraction and the extent to which a concept activates sensorimotor systems at any given time depends upon conceptual familiarity, contextual support, type of concept, and the current demand for sensorimotor information (see Binder & Desai, 2011, Desai, 2022;Kiefer & Harpaintner, 2020 for review).
Alternatively, grounding in sensori-motor resources might be functionally relevant for conceptual processing, but distinct from the conceptual representations.For instance, the physiological activity might be the result of elaboration after the concept has been retrieved.Finally, the physiological responses might be completely epiphenomenal, reliably accompanying conceptual activity but playing no functional role (Mahon & Caramazza, 2008).On such an account, amodally represented concepts might spread activation to, say, motor circuits, but these side effects play no causal role in our understanding and have no consequences for conceptual reasoning processes (Mahon, 2015).

Emotional words and emotional faces
Compelling evidence in favour of the hypothesis that emotion concepts draw on neural resources involved in action and perception comes from research on subjects who have impaired motor function.For example, individuals with Motor Neuron Disease and Parkinson's Disease have motor deficits and these deficits are associated with impaired action-word processing (Bak & Chandran, 2012;García & Ibáñez, 2014).Individuals on the autism spectrum whose motor deficits impair their emotional expressions also display deficits in the processing of emotional words, and the extent of these two impairments is correlated (Moseley & Pulvermüeller, 2018).One report suggests that a patient with lesion in the left supplementary motor area was selectively worse in processing abstract emotional words (Dreyer et al., 2015).Similarly, the difficulties autistic individuals experience with emotional content might be related to their wellknown deficits in the spontaneous mimicry of facial expressions (Clark et al., 2008;McIntosh et al., 2006;Oberman et al., 2009).
Complementing the correlational research above are studies that involve experimental manipulation of motor activity in neurotypical subjects in order to measure its impact on conceptual processing.As previously mentioned, emotions involve different patterns of muscle activity in the face (Ekman & Friesen, 1971).A smile, for example, involves the use of the Zygomaticus major muscle to pull back the corners of the lips.Unsurprisingly, many researchers have attempted to explore the consequences of manipulating facial activity on emotional feelings and on the processing of emotional concepts.
Because of recent debates about the replicability of some of these findings, it is worth highlighting a few distinctions.First, the replication debate primarily concerns how manipulating facial feedback influences ratings of feelings and affect-laden stimuli.Notably, Strack et al. (1988) asked participants for ratings of cartoon funniness while holding a pen in their mouth in a way that either facilitates smiling (lightly between teeth) or prevents smiling and increases pouting (strongly between lips).Wagenmaker and colleagues (2016) could not replicate the original report of participants finding the cartoons funnier after the induced smiling manipulation.This led to an active debate about the relative strength and potential limits of facial feedback effects (e.g.Noah et al., 2018).
The most recent conclusion is that such effects are reliable, but also small, sensitive to the specific facial manipulation, and most importantly, dependent on context (Coles et al., 2019(Coles et al., , 2022)).In fact, one key variable is what the specific facial feedback manipulation (e.g.pen in mouth) does.It can facilitate smiling, by making it easier, or even forcing participants to raise the corners of the mouth (as in Strack et al., 1988).
Alternatively, it can prevent smiling (as in Niedenthal et al., 2001), by essentially freezing the Zygomaticus muscle in one fixed position (thereby not allowing any dynamic changes in response to positive stimuli).In the research presented next, we use the latter strategy, thus avoiding any ambiguities and replicability issues associated with the use of pen-in-mouth procedure as a way to facilitate a smiling action.
For example, to prevent smiling, participants can be asked to continuously bite on a pen held horizontally between their teeth.EMG indicates that this pen manipulation generates tonic Zygomaticus activity, injecting noise into the system while preventing movement mimicry at the periphery (Davis et al., 2015(Davis et al., , 2017;;Oberman et al., 2007).Disrupting the motor system in this way impairs the recognition and categorisation of subtle expressions of happiness that rely on motor activity in the mouth, but not subtle expressions of anger and sadness that rely heavily on motor activity at the brow (Oberman et al., 2007).Further, impairing smiling mimicry slows the detection and recognition of facial expressions that gradually change between happiness and sadness (Niedenthal et al., 2001).
These sorts of interference studies reveal a systematic relationship between the targeted muscles and the emotional expressions those muscles mediate.In a study that manipulated tonic motor activity either at the brow or at the mouth, interference at the brow impaired the recognition of expressionssuch as angerthat rely on the upper half of the face, while interference at the mouth impaired the recognition of expressionssuch as happinessthat rely more on the lower half of the face (Ponari et al., 2012).The claim that different halves of the face provide more diagnostic information about emotional expressions, has been validated both by facial EMG (Oberman et al., 2007) and a recognition task that involved composite images that were half emotionally expressive and half neutral (Ponari et al., 2012).
Interfering with the production of facial expressions can also impair the processing of emotional language.In an emotion classification task in which participants quickly sorted words into categories associated with different emotions, interfering with motor activity on the lower half of the face impaired the categorisation of words associated with HAPPINESS and DISGUST relative to a control condition, but not those associated with ANGER or NEUTRAL (Niedenthal et al., 2009).Expressions of happiness and disgust both rely heavily on lower face muscles, for smiling and wrinkling the nose, respectively, while anger does not.Another way in which motor activity has been manipulated is through subcutaneous injections of Botox, a neurotoxin that induces temporary muscular denervation.Botox injections at the Corrugator supercilii muscle site, a brow muscle active during frowning and expressions of anger, slowed the comprehension of sentences about sad and angry situations but not happy ones (Havas et al., 2010).
These data are compelling both because they use experimental methods and because the observed impairments are confined to specific, predictable emotions.The selectivity of the findings rules out the possibility that the facial posture manipulations are simply awkward and impair conceptual processing in general.They also argue against accounts that propose the embodied activity is a downstream epiphenomenal consequence of conceptual processing as such accounts struggle to explain why the disruption of downstream consequences (i.e.facial expressions) impair the comprehension of emotional language.Of course, because studies reviewed above utilised behavioural measures that conflate comprehension and decision-making processes, it remains possible that the motor disruption impaired cognitive processes that were not semantic in nature, but instead involved in decision making or elaboration.

ERP studies
To differentiate the impact of motor disruption on semantic and decision-making processes requires a measure with high temporal resolution, such as eventrelated brain potentials (ERP).ERP measures are particularly useful when there is widespread agreement regarding the link between a particular ERP component and an associated cognitive processing event (Luck, 2005).The N400 ERP component is a negative-going deflection evident in the brainwaves 250-500 ms after the presentation of a written word and has been associated with semantic retrieval (Lau et al., 2008).Although different stimulus modalities (e.g.language and pictures) influence the scalp topography of the component, a larger (more negative) N400 occurs in response to stimuli that induce greater semantic retrieval demands (Wu & Coulson, 2011).Additionally, the N400 dissociates from other cognitive processes such as those involved in elaboration and decision making (Kutas & Federmeier, 2011).
To evaluate whether interfering with embodied resources influences semantic retrieval, we conducted an N400 ERP study in which we interfered with the smiling muscle using the aforementioned "pen" manipulation (although we actually used a wooden chopstick) as participants categorised emotional facial expressions along a dimension of valence (i.e.expressing a very good to a very bad feeling).In the control condition, participants loosely held the chopstick horizontally between their lips.EMG measurements at the cheek and brow indicated that while smiling mimicry occurred in the control condition, it did not occur in the interference condition.Rather, the interference manipulation led to tonic noise at the cheek, and not at the brow.Relative to the control condition, interfering with smiling increased the N400 when participants categorised expressions of low intensity happiness, but not for expressions of anger (Davis et al., 2017).This suggests that embodied motor resources play a causal role in the semantic processes indexed by the N400.However, while disrupting smiling mimicry affected a neural indicator of semantic retrieval, it did not influence participants' ratings of emotional valence.The impact of embodied responses to emotional stimuli is thus extremely subtle.Davis et al. (2015) used a similar method to show that the disruption of smiling mimicry affects the amplitude of the N400 elicited by emotional language, namely sentences about positive and negative events.The sentences in this study were constructed in positive and negative pairs, such that their valence depended on an affectively charged word, and that word was the third to last in the sentence, e.g."She reached into the pocket of her coat from last winter and found some (cash/bugs) inside it".This allowed us to evaluate whether any embodiment effects occurred during lexical retrieval (e.g.cash or bugs) and/or at a higher level of conceptual processing, during the construction of a situation model, which tends to occur at the end of phrases and sentences.Strong grounding models predict smiling interference would impact processing at the lexical level, (e.g., cash/bugs) before the end of the sentence, and should impact valence ratings, while weak grounding models predict interference effects would most likely be manifested at the end of the sentence, as situation models are hypothesised to involve mental simulations (Zwaan, 2009).
In keeping with a weak grounding position, we found N400 effects of smiling interference on the sentencefinal words of the positive but not negative sentences (Davis et al., 2015).Further, we found no effect of the interference manipulation on participants' overt ratings of the sentences.If the conceptual and sensorimotor systems were one and the samestrong groundingone would expect N400 effects at the lexical level at the very least, and plausibly at the behavioural level as well.Instead, the effects were small and confined to the end of sentences about positive events.
Another indication that embodiment effects are nuanced and subtle comes from a repetitive transcranial magnetic stimulation (rTMS) emotion detection experiment in which rTMS was applied over right primary motor cortex (M1), right primary somatosensory cortex (S1), or the vertex in the control condition (Korb et al., 2015).Participants viewed videos of facial expressions changing either from neutral to happy or from angry to happy.Their task was to identify when the expression changed.Although the rTMS manipulation had no effects in the males tested, among females, rTMS over M1 and S1 delayed both mimicry and the detection of smiles.These findings suggest a causal connection between activity in motor and somatosensory cortex and the recognition of happiness, but only in a subset of the participants.
Taken together, these studies support the hypothesis that neural resources involved in action and perception play a functional role in semantic processing of emotion concepts.Processing emotional words and faces can provoke embodied responses in an emotion specific manner.Persons with motor processing abnormalities show deficits in understanding language about action and emotion.Moreover, interfering with people's embodied responses to emotional stimuli impacts semantic retrieval in an emotion specific manner, and thus such studies complement correlational studies which show early modal activations to emotion concepts (Kiefer et al., 2022).However, these studies also show embodiment effects to be rather tenuous.We suggest that this is because embodied physiological responses contribute to a diverse array of functions, including accessing conceptual representations, elaborative inferences, and emotional reactions, whose relevance for cognition varies greatly across tasks.In the next section, we focus on the context dependent nature of embodiment in conceptual processing.

CODES: the context dependent nature of embodied emotion concepts
In the last two decades, researchers in grounded cognition have progressively emphasised the idea that concepts are flexible and shaped by context.The original suggestions came from behavioural studies showing the contextual flexibility of concepts (Barsalou, 1982).Later studies have shown that concepts (concrete and abstract) dynamically recruit different somatosensory and motor resources depending on the task requirements (Hoenig et al., 2008;Kemmerer, 2015;Kuhnke et al., 2020;Oosterwijk et al., 2015;Popp et al., 2019;Van Dam et al., 2012).Following this trend, we proposed the CODES (COntext Dependent Embodied Simulation) model several years ago to describe how embodied resources are flexibly used to ground the construction of simulations in emotion understanding (Winkielman et al., 2018).A key tenet of the CODES model is that the embodied resources involved in any given simulation depend on the context specific cognitive needs of the individual.
Embodied information is most useful in situations that require relatively deep semantic processing and inferential elaboration.For emotion concepts, this is most common in situations that involve attempting to understand or predict the behaviours of others or oneself.This is similar to hypotheses that embodied simulations can be used to create as-needed predictions of interoceptive states (Barrett & Simmons, 2015) and the anticipation of emotional consequences (Baumeister et al., 2007).Our model, however, emphasises the flexible way in which embodied resources are recruited during these simulations.For instance, when the goal is to cultivate a deep empathic understanding of our child's feelings, sensorimotor recruitment may be quite extensive.In other situations, the recruitment might be quite minimal, akin to sensorimotor satisficing.
One example of how task demands influence embodied recruitment comes from research on the processing of emotion words in a shallow or deep manner (Niedenthal et al., 2009).In these studies, participants viewed words that referred to emotional states (e.g."foul" or "joyful"), concepts associated with emotional states (e.g."slug" or "sun"), and neutral control words (e.g."table" or "cube").In the shallow processing task, participants were asked to judge a superficial feature of the words, namely whether the word appeared in upper or lower case.In the deeper processing task, participants had to judge whether or not the words were associated with emotions.In each of these tasks, facial EMG was recorded from muscle sites associated with the expression of positive or negative emotions.Consistent with the cognitive demand aspect of the CODES model, participants displayed affectively congruent emotional expressions when processing the words for meaning, but not when deciding whether it was printed in upper or lower case (Niedenthal et al., 2009).Interestingly, these results argue against the suggestion that embodied responses to words reflect automatic affective reactions to stimuli.Indeed, if embodied responses were reflexive, they should have been evident in the shallow processing task as well as the deep one.
Of course, it could be argued that the shallow task was so shallow that participants did not even read the words.To address this concern, Niedenthal et al. (2009) conducted an additional experiment in which participants were presented with emotion words (e.g."frustration") and told to list properties of those words while facial EMG was recorded.Critically, participants were LANGUAGE, COGNITION AND NEUROSCIENCE asked either to produce properties for an audience interested in "hot" features of the concepts (such as a good friend that could be told anything), or for one interested in "cold" features (such as a supervisor with whom they have a formal relationship).Both conditions involved deep conceptual processing, and both led to the production of normatively appropriate emotion features.However, the "hot" emotion condition led to greater activation of valence consistent motor responses.As simulating an emotional experience is more relevant for processing "hot" emotional features than for experientially detached "cold" ones, these data support the context dependent aspect of the CODES model and suggest there are multiple routes of representation during conceptual processing.
Another example of emotion cognition without "hot" embodied content is emotion recognition in patients with Möbius Syndrome, a congenital form of facial paralysis.Although these patients cannot produce (or mimic) emotional facial expressions, they can still recognise them on par with neurotypical controls (Rives Bogart & Matsumoto, 2010).Such findings undermine strong embodiment views that suggest the lack of relevant sensorimotor experiences and production capacities would lead to deficient emotion concepts.As advocates of the CODES model, we suggest that while these patients lack experience with mimicry, they do have extensive experience decoding emotional expressions via visual resources.As such, their concepts of emotions may be quite different from individuals who have a lifetime of facial mimicry.Moreover, data suggests that when asked to draw fine-grained distinctions among emotional expressions, some patients with Möbius Syndrome do perform worse than controls (Calder et al., 2000).
To recap, experimental data reveal much variability in the extent of sensorimotor recruitment for emotion concepts.Bodily responses, such as facial mimicry, are not reflexively elicited in all situations, but rather occur more readily for semantic processing of emotional language and are especially pronounced when people consider "hot" features of these concepts.Because emotional concepts have many dimensions, sensorimotor recruitment is not necessary to understand all aspects of them.This message is reinforced by the next section on the role of culture, metaphor, and multiple representation accounts of emotion concepts.
Breaking new ground: culture, metaphor, and multiple representation accounts So far we have emphasised the importance of grounding emotion concepts in internal sensorimotor experiences and core networks underlying emotions.However, no account of emotion concepts can ignore the role of culture.After all, there are cultures with terms for some emotions (e.g.Amae in Japan; Gheirat in Persian culture) that are largely without counterparts in the English language (Niiya et al., 2006;Razavi et al., 2023).Cross-linguistic comparisons are especially important for the cognitive neuroscience of concepts (Kemmerer, 2019).Even basic emotion terms like "anger" and "fear" vary across languages in terms of their semantic similarity, raising the question of whether this semantic diversity implies a parallel diversity in the experience of (supposedly) basic emotions (Jackson et al., 2019).
In light of the general difficulty of finding universal aspects of emotion in experience and expressions, some authors argue for a contextual constructivist approach that prioritises cognitive learning and dynamical, on-line construal of emotional concepts, albeit from grounded elements (Lindquist et al., 2015;2022).Empirical evidence consistent with this view shows that neural representations (as studied by fMRI) of basic emotion concepts such as FEAR and ANGER can quickly become very different even as a function of relatively simple learning.For example, in one study participants learned to think of anger (or fear) in a physical context or in a social one.Later, during test trials, when reproducing fear and anger states, the two learning groups activated nearly nonoverlapping brain regions, even though both included activity in somatosensory and limbic areas.This shows that even a short learning episode can create new, separate "emotions" linked by the same linguistic term out of different mixtures of grounded ingredients (Lebois et al., 2020).
Likewise, neuroimaging reveals how understanding concepts like ANGER, FEAR, or JOY activates very different neural resources, either related to interoception or to motor planning, depending on whether the task focuses on internal experiences or external actions of the "same" emotion (Oosterwijk et al., 2015).Finally, recent work suggests that dynamic understanding of emotion concepts can also involve a conjoint activation of somatosensory resources with the mentalising network (Ulrich et al., 2022).This suggests a potential mechanism for how the brain incorporates information about goals and intentions that are essential for understanding what aspects of emotions are relevant for the current situation.More importantly, these links could support understanding the intentional aspects of emotion that are key for differentiating, for example, the difference between guilt and shame (guilt has a self-blaming component, Ortony et al., 1988).

Emotion metaphors
The mechanistic investigations suggesting flexibility (yet groundedness) of the neural basis of emotion concepts go well with insights into emotion concepts that come from a very different level of analysisthe work on cultural similarities and differences in metaphors.Examining the range of idiomatic expressions for talking about anger in English, ("She got all steamed up", "He was bursting with anger", etc.), Lakoff and Kövecses (1987) argued that in using these expressions, English speakers deploy a cultural ("folk") model of anger in which an angry person is metaphorically construed as a container filled with a heated fluid. 1 The cause of the anger is expressed as the source of heat, the person's body is the container, and the anger is the heated fluid.This cultural model allows speakers to articulate the intensity of anger in terms of either the level of fluid in the container ("filled with anger"), or its temperature ("red hot"); control over anger is expressed in terms of the fluid's location inside the container ("He could barely contain his anger"), and the lack of control is expressed as the fluid's forceful emergence from it ("She was given to sudden outbursts of anger", and "He exploded").
The construal of an angry person as a fluid-filled container is not unique to English, however, as linguists have noted parallel metaphors in Hungarian (Kövecses, 1990), Japanese (Matsuki, 1995), and Chinese (Yu, 1995).Noting these commonalities in languages from disparate language families (Indo-European, Uralic, Japonic, and Sino-Tibetan), Kövecses (2000) suggests their common origin may lie in the physiology of anger and its experiential association with body heat, a feeling of internal pressure, and the appearance of redness in the face and neck.Even then, specific instantiations of the metaphor differ from language to language, in part because the concept of ANGER is embedded in a larger system of cultural beliefs.Japanese for example contains numerous phrases for control over anger as the fluid rises from the hara (stomach) to the mune (chest) to the atama (head), and the experiencer gradually loses their ability to hide and control their anger (Matsuki, 1995).The prevalence of expressions for the control over anger presumably reflects Japanese cultural values regarding the overt expression of this emotion (Kövecses, 2003).Rather than a liquid which is heated, Chinese anger metaphors depict a gas that may be related to qi, a concept from traditional Chinese medicine of an energy that flows through the body (King, 1989).Likewise, the heated liquid in English metaphors may have its origins in long abandoned ideas about the four humours (Geeraerts & Grondelaers, 1995).
Linguists have often remarked at the similarity of emotion metaphors in unrelated languages (Kövecses, 2003).In a cross-linguistic study of emotion metaphors, Zlatev et al. (2012) found a striking correspondence in the use of motion verbs to describe changes in emotional state.However, arguing against a universalist position, they found that there were language-specific emotion metaphors in each of the languages they examined; moreover, the closer the languages were geographically and genealogically, the more overlap there was (Zlatev et al., 2012).
Metaphors that describe positive emotions in terms of upwards movement and negative emotions as downwards movement have been observed in so many languages that the conceptual metaphor HAPPY IS UP has been suggested as a potential universal metaphor based on the subjective associations between an upright posture with happiness (and other positive states), and between a drooping posture with sadness and negative states (Kövecses, 2003;Zlatev et al., 2012).Some research suggests that this association is so automatised that it can be triggered even by rudimentary changes in vertical position (Meier & Robinson, 2004).Implicit associations between valence and bodily position have also been documented in research showing that various stimuli (words, sentences, sounds) presented behind a participant are automatically assigned a more negative meaning (Frankowska et al., 2019).Presumably, this BAD IS BEHIND association is again grounded in the subjective association of negative states (fear) and direction of sensory input relative to the perceiver's body.Critically, the nature of these associations cannot be entirely explained by co-occurrence in language, as shown by research on lefthanders who associate positive valence with the left side, despite the prevalence of GOOD IS RIGHT associations in their linguistic experience (Casasanto, 2009).

Grounding and metaphor
In fact, it is these kinds of experiential correlationsthat is, pairings between subjective experience and an abstract domainthat lies at the basis of the claim that abstract concepts are grounded via metaphor (Lakoff & Johnson, 1980).Lakoff and Johnson (1999, p. 463) write, Our common capacity for metaphorical thought arises from neural projections from the sensory and motor parts of our brain to higher cortical regions responsible for abstract thought.Whatever universals of metaphor there are arise because our experience in the world regularly makes certain conceptual domains coactive in our brain, allowing for the establishment of connections between them.
For example, the association between being upright and being happy leads to a link between the two states (presumably via simple Hebbian learning) so that the abstract concept of HAPPINESS is grounded in part by its association with a particular bodily state.This in turn suggests that thinking about happiness should recruit sensorimotor areas relevant for the metaphoric source domain.In the case of the metaphor HAPPY IS UP, the source domain is the state of being upright, suggesting the concept HAPPY might trigger sensorimotor activations related to perceiving objects in the upper half of vertical space or movements toward that region.
Accordingly, to test whether words associated with spatial attributes reactivate relevant traces in sensorimotor cortex, Bardolph and Coulson (2014) recorded EEG as healthy adults read words while performing a concurrent motor task that involved either upwardsor downwardsdirected movements.As in Casasanto (2008), a marble moving task was employed in which participants were directed to move marbles from a red tray to a green one located above, or from the green tray to the red tray located below it as they silently read words presented on a computer monitor.The marble movements were described in terms of the coloured trays so as to avoid overt mention of the vertical dimension that the task highlighted.The words participants read were either literally related to lower versus upper regions of spaceas in "descend" and "ascend", "floor" and "ceiling", "fall" and "leap"or metaphorically relatedas in "defeat" and "victory", "poverty" and "power", "agony" and "delight".
The rationale for the paradigm was that moving the marbles either upwards or downwards would impact sensorimotor resources putatively recruited to understand the words.If so, we would expect to observe movement congruity effects, that is, differences in ERPs to low words when movements were downwardsdirected (congruent) than when they were upwardsdirected (incongruent) and vice versa for high words.The temporal resolution of ERPs also affords insight into the timing of sensorimotor recruitment, as any observed movement congruity effects would necessarily occur either at the same time as that recruitment or afterwards as a downstream consequence of it.Language ERP researchers generally agree that meaning activation is indexed in the first 500 ms of the brain response with later effects indexing more strategic processes (Kutas & Federmeier, 2011;Lau et al., 2008).Some ERP researchers have suggested, however, that conceptual access occurs within 300 ms of processing (Kiefer & Pulvermüller, 2012).By either criterion, movement congruity effects were evident both early (200-300 ms after word onset) and late (700-1000 ms) for the words in the literal verticality condition, but only late for the words in the metaphorical verticality condition (Bardolph & Coulson, 2014).The early movement congruity effect for the literal words resembled an ERP effect previously reported for differences between action verbs and concrete nouns with a suspected generator in either motor or premotor cortex (Hauk & Pulvermüller, 2004).
Results reported by Bardolph and Coulson (2014) were in keeping with numerous behavioural studies showing spatial compatibility effects for words related to the verticality dimension that suggest concrete concepts have a perceptuo-motor basis and recruit brain structures involved in perception and action (Lachmair et al., 2011;Thornton et al., 2013).Likewise, the absence of early movement congruity effects for the metaphorically related words argues against the rapid activation of sensorimotor cortex as part of their comprehension.The late movement congruity effects for both literal and metaphoric verticality are in line with weak embodiment.Indeed, this result fits well with behavioural studies of language-space associations that suggest automatic sensorimotor activations for emotion words are confined to words such as "happy" and "melancholic" that have a direct association with body postures (Dudschig et al., 2015).In keeping with the CODES model, automatic sensorimotor activations can occur for words whose vertical associations are rooted in specific bodily experiences, but otherwise require task demands for their elicitation (see Dudschig et al., 2015 for review).

Sensorimotor career of metaphor
Whereas the first decade of the twenty-first century provided ample evidence that sensorimotor areas are often activated during language and memory tasks in a manner consistent with the predictions of grounded theories of meaning, since then, it has become clear that the role of these sensorimotor activations is robust, but more in line with weak embodiment accounts (see Desai, 2022;Meteyard et al., 2012 for reviews).Desai and colleagues (2011) provide a particularly appealing account they dub the Sensorimotor Career of Metaphor.The account is based on neuroimaging studies of people reading sentences with action verbs used literally ("The daughter grasped the flowers"), metaphorically ("The jury grasped the concept"), or with an abstract equivalent of the metaphoric verb ("The jury understood the concept").They found that relative to the abstract sentences, both literal and metaphoric sentences activated the left anterior inferior parietal lobulean area involved in action planningsuggesting that both literal and metaphoric uses of the verbs activated sensorimotor areas involved in the actions denoted by the verbs.Moreover, the familiarity of the metaphors correlated negatively with the extent of activation in the primary somatosensory cortex (S1).That is, the more familiar people were with the metaphoric meaning, the less likely they were to show activation in S1.Desai and colleagues (2011) suggested that perhaps novel metaphors involve detailed simulations in motor and somatosensory areas while more familiar metaphors recruit abstract representations in higher level motor planning areas.The name "sensorimotor career of metaphor" alludes to an earlier suggestion that people use analogical reasoning to understand novel metaphors, but once they become familiar with the metaphor, they simply retrieve the abstract target domain meaning (Bowdle & Gentner, 2005).The account by Desai and colleagues however differs somewhat from the original suggestion, in that rather than positing two distinct ways of processing metaphor (Gentner et al., 2001), the sensorimotor career of metaphor implies a continuum from vivid simulation to the retrieval of a more abstract meaning.Desai (2022) highlights how neural activations during metaphor comprehension typically include both sensorimotor modal regions relevant for the source domain and amodal regions common to the abstract target domain meaning.His account thus resonates with the CODES model both in the key role of sensorimotor simulations for learning metaphors and their varying importance for understanding particular metaphors in context.

Convergence zones and multiple representations
Beyond issues with the grounding of metaphors and other abstract concepts, strong embodiment models have also been challenged by neuroimaging studies that highlight the importance of supra-modal brain areas for language processing.In addition to providing clear evidence for sensorimotor recruitment, neuroimaging studies show conceptual tasks also recruit a network of brain areas whose function is neither sensory nor motor (Binder & Desai, 2011).For example, semantic tasks consistently reveal activity in frontal and prefrontal regions thought to control the topdown activation and selection of information, as well as temporal and parietal lobe regions that are not tied to a single sensory modality.These supra-modal processing regions are likely to be convergence zones, that is, brain regions that integrate input from a range of unimodal input streams and are hypothesised to be important for the formation of more abstract concepts (Meyer & Damasio, 2009).
A given convergence zone receives input from one or more perceptual areas, and sends feedback to them via re-entrant projections.Moreover, a convergence zone also sends feed-forward signals along to the next level in the hierarchy, and receives return projections from these higher-level convergence zones.For example, a low-level convergence zone might link neural codes for the colour and shape of an apple, receiving input from parts of the visual system coding colour and shape, and sending signals along to a higher-level convergence zone that might link codes for the apple's colour, shape, taste, and feel.The conjunctive neurons that make these cross-modal linkages possible can reactivate the distributed traces in the sensorimotor cortices in a simulation, or fire independently as stand-alone abstract representations (Simmons & Barsalou, 2003).An embodied account of emotion concepts might involve activations in a hierarchically organised network of convergence zones with cell assemblies at the bottom and concepts at the top (Barsalou et al., 2003).
Recent neuroimaging research is compatible with this suggestion, indicating a hierarchy of cortical regions with representations with varying degrees of modal content, including unimodal representations at different levels of abstraction, through bimodal, trimodal, and multimodal, all the way to supraor amodal regions (Kiefer & Harpaintner, 2020).Further, because some convergence zones thought to be supramodal in fact maintain modal representational content, it is important to distinguish between heteromodal convergence zones that are amodal (in which modalityspecific input has been abstracted away) and those that are multimodal and thus maintain some modalityspecific information (Kuhnke et al., 2020).Contrasting sound and action concepts, Kuhnke and colleagues found modality specific activations were enhanced when the task explicitly highlighted their acoustic versus action-related features (Kuhnke et al., 2020).They found that multimodal regions in posterior parietal cortex showed increased functional coupling with primary motor and somatosensory cortices during action feature retrieval but increased coupling with auditory association cortex during sound retrieval (Kuhnke et al., 2021).The profile of activity in posterior parietal cortex was thus exactly what one might expect of a convergence zone reactivating distributed traces in modality specific regions.
This sort of an architecture could accommodate both evidence for modality-specific activations for concrete concepts as well as more exclusively supra-modal activations for different varieties of abstract ones.Moreover, it is potentially compatible with multiple representation accounts that have been proposed to integrate experiential and distributional approaches to semantics.Distributional semantics is based on the idea that words mean what they do because of how they are distributed in language (see Lenci, 2018 for a review).For instance, we might learn that "helicopter" and "drone" mean similar things because they tend to be found in similar linguistic contexts.Natural language processing systems that employ distributional semantics are amazingly good at predicting human responses to language (see e.g.Michaelov et al., 2022Michaelov et al., , 2023)), leading some researchers to argue that they are a plausible model of human language comprehension (e.g.Jones et al., 2015).A potential problem for this account of word meaning, though, is that these kinds of representations are not grounded in the world.
Elaborating on the Chinese Room thought experiment (Searle, 1980), Harnad (1990) invites us to imagine the task of learning Mandarin entirely from a Mandarin-Mandarin dictionary.Although we might be able to learn how the foreign symbols relate to one another, we would always lack an understanding of how those Chinese characters relate to the world around us.Yet this is exactly the plight of language models such as chatGPT (https://openai.com/blog/chatgpt/) that "learn" about word meaning by being trained to predict the conditional probability of words in a language from the presence of other words in the context (see Jurafsky & Martin, 2008 for a review).Because these systems have no access to the actual truth, and do not operate with an internal, structured model of the world, they sometimes fail spectacularly and produce incoherent nonsense (Sher, 2023).
In fact, the compatibility of embodiment and distributional semantics has been an issue from the early days of grounded approaches to meaning.In a seminal study, Glenberg and Robertson (2000) constructed sentences whose critical words were equally likely based on distributional information but differed in terms of their physical affordances as understood by humans.The examples all involved a person using an object to solve a specific problem, such as, "After wading barefoot in the lake, Erik needed something to get dry.He used his shirt/glasses to dry his feet".Whereas human participants rated the afforded condition (shirt) as more plausible than the non-afforded condition (glasses), Latent Semantic Analysis, a then state-of-the-art approach to distributional semantics, failed to reveal any differences in the semantic distance between the context and the words in the two conditions.Glenberg and Robertson (2000) argued that the insensitivity of distributional semantic representations to the affordances of objects in novel situations reveals a fundamental limitation of this approach and suggested that humans draw on their embodied experience of the world to simulate the events described in the experimental stimuli.
Jones and colleagues tested whether modern language models are more sensitive to affordances than those available at the turn of the century (Jones et al., 2022).While two otherwise highly effective language models (BERT and ROBERTA) failed to distinguish between the afforded and the non-afforded conditions in Glenberg and Robertson's (2000) materials, one (GPT-3) was sensitive to the affordedness distinction, assigning greater probabilities to words in the afforded than the non-afforded condition.In contrast to Glenberg and Robertson (2000), Jones and colleagues' result suggests that sufficiently powerful distributional models may be able to learn knowledge that would seem to rely on embodied experience with the world.However, by conducting a replication of the plausibility judgment task from the original study, Jones and colleagues found that the language model consistently underestimates the sensibility of the afforded scenarios and overestimates the sensibility of the non-afforded ones, indicating that humans do indeed use information that is unavailable to neural language models.
While experiential and distributional data have historically been considered somewhat at odds with one another, Andrews et al. (2014) suggest they can be fruitfully combined.Experiential and distributional data constitute distinct (that is, non-redundant) information sources and computational modelling suggests the most empirically adequate account of word meaning is learned by treating both sources of information as a single joint distribution (Andrews et al., 2009).A similar account can be found in the symbol interdependency hypothesis that language is both embodied and symbolicembodied because words are linked to perceptual representations and symbolic because of the complex web of dependencies between linguistic representations (Louwerse, 2007(Louwerse, , 2011)).Neural data contrasting experiential and distributional accounts of semantic similarity structure encoded by fMRI have found support for both embodied and symbolic accounts (Carota et al., 2017;c.f. Fernandino et al., 2022).
Because language encodes information about the world, we can learn world knowledge by learning about these intra-linguistic relationships.In fact, research comparing colour concepts in sighted and congenitally blind participants suggests semantic associates of colour terms lead to similar colour concepts in these two groups (Saysani et al., 2018).These investigators asked participants to rate the similarity of different pairs of colour terms and used multidimensional scaling to produce semantic maps of colour space.
Remarkably, only minor differences were found in the colour maps of sighted and blind participants, despite the obvious differences in the ability to use colour information in actual behaviour (Saysani et al., 2018; see also Kim et al., 2021).

Moving forward
Early articulations of the embodiment movement in psychology and neuroscience (e.g.Barsalou, 1999;Glenberg, 1997) pointed to cognitive linguistics as a source of inspiration and evidence for the approach.Usagebased approaches to meaning view language as an embodied social behaviour that recruits domaingeneral cognitive processes (Bybee, 2006).On this approach words do not denote a single, context-invariant meaning.Rather, speakers use words in communicative acts to prompt their listeners to activate contextually relevant portions of background knowledge (Coulson, 2001).As in traditional accounts, meaning emerges gradually as a function of experience.However, individual experiences with "birds" or "anger" lead not to conceptual cores, but gradient networks of related meanings (Langacker, 1988).Noting that cognitive linguists reject formal semantics in favour of grounded theories of meaning, many scholars have suggested that abstract concepts pose a problem for these approaches (Mahon, 2015).However, this is not because advocates of usage-based approaches eschew the notion of abstraction.Indeed, such approaches rely crucially on the human capacity to abstract meaning from a diverse array of experiences (Croft & Cruse, 2004).
Rejection of the traditional approach to semanticsthat is, functions that map linguistic expressions onto a set of truth conditionswas motivated in part by its complete disregard for the human beings who produce and comprehend those expressions.However, in our zeal to embrace a functional role for experience in meaning, grounded cognition theorists may have unwittingly adopted an alternative version of the traditional account of language as a system for formulating propositions about the world, albeit with propositions populated by sensorimotor simulations rather than abstract symbolic elements.While language certainly can be used to describe the world, and people do sometimes recruit sensorimotor simulations to do so, language is fundamentally a system for social interaction.Besides sharing knowledge, people use language to convey expectations, offer opinions, forge social relationships, express their emotions, and more generally to participate in their sociocultural world.
Sinha writes, "Meaning is a mapping relationship between a linguistically conceptualized referential situation, and a conceptually motivated expression, enabling the hearer to understand in the context of the universe of discourse, the communicative act intended by the speaker", (Sinha, 1999, p. 238).As such, meaning relies upon two kinds of grounding: embodied grounding, which involves the perceptual and cognitive mechanisms of the speaker to apprehend the local ecology, and discursive grounding, which involves the capacity for inter-subjective situated awareness.Zlatev and Blomberg (2016) suggest a synthetic alternative of embodied intersubjectivity that relies on an integrated physical and social experience.This integrated physical and social experience is a crucial element of word learning for the toddler hearing "Look at the cat!" as she and her mother jointly attend to a cat approaching their stroller (Tomasello, 1995).Embodied intersubjectivity is also what enables the anthropology professor to teach her student what "change in slope" means as they delineate different segments of dirt at an archaeological dig (Goodwin, 1994).
Usage-based approaches to language are premised on the observation that the use of words and the meanings they evoke in context can differ greatly from culture to culture, from person to person within a culture, and even from occasion to occasion, as each occasion introduces a different perspective and has different representational needs (Barsalou, 2003;Barsalou & Wiemer-Hastings, 2005).Such views readily accommodate the different neural activations of hockey players and novices to sentences about hockey (Lyons et al., 2010), professional musicians and novices for concepts of musical instruments (Hoenig et al., 2011), and even the different brain areas activated for scientific concepts such as "operant conditioning" among professional psychologists versus undergraduates (Ulrich et al., 2022).Beyond their shared linguistic knowledge, speakers and listeners have recourse to considerable social and cultural resources for interactionsuch as shared knowledge of situational context, non-verbal signals, and shared background knowledge (Clark, 1996).Goodwin's (1994) study of archaeologists provides an excellent example of how meaning emerges from situated interactions that are at once perceptual and social, as the scientists' discussion of the abstract concept SLOPE occurs in the context of an activity at the dig site that involves classifying the colour of the dirt, measuring the dirt, and drawing a diagram of their findings.Accordingly, recent findings suggest that academic training may actually increase grounding of scientific concepts in experiential brain systems (Ulrich et al., 2022).
Because much of our experience as humans consists of linguistically mediated interaction with other people, the words in these utterances provide a rich source of information regarding meanings that are ultimately grounded in our embodied intersubjective experience.As for other concepts, hybrid accounts of emotion concepts that combine embodiment and distributional semantics appear most satisfactory (Borghi, 2020;Zwaan, 2014).Importantly, accepting hybrid accounts does not imply a capitulation of grounded accounts, but a realisation that demands, goals, and sensorimotor learning processes all influence the recruitment of grounded representations in a particular situation for a particular individual.We suggest that the field of emotion research has long moved past simplistic models in which emotions and their cognitive representations are inflexible packages of somatic and motor reactions, or in which embodiment is always necessary for understanding emotion concepts.Society and culture also provide key inputs that structure emotion concepts, and interact with rich experiences, shaping them, as well as being shaped by themall the while maintaining a connection to interoceptive and sensorimotor resources (Barrett, 2019).

Overall conclusion
To conclude, here we reviewed research suggesting that many sensorimotor resources are involved in the processing of emotional concepts.These findings argue against a purely amodal account which assigns, at most, a secondary role to perceptual, interoceptive, and motor processes.We suggest that these sensorimotor processes not only help emotion concepts get off the ground, but are actively used in the construction of emotional meaning.However, direct sensorimotor experience with concrete referents is not the sole contributor to conceptual meaning.We learn about emotion via linguistic interaction that can organise and calibrate meaning in situated contexts constrained to varying degrees by cultural norms (Borghi, 2020;Lindquist et al., 2022).Returning to the questions with which we started this review, we suggest that just as an economist might understand how factories operate without ever having seen a factory, there is a real sense in which a person with congenital analgesia can understand PAIN, an asexual individual understands LUST, and a congenitally blind individual understands RED.Similarly, an impressive degree of understanding of PAIN, DESIRE, LUST, and LOVE may be possible without first-person experience.This understanding may falter, however, when the task demands the generation of actual internal experiences (e.g."does shame feel different from embarrassment?") or reporting on their unique behavioural consequences (e.g."does desire motivate different actions than lust?").We hope that future research will specifically tackle the psychological and neural mechanisms of the interaction between concepts and experiences and thus lead to more precise models of emotion understanding.