Measures of functional, real-world communication for aphasia: a critical review

Aims: The aim of this article is to identify which existing instrument of functional communication from the aphasia literature best fits with a theoretically founded definition of real-world communication. Background: Aphasia is a language impairment caused by acquired brain damage such as stroke. For successful rehabilitation, a thorough understanding of naturalistic, real-world communication is imperative, as this is the behaviour speech and language therapy (SLT) ultimately aims to improve. In the field of aphasiology, there currently is a lack of consensus about the way in which communication should be measured. Underlying this is a fundamental lack of agreement over what real-world communication entails and how it should be defined. Methods & procedures: In this critical review, we review the instruments that are currently used to quantify functional, real-world communication in people with aphasia (PWA). Each measure is checked against a newly proposed, comprehensive, theoretical framework of situated language use, which defines communication as (1) interactive, (2) multimodal, and (3) based on context (common ground). Outcomes & results: The instrument that best fits the theoretical definition of situated language use and allows for the quantification of communicative ability is the Scenario Test. Conclusions: This article provides a start in a more systematic and theoretically founded approach to the study and measurement of functional, real-world communication in aphasia. More work is needed to develop an instrument that can quantify communicative ability across different aphasia types and severities. ARTICLE HISTORY Received 16 September 2019 Accepted 6 December 2019


Introduction
One of the most important goals of speech and language therapy (SLT) is for People With Aphasia (PWA) to communicate as effectively as possible in their everyday livesi.e., to see improvements at the level of functional communication (Thompson & Worrall, 2008;Wallace et al., 2016). Traditionally, aphasia is diagnosed by administering pen-and-paper batteries such as the Western Aphasia Battery-Revised (Kertesz, 2007). In these tests, language production and language comprehension tasks are presented to the client on an item-by-item basisfor example, picture naming, or word-to-picture matching. The client is often given ample time to respond in a one-to-one setting, where all possible forms of distraction are removed. These

Current approaches to measuring communication in aphasiology
The CADL-2 has been criticized for focusing on the transmission of a message by the PWA, without taking into account the interactive aspect of communication (Ramsberger & Rende, 2002;van der Meulen et al., 2010). The ANELT has been criticised for only measuring verbal exchanges and not taking into account the non-verbal aspects of interaction (van der Meulen et al., 2010). The use of role play, or simulating situated, more context-specific communication tasks in a clinical setting, has been suggested to make additional cognitive demands that are often not required in real-life situations, such as pretending to be somewhere you are not (Ramsberger, 1994;Wirz, Skinner, & Dean, 1990). While the ANELT uses physical props to support the role-play (i.e., a shirt with a hole in it at the dry cleaners), the Scenario Test and CADL-2 use illustrations or pictures of a scene that are initially shown and then taken away when PWA are asked to respond. Finally, a criticism of the Scenario Test is that many PWA with some verbal ability can perform at ceiling, as a full score can be acquired with a response of a few single words. As the test is currently structured, it is not informative across the full range of aphasia severities (this is unsurprising, as it was originally designed as a test of multimodal communication for people with a severe aphasia).

Non-standardised measures of communicative success
The Assessment of Communicative Effectiveness in Severe Aphasia (ACESA; Cunningham, Farrow, Davies, & Lincoln, 1995) is a measure designed to assess the communicative effectiveness of people with severe aphasia. This measure includes a structured conversation, in which the assessor asks the PWA a number of questions about familiar topics, initially allowing for yes/no answers and working towards more open-ended questions on familiar topics (e.g., "is your husband/wife/carer alright?" and "Tell me about where you live, about your home"). The second part of the measure requires the PWA to convey the meaning of common items,  Holland et al., 1999) • Amsterdam-Nijmegen Everyday Language Test (ANELT; Blomert et al., 1994) • Scenario Test (van der Meulen et al., 2010) Non-standardized test • ACESA (ACESA; Cunningham et al., 1995) • Transactional success (Ramsberger & Rende, 2002) Observational profiles (clinician rated) • Functional Communication Profile (FCP; Sarno, 1969) • Revised Edinburgh Functional Communication Profile (R-EFCP; Wirz et al., 1990) • American-Speech-Language-Hearing Association Function Assessment for Communicative Skills in Adults (ASHA FACS; Frattali et al., 1995) • Therapy Outcome Measure, Activity Scale (TOM (Enderby et al., 2006)) Observational profiles (client or proxy rated) • Communicative Effectiveness Index (CETI; Lomas et al., 1989) • Assessment of Communicative Effectiveness in Severe Aphasia (ACESA; Cunningham et al., 1995) • Functional Outcome Questionnaire for Aphasia (FOQ-A; Ketterson et al., 2008) • Communicative Activity Log (CAL; Pulvermüller & Berthier, 2008) • Communication Outcome after Stroke, client and carer version (COAST and carer COAST; Long et al., 2009;Long et al., 2008) • Aphasia Communication Outcome Measure (ACOM; Hula et al., 2015) Type of instrument Name of test Linguistic analysis of connected speech • Correct Information Unit Analysis (CIU; Nicholas & Brookshire, 1993) • Information Units (IU; McNeil et al., 2001) • Pragmatic Protocol (PPL; Prutting & Kirchner, 1987) Sociological analysis of interaction • Conversation Analysis (CA; Beeke et al., 2007) shown in objects and pictures. Communicative effectiveness is defined on a scale of recognisability of the attempt, ranging from "easily and quickly recognisable" to "completely unrecognisable, no response, recurrent gesture or vague gross movement" (Cunningham et al., 1995). Ramsberger and Rende (2002) measure of transactional success consists of a semispontaneous story re-telling task: PWA are asked to watch an "I Love Lucy" video and re-tell the storyline from the video to a conversation partner. Transactional success is defined by the number of main ideas expressed by the conversation partner of the PWA, when retelling the story as told by them by the PWA.

Observational profiles (clinician rated)
There are a number of instruments that quantify functional communication by relying on observations made by the clinician. With the Functional Communication Profile (FCP), Sarno (1969) was the first to develop such an instrument. Sarno compiled a list of communicative behaviours across different categories (reading, understanding, speaking, gesturing, etc.), such as "reading street signs" or "speaking on the telephone" that could be ticked off as executed by the PWA or not, including a judgement of how effectively this was done. A number of observational profiles have been published since, including the Revised Edinburgh Functional Communication Profile (R-EFCP; Wirz et al., 1990), the American-Speech-Language-Hearing Association Function Assessment for Communicative Skills in Adults (ASHA FACS; Frattali, Thompson, Holland, Wohl, & Ferketic, 1995) and the Therapy Outcome Measure Activity Scale (TOM, Enderby, John, & Petheram, 2006). Functional communication is quantified in these observational profiles as an overall score of ability, effectiveness or independence on a number of communicative activities, such as "expresses feelings", "tells time", or "participates in conversations", or a description of pragmatic skills that the PWA exhibits or not (i.e., "responding to open questions", "greeting" and "initiating a new topic"), as well as an indication of the modalities used during communication. The rationale for using an observational instrument is that it is based on naturalistic, spontaneous behaviour and administration is feasible in a clinical setting. However, the FCP has been criticized for measuring functioning in relation to pre-morbid levels (Ramsberger, 1994) and to be linguistically biased, with no measure of non-verbal communication (Cunningham et al., 1995). Glueckauf et al. (2003) criticised the ASHA-FACS for measuring the degree of independence in communication (i.e., can someone perform a task without help), but not including a measure of communicative success (i.e., how effective is communication). The observational nature of the instruments is considered by some to be subjective and has been argued to result in an indirect measure of functional communication (Blomert, Koster, Mier, & Kean, 1987;Glueckauf et al., 2003;van der Meulen et al., 2010), as well as being ill-suited to capture how real-time communication unfolds for PWA (Barnes & Bloch, 2018). For the FCP (Sarno, 1969) and the ASHA FACS (Frattali et al., 1995) communication is judged on the basis of indirect observation (i.e., memory of multiple conversations that have previously been observed), rather than directly observing and scoring behaviour.

Observational profiles (client or proxy rated)
The third category includes observational profiles that are rated by the client or a proxy (e.g., a partner or carer) rather than a clinician. These instruments are built on the assumption that the clinician only has limited opportunity to observe the client in everyday situations typical for them (Lomas et al., 1989), while the proxy has a much better sense of the level of functioning of the client in day-to-day life. In addition, Davidson and Worrall (2000) suggested that clinicians may focus more on the potential of the client rather than actual performance in their judgements of functional communication. On a larger scale, healthcare providers have become more person-centred, meaning that a high value is given to the client's perspective in therapy goal setting (Worrall, 2006) and to their judgement of what represents meaningful therapy outcomes (Wallace et al., 2016). The inclusion of the client perspective in therapy outcome measures has thus become a key part of health-care policymaking (Frattali et al., 1995;Irwin, 2012;Rudd, 2016). As such, patient-reported outcome measures (PROMS) have become increasingly valuable, including observational measures of communication as judged by PWA themselves. Observational profiles that aim to measure functional communication in aphasia include the Communicative Effectiveness Index (CETI; Lomas et al., 1989), the Assessment of Communicative Effectiveness in Severe Aphasia (ACESA; Cunningham et al., 1995), the Functional Outcome Questionnaire for Aphasia (FOQ-A; Ketterson et al., 2008), the Communicative Activity Log (CAL; Pulvermüller & Berthier, 2008), the Communication Outcome after Stroke, client and carer version (COAST and carer COAST; Long, Hesketh, & Bowen, 2009), and the Aphasia Communication Outcome Measure (ACOM; Hula et al., 2015). Measures such as the COAST have expanded their definition of functional communication outcome to include measures of the impact of the communication impairment on the client's life (similar examples are the Aphasia Impact Questionnaire-21, Swinburn et al., 2018;Swinburn & Byng, 2006). The criticism for observational profiles as discussed in the previous section also applies here: they are considered to be subjective and indirect measures of functional communication (Blomert et al., 1987;van der Meulen et al., 2010), including the fact that for these profiles, communication is judged on the basis of indirect observation (i.e., memory of multiple conversations that have previously been observed). In addition, it has been suggested that the observations by a proxy can be biased by factors relating to the relationship with the PWA and by the proxy's emotional well-being (Glueckauf et al., 2003). Furthermore, it is difficult to control what the client or proxy base their answers on when filling out the observational profile. For example, Fucetola and Connor (2015) showed that the CETI score was primarily influenced by expressive abilities of the PWA, not receptive communication skills, resulting in an unintentional one-sided view of a person's communicative performance in everyday life. Functional communication is quantified in a similar fashion as for the clinician-rated observational profiles: as an overall score of ability, effectiveness, impact or independence on a number of communicative behaviours.

Linguistic analysis of connected speech
There is a group of instruments that are based on the linguistic analysis of connected speech. As interest grew in what PWA could communicate at a conversational level, knowledge from studies on pragmatics and discourse has been applied to the analysis of conversation in aphasia. Both fields study language above the sentence level and are thus, in theory, relevant to the discussion of functional communication in aphasia. A number of these measures explicitly claim to measure "functional communication" in PWA and are therefore included here. Other pragmatic or discourse measures are relevant to the study of communication, but do not claim to measure communication comprehensively: instead, these instruments assess a sub-component of communication (such as story grammar and topic coherence, see Pritchard, Hilari, Cocks, & Dipper, 2018 for a review) and are therefore not included in the current discussion. Examples of instruments based on a linguistic analysis of communication are the Correct Information Unit Analysis (CIU; Nicholas & Brookshire, 1993) and the Information Units approach (IU; McNeil, Doyle, Fossett, Park, & Goda, 2001) which aim to assess the informativeness of connected speech by identifying phrases (or units) that represent crucial, relevant information for a specific story. The informativeness of a story that is retold is defined by the number or percentage of units that are expressed correctly and intelligibly. It is difficult to achieve high inter-rater reliability on these measures (Oelschlaeger & Thorne, 1999;Ramsberger & Rende, 2002), though other measures of discourse with PWA such as Story Grammar, Topic Coherence, Reference Chains and Predicate Argument Structure have been shown to be psychometrically robust (Pritchard et al., 2018). Another instrument that is based on linguistic analysis of functional communication is the Pragmatic Protocol (PPL; Prutting & Kirchner, 1987). The PPL is an observational tool but is discussed in this category because of its linguistic origins. The tool can be used to indicate whether a set of pragmatic aspects of language are observed or not in conversation, such as "turn taking interruption/overlap'', "physical proximity" and "vocal intensity" (Prutting & Kirchner, 1987). The pragmatic aspects of behaviour, if observed, are also judged on whether they are applied appropriately or inappropriately (i.e., to facilitate/neutrally influence communication, or not). The aim of the PPL is thus to identify a pattern of pragmatic behaviour impairments, based on the observation of 15 minutes of spontaneous conversation. The PPL is an observational tool, and therefore, the same criticism applies as for the second and third categories mentioned earlier.

Analysis of interaction
Conversation Analysis (CA) surfaced in aphasiology around the turn of the 20th century, emphasizing the importance of studying spontaneous, natural conversation (Beeke, Maxim, & Wilkinson, 2007) and to take into account the interactive nature of conversation. Though originally applied to audio recordings, CA can also include the study of non-verbal behaviour during conversation (i.e., video materials). CA is based on the assumption that conversations are products of a structured interaction in which the sequential order of turns represents an important organizational feature of the conversation. The overall aim of applying CA to the study of aphasia is to analyse what causes problems and disruptions to the organization of conversation and to identify adaptive strategies to overcome these problems. To do this, it typically focuses on how conversation unfolds between a PWA and a specific communication partner (a dyad). This methodology has provided useful information for the assessment of natural conversation and it lends itself well to training programmes for PWA and their conversation partners (Beeke et al., 2007;Wilkinson, 2015). Due to the observational nature of the methodology, it remains difficult to synthesize findings from CA and to describe behaviour at the group level, though a number of attempts have been made (Perkins, Crisp, & Walshaw, 1999; for a brief discussion, see Prins & Bastiaanse, 2004).

Interim summary
A wide range of instruments have been created to measure functional communication, each with different purposes: either to determine treatment effectiveness, the generalization of therapeutic interventions, to use for therapy planning or to develop our theoretical knowledge of functional communication in aphasia. The conceptualizations and operationalisations of functional communication in the literature show overlap, as all aim to capture language or communication in conversation or everyday life. In a theoretical and methodological sense, however, they are quite different (Irwin, Wertz, & Avent, 2002;Linnik, Bastiaanse, & Hohle, 2016), often focusing on particular component of communication that is of particular interest for the measure created, such as verbal output (ANELT, Blomert et al., 1994), the patterns of interaction that structure conversation (CA, Beeke et al., 2007) or including impact as part of a measure of communication (COAST and carer COAST; Long et al., 2009). The variety across these instruments reflects the challenging nature of capturing the complex, multifactorial phenomenon of communication, as well as the lack of fundamental agreement on what real-world communication is. A theoretically founded definition of communication that is comprehensive and does not emphasize one element over another is therefore imperative and would enable researchers to scrutinize the validity of the abovementioned measures, as well as to make suggestions for the improvement of the instruments.

A definition of situated language use
Over the past decades, much research has been done on the topic of communication with healthy adults in the fields of communication science, psychology, linguistics, neuroscience, psycholinguistics and sociology. Much of this work has yet to be translated into aphasiology. This body of research provides important clues on what components influence a person's ability to use language in a real-world setting and can inform the endeavours in aphasiology to develop a theoretically founded definition of functional, real-world communication (Simmons-Mackie et al., 2014;Webster, Whitworth, & Morris, 2015).
From as early as the 1940s, box-and-arrow models of the communication process have been published in the literature. Initial models were very much focused on information transfer, often describing communication as a linear, one-way process from a sender, the transmission of the message, a channel through which information can travel to a receiver (Shannon & Weaver, 1949). Later models added components such as the interpretation of meaning of a message by the sender and receiver (Schramm, 1954), the influence of feedback during communication as well as the use of multiple modalities (Westley & MacLean, 1957). Berlo (1960) further built on this to include contextual factors such as communication skills, attitudes and the influence of social support on the communication process. A number of research fields have focused specifically on a particular component of communication, such as non-verbal communication (i.e., gesture, facial expression and body movement; Goodwin, 1995;Kendon, 1980;McNeill, 1992), the patterns of interaction that structure conversation (Conversation Analysis: Barnes & Bloch, 2018;Beeke et al., 2007) or the purpose of communication (interactional or transactional;Simmons-Mackie & Damico, 1997). Although useful, these do not provide a comprehensive model or description of communication, rather they describe a particular component of the process. There are many different ways in which to approach and describe the process of real-world communication, depending on the focus of the model, the scope, the theoretical underpinnings and its explanatory purpose. For the purpose of the current paper, a model or framework that attempts to describe communication comprehensively rather than focusing on one element of the process is required. To be useful for practical application, it should help delineate both the individual (cognitive skills) and the situational (contextual) factors that are important for communication. It is precisely because "functional communication" has this complexity, spanning levels of the ICF and incorporating more than just an individual's abilities, that its definition and measurement have been so problematic.
From our reading (Doedens & Meteyard (preprint: 2018, July, p. 31)), we propose that Clark (1996) provides such a description, with sufficient descriptive detail to take stock of existing instruments. A thorough review of this topic is beyond the scope of this article, but see Doedens & Meteyard (preprint: 2018, July, p. 31) for an extended review. Clark (1996) outlines three core characteristics of communication as "situated language use". It is always (1) interactive, (2) multimodal and (3) reliant on common ground (see Table 3). Within those three characteristics, sub-components are listed to further break down exactly which variables play a role. The relatively simplistic structure of the framework means it can function as a starting point for the discussion of communicationsituated language usein aphasiology.

Face-to-face communication
Communication in everyday life varies across settings, modalities and ways of communicating (speaking with a sibling at home, listening to an audio book in the car, performing for an audience in the theatre, writing a letter to a friend, etc.). A person's ability to communicate, as well as the way in which people communicate across these different settings varies. To evaluate the principles that govern situated language use, researchers have started by studying the most basic form: face-to-face communication (Barnes & Bloch, 2018;Bavelas & Chovil, 2000;Clark, 1996;McDermott & Tylbor, 1983;Pickering & Garrod, 2004), as it is the most commonly used and pervasive form of communication, it is universal to all human societies, it is the basis for typical language acquisition in children Table 3. The key components that characterise situated language use (based on Clark, 1996).

Components
Definition Sub-components Interactive Joint activity between two people. Actions of one person depend on those of the other.
Multiple interdependent channels of communication are available and integrate into a single composite message. Different channels replace, supplement, complement and emphasize speech.

Contextual (relies on common ground)
Common ground provides interlocutors with context that allows them to assume a degree of "givenness" of information, or directly use physical referents during communication. This relieves the communicative burden Pre-existing: • Communal common ground • Personal common ground Discourse representation: • Situational context • Communicative context and it does not require education or special skills (Bavelas & Chovil, 2000;Clark, 1996). Indeed, Davidson, Worrall, and Hickson (2008) showed that face-to-face conversation is the most frequently occurring communicative activity in daily life for PWA. The reasoning is that once the principles that govern face-to-face communication are teased out, language use in other communicative situations, such as speaking on the telephone, can be derived from the basic face-to-face exchange (Clark, 1996).

Language use is interactive
Many researchers agree that face-to-face communication is a joint activity (Clark, 1996;Schegloff, 1982). This means that language use is achieved by two or more people who coordinate their actions to achieve a common goal. Every decision made during a conversation will depend on the actions of the other. Face-to-face communication is therefore an inherently interactive process, in which two or more participants work together and coordinate their actions to create meaning. The whole, as well as the individual actions of each individual, can be studied within that process. This means that when language production and comprehension are studied outside of the interactive process (i.e., in isolation or based on the behaviour of one person), they will be tapping into inherently different processes and task demands as compared to language when it is used for communication. It is worth noting here that this may be a critical reason why a number of impairment-based therapies for aphasia (e.g., picture naming therapies for word finding) do not show reliable generalisation to functional, real-world communication (Webster et al., 2015). The interactive nature of communication is therefore a core component of face-toface communication that should be taken into account when assessing language performance in a real-world setting (Barnes & Bloch, 2018;Clark, 1996;Schegloff, 1982).

Language use is multimodal
Face-to-face communication is a fundamentally multimodal phenomenon (Bavelas & Chovil, 2000;Clark, 1996;Kendon, 1980;McNeill, 1992). A number of different modalities or channels of expression are used during communication, such as facial expressions, gesture, prosody, speech and body movements. These channels interact and are interdependent: they integrate into a single composite message. Channels are combined to replace, supplement, complement and emphasize speech, as well as to express emotion (Kendon, 2004;McNeill, 1992). By studying language in isolation, the complexity and interdependence of the different channels are ignored (Vigliocco, Perniss, & Vinson, 2014), and a wealth of information that is relevant for communication is missed. When people communicate with each other in the real world, they use all channels to express meaning, as well as to monitor and understand what the other participant is communicating (Clark & Krych, 2004). Therapeutic approaches that support, encourage or train "total communication" (i.e., not just focusing on verbal input and output) are common in aphasia rehabilitation (Nykanen, Nyrkko, Nykanen, Brunou, & Rautakoski, 2013;Pound, Parr, Lindsay, & Woolf, 2000;Rautakoski, 2011), highlighting the importance of having a measure that captures multimodality in communication.

Language use is based on common ground
Finally, face-to-face communication allows interlocutors to rely on context during the exchange (Clark, 1996). Clark (1996) refers to context for face-to-face communication as common ground: the set of shared knowledge, beliefs and assumptions that exists between two speakers. There are different types of common ground, as described in Table 4. There will be a degree of common ground that exists even before two interlocutors start their conversation (pre-existing common ground), and there is common ground that builds up during conversation (discourse representation). A key premise is that whatever is part of common ground will require less effort (time and/or energy) to refer to during face-to-face communication (Boyle, Anderson, & Newlands, 1994;Horton & Gerrig, 2005), meaning that the more common ground two interlocutors share, the greater the ease with which they can communicate. In some cases, the existence of more common ground can allow the interlocutors to rely less on (complex) linguistic processing for the exchange of information, by relying on the "givenness" of information in dialogue and producing shorter and less "complete" utterances. A simple example is using pronouns ("he" or "she") instead of proper names, or two friends who use the same slang terms. For comprehension, interlocutors can rely on context to restrict the number of possible interpretations for a sentence they have heard (Skipper, 2014).
When measuring a person's ability to communicate in the real world, their ability to rely on common ground should be taken into account. Knowing if and how a person can use common ground to support conversation can help provide greater insight into the way in and the degree to which a person can compensate for their linguistic difficulties in conversation.

A theoretically founded measure of communication in aphasia
The framework described above identifies three components that define functional communication, namely that it is (1) interactive, (2) multimodal and (3) based on common ground, including (3a) shared knowledge between speakers and the variation in this across different speakers, (3b) the physical environment, and (3c) the communicative environment. In this section of the paper, the existing instruments reviewed above will be Table 4. The different sub-types of common ground, as described by Clark (1996).

Type
Sub-type Definition Pre-existing • Communal common ground Communal common ground refers to shared beliefs and knowledge based on a shared nationality or religion. Customs that are specific to a certain country or culture, will be shared and readily understood between people from that culture.
• Personal common ground Personal common ground reflects the number of shared experiences two participants have had together, also referred to as the level of acquaintedness or personal familiarity.

Discourse representation
• Situational context The situational context includes what is physically present in the perceptual environment.

• Communicative context
The communicative context is an accumulation of what has been referred to earlier in conversation (through any modality).
checked against the proposed theoretical framework. In addition, we will evaluate whether the instruments provide information on how these components influence communication for PWA. This evaluation is summarised in Table 5.

Standardized tests
The CADL-2 (Holland et al., 1999) is administered by the clinician who asks the PWA questions, requiring the PWA to respond to a given situation, without receiving any form of (structured) feedback from the clinician. Thus, although there is another person present, the CADL-2 does not fully take into account the interactive aspect of communication. The CADL-2 does take note of the use of different modalities in communication, allowing verbal and non-verbal responses on the items. Finally, the CADL-2 takes into account some elements of common ground: it does not explore the PWA's communicative abilities across different speakers, but it attempts to re-create different situations and environments in which someone might need to communicate (e.g., a doctor's office), assessing the PWA's ability to communicate in different settings. Different images are used to provide information on the setting and to situate the question that is posed to the PWA. Since the test does not place PWA in the actual, physical environment, the use of physical context by the PWA to support communication is not explored optimally. Thirdly, the type of questions posed to the client (test-questions, rather than conversational questions, i.e., "What should you wear or use on a day like this?") means no substantive communicative context is created between the interlocutors. Exploration of the reliance of PWA on the communicative context is therefore not possible. The ANELT (Blomert et al., 1994) is set up in a similar fashion to the CADL-2. The test is set up as a role-play, but essentially elicits a monologue from the PWA, with no interaction or feedback exchanged between the clinician and the client. As the ANELT only scores verbal responses, it does not include the multimodal component of communication.
Common ground is partially taken into account: different settings in which PWA might find themselves in everyday life are assessed and physical props are used to support the role-play. This allows the clinician to further explore the ability of the PWA to use the physical environment to their advantage. The test does not assess the ability of the PWA to communicate with different conversation partners. It also does not fully assess the influence of the communicative context: PWA are asked one question per scenario in order to avoid negative effects of potential stroke-induced short-term verbal memory problems (Blomert et al., 1994), meaning very little communicative context is built.
The Scenario Test (van der Meulen et al., 2010) assesses multimodality and interactivity in a face-to-face setting, as the test requires the administrator to interact with the client and provide different levels of feedback and help throughout the scenarios of the test. All forms of communication, be it verbal, gestural, written, drawn or use of a communication aid are recorded and contribute to the final score on the test. Although the interaction remains artificial, efforts have been made to structure the feedback as it would be given in a natural setting. The influence of common ground is partially assessed: the ability of the PWA to communicate across a number of different situations is assessed. Similarly to the CADL-2, the test uses illustrations to "set the scene" for the scenario, to which the PWA is asked to respond. The lack of physical objects or props, however, means the use of the physical environment by the PWA to communicate is not assessed. Each scenario in the Observational profiles (clinician rated) • FCP (Sarno, 1969)  -Face Validity, ContV -Content validity. Some measures also reported sensitivity (to discriminate between groups or detect change), we have not reported those here for reasons of space. Psychometric properties are left blank when we have been unable to find any published data. Cut-offs based on Streiner et al. (2015) and Pritchard et al. (2018): reliability scores were labelled as follows: <0.7 = low, 0.7-0.9 = moderate, >0.9 = high. Validity scores were labelled: ≥0.3 = good. For the values themselves, see Table 6 in the Appendix.
test includes three different questions, with structured feedback (i.e., a brief interaction), for each question. This means a small amount of communicative context is built for each scenario and therefore theoretically allows the clinician to explore whether the PWA uses the communicative context (i.e., earlier references) to their advantage. However, exploration of this aspect of communication is not part of the official scoring guidelines. Finally, the test does not assess the ability of the PWA to communicate with different conversation partners.

Non-standardised measures of communicative success
In Ramsberger and Rende (2002) measure of transactional success, the authors have created a fully interactive task where interlocutors can communicate and provide feedback in a natural manner. As the interlocutors provide feedback spontaneously, systematically assessing the ability of PWA to use different kinds of feedback during communication is not straightforward. The test itself therefore is interactive, but it does not measure how the PWA relies on the conversation partner during communication. Similarly, the measure takes into account all modalities of communication, but the scoring of the test does not report on the use of different modalities by the PWA. This also applies to the use of the physical environment. Finally, common ground is taken into account partially: the ability of the PWA to communicate in different settings is not assessed, but the measure does allow for the assessment of communicative abilities across different conversation partners. Finally, although it does not report on this explicitly in the outcome of the measure, it does take into account the communicative context, as the PWA and their conversation partner speak for an extended period of time about the same topic. The measure of transactional success is defined by the ability of the conversation partner of the PWA to re-tell the story as they have understood it from the PWA. This therefore is an indirect measure of the communicative abilities of the PWA through the interpretation of the conversation partner. The ACESA, like the ANELT and CADL-2, is not an interactive test. The examiner poses the questions to the PWA, but no further interaction takes place. The measure is partially multimodal, as it does not allow for the use of writing or drawing during communication. The ACESA does not take into account the influence of common ground: it does not assess the influence of different communication partners, different settings in which one can communicate nor the use of the physical environment. The measure partially takes into account the influence of the communicative environment, as the structured conversation could be seen as building up a communicative context that can be used by the PWA.

Observational profiles (clinician, client or proxy rated)
Observational profiles are based on the observation of naturalistic communication and therefore implicitly take into account, to some degree, the three components of communication. The profiles vary considerably in the way and the degree to which these components are explicitly assessed, however. For ease of exposition, we will walk through each component (interactivity, multimodality and common ground) and directly compare profiles, rather than dealing with each profile in turn.
Many profiles include a mix of interactive and non-interactive items. The FCP includes a number of behaviours that are explicitly interactive (e.g., "understanding a simple conversation with one person"), while the majority of the items are focused on non-interactive, linguistic skills (e.g., "saying long sentences", "understanding television"). The CETI (Lomas et al., 1989), on the other hand, focuses heavily on interaction: 15 out of 16 items refer to interactive communicative behaviours. The FOQ-A incorporates the interactive component of communication by assessing communicative acts (e.g., "the person can make routine verbal requests") and by assessing the ability of the PWA to monitor conversation ("this person can recognize mistakes in his or her speech when he or she makes routine verbal requests"). The majority of items on the CAL (Pulvermüller & Berthier, 2008), the ACOM (Hula et al., 2015), the (carer) COAST (Long et al., 2009;Long, Hesketh, Paszek, Booth, & Bowen, 2008), the ASHA-FACS (Frattali et al., 1995) and the TOM Activity Scale (Enderby et al., 2006) implicitly assess the interactive aspects of communication by referring to "communicating" or "conversation" (i.e., "how well could you have a chat with someone you know well?" on the COAST, "participates in conversation" on the ASHA-FACS and "talk about your day with family or friends" on the ACOM).
The degree to which the use of multimodal communication is explicitly assessed varies across profiles. The R-EFCP (Wirz et al., 1990) is most explicit, as it specifically aims to describe the modality in which the speech acts are performed. On the ASHA-FACS most items are indirectly multimodal ("requests information"), while a few are more explicit ("understands facial expression/tone of voice"). The FOQ-A explicitly assesses verbal and non-verbal communication ("this person can answer 'who, what where, when and why' questions correctly either verbally or with gestures"). On the CETI, CAL, COAST, ACOM and FCP, most items are implicitly multimodal (e.g., "communicating his/her emotions", "having a one-to-one conversation with you" on the CETI; "talk about your day with family or friends" on the ACOM) with a small number of questions explicitly assessing multimodal communication (e.g., "how well can you use other ways to help you communicate?" on the COAST, "responding to or communicating anything (including yes or no) without words" on the CETI and "use of gestures" on the FCP). In addition to this, some of the profiles only explore specific modalities used for communicating, such as the FOQ-A which only assesses verbal information and gestures. The more implicit questions about communication potentially allow for different interpretations of the question (i.e., some questions might be interpreted as just being about verbal abilities).
Finally, the observational profiles vary in the extent to which they asses the influence of common ground. Common ground is not explicitly assessed on the FOQ-A and R-EFCP, while only minimally on the FCP and ASHA FACS. The latter two profiles focus mostly on different settings for conversation (e.g., "understanding conversation with one person" vs. "more than two people", "speaking on the phone" and "understand conversation in noisy surroundings" and "following directions"). In these profiles, the influence of communicating with different people (familiar or unfamiliar) and the physical and communicative environment are not assessed. The TOM Activity Scale only explicitly mentions the influence of different environments on communication. The CETI, the CAL, the (carer) COAST and the ACOM dedicate a few items to the effect of different conversation partners (in number and type, e.g., "having coffee-time visits and conversations with friends and neighbours" on the CETI, "join a conversation with a group of people" on the COAST, "have a conversation with strangers" on the ACOM) and settings (e.g., "how does the patient communicate on the telephone" on the CAL, "explain your health concerns to your doctor" on the ACOM), but the use of the physical and communicative environment are not explored.
Crucially, all the profiles lack specificity on how each component affects communication for each individual PWA. The observational profiles are thus useful in getting a general sense of the communicative abilities of a PWA but do not provide detail on what specific communicative behaviours are underlying these scores.

Linguistic analysis of connected speech
More often than not, the CU (Yorkston & Beukelman, 1980), the CIU (Nicholas & Brookshire, 1993) and the IU (McNeil et al., 2001) are administered in a non-interactive setting, i.e., without the presence of a conversation partner (e.g., picture description or story re-tell task), resulting in a monologue type of output. Furthermore, these measures only assess verbal output (speech). Therefore, these measures are non-interactive and not multimodal. Common ground is partially taken into account: the use of the physical environment is not taken into account but PWA can use the communicative context if they are telling a story, though the use of this context is not explored explicitly in the scores.
The Pragmatic Protocol (Prutting & Kirchner, 1987) takes into account most of the model's components in a face-to-face communicative setting. It observes spontaneous conversation, which is inherently interactive, in which the use of all modalities of communication is allowed. Aspects of interactive behaviour (turn taking, providing feedback, etc.) as well as multimodal behaviour (eye gaze, gestures, facial expressions, etc.) are all coded in the protocol. Furthermore, the communicative context is taken into account by looking at verbal aspects such as "specificity/accuracy", which relates to making appropriate lexical choices to convey information (e.g., not under-or over-specifying referents). Use of the physical environment is not taken into account explicitly. Overall, this measure is set up to judge the appropriateness of specific pragmatic characteristics in conversation (i.e., does the PWA show this behaviour and does it facilitate or impede communication), rather than to describe how communication is achieved.

Analysis of interaction
Conversation analysis focuses on directly observed face-to-face communication and explicitly takes into account the interactive, joint responsibility of communication. CA can take into account the multimodal ways in which interlocutors communicate, though speech is often used as the principal base measure (Ten Have, 2007). CA emphasises the importance of taking into account the communicative context in which a statement is made, thereby partly addressing common ground. It is possible to take into account the physical environment with this methodology, for example, by coding how conversation partners use or refer to objects in their environment. At present there is no standardised measure from the CA approach with norms that can be used in clinic to assess effective communicative ability of PWA, although treatment protocols based on CA principles such as SPPARC (Lock et al., 2001) and Better Conversations (Beeke et al., 2013) exist, which apply CA principles. More standardized and simplified CA approaches that can be easily applied in clinic may well come in the future (Barnes & Bloch, 2018).

Interim summary
Both the Scenario Test (van der Meulen et al., 2010) and Conversation Analysis (CA) adhere best to the definition of communication as outlined in the theoretical framework, i.e., as an interactive, multimodal and contextualised phenomenon. Both of these methods have been fruitful in generating more knowledge on communication in aphasia, as well as informing therapeutic approaches (Beeke et al., 2013;van der Meulen et al., 2010).
Crucially, the analytic purpose of the Scenario Test and CA is very different. CA is aimed at describing one aspect of communication, namely how interaction is organised between two people. This is done by looking at processes such as turn-taking, sequencing, and repairs. The purpose of CA is to describe how interaction is organised and how this might be atypical, not to explain why people show a particular kind of behaviour (Ten Have, 2007), or to explain the underlying (cognitive) causes of the (a)typical interaction. Observations made through CA are inherently specific to the dyad being studied and do not describe behaviour that can be separated from that particular conversation partner or environment. The detailed analysis that can be obtained through CA was, therefore, not designed to describe or identify general patterns and relationships between variables at the group level (Ragin, 1994, quoted by Ten Have, 2007. As was stated before, there is currently no standardised instrument based on CA that could be used in clinic. Although CA provides very rich, detailed information about interaction between two people, in its current form it does not allow for an analysis of communicative ability that can be easily generalised. The Scenario Test aims to quantify the effectiveness of communicative attempts by PWA. It provides a score for the communicative ability of the individual (in an interactive setting), while also describing the way in which communication is achieved (through which modalities and the degree of reliance on the conversation partner). The communicative behaviour as measured by the Scenario Test can then be related to other (cognitive or behavioural) measures for that individual, through comparison or further analysis of scores. This makes it possible to attempt to explain the why of the communicative difficulties experienced by the PWA in conversation. The Scenario Test has been standardized, meaning its outcome can be generalized and compared across larger groups of people. It has been shown to be a valid measure of the ability to convey information in simple communicative situations and communicative creativity (van der Meulen et al., 2010).
Thus, from the instruments we have considered, the Scenario Test (van der Meulen et al., 2010) incorporates most of the components from the theoretical framework of communication, while also providing information about how these components influence the communicative ability of the PWA. The Scenario Test is a standardized, objective measure of communication, which allows for the exploration of causal links between cognitive skills and communicative behaviours. In addition to this, the test currently exists in a format that is usable in the clinical as well as the research setting.

Psychometric properties
Our principle aim has been to consider the content validity of the instruments used to measure functional communication in aphasia rehabilitation, i.e., to evaluate whether an instrument samples all the relevant domains of a concept (Streiner, Norman, & Cairney, 2015). To do so we have selected situated language use (Clark, 1996) as a frame for understanding functional communication. Content validity is not the only property that needs to be considered for a measure to be suitable for clinical use. The instruments should also be evaluated on other psychometric properties, such as the consistency of items included in the instrument (internal consistency), the agreement in scoring between raters (inter-rater reliability), and the stability of test scores for the same person over time (test-retest reliability). To that end, we have provided a summary of reliability and validity values for each instrument in Table 5, and the originally reported reliability and validity values in Table 6 in the Appendix. It is worth noting that, when reported, the vast majority of instruments show moderate to high reliability across raters and timepoints, and good validity when correlated with other measures.

Conclusion
In this paper, we have reviewed current assessments of functional, real-world communication against a theoretically founded framework of communicationdefined as situated language use. Conversation Analysis and the Scenario Test came closest to the theoretical framework described in this paper. Out of these two, the Scenario Test (van der Meulen et al., 2010) was selected as the best fit for capturing real-life communicative ability in PWA in an objective, standardized manner, in a clinical setting.
In its present form, the Scenario Test has a number of limitations. As it uses role-play, there are cognitive demands placed on PWA by asking them to pretend to be in a situation. It is difficult to consider how this could be altered, but an increased use of props or materials may help reduce some of this burden. The lack of physical referents also limits the test since it removes contextual support that the PWA would have in the real-life equivalent of the situation. The Scenario Test is also prone to ceiling effects for those with mild to moderate verbal impairments. More complex scenarios or a change in scoring may help to capture a broader range of abilities, and to make it suitable for use with PWA with any level of aphasia severity.
The authors hope that the framework presented in this paper will encourage researchers to apply more scrutiny to the concept of functional communication and the instruments used to measure it. It is of crucial importance for the development of effective interventions to have a thorough understanding of real-world communication and of how to capture it. Note 1. The ability to communicate in the real world or in one's own everyday life will be referred to as functional communication or simply as communication. This refers to skills, including language skills, required to communicate in various situations one might come across in one's day-to-day life. Communication or functional communication is defined in contrast to "language" or "linguistic" skills, which represent the ability to process language in isolation, as demonstrated in decontextualised tasks in the clinic.