Treatment of sentence comprehension and production in aphasia: is there cross-modal generalisation?

ABSTRACT Exploring generalisation following treatment of language deficits in aphasia can provide insights into the functional relation of the cognitive processing systems involved. In the present study, we first review treatment outcomes of interventions targeting sentence processing deficits and, second report a treatment study examining the occurrence of practice effects and generalisation in sentence comprehension and production. In order to explore the potential linkage between processing systems involved in comprehending and producing sentences, we investigated whether improvements generalise within (i.e., uni-modal generalisation in comprehension or in production) and/or across modalities (i.e., cross-modal generalisation from comprehension to production or vice versa). Two individuals with aphasia displaying co-occurring deficits in sentence comprehension and production were trained on complex, non-canonical sentences in both modalities. Two evidence-based treatment protocols were applied in a crossover intervention study with sequence of treatment phases being randomly allocated. Both participants benefited significantly from treatment, leading to uni-modal generalisation in both comprehension and production. However, cross-modal generalisation did not occur. The magnitude of uni-modal generalisation in sentence production was related to participants’ sentence comprehension performance prior to treatment. These findings support the assumption of modality-specific sub-systems for sentence comprehension and production, being linked uni-directionally from comprehension to production.

Treatment studies that target deficits in sentence comprehension and production form a promising methodology, which can add to the discussion on single versus distinct syntactic processing systems (Nickels, Rapp, & Kohnen, 2015). In particular, the observation of cross-modal generalisation can provide insights into the functional relation between sentence processing sub-systems (Mitchum, Haendiges, & Berndt, 1995;Nickels, Kohnen, & Biedermann, 2010). Thus, improvements in sentence processing in the non-treated modality would favour the assumption of a single syntactic processing system, upon which both modalities draw, whereas a lack of cross-modal generalisation would point to distinct syntactic processing systems.
The architecture of the sentence processing system is also studied in psycholinguistic research involving language-unimpaired participants. For example, the finding of crossmodal syntactic priming effects (from comprehension to production and vice versa) is taken as evidence for a single syntactic processing system (Bock, Dell, Chang, & Onishi, 2007;Branigan, Pickering, & McLean, 2005) and the same neuronal networks seem to be involved in uni-as well as cross-modal syntactic priming (Segaert, Menenti, Weber, Petersson, & Hagoort, 2012). Yet, in psycholinguistic research, sentence comprehension and production have mostly been studied separately (e.g., Bock & Levelt, 1994;Levelt, 1989;MacDonald, Pearlmutter, & Seidenberg, 1994) and models of sentence processing are not explicit about whether syntactic comprehension and production rely on a single or distinct mechanisms.
For example, in the model of sentence processing proposed by Garrett (1980Garrett ( , 1995, comprehension and production systems are assumed to be "distinct but intricately intertwined" (Garrett, 1995, p. 881) and concurrently active. However, in this model both modalities draw upon a shared mental lexicon rendering them mutually dependent on each other. Schröder et al. (2015) come to the conclusion that the abilities necessary to produce and comprehend sentences are mostly subserved by modalityspecific processes. On the basis of their data, the authors conclude that good comprehension abilities prior to production treatment enhance generalisation to untrained sentences when sentence production is trained. Similarly, Dickey and Yoo (2010) report that pre-treatment auditory comprehension abilities as measured in the Western Aphasia Battery (WAB, auditory comprehension score; Kertesz, 1982) are a significant predictor for the size of improvements in producing trained sentence types, although in their meta-analysis comprehension abilities were not related to generalisation to untrained sentence types. Additionally, Dickey and Yoo (2010) report that scores for the comprehension of complex sentences as assessed with the Northwestern Assessment of Verbs and Sentences (NAVS; Thompson, 2012) or the Philadelphia Comprehension Battery for Aphasia (Saffran, Schwartz, Linebarger, Martin, & Bochetto, 1988) were neither predictive of improved production of trained sentence structures nor of the generalisation to untrained sentence structures. Taken together, the aforementioned findings can be taken as evidence for distinct, modality-specific processing systems involved in sentence comprehension and production, linked via a uni-directional connection from comprehension to production.
The aim of the present study is two-fold. First, we provide a review of studies exploring uni-and cross-modal treatment outcomes in sentence comprehension and production in aphasia. Second, we present results of a treatment study, which we conducted in order to investigate whether treatment of sentence processing administered in a single modality generalises to the other non-treated modality, and whether comprehension is related to production via a uni-directional connection as proposed by Schröder et al. (2015).
MT is based on the Mapping Hypothesis (e.g., Schwartz et al., 1994;Schwartz, Saffran, & Marin, 1980), which attributes sentence processing deficits to impairments in mapping grammatical roles of nouns onto their thematic roles and vice versa. Following this, MT aims at improving the ability to connect syntactic and semantic/thematic structures. In one study on treatment of sentence comprehension (Rochon & Reichman, 2004), MT was combined with an additional acting-out task using figurines (MT-act). According to Kiran et al. (2012), acting-out may enhance treatment-induced improvements, as it targets intermittent reductions in working memory capacities necessary for the interpretation of syntactic structures and assignment of thematic roles. Such reductions are assumed to be present in IWA with sentence processing deficits (e.g., Caplan, DeDe, & Michaud, 2006;Caplan, Waters, Dede, Michaud, & Reddy, 2007).
The TUF approach, as for example proposed by Jacobs and Thompson (2000), focuses on visualising the abstract grammatical properties of sentences and the syntactic movements required to form non-canonical sentences. It also targets identification of thematic roles in a sentence. Table 1 summarises treatment effects described in 17 treatment studies using either MT or TUF investigating not only uni-modal (e.g., improvements in comprehension or production of sentences) but also cross-modal generalisation in sentence processing. It is important to note that, although some studies refer to MT, the application of particular tasks and procedures as well as generated theoretical assumptions are heterogeneous to some extent. We review treatment results for 39 cases with associated impairments in sentence comprehension and production, for which treatment was administered either solely in comprehension or in production in the respective treatment phase(s) and present evidence of how successful they have been.
Given the debate on whether treatment effects in sentence processing generalise across modalities, we distinguish between practice effects (i.e., improvements in trained sentences, which are, importantly, restricted to the treated modality) and several types of generalisation: Uni-modal generalisation refers to improvements in the treated modality only. These can occur either for untrained exemplars of the trained sentence type or for untrained sentence types. Cross-modal generalisation indicates improvements in the untreated modality, which have been investigated with a variety of tasks (e.g., picture description, storytelling, sentence elicitation in production; and sentence-picture matching in comprehension) using various measures (e.g., number of correctly produced verb inflections, complexity of verb-argument structures produced, mean length of utterance, noun/verb-ratio, number of utterances classified as sentence structures in production; and sentence-picture matching accuracy in comprehension). Treatment effects reported in the respective studies were reviewed with respect to the occurrence/absence of improvements and the application of statistics (Yes S /No S = tested for statistical significance, Yes/No = numerical change without statistical analyses). Moreover, Table 1 provides information on whether a control task unrelated to the linguistic activity targeted during treatment or pre-treatment baseline assessments has been administered, in order to evaluate whether outcomes are treatment specific. In the next section, we summarise outcomes of the treatment studies targeting sentence comprehension followed by a section considering results of treatment studies focusing on sentence production.
Significant practice effects (i.e., uni-modal improvements on trained exemplars of the trained sentence type) were observed in 7/13 participants (Byng, 1988;Mitchum et al., 1995;Schwartz et al., 1994). Practice effects are also reported for another three participants in Rochon and Reichman (2004) and Jacobs and Thompson (2000), although no statistical analyses are provided. No information is given regarding practice effects for the two participants in Jones (1986) and Nickels et al. (1991).
Significant uni-modal generalisation to untrained exemplars of the trained sentence type (usually reversible active sentences) was found for 8/13 IWA (Byng, 1988;Mitchum et al., 1995;Schwartz et al., 1994; albeit for one participant in the absence of a practice effect). Three other studies involving four participants also reported uni-modal generalisation to trained sentence types, although without statistical computations (Jacobs & Thompson, 2000;Jones, 1986;Rochon & Reichman, 2004). Uni-modal generalisation to untrained structures was found for 7/13 participants and was significant for six of those (Byng, 1988;Schwartz et al., 1994; albeit for one participant in the absence of a practice effect).
Cross-modal generalisation, i.e., improvements in sentence production after treatment of sentence comprehension, occurred in all but one study (Mitchum et al., 1995) for altogether 10/13 IWA, although the effect was significant for three participants only (Byng, 1988;Nickels et al., 1991). As mentioned above, across studies, a variety of constrained or narrative production tasks and measures have been used to assess crossmodal generalisation. Only two studies used a sentence-elicitation task (Jacobs & Thompson, 2000;Rochon & Reichman, 2004), allowing assessment of production of exactly the same sentence types as trained in comprehension. This makes it difficult to determine whether effects of cross-modal generalisation are comparable across the different studies. In addition, treatment protocols were not always limited to sentence comprehension, but required the participants to produce the trained sentences as well (Byng, 1988;Schwartz et al., 1994). This might have confounded the finding of cross-modal generalisation. However, in Jones (1986) and Rochon and Reichman (2004), IWA were not required to produce trained sentences during comprehension treatment and cross-modal generalisation did occur (albeit without statistical confirmation).
Regarding the issue of whether changes after treatment can be ascribed to the particular treatment protocol, results for seven IWA were accompanied by stable performance in a control task or pre-treatment assessments (Byng, 1988;Jacobs & Thompson, 2000;Mitchum et al., 1995;Nickels et al., 1991;Rochon & Reichman, 2004), although evidence was not always accompanied by statistical computation. In two studies, information about control tasks was lacking (Jones, 1986;Schwartz et al., 1994).
Overall, studies on sentence comprehension treatment in IWA with associated impairments in sentence comprehension and production point to the conclusion that MT, MT-act, and TUF evoke uni-modal treatment effects: considerable improvements in trained sentences and generalisation to untrained exemplars of the trained sentence type(s); while generalisation to untrained sentence types occurred less frequently. Concerning cross-modal generalisation to sentence production, several studies reported improvements, although the administered tasks and measures varied enormously and changes were often only demonstrated numerically or descriptively in the respective measures.
Significant uni-modal generalisation to untrained exemplars of the trained sentence type was found in 13 IWA (Harris et al., 2012;Marshall et al., 1993;Rochon et al., 2005;Rochon & Reichman, 2003; despite the absence of a practice effect; Stadie et al., 2008). Two other studies involving three participants reported a numerical increase in production accuracy of untrained items of the trained sentence type (Jacobs & Thompson, 2000;Thompson, 1998).
In contrast, cross-modal generalisation to sentence comprehension following treatment of production occurred in only 4/26 IWA (Byng et al., 1994;Harris et al., 2012;Thompson, 1998;Weinrich et al., 2001). However for the two participants reported in Byng et al. (1994) and Thompson (1998), only descriptive data were provided. For the two participants reported in Weinrich et al. (2001) and Harris et al. (2012) although the generalisation has been confirmed statisticallycausality with respect to the applied treatment remains ambiguous, as no stable performance prior to treatment or in a control task was reported. Actually, across all studies included in the review, a causal relation between applied treatments and the occurrence or lack of treatment effects, established by the use of a control task, can only be assumed for 7/26 IWA (Nickels et al., 1991;Stadie et al., 2008). Note that in contrast to the various tasks used to detect cross-modal generalisation after comprehension treatment, improvements in sentence comprehension following production treatment were assessed with the same type of task across studies: sentence-picture matching In sum, treatment of sentence production based on MT or TUF generally results in practice effects and generalisation within the same modality. Generalisation has been observed to untrained exemplars of the trained syntactic structure and to untrained sentence types. In contrast, evidence of cross-modal generalisation following sentence production treatment is mostly lacking.

Summary
As cross-modal generalisation is merely absent from production to comprehension, it can be assumed that the processing sub-systems for comprehension and production of sentences are modality-specific. However, the review of treatment studies reveals that cross-modal generalisation seems to occur from sentence comprehension to production; pointing to the conclusion that there is a uni-directional link between both modalities, that is, from sentence comprehension to production (but not from production to comprehension), as suggested in Schröder et al. (2015).
However, findings of cross-modal generalisation from comprehension to production should be regarded with caution because the evaluation of generalisation is often methodologically limited. This is because the assessment of generalisation mostly relied on narrative tasks applying various measures (see Table 1; Byng, 1988;Jones, 1986;Mitchum et al., 1995;Nickels et al., 1991;Schwartz et al., 1994). However, since performance varies in tasks tapping narrative as opposed to constrained language production (e.g., Glosser, Wiener, & Kaplan, 1988), therapy outcomes should be investigated in constrained tasks, such as spoken sentence elicitation.

Aims
The present study examined the efficacy of sentence comprehension and production treatment in two German-speaking IWA with associated deficits in sentence comprehension and production prior to intervention. Treatment of sentence comprehension involved MT-act (based on Kiran et al., 2012); in sentence production, the German adaptation of TUF (e.g., Stadie et al., 2008;Thompson, 2001) was applied. Object relative clauses (ORC) were trained in two subsequent treatment phases, each targeting a single modality, and several treatment outcomes (practice and generalisation effects within and across modalities) were investigated using a crossover design (Coltheart, 1991). With respect to these treatment effects, we addressed the following questions and hypotheses: (1) Uni-modal treatment effects: Do practice effects and generalisation occur within the treated modality? Generalisation is assessed for (a) untrained exemplars of the trained sentence type (ORC) and (b) untrained sentence types (object who-questions, whoQ, and subject relative clauses, SRC).
(2) Cross-modal treatment effects: Is there evidence of generalisation to the untrained modality, i.e., does performance in comprehending ORC (and other sentence types) improve after treating ORC in production and vice versa?
Regarding the occurrence of uni-modal treatment effects, following sentence production treatment using TUF (Thompson, 2001), we expect practice effects and generalisation to untrained items of the trained sentence type (ORC) to occur. In addition, following the Complexity Account of Treatment Efficacy (CATE; e.g., Stadie et al., 2008;Thompson, Shapiro, Kiran, & Sobecks, 2003), we expect generalisation to occur from trained complex to untrained simpler but linguistically related sentence types (whoQ). Related sentence types share similar grammatical properties in terms of the syntactic movement operations involved, i.e., they rely on comparable syntactic processes, but they are less complex than trained structures. Again based on CATE, no generalisation should occur to SRC, as they involve another type of syntactic movement operation. Following sentence comprehension treatment using MT-act, we expect improved comprehension performance for trained and untrained exemplars of ORC based on the results of Kiran et al. (2012). Moreover, Kiran and colleagues propose that syntactic deficits are associated with reductions in resources subserving sentence comprehension, and that MT-act addresses the ability to utilise these resources. Thus, uni-modal generalisation to untrained sentence types might be observable for untrained sentences irrespective of the movement type involved (i.e., for whoQ and SRC).
An additional objective of the present study consisted of adding to the debate of single versus distinct processing systems responsible for sentence comprehension and production. In particular, we aimed at finding support for the uni-directional linkage from comprehension to production. As proposed by Schröder et al. (2015), good comprehension performance seems to assist re-learning processes in sentence production. Thus, the following research question arises: (3) Are uni-modal generalisation effects in production related to the pre-treatment performance in sentence comprehension?
If good sentence comprehension endorses re-learning in sentence production due to a uni-directional connection, then comprehension performance (prior to production treatment) should significantly predict the occurrence of generalisation effects within the treated modality, whereas the pre-treatment production performance should not be predictive of the occurrence of generalisation following comprehension treatment.

Participants
Two monolingual German-speaking individuals with aphasia resulting from a unilateral lesion in their dominant hemisphere took part in the study. Participant CM1, a male wholesaler was 39 years old, received 10 years of education and was 3 years 4 months post-onset at the beginning of the present investigation; participant CM2, a female lawyer was 57 years old, received 13 years of education and was 11 years 4 months post-onset when the treatment study started. None of the participants suffered from dysarthria and only CM1 presented with a mild apraxia of speech. Both participants were classified as individuals with Broca's aphasia in a standardised aphasia battery (Aachen Aphasia Test, AAT; Huber, Poeck, Weniger, & Willmes, 1983) and showed non-fluent, agrammatic speech output. The mean length of utterance (MLU) and noun-verb-ratio (N/V) was as follows: CM1: MLU = 1.33, N/V = 30.0; CM2: MLU = 3.0, N/ V = 1.07. Besides word-finding difficulties and phonological paraphasias, their spontaneous speech was syntactically simplified with frequent omission or substitution of function words and of bound grammatical morphemes. During the present study, CM2 received no further speech language therapy and CM1 was involved in anomia treatment targeting nouns, without focusing on sentence processing.
Inclusion for the present study required additional deficits in comprehension of noncanonical sentences prior to treatment, assessed with a German sentence comprehension test (Sätze verstehen; Burchert, Lorenz, Schröder, De Bleser, & Stadie, 2011). Here, both participants performed at chance with reversible non-canonical active sentences and significantly better with reversible canonical active sentences, indicating a word order effect, CM1: χ 2 (1) = 24.92, p < .05; CM2: χ 2 (1) = 4.53, p < .05, both two-tailed Chisquare without Yates correction. In order to rule out pre-lexical and lexical deficits as a possible cause for impaired sentence comprehension, both participants were required to show performance within the range of normal controls in a task requiring the discrimination of minimal word pairs and in an auditory word-picture-matching task taken from LeMo (De Bleser, Cholewa, Stadie, & Tabatabaie, 2004). Details of the assessment administered before the treatment study are provided in Table 2.

Materials
Sentences and their corresponding black-and-white pictures were taken from a German treatment programme for sentence production deficits (Komplexe Sätze; Schröder, Lorenz, Burchert, & Stadie, 2009). Additionally, concept cards and figurines were used during treatment.

Sentences
All 90 sentences used in the present study were derived from semantically reversible actions of 20 transitive verbs combined with two animate nouns. Across the sentences, nouns and verbs were matched for their combined written and spoken lemma frequencies (Baayen, Piepenbrock, & Rijn, 1993). In total, 60 object relative clauses (ORC), 10 subject relative clauses (SRC) and 20 object who-questions (whoQ) were used for baseline and post-treatment assessments. Relative clauses (ORC, SRC) were either casemarked (with two masculine, singular nouns) or number-marked (one feminine noun in plural and one neuter singular noun).
ORC were divided into the following three sets counterbalanced across participants and treated modality (comprehension and production). Set 1 consisted of 20 ORC derived from 10 different verbs. Each verb was paired with two animate nouns in order to construct two different ORC, in which the nouns were allocated to different grammatical roles. Five verbs were combined with two masculine singular nouns resulting in 10  (Huber et al., 1983), percentiles are provided. For selected tasks of LeMo (De Bleser et al., 2004) and selected subtests of Sätze verstehen (Burchert et al., 2011), the number of correct responses is provided with corresponding performance levels in parentheses (normal = scores within range of age-matched controls, impaired = scores below 2 SD from mean of age-matched controls, chance = scores within chance range).
case-marked sentences (i.e., both noun phrases were unambiguously marked for either nominative or accusative case). The other five verbs were combined with one neuter singular noun and one feminine noun in plural, resulting in 10 number-marked sentences (i.e., both nouns were ambiguous between nominative and accusative case and only verb inflection disambiguated the sentence meaning). In order to assess performance on untrained exemplars of the trained sentence type, we arranged two further sets of ORC. Set 2 comprised ORC with the same verbs as in Set 1 and differed with respect to the nouns and their gender. In this set, verbs that resulted in case-marked sentences in Set 1 were used to construct number-marked sentences and vice versa. Set 3 comprised 20 ORC constructed with verbs and nouns other than those of Set 1 and 2. The SRC and whoQ contained the same verbs as in the Set 3 ORC but other nouns. Table 3 gives an overview of the item structure and provides examples for each sentence type.

Pictures, concept cards, figurines
There were 90 black-and-white drawings displaying the characters engaged in the action of a respective sentence (Schröder et al., 2009). There were two concept cards of a doer and a receiver (see Figure 1). The concept card of the doer was red-framed and depicted a running person, indicating an activity. The concept card of the receiver was blue-framed and depicted a figure standing quietly, thus, being passive. Finally, there were 40 Playmobil® figurines unambiguously representing the nouns of the sentences. All figurines were used for acting out the sentences. In addition, tools required to enact actions like "vaccinate" (syringe) or "measure" (yardstick) were provided.

Study design
In the intervention study, we applied a crossover treatment design (AABACA). Following two pre-treatment baseline assessments (BL1 and BL1'), ORC were trained in two Who is the son catching? ORC = object relative clauses; SRC = subject relative clauses; whoQ = who-questions.
successive treatment phases (Tx1 and Tx2) with post-treatment performance being assessed at the end of each phase, respectively (i.e., post-Tx1 and post-Tx2). In each phase, intervention focused solely on comprehension or on production in order to evaluate cross-modal generalisation effects from production to comprehension and vice versa (see Figure 2). The sequencing of production and comprehension treatment was randomly allocated and counterbalanced across participants: CM1 started with comprehension treatment in Tx1 followed by production treatment in Tx2, and the reverse ordering was administered to CM2 (Tx1: production, Tx2: comprehension treatment).
Assessments of sentence comprehension and production were administered before and after each treatment phase, involving the set of trained and untrained ORC (each n = 20), control ORC (n = 20 in comprehension, n = 10 in production), control SRC (n = 10) and control whoQ (n = 20). Following this, all assessments involved the production and comprehension of trained and untrained sentence structures in order to detect improvements within a modality (modality-specific or uni-modal treatment effects) or across modalities (cross-modal). Hereby, we distinguish between practice effects, i.e., improvement in trained sentences, and generalisation effects to untrained items. Generalisations can be observed to sentences of the treated structure either containing the  same verbs with other nouns, or differing with respect to both nouns and verbs. The former sentences are referred to as untrained ORC, the latter to control ORC.
During intervention in Tx1, participants were trained on ORC of Set 1, while Set 2 remained untrained. In Tx2, both participants were trained on ORC of Set 2. ORC of Set 3, SRC and the whoQ remained entirely untreated during both intervention phases and, thus, served as control items. Furthermore, treated items were pseudo-randomised with a maximum of three successive sentences marked for either case or number. Additionally, sentences containing the same verb were never adjacent.
Each treatment phase comprised a maximum of eight treatment sessions (approximately 45-60 minutes, twice a week) or ended if a participant reached 90% correct responses in three consecutive sessions. During the present treatment study, participants received no other treatments targeting sentence comprehension or production.
Reading of non-words (LeMo subtest 14; De Bleser et al., 2004), a task tapping language activities unrelated to those targeted during intervention, was administered as a control task at three points in time (i.e., before treatment and after each treatment phase). Before the onset of the treatment study, participants had received speech-andlanguage therapy for varying periods of time.

Assessment and scoring of responses
Comprehension and production of sentence structures (ORC, SRC, and whoQ) were assessed in blocks varying systematically according to morphological properties (case-versus number-marked) and lexical content of the verb. Baseline and pre-and post-treatment testing of comprehension was assessed with an acting-out task using figurines, which were introduced for each item by the examiner. Participants' responses were scored as correct when the thematic agent and theme roles were unequivocally assigned to the correct figurine and the action was accurately acted out. For example, for a sentence like I see the son, who the father is kissing, the participant should use the father figurine to actually do the kissing, while the son receives the action.
Control data for accuracy in the acting-out task (see Table 4) were gathered in a pilot study involving 19 participants without any history of neurological or learning impairment (10 male, 9 female, mean age: 42.1 years, SD = 21.77, range: 20-75 years). Based on the mean correct responses, the normal range, that is, absolute scores which do not significantly differ from controls' performance, was determined using the criterion suggested by Crawford and Garthwaite (2002).
Production was assessed in a sentence-elicitation task (Jacobs & Thompson, 2000;Schröder et al., 2009). For this task, the examiner presented two pictures, a target and a foil displaying the same action with reversed thematic roles. The participant was asked to produce a sentence to the target picture using the same syntactic structure as provided by the therapist in a sentence describing the foil picture. The participant's responses were scored correct when word order was accurate (non-canonical in ORC and whoQ, canonical in SRC). Moreover, accuracy of case and number marking of determiners, the relative pronoun (in ORC and SRC) or the interrogative pronoun (in whoQ) as well as verb inflection was evaluated. In contrast, errors in gender assignment and the occurrence of semantic and phonological paraphasias were tolerated. Participants' performance prior to treatment was classified with respect to control data taken from Komplexe Sätze (Schröder et al., 2009).

Treatment procedures
In a practice session prior to the intervention, the therapist introduced the concept cards of the doer and receiver using five semantically irreversible ORC. Following this, the participants were asked to identify agent and theme in each of those practice sentences. Participants had to perform 100% correct twice during the practice session before treatment of sentence comprehension or production started. Sentence production therapy was based on a German treatment programme (Komplexe Sätze; Schröder et al., 2009) adopting the method of TUF (Jacobs & Thompson, 2000) by mirroring the movement steps necessary to derive the non-canonical target sentence with word cards (see Stadie et al., 2008, for a detailed description). It involved a sentence-elicitation task targeting ORC. If the participant's response was incorrect during the first elicitation probe, a series of hierarchical steps was administered. The therapist used word cards to illustrate the required phrasal movements of how an ORC is derived from two canonical active sentences. Thereafter, the participant was asked to repeat the manipulations of word cards and to produce the corresponding ORC orally.
Comprehension treatment primarily focused on the demonstration of thematic roles (doer and receiver) and the action expressed in an ORC using an acting-out task (Kiran et al., 2012) with figurines. Following the participant's response during the first comprehension probe, a series of hierarchical steps was administered. On the basis of the oral and written presentation of the sentence, the black-and-white picture and the concept cards (see Figure 1), the therapist pointed out the action, the agent and the theme several times. Finally, the participant was asked to act out the sentence meaning using the figurines. The different steps of the treatment protocols are provided in Table 5.

Data analysis and identification of treatment effects
For statistical significance, we adopted an alpha level of .05. Comparisons of the number of correct responses within an assessment were computed using Fisher's exact test (two-tailed). In order to detect treatment effects, significance of changes across assessments was determined using McNemar's test. According to the procedure provided by Cohen (1992a), we additionally report effect sizes (g) for significant McNemar results, where the maximum effect size is g = .5, thus, g = .25 indicates a large effect, g = .15 refers to a medium, and g = .05 to a small effect size. Note that, as "every statistical test has its own effect size index" (Cohen, 1992b, p. 98), g is being calculated differently as compared to other effect size indices and, thus, the benchmarks for high, medium and low magnitude are also numerically different from benchmarks of other effect sizes.
In the following, we report unique treatment effects, as only these can unequivocally be attributed to the preceding intervention phase (Stadie et al., 2008). Thus, all significant practice or generalisation effects arising after the first treatment phase (Tx1) were regarded as being unique. Following the second treatment phase (Tx2), significant improvements can only be interpreted as being unique if no practice or generalisation effects occurred previously. According to Stadie et al. (2008), in case of significant improvements after the first intervention phase, these improvements may sustain throughout the second treatment phase. In this case any additional improvements arising after the second treatment phase could in principle result from the specific treatment applied to the other modality in the second phase or from a summation of improvements due to treatment of both modalities. ORC = object relative clause, e.g., I see the son who the father is kissing; picture (A) = depicted action, picture (B) = same depicted action as in (A) but with reversed theta roles; written sentence (1) = declarative canonical sentence, e.g., The father is kissing the son, written sentence (2) = matrix clause for ORC, e.g., I see the son.
For the investigation of the uni-directional linkage hypothesis from sentence comprehension to production (Schröder et al., 2015), we tested whether sentence comprehension performance before production treatment was a significant predictor for uni-modal generalisation within sentence production after treatment. Additionally, we explored the opposite path, i.e., whether sentence production performance prior to comprehension treatment significantly predicted post-Tx uni-modal generalisation within sentence comprehension. For exploration of the linkage hypothesis from comprehension to production, we operationalised the dependent variable as follows: For each item of the untrained and control ORC sets which was not produced correctly in the pre-treatment assessment, we coded the post-treatment performance as 1, if it was correctly produced at post-Tx (indicating generalisation in performance). If production was still incorrect at post-Tx, the respective item was coded as 0. Thus, this matrix represented positive changes in paired observations (i.e., the number of changes from incorrect to correct responses) as 1 and the lack of change as 0, therefore, inherently encompassing information about the extent of uni-modal generalisation. The logistic regression model, fitted with the binomial link function, included the proportion of correctly comprehended ORC for each of the untrained sets prior to sentence production treatment as fixed effects (i.e., as the predictor values), and intercepts for subjects and by-subject random slopes for the effect of pre-treatment comprehension performance as random effects. For exploration of the linkage hypothesis from production to comprehension, we applied the mirror image of this procedure (re-coding pre-and post-treatment comprehension performance instead of production performance and including the proportion of correctly produced ORC for each of the untrained sets prior to sentence comprehension treatment as predictor in the model).
For model parameter estimation, we used the maximum likelihood estimation procedure and determined statistical significance of the predictor variable by model comparisons using log-likelihood ratio tests, for which we report the Chi-square statistics. Furthermore, we provide the coefficient estimate, its standard error, z-score, and the corresponding p-value.
Concerning the amount of treatment sessions, both participants reached the criterion for ending the sentence comprehension treatment after five sessions. For sentence production treatment, CM1 and CM2 both ended after the predefined maximum of eight sessions.
Participants' performance in production and comprehension in two separate pretreatment assessments is illustrated in Table 6. Both depicted stable impairments in producing and comprehending ORC across pre-treatment assessments (p > .05 for all χ 2 comparisons). No significant differences were observed for either participant across sets of ORC (Sets 1, 2, 3) within production and comprehension prior to treatment. Overall, comprehension performance was significantly better than production for both participants (p < .05 for all χ 2 comparisons). Yet, both production and comprehension performance had deteriorated enough to allow for the detection of significant improvements after treatment. Treatment effects were computed based on results of the first pre-treatment baseline. Table 7 summarises results obtained before and after each treatment phase from both participants. As outlined in the method section, the sets of trained ORC were not identical throughout modalities, i.e., ORC trained in production remained untrained during comprehension treatment, whereas untrained ORC during production were trained in comprehension. As can be derived from Table 7, participant CM1 received production treatment in the second phase, thus, the respective comparisons involve performance at post-Tx1 vs. post-Tx2, and comprehension treatment was administered in the first phase, therefore comparisons involve BL1 vs. post-Tx1; and vice versa for participant CM2.

Post-treatment performance
In the following, we will first report unique practice and generalisation effects arising within a trained modality, i.e., uni-modal treatment effects, and then focus on improvements arising across modalities, i.e., from production treatment to sentence comprehension and vice versa.

Uni-modal treatment effects (comparisons within a single modality)
Following sentence production treatment, both participants showed significant changes with large effect sizes regarding the number of correctly produced trained n/a Impaired ORC = object relative clauses; SRC = subject relative clauses (assessed in BL1'only); whoQ = who-questions; n/a = not applicable. Performance levels are classified based on control data (see Table 4 for sentence comprehension norms and Schröder et al., 2009, for sentence production); normal range = within normal range; impaired = below normal range.

Cross-modal treatment effects (comparisons across modalities)
Since both participants manifested relevant treatment effects after the first treatment phase, unique cross-modal generalisation can only be analysed following this phase. Nevertheless, Tables 8 and 9 comprise scores and comparisons obtained after both treatment phases. In what follows, we analyse unique cross-modal generalisation after production treatment for CM2 and after comprehension treatment for CM1 only.
From production to comprehension. Participant CM2 showed no increase in the comprehension of ORC that had been trained or remained untrained during production treatment (χ 2 = 0.57, p > .05; χ 2 = 0.1, p > .05). Comprehension performance significantly decreased for the set of control ORC (χ 2 = 4.0, p = .046, g = −.4). No changes were observed with respect to the other control item sets (SRC: χ 2 = 0.8, p > .05; whoQ: χ 2 = 0.25, p > .05). 2 From comprehension to production. For CM1, no changes occurred in the production of ORC after comprehension treatment, neither in the trained nor in the untrained item sets as performance remained at floor (all sentences types: 0% correct).
Uni-directional linkage from sentence comprehension to production. With respect to the relation between pre-treatment comprehension performance and the occurrence of uni-modal generalisation in production, model comparisons revealed that adding the fixed effect of proportions of correctly comprehended ORC (prior to sentence production treatment) led to a significant improvement in model fit compared to a null model without the predictor, χ 2 (1) = 4.0, p = .046. As revealed by the model coefficients, pre-treatment comprehension performance significantly predicted the occurrence of uni-modal generalisation effects in production (b = 2.26, SE = 0.98, z = 2.3, p = .02).
For the effect of pre-treatment production performance on the occurrence of post-Tx uni-modal generalisation in sentence comprehension, the model results showed that  the predictor cannot significantly account for any of the variance in the data, χ 2 (1) = 2.59, p = .11, b = −5.05, SE = 2.76, z = −1.8, p > .05.

Summary of results
Both participants demonstrated unique uni-modal practice effects following either comprehension or production treatment. Additionally, uni-modal generalisation was observed, but only for the trained sentence type (i.e., untrained exemplars of ORC), and not for untrained sentence types (SRC, whoQ). Cross-modal generalisation did not occur either from comprehension to production or from production to comprehension. 3 However, pre-treatment comprehension performance was a significant predictor of uni-modal generalisation to untrained exemplars of ORC after sentence production treatment. In contrast, pre-treatment production performance did not significantly predict generalisation in comprehension performance following comprehension treatment. All reported effects are treatment induced, as there were no significant differences between the pre-treatment assessments, and, moreover, we observed no significant changes in a functionally unrelated control task before and after treatment.

Discussion
The present study examined unique uni-and cross-modal treatment effects occurring after training of sentence comprehension and production. The treatment was applied Table 9. Number of correct responses and proportion correct in production after treatment in the other modality.
Production performance before and after comprehension treatment For CM2, changes in production performance from post-Tx1 to post-Tx2 cannot be considered, as performance already increased between BL1 and post-Tx1 (i.e., after treatment phase 1).
3 Throughout the application of assessment and treatment, participants' errors in sentence production were classified according to the scoring criteria outlined in the methods section. However, an anonymous reviewer raised the issue of whether a more lenient scoring would have changed the results in any way. We thus performed a descriptive qualitative re-analysis of the different error categories. This revealed that for CM1, at post-treatment, there was still a significant decrease in errors due to incorrect word order realisation even if incorrect responses with purely morphological errors were left aside. For participant CM2, most pre-treatment errors were due to her producing the canonical version of the target, whereas the number of such errors had markedly decreased post-treatment, i.e., changes in particularly this error type (and not in purely morphological errors) fundamentally contributed to the improvements observed post-treatment.
to two IWA depicting associated deficits in both modalities prior to treatment. According to the CATE (Thompson et al., 2003), complex non-canonical sentences (ORC) were trained, and other exemplars of ORC served as control items. In addition, control sentences encompassed complex sentences involving a different type of syntactic movement (SRC) and less complex sentences with the same movement type (whoQ). Comprehension therapy targeted mapping relations between syntactic and semantic information (e.g., Schwartz et al., 1994) by means of an acting-out task, calling upon processing capacities essential to sentence comprehension (Kiran et al., 2012). Treatment of sentence production followed the TUF protocol (Jacobs & Thompson, 2000;Stadie et al., 2008;Thompson, 2001).
To control for non-specific treatment effects, we used a crossover design encompassing a control task (administered pre-treatment, post-Tx1, and post-Tx2) and two pre-treatment assessments. We acknowledge that more than two pre-treatment assessments would be preferable in order to demonstrate stable pre-treatment performance, especially since there was slight fluctuation in some instances, however, it was not significant. Efficacy of the treatment protocols was investigated with respect to the occurrence of practice effects and generalisation patterns (to trained and untrained sentence types). Outcomes were explored both in the treated (i.e., uni-modal effects) and untreated modality (i.e., cross-modal effects, either from comprehension to production or vice versa). More specifically, we aimed at corroborating the assumption that sentence comprehension is related to sentence production via a uni-directional link, as suggested by Schröder et al. (2015).
In what follows, we relate our findings to previously reported treatment outcomes and discuss how our results add to the theoretical debate about single versus distinct but (uni-directionally) linked sub-systems for sentence processing.

Uni-modal treatment effects
In line with previous treatment studies applying MT or MT combined with an acting-out task (Byng, 1988;Mitchum et al., 1995;Kiran et al., 2012;Rochon & Reichman, 2004;Schwartz et al., 1994), we observed uni-modal practice effects following sentence comprehension treatment in both participants. The additional occurrence of uni-modal generalisation to untrained exemplars of the trained sentence type (ORC) shows that improvements are not solely due to repeated exposure to trained sentences (Coltheart, 1991). Instead, it reflects participants' regained ability to apply mapping rules to ORC containing different lexical items, supporting outcomes of former treatment studies targeting sentence comprehension (Byng, 1988;Jones, 1986;Mitchum et al., 1995;Rochon & Reichman, 2004;Schwartz et al., 1994).
In line with the relevant literature reporting a lack of generalised improvements in comprehending non-treated sentence structures following MT or MT-act treatment, we also did not observe uni-modal generalisation to untrained sentence types (Byng, 1988;Mitchum et al., 1995;Nickels et al., 1991;Rochon & Reichman, 2004). The results are, due to ceiling effects, ambiguous for CM1 and they only reveal structurespecific improvements for CM2, that is, increased comprehension performance was restricted to the trained sentence type. This lack of generalisation to untrained sentence types conforms to the results by Rochon and Reichman (2003), who also used MT-act without revealing uni-modal generalisation in comprehension performance. However, it is not in accordance with the hypothesis put forward by Kiran et al. (2012) that an acting-out task included in the mapping procedure would enhance processing resources necessary for the comprehension of other sentence types.
In fact, we found a structure-specific treatment effect for the trained sentence type, in face of significantly decreased performance in comprehending whoQ. This indicates that although the specific treatment protocol, which we adapted from Kiran et al. (2012), led to improvements in comprehending the target structure, it also brought about a negative effect with respect to an untrained sentence structure. Similar declines concerning untreated material in the presence of increased performance for treated material after treatment have also been reported in a few other studies (e.g., Jacobs & Thompson, 2000;Kiran et al., 2012;Schröder et al., 2015), yet the cause so far remains unclear.
In the case of CM2, the diminished performance, although not expected, could be explained with assumptions put forward by the resource allocation account (e.g., Caplan et al., 2007;McNeil, Odell, & Tseng, 1991). According to Kiran and colleagues (2012), syntactic deficits can be ascribed to reduced abilities in allocating resources, thus, the treatment protocol applied here should induce an increase in these resources. Indeed, we believe that, for CM2, treatment specifically touched upon re-allocating those resources necessary for comprehension of the trained sentence type, resulting in practice effects and structure-specific generalisation. At the same time, the structure-specific nature of the re-allocation might also explain why comprehension performance decreased for whoQ. Because, prior to treatment, the available resources for CM2 were very low, treatment may have tapped only into those resources needed for the specifically trained structure at the expense of other sentence types (McNeil et al., 1991). Although this assumption helps to explain why negative effects concerning untreated sentences may have occurred, it remains an open question whether and why resource re-allocation, supposedly introduced by the treatment protocol, is specific to the target sentence structure only. Following this, future research is needed in order to disentangle this issue.
Concerning sentence production treatment, we foundin accordance with previous studies using the TUF approachevidence for uni-modal practice effects and generalisation to untrained exemplars of the trained sentence type (Jacobs & Thompson, 2000;Murray et al., 2004;Stadie et al., 2008;Thompson, 1998). As suggested by Stadie et al. (2008), these generalised improvements indicate re-learned abstract grammatical properties underlying ORC and, moreover, the regained ability to apply them flexibly in the production of ORC with novel lexical material.
Yet, we acknowledge that this generalisation may be restricted to ORC containing untrained nouns but not verbs, as generalisation to control ORC (with different verbs) was limited. Until now most studies have not consistently differentiated between control sentences containing the verbs of the trained sentences as opposed to control sentences with other verbs. Moreover, there was a marginal generalisation to control sentences with other verbs for CM1 (following production treatment). Therefore, it remains unclear to what extent uni-modal generalisation depends on the presence of the same verb in trained and control sentences.
However, none of our participants showed evidence of uni-modal generalisation in producing untrained sentence types. Regarding SRC production, the findings are in line with our hypothesis as they support the assumption that generalisation is restricted to sentences, which rely on the same underlying grammatical properties as those trained, that is, they are linguistically related regarding the movement operations involved. As outlined in the introduction section, Thompson and Shapiro (2005) suggest that generalisation to SRC following treatment of ORC production may be absent, because both structures involve different types of syntactic movement operations (NP-vs. wh-movement).
However, generalisation to who-questions would be expected, as they involve the same syntactic movement operation as ORC. Yet, improvements in producing who-questions following treatment of ORC did not occur in our participants. Nevertheless, this lack of generalisation to who-questions has also been reported in the relevant literature, demonstrating that not all IWA show generalised improvements in producing untreated sentence types following TUF (Jacobs & Thompson, 2000;Murray et al., 2004;Stadie et al., 2008;Thompson, 1998). Certainly, various factors could influence the occurrence/absence of generalisation, such as number of treatment sessions, severity of the sentence processing deficit and associated deficits in working memory. Therefore, future research is needed in order to determine in more detail which factors are responsible for detecting uni-modal generalisation to untreated sentence types in different IWA.

Cross-modal treatment effects
We will discuss findings of cross-modal generalisation to production following comprehension treatment with respect to participant CM1 only. This is because CM2 already improved significantly in sentence production after production treatment, which is why cross-modal generalisation possibly occurring after comprehension treatment is ambiguous. Similarly, we will only consider participant CM2 for the discussion of cross-modal generalisation to sentence comprehension following production treatment, since CM1 already showed significant gains in sentence comprehension following comprehension treatment, again rendering cross-modal effects ambiguous.
The results from the present study provide further evidence for the absence of crossmodal generalisation following treatment of sentence production. Even though the applied German adaptation of the TUF protocol contained aspects of sentence comprehension treatment (i.e., illustrating the assignment of thematic roles and how they are maintained when deriving an ORC from its underlying active sentences), intervention targeting sentence production had no effect on comprehension performance. Thus, following treatment, CM2 showed significant improvements in producing ORC, but was still unable to comprehend them correctly. In fact, for the ORC control items, post-treatment comprehension performance even declined after production treatment. However, we are cautious about the impact of this change, as the lower post-treatment performance still aligns with the performance level observed with the other two sets prior to treatment. Overall, the lack of cross-modal generalisation is in line with previous treatment studies investigating the effectiveness of TUF for sentence production (Jacobs & Thompson, 2000;Murray et al., 2004;Schröder et al., 2015).
However, contrary to the results of previous studies investigating treatment effects following MT and MT-act comprehension treatment (Byng, 1988;Jones, 1986;Nickels et al., 1991;Rochon & Reichman, 2004;Schwartz et al., 1994), cross-modal generalisation to production did not occur in participant CM1. Note that this implies that after sentence comprehension treatment, CM1 showed significant gains in comprehending ORC, performing even within the normal range, but was still completely unable to produce ORC.
A possible explanation for the diverging results between our study and previous studies may be found in the way cross-modal generalisation was measured. So far, studies on comprehension treatment involving MT mainly employed narrative production tasks (e.g., storytelling, picture or video description) and used several measures (e.g., production of thematic roles, verb-argument structures) to detect cross-modal generalisation in order to provide evidence that sentence production had improved (Byng, 1988;Jones, 1986;Mitchum et al., 1995;Nickels et al., 1991;Schwartz et al., 1994). To our knowledge, only one study (Rochon & Reichman, 2004) assessed crossmodal generalisation using elicited production of the sentence types trained during comprehension treatment. However, the authors only reported numerical changes without providing results of statistical analyses. Thus, previous findings of crossmodal generalisation do not provide evidence that participants did actually improve in producing those sentence structures for which they showed gains in comprehension.
However, in our study, examining cross-modal generalisation using a sentence-elicitation task allowed us to assess whether production improved for exactly the same sentence types trained (and untrained) during comprehension treatment. Crucially, assessing structure-specific cross-modal generalisation is particularly important given the question of whether sentence comprehension and production rely on a single syntactic processing system, shared by both modalities, or whether there are distinct modality-specific sub-systems. Since such a structure-specific cross-modal generalisation was absent following treatment in both modalities, respectively, the outcome of our and previous treatment studies reporting a lack of cross-modal generalisation support a model of sentence processing in which distinct modality-specific subsystems subserve processes involved in both sentence comprehension and production.
An alternative explanation for the absence of cross-modal generalisation following comprehension treatment may concern the number of treatment sessions devoted to sentence comprehension. While comprehension treatment in the present study only comprised five sessions before the ending criterion was reached, intervention in previous studies lasted from 12 (Nickels et al., 1991) to 58 sessions (Schwartz et al., 1994). Further research should determine whether the number of treatment sessions is an essential factor contributing to the occurrence of cross-modal generalisation following sentence comprehension therapy.

Uni-directional linkage from comprehension to production
Although the absence of cross-modal generalisation provides evidence for modalityspecific sub-systems subserving sentence comprehension and production, respectively, the results of our study provide preliminary support for the assumption put forward by Schröder et al. (2015) that the processing components responsible for sentence comprehension are connected, but via a uni-directional link to the production system. This is demonstrated by the regression analysis, which revealed that comprehension performance prior to production treatment was a significant predictor of generalised improvements in producing untrained ORC in both our participants, but not vice versa.
We acknowledge that the data supporting the uni-directional linkage hypothesis are still limited, as only the data sets from two participants entered the regression analyses. However, overall, the findings are largely consistent with the results obtained by Dickey and Yoo (2010). Clearly, further research is needed to investigate whether the current results are replicable with more participants. Thus, it is unequivocal that the uni-directional linkage hypothesis needs to be explored in more detail. Yet, the findings provide further support for Schröder et al.'s (2015) assumption that relatively retained abilities in sentence comprehension assist re-learning processes in sentence production, promoting uni-modal generalisation to untrained exemplars of the trained sentence type. As there is no indication for the reverse relationship so far, it seems that production abilities are less likely to assist re-learning mechanisms in sentence comprehension. In order to better understand the relationship between cognitive functions involved in sentence processing, future research should take into account in more depth the complexity of cognitive inter-relations and, in particular, their interplay with re-learning mechanisms.

Conclusion
The present study aimed to contribute to the debate about the functional relationship between sentence comprehension and sentence production by investigating uni-modal treatment effects and cross-modal generalisation following treatment of sentence comprehension and production, respectively. The findings corroborate previous evidence supporting the notion of distinct, modality-specific sub-systems underlying sentence comprehension and production rather than a single mechanism that is shared by both modalities (Schröder et al., 2015). Moreover, the strength of uni-modal generalisation to untrained items in production was assisted by (relatively) spared comprehension abilities, whereas generalisation following comprehension treatment was not dependent on production abilities. Although preliminary, this provides support for Schröder et al.'s (2015) proposal that the two distinct sentence processing subsystems are connected via a uni-directional link from comprehension to production.
With respect to clinical decision-making in sentence processing treatment in aphasia the results of the present study and those of Schröder et al. (2015) suggest the following: Sentence comprehension deficits do not seem to profit from sentence production treatment, since no cross-modal generalisation has been observed from production to comprehension. In case of sentence production deficits, it seems advisable first to assess comprehension performance and, in case of deficits, to treat comprehension before or in combination with sentence production, since then generalisation within the production modality seems more likely to occur.