A Novel Morphology-Based Naming Therapy for People with Aphasia

ABSTRACT Background Previous studies have demonstrated that naming treatments can improve language abilities in people with aphasia (PWA). However, there is currently a lack of protocols for evidence-based naming treatment in Hebrew. Aims This study aims to evaluate the efficacy of a novel morphology-based naming treatment for Hebrew-speaking PWA and to investigate subject-related factors influence responsiveness to the treatment. Method & Procedures Twelve chronic stroke PWA and moderate to severe anomia participated in 20 treatment sessions focused on the root-structure morphology of Hebrew. Treatment stimuli incorporated morphologically complex words comprising root and template. Treatment effects were assessed at both subject level and group level. Outcomes & Results The treatment showed promising results, with a significant increase in correct naming for both treated and untreated complex words. These gains were maintained for at least 10 weeks post-treatment. Most of the benefit was achieved during the first 10 treatment sessions. Additionally, the group demonstrated generalization effects to naming simple words. Pre-treatment performance in naming morphologically complex words predicted higher treatment gains during the follow-up session, irrespective of word type. Conclusions These findings provide preliminary evidence supporting the efficacy of root-based naming treatment for Hebrew-speaking PWA. Future research should compare this treatment to an untreated control group and to other treatment methods in Hebrew speakers to further validate its benefits.


Introduction
Evidence-based naming treatments for Hebrew-speaking people with aphasia (PWA) do not exist in the aphasia literature.This study investigates the effects of a novel morphology-based treatment on naming abilities in Hebrew-speaking PWA, in which we examined whether increasing participants' awareness of root morphemes, and practising the use of a root-based cue during treatment, would facilitate word retrieval.
In many languages, including Hebrew, morphological processing affects word retrieval, yet there is a paucity of morphology-based naming treatments (Archer et al., 2016).This gap could be attributed to the English-centric bias in the aphasia literature (Beveridge & Bak, 2011).For instance, focusing on the initial phoneme of the word, as a cue for facilitating word retrieval, as in the Phonological Component Analysis treatment (Leonard et al., 2008), may be effective in English, which lacks inflectional prefixes in nouns (Archer et al., 2016).However, this approach may not be equally beneficial in languages where the first phoneme could be a prefix or part of a morphemic template rather than the root, and is thus less informative.In one study conducted in South African Sesotho (Archer et al., 2016), two cueing methods were compared in naming treatment of two PWA: A phonological prefix-based cue (PBC)targeting the first phoneme of the word, and a root-based cue (RBC) -targeting the first phoneme of the root, which was not necessarily the first phoneme of the word.The RBC treatment improved participants naming more than the PBC treatment, exemplifying the benefit of morphological cues in languages like Sesotho.This study suggests that practices within speech-language pathology should be grounded in theories of normal and impaired language functioning but also be informed by the uniqueness of each language.

Hebrew morphology and naming therapies
Hebrew is a Semitic language with rich morphology, in which most words are composed of three, or occasionally four, consonant roots.The root is embedded in a phonological pattern (the morphological template), which is a sequence of vowels, or vowels and consonants (Frost et al., 1997;Kolan et al., 2011).Roots and templates are intertwined together in a non-concatenative way to form words, e.g., /miʃKeFet/ (binoculars) is composed of the root /ʃ.K.F/ and the template /miXXeXet/ (where X's represent the slots for the root consonants).In most cases, words sharing the same root are semantically related (Dotan & Friedmann, 2015).
Studies in Hebrew have shown that the root morpheme plays a crucial role in the Hebrew lexicon.Children as young as three years old demonstrate an awareness of root morphemes (Berman, 1982), while adults typically attain awareness of the morphemic pattern around age fifteen (Ravid & Malenky, 2001).Priming studies in Hebrew indicate that root activation prior to word presentation enhances lexical recognition and the retrieval of derived words (Bentin & Feldman, 2007;Frost, Deutsch, & Forster, 2000;Frost, Deutsch, Gilboa, et al., 2000).Moreover, the morphological structure of Hebrew words was found to facilitate written word recognition in Hebrew-speaking adults with dyslexia (Bitan et al., 2020) and word retrieval in Hebrew-speaking school children with delayed language development (DLD; Kraizer & Novogrodsky, 2012).Imaging studies of Hebrew-speaking adults also suggest that distinct morphological processing occurs during lexical access to complex words (Bick et al., 2008(Bick et al., , 2010;;Bitan et al., 2020).
Despite the prominent role of the root morpheme in lexical access in Hebrew, to our knowledge, no studies have investigated morphology-based anomia interventions in Hebrew.While there have been studies that included aphasia treatment in Hebrew as part of various research questions, they have adapted protocols from English (Ben-Arie et al., 2015;Lerman et al., 2022), or did not focus on treatment details (Fridler et al., 2012;Gil & Goral, 2004).Our study aims to address this gap by developing an anomia treatment that addresses the unique characteristics of lexical access in Hebrew, and examine its impact on naming ability in Hebrew-speaking PWA.The treatment stimuli consist of bi-morphemic nouns and the tasks are designed to increase participants' awareness and sensitivity to root morphemes, and eventually use the root morpheme as a cue to facilitate word retrieval.We hypothesize that: (1) Naming of bi-morphemic treated words would improve following treatment and will be maintained during follow-up assessment.This could be either due to recognition of the root morpheme that can lead to naming of root-based words, or due to strengthening of the lexical representations of treated words.(2) We expect that the effect of treatment would generalize to bi-morphemic and mono-morphemic untreated words.While generalization to bi-morphemic words may suggest the use of a root-cueing strategy, improvement on both types of words can result from strengthened lexical system following treatment.(3) Repetition of morphologically complex words is expected to improve following treatment, due to the enhancement of morphological decomposition andstrengthened lexical system, making them more easily repeated via the lexical route (MccarthyMcCarthy & Warrington, 1984).

Participants
Twelve participants (7 females; mean age 61.75 years, range 29-82) with chronic poststroke aphasia were recruited through the Loewenstein hospital and through strokesurvivors clubs.Participants suffered from a single left hemisphere unilateral stroke at least 12 months before recruitment (range 1-24 years).Aetiology and lesion location were determined by MRI or CT scans as indicated by medical records from the time of hospitalization.All participants had been diagnosed with aphasia by a speech language pathologist (SLP) during their hospitalization.All had received some standard language therapy during the months following their stroke but were not engaged in any formal speech language therapy during the study.Aphasia syndrome was not part of the inclusion criteria, in order to better reflect the heterogeneity of the clinical population (Best et al., 2013).See Table 1 for participants' demographic and linguistic profiles.
All participants met the following inclusion criteria: (1) Native speakers of Hebrew or have been using Hebrew as their dominant language since adolescence.All have been based in Israeli Hebrew speaking environment most of their lives.(2) Moderate to severe anomia, as determined by less than 70% accuracy on the Hebrew standardized SHEMESH picture naming test of one hundred nouns (Biran & Friedmann, 2004, 2005) and less than 70% accuracy on a list of morphologically complex words (MCW), developed for the current study.(3) Adequate comprehension, determined by a score of above 90% in the spoken word to picture matching subtest (no.47) of the Psycholinguistic Assessments of Language Processing in Aphasia test (PALPA; Kay et al., 1992;Hebrew version: Gil & Edelstein, 2001), as well as a score of above 90% in auditory comprehension of yes/no questions subtest from the Hebrew version of the Western Aphasia Battery (WAB; Kertesz, 1982;Hebrew version: Soroker, 1997).This relatively strict criterion was used to ensure that participants could comprehend and provide informed consent for their participation in the study, and to ensure participants' ability to understand the meta-linguistic aspects of the treatment.(4) Right-handed with normal or corrected hearing and vision, and no history of major psychiatric illness.(5) Intact cognitive abilities in the fields of memory, spatial attention, time orientation and executive functions, as indicated by the Hebrew version of the Cognitive Assessment scale for Stroke Patients (CASP; Barnay et al., 2014;Crivelli et al., 2018;Hebrew version: Rosenheck et al., 2021).Additionally, the BAFI repetition test of morphologically mono-and multi-morphemic words (Friedmann, 2006) was administered in order to assess lexical impairment, as well as generalization of treatment effects to other linguistic abilities that involve morphological processing.
For the first four participants (P1 -P4), all interactions were conducted in-person at participants' homes.However, due to the Covid-19 pandemic, the remaining eight participants (P5 -P12) had all assessment and treatment sessions conducted online via Zoom©.Despite the shift to an online format, no changes were made in the procedure, and the possible effects of the online treatment and assessments were tested in the statistical analysis.The study was conducted in accordance with ethical guidelines and received approval from the Helsinki committee of the Loewenstein Rehabilitation Medical Centre and the Ethics Committee of the University of Haifa.Prior to recruitment, the study was registered in the Ministry of Health database of studies.Informed consent was  obtained from all participants either through written consent or oral video-taped recording after a thorough explanation of the study was provided and participants' comprehension was verified.

Materials
The MCW list, developed for the current study, consists of 60 coloured pictures of Hebrew bi-morphemic (root + template) concrete nouns on white background.Each noun in the list shares a root with a semantically related verb.E.g., the target noun /miʃkefet/ (binoculars) shares the root /ʃ.k.f./ with the verb /maʃkif/ (looks around).See Table 2 for more examples.To ensure the accuracy and clarity of the stimuli, pre-testing was conducted on a group of native Hebrew-speaking healthy adults (N = 57; mean age =31 years, range 16-70), and only pictures that elicited the desired target word in at least 95% of cases were included.
During the assessment of word frequency for the MCW list, we faced a challenge in assessing the frequency of spoken words based on the frequency of written words due to the high proportion of homographs in Hebrew, which account for an estimated 25% to 40% of texts (Alon, 1996;Shimron & Sivan, 1994).This characteristic introduces a bias when determining the frequency of spoken words based on these homographs (Bar-On et al., 2019).For instance, the sequence of graphemes / ‫ס‬ ‫פ‬ ‫ר‬ / in its unpunctuated form represents multiple nouns [book /sefeʁ/; hair-maker /sapaʁ/; border /sfaʁ/], and verbs [he told /sipeʁ/; he cut the hair /sipeʁ/; he/it was told /supaʁ/; its/his hair was cut /supaʁ/; he counted /safaʁ/].Consequently, the same frequency count applies to all these different meanings in an unpunctuated corpus.It is worth noting that the MCW list itself contains 37% homographs.
Therefore, to enhance the reliability of our frequency measures, we engaged a group of native Hebrew-speaking healthy adults (N = 36; mean age 28.5 years, range 20-54 years) in a subjective assessment of word frequency on a scale ranging from 1 to 5. To avoid ambiguity in this assessment, homographs were presented with diacritic marks, and participants were explicitly instructed that all words in question were concrete nouns, together ensuring a single interpretation for each word.The subjective frequency ratings provided a range from 2.4 to 4.9 (M = 3.76, SD = .72)across the various words.
Furthermore, when comparing word frequency for treated and untreated words (see Assessment below), we expanded our verification by obtaining frequency values from two distinct corpora: the He-ten ten corpus of the Sketch Engine© (Kilgarriff et al., 2014) for frequencies of written words, and the OPUS corpus (Tiedemann, 2012) for frequencies of spoken words based on Hebrew-translated movies subtitles.This multi-source verification allowed us to ensure frequency balance between treated and untreated words.

Naming the MCW list
Naming of words on the MCW list was tested one week before the beginning of treatment, one week after treatment termination, and again in a follow-up session, ten weeks after treatment termination.See Figure 1 for timeline.The pre-and post-treatment tests each consisted of three measures, conducted on three consecutive days, as per recommendations for single subject designs (Beeson & Robey, 2006).Pictures were presented one at a time on a computer screen with a minimum size of 15", and participants were required to name each of them.Items were scored as correct if the target word was solely and accurately produced within 5 seconds of the picture presentation (Biran et al., 2017;Biran & Friedmann, 2005).No feedback was given (Kiran & Johnson, 2008).For each participant, the MCW list was split into two sub-lists of 30 treated and 30 untreated words.The two sublists were balanced as much as possible for word frequency, word length, measured by number of syllables and number of phonemes, and success in naming in the three MCW pre-tests (Lambon Ralph et al., 2010).T-tests indicated no significant differences between the lists in terms of length (p > .05)and frequency measured by the three different sources: He-ten ten, OPUS, and subjective rates (all ps > .05).

Linguistic tests
The SHEMESH naming test and the BAFI repetition test were administered before the beginning of treatment, to assess participants' lexical deficit profile, and after the treatment was completed, to capture generalization of treatment effects to words with varied morphological structures, as well as other morphological abilities.Fourteen bi-morphemic words in the SHEMESH naming test overlapped with the MCW list, and were therefore excluded from this analysis.The remaining 86 words were divided into mono-morphemic (58) and bi-morphemic (28) categories.Bi-morphemic words were defined as those composed of a tri-consonantal productive root and a template, as described in previous studies (Barouch et al., 2022;Bitan et al., 2020;Haddad et al., 2018).The two SHEMESH sub-lists did not differ from each other in the proportion of high and low-frequency words, as measured by the authors, χ2(1, N = 86), p = .03,and there was no significant difference in word length, measured by the number of phonemes, t(84) = 1.3, p > .05.
Post-test SHEMESH score was missing for one participant (P9).Four of the participants (P7, P8, P9, P11) underwent fMRI scans in which they performed an overt naming task with the same 60 words from the MCW list before and after treatment.The results will be reported in a separate paper (Truzman et al., in preparation).

Assessing lexical impairment
Participants' performance on the linguistic tests was analysed by two certified SLPs (coauthors T.T and M.B).A semantic deficit was determined based on the existence of semantic and associative errors in the SHEMESH naming test.A deficit in the phonological output lexicon was determined based on phonemic paraphasias in the SHEMESH naming test, as well as the existence of a frequency effect, i.e., significantly better performance on frequent than infrequent words (Laganaro, 2005(Laganaro, , 2010)).A deficit in the phonological output buffer was determined based on the existence of phonemic paraphasias in the SHEMESH naming test, in addition to a length effect, i.e., significantly better naming of shorter words (Best et al., 2013;Gvion & Friedmann, 2012).Frequency and length effects in the SHEMESH naming test were tested for each participant, using χ 2 test with α ¼ :05.Impaired repetition on the BAFI test, as determined by accuracy lower than 60% on monomorphemic words, was also used as an indicator for impairment in the phonological output buffer (Gvion & Biran, 2023).Only mono-morphemic words were considered to determine impairment in the BAFI task, in order to control for effect of morphological complexity on naming.The results of this classification are presented in Table 1 and were used in the interpretation of single subject performance.

Treatment schedule
The treatment consisted of twenty sessions administered twice a week by a certified SLP (co-author T.T).In each session treatment was given for six words, and the session ended when the treatment of the six words was completed.On average, treatment sessions lasted approximately one hour.As the treatment progressed and participants improved, the sessions typically became shorter, ranging between 45 to 60 minutes.A treatment cycle was completed after five sessions in which all 30 words were treated.After each cycle, a probe test was conducted for naming of all 30 treated words to monitor participants' improvement.For each treatment cycle the treated words were randomly rearranged into sessions.Treatment was completed after four treatment cycles were administered, namely, when each word was repeated four times.

Treatment protocol
All treatment stimuli were presented on a computer screen with minimum size of 15".The treatment comprised three steps, and for each step all six words in a given session were presented sequentially before moving on to the next step: (1) Morphological awareness instruction: Participants saw a picture of the target noun and its matching written word below, with root letters depicted in coloured bold type.The SLP named the word, and explained the shared root and therefore shared sounds and meaning of the target word with a related verb and gerund, which were written next to the target word (e.g., the target word /miKLeDet/ (keyboard), was presented with /lehaKLiD/ (to type), and /maKLiDim/ (typing), which all share the root: K.L.D.).Participants were then asked to repeat the target word.See an example in Figure 2a.
(2) Morphological recognition: Participants were shown four pictures on the screen and asked to select with a mouse press the one that shared a root with an orally presented gerund.E.g., for the target word /miKLeDet/ (a keyboard) they were asked: "Which word has the same root as /lehaKLiD/ (to type)?".The four pictures included: A) The target noun, using a different exemplar from the picture used in the first step; B) An unrelated item that was presented in step 1 as another target word (e.g., a screwdriver).The use of a different exemplar (for A) and of a distractor taken from the list of target words (for B) were made to minimize reliance on visual memory and familiarity; C -D) Two pictures of objects which are semantically related to the target word but with no shared root (e.g., hand palms and a computer mouse).The pictures were chosen based on frequent co-occurrence of their names with the target word according to the heTenTen Sketch Engine© corpus.Notably, co-occurrences of words in a written language corpus can serve as a measure for semantic relatedness, unlike in the assessment of frequency, because they narrow down the meaning space of homographs to a single meaning.Computerized auditory feedback was given for correct and incorrect responses, and the SLP named the target word and demonstrated its relatedness to the gerund regardless of participants' success.See an example in Figure 2b.(3) Picture naming with morphological cues: participants saw a picture of the target word and heard a conductive indicative SV (subject + verb) sentence in which the verb shares a root with the target word.Participants were then asked to complete the sentence by naming the target word, e.g., /Huʔ maKLiD be … / (he is typing on a …) should elicit the response /miKLeDet/ (a keyboard)].Regardless of participants' naming success, the SLP then said the full sentence, including the target word, and participants were asked to repeat the target word once again.If, during naming attempts, participants produced an incorrect word with identical root as the target, they were encouraged to pay attention to such words and try to use them as a cue for retrieving the target word.

Reliability
All the sessions were audiotaped.Forty percent of the recordings, from both pre-and post-treatment tests, were scored by two independent observers, a speech-language therapist with experience in working with individuals with aphasia, and an MA student in neuropsychology.Both observers were trained by the authors on how to analyse the data.Point-to-point scoring of accuracy reached high agreement (mean 97% agreement; range 94-100%).

Statistical analysis
A generalized linear mixed model analysis was utilized to assess treatment outcomes of treated and untreated words across the entire group.This method allows us to examine individual data points across repeated measures, while also considering specific characteristics of the stimuli, and is especially useful when working with smaller sample sizes (Wiley & Rapp, 2019).Logistic regression was used, considering the binary nature of the dependent measure, which represented accuracy as 0 or 1in each trial.We conducted the analysis using R (Version 4.3.1,R Core Team, 2020).
A maximal model, incorporating crossed random effects of by-participant and by-item intercepts and slopes (Bell et al., 2019;Brauer & Curtin, 2018), was served to the buildmer function from the buildmer package (v2.9;Voeten, 2023), which uses the glmer function from the lme4 package (Bates et al., 2015).By employing a backward-fitting model selection procedure, the buildmer systematically simplifies random slopes until convergence, and likelihood ratio tests (LRTs) are utilized to evaluate the contribution of random slopes to the model fit (Matuschek et al., 2017).Additionally, the buildmer function assesses the contribution of each fixed effect to the model fit through chi-squared tests on the residual sum of squares.P-values for fixed effects were determined based on Wald degrees of freedom.Detailed models are presented in the Tables using the summary function.
For categorical fixed effects with more than two-levels that were included in the optimal model suggested by the buildmer, LRTs were conducted to determine their contribution to the model.To describe interactions and examine pairwise comparisons, the selected model was refitted using the glmer function, followed by the pairs function from the emmeans package (v1.8.7; Lenth, 2023) for pre-planned contrasts, with Bonferroni adjustments for multiple comparisons.Graphs were plotted using the interactions (v1.1.5;Long, 2019), ggplot2 (Wickham, 2016), and effects (v4.2.2; Fox & Hong, 2010) packages.

Model 1: General treatment effects
The first model was used to examine the effects of the treatment on naming accuracy of treated and untreated words (the dependent measure).The buildmer function was employed with the following fixed effects: Word Type (a categorical variable with two levels: 'treated' and 'untreated' words, with 'treated' serving as the reference), Session (a categorical variable representing seven levels of time points of naming tests: pretreatment sessions S1, S2, S3; post-treatment sessions S7, S8, S9; and follow-up session S10, with S1 serving as the reference), Standardized Word Frequency (based on subjective frequency measures), and Length, with mean as the intercept for both.The model also included the interactions between Session and Word Type, as well as between Session and Treatment Mode (a categorical variable with two levels: 'frontal' and 'online', with 'frontal' as the reference).Crossed random slopes and intercepts of Word Type and Session by participant, and of Session by item, were also included.The inclusion of the Treatment Mode variable was aimed to explore any potential effects of treatment mode on treatment gains.Categorical modeling of Session was preferred over linear modeling to allow for non-linear growth in naming accuracy across sessions, and to enable post hoc comparisons.The variable characteristics (categorical/continuous/standardized) and the reference levels were consistent across all models.

Model 2: Dynamics of treatment effects
This model was used to investigate the dynamics of the effect of treatment on naming accuracy of treated words, during the probe tests (the dependent measure).This analysis has the potential to offer valuable insights into the evolution of the treatment impact, and act as an indicator of treatment efficacy throughout the treatment.The buildmer function was utilized with the following fixed effects: Session, which encompassed 5 levels: the last pre-treatment naming test (S3, as the reference level), the probe tests (S4, S5, and S6), and the first post-treatment naming test (S7).Additionally, the standardized word Frequency and Length were included as fixed effects.Crossed random slopes and intercepts of Session by participant and by item were entered into the model.

Model 3: Psycholinguistic and demographic effects
This model was used to examine the influence of demographic and linguistic factors, measured before treatment, on treatment accuracy gains for both treated and untreated words in specific timepoints (the dependent measure).These factors were not included in the first model, with all the sessions, because the model did not converge.The model included the following fixed effects: Session, focusing on three specific sessions: pretreatment (S3), post-treatment (S7), and follow-up (S10); word Frequency and Length, time post-onset (TPO), Age, Education and pre-treatment performance of the SHEMESH bi-morphemic words and mono -morphemic words (SHEMESH Bi Pre, SHEMESH Mono Pre, respectively).Additionally, the model considered interactions between Session and the following factors: TPO, Age, Education, SHEMESH Bi Pre, SHEMESH Mono Pre, and Word Type.To address potential multicollinearity, continuous variables were standardized.Crossed random slopes and intercepts were included for Word Type and Session by participant, as well as for Session by item.

Model 4: Generalization of treatment effects
This model was used to examine whether treatment effects generalized to the naming of different types of words from the SHEMESH naming test, specifically complex (bi-morphemic) and simple (mono-morphemic) words (the dependent measures).The model served to the buildmer function incorporated the following fixed effects: Time (a categorical variable with two levels: pre-treatment and post-treatment naming tests, with pretreatment as the reference level), Morphological Complexity (a categorical variable with two levels: mono-morphemic and bi-morphemic words, with bi-morphemic as the reference), SHEMESH word Length, SHEMESH word Frequency (a categorical variable with three levels as given by the test authors: very low, medium, and very high frequency, with the low frequency words as a reference), Age, TPO, Education, and interaction between Time and Morphological Complexity.Crossed random slopes and intercepts were included for Time and Morphological Complexity both by participant and by Items.Continuous variables were standardized to reduce multicollinearity.
Notably, we did not conduct a statistical test to assess improvement in the BAFI repetition test from pre-to post-treatment because the majority of participants (9/12) did not show any numerical improvement following treatment, thus, a statistical analysis was deemed unnecessary.
In addition to the group-level analysis, individual performances of participants in the MCW test are reported.To gauge the magnitude of the treatment effects, we calculated effect sizes based on the approach proposed by Busk and Serlin (1992).The effect size was computed by comparing the mean raw performance of the three post-treatment measures (Mpost) to the mean raw performance of the three baseline pre-treatment measures (Mpre), relative to the variance (SD) observed during the baseline phase.The formula used for calculating the effect size is as follows: d = (Mpost -Mpre)/SDMpre.For further interpretation, the study employed the benchmarks set forth by Beeson and Robey (2006).These benchmarks categorize effect sizes as small (d = 4.0 -7.0), medium (d = 7.0 -10.1), and large (d > 10.1).

Results
Individual performances in the MCW naming test, in terms of the number of correctly named treated and untreated words during individual sessions before and after treatment are presented in Table 3, along with effect size.
Group-level analyses were conducted using generalized linear mixed models (GLMM) with logistic hierarchical regressions.Three separate models were used -to examine the relationship between treatment and naming accuracy in the MCW words, as described below.

Model 1: General treatment effects
The model with the highest explanatory power included fixed effects of Session, Frequency, Length, Word Type, and the interaction between Session and Word Type, along with random slopes and intercepts for Word Type by-Participants (Akaike information criterion; AIC = 5804).The dataset consisted of 5040 observations grouped by twelve participants.The Treatment Mode effect was not included in the model as a main effect or as an interaction with Session., as its contribution to the explained variance was not  significant, χ 2 (1) = 2.5, p = .1,indicating no difference in the effect of treatment between online and frontal groups.Table 4 presents the estimated coefficients, standard errors, zvalues, p-values and confidence intervals (CI) for the fixed effects, as well as variance and standard deviation (SD) of random effects.
Holding all other predictor variables constant, the effects of sessions S7, S8, S9 and S10 were significant (all p < .001).Frequency and Length were also significant (p < .001for both), such that higher frequency words were associated with an increased likelihood of correct naming, while longer word length was associated with a decreased likelihood of correct naming.See Figure 3.The effect of Word Type was insignificant (p > .05).
The analysis of variance indicated a significant interaction between Session and Word Type, χ 2 (6) = 23.3,p < .001,and prompted a follow-up post-hoc analysis.Pairwise comparisons between predicted accuracies of different sessions, separately for treated and untreated words, as well as within sessions between treated and untreated words, were conducted using z-tests with Bonferroni correction for multiple comparisons.The results are presented in Table 5.Specifically, the comparisons between the three pre-treatment sessions (S1, S2, and S3) and between the three post-treatment sessions (S7, S8 and S9) did not yield significant differences among themselves for neither treated nor untreated words (all ps > .05).However, significant differences emerged between the last pre-treatment session (S3) and all post-treatment sessions (S7, S8 and S9), as well as the follow-up session (S10), for both treated (all ps < .001)and untreated (all ps < .01)words.Based on the observed estimates (presented in Table 5), the odds of correct naming increased by 3.56 times ðe 1:27 ) from S3 to S7 for treated words, and by 1.99 times (e 0:69 ) for untreated words.These odds are further transformed into more familiar probabilities presented in Figure 4.For example, an individual with approximately 40% accuracy in naming treated words in S3 would improve to around 70.3% in S7.The comparison between the last post-treatment naming test (S9) and the follow-up test (S10) did not yield a significant difference (p > .05),indicating the long-term maintenance of naming accuracy.
When examining the differences within each session between treated and untreated words, no significant discrepancies were found among all pre-treatment sessions: S1, S2 and S3 (all ps > .05).However, a significant difference between treated and untreated words became apparent following treatment, as seen in S7 (p = .002)and S8 (p < .001).The estimates presented in Table 5 suggest that after treatment, the odds of correct naming were 1.82 times e 0:6 ð Þ higher for treated words compared to untreated words.However,   this disparity was not sustained, as no significant differences were identified at S9 (p = .1)and at S10 (p = .06). Figure 4 shows the predicted probabilities of naming accuracy per different sessions for treated and untreated words.

Model 2: Dynamics of treatment effects
The model with the highest explanatory power included Session and Frequency as fixed effects, along with random intercepts by-participants and by-items (AIC = 1928.4).The dataset consisted of 1800 observations grouped by 12 participants and 60 items.Table 6 presents the estimated coefficients, standard errors, z-values, and p-values for the fixed effects, as well as the variance and standard deviation of the random effects.
Consistent with Model 1, a main effect for Frequency was found (p < .001)when holding other predictor variables constant.Additionally, the Session factor significantly contributed to the explained variance in the model, χ 2 (4) = 101.7,p < .001,indicating the need for post-hoc analysis for comparing between sessions.Pairwise comparisons were conducted using z-tests with Bonferroni correction for multiple comparisons.The results indicated significant differences in the likelihood of correct naming between S3 and S4, z = 5.19, p < .001,95% CI [-1.44, 0.43], and between S4 and S5, z = 3.3, p = .009,95% CI [-1.12, -0.09].However, no significant differences were observed in the likelihood of correct naming between S5 and S6, z = -0.66,p > .05,95% CI [-0.4, 0.6] and between S6 and S7, z = 0.38, p > .05,95% CI [-0.6, 0.45].These findings suggest that treatment effects on accuracy are primarily observed during the first two treatment cycles, i.e., 10 treatment sessions, and are expected to be maintained thereafter.See Figure 5.

Model 3. Psycholinguistic and demographic effects
The model with the highest explanatory power included the fixed effects of Session, Frequency, Length, Word Type, pre-treatment naming of SHEMESH bimorphemic words (Shemesh Bi Pre), time post-onset (TPO), Age, Education, and the interactions between Session and Word Type, and between Session and Shemesh Bi Pre.Random intercepts by participant were also included (AIC = 2473).The dataset consisted of 2160 observations grouped by 12 participants.Table 7 presents the estimated coefficients, standard errors, z-values, and p-values for the fixed effects, as well as the variance and standard deviation of the random effects.
Holding all other predictor variables constant, significant main effects were found for the following demographic variables: Age (p = .005),indicating that older participants in our sample were more likely to have higher accuracy in naming; TPO (p = .004)indicating that participants with shorter time following their lesion were more likely to have higher naming accuracies; Education (p = .018),with participants having more years of education being more likely to have higher naming accuracies.Consistent with Model 1, Frequency and Length had a significant effect (p < .001,p = .02,respectively) in identical directions, while the effect of Word Type was insignificant (p > .05).
A significant interaction between Session and Shemesh Bi Pre was found, χ 2 (2) = 7.7, p = .02.Visual analysis revealed that higher proficiency in naming bi-morphemic words prior to treatment was associated with increased likelihood of improvement in accuracy following treatment.Figure 6 shows the predicted probabilities of naming accuracy, in session 3, 7 and 10 by performance on the SHMESH bi-morphemic words, beyond Word Type (treated/untreated words).

Model 4: Generalization of treatment effects
The model with the highest explanatory power included fixed effects of Time, SHEMESH Length, SHEMESH Frequency, as well as random intercepts by participants and by item (AIC = 2184.8).The dataset consisted of 1892 observations grouped by 11 participants and 86 items.

Discussion
The primary objective of this study was to investigate the efficacy of a new morphologybased treatment on naming abilities of Hebrew-speaking people with post-stroke aphasia (PWA).The treatment targeted the root morpheme in morphologically complex words, with the aim of enhancing participants' awareness of roots and its use as a cue to facilitate word retrieval.The results indicated that the treatment was successful.In this section, we will discuss the findings in relation to our hypotheses and the existing literature on aphasia treatment.

Treatment outcome measures
The results revealed a significant increase in the likelihood of correct naming for both treated and untreated words following treatment.Furthermore, the predicted improvements in naming accuracy remained stable even 10 weeks after the termination of treatment, indicating the maintained efficacy of the treatment for the long-term.The results also supported the hypothesis that the treatment effects would generalize to the naming of words without roots.This generalization was evident in the improvement of naming performance for both mono-and bi-morphemic words in the SHEMESH naming test, observed from pre-to post-treatment assessments.
The literature on aphasia is replete with studies demonstrating enhanced naming abilities for treated words after undergoing naming therapy.However, the occurrence of generalization to untreated words is relatively infrequent.One study (Nickels, 2002) suggests that around one quarter of patients demonstrate generalization in word production after naming intervention, and propose that generalization is more likely when the treatment offers a strategy, rather than simply practicing the pairing of meaning and form.A systematic review by Efstratiadou et al. (2018) analysed 21 studies on Semantic Feature Analysis (SFA) therapy for aphasia.Comparing pre-and post-treatment performance, the treatments resulted in improvement for most patients, but generalization to untreated words and to connected speech was observed in less than half of them.However, effect size (ES) calculations for treated words, using similar benchmarks as in the current study (Beeson & Robey, 2006) were more moderate, with most participants showing small or less-than-small treatment ES (Efstratiadou et al., 2018).In contrast, in the current study a third of the participants showed medium and large effect sizes, and another half showed small effect sizes for treated words.The improvement for untreated words in the current study showed mostly less than small effect sizes, which is in line with previous a metaanalysis conducted on SFA naming treatments of nouns (Quique et al., 2018).Another meta-analysis by Wisenburn and colleagues (2009), who explored effects of naming therapies, including semantic, phonological and mixed methods, found higher gains for treated items, and lowest gains for untreated items, as in the current study.
Exploring the dynamics of naming accuracy change during treatment in the current study, we found that improvement in naming treated words was significant in the first two cycles of treatment, i.e., after ten treatment sessions.Similar findings were demonstrated by Simic et al (2020), who found that the steepest improvement during phonological naming treatment (PCA) occurred in the first four sessions.While there was no significant improvement at the group level for the third and fourth treatment cycles, the trajectory varies across individuals, and the additional treatment cycles may have contributed to maintenance and generalization.
Several mechanisms may have potentially contributed to treatment gains.First, the frequent attempts to name words during treatment have been suggested as a potential factor that contribute to the gains observed in treated words (Wisenburn & Mahoney, 2009).The exposure to treated words multiple times during treatment, could strengthen the links between word form and meaning every time a word is retrieved (Best et al., 2013).While this mechanism alone cannot explain improvement in naming untreated words, spreading activation may play a role.The spreading activation theory suggests that the activation of related semantic, phonological, and orthographic features during the treatment may lead to strengthening of the overall network (Dell, 1986;Dell et al., 2014) and serves as the basis of the SFA treatment method (Collins & Loftus, 1975).This process can explain the generalization of treatment effects to untreated words that share lexical features with treated items (Efstratiadou et al., 2018;Kiran & Bassetto, 2008;Quique et al., 2018;Robey, 1994Robey, , 1998)).Nonetheless, given the lack of a control group in the current study, it is important to consider that the act of naming items through repeated probing may also contribute to improvement through priming, even without explicit feedback or treatment (Webster et al., 2015).
More specific to the current treatment method, the improvement in naming bi-morphemic words, either treated or untreated (both from the MCW list and from the SHEMESH test), can also be explained by a specific mechanism of decomposition induced retrieval.The treatment involves breaking down morphologically complex words to extract the root morpheme and use it as a cue to retrieve the target word which shares the same root.During treatment, participants were instructed to pay attention to relevant information in their own naming attempts and to identify relevant morphological information that could assist in retrieving the target word.When faced with a difficulty to name an untreated bi-morphemic word, participants could employ the same strategy by explicitly attempting to evoke or identify the target root.For example, when trying to name the word "screwdriver" /maVʁeG/, one participant said "what is it?One screws with it /maVʁiGim/.Ha! Screwdriver!/maVʁeG/".Another participant described her moment of understanding the root-decomposition principle in the treatment this way: "Only now it hits me!At the beginning I said 'ok, he is combing /meSaʁeK/ his hair', what's the problem to say it?but I understood it make it easier for me to retrieve the word 'comb' /maSʁeK/".
Finally, treatment gains that expanded beyond words with a bi-morphemic structure may suggest the acquisition of a semantic self-cuing strategy (Lowell et al., 1995).This mechanism was used to explain the improvement acquired during the SFA and similar naming therapies (Biedermann et al., 2010;Boyle, 2004;Efstratiadou et al., 2018).In the current study, the instruction to participants to pay attention to relevant information in their own naming attempts may have also assisted participants in recalling semantic information that facilitated the retrieval of mono-morphemic untreated words, similarly to semantic treatment approaches (Boyle, 2004).

Treatment effects on word repetition
Our final prediction, which suggested that treatment effects would extend to repetition (BAFI) of morphologically complex words, was not supported by the results.The dual route model proposed by McCarthy and Warrington (1984) posits that both lexicalsemantic and non-lexical routes contribute to auditory word repetition.Following that, we hypothesized that strengthening complex word representations would enhance the lexical-semantic route, resulting in improved auditory real word repetition.The lack of improvement in the repetition of complex word in the BAFI test, despite improvement found in naming bi-morphemic words, could be explained by differences between the words lists.For example, words in the BAFI test were significantly longer than those used in the MCW list, i.e., 3.2±0.7 syllables for the BAFI compared to 2.6±0.52 syllables in the MCW list, t(98) = 5.5, p < .001,and also contained inflections, making them more difficult to process at the phonological output buffer lexical stage.

Predicting response to treatment from participant factors
Accurately predicting treatment outcomes is of paramount importance for both clinicians and patients as it enables selecting individualized treatment strategies (Moons et al., 2009).In our logistic regression analyses, we found that only pre-treatment naming performance on SHEMESH bi-morphemic words significantly predicted treatment longterm effects for treated items, whereas pre-treatment naming of mono-morphemic words was insignificant in the exploratory buildmer model.This finding is consistent with previous research on Hebrew-speaking PWA (Bahar-Sharabi, 2014), showing that naming SHEMESH bi-morphemic words is generally more challenging than naming mono-morphemic words.These results, along with previous findings showing the importance of the root morpheme in lexical access (Berman, 1982;Dromi & Berman, 1982;Frost, Deutsch, & Forster, 2000;Goral et al., 2003), suggest that unconscious, unique processing occurs for words with roots compared to mono-morphemic words, even in simple naming tasks.Moreover, a prior ability to manipulate and decompose roots in Hebrew words may have facilitated internalization of the current root-based treatment principles, leading to greater treatment success.
The findings from our third model demonstrated significant effects of time post onset, education, and age on naming accuracy.Specifically, older participants and those with higher levels of education demonstrated an increased likelihood of correct naming, while a longer duration following the lesion demonstrated a decreased likelihood of correct naming.However, it is essential to note that these factors did not interact with treatment sessions, indicating that they did not contribute significantly to treatment gains.When exploring the existing aphasia literature, the relationship between demographic variables and treatment outcomes remains inconclusive.While some studies found no significant association between demographic variables and treatment-related naming improvements (Quique et al., 2018), others reported that pre-treatment language performance severity and psycho-linguistic abilities, including pre-treatment naming, were predictive of treatment outcomes (Geranmayeh et al., 2014;Gilmore et al., 2019;Gu et al., 2020;Lambon Ralph et al., 2010).This uncertainty can be attributed to the intricate nature of aphasia, wherein a diverse array of factors collectively influences language recovery.Furthermore, the challenge of small sample sizes commonly encountered in aphasia treatment research can render the investigation of demographic variables particularly intricate.
It is worth noting that most participants in our study had a phonological impairment, with relatively mild semantic difficulties, which could have contributed to their high responsiveness to the morphology-based treatment.These results are similar to those of a previous study (Best et al., 2013) in which participants with either semantic or phonological lexical deficits were treated with a cueing hierarchy intervention of phonemes/graphemes.Their results showed that individuals with less severe semantic deficits and more dominant phonological output deficits tended to show generalization to untreated words.Best et al. (2013) suggest that when there is a lexical-semantic deficit, there is insufficient activation feeding through the level of phonological encoding during word retrieval.When the lexical-sematic processing remains relatively well preserved, it enables partial activation at the phonological encoding level, which can produce generalised changes.
Among the participants, it is important to highlight that one individual (P7) did not demonstrate any improvement in naming following the treatment (showing the smallest treatment effect size).This was the only participant diagnosed with conduction aphasia, as was indicated by their significantly poor performance on the BAFI repetition test (14%) and a high occurrence of phonological paraphasias.Conduction aphasia is known for its characteristic features of phonological paraphasias and poor speech repetition (Ardila, 2010;Bartha & Benke, 2003).These specific deficits experienced by P7 could account for the limited treatment effect, as the treatment heavily relied on recognizing phonological similarities between roots and targets, which may have been challenging for this individual.However, it is important to acknowledge that other factors, such as lesion characteristics or untested cognitive abilities beyond the scope of this study (Boyle, 2015;Lambon Ralph et al., 2010), may also have played a role in influencing treatment outcomes.

Limitations
An important limitation of the current study is the lack of a control group undergoing therapy with a different treatment method.The observed enhancements in naming abilities could potentially arise from the mere exposure to stimuli.Consequently, we could not determine the specificity of the mechanism at play, and the potential advantage of morphology-based treatment over other treatments, which are not uniquely tailored to the structure of the Hebrew language.Further investigation with larger sample sizes is required to explore the treatment effects on individuals with different psycholinguistic profiles and compare it with other naming treatments such as SFA and PCA.

Conclusions
This study presents the first evidence-based Hebrew anomia treatment for PWA, demonstrating potential effectiveness by improving naming of both treated and untreated words.Most of our results are consistent with either a specific treatment mechanism, involving improved morphological decomposition, or a more general mechanism of a strengthened lexical system and improved self-cueing strategies.These could be differentiated in the future by comparing the results to other treatment methods.Our findings underscore the importance of exploring naming therapies which are specifically adapted to each language.The principles of the current treatment may also be applicable for improving verb retrieval in Hebrew and other Semitic languages with root-based morphology.While treatment outcome is affected by many uncontrollable factors, we can control the type of therapy delivered to PWA.Therefore, it is of our duty to search for the best treatment approach in each language and for each individual.

Figure 1 .
Figure 1.Timeline of assessments and treatment.

Figure 2 .
Figure 2. Examples of the treatment steps.Step 1 (a).The target word /miKLeDet/ (a keyboard) is presented below its representative picture.In the upper left corner, the written forms of the root, target word, related verb, and gerund are displayed.The root graphemes are highlighted in red in all words for easy identification.The SLP explained the morpho-phonological relations between words sharing the same root.Step 2 (b).Participants are presented with four pictures and are required to choose the one that shares a root with an auditory presented gerund.Once the target word is identified with a mouse press, it is highlighted, and auditory feedback is given to indicate correct or incorrect selections.
Participants' naming performances (number of correctly named words out of 30) in pre-and post-treatment sessions of the MCW -up; ES = effect size; Effect sizes are presented in parentheses.S = small; M = medium; Large = large.

Figure 3 .
Figure 3. Predicted probabilities of naming accuracy per change in word Length and (a) and word Frequency (b).

Figure 6 .
Figure 6.Predicted probabilities of naming accuracies for the interaction of pre-treatment namingof Shemesh bi-morphemic words (standardized) and Session levels.

Table 1 .
Participants' demographics and pre-treatment performance on language tests

Table 2 .
Examples of the bi-morphemic nouns and their related verbs from the MCW list

Table 4 .
GLMM results of Model 1: General treatment effects

Table 6 .
GLMM results of Model 2: Dynamics of treatment effects