Are There Sustained Effects of a Preschool Shared-Reading Intervention Addressing Dual Language Learners?

ABSTRACT Purpose Previous research has demonstrated that immediate effects of language interventions tend to fade, but has also suggested that differentiating language skill types may be essential for understanding fade-out processes. This paper examines the longer-term effects of participating in a shared-reading intervention. Method The study included 464 dual-language learners (DLLs) (49.6% girls) in Norway with a mean age of 52.60 months when the intervention started. They were randomly assigned to an intervention condition receiving a shared-reading program in preschool and at home or a business-as-usual control condition. The children spoke a number of first languages and were second-language speakers of Norwegian. Children’s second-language skills were assessed seven months following the completion of the intervention. We asked whether the developmental advantages induced during the intervention faded or remained when the intervention conditions were no longer present, using autoregressive structural modelling and second-order latent growth modelling to answer the question. Results While some immediate intervention effects disappeared (grammar) or showed tendencies to fade (vocabulary and perspective taking), second-order latent growth modelling suggested that narrative skills emerged. Conclusion The results demonstrate the need to consider skill type in future research on fade-out and offer a longer-term perspective on how DLLs respond to shared-reading interventions.

An increasing number of children speak a language at home that is different from the language of instruction in schools, rendering questions of how preschools prepare dual-language learners (DLLs) for schooling ever more urgent.In spite of the recognized importance of support for both home and school languages in early childhood, we have limited knowledge about the longerterm impacts of language-promoting interventions and programs designed for young DLLs (Abenavoli, 2019).Do language skills that have been spurred by a preschool intervention continue to develop, conferring a lasting advantage on participating children?Are some language skills more resistant to fade-out than others as time passes?Do certain skills require time to develop, showing effects only some months or years after the completion of the intervention?In the present study, we examine the delayed effects of a randomized and controlled language intervention in Norwegian preschools and homes that used shared reading as a core language-promoting component to support young DLLs' second-language vocabulary, grammar, perspective taking and narrative skills.Significant immediate post-intervention effects on the vocabulary, grammar and perspective-taking skills of the children who received the shared-reading intervention have previously been reported (Grøver et al., 2020), while no immediate effects on narrative skills were identified in the same report.The current study asks whether these impacts remained or faded seven months after the intervention was concluded, and whether any novel intervention effects emerged.

Persistence and fade-out of language interventions in early childhood
The question whether receiving a time-limited educational intervention early in life has the potential to alter developmental trajectories has interested researchers from fields ranging from education and developmental psychology to economics.Precedents in the early education research field have examined maintenance or fade-out of early interventions, years after reports of their success in early childhood (Burchinal et al., 1997;Campbell & Ramey, 1995;Gormley et al., 2018;Ramey et al., 2000).Particularly relevant to the current study are recent reports on follow-up effects during the first year after the completion of a particular preschool program or a researcher-developed intervention (Bierman et al., 2014;McCormick, Weiland, et al., 2021;Moffett et al., 2022;Timperley et al., 2022).The time that passed between the completion of an intervention and the assessment of delayed effects is of course crucial to consider.Yoshikawa et al. (2016) demonstrated for example that the advantages children accrued during a preschool program diminished every year, with the decline being steepest in the first years.There is no well-established position on the optimal time point for studying delayed intervention effects, nor agreement on what can be described as longer-term effects (compare for example the life-long perspectives in studies of the effects of the Abecedarian preschool program (Campbell et al., 2012) versus effect studies undertaken about eight months after literacy interventions were completed (Bus & Ijzendoorn, 1999), both time windows described as "long-term" by the authors).In the present study we term effects studied seven months after the completion of a sharedreading intervention as longer-term effects, contrasting them to immediate effects.Bailey et al. (2020) concluded in their extensive review of persistence and fade-out following educational interventions that we lack a theoretical framework for understanding sustainability of skill advantages achieved through interventions.They demonstrated that studies both find persistent effects as well as effects that diminish or fully disappear when the intervention conditions are no longer present.The fade-out hypothesis is typically the default in studies where environmental conditions prior to the intervention remain unchanged after the intervention.Many long-term intervention studies suggest that child effects fade or disappear completely (Barnett, 2011;Bierman et al., 2008) with control groups catching up so that intervention-and control-group outcomes converge (Noble et al., 2019;Yoshikawa et al., 2013Yoshikawa et al., , 2016)).The persistent effect hypothesis, on the other hand, tends to be the default in studies arguing that earlier skill levels are strong predictors of later skill levels (Cunha & Heckman, 2007; for a discussion of skill building as a mechanism for persistant intervention effects, see also Bailey et al., 2017).Skills may lead to more skills, and advantages acquired through an intervention may not only persist but even grow, a phenomenon captured by the term "the Matthew effect" (Stanovich, 1986).Bailey et al. (2020) concluded that even though fade-out is widespread, it occasionally co-exists with persistence and this co-existence may be dependent upon the types of skills targeted in the intervention.Paris (2005) argued for example for the importance of differentiating constrained skills, which are easily accelerated by brief interventions but then also universally mastered even without an intervention, from unconstrained skills, which continue to develop over a long period and for which intervention-induced advantages are more likely to persist.McCormick, Weiland, et al. (2021) confirmed that differentiating skill type was essential for understanding the long-term language effects of preschool programs; effects on unconstrained skills, such as vocabulary, were more likely to be sustained than effects on constrained skills, such as decoding, and this was in particular the case for DLLs during the fall term of kindergarten before classmates had received more intensive literacy teaching and started catching up.Thus we formulate a third hypothesis, the skill-type matters hypothesis, that suggests that fade-out may be more common for some skill types than for others, depending on the extent to which the skill type assessed is constrained or unconstrained.
DLLs, speaking one language at home and another in the preschool setting, have typically had less exposure to the language of instruction in preschool than their monolingual peers and may thus demonstrate different second language learning trajectories post-intervention.Though DLLs may benefit from language-of-instruction interventions as much as or sometimes more than native speakers (McCormick, Weiland, et al., 2021;Yoshikawa et al., 2013), the evidence for persistence or fade-out of intervention-induced second-language boosts for DLLs is at best limited (for review, see Abenavoli, 2019).In the effect studies reviewed below, only three studies specifically addressed DLLs: McCormick, Weiland, et al. (2021) (about half the sample was DLLs); Rodge et al. (2016) (entire sample was second-language speakers); Vadasy et al. (2015) (compared language-minority children with language-majority).A number of other studies (Bierman et al., 2014;McCoy et al., 2018;Nix et al., 2016;Sim et al., 2014;Welsh et al., 2020;G. J. Whitehurst et al., 1994) reported on percentages of Hispanic/Latino children in the sample, information that may suggest the sample included DLLs, but offering no specific information about the children's language status and use.

Vocabulary outcomes
The most researched variable in sustainability studies is vocabulary.Several studies have reported no longer-term effects of shared reading on vocabulary (Bierman et al., 2014;Sim et al., 2014;G. Whitehurst et al., 1988).G. Whitehurst et al. (1999) did on the other hand find effects of a shared book-reading intervention on emergent literacy skills, including vocabulary, one year after the completion of the intervention, and G. J. Whitehurst et al. (1994) similarly demonstrated growth in measures of expressive vocabulary in a follow-up study six months after the intervention.Vadasy et al. (2015) found that DLLs in shared reading conditions exhibited better longer-term receptive vocabulary gains if their receptive vocabulary skills were low.Rodge et al. (2016), who also studied DLLs, found only slightly reduced effects compared to immediate post-intervention scores at a seven-month follow-up assessment.Fricke et al. (2013) and Hagen et al. (2017) likewise found vocabulary effects of a language intervention that addressed children with language difficulties and included a dialogic reading component at a respectively six-or seven-month follow-up.In sum, evidence of vocabulary effects from shared-reading interventions assessing children six to 12 months after an intervention is meager but not absent.The several available studies have reported sustained effects, reduced effects, and effects that completely disappeared.

Grammar outcomes
Of the few studies that have assessed the longer-term effects of shared reading interventions on grammar development, none has shown sustained grammatical skill improvement.Rodge et al. (2016) and Bianco et al. (2010) observed no effects of shared reading on syntactic comprehension in seven-to nine-month follow-up studies and Hagen et al. (2017) similarly concluded there were no follow-up effects on morpheme generation after seven months.

Narrative outcomes
Though some studies have reported no narrative effects in follow-up studies of shared book-reading interventions (Bianco et al., 2010;Feagans & Farran, 1994), other publications identify emerging narrative sleeper effects (see below).This is the only skill for which delayed learning outcomes of shared book-reading emerged that were not present at immediate post-intervention assessment.Peterson et al. (1999) in a storytelling intervention (not a shared-reading intervention per se) found narrative sleeper effects one year after the completion of the intervention, despite finding no immediate effects.Moreover, Reese et al. (2023) recently reported narrative effects (both story comprehension and retelling skills) in a one-year follow-up assessment of an intervention that included shared reading, while no immediate effects of the same intervention had been identified (Riordan et al., 2022).Narrative skills are an excellent example of unconstrained skills that take more time to develop.The sleeper effects on narrative skills identified by Reese et al. may thus confirm the importance of the unconstrained-constrained dimension suggested by Paris (2005) and confirmed by McCormick, Weiland, et al. (2021) in their follow-up study of a preschool intervention.

Perspective-taking outcomes
Perspective taking is a multidimensional skill that develops during the preschool years.Shared reading is an activity that seem to promote the development of perspective taking through opportunities to talk about characters' internal state and emotions (Dowdall et al., 2020;LaForge et al., 2018;for review, see;Grøver et al., 2023).In the present study perspective taking encompasses emotion comprehension, internal state comprehension and skills in shifting protagonist perspectives during narration.It is thus a skill needed for emotional and social understanding and is an example of an unconstrained skill important for communication and narration.There are some indications that socio-emotional effects of interventions are more robust to extinction than traditional academic skills, both when observed one year post-intervention (Bierman et al., 2014;McCoy et al., 2018) and through the elementary school years (Nix et al., 2016;Welsh et al., 2020).
However, the evidence is also mixed for this developmental domain.Dowdall et al. (2021) found changes in parent-child interactions, such as for example increased talk about mental states, from a parenting intervention using shared reading to promote toddlers' socio-emotional development.Contrary to expectations, though, no immediate or 6-months delayed child outcome effects appeared in social behavior or theory-of-mind.The authors argue that a longer-term follow-up study would be important before concluding the intervention was ineffective, particularly as the intervention led to positive improvements in caregiver talk.

The present study
The DLLs participating in the present study had received a researcher-developed and loosely scripted shared-reading intervention called Extend in preschool and at home.The intervention took place in four book-sharing units, each lasting for four weeks, over one preschool year.A randomized controlled research design was applied to evaluate the effects of the intervention, immediately and in a longer-term perspective.Children's vocabulary, grammar, perspective-taking and narrative skills were assessed prior to the intervention, immediately following the intervention and seven months past its completion.In the present study we asked whether the developmental advantages induced under the particular intervention conditions had faded or remained when those conditions were no longer present.More specifically, we asked the following research question: Were the intervention effects on vocabulary, grammar, perspective taking, and narrative skills sustained seven months after the completion of the shared-reading intervention?
We used two different statistical approaches to respond to the question: first, we applied autoregressive structural equation modeling (SEM-modeling) in which the delayed assessments were regressed on the immediate post-intervention and pre-intervention assessments.Second, we analyzed delayed effects using second-order latent growth modeling that included data from the three waves of assessment.

Method
The study's recruitment procedures, treatment of participants and data handling complied with the Norwegian Personal Data Registers Act and were approved by the Norwegian Social Sciences Data Services.The parents gave informed consent to their child's participation in the study.Teachers offered informed consent to their participation.

The children
The study included 464 (49.6% girls) 3-5-year-old children.They attended 123 classrooms in 60 preschools in the larger Oslo area, with a mean number of 3.77 children participating per classroom.In Norway, children commonly attend age-heterogeneous preschool classrooms with approximately 18 children aged 3-5 years per classroom, while in some preschools children are grouped in more age-homogeneous ways.There is no kindergarten class in the Norwegian education system equivalent to kindergarten classes in the US.Children leave preschool and start first grade in elementary school the calendar year they turn six years.Pre-intervention, the children's mean age in months was 52.60 (SD = 9.63).40.7% of the sample had recently began their last year of preschool attendance when the intervention started and were assessed as firstgraders at follow-up assessments.Similarly, the sample encompassed 36.4% 4-year-olds and 22.8% 3-year-olds pre-intervention, and they were all still preschoolers at the follow-up assessment.To qualify for the study, children had to be identified as DLLs by parents both of whom spoke a non-Scandinavian language as their first language.Though the children spoke a variety of languages at home, more than half of the sample spoke one of the following four bestrepresented languages: Urdu (20.3% of the sample), Somali (14.0%),Polish (9.7%) and Arabic (9.3%).The large majority of the children, 86.2%, had entered preschool before age three (mean age in months at preschool entrance 26.11, SD = 10.72).The parents reported on child language use at home on a scale from 1 = mostly first language, 2 = about equal use of first and second language, 3 = mostly second language (Norwegian).Most children combined their first language and Norwegian when interacting with their parents (child to mother M = 2.08, SD = 0.87, child to father M = 1.95,SD = 0.88).

The parents
The sample comprised considerable variation in parental education.Every fourth mother and father had middle school or less (up to 10 years of schooling) as their highest educational level (25% of mothers in the sample, 24% of fathers).A larger group of parents had completed high school (43% of mothers, 42% of fathers).Every third parent (33% of mothers, 34% of fathers) had a university degree (BA or above).The majority of parents studied or worked outside of the home in a full-or part-time position (63% of mothers and 79% of fathers who reported on study/work status).Most parents (92% of both mothers and fathers) had been born outside of Norway and immigrated as young adults (for mothers mean age at immigration to Norway approached 22 years, fathers' mean age was approximately 24 years at time of immigration).Parents mostly used their first language in daily communication with their child (mother to child M = 1.39,SD = 0.65, father to child M = 1.37,SD = 0.66; see scale above for reports on language use).

The teachers
Most teachers (78%) had degrees in early childhood education and more than half had worked for six years or more as preschool teachers.All participating classrooms were characterized by high language diversity.

Random assignment to experimental condition
After recruitment was completed, we applied a two-step randomization procedure at the classroom level to optimize similarity in socioeconomic backgrounds between intervention and control classrooms (for detailed information on randomization, see supplementary material, Grøver et al., 2020).Pre-intervention, the sample consisted of 246 children in the intervention group and 218 children in the control group, attending respectively 61 and 62 classrooms.An independent samples t-test was conducted to compare language use and demographics in the intervention and control conditions at pretest.We found only one difference between the intervention and control groups on child, parent, teacher or classroom variables; mean age at pretest for the intervention group was 53.84 months (SD = 9.51) and for the control group was 51.19 months (SD = 9.60); t(464) = 2.99), p < .01.This age difference appeared to result from the intervention sample randomly including more five-year olds while the control group included more three-year olds.The percentage of three-year olds in the intervention group was 19, of four-year olds 34, and of five-year olds 47, while the percentage of threeyear olds in the control group was 27, of four-year olds 39, and of five-year olds 34.

Attrition
Attrition of individual children mostly resulted from families moving out of the larger Oslo area.One entire intervention classroom was lost during the intervention year as all the three participating children moved.At immediate posttest, which occurred a mean of 7.43 months after the pretest, the sample included 122 classrooms with 429 participating children (228 intervention and 201 control children), with a mean age in months of 60.03 (SD = 9.67).At the delayed posttest 418 children (221 in intervention and 197 in control group) with a mean age in months of 66.89 (SD = 9.85) participated.The mean interval between posttest and delayed posttest was thus 6.86 months.Four children who were not assessed immediately post-intervention due to their extended summer vacations were assessed at the delayed posttest.At the delayed posttest 171 (40.9%) children were attending first grade, while 151 (36.1%) children were in their last year of preschool attendance and 96 (23.0%) children in their penultimate year of preschool attendance.Attrition was equal across conditions and age groups.

Child assessments
The children's second language skills were individually assessed in a quiet room in preschool or elementary school by research assistants who were not informed about the condition the child was assigned.Similarly, transcription and coding was done by research assistants who were blinded to the purpose of the study.Children were first assessed in early fall prior to the start of the intervention (end of August and September).They were assessed a second time immediately after the completion of the intervention in April and May.The delayed posttest took place at the end of October-beginning of December.
Seven instruments were administered in a fixed order and with no time limits imposed, allowing breaks or a split of assessments into two sessions if the child lost interest or attention.To keep the child engaged the assessments alternated between receptive and expressive tasks in the following order: Targeted receptive vocabulary (VOC_RECEPTIVE), Targeted expressive vocabulary (VOC_EXPRESSIVE), The Multilingual Assessment Instrument for Narratives (MAIN), Test for Reception of Grammar second edition (TROG-2), British Picture Vocabulary Scale second edition (BPVS-2), the HUG instrument, and the Test of Emotion Comprehension (TEC).The MAIN and the HUG instrument included both two separate subsets.The first set of each instrument was used to assess narrative skills (MAIN_NARRATIVE, HUG_NARRATIVE) and the second to assess perspective-taking skills (MAIN_INTERNAL STATE COMPREHENSION, HUG_PERSPECTIVE SHIFTING).
Vocabulary.The BPVS-2 (Dunn et al. (1997), adapted to Norwegian by Lyster et al. (2010) assessed children's general, nontargeted receptive vocabulary.Three additional researcherdeveloped indicators of words targeted in the intervention were also used.VOC_RECEPTIVE (a 46-item receptive targeted vocabulary test modeled on the BPVS-2, in which children were shown panels of four pictures and asked to point to the picture that matched the word, receiving one point per correct answer), VOC_EXPRESSIVE (a nine-item expressive vocabulary test to assess skills in defining targeted words, answers coded along a 3-point scale with an interrater reliability of .86 (Cohen's κ), and VOC_SPONTANEOUS USE (the number of targeted words that the children spontaneously used while narrating the wordless picture book Hug (Alborough, 2002, see below), thus indicating the extent to which these words had become part of their productive repertoire) (for further description of the project's instruments, see Grøver et al., 2020).
Grammar.Three sets (set C, D, and E, in total 12 items) of the TROG-2 (Bishop, 2003, translated and adapted to Norwegian by; Lyster & Horn, 2009) were used to assess children's syntactic comprehension skills.
Narrative skills.To assess children's narrative skills, two different instruments that both addressed narrative production were applied.In the HUG_NARRATIVE assessment children were invited to produce a narrative about a baby monkey protagonist based on the wordless book Hug.Their audio-recorded narratives were transcribed according to the transcription conventions of the Child Language Data Exchange System, CHAT (MacWhinney, 2000), and coded in accordance with content categories developed by Luo et al. (2014) (maximum score 33 points if the child included all possible narrative content components).Interrater reliability was .83(Cohen's κ).The MAIN_NARRATIVE (Gagarina et al., 2012) is a six-picture-based narrative task available in several versions.We used the baby goat version with its elicitation and coding procedures.Coding addressed children's inclusion of predefined narrative's content components and was applied to children's audiotaped CHAT-transcribed narratives (maximum score 16 points) and had an interrater reliability of .83(Cohen's κ).
Perspective-taking skills.Children's perspective-taking skills were assessed using three instruments that respectively addressed internal state comprehension, emotion comprehension, and skills in perspective shifting; none of these required children to use or understand sophisticated vocabulary.Internal state comprehension was assessed using the second and separate part of the MAIN (MAIN_ INTERNAL STATE COMPREHENSION).For this instrument five questions relevant to assessing internal state comprehension were posed, with children responding verbally (one point per correct answer).The Test of Emotion Comprehension's (TEC) set two (Pons & Harris, 2000), was used to assess emotion comprehension (EMOTION_COMP).TEC's set two addresses children's understanding of causes of emotions (such as being happy when receiving a birthday present) in response to illustrated stories.The children's understanding of the emotion terms used in the instrument was checked before they were asked to identify the relevant emotion by pointing to one of four drawings of facial expressions for each of four stories.The HUG_PERSPECTIVE SHIFTING tool was based on but extended the HUG_NARRATIVE.The children who produced a story about the protagonist, a baby monkey, during the HUG_NARRATIVE (narrating events rather than just labeling the pictures) were invited to retell the story from the perspective of another character, a baby elephant, who had a secondary role in the narrative (for description of the perspectiveshifting task and its coding criteria, see Grøver et al., 2020).Interrater reliability was .82(Cohen's κ).

Parent interview and teacher questionnaire prior to the onset of the intervention
At the onset of the study, parents were interviewed by telephone about family demographics and language use by bilingual research assistants who spoke the family's first language as well as Norwegian.The classroom pedagogical leader (for work role description, see Norwegian directorate for education and training, 2017) responded to a questionnaire that included questions on teacher demographics and classroom language composition.

The intervention classrooms
The Extend intervention was based on a social-interactionist approach to language learning, grounded in the thinking of Vygotsky (1978), Bruner (1981), and Snow (1977) among others.The Extend intervention aimed to support teachers and parents to interact with DLLs in language-promoting ways and to use books to encourage shared attention and discussion, the expectation being that experiencing such book-based interactions would affect the broad scope of language skills developed in early childhood.Observations in all classrooms prior to the intervention showed that text-related activity of any kind was infrequent in both intervention and control classrooms, with no differences between the conditions (Grøver et al., 2022).
The Extend intervention was piloted in close collaboration with experienced preschool teachers and leaders to adapt to the Norwegian Framework Plan for Kindergartens (Norwegian Directorate for Education and Training, 2017) as well as educator values guiding the play-focused early childhood education system in Norway and was developed to support a broad set of skills hypothesized to result from shared book reading: vocabulary, grammar, narrative, and perspective taking.Teachers were requested to invite child participation and to adapt their book sharing to the skill level of the children they worked with, to ensure shared attention.
The intervention consisted of four thematically defined four-week book-sharing units throughout the year; teachers received one theme-relevant book per week to share with participating children.For the book-sharing program in preschool 15 books were selected, both wordless and with text.The books were selected because they were considered useful in inviting content-rich discussions with preschool children and because they fit into one of the curricular themes, thus allowing words and themes covered in the discussion of one book to be revisited when discussing another.During a shared reading session, the teachers were asked to teach 4-5 targeted words and build knowledge aligned with the targeted vocabulary, to invite child reasoning and explore ideas through questions, to support identification of emotions and internal states in the text, and to invite shifts in protagonist perspectives.Teachers were asked to read each book at least three times during the week and audiotape one reading.Because play is highly valued in Norwegian early education, teachers were also asked to invite children to extend book themes to play and other activities outside of shared reading.All preschool readings took place in the common language, Norwegian.Each book came with support material that offered suggestions and ideas for what to emphasize.Teachers shared the books in groups of children and could include the number of children they wanted, the only exception being when they recorded the readings; for these sessions only consented children could participate.Teachers in the intervention group were offered a one-day workshop prior to the start of the shared-reading intervention and were coached once during each unit.The purpose of the workshop was to introduce the main components of the intervention and to offer the teachers opportunity to familiarize themselves with and discuss the books and support material that were used during the first thematic unit.During the coaching sessions, which took place in each teacher's classroom as a conversation between the teacher and one out of two authors of this article, the teacher was invited to discuss challenges in intervention implementation.
Families received in total four books that were also read in preschool, one from each thematic unit, and were asked to share the book with their child as they normally would and in their preferred language.The majority of families returned tapes in which they used their first language (Nomat et al., 2023).The books sent home were fully or mostly wordless to allow the parents to use the family's language of choice.The parents received the books from the teachers together with a list of the words being targeted during reading in preschool (information offered in the family's first language and Norwegian with additional word illustrations).A stuffed animal went back and forth between the classrooms and the children's homes and was introduced to offer children support in sharing the books in the two settings (preschool and home).The books remained at home after the completion of the intervention.

The control classrooms
A business-as-usual condition was applied for the control classrooms.Teachers did not receive any guidance on how to share books with children, but received one to two books (seven all together) within each thematic unit as an appreciation of their collaboration.The books were not identical with the books that the intervention group received but were thematically linked.Parents in the control group received no books, but thank you notes, holiday cards, and certificates of participation, as did the intervention group parents.While intervention and control group teachers collaborated across classrooms in other curricular activities and projects, they were asked not to discuss the Extend program during the intervention year.
All books and material remained in the preschool post intervention, and teachers across intervention and control classrooms were told at the start of the intervention that they could share books and support-material post intervention.Neither the intervention nor the control group received any further support in how to use the books or read interactively with children after the intervention was completed.

Intervention fidelity and attendance
Preschools.The teachers reported on the frequency of their book sharing and on child presence during reading.During the intervention year teachers engaged in 32.69 book readings (SD = 10.51)out of the required 45 (3 readings per book, four thematic units, the three first units with four books, the final with three books).Teachers were asked to audiotape and return one reading of each book.They recorded on average 12.69 book readings (SD = 3.25) out of the required 15.Due to frequent child absences, reflecting the noncompulsory nature of preschool in Norway, children got exposure to about half of the shared readings (M = 24.99,SD = 11.08,maximum possible = 45).

At home.
Parents were asked to audiotape and return one reading of each book.The frequency of returned audiotaped shared readings at home is an indicator of home implementation.Parents returned on average 2.22 (SD = 1.52) tapes out of four requested; every fifth parent returned no tapes while half of the parents returned three or four tapes.

Overview of analytic plan to test sustained program effects
First, we applied structural equation modeling (SEM-modeling) to estimate the sustained program effects on development.We used multiple observed indicators of each targeted latent variable: for vocabulary four indicators (BPVS-2, VOC_RECEPTIVE, VOC_EXPRESSIVE, VOC_SPONTANEOUS USE), for grammar three indicators (we used each of the three TROG-2 sets as separate indicators), for perspective taking three indicators (MAIN_INTERNAL STATE COMPREHENSION, EMOTION_COMP, and HUG_PERSPECTIVE SHIFTING) and for narrative two indicators (HUG_NARRATIVE and MAIN_NARRATIVE).The model had an autoregressive structure, with the four latent delayed posttest assessments being regressed on corresponding latent immediate posttests, while these were similarly regressed on latent pretest assessments.To control for the age difference between the intervention and control group the variable age in months was included as a predictor of observed variables at pre-and posttest.The model also included a dummy variable that represented the condition (intervention or control).
Second, to further address whether intervention children developed differently post intervention, we applied second-order latent growth modeling.These approaches developed from a somewhat different perspective than the analyses of change over time using SEM-modeling.Specifically, they describe the children's development using their initial pre-intervention level and their developmental trajectories from that level.They also determine the variability across children in initial levels.This approach thus focuses on changes in variance and in mean values on latent variables over the three time points.In the first step linear growth models were used to estimate intercept and slope parameters from the three points of measurement for each of the observed variables.In the second step second-order latent variables were fitted separately to the estimated intercept and slope parameters.Three correlated second-order factors were estimated for the intercepts, and two correlated second-order factors were estimated for the slopes.We did not control for age in the final second-order latent growth model.Addition of the age variable did not affect parameter estimates but the model fit got worse.
The models were estimated using the Mplus program (Muthén & Muthén, 1998-2017).Missing data were handled using full information maximum likelihood (FIML) estimation, drawing on all available data.The "complex" option to account for cluster effects was not applied, because the model estimated more parameters than the number of pre-school classes participating in the study.However, the cluster effects were small, so the complex option only had little impact on the results.

Results
We first present descriptive statistics on the children's raw scores with means, standard deviations, and effect sizes for observed measures at pretest, immediate posttest, and delayed posttest (see Table 1).The table also includes the reliability measures at pretest (Cronbach's alpha).
Table 1 illustrates that both the intervention and control groups increased their raw scores from immediate posttest to delayed posttest on all observed variables.For the intervention group the effect size between delayed posttest and immediate posttest was much smaller than the total effect size (between delayed posttest and pretest) for all observed variables.Also, in the control group the effect sizes between delayed posttest and immediate posttest were less than half of the total effect size between delayed posttest and pretest for most variables.The intervention group had a steeper gradient between pretest and immediate posttest than between immediate posttest and delayed posttest for all measures while the growth appeared more linear for the control group.
The SEM-model allowed us to regress the latent variables on condition to estimate the delayed intervention effect, both the total effect of the intervention on delayed outcomes and the indirect effects via the immediate posttests.A dummy variable to represent the condition was included.Table 2 offers the standardized factor loadings for the measurement model for the latent variables at pretest, immediate posttest, and delayed posttest.
All indicators loaded significantly on their respective constructs at pretest, immediate posttest and delayed posttest.Indicators with few items, such as HUG_PERSPECTIVE SHIFTING and VOC_SPONTANEOUS USE, tended to have lower factor loadings, but they still contributed to the estimation of the latent variables.The model fitted the data well (χ 2 = 1208.143,df = 657, p < .001;RMSEA = 0.043, 90% CI [.039, .046];CFI = 0.930; TLI = 0.922; SRMR = 0.054).The total standardized intervention effects (assessed at delayed posttest) and standardized indirect effects (assessed at immediate posttest) are reported in Table 3.
For vocabulary a highly significant total effect remained at delayed posttest.The t-value was larger for the total indirect effect, reflecting a declining effect at delayed posttest.A similar tendency was observed for perspective taking with a significant total intervention effect at delayed posttest (p = .015),but with a reduced t-value compared to the total indirect effect.For grammar, Table 3 illustrates that the entire effect found at immediate posttest had vanished at delayed posttest.Finally, for narrative skills we detected an opposite pattern.While there were no immediate intervention effects on narrative skills, we observed an increased t-value approaching significance at delayed posttest.As 40% of the  sample attended first grade at delayed posttest and thus experienced an environmental change (conf.the importance of environmental stability or change in the fade-out hypothesis), we checked whether the first-graders differed from the rest of the sample on delayed narrative outcomes.This analysis did not reveal significant findings on delayed narrative outcomes for either group (children in first grade versus children still attending preschool) and the model fit worsened.Next, we tested the effects of the intervention using second-order growth modeling with three observation points to check potential differences in growth in the intervention and control group.The model we built included second-order intercept estimates for the three latent variables vocabulary, narrative and perspective taking that had demonstrated intervention effects in the SEM-modeling and also second-order slope estimates for vocabulary and narrative.Condition was included as a dummy in the model.The model fit was acceptable: χ 2 = 872.166,df = 327, p < .001;RMSEA = 0.060, 90% C.I. [0.55, 0.65]; CFI = 0.900; TLI = 0.884; SRMR = 0.065.
Table 4 demonstrates that all observed variables in the model had significant intercept estimates.The estimates and t-values were particularly high for BPVS-2 and VOC_RECEPTIVE and somewhat lower, but still significant, for instruments with few items such as VOC_SPONTANEOUS and HUG_PERSPECTIVE SHIFTING.Table 4 also demonstrates that most variables in this model had significant t-values for slope, except for BPVS-2 (t = 1.42).The two narrative instruments and HUG_PERSPECTIVE SHIFTING had t-values approaching 3. The remaining variables all had t-values suggesting significance.Table 5 reports standardized factor loadings for the second-order latent intercepts vocabulary, narrative skills and perspective-taking skills, as well as factor loadings for the two second-order slopes (vocabulary and narrative) included in the model.All three second-order intercepts were significantly affected by the intervention (vocabulary: β f062 = 0.13, t = 2.61, p = .009;narrative: β = 0.14, t = 2.68, p = .007;and perspective taking: β = 0.23, t = 4.42, p < .001).Second-order slopes were also affected by the intervention (vocabulary: β = 0.44, t = 5.64, p < .001;and narrative: β = 0.12, t = 2.01, p = .044).
A model that included second-order slopes for all three domains (vocabulary, narrative and perspective taking) did converge, but results were hard to interpret, probably due to multicollinearity, and the model fit did not improve.The analysis thus demonstrated a significant delayed effect of the intervention on the second-order vocabulary slope, which was explained by growth in the three instruments assessing targeted vocabulary, but not by growth in general receptive vocabulary.There also was a significant effect of the intervention on the second-order slope of narrative skills.The t-values for second-order slopes on condition for vocabulary and narrative in the growth model were marginally stronger than the t-values for total effects in the SEM-modeling using autoregressive techniques with all four latent variables included.For narrative skills the increase in t-value reflected a significant slope effect.The two different analytical tests thus led to similar findings, offering stronger support for the conclusions than either of them alone.

Discussion
The study of longer-term intervention effects revealed the following main findings: (1) At delayed posttest the immediate posttest effect on grammar had disappeared.We found an almost flat trajectory from immediate posttest to delayed posttest for the intervention group, while the control group demonstrated a steadier development and approached the grammar skills of the intervention group at delayed posttest.This suggests that grammar is a domain that all children will eventually master.
(2) The immediate posttest effect on vocabulary and perspective taking was sustained at delayed posttest, but the total effect of participating in the intervention was weaker, suggesting some catch-up by the control participants.(3) For narrative skills, which had shown no intervention effects at immediate posttest, the total effect approached significance on delayed posttest as assessed with autoregressive modeling.Second-order latent growth modeling confirmed a significant effect of the intervention on narrative skills at delayed posttest.The existence of a delayed narrative effect should be asserted with caution as it was identified in only one of the two statistical tests we applied.
As noted in the introductory section, there is currently no strong theoretical framework for understanding maintenance or fade-out of advantages achieved through interventions.Three hypotheses were introduced: one proposed fade-out when the support of a time-limited intervention is no longer present, another suggested persistent effects grounded in the expectation that acquired skills may be predictive of later skills.A third hypothesis claimed that persistence and fade-out depend on the specific skill type being examined.We believe that our results lend partial support to all three hypotheses.

Hypothesized effects and main findings
The fade-out hypothesis depends in part on the extent to which the environment subsequent to the intervention no longer includes affordances in line with the intervention (for discussion of findings in support of this hypothesis, see for example Bailey et al., 2017Bailey et al., , 2020;;Barnett, 2011;Noble et al., 2019).
In the current study, we examined the longer-term effects of a time-limited shared-reading preschool intervention with support conditions that were not present at the time of the follow-up study.The identified fade-out effect (grammar) and diminishing effects (vocabulary and perspective taking) lend support to the hypothesis predicting fade-out when the intervention-induced environmental support is no longer available.According to The persistent effects hypothesis intervention-induced advantages should be expected to remain in follow-up examinations (for discussion, see for example Bailey et al., 2017;Cunha & Heckman, 2007).New language skills are affordances that may create possibilities for participating in new forms of interaction.New words learned may make it possible to join new types of conversations with implications for further word learning.More sophisticated grammatical skills help children perceive nuances in conversations and express their own intentions in more detail.New skills in perspective taking may increase comprehension of others' statements and promote sensitivity to their points of view.The persistent effects hypothesis finds less support in the results, though we identified a version of persistence for narrative skills close to what is sometimes called sleeper effects: effects that were not immediately identified post-intervention appeared in the longer-term assessment.
Finally, the data offers some support for The skill-type matters hypothesis, distinguishing between constrained and non-constrained skills (McCormick, Weiland, et al., 2021;Paris, 2005).No skill types assessed in the study can easily be acquired within a relatively short time span as is typical of constrained skills (decoding skills is the default example).However, on a continuum from unconstrained to constrained skills, grammar can be considered closer to the constrained end of the continuum than the other skills assessed; syntactical comprehension was assessed by three sets of grammatical constructions typically mastered during the preschool years.Conversely, narrative production can be considered an example of an unconstrained skill.We thus believe that the results also offer some support to the skill-type matters hypothesis.
Below we review whether the results confirm or diverge from other longer-term studies of the targeted four skills and then examine the results in light of methodological design characteristics, environmental characteristics, intervention program characteristics and sample characteristics before we identify limitations.

Grammar
To the best of our knowledge, no previous studies have reported on significant delayed grammar effects in studies of language intervention programs.The vanishing grammar effects at the seven-month follow-up in the present study coincide with findings of Bianco et al. (2010), Hagen et al. (2017), andRodge et al. (2016) who used both similar and different measures of grammatical skills.Of particular relevance to the current study is the Rodge et al. study that found no seven-month delayed grammar effects in a sample of second-language preschoolers.

Vocabulary
Several studies found that immediate post-intervention vocabulary effects disappeared only a few months after an intervention was completed, whether the sample included DLLs (Bierman et al., 2008;Sim et al., 2014) or monolinguals only (G.Whitehurst et al., 1988).
Other studies with diverse samples, such as Fricke et al. (2013) and Hagen et al. (2017) studying children with language difficulties or Rodge et al. (2016) studying DLLs, did report sustained vocabulary effects.In the present study we identify tendencies for vocabulary intervention effects to fade.The estimated immediate post-intervention effect, explained fully by intervention-targeted vocabulary skills, was however sufficiently strong to remain significant seven months later.

Perspective taking
We similarly found a sustained significant effect on perspective taking, but with a lower effect than at immediate posttest.Children who received the intervention were on a faster developmental trajectory during the intervention year, but after the completion of the intervention control group children started catching-up on perspective taking, assessed as skills in identifying emotions, understanding internal states, and handling shifting protagonist perspectives while reading a book.Though several previous studies of diverse samples including DLLs have suggested that socio-emotional skills may be more resistant to fade-out than language and other more academic skills (Bierman et al., 2014;Love et al., 2013;Nix et al., 2016), fade-out would likely have emerged in the present study had the children been observed over a longer time period.Ultimately the skills assessed are ones which most children master in the early elementary years.

Narrative
Narrative skills demonstrated a different pattern, with close to significant (SEM-modeling) or significant (second-order latent growth modeling) differences between intervention and control group at follow-up assessment, while no effects had been found for narrative skills at immediate posttest.Similar findings of longer-term effects on narrative skills (with no immediate effects) have been reported by Peterson et al. (in a storytelling project) and recently by Reese and colleagues (2023) in monolingual samples participating in interventions at home.Reese et al. tested narrative skills using a narrative comprehension test and a productive retelling task, both addressing children's inclusion of relevant content components, while the present study used two narrative production tasks, also assessing narrative content inclusion.Despite the differences in the assessment formats, the Reese et al. and the present study demonstrate longer-term effects on narrative skills, in both cases without immediate effects.There is little evidence of emerging delayed effects/sleeper effects in the literature, but it is understandable that narrative skills might show such effects.They likely take considerable time to develop in ways that are measurable with available instruments.It should be noted that in the Reese et al. study the narrative developmental trajectory may also have shown effects of sustained changes in the ways parents interacted with their children after the intervention was concluded.

Methodological design characteristics
Intervention impacts with smaller effect sizes tend to fade faster (Barnett, 2011).We speculate that the lack of total effects for grammar has to do with the same declining effect as was identified for vocabulary and perspective taking, but for the two latter-mentioned latent variables total intervention effects at immediate posttest were estimated to be higher than for grammar.Except for narrative skills, our findings confirm previous reports that test scores for intervention and control groups converge as time passes.Yoshikawa et al. (2016) concluded for example that the test score advantage of children exposed to preschool over children who were not diminished every year, and that the decline was steepest in the first years.
An important factor to keep in mind when understanding fade-out in longer-term experimental studies is what happens to the counterfactual -the control group (for discussion, see Bailey et al., 2020;Barnett, 2011) as well as whether there are any support conditions remaining for the intervention group or emerging for the control group.Books and support materials related to the intervention remained in the preschools after intervention completion and may have spurred follow-up use by the intervention teachers, leading to maintenance of effects.Similarly, intervention teachers were allowed to share support material and books with control group teachers after the completion of the intervention, which might have led to catch-up effects.
For the purpose of comparability, the same instruments were used across measurement waves.This is basically a strength as many assessments of longer-term interventions need to switch to assessment instruments that are developmentally more appropriate, but that may make the study of developmental trajectories more challenging.In none of the test formats, whether receptive or expressive, did children receive information as to whether their responses were correct or not.However, in the two narrative tests, in which children were exposed to series of pictures and asked to build a narrative, we cannot exclude a test-retest effect.This is a potential weakness in the assessment of immediate effects, but even more so when the children are exposed to the test material a third time and are older and potentially more able to remember the test instruments.Both picture materials portrayed a dramatic event for which there was a final happy ending.Recognizing the picture materials and knowing the narrative ending from previous assessments might have encouraged the children to make "short-cuts" in their narration, particularly the third time they narrated.Children, however, were credited based on several different narrative components.They may have been less motivated to include components that did not contribute to the final narrative conclusion (the happy ending), potentially explaining why both groups demonstrated less growth in the raw scores of the observed variables between waves 2 and 3 than between the first and second waves.Though repetition should not affect the two conditions differently, it could impact estimates of growth in each group.
Even though the time span between waves 1 and 2 (pre-and posttest) was only about a half month longer than between waves 2 and 3, the gains between waves 2 and 3 were lower.McCormick, Pralica, et al. (2021) demonstrated that children's test score growth tended to slow down during summer, especially DLLs' second language scores.Between immediate posttest and delayed posttest children had summer vacations, increasing the relative exposure to the family language at home and reducing exposure to the language in which they were assessed in the present study.Some children visited the parents' country of origin or received extended visits from family during their summer vacations, offering more exposure to and development of the language spoken at home.The timing of the follow-up assessments may have reduced estimates of child second-language skill at the delayed posttest, but the potential summer effect should be equivalent across groups.

Language-learning environment characteristics
It is important to consider environmental sustainability in follow-up effect studies in real-life experiments.After the conclusion of the intervention, many children in the present study experienced environmental transitions.Four out of ten children left preschool to attend first grade, located in a primary school environment with new teachers, and a new curriculum which during the fall term of first grade emphasizes code-based literacy skills (not taught in preschool classrooms in Norway).Though the rest of the children (the three-and four-year olds at the start of the intervention) remained in preschool settings, the turnover-rate among pedagogical leaders in Oslo is consistently high (36% according to recent calculations).In some of the city districts participating in the present study, the turnover-rate is even higher (Oxford research report, 2022/ 6).Thus, a number of children at follow-up assessment were taught by teachers who had not received any support in book-sharing, but this was likely to happen for children in both conditions.
The intervention children received four books for home reading in the families' preferred language.Some studies suggest that intervention-induced changes in parental interaction with children remain after the intervention is concluded (Bierman et al., 2021;Blom-Hoffman et al., 2007;Dowdall et al., 2021;Levin & Aram, 2012;Marshall & Reese, 2022;Reese et al., 2020).While the assessment of delayed effects was limited to second-language skills in the present study, it has previously been demonstrated that growth in L1 targeted vocabulary skills during the intervention marginally mediated growth in equivalent L2 skills (Grøver et al., 2020).We can thus not exclude that sustained intervention-induced changes in home reading may have impacted the described results.

Program and sample characteristics
The Extend program was intentionally constructed as a low-intensity shared-reading program.Lowintensity interventions with smaller effect sizes have demonstrated faster fade-out, while more robust programs likely show more lasting effects (Barnett, 2011).
Though most of the sample had started in preschool before age three, they had been much less exposed to Norwegian than native speakers.It is possible that the DLL status of the sample made them more responsive to the affordances of the book-sharing and resulted in more immediate post-intervention program effects than could be expected in a sample of native speakers receiving similar interventions.On the other hand, and as many children were less exposed to Norwegian during their summer vacation, it is also possible that the children's DLL status increased their summer slowdown and reduced follow-up effects compared to native speakers (see discussion in McCormick, Pralica, et al., 2021).Though the results from the current study converge with other follow-up studies of young native speakers, we should be aware that the specific conditions DLLs experience might have impacted delayed intervention effects.

Limitations
There are several limitations to this study that warrant note.The intervention included a preschool component taking place in Norwegian as well as a smaller component taking place in the families' preferred language.We do not have a methodological design that allows us to determine the separate delayed effects on second-language learning of the home vs. the classroom components.Also, we do not have data to analyze whether a program with higher dosage, longer duration or a more manualized and structured approach would have shown longer-lasting impacts (for discussion, see Dowdall et al., 2020;Language and Reading Research Consortium, 2016), and we do not have information on the extent to which the children post-intervention experienced a language learning environment likely to sustain the intervention.The instruments we used to assess growth might not have been sufficiently sensitive to minor changes in developmental trajectories that could still be meaningful, and we cannot ignore that any test-retest effects may have emerged differentially across various observed variables.We believe, however, that the randomized design of the study reduces the impact of these limitations on our longer-term effect estimates.
The study sampled participants from language-minority groups living in a small country.As the possibility of identifying participants by intersecting demographic data could not be completely ruled out and to guarantee the families' privacy, we did not ask for parental consent to publish the data.

Conclusion
The study adds to our limited knowledge of how DLLs respond to shared-reading interventions in a longer-term perspective, acknowledging both their greater potential responsiveness to environmental support as well as fade-out risks when support is withdrawn.We found some support for both the fade-out hypothesis, the persistent effect hypothesis, and the skill-type matters hypothesis.Future research should further examine fade-out or persistence of intervention effects along the continuum from constrained to unconstrained skills.Some applied implications can be drawn.First, shared-reading intervention programs should carefully attend to the skill types addressed.Future planning of shared-reading interventions addressing DLLs should consider supporting those essential unconstrained skills that are most dependent on environmental support.Second, the study demonstrates that an inoculation metaphor does not apply to the effects of a time-limited shared-reading program; several intervention effects showed signs of fading after only seven months.Thus, when developing shared-reading interventions it is important to consider how environments enriched through an intervention may remain language-promoting postintervention.Indeed, future designs of shared-reading interventions addressing DLLs in early education should emphasize not only how to sustain intervention-induced language skills, but how to sustain the environmental conditions that promoted growth of those skills.

Table 1 .
Descriptive statistics at pretest, immediate posttest and delayed posttest: means, standard deviations, sample sizes, effect sizes and reliabilities.

Table 3 .
Standardized total and total indirect effects of treatment on latent variables.

Table 4 .
Standardized factor loadings at pretest, immediate posttest and delayed posttest for intercept and slope for observed variables.

Table 5 .
Standardized factor loadings for second-order intercept and slope for observed variables.