Agreement attraction in comprehension: do active dependencies and distractor position play a role?

ABSTRACT Across four eye-tracking studies and one self-paced reading study, we test whether attraction in subject-verb agreement is affected by (a) the relative linear positions of target and distractor, and (b) the active dependency status of the distractor. We find an effect of relative position, with greater attraction in retro-active interference configurations, where the distractor is linearly closer to the critical verb (Subject…Distractor…V) than in pro-active interference where it is more distant (Distractor…Subject…V). However, within pro-active interference configurations, attraction was not affected by the active dependency status of the distractor: attraction effects were similarly small whether or not the distractor was waiting to complete an upcoming dependency at the critical verb, with Bayes Factor analyses showing evidence in favour of a null effect of active dependency status. We discuss these findings in terms of the decay of activation, and whether such decay is affected by maintenance of features in memory.


Introduction
Agreement attraction in comprehension occurs when the computation of a dependency is affected by items in memory that partially match features that are required by the agreement relation.For example, (1) is easier to process than (2), even though both of these sentences are ungrammatical (Wagers et al., 2009) 1 : (1) *The musicians who the reviewer praise so highly will probably win a Grammy.(2) *The musician who the reviewer praise so highly will probably win a Grammy.
According to the cue-based retrieval model (Lewis & Vasishth, 2005), the reason for the relatively facilitated processing of (1) is the partial match of features between the plural distractor the musicians and the plural requirement of the verb praise, as the match facilitates the activation level of the distractor, resulting in it being mis-retrieved on a proportion of trials.Since the overall retrieval process ends when one item has been retrieved, this situation leads to relatively short retrieval times associated with the processing of the verb, when averaged across trials.In contrast, in (2), neither the musician nor the reviewer matches the plural feature required by the verb, leading to a prediction of greater processing difficulty due to the low activation level of the mismatching agreement target the reviewer, and relatively low probability of misretrieval of the distractor, thus longer retrieval times.This phenomenon has been supported by a large number of studies that have found similar processing patterns for ungrammatical sentences that are analogous to (1) and (2) (e.g.Dillon et al., 2013;Jäger et al., 2020;Lago et al., 2015;Pearlmutter et al., 1999).
As should be clear from the above discussion, the cue-based retrieval model assumes an important role for features, such as grammatical number, and features are often assumed to correspond directly to retrieval cues.Features have been the focus of several previous studies on attraction in comprehensionsuch studies have asked, for example, how features are combined (Parker, 2019), or whether different types of features are used for different types of dependencies (Dillon et al., 2013).Another research question, which has arguably been less widely studied, is the role of the sentence structure in modulating attraction.This question has been more frequently considered in studies of language production.For example, Franck et al. (2006) argued that attraction effects are modulated by structural relations like c-command.may increase the level of attraction.In the complement clause example (4), by contrast, there is no direct dependency between the critical verb (sortent) and the distractor (la prisonnière), and thus no reactivation would be triggered by the verb.
The study that we report in this paper will also examine how agreement attraction is affected by sentence structure, with the specific aim of investigating the role of maintenance of relevant infromation in working memory on attraction.In the cue-based retrieval model as presented by Lewis and Vasishth (2005), the degree of attraction is affected by relative activation of target and distractor phrases, and also activation levels are assumed to decay over time, relative to the point where a given phrase is first encountered, or reactivated, in the input string. 2 One straightforward prediction of this is that attraction effects should be greater when the attractor linearly intervenes between the target and the verb (retro-active interference) relative to when the target linearly intervenes between the attractor and the verb (pro-active interference).For example, consider ( 5) and ( 6), which show retro-and pro-active interference respectively: (5) Retro-active interference: *The nurse who the widows relied on definitely were reluctant to work long shifts.(6) Pro-active interference: *The widows said that the nurse most definitely were reluctant to work long shifts.
In each of the examples above, the nurse is the target of the agreement dependency, while the widows is the distractor.In (5), the distractor is linearly closer to the verb were than the target, while this is reversed in (6).Other things being equal, the activation level of the distractor, and thus the degree of attraction, should therefore be higher in (5) than in (6).Note, incidentally, that (6) does not involve a direct dependency between the critical verb were and the distractor the widows, so no reactivation of the distractor is involved before the critical retrieval event, and thus, levels of attraction would be expected to be relatively weak, as in the complement clause examples examined by Franck et al. (2015).In addition, in a recent development of the cue-based retrieval account Engelmann et al. (2019) have claimed that baseline activation levels can be affected by information structure, with a boost to activation for items that are typically discourse topics, such as main clause subjects.If this is the case, then this would affect the distractor (the widows) in (6), as this phrase is a main clause subject.Therefore, the Engelmann et al. (2019) version of the cue-based retrieval model predicts greater levels of attraction for these pro-active interference conditions relative to the original Lewis and Vasishth (2005) model.These predictions are consistent with previous results.For example, while Kwon and Sturt (2019) failed to find clear differences in the levels of attraction between sentences with retro-and pro-active interference, this lack of clear difference could have been because their target sentences differed in the structural role of a distractor.On the other hand, the authors found a stronger attraction effect when a distractor was a main clause subject than when it was a dative object, and this is consistent with the claims of Engelmann et al. (2019).
In the experiments reported below, given that attraction effects for pro-active interference (as in ( 6)) are expected to be relatively low (modulo possible effects of topicality, as we keep the distractor as the main clause subject), we ask specifically whether the active dependency status of a distractor has an effect on its activation level, as measured by the attraction effect.
Compare (6) (repeated below) with ( 7): (6) Non-active dependency: *The widows said that the nurse most definitely were reluctant to work long shifts.( 7) Active dependency: *The widows who said that the nurse most definitely were reluctant to work long shifts had become quite annoyed.
Sentences ( 6) and ( 7) are both ungrammatical (due to the number mismatch between were and the nurse) and include a plural distractor the widows occurring in the main clause subject position.However, these two sentences differ, as in (6), the distractor the widows forms a dependency with the immediately following word said.Thereafter, the distractor does not participate in dependency formation at the point where were is processed, nor does it participate in any other dependencies throughout the sentence.Thus, we would expect the baseline activation of this phrase to be relatively low, due to decay, at the point where were is processed.In contrast, in (7), the distractor is modified by a relative clause, and later forms a dependency with the main auxiliary verb had (henceforth: active dependency).We assume that the use of the relative clause in (7) therefore leads to the requirement to maintain relevant features of the distractor in working memory, while the expectation of the main verb is active.If this maintenance requires the features of the distractor to remain activated, the rate of decay may be lower.This would lead to a prediction of greater attraction in (7) than in (6).Note, as discussed above, unlike in the relative clause examples tested by Franck et al. (2015) (see 3 above), the sentence structure of our active dependency condition does not allow for the reactivation of the distractor at the critical verb were, because there is no direct dependency between these two elements.The crucial question is whether or not a dependency involving the distractor is waiting to be resolved.We should note that cuebased retrieval models, and other related accounts of content-addressable memory typically do not incorporate a storage component, and thus, the effect of maintenance on decay, if supported by the evidence, would need to be a new addition to such models.However, as we discuss below and in the general discussion, there is empirical evidence suggesting that maintaining unresolved dependencies has a measurable cost, resulting in slower processing times and a distinct ERP signature.In our view, it is reasonable to test whether such maintenance also affects attraction.
As we have seen in the discussion above, in the cuebased retrieval model, memory retrieval plays a crucial role in explaining attraction effects.However, cuebased retrieval is not the only theoretical account of attraction, and other theories instead highlight the importance of representation distortion, whereby attraction effects arise as a result of previously encoded items in memory becoming corrupted (see Nairne, 1990;Oberauer & Kliegl, 2006, for domain-general memory-based implementations of this idea).This can occur either through similarity-based encoding interference affecting target and distractor noun phrases that share features (e.g. Gordon et al., 2001;Jäger et al., 2015;Laurinavichyute et al., 2017;Smith et al., 2021;Villata & Franck, 2020;Villata et al., 2018), or through feature percolation, where the number feature of a distractor can distort or overwrite the analogous feature of a target (Eberhard et al., 2005;Yadav et al., 2023).
Above, we discussed potential effects of active dependencies solely in terms of the cue-based retrieval model.However, we believe that similar predictions could be applied to other models, including those based on representation distortion, given appropriate assumptions.The key point is that active dependencies should increase the size of attraction effects in any model as long as the following conditions hold: (a) an open active dependency increases the activation of a distractor with which it is associated; and (b) greater relative activation of the distractor leads to larger attraction effects, other things being equal.For example, Yadav et al. (2023) implement a number of models of attraction in a Bayesian framework, including a feature percolation model, which is derived from earlier proposals from the production literature (e.g.Bock & Eberhard, 1993).In such a model, the number feature of a plural distractor can overwrite the representation of the target subject noun phrase, so that a target that is singular in the input becomes represented as plural.Attraction in ungrammatical sentences like (1) (repeated below) occurs because the plural feature on the musicians leads to the target the reviewer being represented as plural, therefore matching the features of the verb praise, resulting in the sentence being perceived as grammatical 3 : (1) *The musicians who the reviewer praise so highly will probably win a Grammy.
In contrast, no such facilitation is predicted for ((2), repeated below), because the distractor is singular, so that, even if feature percolation occurs, the target will still be represented as singular, therefore mismatching the required features of the verb: (2) *The musician who the reviewer praise so highly will probably win a Grammy.
It is possible to envisage a model where the probability of feature percolation is correlated with the activation of the distractor.If this is the case, and, crucially, if active dependencies also increase activation, as we speculated in the discussion above, then active dependencies would also be expected to be associated with larger attraction effects in such a percolation model, even if such a model does not rely on cuebased retrieval. 4 Across the four eyetracking experiments and one selfpaced reading experiment reported below, we will test for attraction effects in a number of different configurations, crucially including pro-active interference configurations where the distractor either participates in an active dependency (7) or does not (6).For the non-active dependency sentences (6), attraction effects are expected to be relatively small, due to the low levels of activation of the distractor in this pro-active interference configuration, for reasons explained above.The question then is whether the active dependency in (7) then leads to a larger attraction effect over this baseline.
To our knowledge, only a small number of previous studies have examined the effect of maintenance on number agreement, and we briefly discuss two of those studies below, before turning to describe our own studies.
Since the active dependency that we will test in the studies below is a subject-verb dependency, it is important to consider prior evidence for storage in such dependencies.A study by Ristic et al. (2022) showed such evidence for subject-verb dependencies.In their study Experiment 2, the authors tested sentences like (8), in which an adverbial clause (e.g. a professor finishes the class) linearly intervenes (a-d) or does not intervene (e-h) in a subject-verb dependency.Thus, in the intervening cases, the subject-verb dependency is active during the processing of the adverbial clause, while in the non-intervening cases it is not.
(8) Intervening conditions: a.That student, as soon as a professor finishes the class, leaves the classroom.b.Those students, as soon as a professor finishes the class, leave the classroom.c.Those students, as soon as professors finish the class, leave the classroom.d.That student, as soon as professors finish the class, leaves the classroom.
Non-intervening conditions: e.I watched that student, and as soon as a professor finishes the class, she leaves the classroom.f.I watched those students, and as soon as a professor finishes the class, they leave the classroom.g.I watched those students, and as soon as professors finish the class, they leave the classroom.h.I watched that student, and as soon as professors finish the class, she leaves the classroom.
The authors found that first pass times and go-past times at leaves were longer in the intervening conditions than in the non-intervening conditions, suggesting that there is a processing cost for maintaining the active subject-verb dependency before the main clause verb is received in the input.As mentioned above, this is consistent with our assumption that the subject-verb dependency is actively maintained during the processing of intervening material, assuming that the elevated reading times are due to this cost.Also relevant to our studies is the fact that, as well as intervention, Ristic et al. (2022) also manipulated whether the main clause and embedded clause subjects matched in number (8a,c,e,g) or mismatched (8b,d,f,h).They did not find clear evidence for an interaction between number matching and intervention.This aspect of Ristic et al.'s results may be considered in the light of the research question of the present paper, which asks whether number attraction is greater when the relevant retrieval is launched during the maintenance of an active dependency (as in Ristic et al.'s intervening conditions) than when there is no such active dependency (as in the non-intervening conditions).The lack of such an interaction in Ristic et al.'s study may be seen as evidence against such an effect of active dependency maintenance.However, note that in all of the conditions in (8), the dependency between professor(s) and leave(s) is grammatical, and, as we will discuss below, interference effects involving grammatical subject-verb agreement dependencies are quite variable across studies, in contrast to the more consistent and larger effects in ungrammatical sentences (Jäger et al., 2017(Jäger et al., , 2020;;Yadav et al., 2023).Thus, any interference effect in this part of the sentence may have been difficult to detect.Therefore, the lack of an interaction in Ristic et al.'s (2022) study is not necessarily evidence against the idea that attraction is modulated by active dependency maintenance.
In a series of acceptability judgment and self-paced reading studies, Kim et al. (2020) investigated the role of both re-activation and maintenance in English number agreement attraction.In contrast to Ristic et al. (2022), these authors did include ungrammatical conditions.
In their Experiment 3b, Kim et al. (2020) presented participants with sentences like (9) in a self-paced reading task: (9) a. Active filler: grammatical (singular/plural distractor): Which mistake in the program/programs that will be disastrous for the company certainly is harmful for everyone involved?b.Active filler: ungrammatical (singular/plural distractor): *Which mistake in the program/programs that will be disastrous for the company certainly are harmful for everyone involved?c.Re-activated filler: grammatical (singular/ plural distractor): Which mistake in the program/programs will be disastrous for the company and certainly is harmful for everyone involved?d.Re-activated filler: ungrammatical (singular/ plural distractor): *Which mistake in the program/programs will be disastrous for the company and certainly are harmful for everyone involved?
In each of the conditions (9a-d), a dependency is formed between which mistake and the predicate is/ are harmful.Thus, according to cue-based retrieval, a retrieval is launched at is/are, involving a retrieval cue for number.In the active filler conditions, the authors assume that the filler which mistake in the program(s) has been maintained in memory over the intervening words without participating in further dependencies, because the phrase requires a main verb, which does not appear in the input until is/are harmful.In contrast, in the re-activated filler conditions, the filler is released from maintenance at will, allowing the subject-verb dependency to be formed, and is subsequently re-activated at and.The reason for the reactivation is that the across-the-board constraint requires which mistake to be coindexed with a gap in each of the two conjuncts.The word and, then provides bottom-up evidence for the second conjunct, thus leading to reactivation.Kim et al. (2020) also manipulated whether a distractor (program/programs) matched or mismatched the number retrieval cue, and whether the overall sentence was grammatical or ungrammatical.In the self-paced reading data, Kim et al. (2020) found the expected number attraction effect, i.e. a smaller ungrammaticality cost in conditions where the distractor matched the retrieval cue than where it mismatched, resulting in a two-way interaction of local noun with grammaticality.At the word position immediately following the critical verb (i.e at the word harmful in 9a-d), this two-way interaction was significant only for the active filler conditions, but not for the reactivated filler conditions, resulting in a 3-way interaction.However, at the next word for in 9a-d), the two-way interaction was not modulated by a significant three-way interaction.Thus, the attraction effect appeared earlier in the active filler conditions than in the reactivated filler conditions.This difference in timing of the attraction effect might be due to differences in the memory representation of the filler phrase which mistake in the program(s).Since the sentence structure of the active filler requires this phrase to be maintained over a long span of the sentence, this could increase the availability of the contents of the phrase, crucially including the distractor the program(s), leading to larger (or earlier) attraction effects.In contrast, for the reactivated filler conditions, the filler phrase has been released from maintenance in the first conjunct, allowing decay of its contents in memory before being reactivated at and, thus leading to less (or later) attraction effects.
The goals of the experiments that we report below are similar in some ways to those of Kim et al. (2020).However, there are some major differences.While Kim et al. (2020) concentrated on retro-active interference, we examine both retro-and pro-active interference.A second difference is that our active-dependency manipulation is focussed on the distractor, while keeping the active status of the retrieval target constant.Specifically, in (7) (but not in 6) above, we assume that the distractor (the widows) is maintained in memory, and that maintenance is active at the point where were is processed.At the same time, the retrieval target (the nurse) does not differ in active status between (6) and ( 7).In contrast, in the study of Kim et al. (2020), the retrieval target (which mistake) and the distractor the programs form part of the same phrase, and this phrase either participates in an active dependency (9a,b) or is reactivated (9c,d).This difference means that our manipulation can be expected to affect the activation of the distractor specifically, while that of Kim et al. (2020) may have affected the activation of both the distractor and the retrieval target, both in terms of maintenance and reactivation.This difference simplifies the interpretation of any effects of active dependency status in our experiments, relative to those of Kim et al. (2020).Arguably, it also provides a clearer test of the modulating effect of maintenance on attraction, particularly as attraction in the cue-based retrieval model is linked to the activation of the distractor, as a factor in its misretrieval.
In the four eye-tracking experiments and one selfpaced reading experiment reported below, we test the prediction that attraction in subject-verb number agreement is modulated by active dependencies.We also compare attraction effects between pro-active and retro-active interference.In Experiment 1, we obtain a baseline estimate for attraction effects in retro-active interference using sentences similar to (5).In Experiments 2 and 3, we test pro-active interference using non-active dependencies (as in 6) and active dependencies (as in 7) respectively.In Experiment 4, we replicate the attraction effect for retroactive interference, with a small change from the conditions of Experiment 1, namely, having the distractor in object position instead of subject position.The stimuli for all four experiments were very similar, with identical critical regions, to allow for cross-experiment comparisons.To test the effect of active dependency status, we compare the attraction effect in Experiment 2 with that of Experiment 3. To test the effect of retro-active vs. pro-active interference, we compare Experiments 1 and 4 on the one hand, with Experiments 2 and 3 on the other.Finally, in Experiment 5, we report a self-paced reading experiment that combines the conditions of Experiments 2 and 3 in a single within-participant design, in order to provide a further test of the effect of active dependencies on pro-active interference.

Experiment 1
Participants Thirty-nine native speakers of English from the University of Edinburgh community were paid to participate in the experiment.

Stimuli
Forty-eight stimuli were constructed on the model of (10a-c): (10) a. Ungrammatical; Matching distractor: *The nurse who the widows relied on definitely were reluctant to work long shifts.b.Ungrammatical; Mismatching distractor: *The nurse who the widow relied on definitely were reluctant to work long shifts.c.Grammatical: The nurses who the widow relied on definitely were reluctant to work long shifts.
The stimuli always used the plural auxiliary verb were, and we manipulated grammaticality by changing the number of the head noun (nurse(s)), and manipulated attraction by changing the number of the distractor (widow(s)), leading to a design with one grammatical baseline condtion (10c) and two ungrammatical conditions, with either a distractor that matched (10a) or mismatched (10b) the number of the verb.
We adapted the stimuli from those of Dillon et al. (2013), by making the following changes: (a) we adapted the content of some of the sentences to make them suitable for readers of British English (the original study used North American participants); (b) we used object relative clauses instead of subject relative clauses, and (c) we used only three conditions instead of Dillon et al.'s eight.Of the three changes, (b) and (c) were designed to maximise the chances of finding an attraction effect.Using object relative clauses means that the distractor (widow(s)) is a subject, and subjecthood presumably corresponds to a retrieval cue for verb-subject agreement.Using only three conditions with 48 items yields a greater number of items per condition (i.e.16) than Dillon et al. (2013).
Note that our design is set up to examine attraction effects specifically in ungrammatical sentences, and the lack of a second grammatical condition precludes the possibility of looking for attraction effects among grammatical sentences.However, attraction effects among grammatical conditions are variable, with some studies showing a reading time cost after the verb when a distractor matched the number retrieval cue (Franck et al., 2015;Nicenboim et al., 2018), and other studies showing the opposite pattern, with a reading time cost in the context of a mismatching distractor (na Fariña et al., 2014), while still other studies have shown null differences among grammatical conditions (Dillon et al., 2013;Lago et al., 2015;Wagers et al., 2009).In contrast, attraction effects among ungrammatical conditions appear to be larger and with a more consistent direction (Dillon et al., 2013;Jäger et al., 2017;Lago et al., 2015;Wagers et al., 2009) see also Yadav et al. (2023) for further discussion.Moreover, we did not have any theoretical predictions that applied specifically to attraction in grammatical sentences.We therefore took the decision to use three conditions instead of four, as way to maximise the chance of finding a reliable attraction effect.We will return to discuss our decision to use ungrammatical sentences in the General Discussion.
Given that we are measuring attraction only in ungrammatical sentences, this effect is measured by comparing the reading times of the two ungrammatical conditions (see below for an explanation of this contrast, and its predicted outcome).This raises the question of whether the grammatical condition is in fact needed.In our view, number attraction in ungrammatical sentences equates to the reduction of the cost of ungrammatical agreement.Therefore, it seems desirable to show both that ungrammatical sentences have a cost, and also that this cost is reduced in the context of a distractor matching the retrieval cue.Given this, the grammatical condition is needed as a baseline to demonstrate the cost of ungrammatical sentences.
A further question is whether it would have been more desirable to have a fully factorial design, with two grammatical conditions, in which the distractor's match with the verb's retrieval cue is manipulated in a way that parallels the two ungrammatical conditions.In such a design, the evidence for attraction would be found in the interaction of grammaticality by distractor matchfacilitation would be expected in the ungrammatical sentences where the distractor matched the verb's retrieval cue, relative to sentences where it did not, while this effect might be expected to be absent among the grammatical conditions (Wagers et al., 2009).Such a fully factorial design would undoubtedly provide a higher level of experimental control than is possible with the design that we employ in the current experiments, as the interaction would rule out alternative explanations of the attraction effect based solely on the processing of the distractor.For example, if reading times at the verb differ as a function of distractor matching in the ungrammatical conditions, then this could potentially be explained as a delayed effect of processing differences between singular and plural distractors, rather than as a genuine attraction effect.Note that, in a fully factorial design, if such an effect is found for the ungrammatical conditions but is absent, reduced or reversed in the grammatical conditions, then the resulting interaction must be explained by factors that are over and above simple processing differences between singular and plural distractors.However, we believe that such a confound is unlikely given the current design.In all our experiments, the distractor is separated from the verb by at least one agreement-neutral word, making an influence of the distractor on reading times at or following the verb unlikely.Moreover, although it is possible that a plural distractor could cause local processing difficulty due to its conceptual complexity relative to a singular distractor (see discussion in Wagers et al., 2009), in our design, the verb is always plural (i.e.were), so the expected attraction effect would equate to a facilitation in the context of a plural distractor, not the inhibitory effect that would be predicted on the basis of the alternative explanation.Finally, as mentioned above, although the fully factorial design would provide more experimental control, this would likely require more observations to reach the same level of power, given the larger number of conditions.

Procedure
Eye movements were recorded using an SR Research Eyelink 1000 eye tracker, with a sampling rate of 1 KHz.The stimuli were displayed on a 19 inch Viewsonic monitor, at a viewing distance of 81 cm, using Times New Roman 18 pt, with black text on a white background.Stimuli were rotated in a Latin-square design, so that each participant was exposed to an equal number of trials for each condition, and no participant saw the same item in more than one condition.The 48 stimuli in each list were combined with 100 filler sentences, of which 36 were part of an unrelated experiment on parallelism effects (example item: The footballer who Trevor admired, and Angela imitated, scored in the championship.)Fillers and stimuli were combined so that no two items from the current experiment appeared adjacent to each other.Approximately 1/3 of all stimuli were followed by a Yes/No comprehension question, which the participants answered by pressing the appropriate button on a gamepad.The correct answer was "yes" on 50% of the trials.Each experimental and filler item fitted on one line.Mean comprehension accuracy was 92%.
At the start of each trial, a black square was displayed on the left of the screen.When a stable fixation was detected on this square, it was replaced by the text.

Data analysis
Fixation data were screened and corrected for vertical drift.Following this, each fixation that was shorter than 80 msec was combined with the previous (or next) fixation, provided that the two fixations were within 1 character of each other.We then removed any remaining fixations that were shorter than 80 msec or longer than 1200 msec.
The analysis regions are illustrated in (11) below: (11) The nurses who the widow relied on definitely/ were/ reluctant/ to work/ long shifts.
The Verb region consisted of the word were.The (Verb+1) region consisted of the next word after were.Finally, the Verb+2 region consisted of the next two words after the Verb+1 region.In all cases, the region includes the immediately preceding space.As the Verb region was a short function word, and was expected to be skipped on a high proportion of trials, an iterative procedure extended its left region boundary one character to the left in cases where no initial fixation was detected within the originally defined Verb region.If there was still no fixation on the region with the adjusted boundary, the region was extended by one further character, and this was done iteratively, until either a fixation was detected, or the region was extended a maximum of four characters to the left of the original region boundary, in which case the trial was not included in the analysis (see also Sturt, 2003;Sturt & Lombardo, 2005).This procedure only applied to the Verb region.The procedure was used in the calculation of means for the Verb region in Experiments 1-4, and in the analysis of the Verb region for Experiment 1.
We concentrated on two first-pass measures within these regions, namely first-pass reading time (the sum of fixation durations from the reader's first entry into the region from the left, until the first exit from the region, either to left or right), and go-past times (the sum of fixation durations from the reader's first entry into the region from the left, until the first rightward exit from the region).Note that first-pass time includes only fixations in the region, and does not include any fixations following either a leftward or rightward exit of the region.However, go-past time may include fixations to the left of the region, in cases where a first-pass regression is made.Neither of these measures includes trials where the region was skipped.Note that Dillon et al. (2013) reported attraction effects in Total Time, a measure consisting of the summed duration of all fixations on the region.We do not report Total Time, for several reasons.The first reason is that we wanted to be able to compare reading times on the target regions across all four experiments reported in this paper.The four experiments have identical content in the analysis regions, but differ in content towards the end of the sentence.In particular, Experiments 2 and 3, reported below, use sentences that are longer by several words, relative to Experiments 1 and 4. In contrast to the first-pass measures that we use in this paper, Total Time is affected by regressions that are launched from beyond the analysis regions.Thus, if we had included Total Time as a measure, comparisons across experiments could have been confounded by regressions launched from the later part of the sentences.A second reason for excluding Total Time is that we wanted to avoid reporting a large number of measures and regions, as this can lead to problems associated with familywise error (von der Malsberg & Angele, 2017).
For data analysis, Linear Mixed effect models were computed using Bayesian inference.The models were implemented using the brms R package (Bürkner, 2017(Bürkner, , 2018(Bürkner, , 2021)), which provides an interface to the Stan programming language (Carpenter et al., 2017).The models included the three-level factor of condition, using dummy coding, and treating the ungrammaticalmismatching distractor condition as the reference level.Thus, there were two fixed effect contrasts, which we will call Grammaticality (contrasting the ungrammatical-mismatching distractor condition with the grammatical condition) and Attraction (contrasting the ungrammatical-mismatching distractor condition with the ungrammatical-matching distractor condition).Both of these contrasts are expected to show negative coefficients: For the grammaticality contrast, the negative coefficient reflects the facilitation of a grammatical sentence relative to an ungrammatical sentence, and for the attraction contrast, the negative coefficient reflects the facilitation of a number-matching distractor relative to a number-mismatching distractor, among the two ungrammatical conditions.The models also included crossed random effects for participants and items, with random slopes for both participant and item for both fixed effect contrasts, and the full correlation matrix.The reading times were log-transformed in the analysis.The model used uninformative priors.The prior for the intercept was Normal(0, 10).Given the contrasts for our model, this implies that we are 68% certain that the mean reading time for the ungrammatical condition (i.e. the intercept) will be between a number very close to zero (i.e.exp(−10)) and 22,026 milliseconds (i.e.exp(10)).The prior for the fixed effect contrasts was set to Normal(0, 1).Here, if we assume a mean of 400 millseconds for the baseline (ungrammatical mismatch) condition, this would equate to an intercept of around 6 on the log-millisecond scale.Given this intercept and the specified fixed effect prior, this implies that we are 68% certain that the mean for the comparison (either grammatical or ungrammatical match) condition will range between 148msec (i.e.exp (6−1)) and 1097 (i.e.exp(6 + 1)).The prior for the random effect correlation matrix was set to lkj(2).This provides a bias against overestimating random effect correlations.Remaining priors used brms defaults.
It should be noted that Bayesian inference does not involve the calculation of p-values.In the analyses below, we report model coefficients (β), along with a term P(b , 0), representing the probability that the coeffcient in question is less than zero.The probability estimates were obtained by sampling coefficient estimates from the posterior distribution of the LME model, and calculating the proportion of these estimates that fell below zero.Thus, for a coefficient whose predicted sign is negative, evidence for the hypothesis is deemed to be strong to the extent that the value of P(b , 0) approaches 1. Conversely, for a coefficient whose predicted sign is positive, evidence for the hypothesis will be strong to the extent that the value of P(b , 0) approaches 0.
We also report 95% credible intervals for each coefficient.These indicate the range of the coefficent's possible values that cover 95% of the probability density, based on the posterior sample.Coefficient estimates were calculated using the mean of the posterior samples, which is the brms default.
For Experiment 1, we report analyses of all three regions for first-pass reading time and go-past time.
The means for each region are given for both eyemovement measures in Table 1, and results of the Bayesian linear mixed effect model are given in Table 2. Means for go-past time are displayed graphically in Figure 1.
The results for first-pass reading time showed the predicted negative coefficient for the grammaticality effect in the Verb and Verb+1 regions, but no strong evidence for a grammaticality effect in the Verb+2 region.There was no strong evidence for an attraction effect in Firstpass reading time.
In contrast, go-past time showed clear evidence for a grammaticality effect in all three regions, and for an attraction effect in the Verb+1 and Verb+2 regions, with P(b , 0) ..99 in both of these regions.
Since the present paper is specifically concerned with the attraction effect, it was decided to restrict analyses in the remaining experiments to the Verb+1 and Verb+2 regions, for go-past time only.This also serves to alleviate statistical concerns with using multiple regions and measures (von der Malsberg & Angele, 2017).
In summary, Experiment 1 examined sentences with retro-active interference.The distractor used in Experiment 1 was also the subject of its clause.Both of these features of the design were designed to maximise the chance of finding an attraction effect.The results showed an attraction effect in go-past time, on the Verb+1 and Verb+2 regions: ungrammatical sentences had shorter go-past times when the distractor matched the number cue of the verb than when it did not.

Experiments 2 and 3
In Experiments 2 and 3, we examine attraction effect in sentences with pro-active interference, such that the distractor appears before the retrieval target, unlike in Experiment 1.However, as in Experiment 1, the distractor was the subject of its clause in both experiments.Experiment 2 and 3 differ in terms of active dependency status.In Experiment 3, the distractor is involved in an active dependency at the point where retrieval takes place, but it is not in Experiment 2. By comparing the attraction effects in Experiment 2 and 3, we aim to test the effect of active dependency status.

Participants
Experiments 2 and 3 each used 39 native speakers of English, who were paid to participate in the study.

Stimuli
The 48 experimental stimuli for Experiment 1 were adapted to create 48 stimuli for Experiment 2 and 48 for Experiment 3, with designs summarised below: The widow who said that the nurses most definitely/ were/ reluctant/ to work/ long shifts had become quite annoyed.
All other things being equal, the Lewis and Vasishth (2005) model would predict a smaller interference effect for Experiments 2 and 3, where interference is pro-active, relative to Experiment 1, where interference  is retro-active.This is because, in the pro-active case, the activation of the distractor the widow will have decayed to a greater extent at the point where the retrieval occurs at were.Experiment 2 therefore serves as a baseline for Experiment 3, where, although the interference is pro-active, the distractor the widow(s) is involved in an active (i.e.unresolved) dependency at the point where retrieval takes place.If maintaining the distractor in memory for the purpose of later resolution involves a slower rate of decay, then we would expect increased interference in Experiment 3 relative to Experiment 2.

Procedure
The procedure was identical to that of Experiment 1, except that the screen was placed at a distance of 70 cm, instead of 81 cm, due to a change in the laboratory set-up.At the same time, the font size was reduced from 18 pt to 16 pt, maintaining a similar size of visual angle per character. 5Mean comprehension accuracy was 94% for Experiment 2 and 92% for Experiment 3.

Results
As mentioned above, for Experiment 2 and remaining experiments, we restricted our analyses on two regions of go-past time (namely Verb+1 and Verb+2).Based on the results of Experiment 1, this gives the best possibility of observing an attraction effect, without the statistical problems introduced by reporting multiple measures and regions (von der Malsberg & Angele, 2017).
Table 3 shows the means for go-past time for Experiment 2, and Table 4 shows the results of the statistical analysis for Experiment 2. Means are displayed graphically in Figure 2.
Experiment 2 again showed strong evidence for a grammaticality effect (P(b , 0) ..99 for both regions).However, there was no strong evidence for an attraction effect in this experiment, although the means are in the predicted direction (P(b , 0) = .87for Verb+1, and P(b , 0) = .68for Verb+2).

Experiment 3: results
Means for Experiment 3 are presented in Table 5, and results of statistical analysis in Table 6.Means are displayed graphically in Figure 3 As in Experiments 1 and 2, there was again a large grammaticality effect.However, as in Experiment 2, there was again no strong evidence of attraction (P(b , 0) = .87for Verb+1, and P(b , 0) = .53for Verb+2), although the means were again in the predicted direction for an attraction effect.

Combined analysis of Experiments 2 and 3
We conducted a combined analysis of Experiments 2 and 3, in order to quantify the degree to which the attraction effect depended on the active status of the dependency in which the distractor participated.This is measured by the attraction-by-experiment interaction, which should show a larger attraction effect in Experiment 3, where there was an active dependency, than in Experiment 2, where there was not.In this analysis, we combined data for the Verb+1 and Verb+2 regions into one analysis, including region as a factor (Cunnings & Sturt, 2014).We argue that this combined analysis simplifies the interpretation of results, due to having only one single coefficient for each of the effects of Attraction, Experiment, and the crucial Attraction-by-Experiment interaction, as opposed to the six separate coefficients that would be expected with the two separate analysis regions.The region factor was coded as a centred predictor, with a mean of zero and a range of 1.The experiment factor was also coded in the same way.Therefore, the effect of attraction is evaluated at the average across the two regions and across the two experiments, while the interaction of attraction by experiment or by region measures the degree to  which attraction differs by experiment or by region.To account for the statistical dependency between the data points across the pair of regions in any given trial, this model also included a random intercept for trial.This random factor had a single level for each unique trial in the combined two experiments. 6 The results of this analysis are given in Table 7: As expected, the combined analysis shows clear evidence for the grammaticality effect (P(b , 0) ..99).However, there was no strong evidence for an Attraction effect, even though the coefficient was in the expected direction (P(b , 0) = .94).There was also no evidence for the predicted interaction of the attraction effect by experiment (P(b , 0) = .4),and indeed, the coefficient for this effect is positive, whereas it would have been expected to be negative if Experiment 3 had shown more attraction than Experiment 2. Because of the theoretical importance of the Attraction-by-experiment interaction, we computed Bayes Factors to evaluate the evidence that this effect was null (i.e.BF 01 ).In addition, because of the unexpectedly small effect of Attraction, we decided to compute Bayes Factors for this effect as well, given that previous studies have not investigated this type of pro-active interference sentence structure, and it is theoretically informative to test which types of sentence structure lead to attraction and which do not.
The Bayes Factor analyses used the Savage-Dickey method, and in order to obtain stable Bayes Factor estimates, the models were run with a large number of iterations (12,000, of which 2000 were warmup).Also, because of the sensitivity of Bayes Factors to the choice of prior, the models were run with a range of priors for the beta coefficients: In addition to the normal(0,1) prior that we used to obtain the coefficient estimates, the Bayes Factor models   were also run with normal(0,0.25);normal (0,0.5);and normal(0,2).Given an intercept of 6 on the log-millisecond scale, corresponding to an overall mean of around 400msec for the ungrammatical-mismatch conditions, these priors would imply that we are 68% confident that mean of the grammatical condition, or that of the ungrammatical match condition, would range between 312 and 514 msec for the normal(0,0.25)prior, between 243 and 659 msec for the normal(0,0.5)prior, and between 54 and 2956 msec for the normal (0,2) prior.The outcomes of these Bayes Factor analyses are reported in Table 8.The Bayes Factors analyses suggest moderate to strong support for a null effect of both Attraction and the Attraction-by-experiment interaction, across the range of priors.

Discussion of Experiments 2 and 3
Experiment 2 was a baseline experiment, designed to quantify the degree of attraction in a situation where interference is pro-active, and where the distractor does not participate in an active dependency at the point of retrieval.Experiment 3 was closely matched to Experiment 2 in that it involved pro-active interference, but it differed from Experiment 3 in that the distractor participated in an active dependency at the point where the crucial retrieval event occurred.We predicted that attraction effects would be larger in Experiment 3 than in Experiment 2, due to the active status of the dependency.However, the posterior distributions of the Bayesian analyses did not support this prediction, and Bayes Factor analyses instead supported the absence of such an effect, as well as the absence of an overall attraction effect.

Experiment 4
Experiment 1, which used retro-active interference, showed a robust attraction effect, while Experiments 2 and 3, which used pro-active interference, did not.We designed Experiment 4 as a partial replication of  Experiment 1, with the aim of confirming that our method was sensitive to attraction, and that this effect could be reliably detected using retro-active interference.Experiment 4 was similar to Experiment 1 except that the distractor was an object in Experiment 4, while it had been a subject in Experiment 1.Although there have been demonstrations that subject-based attraction is stronger than object-based attraction for certain types of subject-verb dependencies (Van Dyke, 2007), our purpose here is not to compare these two types of attraction directly.Rather, our purpose is to confirm that the retro-active attraction effect that we found in Experiment 1 generalises to a different type of sentence.

Participants
Thirty-nine native speakers of English from the University of Edinburgh community participated in the experiment.The data for an additional 11 participants were collected, but were not included in the analysis, due to an error in the script, resulting in incorrect counterbalancing. 7  Stimuli There were forty-eight stimuli similar to 14: ( The nurses who cared for the widow definitely/ were/ reluctant/ to work/ long shifts.

Procedure
Procedure was identical to that of Experiments 2 and 3. Mean comprehension accuracy was 92%.

Results
Mean go-past times are presented in Table 9, and results of statistical analysis are presented in Table 10.Means are displayed graphically in Figure 4.As in all the other experiments, Experiment 4 showed a robust grammaticality effect in both Verb+1 and Verb+2 regions (P(b , 0) ..99).As in Experiment 1, Experiment 4 showed strong evidence for an attraction effect in the Verb+1 region (P(b , 0) ..99).However, unlike Experiment 1, there was only weak evidence for the attraction effect in the Verb+2 region (P(b , 0) ..87).
Experiment 4 confirmed that our method was able to detect an attraction effect for retro-active interference.This effectively replicates (Dillon et al., 2013), who also found evidence for interference effects where the distractor is an object.Although it was not our aim to compare Experiments 1 and 4 directly, we note incidentally that Experiment 1, which used a subject distractor, showed a large attraction effect in both the Verb+1 and Verb+2 regions, while Experiment 4, which used an object distractor, showed clear evidence for the effect only in the Verb+1 region.If this reflects a difference in the true effect, it may support subjecthood as a retrieval cue in this type of dependency, and this would fit with findings using other subject-verb dependencies (Van Dyke, 2007).
The relatively large attraction effect for our retroactive interference experiments (1 and 4) appears to contrast with the smaller interference effect for our pro-active interference experiments (2 and 3).We tested this difference by combining all four experiments into one large analysis, grouping the relevant pairs of experiments together to form a factor of interference type (Pro-active [Expts 2,3] vs. Retro-active [Expts 1,4]).This factor was treated as a centred predictor variable, with a mean of zero and a range of 1.As with the combined analysis of Experiments 2 and 3, we used both analysis regions, Verb+1 and Verb+2, including region as a centred factor.As well as allowing the comparison of pro-active and retro-active interference effects, the combined analysis also provides the opportunity to observe estimates of the grammaticality and attraction effects based on a large sample of 156 participants.The results of this analysis are summarised in Table 11: The results of the analysis are consistent with a larger attraction effect for the retro-active interference experiments relative to the pro-active interference experiments (Attraction × Pro/retro-interference: P(b , 0) = .01).Interestingly, there was no evidence for a modulation of the grammaticality effect as a function of interference type (P(b , 0) = .39).The analysis also confirmed overall main effects of both grammaticality and attraction (both P(b , 0) ..99)Experiment 5 The results of Experiments 2 and 3 suggest that (a) attraction is weak or non-existent in the types of proactive interference sentences that we examined, and (b) that the degree of attraction in these sentences is not increased by an active dependency involving the distractor.However, one possible objection to this argument is that the power of the experiments may have been too low, given that each experiment used only 39 participants.The power of the combined analysis of Experiments 2 and 3 may also have been compromised by the fact that one of the crucial factors, active-dependency status, was a between-participant variable, meaning that the error term for the interaction between active dependency and attraction would be inflated, due to the effect of individual differences in reading time, relative to a fully within-participant design where this variance would be partialled out.Experiment 5, was a web-based self-paced reading experiment that was designed as a partial replication of Experiments 2 and 3, with an increased sample size (198 participants), and a fully within-participant design.
Sample size, data exclusion criteria, and analysis methods were pre-registered. 8

Participants
The participants were 198 native speakers of English, recruited and paid via Prolific.co.One further participant was excluded due to low accuracy on comprehension (below 70%), in accordance with the pre-registered analysis plan.Participants were paid GBP 6 for participation.Before the experiment took place, a separate group of eighteen participants, again recruited and paid via Prolific, ran through the experiment.The purpose of this pilot study was to test the procedure, and to establish reasonable data exclusion criteria for the pre-registration.The analyses reported below do not include these eighteen pilot participants.

Stimuli
The 48 stimuli were adapted from Experiments 2 and 3 (see 15 for an example).( 15 The widow/ who said that/ the nurses/ most/ definitely/ were/ reluctant/ to work/ long shifts/ had become quite annoyed. The non-active dependency conditions were adapted from the stimuli of Experiment 2. An extra region was added to the end of each non-intervening item (e.g."and seemed quite annoyed" in (15)), in order allow a rough match in length to the active dependency conditions.The active dependency conditions were adapted from the stimuli of Experiment 3. In some cases, changes were made to the final region, in order to allow a closer match of the content with the nonactive dependency conditions.
Given that there were forty-eight items, and six within-participant conditions, each participant would be exposed to eight examples of each condition.This is a smaller number than the 16 examples per condition for Experiments 2 and 3.However, due to the increased number of participants, overall there were around 2.5 times more observations per condition in Experiment 5 than in there were in Experiments 2 and 3 (1584 vs. 624, before data exclusion).

Procedure
The experiment was implemented on PCIbex (Zehr & Schwarz, 2018), and participants took part on the web.Each trial began with a row of underscores, indicating the positions of the words of the sentence, preserving spaces between words.The participant pressed the spacebar on his/her computer to reveal each new segment of the sentence.The segments are indicated in (15), and include segments that match the analysis regions of the eye-tracking experiments.The selfpaced reading function required the participant to release the spacebar between each button-press, to prevent participants from simply holding down the spacebar to reveal each segment in quick succession.Time between button presses was recorded in milliseconds.
The six conditions of the experimental sentences were distributed among six latin square groups, such that a participant in any given group was exposed to only one condition of each item, and an equal number of the six conditions.An equal number of participants was assigned to each group.
The experimental sentences were interleaved with 62 filler sentences, adapted from the fillers of Experiments 2 and 3. Around one third of stimuli were followed by comprehension questions, which were answered by selecting one of two possible answers using the 1 and 2 keys on the keyboard.Mean comprehension accuracy was 93%.

Results
Stimuli, data, scripts and pre-registration for Experiment 5 are available at https://osf.io/g6tdn/Before analysis, we removed any data point for an individual segment that was greater than 3000 msec or less than 150 msec, affecting less than 0.5% of the data.This criterion was pre-registered.
Given the sentence segmentation (see (15) above), segment 6 (were) corresponds to the Verb analysis region of the eye-tracking experiments, segment 7 (reluctant) corresponds to the Verb+1 region, and segment 8 (to work) corresponds to the Verb+2 region.Henceforth, we will use these region names, to facilitate comparison with the eye-tracking results.Table 12 shows the mean self-paced reading times for the Verb, Verb+1 and Verb+2 regions, and these are also displayed graphically in Figure 5.
As Experiment 5 was pre-registered, we will report the outcomes of pre-registered and exploratory analyses seperately.

Pre-registered analyses
The main pre-registered analysis was analogous to the combined analysis of Experiments 2 and 3. A Bayesian linear mixed effects model was computed to analyse the reading times of the Verb+1 and Verb+2 regions, combined into one analysis, treating region as a factor.The fixed effects were Region (Verb+1 vs. Verb+2, centred), Active dependency status (Active vs. Non-Active, also centred), and condition (Ungrammaticalmatch vs. Ungrammatical-mismatch vs. Grammatical, treatment coded with the Ungrammatical-mismatch condition as the reference).The model also included all interactions among these factors.Full random effects were included for participant and item, with random intercepts as well as random slopes for all fixed effect interactions and main effects.As in the combined analysis of Experiments 2 and 3, a random intercept was also included for trial-id, whose value was unique to each trial.This was to account for the lack of independence between two regions of the same trial.Random correlation parameters were not included in the model.
The Bayesian model used a prior of normal(0,1) on the fixed effects, and normal(0,10) on the intercept, with brms default priors for all other parameters.The model was run with 6000 iterations, of which 1000 were warmup.The outcome of the analysis is given in Table 13.
The analysis did not show strong evidence of an attraction effect, although the direction of the effect was in the expected direction (P(b , 0) = .86).There was also no clear evidence for the crucial interaction between attraction and active dependency, and in any case, the direction of the effect was opposite to what would have been predicted if active dependencies increased attraction relative to non-active dependencies (P(b , 0) = .86).
The pre-registration included computation of Bayes Factors for the attraction effect and for the attraction-by-active-dependency interaction, to test the null hypothesis that each of these effects were equal to zero, in comparison to the alternative hypothesis that the effect was not zero.These bayes factors yielded strong evidence in favour of the null hypothesis (Attraction: BF 01 = 96.22;Attraction × Active-dependency: BF 01 = 56.3).Thus Bayes Factor results replicate those of the combined analysis of Experiments 2 and 3 in supporting a null effect of attraction, and of the effect of active dependencies on attraction.

Exploratory analyses
In addition to the pre-registered analysis reported above, we ran two exploratory analyses that were not part of the pre-registration plan.The first pre-registered analysis was run in order to counter a possible objection to the outcomes of our Bayes Factor analysis reported above.In the pre-registered analysis, we used a prior of normal(0,1), which, in the context of log-based self-paced reading times is quite a flat, unconstraining distribution.It could be the case that the Bayes Factor support for the null hypothesis that we found in the pre-registered analysis was amplified by using an unconstraining prior.The Bayes Factors that we previously reported in the combined analysis of Experiments 2 and 3 showed that BF 01 values were reduced for more constraining priors centred on zero (i.e.priors with smaller standard deviation parameters), compared to the less constraining priors.
In the first exploratory analysis, we therefore repeated the Bayes Factor analyses as described above, but using a fixed effect prior of normal (0,2.5)instead of the pre-registered prior of normal(0,1).The prior of normal(0,2.5)corresponds to the most constrained prior that we ran in the combined analysis of Experiments 2 and 3, and in those analyses, it yielded the smallest value for BF 01 among the four priors tested.If we continue to find strong evidence for the null hypothesis with this more constrained prior, it will therefore strengthen the support for the null effects of attraction and attraction-by-active-dependency.The analysis with the normal(0,2.5)prior continued to find strong evidence in favour of the null hypothesis, both for attraction (BF 01 = 23.77), and for the interaction of attraction with active dependency status (BF 01 = 13.45).Thus, the support for the null hypothesis that we found in the pre-registered analysis generalises to a more specific prior.
The second exploratory analysis was an analysis of the Verb region.We had originally chosen the combined analysis of Verb+1 and Verb+2 as the target of the pre-registration, on the basis of the results of the Eyetracking experiments.However, it remains possible that evidence for a role of active dependencies would be found on the verb region, which we may have missed by concentrating on the pre-registered analysis.We report the outcome of the analysis of the Verb region below.The model that we used to analyse the Verb region included the crossed factors of active-dependency and condition, with the same contrast coding as in the preregistered analysis.Since only one region was being analysed, the region factor was not included in the fixed effects, and the trial-id factor was not included in the random effects.The model included random effects for both participants and items, with random intercepts and random slopes corresponding to the full crossing of the fixed effects, and, like the pre-registered analysis, excluded random correlation parameters.The analysis used the same priors as the pre-registered analysis.
The coefficients and credible intervals for the exploratory analysis of the Verb region are given in Table 14: The exploratory analysis of the Verb region clearly confirmed a grammaticality effect, with faster reading times for grammatical sentences than for ungrammatical sentences with mismatching distractor (P(b , 0) = .99).However, the evidence for an attraction effect was not strong, even though the coefficient was in the expected direction (P(b , 0) = .9).Similarly, the interaction of attraction with active dependencies did not correspond to a clear effect in the analysis, although it was in the expected direction attraction and active-dependency (P(b , 0) = .3).Bayes Factor analyses confirmed strong support for the null hypothesis, both for attraction (BF 01 = 62.08), and for its interaction with active-dependency (BF 01 = 65.89).A further analysis using the more constraining prior of normal(0,2.5)again confirmed strong support for the null hypothesis for both of these effects (Attraction: BF 01 = 15.67;Attraction × Active-dependency: BF 01 = 15.54).

General discussion
Across five experiments, we examined attraction effects in subject-verb agreement.We found stronger attraction effects for retro-active interference relative to pro-active interference.This may reflect effects of decay of activation of the distractor, which was at a greater distance from the critical verb in the pro-active experiments than in the retro-active experiments.However, within proactive interference, we did not find clear evidence that the degree of attraction was modulated by whether the distractor was participating in an active dependency at the point where the agreement computation took place.
One important question to consider is that of why the number attraction was relatively weak in both of the pro-active interference configurations that we tested (Experiments 2, 3 and 5).
As noted in the introduction to this paper, other things being equal, standard accounts of cue-based retrieval (Lewis & Vasishth, 2005) would have predicted relatively weak effects of attraction for pro-active interference, due to the decay of activation of the distractor, and this is consistent with the fact that we only found clear evidence for attraction using retro-active interference (Experiments 1 and 4), while for proactive interference (Experiments 2, 3 and 5), we found evidence against the presence of an attraction effect.On the other hand, more recent accounts of cue-based retrieval assume that activation is also affected by the information theoretic status of the distractor, with a boost of baseline activation being assumed for main clause subjects, due to their high topicality (Engelmann et al., 2019).If topicality does have an influence, however, it failed to yield reliable attraction effects in Experiments 2 and 3 reported above, where the distractor was a main clause subject.A second factor that we should consider is whether the distractor is reactivated in pro-active interference before the retrieval event occurs.In the introduction to this paper, we discussed two studies that showed evidence of pro-active interference in language comprehension, namely those reported by  (16) *The musicians who the reviewer praise so highly will probably win a Grammy.(17) *The musician who the reviewer praise so highly will probably win a Grammy.(18) a*Jerôme parle à la prisonnière que le gardien sortent.
Jerôme talks to the prisoner who the guard take out.b*Jerôme parle aux prisonnières que le gardien sortent.
Jerôme talks to the prisoners who the guard take out.
Both studies (Franck et al., 2015;Wagers et al., 2009) showed evidence of attraction, though with different measures, despite the fact that the distractor (the musicians, the prisoners) appears before the retrieval target (the reviewer, the guard).However, in both cases, the distractor forms a dependency with the verb that signals the critical retrieval event (praise, take out).Thus, arguably, the distractor is reactivated at the point where the number features of the target phrase have to be retrievedin fact, a second retrieval event (that of the verb retrieving its object) is taking place at the same time.This situation is in contrast with the sentence types that we used in our study, where we wished to test for effects of maintenance without the complicating factor of reactivation.Nevertheless, it may be the case that reactivation is a necessary condition for finding measurable attraction effects with pro-active interference. 9Moreover, reactivation may be a stronger influence on attraction effects than topicality.Note that the distractor in Wagers et al.'s (2009) study is a main clause subject (the musicians in 16 and 17), but that of Franck et al.'s (2015) study is not (the prisoner(s) in 18).The relative influence of reactivation and topicality should be reassessed in future studies that manipulate both of these factors.
Given the discussion above, one potential criticism of our studies is that we were unable to observe an effect of maintenance simply because the sentence structures that we were using in Experiments 2 and 3 were not appropriate to yield reliable attraction effects, whether or not an active dependency was being maintained, or that we would have needed more power to detect interference in our sentence structures.Although we cannot discount this possibility on the basis of the data reported here, our results may instead indicate that the types of active subject-verb dependencies that we examined, or perhaps active dependencies in general, do not affect the activation of items in memory, or that they do not require the maintenance of relevant features in memory.However, this view would need to be reconciled with a substantial literature showing ERP effects of maintenance of active dependencies, which we review below.
A number of ERP studies have shown a sustained ERP response for the sequence of words within a sentence where an active long-distance dependency remains unresolved, relative to control sentences (Cruz Heredia et al., 2022;Fiebach et al., 2002;Hagiwara et al., 2007;King & Kutas, 1995;Phillips et al., 2005).For example, in their ERP experiment (Phillips et al., 2005) examined sentences like (19), among other conditions: (19) a.The lieutenant knew which accomplice the detective hoped that the shrewd witness would recognize in the lineup.b.The lieutenant knew that the detective hoped that the shrewd witness would recognize the accomplice in the lineup.
In (19a), there is a long distance dependency linking which accomplice with the most deeply embedded verb recognize.In the control condition (19b), there is no such wh-dependency.In their ERP experiment, Phillips et al. (2005) found a sustained negativity in (19a), where the dependency was active, relative to the control condition (19b), for word positions between the two end-points of the wh-dependency.This, along with other similar demonstrations, shows that the brain is sensitive to active dependencies.However, there are different possible ways in which active dependencies might be represented.According to the view that we have assumed in this paper, the relevant features of the left-hand element of the dependency are actively maintained in working memory until the righthand element of the dependency is encountered.As an example, for the distractor phrase in our Experiment 3, the maintained features would include its number features.In this maintenance view, it would be natural to expect that attraction would be affected by the active status of the dependency, due to the maintenance of the features in working memory.However, according to another possible account discussed by Cruz Heredia et al. (2022), rather than the maintenance of the lefthand element, active dependencies might instead be characterised by the prediction of the right-hand element itself, or features of it.If this is the case, then activation of the distractor may not be affected by active dependency status, and therefore, a difference may not be expected in the size of any attraction effect for active versus non-active dependencies.
One possibility that should be considered is that dependency types may differ in whether they show effects associated with active dependencies.In an ERP study, Cruz Heredia et al. (2022) examined sentences like (20a-d): (20) a.What did the commentary from the spokesman interrupt?b.Did the commentary from the spokesman interrupt the game?c.After Charlotte quit her boring job, she traveled to Europe like she always wanted to.d.Today Charlotte quit her boring job (and traveled to Europe like she always wanted to) In (20a), there is a dependency between what and the verb interrupt, while in (20c), the initial subordinate clause (after…) indicates the upcoming presence of the main clause, making prediction possible.Thus, there is a sense in which we can think of (20c) as also involving an active dependency.In both (20a) and (20c), the relevant dependencies are arguably active while the comprehender is processing the intervening words.Sentences (20b) and (20d) are control conditions that lack the active dependencies of (20a) and (20c) respectively.If sustained negativity is a general ERP signature of unresolved, dependencies then such an effect would be expected for both (20a) and (20c) relative to their respective control conditions.However, Cruz Heredia et al. (2022) found the expected effect only in (20a).This might be taken to suggest that the expectancy-based processing mechanisms underlying the sustained negativity effect are limited to whdependencies. 10If this reflects a general difference between wh-dependencies and other active dependencies, then this might also explain why we failed to find modulation of number attraction as a function of active subject-verb dependencies, as our studies did not use wh-dependencies.In addition, this may also explain why Kim et al. (2020) were able to observe active dependency effects, given that their active dependencies used wh-dependencies (see example 9 above)in other words, it may be the case that number agreement attraction is affected by the active status of wh-dependencies, but not other forms of dependency. 11However, recall above that Ristic et al. (2022) detected a processing cost for active (non-wh) subject-verb dependencies in their self-paced reading experiment, so it seems that active subject-verb depenendencies do have some measurable impact on processing, even if this does not modulate attraction, or lead to a sustained negativity ERP signature.
Finally, it is worth commenting on our decision to test attraction effects only in ungrammatical conditions, without also manipulating distractor number in grammatical sentences.As mentioned above, previous comprehension studies of verb-subject number agreement have shown a grammaticality asymmetry, such that more consistent effects have been found in ungrammatical sentences than in grammatical sentences, 12 and therefore, it seemed reasonable to test for attraction only in ungrammatical sentences.However, it should be acknowledged that the processing of ungrammatical sentences may be subject to individual strategies, and be atypical of normal sentence processing.Also, it is always possible that by concentrating on ungrammatical sentences, we missed finding effects of active dependencies, that we would have found if we had also examined grammatical sentences, although we are not aware of any theoretical framework that would predict larger effects of active dependencies in grammatical than in ungrammatical sentences.

Conclusion
Across five experiments, we investigated the factors that affect number attraction in the comprehension of subject-verb dependencies.We found evidence for greater attraction in retro-active interference, where the distractor linearly intervenes between the verb and retrieval target, relative to pro-active interference, where the distractor precedes the target.This is compatible with the notion of decay of activation, as proposed in the original version of the cue-based retrieval model (Lewis & Vasishth, 2005).We also asked whether the maintenance of a dependency affected attraction within pro-active interference conditions, as would be expected if active dependencies required maintenance of the relevant features in memory.However, this result was not obtained.This might indicate that active dependencies do not involve the maintenance of features in memory, or otherwise might indicate that our pro-active interference conditions were not suitable for examining attraction effects.Notes 1.In this paper, we will indicate ungrammatical sentences with an asterisk.2. Although the cue-based retrieval model assumes the decay of activation, it should be noted that the notion of decay is controversial in the domain-general memory literature (see Lewandowsky et al., 2009).3. Note that, in the production literature, it is usually assumed that percolation can only act upwards in a syntactic tree, which would rule out attraction occuring in cases like (1), and in our pro-active interference experiments, because in these cases the distractor is above the target in the tree.However, in Bayesian simulations of a model that combined feature percolation with cuebased retrieval, Yadav et al. (2023) find good fits with data from a number of experiments, regardless of whether the distractor was lower or higher than the target in the tree.4.However, note that among the models tested by Yadav et al. (2023), the best fit to experimental data across both grammatical and ungrammatical sentences was provided by a model that combined both feature percolation and cue-based retrieval.5. We cannot give a precise measurement for the visual angle per character, because the experiments used Times Roman font, which is a variable width font.6.An initial model also included a random slope for trial by region, but inspection of the trace plot indicated that the estimate for this slope did not converge.7. Further analyses including these extra participants yielded very similar results to those reported below.These extra analyses, along with the data, are available on the OSF site for this experiment.8. See https://osf.io/g6tdn/.9. Another possibility is that the referent of the distractor must participate in the same event as the referent of the target, as is the case in the relative clause examples of both Franck et al. (2015) and Wagers et al. (2009), but not in our Experiments 2 or 3. 10.In fact Cruz Heredia et al's full results suggest that the sustained negativity effect is limited to matrix whquestions, as they did not find an effect for embedded wh-questions.However, Phillips et al. (2005) did find sustained negativity for embedded questions.11.Note, however, that Kim et al. (2020) tested the effect of active dependency status only in retro-active interference, so their results are not strictly comparable to ours.12. Explaining the grammaticality asymmetry is beyond the scope of this paper, though one interesting recent suggestion is that, in a model that uses both cue-based retrieval and feature percolation, these two processes both push in the same direction for ungrammatical sentences, resulting in a facilitatory attraction effect, while the two processes push in opposite directions in grammatical sentences, resulting in a small net effect, of variable direction (Yadav et al., 2023).

Figure 5 .
Figure 5. Mean self-paced reading time time for Experiment 5 (Upper panel: active dependencies; Lower panel: non-active dependencies).

Table 1 .
Means and standard errors (aggregated by participants) for Experiment 1.
*The widow said that the nurse most definitely/ were/ reluctant/ to work/ long shifts.

Table 2 .
Statistical analyses of first-pass and go-past reading time for Experiment 1. Model estimates, error, lower and upper 95% credible interval limits (in log-millisecond units) and probability that the coefficient is smaller than zero, based on posterior samples.

Table 3 .
Means and standard errors (aggregated by participants) for Experiment 2.

Table 4 .
Model estimates for go-past time, for Experiment 2.

Table 5 .
Means and standard errors of go-past time (aggregated by participants) for Experiment 3.

Table 6 .
Model estimates for go-past time, for Exeperiment 3.

Table 7 .
Model estimates for combined analysis of Experiments 2 and 3.

Table 9 .
Means and standard errors (aggregated by participants) for Experiment 4.

Table 10 .
Model estimates for go-past reading time, for Experiment 4.

Table 11 .
Results of combined analysis of Experiments 1-4.

Table 12 .
Means and standard errors (aggregated by participants) for Experiment 5

Table 13 .
Results of the pre-registered analysis of Experiment 5. Note that, given the condition labelling in this experiment, the interaction between attraction and active dependencies is expected to be positive, if active dependencies increase the size of the attraction effect.

Table 14 .
Results of the exploratory analysis of the verb region in Experiment 5.