The neural correlates of word position and lexical predictability during sentence reading: evidence from fixation-related fMRI

ABSTRACT By means of combining eye-tracking and fMRI, the present study aimed to investigate aspects of higher linguistic processing during natural reading which were formerly hard to assess with traditional paradigms. Specifically, we investigated the haemodynamic effects of incremental sentence comprehension – as operationalised by word position – and its relation to context-based word-level effects of lexical predictability. We observed that an increasing amount of words being processed was associated with an increase in activation in the left posterior middle temporal and angular gyri. At the same time, left occipito-temporal regions showed a decrease in activation with increasing word position. Region of interest (ROI) analyses revealed differential effects of word position and predictability within dissociable parts of the semantic network – showing that it is expedient to consider these effects conjointly.


Introduction
Over the last years, researchers have shown growing interest in studying the neurocognitive processes of language in ecologically valid settings to embrace the complexity of real-time linguistic processing. In the context of reading research this can be achieved by combining recordings from neuroimaging, such as electroencephalography (EEG) or functional magnetic resonance imaging (fMRI), with the recordings of eye movements (e.g. EEG: Hutzler et al., 2007;Dimigen, Sommer, Hohlfeld, Jacobs, & Kliegl, 2011;fMRI: Henderson, Choi, Luke, & Desai, 2015;Richlan et al., 2014;Schuster, Hawelka, Richlan, Ludersdorfer, & Hutzler, 2015). This integrative approach enables the investigation of higher linguistic processing during natural reading which were formerly hard to assess by traditional approaches (see Himmelstoss, Schuster, Hutzler, & Hawelka, under review). Beyond the level of single words, sentence comprehension involves a multitude of linguistic operations such as lexical, phonological, syntactic, semantic and compositional semantic processing. However, the neural mechanisms underlying the construction of sentence meaning as linguistic information unfolds and, in particular, how this can be related to context-based influences such as effects of word predictability are, as yet, poorly understood. The present study therefore investigated the haemodynamic effects of incremental sentence processingoperationalised by word positionand the effects of lexical word predictability during reading while simultaneously recording participants' eye movements and blood-oxygen-level dependent (BOLD) signals.
Reading (or listening to) coherent sentences compared to randomly arranged lists of words recruits a neural network encompassing large proportions of the temporal lobe(s)including superior, middle and inferior temporal gyri, and the temporal pole-, along with the angular gyrus and the prefrontal cortex (e.g. Bavelier et al., 1997;Bottini et al., 1994;Humphries, Binder, Medler, & Liebenthal, 2006;Kuperberg et al., 2000;Mazoyer et al., 1993;Vandenberghe, Nobre, & Price, 2002;Xu, Kemeny, Park, Frattali, & Braun, 2005). Of particular interest with regard to sentence comprehension are activations of the anterior temporal lobe(s) and the left inferior frontal gyrus, albeit it is yet unclear which specific processes are associated with these cortical activations (i.e. compositional semantics: e.g. Vandenberghe et al., 2002 or syntactic processing: e.g. Humphries et al., 2006).
Addressing the question to what extent these core regions of the language network are modulated by the amount of incoming information, Pallier, Devauchelle, and Dehaene (2011) investigated the effect of constituent structure (i.e. a coherent unit formed by a word or phrase). Treating the number of words forming a meaningful entity as a proxy of constituent size, the authors demonstrated that increasing constituent size is associated with an increase of activation in the left temporal pole, the anterior superior temporal sulcus and the temporo-parietal junction (including the angular gyrus). Thus, the authors reasoned that these regions may encode semantic components. By contrast, left inferior frontal and posterior temporal regions exhibit such an effect even in the absence of meaningful semantic content (i.e. jabberwocky stimuli) which may indicate an autonomous encoding of syntactic regularities. Furthermore, findings from intracranial recordings showed a monotonic increase of gamma power over the course of sentence processing which, critically, was higher for meaningful sentences than for randomly arranged word-lists and syntactically "legal" jabberwocky sentences (Fedorenko et al., 2016).
That being said, there is a substantial correlation between word position (and by association constituent size) and word predictability. That is, as contextual information accumulates words tend to be more predictable (Levy, 2008;Marslen-Wilson & Tyler, 1975). Indeed, in an event-related potential (ERP) study, Dambacher, Kliegl, Hofmann, and Jacobs (2006) could demonstrate that when accounting for word predictability, the effect of word position on the N400 component gets assimilated. According to the authors, this finding substantiates the notion that word position can be considered as a proxy of contextual constraint (Van Petten, 1993;Van Petten & Kutas, 1990, 1991. Word position, however, elicited an effect on the P200 component which could not be accounted for by word predictability indicating that position and predictability may elicit dissociable effects. With regard to fMRI, various studies reported that increasing word predictability leads to a decrease in activation within left temporal regions including the inferior, middle and superior temporal gyrus and the inferior frontal gyrus (e.g. Baumgaertner, Weiller, & Büchel, 2002;Dien et al., 2008;Hartwigsen et al., 2017). In light of these findings, the involvement of left inferior frontal regions has been suggested to reflect top-down retrieval/selection processes of semantic information with anterior proportions dedicated to controlled retrieval and posterior proportions to selection processes (Badre, Poldrack, Paré-Blagoev, Insler, & Wagner, 2005;Thompson-Schill, D'Esposito, Aguirre, & Farah, 1997;Thompson-Schill, D'Esposito, & Kan, 1999;Wagner, Paré-Blagoev, Clark, & Poldrack, 2001). The actual access to lexico-semantic information is supposed to be achieved by left temporal regions (e.g. Hickok & Poeppel, 2007;Lau, Phillips, & Poeppel, 2008).
To summarise, the effects of word predictability and word position seem to influence brain activation in the opposite direction: while an increase in the number of words being processed is associated with increasing activation, increasing word predictability is associated with decreasing activation. Evidence from eye movement research likewise suggests independent contributions of both word position and predictability on fixation duration (Kuperman, Dambacher, Nuthmann, & Kliegl, 2010). To be specific, while increasing predictability is associated with shorter fixation durations and more frequent word skippings (Balota, Pollatsek, & Rayner, 1985;Kliegl, Grabner, Rolfs, & Engbert, 2004;Kliegl, Nuthmann, & Engbert, 2006), fixation durations increase over the course of sentence processing. The facilitatory effect of word predictability has been argued to result from a graded activation of probable upcoming words (see Staub, 2015). The effect of word position, on the contrary, supposedly reflects the evolution of compositional semantics (i.e. sentence "wrap-up"; Kuperman et al., 2010; see also Balogh, Zurif, Prather, Swinney, & Finkel, 1998;Rayner, Kambe, & Duffy, 2000;Rayner, Sereno, Morris, Schmauder, & Clifton, 1989).
As yetin contrast to eye movement and ERP studies -fMRI studies investigated the effects of word position and predictability independent of each other. To illustrate, most of what we know about the neural underpinnings of the effect of word predictability stems from studies manipulating sentence final wordsthereby keeping (relative) word position constant. For a conjoint analysis of the effects of word position and predictability, however, one would require reading materials in which word predictability unfolds naturally over sentences. This feature is offered by sentence corpora which provide predictability values for each constituent word, e.g. based on norming studies (e.g. Kliegl et al., 2004). Such corpora are, for example, the Potsdam Sentence Corpus (German; Kliegl et al., 2004), the Dundee Corpus (English; Kennedy, 2003) and the Provo Corpus (English; Luke & Christianson, 2018). Alternatively, one can approximate a similar measure using language models (from the field of computational linguistics) which compute probability distributions over potentially upcoming words (Hale, 2001;Levy, 2008).
A modest number of fMRI studies have been conducted using either corpus or language model based estimates of word predictability in naturalistic experiments (i.e. natural reading or speech comprehension; e.g. Henderson, Choi, Lowder, & Ferreira, 2016;Schuster, Hawelka, Hutzler, Kronbichler, & Richlan, 2016;Willems, Frank, Nijhof, Hagoort, & van den Bosch, 2016). By means of utilising language model based estimates of syntactic surprise, Henderson et al. (2016) demonstrated that increasing syntactic surprise results in higher activation in the left inferior frontal gyrus and the left anterior superior temporal lobe. For the effect of lexical surprise, Willems et al. (2016) showed a higher activation with increasing surprise in the left inferior temporal sulcus, bilateral superior temporal gyrus, bilateral anterior temporal poles, and in the right inferior frontal sulcus. In a study from our lab, we investigated the impact of word predictability (in addition to the effects of word length and frequency) on brain activation by utilising corpus based predictability norms (Schuster et al., 2016). In an exploratory analysis, we observed a decrease in activation in bilateral inferior frontal, posterior-toanterior middle temporal gyri and in the left occipitotemporal cortex with increasing word predictability.
These aforementioned studies provide first insights into the effects of contextual processing during natural language comprehension. However, we are not aware of any fMRI study which conjointly analysed the effects of word predictability and word position to identify potentially independent contributions within the supposed reading network (for such an assessment with EEG see Dambacher et al., 2006). The present study is a re-analysis of a recent study from our lab in which we simultaneously recorded the BOLD signal and eye movements of participants who silently read sentences for comprehension (Schuster et al., 2016). The sentences were selected from a corpus in which word predictability unfolds naturally and predictability estimates are available for each and every word (i.e. the Potsdam Sentence Corpus; Kliegl et al., 2004). The main objective of the following analysis was to investigate in which regions of the reading networkand particularly regions associated with semantic processing (e.g. Binder & Desai, 2011;Binder, Desai, Graves, & Conant, 2009)word predictability and word position exert (differential) effects.

Method
We re-analysed combined fMRI and eye-tracking data from a previous study of our lab (Schuster et al., 2016). In the following we only briefly describe the material, procedure, data acquisition and statistical analysis. Please note that the differences between this and our previous analysis are: (i) the preprocessing was done in SPM12 (previously in SPM8), (ii) in the previous study we modelled linear and quadratic effects of word length, whereas in the current analysis we only added the linear term and (iii) we now included (relative) word position into the model.
Data from forty-seven participants exhibiting average and above-average reading performance were used for the present analysis. Participants silently read 117 sentences from the Potsdam Sentence Corpus (Kliegl et al., 2004) for comprehension. For the analysis, we excluded sentence initials and closed-class words (i.e. determinators, particles, conjunctions, prepositions and pronouns), leaving a total of 518 words. As mentioned in the Introduction, the Potsdam Sentence Corpus provides predictability estimates for each and every word of the sentences based on an independent norming sample. These norms range between 0 and 1 denoting completely unpredictable and highly predictable words, respectively (M = 0.21; SD = 0.28). Word position was transformed into relative word position which is defined as the ordinal rank divided by sentence length. The sentence lengths ranged from 5 to 11 words (M = 7.6; SD = 1.2). The measure of relative word position ranged from .18 to 1. In addition to word predictability and relative word position, we further considered word frequency (i.e. log-transformed occurrences per million; range: 0.0-4.4; M = 2.1; SD = 1.3) and word length (M = 5.4; SD = 2.6) in our statistical model of the fMRI analysis (see below).
In brief, the procedure was as follows: Sentence presentation was preceded by the appearance of two fixation-bars at the vertical centre near the left border of the screen. The bars remained on screen for a variable time ranging from 1000 to 3000 ms (with increments of 500 ms). While fixating between these bars, a drift correction/fixation control was administered by the eye-tracking system. Thereafter, a sentence appeared in the horizontal centre of the screen. Fixating a cross at the bottom of the right corner of the screen terminated the sentence presentation. After 10% of sentence presentation, participants had to answer (via a button press) a simple two-alternative forced-choice question with regard to the content of the preceding sentence (12 questions in total). During the 24 null-events the fixation-bars remained on the screen for additional 2 seconds.
Eye movements were recorded monocular with an Eyelink CL system in the long-range setup (SR-Research, Ontario, Canada) with a sampling rate of 1 kHz. The camera was placed at the rear end of the scanner bore at a distance of approximately 90 cm behind the participant and approximately 120 cm in front of the screen. A horizontal three-point calibration routine preceded each of the three scanning sessions. Additionally, each trial was preceded by a drift correction/fixation control procedure (see above) in which a fixation had to be detected by the eye-tracking system between the fixation bars. In case the control procedure failed, the system was recalibrated.
We analysed our eye movement data by means of linear mixed models (LMM) with the lme4-package (version 1.1-12; Baayen, Davidson, & Bates, 2008) running in the R environment for statistical computing (R Core Team, 2017). Note that fixations shorter than 80 ms were excluded from both the eye-tracking and fMRI analysis (i.e. 4.2% of the data).
Functional imaging data were recorded with a Siemens Magnetom Trio 3 Tesla scanner (Siemens AG, Erlangen, Germany) equipped with a 12-channel headcoil. Functional images sensitive to BOLD contrast were acquired with a T2*-weighted gradient echo EPI sequence (TR 2000 ms, TE 30 ms, matrix 64 × 64 mm, FOV 192 mm, flip angle 80°). Thirty-six slices with a slice thickness of 3 mm and a slice gap of 0.3 mm were acquired within the TR. Scanning was divided in 3 sessions with a variable number of scans per session. The exact number of scans depended on the participants' reading speed and potential re-calibration procedures, and ranged from 106 scans to 437 scans (M = 152; SD = 39 scans). In addition to the functional images, a gradient echo field map (TR 488 ms, TE 1 = 4.49 ms, TE 2 = 6.95 ms) and a high resolution (1 × 1 × 1.2 mm) structural scan with a T1-weighted MPRAGE sequence were acquired from each participant.
For preprocessing and statistical analysis of the fMRI data we used SPM12 software (http://www.fil.ion.ucl.ac. uk/spm/) running in a MATLAB 8.1 environment (Mathworks Inc., Natick MA, USA). Functional images were corrected for geometric distortions with the FieldMap toolbox, realigned and unwarped, and then coregistered to the high resolution structural image. The structural image was normalised to the MNI T1 template image, and the resulting parameters were used for normalisation of the functional images, which were resampled to isotropic 3 × 3 × 3 mm voxels and smoothed with a 6 mm FWHM Gaussian kernel. No slice timing correction was applied.
Statistical analysis was performed by computing a fixed effects model on the first level and a random effects model on the second level. The BOLD response was related to the eye-tracking data in the specifications of the subject-specific first level model: each onset of a first fixation on a word was used to model the canonical hemodynamic response function. First fixation onsets on the first word of each sentence, the closed-class words, as well as the onsets and durations of the comprehension questions were not analysed further, but coded in separate onset vectors of no interest along with six head movement parameters derived from the realignment step during preprocessing. The functional data of these first level models were high-pass filtered with a cutoff of 128 seconds and corrected for autocorrelation by an AR(1) model (Friston et al., 2002). Parameter estimates of the first level models were further calculated in the context of a General Linear Model (GLM; Henson, 2004). For the conjoint analysis of relative word position and word predictability, these variables were added as parametric regressors of the reading versus baseline (i.e. including interstimulus intervals, null-events, and drift correction/re-calibration procedures) contrast. Additionally, we included word frequency and word length as parametric regressors in the model, so that potential effects of word predictability and relative word position can explain variance over and above what is already explained by these variables. Note that orthogonalization was deactivated in the single subject analyses which ensures that the present results capture the unique variance assigned to each of the parametric regressors (see Mumford, Poline, & Poldrack, 2015). The resultant subject-specific contrast images for word predictability and relative word position were then used for the second level random effects analysis and submitted to one-sample t-tests. Statistically significant effects on the whole-brain level were identified using a voxel-level threshold of p < .001 (uncorrected) and a cluster-level threshold of p < .05 (FWE-corrected for multiple comparisons).

Behavioural results
The performance on the two alternatives forced-choice comprehension questions was close-to-ceiling with a mean accuracy of 98.94%. In order to assess whether relative word position and word predictability contributes significantly in predicting first fixation duration and gaze duration we computed two LMMs (the relationships are displayed in Figure 1). In one of the LMMs, we included relative word position; whereas in the other one we omitted relative word position. As additional predictors (i.e. fixed effects), we considered word predictability, frequency and length. Random effects were participants, words and sentences. The results of the two LMMs are provided in Table 1. Both models revealed significant effects of word frequency and word length; first fixation and gaze duration decreased with increasing frequency and increased with increasing word length. The LMM without word position as predictor did not yield a significant effect of word predictability (on first fixation duration the effect of word predictability even indicated a significant positive relationship). The model which included the predictor word position, to the contrary, revealed significant effects in the expected direction of both, word predictability and relative word position. Mean first fixation duration and gaze duration decreased with increasing predictability and increased towards the end of the sentences (i.e. with increasing word position). A model comparison revealed that the models including word position are significantly better in predicting first fixation and gaze duration than the models without the predictor word position (χ 2 s > 72; ps < .001).  Note: fixdur = fixation duration; prd = predictability; len = length; frq = frequency; sbj = subject; wrd = word; sen = sentence.

fMRI results
The results of the analysis of our parametric design are provided in Table 2 and illustrated in Figure 2. As can be seen, this analysis revealed effects of relative word position in left posterior middle temporal gyrus, the left angular gyrus and left occipito-temporal regions. The left posterior middle temporal gyrus and the left angular gyrus exhibited an increase in activation with increasing relative word position. The left occipital pole, the left inferior occipital and the left fusiform gyrus, by contrast, showed a decrease in activation as a function of increasing relative word position. In order to investigate whether sentence final words evoke any elevated processing demands by their own, we contrasted activation elicited by fixations on sentence final words compared to fixations on previous words. This analysis did not reveal any differences at the wholebrain level in neither direction (i.e. final > previous words, previous > final words).
As reported in the original analysis, we did not observe any significant clusters for the effect of word predictability at the whole-brain level when applying a conservative threshold. Only lowering the threshold to p < .05 revealed effects within bilateral inferior frontal regions, left posterior-to-anterior middle and superior temporal regions, the left precentral gyrus, the left middle frontal gyrus and the right insular cortex which showed a decrease in activation as a function of increasing predictability.
Since the aim of the present study was to investigate whether regions associated with semantic processing exert differential effects of word predictability and word position, we conducted ROI analyses based on coordinates previously identified in a meta-analysis on semantic processing (Binder et al., 2009). We only considered regions in the left hemisphere, namely, inferior parietal (inferior parietal cortex:   word predictability was located in the intermediate section of the left middle temporal gyrus (Figure 3).

Discussion
The present co-registration study investigated the haemodynamic effects of incremental sentence comprehension and local, word-level contextual effects. To this end, we re-analysed data from combined recordings of BOLD signals and eye movements while participants silently read sentences for comprehension. As proxies for incremental comprehension and word-level contextual effects we considered word position and word predictability, respectively. Corroborating previous evidence from eye movement research, we observed an increase in fixation durations over the course of sentence processing as well as a decrease in fixation durations as a function of increasing word predictability (Ashby, Rayner, & Clifton, 2005;Balota et al., 1985;Fitzsimmons & Drieghe, 2013;Hawelka, Schuster, Gagl, & Hutzler, 2015;Kliegl et al., 2004Kliegl et al., , 2006Kuperman et al., 2010;Staub, 2011), however only when considering Figure 3. ROIs representing contrast estimates in left-hemispheric "semantic network" regions (coordinates are based on a meta-analysis of Binder et al. (2009) and are indicated by the red blobs) for the effect of word position (dark grey) and word predictability (light grey). Error bars represent 1 SEM. Significant effects are indicated with an asterisk. Abbreviations: SMG = supramarginal gyrus; AG = angular gyrus; IPC = inferior parietal cortex; pMTG = posterior middle temporal gyrus; MTG = middle temporal gyrus; ATL = anterior temporal lobe; IFG orb = inferior frontal gyrus pars orbitalis; IFG tri = inferior frontal gyrus pars triangularis.
both factors within the statistical model. Our fMRI analyses revealed that incremental sentence processing (reflected by increasing word position) is associated with increasing activation in the left posterior middle temporal (pMTG) and the angular gyrus (AG). At the same time, left occipital regions extending to the left fusiform gyrus exhibited a decrease in activation with increasing word position. By contrast, the effect of word predictability did not exceed our predefined (conservative) threshold at the whole-brain level. Literature based ROI analyses within the semantic network (see Binder et al., 2009) revealed effects of word predictability in the left inferior frontal gyrus (IFG), left middle temporal gyrus (MTG) and the anterior temporal lobe (ATL). In sum, word predictability and word position largely affected distinct cortical regions. The only region sensitive to both predictability and position was the intermediate proportion of the left MTG.
Generally, word position revealed stronger effects on brain activation compared to word predictability at the whole-brain level. Only targeted ROI analysis in the left MTG and the left IFG showed that higher predictability resulted in a decrease of activation. A probable explanation for the relatively small effect of predictability on brain activation is that we utilised a sentence corpus in which the predictability of the words was not explicitly manipulated (as it is the case in studies which compared activation elicited by high predictable versus unpredictable target words; e.g. Baumgaertner et al., 2002;Dien et al., 2008;Hartwigsen et al., 2017). The mean predictability of the words in the present analysis, in contrast, was rather moderate (p = .21).
Furthermore, semantic and morphosyntactic information may be more predictable than the predictability of the actual word form (i.e. lexical predictability) based on previous sentence context. Using the Provo Corpus in an eye-tracking study, Luke and Christianson (2016) could show that predictable semantic and morphosyntactic information had a facilitatory effect on reading times beyond the effect of pure lexical predictability (in the Provo Corpus, comparable to the Potsdam Sentence Corpus, only 5% of the content words are highly predictable, Luke & Christianson, 2018). Future neuroimaging studies therefore may investigate whether regions of the language network are more sensitive to semantic and morphosyntactic predictability than to lexical predictability and/or whether these different levels of predictability indeed elicit discernible effects (see Carter, Foster, Muncy, & Luke, 2019).
Word position resulted in an increase in activation in the left pMTG and in areas of the left (inferior) parietal cortex, that is, the supramarginal gyrus and the AG. In a meta-analysis on semantic processing, Binder et al. (2009) reported that most activation foci were located in the AG. Thus, the AG is supposed to form the "top of a processing hierarchy" dedicated to semantic knowledge retrieval and integration (Binder et al., 2009(Binder et al., , p. 2776. Beyond the semantic processing of single words, the AG has been proposed to bind multiple semantic concepts into a coherent semantic representation (Friederici, Rüschemeyer, Hahne, & Fiebach, 2003;Newman, Just, Keller, Roth, & Carpenter, 2003;Ni et al., 2000). Moreover, investigating the time course of the BOLD signal in the AG during sentence and word list processing revealedcompared to baselinean increase in activation only when consecutive single words can be translated to a coherent sentence interpretation (Humphries, Binder, Medler, & Liebenthal, 2007). Critically, this response occurred relatively late. This late unfolding of AG activation was interpreted as reflecting combinatorial semantic processing which, supposedly, enables the construction of an overall sentence meaning. The presently observed effect of word position in the AG would conform to such an interpretation.
In contrasts, the left MTG and surrounding superior temporal regions in Humphries et al. (2007) showed an instantaneous increase in activation not only for semantically congruent sentences, but also for congruent word lists (i.e. the scrambled version of a coherent sentence, e.g. "on vacation lost a and bag wallet man a") compared to a list of semantically unrelated words. The fact that the MTG does not show a differential effect for coherent sentence structure (beyond congruent word lists) was interpreted as reflecting its role in semantic processing of individual words. This notion is corroborated by evidence reporting severe comprehension difficulties at the single word level when a lesion affects the left (posterior) MTG (e.g. Dronkers, Wilkins, Van Valin, Redfern, & Jaeger, 2004). By contrast, lesions affecting more anterior proportions of the MTG, surrounding superior temporal regions, frontal and parietal regions (including the AG) have been associated with comprehension deficits beyond the level of single words. The increase in activation over the course of sentence processing in both AG and pMTG observed in the present study therefore might indicate that the pMTG feeds single word based semantic information towards the AG which subsequently generates an overall sentence interpretation.
This evidence of the evolvement of the AG and the pMTG in word and sentence-level semantic processing, however, does not necessarily mean that the activation could not also reflect additional or different processes, such as working memory processes or spatial coding. To illustrate, Dambacher et al. (2006) speculated that effects of word position might indicate working memory load which presumably is highest in the middle of a sentence resulting in a U-shaped pattern of P200 amplitudes as a sentence unfolds. Furthermore, Kuperman et al. (2010) provided evidence that the sentence "wrap-up" effect is also triggered by visual properties of the text. Specifically, the authors observed the wrap-up effect only at the end of a sentence in singleline reading and at the end of the last sentence of a paragraph, but not at the end of the sentences (or clauses) within paragraphs. More targeted fixation-related studies would be needed to resolve the contribution of several factors to the effect of word position.
In the present study, more anterior regions previously attributed to sentence comprehension, namely the left IFG and ATL, did not exhibit effects of word position, but effects of word predictability. As noted before, it isas yetunclear which particular processes are associated with these cortical activations (i.e. compositional semantics: e.g. Vandenberghe et al., 2002; or syntactic processing: e.g. Humphries et al., 2006). The absence of an effect of word position and the presence of a predictability effect in these regions (to be specific, in the IFG pars orbitalis and the ATL) indicates that the activation is driven more by local, word-based information processing, rather than incremental sentence-level processing. Previous research reported IFG activation preferentially when linguistic complexity was high (either syntactically or semantically; e.g. Constable et al., 2004;Haller, Klarhoefer, Schwarzbach, Radue, & Indefrey, 2007;Friederici, Fiebach, Schlesewsky, Bornkessel, & von Cramon, 2005;Friederici et al., 2003). Thus, one may speculate that the IFG engages when controlled top-down processes, such as syntactic and semantic disambiguation, are afforded for sentence comprehension (e.g. Hagoort, Hald, Bastiaansen, & Petersson, 2004;Kiehl, Laurens, & Liddle, 2002;Kuperberg, Sitnikova, & Lakshmanan, 2008;Kuperberg et al., 2003;Newman, Pancheva, Ozawa, Neville, & Ullman, 2001;Zhu et al., 2012Zhu et al., , 2013. Regions which were inversely associated with word position (i.e. showing a decrease in activation with increasing contextual information) were left occipital areas extending to the left fusiform gyrus. This effect could be explained as follows: During reading, each incoming word contributing to sentence comprehension may reduces the reliance on bottom-up visual information which might be reflected in a steady decrease in activation within early visual areas (Rao & Ballard, 1999). Notably, when one does not control for word position, this pattern of reduced activation in occipito(-temporal) regions is observable for/ascribed to word predictability (Schuster et al., 2016;Willems et al., 2016). It is conceivable that occipito-temporal regions are sensitive to both, word position and predictability when, for example, predictability is (on average) higher than in the presently used sentence corpus or when predictability is operationalised differentlyas, for instance, by the predictability of syntactic categories (e.g. Dikker, Rabagliati, Farmer, & Pylkkänen, 2010;Dikker, Rabagliati, & Pylkkänen, 2009).
As aforementioned, the only region which exhibited a conjoint effect of predictability and word position was the intermediate proportion of the left MTG. Tractography and functional connectivity analyses revealed that this region is characterised by a distributed structural and functional connectivity pattern within the brain's language network (Turken & Dronkers, 2011). Moreover, the left mid MTG has been identified as a provincial hub, that is, a hub primarily linking nodes within the perisylvian language module (Xu et al., 2016). One may speculate that the sensitivity to both word position and word predictability reflects the region's involvement in shaping linguistic predictions which are further propagated within the perisylvian language network. Thus, accumulating contextual information during sentence processing may facilitate perceptual and cognitive processing of subsequently processed words as envisioned by recent neurocognitive theories about predictive coding (Clark, 2013;Friston, 2009Friston, , 2010. To conclude, the present study shows that word predictability and word positionalthough correlatedelicit dissociable effects in different brain regions within the semantic network. Based on the association between these two variables it has been reasoned that word position can serve as a proxy of contextual constraint (e.g. Schoffelen et al., 2017;Van Petten & Kutas, 1990;Yarkoni, Speer, Balota, McAvoy, & Zacks, 2008). In order to fully examine effects of word position and predictability, these variables would have to be independently manipulated by means of embedding predictable and unpredictable target words at the beginning, middle or end of a line of text (e.g. Parker, Kirkby, & Slatter, 2017). When adopting a quasi-experimental approach, however, the present co-registration study demonstrates that it is expedient to consider word position for the analysis of predictability effects.

Funding
This work was supported by the Austrian Science Fund (FWF P 25799-B23).