Reading Poetry and Prose: Eye Movements and Acoustic Evidence

ABSTRACT We examined genre-specific reading strategies for literary texts and hypothesized that text categorization (literary prose vs. poetry) modulates both how readers gather information from a text (eye movements) and how they realize its phonetic surface form (speech production). We recorded eye movements and speech while college students (N = 32) orally read identical texts that we categorized and formatted as either literary prose or poetry. We further varied the text position of critical regions (text-initial vs. text-medial) to compare how identical information is read and articulated with and without context; this allowed us to assess whether genre-specific reading strategies make differential use of identical context information. We observed genre-dependent differences in reading and speaking tempo that reflected several aspects of reading and articulation. Analyses of regions of interests revealed that word-skipping increased particularly while readers progressed through the texts in the prose condition; speech rhythm was more pronounced in the poetry condition irrespective of the text position. Our results characterize strategic poetry and prose reading, indicate that adjustments of reading behavior partly reflect differences in phonetic surface form, and shed light onto the dynamics of genre-specific literary reading. They generally support a theory of literary comprehension that assumes distinct literary processing modes and incorporates text categorization as an initial processing step.


Introduction
Readers' schematic conceptions of text types and genres allow them to strategically process and comprehend unfamiliar texts (e.g., Van Dijk & Kintsch, 1983). Readers adjust, for instance, to different types of literary texts such as literary prose and poetry-two broad categories of literary composition distinct enough to result in strategic differentiation. Previous work has revealed a number of differences between the processing strategies for these literary genres, for example, that readers associate them with distinct conceptions of how their phonetic surface form should be realized. Converging phonetic evidence even allows to characterize these genre-appropriate articulation strategies in terms of distinctive features (Bröggelwirth, 2007;Byers, 1979;Fant et al., 1991;Wagner, 2012), although the status of some features remains disputed (Barney, 1999). A similarly fine-grained characterization of poetry-and prose-specific reading strategies is currently impossible, though, since relevant contrastive evidence remains sparse and partly contradictory. verse lines) whose rhythmic patterns repeat periodically (Fabb, 2015). These fundamental differences between prototypical prose and poetry should lead to a basic differentiation of readers' literary genre conceptions (Steen, 1999) and to an appropriate differentiation of literary processing strategies.
Previous research indicates that these differences are indeed reflected in the processing strategies for literary prose and poetry. Not only does poetry comprehension result in particularly strong representations of surface form (Hanauer, 1998)-more durable and accurate than those resulting from prose comprehension (Tillmann & Dowling, 2007)-contrastive phonetic studies also show consistently that readers associate poetry and prose with distinct conceptions of how phonetic surface form should be realized (e.g., Byers, 1979;Wagner, 2012). These systematic phonetic differences indicate that strategic literary comprehension involves genre-appropriate prosodic modulationsconverging cross-linguistic evidence has even identified a number of melodic and rhythmical features that characterize "poetic intonation" and that set it apart from the oral performance of prose. For one, speakers articulate poetry more slowly than prose and systematically lengthen speech units like phonemes, syllables, and prosodic feet 1 (Bröggelwirth, 2007;Kruckenberg & Fant, 1993). Poetryspecific syllable lengthening affects stressed syllables to a greater degree than unstressed ones, which enhances prosodic prominence contrasts-as indexed by increased duration ratios of strong (=stressed) and weak (=unstressed) syllables-and thus leads to more pronounced speech rhythm for poetry than for prose (Kruckenberg & Fant, 1993;Wagner, 2012). Moreover, silent speech pauses are more frequent in poetry (Barney, 1999;Byers, 1979). Although driven by articulation speed and speech pauses, modulations of the global speaking rate (or: speech rate)-a distinctive feature of Byers (1979) original "formula for poetic intonation"-have proven less uniform in earlier investigations (Barney, 1999), so that their genre-distinctive status remains unclear. The present study aimed to reassess these genre-specific articulatory adjustments and to clarify the status of speaking rate modulations and the relative contribution of articulation and speech pauses.
Whereas these genre-appropriate articulatory adjustments are fairly well established, extant contrastive evidence for poetry-and prose-specific reading behavior remains relatively sparse. In line with the speaking rate reduction observed in most contrastive phonetic studies, strategic poetry reading appears to be slower than prose reading (Hanauer, 1998;Peskin, 2007Peskin, , 2010, which has previously been attributed to distinct interpretive operations in poetry comprehension (Blohm et al., 2017;Gibbs et al., 1991;Hoffstaedter, 1987;Peskin, 2007). But unidimensional measures like reading times allow only for a very crude characterization of genre-specific reading behavior, whereas eye-tracking can provide a more fine-grained picture. A contrastive eyetracking study by Koops van 'T Jagt et al. (2014) focused on lineation in poetry and demonstrated that this characteristic formal feature locally increases reading times and affects regressions at the words preceding and following line breaks compared to prose versions where these regions occurred line-medially. The most detailed description of genre-specific eye-movement correlates stems from a study by Fischer et al. (2003) in which participants read poems in both the original layout and in prose format. Results revealed that average fixation durations were longer for the original poem versions and that progressive saccades were smaller, whereas regressive ones were longer and more frequent. Observed differences in the number of fixations and in global reading rates did not reach statistical significance, which is at odds with the respective reading-time data. A recent study by Fechino et al. (2020) confirmed that the poetry (vs. prose) layout leads readers to reduce their reading speed and to regress more frequently. The present study extends this sparse evidential basis, aiming to characterize poetry-and prosespecific reading behavior in terms of distinctive eye-movement adjustments and to show that genre-specific reading partly reflects modulations of phonetic surface form.
Taken together, the available evidence seems to confirm that readers pursue distinct strategies for literary prose and poetry. Zwaan's model of literary comprehension accounts for this differentiation of literary comprehension in terms of gradient effects of the LCCS, which are more pronounced for poetry than for prose, because readers expect a greater degree of "literariness" in verse, for example, systematic sound patterns and creative figurative language. This gradient conception of literary reading strategies contrasts with a categorial difference, proposed by Kintsch (1998, pp. 206-209), that acknowledges the cohesive force of sound patterning in verse. He argued that prose comprehension results only in representations of linguistic surface form, propositional content, and in mental models of the states of affairs described, whereas poetry comprehension additionally results in the construction of a versification level that represents the systematic sound patterns of verse. But contrary to linguistic, propositional, and situational representations, the online construction of the versification level abstracts implicit patterns from periodic sound recurrences encountered in the phonetic surface form of the context. This entails that-even if the reading strategy for poetry contains "instructions to attend to sound recurrences in the assumption that these are not random" (De Beaugrande, 1978, p. 24; see also Rosenblatt, 1978)-the representations of rhythmic patterns emerge incrementally if the prosodic context gives sufficient evidence of the underlying regularities.
Such dynamics are captured by the structure building framework (Gernsbacher, 1991(Gernsbacher, , 1997, which maintains that text-initial information results in the representational foundation(s) onto which later input is mapped. Based on this idea, we hypothesized that early poetry comprehension seeks to construct strong foundations of surface form and of its rhythmic patterns at the versification level and that later prosodic information is then integrated in accordance with these regularities. Specifically, we expected that-reflecting the incremental construction of the versification leveloral poetry performance becomes increasingly rhythmic as readers navigate through a text. By contrast, early prose comprehension should focus on the foundations of a coherent propositional discourse model, which thematically constrains and guides later comprehension and thus warrants less careful navigation through a text. Hence, we expected that prose reading becomes increasingly risky with mounting discourse context.

Present study
We recorded speech signals and eye movements while participants (N = 32) orally read short texts (N = 48) that we categorized and formatted as either literary prose or poetry (genre: prose vs. poetry). Presenting identical texts in both genre conditions controlled linguistic and thematic variables and isolated effects of the text category. This enabled us to address unresolved issues regarding the articulation strategies of these genre conceptions, to extend the sparse contrastive eye-movement evidence and characterize poetry-and prose-specific reading behavior in terms of distinctive eyemovement indices, and to assess in how far potential genre-dependent adjustments of reading speedpreviously shown to correlate with distinct interpretive operations-reflect genre-appropriate articulation strategies and the construction of phonetic surface form.
The genre-specific acoustic and behavioral profiles we aim to obtain are useful approximations to strategic default adjustments. But since they constitute genre-dependent differences averaged across readers and texts, they fail to reflect the incrementality of the comprehension process and, hence, cannot reveal the genre-specific processing dynamics we hypothesized. To test these hypotheses, we varied the position of a critical region, which either occurred at the very beginning of the text (where no context information is available) or following a short context (position: text-initial vs. text-medial). The rationale behind this manipulation was that genre-appropriate default adjustments should result in differences that remain constant across the text or that occur only text-initially. Genre-specific processing dynamics, by contrast, should be reflected in differences between genres that emerge only later in the text. Comparing acoustic correlates of word stress in the critical region thus enabled us to examine whether (only) the oral performance of poetry becomes increasingly rhythmic. Analyzing word-skipping rates in the critical region allowed us to assess whether (only) prose reading becomes riskier if guiding discourse context is available.

Design and hypotheses
We used a 2 × 2 factorial design that crossed genre (poetry vs. prose) and the text position of critical regions (initial vs. medial) within participants and texts using a Latin square (cf. Koops van 'T Jagt et al., 2014). Presenting identical texts in either a four-line stanza format (genre: poetry) or-removing two of the line breaks-in a two-line prose layout (genre: prose) kept linguistic variables constant and thus avoided potential confounds between text and genre (cf. Bröggelwirth, 2007;Fischer et al., 2003); see Materials below for details. In addition to this visual layout cue, participants received written instructions that explicitly specified the genre of the texts (cf. Schmitz et al., 2017;Zwaan, 1994). Based on previous findings, we predicted that readers reduce their speaking tempo for poetry versus prose and that this reduction reflects slower articulation and more frequent speech pauses. We further predicted a similar reduction of the reading tempo, which-following earlier eye-movement resultswe expected to reflect more and longer fixations in poetry than in prose, shorter progressive saccades, and longer and more frequent regressive ones. Finally, we expected that genre-specific adjustments of reading speed partly reflect the respective modulations of phonetic surface form.
To test our hypotheses regarding genre-specific processing dynamics, we varied the text position of critical regions by inverting the order of the two complex sentences that made up each text, which kept linguistic variables constant across position conditions, too. Critical regions thus occurred either without prior context at the very beginning of the text (position: text-initial) or following a complex context sentence (position: text-medial). The rhythmicity of oral text performance was captured by phonetic S/W (=strong/weak) ratios of the critical region, that is, by the relative duration, pitch, and intensity of strong and weak syllables. Based on the results of Wagner (2012) we predicted greater S/W ratios, and thus more pronounced speech rhythm, in poetry than in prose. If this rhythmicity effect is entirely driven by genre categorization, then readers should emphasize linguistic rhythm independently of prosodic context, and we should observe a stable genre difference across text positions. But if it reflects text-specific metrical representations at the versification level, then the initial lines of a poem first need to lay this foundation before a genre difference arises text-medially (interaction effect of genre and position). Finally, if poetry readers initially rely on their schematic genre conceptions and then integrate the prosodic regularities of the recurrent metrical pattern, we should observe an initial prosodic contrast between genres that further diverges text-medially (main effect of genre + interaction effect of genre and position). We further predicted that words in context are skipped more frequently than text-initial ones (main effect of position) and that this tendency-indicative of readers' increasing reliance on discourse-semantic constraint as they navigate through a text-is absent or greatly attenuated in poetry reading (interaction effect of genre and position).

Materials
We constructed 48 short critical texts of comparable complexity (Table 1), each consisting of two (typically biclausal) sentences ( Figure 1A); critical texts are provided in Appendix A. Care was taken that the speech rhythm of all texts was consistent with a binary metrical pattern in which strong and weak syllables alternate ( Figure 1B). Stimulus texts were either formatted as a two-line prose excerpt or as a four-line stanza of poetry ( Figure 1C) (see also (Beck & Konieczny, 2021;Fechino et al., 2020); in line with genre conventions, the poetry layout further differed from the prose format in terms of lineinitial capitalization irrespective of word class (cf. Fischer et al., 2003), which otherwise determines sentence-internal capitalization in German orthography. Note, however, that this did not affect the critical regions, which consistently featured sentence-initial capitalization across genre conditions. Each text contained a critical region of two monosyllabic function words (FWs, e.g., "on the") whose phonetic realization and fixation probability we expected to differ between conditions. We ensured that the critical regions occurred line-initially in both genre conditions and that none of the line breaks necessitated syntactic reanalysis at the beginning of the next line (in most cases, line breaks corresponded to punctuation-marked clause boundaries). Each critical FW pair constituted the initial prosodic foot of the respective sentence and conformed to one of four syntactic frames that allowed for sufficient lexical variation between items: (1) dummy pronoun + auxiliary (e.g., "es hat" = "it has") (2) subordinating conjunction + pronoun (e.g., "als ich" = "when I") (3) relative wh-pronoun + determiner/pronoun (e.g., "was ich" = "what I") (4) preposition + determiner (e.g., "auf der" = "on the"). Each syntactic frame was represented by 12 critical items; the number of items per syntactic frame and condition was balanced for each participant. Critical regions were followed by trochaic content words that were either free lexemes (44 items) or first constituents of compound words (4 items). This postcritical region served for comparison with the critical region and biased readers toward realizing the preceding critical region as a trochaic foot (=stressed-unstressed) to optimize rhythm (Vogel et al., 2015) and to parallelize adjacent prosodic feet (Kentner, 2015;Wiese, 2016;Wiese & Speyer, 2015).
All texts were presented in each of the four conditions. The resulting 192 critical text versions were distributed over four lists according to a Latin square (participants/texts/conditions). Each participant read 24 critical texts per genre condition, presented in separate blocks and with the appropriate layout; each genre block additionally contained 24 authentic texts (short prose excerpts or stanzas of poetry with various meters) that only served to reinforce the instruction and were excluded from the analysis. The order of genres was counterbalanced across participants so that each ordered list was read by four participants.

Participants
We recruited 32 adult native speakers of German from the University of Frankfurt community (23 women; mean age, 25.4 years; range, 18-39 years); all of them had normal or corrected-to-normal vision, and none reported any known reading or speech disorders. Data from three further participants were excluded, as the removal of outliers and trials containing speech errors resulted in less than 75% of the original observations in at least one condition; the recording session of one further participant was aborted due to ongoing calibration problems. Note that our participants were "experienced readers" in the sense that they had learned to read at (roughly) the age of 6 and that their education (>10 years of school) had familiarized them with different literary genres. Still, they were literary nonexperts in the sense that none of them had extensive creative-writing experience or had received a higher education focusing on literature (cf. Hanauer, 1996).

Procedure
Before the main experiment, we informed participants about the experimental procedures and familiarized them with the technical equipment. Experimental sessions took about 45 minutes, comprising two blocks (poetry and prose; ~15 minutes each) separated by a short break (~5-10 minutes). During the main experiment, participants were seated in a dimly lit and sound-attenuated recording booth while two experimenters (seated outside the recording booth) monitored the eyemovement recording and documented speech errors. Participants received written instructions that specified the genre (prose or poetry) at the beginning of each block (Appendix B). To minimize conscious articulatory planning prior to articulation, participants were instructed to orally read each text as soon as it appeared on the screen and to press a button on a handheld gamepad once they had finished reading the text. Having made sure that the instructions had been read and understood, an experimenter fixed and checked the headset microphone, and the experiment began with a short practice session of two trials per block. During the break, we removed the microphone and led participants into an adjoining room where they had a cup of water and answered a brief questionnaire about their reading habits (~5 minutes). Once participants had filled out the questionnaire, the experimental session was resumed with the second genre block, following the procedure described above. Each trial began with the presentation of a small black square just to the left of where the first word of the text was going to be displayed. Onset of the text presentation and start of the voice recording were gaze-contingent and began once a valid fixation had been registered on this square. All experimental procedures were approved by the Ethics Council of the Max Planck Society and were undertaken with written informed consent from each participant.

Recording
Text presentation and audio recording were controlled via the open-source software EyeTrack. 2 Voice recordings were made using a directional headset microphone (DPA Microphones A/S, Alleroed, Denmark) and sampling at 44.1 kHz with a 16-bit resolution. The recording level was fine-tuned for each participant prior to the recording session.
Eye movements were registered with an EyeLink 1000 eye tracker (SR Research Ltd., Ontario, Canada) sampling at 1,000 Hz. Viewing was binocular, but only one eye was monitored (the right eye whenever possible). Participants were seated at a distance of approximately 60 cm from the screen. Texts were aligned to the left edge of the display and presented in a black 28-point Courier font on a light-gray background; this presentation ensured that all texts could be displayed as intended and be easily read by the participants. Recording sessions began with a nine-point calibration of the eye tracker, and a drift correction was performed before each trial. Calibration was repeated after breaks and when deemed necessary by the experimenter.

Outlier removal
We excluded all trials from the analysis that contained speech errors (as documented during the experiment by one of the experimenters) or that had late articulation onsets exceeding a threshold of three median absolute deviations above the median value (Leys et al., 2013), corresponding to 1.79 seconds post onset of the text display. On the basis of these criteria, 10.87% of the data points were excluded. The remaining observations were distributed evenly across genres (chi-squared test for given probabilities: χ 2 (1) = 0.07, p = .787) and conditions (χ 2 (3) = 2.70, p = .440); average articulation onsets of the remaining trials did not differ between genre conditions (Welch's two-sample t-test: t (1367.8) < 1, p = .696).

Acoustic analysis
Voice recordings were presegmented using the software MAUS 3 (Kisler et al., 2017). Subsequently, syllable boundaries in critical and postcritical regions were inspected and manually corrected if necessary, before we extracted phonetic parameters using Praat (Boersma & Weenink, 2017).
Global measures. We extracted the total speaking time per trial (i.e., the time from articulation onset to offset) and the number and durations of silent speech pauses; total articulation time per trial was obtained by subtracting the summed pause durations from the total speaking time. From these unstandardized measures we then calculated trial-level speaking/speech rates (=number of syllables/ total speaking time) as a general index of oral reading fluency (cf. Byers, 1979); speaking rates reflect the average number of syllables produced per second so that values decrease when speakers slow down. To obtain a more fine-grained characterization of oral text performance, we further calculated articulation rates (=number of syllables/articulation time), the proportion of pauses per trial (=summed pause durations/total speaking time), and pause rates (=number of syllables/number of pauses) and the average pause duration.
Local measures. We further extracted the duration, mean pitch, and mean intensity for each syllable in the critical and postcritical regions. From these values we calculated S/W ratios (cf. Wagner, 2012), that is, the relative duration (in ms), intensity (in dB), and pitch (in Hz) of strong and weak syllables per prosodic foot (=region), assuming a trochaic rhythm (=SW) for both critical and postcritical feet.

Eye-movement analysis
Global measures. We first calculated trial-level reading rates, a general index of participants' reading performance that relates reading time to the amount of information processed. Reading rates were calculated for each trial by dividing the summed fixation durations (i.e., excluding saccades) by the number of characters per text (cf. Fischer et al., 2003); note that, contrary to speaking rates, reading rate values increase when speakers slow down. To obtain a more nuanced characterization of how participants gathered information from each text, we further analyzed the fixation rate per trial (=number of fixations/number of words) and the average fixation duration as well as the percentage of regressive saccades and the average length of progressions and regressions.
Local measures. Analyses of local effects were restricted to word-skipping rates for critical function words and postcritical content words. We extracted for each trial whether the words in critical and postcritical regions were fixated or skipped; spaces between words were included into the following word to account for the forward-biased perceptual span (Ashby et al., 2012).

Statistical analysis
Trial-level data were analyzed using linear and logistic mixed-effects regression with crossed random effects for participants and items (Baayen et al., 2008). The parsimonious random effect structure for each model was determined using a forward-selection heuristic based on likelihood ratio tests with a liberal alpha level of 0.1 (Bates, Kliegl et al., 2015;Matuschek et al., 2017). Global analyses tested for fixed main effects of genre (poetry vs. prose) on text-level measures. Local analyses tested for fixed main and interaction effects of genre (poetry vs. prose) and text position (initial vs. medial) on calculated S/W ratios in critical and postcritical regions and on the likelihood of skipping (fixated vs. skipped) the words in these regions.
The calculation of p-values and effect size estimates (Cohen's d) used Satterthwaite's method for estimating the degrees of freedom; odds ratios (ORs) are reported as effect size estimates of logistic regression models; the false discovery rate was controlled in multiple post-hoc comparisons (Benjamini & Hochberg, 1995). All analyses were carried out in R (R_Core_Team, 2019) using the packages car (Fox & Weisberg, 2011), lme4 (Bates, Mächler et al., 2015), lmerTest (Kuznetsova et al., 2017), and EMAtools (Kleiman, 2017).

Global differences between genres
Our initial analyses aimed to characterize genre-induced behavioral adjustments in terms of articulation-related indices and eye-movement measures. Observed values and results of the statistical analyses are summarized in Table 2.

Global articulation differences
We had predicted a general slowdown for oral poetry (vs. prose) reading, with slower speaking and articulation rates and with more silent speech pauses. Analyses of the acoustic data confirmed that participants orally read poetry at a slower rate than prose (Figure 2A). This reduction reflected that they articulated speech units in poetry more slowly ( Figure 2B) and that the proportion of silent speech pauses was greater when they read poetry ( Figure 2C). To assess the relative contribution of articulatory slowdown and speech pauses we fitted a linear model (adjusted R 2 = 0.99) that predicted participant-level speaking-rate adjustments (difference: prose-poetry) as a function of participants' genre-dependent adjustments to their articulation speed and to the proportion of speech pauses; results indicated that slower articulation accounted for roughly twice as much variance (ΔR 2 = 0.61) as the increased proportion of pauses (ΔR 2 = 0.32). The increased proportion of pause time reflected that silent speech pauses were more frequent in poetry ( Figure 2D), whereas their average duration was in fact shorter than in prose ( Figure 2E).
Fully supporting our hypotheses, these results are consistent with previous cross-linguistic evidence (e.g., Swedish, English, and German) indicating that readers realize speech units in poetry with longer durations than in prose (Bröggelwirth, 2007;Byers, 1979;Kruckenberg & Fant, 1993;Nord et al., 1990;Wagner, 2012) and that silent speech pauses are more frequent (Barney, 1999;Byers, 1979). Participants' genre-dependent modulations of their speaking rates support Byers (1979) original Note. Observed condition means and standard deviations as well as results of multilevel regression analyses: coefficient estimates (B) and standard errors (SE), along with 95% confidence intervals (CI 95% ), t-value and effect size estimates (Cohen's d); Satterthwaite's method was used to approximate the degrees of freedom. Models tested for the fixed main effect of genre (poetry vs. prose) and contained random effects for both participants (N = 32) and texts (N = 48). *p < .05. **p < .01. ***p < .001 "formula for poetic intonation" but are inconsistent with the revised formula proposed by Barney (1999), who had argued that speaking rate adjustments are a general feature of text performance rather than a distinctive feature of poetic intonation. Barney's observations and the results of Byers and the present study can be reconciled if one assumes that genre-specific oral reading strategies are differentiated by distinct but possibly overlapping ranges of gradient articulatory adjustments. Many of our phonetic dependent measures (reading rates, frequency, and duration of speech pauses) are considered indices of oral reading proficiency (e.g., Kowal et al., 1975). Here, the observed effects reflect neither interindividual differences nor linguistic variables but rather expose genre-driven top-down adjustments within individuals, indicating that readers (of literature) pursue genreschematic articulation/intonation strategies. Note, though, that the greater frequency of speech pauses in poetry performance might not only reflect strategic top-down control but may be partly due to additional caesuras induced by the verse format (Kien & Kemp, 1994), thus highlighting the verse line as a key unit of poetic text organization (Fabb, 2015). These additional short caesuras might account for our observation that speech pauses were, on average, slightly shorter in poetry than in prose. Although the exact relation of overt and inner speech is still a matter of debate (Perrone-Bertolotti et al., 2014), we assume that the strategic global articulatory adjustments we observed characterize both overt and silent articulation, since their general high-level parameters (e.g., tempo and rhythm) appear to be quite similar in healthy populations (MacKay, 1992).

Global eye-movement differences
Since reading speed adjustments have been reported in several contrastive studies of genre-specific reading, we had predicted a general slowdown for poetry reading compared to prose reading. Based on the results of a previous eye-tracking investigation of poetry-and prose comprehension, we had expected that the reading strategy for poetry leads readers to fixate longer and more frequently and to make shorter progressive saccades but longer and more frequent regressive ones.
Results confirmed that participants read poetry more slowly than they read prose, resulting in greater reading rates as readers spent more time per character reading poetry ( Figure 2F). This slowdown is consistent with previous results (Hanauer, 1998;Peskin, 2007) and reflected that, as expected, fixations were more frequent ( Figure 2G) and slightly longer ( Figure 2H) for poetry than for prose. These findings are partly inconsistent with the results of Fischer et al. (2003), who found that observed differences in reading rates and fixation frequency were not reliable; this inconsistency presumably reflects differences in sample size and statistical power to detect the rather subtle effects of genre categorization (Fischer et al., 2003, p. 96 observations from 12 participants; the present study: more than 1200 observations from 32 participants). Contrary to our expectations and prior results (Fechino et al., 2020;Fischer et al., 2003), regressive saccades were less likely in poetry than in prose ( Figure 2I), and both progressive and regressive saccades covered shorter distances in poetry reading ( Figure 2JK). We assume that this discrepancy reflects distinct task demands in these experiments. Performance in the oral reading task of the present study does not benefit from regressions once the end of the text has been reached and the text has been read aloud entirely; the slight decrease of regressive saccades we observed for the poetry strategy thus presumably reflects that readers' careful navigation through the text was less error-prone. By contrast, in the comprehension task used by Fischer and colleagues, performance profits from strategic rereading as participants ensure that they have gathered all the relevant information; in line with this interpretation, the probability of regressions was generally higher in Fischer's study (~28%) than in the present one (~19%). The need to reread prior information in the comprehension task might be greater for poetry if the implicit task demand is higher (Weiss et al., 2018), that is, if there seems "more to gather and comprehend" in poetry than in prose as participants expect to encounter significance beyond plain sense (for supporting evidence, see Peskin, 2007). In this sense, the regression results of prior studies seem to be more ecologically valid than the present ones.
The observed effects on eye movements are usually seen as indices of individual reading proficiency or varying processing demand due to linguistic, semantic, or task-related variables (Rayner, 1998). These factors were carefully controlled in the present study and can be ruled out as potential explanations for the systematic differences we observed. Since these differences occurred within individuals and showed considerable overlap across readers, we interpret them as behavioral correlates of the genre-specific processing strategies for literary prose and poetry. Corroborating the notion of genre-appropriate language processing (Graesser et al., 1997;McDaniel et al., 1986;Perfetti & Stafura, 2014;Zwaan, 1993), these results refine and extend the relatively sparse evidence for within-reader differentiation of generalized literary text categories and of the appropriate reading strategies. More generally, the observed slowdown for poems is consistent with the idea of "savoring" during poetry reading (Menninghaus & Wallot, 2021).

Articulation strategies affect reading speed
A number of previous studies have established that the reading strategy for poetry is characterized by its reduced reading tempo, which has usually been attributed to additional interpretive processes rather than to the construction of phonetic surface form. To assess in how far readers' reduced reading tempo reflects aspects of oral text performance, we fitted a linear model that predicted participant-level reading-rate adjustments (difference: prose-poetry) on the basis of all observed articulatory adjustments. After stepwise elimination of nonsignificant predictors, results of the final model (adjusted R 2 = 0.40) confirmed that the observed reading-speed adjustments can partly be accounted for by the genrespecific regulation of articulation tempo (ΔR 2 = 0.30) and the proportion of speech pauses (ΔR 2 = 0.10). These results indicate that genre-specific reading behavior partly reflects the respective articulation strategies.

Articulation
To assess potential effects of genre and context on speech rhythm in critical and postcritical regions, we calculated the ratio of strong and weak syllables (S/W ratio) in terms of three acoustic correlates of word stress: syllable duration, pitch, and intensity; absolute values; and S/W ratios are summarized in Table 3. We had predicted that the poetry-specific strategy leads readers to highlight speech rhythm and to impose additional accent on stressed syllables, which increases S/W ratios, and that the rhythmicity of oral poetry reading increases incrementally if readers attend to prosodic regularities and represent the text's underlying rhythmic pattern, which should particularly increase S/W ratios in text-medial position.
These results partly confirm our predictions. Participants selectively lengthened syllables to achieve sharper prominence contrasts and thus more pronounced speech rhythm, which appears to be part of the articulation strategy for poetry. This generally corroborates previous findings but suggests a more nuanced characterization of the genre-specific phonetic realization of prosodic prominence contrasts. Note. Observed condition means and standard deviations of three major acoustic correlates of syllable prominence; ratios reflect the relative prominence of strong and weak syllables within prosodic feet and served as indices of local speech rhythm.
Wagner (2012) reported that both stressed and unstressed syllables are lengthened in poetry versus prose performance and that duration-based S/W ratios are generally greater. Here, we observed that consistent syllable lengthening was restricted to lexical trochees (in the postcritical region), whereas selective lengthening occurred for function words and that only function-word pairs in the critical region (but not lexical trochees) had greater duration-based S/W ratios in poetry; additionally, only the S/W ratios of lexical trochees approximated Wagner's findings (~3:2), whereas critical functionword pairs had lower ratios (~4:3) across all conditions. These discrepancies suggest that due to their prosodic flexibility, monosyllabic function words play a special role not only in the rhythmic optimization of spontaneous speech (Vogel et al., 2015) but also of metered verse (Fabb, 2001). The predicted interaction of genre and context was not borne out by our data, that is, we did not observe increasingly rhythmic articulation in oral poetry reading. Thus, our results do not lend support to the idea that strategic poetry comprehension per default involves constructing representations of systematic prosodic regularities (i.e., the metrical pattern). However, while the occurrence of such an effect would have provided strong evidence for this idea, its absence is inconclusive with respect to both meter recognition and the genre-specific allocation of attention during reading. It might simply be the case that the poetry strategy does not entail increased attention to sound recurrences. Alternatively, it might be that poetry readers do attend to prosodic patterns but that either abstracting these patterns takes longer than two lines of verse and presupposes the accumulation of more evidence (at least for nonexpert readers like the ones in our experiment), or that readers simply do not apply the abstracted pattern during oral text reading, possibly with the intention to avoid an overly stylized performance. Finally, readers' increased attention might be restricted to perceptually salient sound recurrences at the subsyllabic level, like rhyme or alliteration, rather than focusing on the ongoing prosodic patterns that could just as well occur unnoticed over stretches of written prose or conversational speech (Schlüter, 2005). Kintsch had warned that the versificationlevel hypothesis is based on data from counting-out rhymes, which are a genre grounded in social interaction and with extremely structured forms and basically interchangeable lexicosemantic content (Rubin et al., 1997). Thus, it might also be the case that detailed representations of a text's versification level presuppose either repeated exposure, dense and interrelated phonological patterning, reduced informativity of higher levels of text representation, or a combination thereof. In any case, less indirect methods might be better suited to examine whether readers' attention to sound structure is indeed subject to genre-induced top-down modulations. Participants (N = 32) read short texts that we categorized and formatted as either poetry or prose. Each text consisted of two complex sentences; one sentence began with a region of interest that comprised two monosyllabic function words and a disyllabic content word (e.g., "at the counter"). Regions of interest occurred either in text-initial position (i.e., without prior context) or in text-medial position (i.e., preceded by a context sentence), which allowed us to compare the influence of contextual information (i.e., absence vs. presence of a preceding sentence) across genres. (a) Syllable duration was the most reliable acoustic correlate of word stress and accent; points represent mean syllable durations per condition; error bars indicate the standard error of the mean; lines reflect prosodic prominence gradients within prosodic feet. (b) Word-skipping differences between genres (in %) for critical function words and postcritical content words; points represent mean differences (poetry -prose); error bars indicate the standard error of the mean.

Eye movements
We analyzed word-skipping rates of short critical function words and postcritical content words; observed values are summarized in Table 4. We had predicted that words in context (i.e., text-medial ones) are skipped more frequently than text-initial words without prior context, reflecting that mounting contextual constraint allows for less careful progression through a text. We had further predicted that this increase is particularly strong for prose, which is characterized by semantic discourse coherence to a larger degree than poetry, whose semantic coherence may be demoted at the expense of formal coherence achieved by cohesive parallelism (e.g., meter and rhyme).
Our results confirmed these hypotheses. Critical function words were skipped more frequently in text-medial position than in text-initial position (1st FW: z = 9.53, p < .001, OR = 9.07; 2nd FW: z = 3.00, p = .003, OR = 1.93); postcritical content words showed no main effect of position (z = -1.08, p = .279); there were no main effects of genre (all ps > .3). The observed word-skipping increase from textinitial to text-medial position most likely reflects that mounting discourse constraint warrants less careful navigation through the text. However, since line-initial words were particularly prone to textmedial word skipping, we assume that these position effects additionally reflect the return sweep between lines, whose landing site is usually six or more characters from the left line margin in left-toright writing systems (Hofmeister et al., 1999).
We further observed the predicted interaction effects of genre and text position at the 1st FW (z = -4.21, p < .001) and the postcritical content word (z = -3.75, p < .001) but not at the 2nd FW (z = -0.19, p = .851) ( Figure 3B). As expected, these interactions reflected that the wordskipping increase was more pronounced in prose than in poetry. In text-initial position, neither the 1st FW nor the postcritical content word showed genre-dependent differences (both ps >.21); text-medially, however, both words were skipped less frequently in poetry than in prose (1st FW: z = -8.00, p < .001, OR = 0.18; content word: z = -3.95, p < .001, OR = 0.02). These distinct dynamics probably also partly reflect that the landing site of return sweeps shifts rightward with increasing line length (Hofmeister et al., 1999) and thus results in more line-initial word skipping in prose layout (~60 characters per line) than in poetry format (~30 characters per line). However, the return sweep is an unlikely explanation for the genre effect on skipping rates of line-internal content words in postcritical regions. Skipping rates for these words increased only during prose reading and actually decreased during poetry reading. Thus, it appears that the observed asymmetries between genres reflect both a bottom-up effect that affects line-initial words and that is enhanced by the conventional differences in line lengths, and a top-down effect of genre-appropriate reading strategies that affects line-initial and line-internal words alike. Crucially, these asymmetries reveal distinct processing dynamics of genre-specific reading rather than indexing general processing defaults. Note. Condition means and standard deviations of word-skipping rates (in %) for critical function words (monosyllabic) and postcritical content words (disyllabic). These regions of interest occurred either at the beginning or in the middle of short texts (position: text-initial vs. text-medial) that were categorized and formatted as either literary prose or poetry (genre: prose vs. poetry).

Discussion
The present study examined genre-schematic processing strategies for literary prose and poetry. We recorded speech and eye movements while participants orally read short texts categorized and presented as literary prose or poetry; we hypothesized that readers adjust their reading behavior and their oral text performance to the literary genre. Analyses focused on global differences between genres and-contrary to prior investigations-on the dynamic interaction of genre and contextual information. Our results confirm that readers differentially adjust their reading behavior and their articulation style to literary prose and to poetry and also provide initial evidence for genre-specific processing dynamics. Readers orally read poetry more slowly than prose, which mainly reflects that they articulate more slowly and lengthen speech units and that silent speech pauses make up a larger portion of an oral reading. Oral poetry performance is also characterized by more pronounced speech rhythm than prose performance, but we found no evidence that readers recognize the underlying metrical pattern early and then use it to make their performance increasingly rhythmical. Taken together, these results corroborate and refine previous findings.
We observed a similar slowdown in reading speed for poetry versus prose. Prior results had established that this adjustment correlates with modulations of comprehension proper (Peskin, 2007). Here, we observed that 40% of the variance in genre-appropriate reading-speed adjustments can be accounted for by articulatory modulations, but these results probably overestimate the contribution of inner speech to eye-movement control during silent reading for comprehension and leisure. Nonetheless, they clearly demonstrate that reading-speed adjustments reflect not only genrespecific interpretive operations but also distinct constraints of genre-specific articulation strategies on information flow in the processing system. Considering that articulation is assumed to play a crucial role in monitoring speech production (Levelt, 1989), slowing inner speech down might allow readers to better monitor the (re)production of a text during silent reading but also to regulate the depth of the comprehension process, because "when the speaker selects a rate, he or she is essentially controlling the rate of construction of the representations" (Dell, 1986, p. 289). Taken together, our eyemovement results extend previous findings and identify eye-movement correlates of poetry-and prose-specific reading behavior. These adjustments suggest that the reading strategy for poetry is less "risky" than the strategy for prose (Fischer et al., 2003;Vančová, 2014) in the sense that it features longer fixations, shorter progressive eye movements and less word skipping (McGowan & Reichle, 2018;Rayner et al., 2006). However, the observed differences between genres also seem to reflect bottom-up effects of the graphical format that distinguishes poetry from prose. Crucially, we found that the riskiness of prose reading increases with the presence of a context, which we interpreted as increasing reliance on the propositional discourse model. This incremental adaption differentiates the comprehension strategy for (literary) prose from the strategy for poetry, and constitutes initial evidence for genre-specific processing dynamics.

Implications for models of literary comprehension
The genre-dependent behavioral adjustments we observed are consistent with approaches to discourse comprehension assuming that readers make strategic use of text-type and genre schemata to optimize processing (e.g., Meutsch, 1986;Van Dijk & Kintsch, 1983;Verdaasdonk, 1982;Viehoff, 1995). In particular, they corroborate the notion of the LCCS proposed in Zwaan's (1996) model of literary comprehension. The LCCS allows readers to deal with the "deliberate inconsiderateness" encountered in many literary texts and leads them to adjust their reading behavior and the allocation of their attentional resources, resulting in slow reading and improved verbatim text memory. Gradient effects of the LCCS account for the differentiation of literary comprehension strategies that we observed. However, this account is based on an ill-defined notion of gradient "literariness," which basically reflects formal and thematic text features. It neglects that these features may differ systematically between literary genres and that readers categorize texts. Thus, we rather assume that readers develop not just one LCCS but rather distinct literary control systems, or strategies, that are activated via genre categorization. These strategies reflect the respective exigencies and affordances of different genres and constrain the construction of potentially all levels of mental text representation, including the construction of phonetic surface form. Some adjustments remain constant whereas others lead to diverging processing dynamics during text comprehension.
Modulations of literary text processing in the absence of linguistic differences are hard to accommodate within the neurocognitive poetics model (Jacobs, 2015a(Jacobs, , 2015bNicklas & Jacobs, 2017), a recent theoretical proposal that aims to relate stylistic theory to neuronal, affectivecognitive, and behavioral effects of literary comprehension. Arguably the most sophisticated formulation of stylistic foregrounding theory's reception-related aspects, this dual-route model posits a fundamental distinction between foreground and background elements of literary texts that determine the processing trajectory (immersive vs. esthetic) and modulate reading behavior. The reading modes observed in the present study appear to map onto the model's processing routes (prose = immersive; poetry = esthetic), but they did not depend on the linguistic features of the texts and seemed to be driven by both top-down text categorization and bottom-up layout differences. These and related findings could be accommodated within the neurocognitive poetics model by assuming that the background/foreground distinction is register-and genre-specific, as argued by early foregrounding theorists (Havránek, 1964;Mukařovský, 1964) and that text categorization prior to reading selects the appropriate background/foreground profile, that is, expectations of prototypical and permissible formal and thematic text features and thus codetermines the appropriate processing defaults.

Limitations
In how far do our results generalize to other texts and readers? We presented participants with short constructed texts-designed to be equally acceptable as literary prose or poetry-rather than with authentic literary texts. This allowed us to present texts in both genre conditions to rule out that observed differences were due to confounding text variables. But even with these genre hybrids we obtained clear evidence for conventionalized processing routines that are activated by genre categorization independent of whether or not the texts are prototypical exemplars of their genre. In fact, we assume that the observed behavioral correlates of genre categorization are even more pronounced for authentic texts, because if a text exhibits many distinctive or characteristic features of a literary genre or subgenre, it is likely to result in discourse representations that resonate with readers' genre schema (Hanauer, 1995(Hanauer, , 1996, reinforcing the initial genre categorization that triggers the appropriate processing strategies. For instance, the average articulation rate we observed for poetry (5.1 syllables/second) matches the rate that Byers (1979, p. 369) reported for "light verse" rather than the slower rate she reported for traditional poetry (4.8 syllables/second).
A similar logic applies to the generalizability across readers and the role of expertise. The observed effects depend on the conceptual differentiation of literary genres in the minds of readers and presuppose a certain degree of experience with literary texts. Here, we chose to sample from a student population so that we could presuppose sufficient literary experience to ensure that participants had acquired distinct enough conceptions of literary prose and poetry. We assume that their mostly basic level of experience reflects an intermediate step toward the conceptual differentiation of literary genres that comes with greater expertise. Conceptual differentiation should, in turn, affect genre categorization (Hanauer, 1995) and result in greater behavioral differentiation of genre-appropriate reading (Peskin, 2010) and oral text performance (Funkhouser & O'Connell, 2013;Kowal et al., 1975).
Finally, we had more female than male participants in our experiment. To assess whether there were systematic differences between men and women in terms of the observed genre effects, we compared their average modulations of reading and speaking rates. Results revealed no reliable differences (Welch's two-sample t-tests; reading rate: t(14.89) < 1, p = .639; speaking rate: t(10.53) < 1, p =. 552), indicating that genre categorization had indistinguishable effects in men and women.

Conclusion
Replicating, extending, and refining a number of previous results, our findings lend further support to the general idea that strategic text reading is genre-specific and demonstrate that readers differentiate literary processing strategies. Although these relationships are imperfectly understood at present, the genre expectations reflected in distinctive behavioral adjustments must be grounded in, and tailored to, systematic formal and thematic differences between previously encountered texts. While we found no support for the idea that poetry comprehension per default involves the representation of systematic sound patterns at the versification level, the present study provided initial evidence for genre-specific processing dynamics, which deserve closer examination in future research.