Reading comprehension and metacognition: The importance of inferential skills

We explored relations between reading comprehension performance and self-reported components of metacognition in middle-school children. Students’ self-reported metacognitive strategies in planning and evaluation accounted for significant variance in reading comprehension performance on questions involving inferences. In Study 2, middle school students read a science text then made predictions about how they would perform on a comprehension test. Students’ metacomprehension accuracy was related to their performance at different levels of understanding. Students’ text-based question performance accounted for significant variance in metacomprehension accuracy for text-based questions, and inference-based question performance accounted for significant variance in metacomprehension accuracy for inference-based questions. Results from the two studies suggest that metacognitive and metacomprehension knowledge is aligned with the level of information given in text, and is related to deeper understanding of texts, particularly for inferential information. We discuss the implications of these Christian Soto ABOUT THE AUTHOR Christian Soto Our research team is currently investigating metacomprehension in reading and how this is related to other metrics of metacognitive monitoring, most notably learners’ ability to accurately report what they know or do not know about a topic. We are also examining whether intelligent tutoring systems based on artificial intelligence can effectively and efficiently train reading comprehension skills in children and adolescents. We believe these lines of inquiry are essential to inform not only educational practice of teachers in classrooms but educational policy as well such as funding decisions for school systems and educational research. PUBLIC INTEREST STATEMENT Understanding how adolescents learn is an important endeavor for learning scientists. We explored relations between reading comprehension performance and self-reported components of metacognition in middle-school children. In Study 1, students’ self-reported metacognitive strategies in planning and evaluation significantly positively related to reading comprehension performance on questions involving inferences. In Study 2,middle school students read a science text then made predictions about how they would perform on a reading comprehension test. Students’ reading comprehension accuracy was related to their performance at different levels of the text. Students’ text-based question performance, which relies on superficial understanding of texts, was positively related to reading comprehension accuracy for text-based questions, and inference-based question performance, which requires a deeper understanding of the text because it necessitates linkingwhat one readswith prior knowledge of the topic, was significantly positively related to reading comprehension accuracy for inference-based questions. This information will help classroom teachers to tailor reading comprehension interventions to student needs to encourage deeper understanding of texts. Soto et al., Cogent Education (2019), 6: 1565067 https://doi.org/10.1080/2331186X.2019.1565067 © 2019 The Author(s). This open access article is distributed under a Creative Commons Attribution (CC-BY) 4.0 license. Received: 21 December 2017 Accepted: 21 December 2018 First Published: 17 January 2019 *Corresponding author: Christian Soto, Department of Spanish, University of Concepción, Victor Lamas 1290, Concepción 4030000, Chile E-mail: christiansoto@udec.cl Reviewing editor: Richard Kruk, Psychology, University of Manitoba, Canada Additional information is available at the end of the article

Abstract: We explored relations between reading comprehension performance and self-reported components of metacognition in middle-school children. Students' self-reported metacognitive strategies in planning and evaluation accounted for significant variance in reading comprehension performance on questions involving inferences. In Study 2, middle school students read a science text then made predictions about how they would perform on a comprehension test. Students' metacomprehension accuracy was related to their performance at different levels of understanding. Students' text-based question performance accounted for significant variance in metacomprehension accuracy for text-based questions, and inference-based question performance accounted for significant variance in metacomprehension accuracy for inference-based questions. Results from the two studies suggest that metacognitive and metacomprehension knowledge is aligned with the level of information given in text, and is related to deeper understanding of texts, particularly for inferential information. We discuss the implications of these Christian Soto ABOUT THE AUTHOR Christian Soto Our research team is currently investigating metacomprehension in reading and how this is related to other metrics of metacognitive monitoring, most notably learners' ability to accurately report what they know or do not know about a topic. We are also examining whether intelligent tutoring systems based on artificial intelligence can effectively and efficiently train reading comprehension skills in children and adolescents. We believe these lines of inquiry are essential to inform not only educational practice of teachers in classrooms but educational policy as well such as funding decisions for school systems and educational research.

PUBLIC INTEREST STATEMENT
Understanding how adolescents learn is an important endeavor for learning scientists. We explored relations between reading comprehension performance and self-reported components of metacognition in middle-school children. In Study 1, students' self-reported metacognitive strategies in planning and evaluation significantly positively related to reading comprehension performance on questions involving inferences. In Study 2, middle school students read a science text then made predictions about how they would perform on a reading comprehension test. Students' reading comprehension accuracy was related to their performance at different levels of the text. Students' text-based question performance, which relies on superficial understanding of texts, was positively related to reading comprehension accuracy for text-based questions, and inference-based question performance, which requires a deeper understanding of the text because it necessitates linking what one reads with prior knowledge of the topic, was significantly positively related to reading comprehension accuracy for inference-based questions. This information will help classroom teachers to tailor reading comprehension interventions to student needs to encourage deeper understanding of texts.

Introduction
Metacognition is a complex construct that has a rich history in the research literature. Flavell (1979) defined "metacognition" as taking one's own cognition as the object of thought. Since Flavell's seminal work on metacognition, researchers have strived to provide more nuanced definitions of this complex concept. Narens (1990, 1994), for instance, further conceptualized metacognition as comprising two fundamental processes, monitoring and control. Subsequent research has more clearly specified metacognition as inclusive of two main dimensions: knowledge of cognition and regulation of cognition, which subsume more fine-grained subprocesses (Schraw & Dennison, 1994). Knowledge of cognition involves declarative knowledge (repertoire of strategies to employ during learning), procedural knowledge (steps necessary to employ strategies) and conditional knowledge (the knowledge of where, when, and why to employ more versus less successful strategies, given task demands). Regulation of cognition, on the other hand, incorporates those processes necessary to monitor and control learning: planning, information management strategies, debugging strategies, evaluation of learning, and comprehension monitoring (Schraw & Dennison, 1994). We focus on one of the sub-processes of regulation in the present study, comprehension (metacognitive) monitoring.
Comprehension monitoring involves the skill to accurately, efficiently, and effectively monitor the learning task and control subsequent actions to successfully achieve learning objectives (Schraw & Dennison, 1994). Monitoring and control are hypothesized to be cyclical, reciprocal processes in this context (Nelson & Narens, 1990, 1994. Throughout time, various indices and metrics of metacognitive monitoring have been proposed. For the purposes of this study, metacognitive monitoring accuracy has been defined as a feeling-of-knowing (FOK) judgment in which learners make judgments about future performance on a task (e.g., test or exam, known as prospective judgments), albeit it is also possible to assess past performance (retrospective) (Schraw, 2009). The choice of whether to measure global (a holistic, overall) or local (item-byitem) judgment also has implications for the interpretation of metacognitive monitoring (Schraw, 2009). The extent to which individuals' judgments of their performance are congruent with actual performance is known as monitoring accuracy (henceforth in our research, metacomprehension accuracy) whereas the mismatch between judgments and performance are known as metacomprehension bias or error (Boekaerts & Rozendaal, 2010;Efklides, 2008;Winne & Nesbit, 2009; i.e., overconfidence and underconfidence). Metacomprehension accuracy, as a metric of metacognitive monitoring, has been studied using absolute versus relative judgments and with ease-of-learning judgments and judgments of learning (see Schraw, 2009, for a detailed review of these metacognitive monitoring indices).
Metacognitive monitoring has been employed in many domains. However, we focused on reading for the present studies. Reading comprehension is the set of skills that the subjects invoke to generate a mental representation of the text that is sufficiently coherent and rich enough to adequately understand the material that is being read. This presupposes the implementation of diverse cognitive tasks that go from the understanding of the words, relations between sentences and paragraphs, as well as capturing the sense of the text as a whole, among others. As theory has shown, students are able to process the content of the text at least in two levels of depth, base (henceforth, text-based) and inferential (Kintsch, 1998), depending on how the reader relates the ideas of the text, eventually incorporating previous knowledge and re-elaboration of the ideas proposed in the written material. In the case of reading, metacomprehension involves metacognitive processes that support comprehension, allowing readers to evaluate the understanding in progress, and tentatively implement the necessary adjustments to improve the level of coherence of the mental representations generated in the reading process. Thus, in this study, we explored how global reading metacomprehension absolute accuracy, self-reported use of reading strategies, and reading performance are associated. Efficient consumers of information possess an awareness of what they know as well as what they do not know and understand the specific actions they need to take to maximize their learning efficiency. This awareness is referred to as metacognitive knowledge and has been identified as an important part of successful learning (Baker & Beall, 2009;Palincsar & Brown, 1987;Schraw & Dennison, 1994). The ability to monitor the learning process is a key component of metacognition that allows people to recognize when their understanding of incoming information is not meeting their standards (Boekaerts & Corno, 2005;Flavell, 1979;Nelson & Narens, 1990, 1994Zimmerman, 2000). Once people recognize a deficiency in their understanding they are better able to regulate their behavior and improve their understanding (Gutierrez & Schraw, 2015;Hacker, Bol, & Bahbahani, 2008;Schraw, 1998). Successful readers, for example, must determine when they have acquired sufficient knowledge from a text. If they recognize that they have not reached an acceptable level of understanding, they engage in additional processing, both processes (i.e., monitoring and control) which are essential in reading metacomprehension.
Effective metacognitive strategies involve more than simply making it to the end of a text and assuming an acceptable level of understanding of the text. When readers have an accurate understanding of their knowledge about a text, their metacomprehension accuracy is said to be high. In cases of poor metacomprehension accuracy, however, readers are less able to appropriately regulate their efforts. For example, consider a conscientious student studying for a biology midterm. Although she may spend ample time studying, if her metacomprehension is low, she may not know when to stop covering a certain topic because she cannot accurately gauge her level of understanding. Similarly, she would be less able to choose particular topics on which to focus, again because she is unsure just how well she comprehends each topic. Alternatively, she may be overly sure or confident of herself, and thus, yield the same outcome of poor metacomprehension. Indeed, poor metacomprehension has been linked to lower achievement (Bol & Hacker, 2001;Hacker et al., 2008;Zabrucky, Moore, Agler, & Cummings, 2015) and less wellregulated study (Thiede, 1999).
Unfortunately, students' metacomprehension accuracy is generally not proficient (Glenberg & Epstein, 1985;Glenberg, Sanocki, Epstein, & Morris, 1987;Griffin, Wiley, & Thiede, 2008). Primary goals in the field have thus far been to better understand metacomprehension and to develop techniques and interventions to enhance metacomprehension (e.g., Dunlosky & Rawson, 2005;Griffin et al., 2008;Gutierrez & Schraw, 2015). Despite many advances in the understanding of how metacognition influences reading comprehension, several questions remain. In the current project, we focus on two emerging topics regarding readers' metacomprehension. First, we examine which components of metacognitive knowledge are most predictive of reading comprehension performance for questions that require a text-based level understanding and for those that require a deeper level of understanding (i.e., requiring inferential reasoning). Second, we examine how reading comprehension skills relate to metacomprehension accuracy for both text-based and inferential understanding. In combination, these topics can elucidate the relations between metacognitive skills and reading comprehension, such as the ability to evaluate one's understanding and the resulting level of comprehension.
Despite empirical evidence supporting the benefits of metacognitive knowledge, its exact relation with reading comprehension is not entirely clear. For example, McNamara and Magliano (2009) claimed that the specific nature of the relation between metacognition and reading strategy use is unclear based on findings from a study using verbal protocols to relate metacognition and self-explanations. Peronard (2005) found that an intervention aimed at increasing metacomprehension knowledge had little impact on reading comprehension. Similarly, Puente Jiménez and Alvarado (2009) found weak relations between a reading comprehension test (Batería de Evaluación de los Procesos Lectores en Secundaria; PROLEC-SE) and a reading awareness test (Escala de Conciencia Lectora; ESCOLA). Thus, students' explicit metacomprehension knowledge does not seem to be a consistently strong predictor of their reading comprehension. In the following sections, we will describe some reasons for these mixed findings, including concerns about the types of reading comprehension tests used in previous studies.
Before delving into the literature about how metacognition influences reading comprehension, it is important to disambiguate our conceptualization of comprehension monitoring. Researchers have described monitoring as including different processes. According to Hacker and colleagues, monitoring has often been discussed as including both the processes of evaluation and regulation (Hacker et al., 1994;Keener & Hacker, 2012). From this perspective, readers' monitoring would be said to be successful only when they, for example, both noticed that they did not understand some part of the text and deployed cognitive effort to remedy their understanding (e.g., by rereading the section). An alternative view of monitoring is that there is a clear distinction between the processes of monitoring (e.g., evaluating comprehension), and regulation (e.g., doing something to fix comprehension deficits) (Schraw & Dennison, 1994;Schraw & Moshman, 1995). We adopt this second view, as we believe it to be more useful when studying metacomprehension and its influence on reading comprehension.

Metacomprehension accuracy
As stated earlier, students' metacomprehension accuracy tends to be quite poor. Studies that find these discouraging results often ask students to read a text, make a prediction (or predictions) about how they will perform on a comprehension test, and then complete a comprehension test (e.g., Bol & Hacker, 2001;Bol, Hacker, O'Shea, & Allen, 2005;Dunlosky & Lipko, 2007;Hacker et al., 2008). Although many of the predictions are about students' overall comprehension level (e.g., asking how well students think they will do on an upcoming test), even when students are asked to predict performance regarding specific bits of information from a text (e.g., asking how well students think they will be able to recall a definition), accuracy is still quite low (Dunlosky, Rawson, & McDonald, 2002). The relation between students' predictions and their performance is considered a measure of monitoring; when students successfully evaluate their level of comprehension they should be quite accurate in their predictions.
Thiede and colleagues (Thiede, Griffin, Wiley, & Redford, 2009) conducted an extensive analysis of the relations between evaluation and reading comprehension. They found that the average correlation-in more than 40 studies-was about .27. Thiede et al. explained this low correlation by describing several concerns that could distort the relation between students' predictions of performance and their reading comprehension scores, or that could lead to low levels of precision in evaluation. The primary concern relates to a lack of evidence supporting the validity of scores from the reading comprehension tests used in previous studies. Weaver (1990) demonstrated that there are improvements of metacomprehension accuracy when several elements of the text are included in the comprehension assessment. Additionally, the assessment must include information from all sections of the text (Dunlosky, Griffin, Thiede, & Wiley, 2005). Therefore, the authors suggest that measures of reading comprehension must cover various components of the text as well as relations among these components. When a measure of comprehension focuses too heavily on certain material the correspondence between predictions and performance will, on average, be lowered. Dunlosky and Lipko (2007) also noted that metacomprehension accuracy can be influenced by the length of a text, such that for longer texts readers will have more trouble making accurate predictions of their comprehension, and hence, their metacomprehension accuracy will suffer (Thiede et al., 2009). It is important to note, however, that they employed a relative accuracy index whereas we use absolute accuracy.
The other key explanation of poor metacomprehension accuracy is that readers may be using different cues to generate their judgments of understanding. Dunlosky et al. (2002) proposed the "levels of disruption" theory, which assumes that when readers make judgments about the understanding of a text they base this prediction on cues that are derived from disruptions that threaten the flow of reading. The inference assumption proposes that metacomprehension judgments are derived from inferences that people make based on instances of disruption. Many factors can cause disruptions, such as reading unfamiliar words, ambiguous pronouns, and so on. According to Dunlosky and his colleagues, as more disruptions occur people are increasingly likely to infer that the text has been misunderstood. The accuracy assumption proposes that metacomprehension judgments are a function of the degree to which key underlying judgments are predictive of comprehension test performance. As an example, readers might infer that the length of a text will influence their test performance and therefore will use this as a cue when making metacomprehension judgments. If length is correlated with the difficulty of a test then the accuracy of judgments will be relatively high; but this will not always be the case. Finally, the representation assumption proposes that disruptions can occur at different levels of representation of the text (i. e., text-based and situation model; Kintsch, 1998). When disruptions occur primarily at one specific level of readers' metacomprehension, it is based on that level of the text. We propose that the representation assumption is especially important in metacomprehension and focus on this assumption throughout this paper.

Levels of representation and metacomprehension
According to the Construction-Integration Model (Kintsch, 1988(Kintsch, , 1998), text comprehension involves multiple levels of information processing. The linguistics level involves recognizing words and understanding the syntactic links between them (Riffo, Reyes, Novoa, Véliz, & Castro, 2014). A second, text-based level involves the generation of meaning through the integration of propositions. A third level, known as the situation model, integrates textual information with additional information from the reader's prior knowledge (Kintsch & Rawson, 2007). Generating inferences is perhaps the most important mental operation in the construction of the situation model. Inferential processes result in readers generating information that is not directly stated in the text (i.e., by invoking prior knowledge). Importantly, this new information yields deeper understanding because readers include these semantic elements to their mental representation of text information. Thus, inferences play an important role in the quality of mental representations (Vieiro & Gómez, 2004).
Many researchers have found that higher inferential abilities are linked to reading comprehension. For example, research has shown that children who experience comprehension difficulties often lack skills needed to generate inferences (Cain, Oakhill, Barnes, & Bryant, 2001). Similarly, high inferential abilities such as those that require individuals to use prior knowledge and information within-text to reach conclusions are associated with good reading comprehension (Cain & Oakhill, 1999). Other lines of work also indicate that teaching inferential skills can improve reading comprehension. McNamara, for example, developed Self-Explanation Strategy Training (SERT), the goal of which is to teach students reading strategies that involve self-explanation and encourage the generation of inferences during reading (McNamara, 2004(McNamara, , 2017). An evaluation of SERT showed that students who are low in prior knowledge benefitted from SERT (McNamara, 2004). This same study also found a positive relation between the generation of inferences that establish connections between text elements (i.e., bridging inferences) and performance with reading comprehension questions.
Prior research, thus, supports the idea that inferencing is an important factor influencing the quality of readers' mental representations. Inferential skills should, therefore, influence readers' metacomprehension. Specifically, we argue that metacomprehension judgments, which require careful evaluation, should be influenced by readers' inferential skills. Because the assumption of representation posits that readers will base metacognitive judgments on disruptions at particular levels of representations, readers who have higher or lower inferential skills will base their judgments on different information. For example, when readers with higher inferential skill read a text they will be more likely to encounter disruptions at the situation model level of representation. In other words, because they are frequently generating inferences and tying information to their prior knowledge they will be more likely to notice discrepancies at this deeper level of understanding. Conversely, readers with lower inferential skill will generate fewer inferences and be less likely to notice disruptions at the situation model level-instead noticing more difficulties in building their text-based level knowledge.
Other researchers have argued that the quality of readers' mental representations relate to metacomprehension accuracy. For example, Thiede and colleagues (Thiede et al., 2009) noted that if readers are generating valid inferences, metacomprehension accuracy increases as they use cues based on the situation model rather than a surface level or text-based level (Dunlosky et al., , 2002. This idea has been supported by some pioneering studies that have improved metacomprehension accuracy using techniques, during or after reading, that are intended to support more complete, readily accessible mental representations. These techniques have included summarization or keywords after a delay, self-explanation while reading, and the construction of concept maps. To relate this to Kintsch's model, if readers write a delayed summary after completing a text they will likely have access to a more complete situation model of the text. Similarly, when readers generate a list of keywords for a text, their metacomprehension judgments more closely align with comprehension test questions, allowing students to make more valid judgments. For example, in a study by Anderson and Thiede (2003), metacomprehension accuracy was dramatically higher for a group that made a delayed summary (r = .60) than for the control group (r = .26); this result was replicated for both longer and shorter texts. Again, these results suggest that metacomprehension is influenced by readers' mental representations and that encouraging readers to engage in inferential processes can improve metacomprehension accuracy.

Purpose of the studies
Different approaches can be used to investigate how readers' metacognitive knowledge links to their level of comprehension at multiple levels (i.e., linguistic, text-based, and situation levels). To this end, we have divided our research into two studies.

Study 1
In Study 1, students completed a self-report survey about their metacognitive knowledge, specifically as it relates to reading. Scores on three subcomponents of metacognition were measured, including planning (i.e., engaging in processes aimed at preparing for a reading task), monitoring (i. e., recognizing problems with comprehension and making necessary adjustments during reading), and evaluation (i.e., assessing and recognizing successes and failures during reading). The main research questions that guided Study 1 were: (1) To what extent are self-reported planning, monitoring, and evaluation related to performance on text-based and inferential comprehension questions among a cohort of Chilean middle school students?
(2) How much variance do planning, monitoring, and evaluation account for in performance regarding text-based and inferential comprehension questions?
H 1 : Based on our review of the literature, we predicted that metacognitive knowledge will be more highly associated with students' performance on comprehension questions that require inferencing (versus text-based).
H 2 : In line with our previous hypothesis, we expected that planning, monitoring, and evaluation would be better predictors of inferential comprehension questions than text-based questions.

Study 2
In Study 2, students made predictions about their upcoming performance on a comprehension test for a text they just read. Metacomprehension absolute accuracy was then calculated for performance on text-based and inference level questions. The following research questions guided Study 2: ( 3) To what extent does reading comprehension performance predict metacomprehension absolute accuracy? (4) Does the relation between metacomprehension absolute accuracy and reading comprehension performance vary by question type (i.e., inferential, text-based)?
H 3 : We expected higher performance on reading comprehension to significantly positively predict metacomprehension accuracy.
H 4 : We expected reading comprehension performance to be more highly positively correlated for inferential level than text-based level judgments. Moreover, we expected reading comprehension and metacomprehension absolute accuracy to be greater for text-based level judgments compared to inferential level judgments.

Participants
A sample of 190 7 th and 8 th grade students were recruited from four different Chilean schools (two public schools and two private schools). Their ages ranged from 11-13 years of age (M = 12.25, SD = 1.17). The schools were public and private and had similar performance on the National System of Evaluation of the Quality of Education, which was in turn similar to the national average. Entire randomly chosen classes participated in this study. There were 93 females and 97 males.
The 190 students were distributed as follows: 99 students from the public education system and 91 from the private education system. The students belonged to two courses of different schools of both educational systems (private and public). For this reason, the final sample was distributed as follows: Public School 1: 23 7 th grade students and 25 8 th grade students; Public School 2: 20 7 th grade students and 31 8 th grade students; Private School 1: 23 7 th grade students and 28 8 th grade students; and Private School 2: 21 7 th grade students and 19 8 th grade students. Of the 20 test questions, 10 are basic text-based questions and 10 are inferential questions. In the case of the basic text-based questions, the information needed to answer the questions is explicitly available in the text while the information for the answers to the inferential questions must be inferred. For both types of questions, we used a rubric per question, with scores from 0 to 2, depending on how complete or partial the information included in the answer was. More complete, coherent, and cogent responses, for instance, received complete credit whereas those with incomplete or weak answers were given 0. Responses with sound arguments but with gaps or incomplete information were given partial credit (i.e., a score of 1). A similar evaluative tool was used by McNamara, Kintsch, Songer, and Kintsch (1996) on the theme heart disease to assess the impact of prior knowledge and cohesion on reading comprehension. This model (Kintsch, 1998) is the most relevant model of the comprehension in the field of discourse processing, and assumes different levels of mental representation of the text (linguistic, text-based and situation model). The text did not contain images and was 433 words long. The comprehension test consisted of 20 open-ended questions, half of which were text-based questions and half of which were inferential questions. Sample items for text-based and inferential questions can be found in the Appendix. Students' responses were assigned a score from 0 to 2 points for each question, and hence, the range of possible scores was 0-40. Each student was thus given a total score, and scores corresponding to their text-based and inferential performance. The assumption behind the text-based/inferential distinction is that students who can represent a coherent mental model during or after the reading will be better able to answer inferential questions. Students who are only able to answer explicit questions (text-based) are limited in their understanding. Internal consistency reliability coefficients, Cronbach's α, for this measure were adequate: text-based = .74; inferential = .81.

Procedure
Institutional review board (IRB) approval was secured from all relevant institutions prior to any data collection activities. Students first completed the ESCOLA test, which lasted approximately 40 minutes. Immediately after, they were given an expository text regarding the endocrine and gastro-intestinal systems and digestion. Students were given 50 minutes to read the text, after which the text was collected and the reading comprehension test was administered.

Data analysis
We screened the data for outliers within the scales of ESCOLA as well as the reading comprehension measure and evaluated the data for requisite assumptions prior to data analysis. The outlier analysis revealed the presence of 16 outliers (6 within planning and 10 within evaluation of the ESCOLA subscales). These outliers were detected using the casewise diagnostic command in regression by specifying standardized residuals beyond three standard deviations. Neglecting to address outliers may undermine the trustworthiness of results due to the undue influence these outliers exert on variable descriptive statistics (Tabachnick & Fidell, 2013). Thus, these outliers were omitted from data analyses, leaving 174 complete cases for analysis. These data met all assumptions, including normality, homoscedasticity, and linearity. Therefore, data analysis proceeded without making any adjustments to the data. Descriptive statistics for the ESCOLA scales and reading comprehension performance are presented in Table 2. Pearson's zero-order correlation coefficients were calculated to answer part of the first research question, and these are presented in Table 3. We subsequently conducted a series of simultaneous/standard ordinary least squares regressions to answer the second research question, on proportion of variance in comprehension performance accounted for by each component of metacomprehension, adjusting the a priori p-value using the Bonferroni adjustment for multiple analyses. Table 3 show that all correlations were in the theoretically expected direction (i.e., positive). Interestingly, the planning and evaluation metacomprehension components correlated more strongly with inferential comprehension questions than with text-based comprehension performance, except for monitoring, which correlated more strongly with textbased comprehension questions. Results of the simultaneous regression with text-based comprehension questions as the criterion revealed that planning, monitoring, and evaluation were not significant predictors of text-based comprehension question performance, F (3,170) = 2.45, p = .06. As would be expected based on Table 4, the closest predictor of performance on textbased comprehension to reaching statistical significance was monitoring (p = .07).

Correlation coefficients in
The simultaneous regression results with inferential comprehension questions as the criterion were statistically significant, F (3,170) = 7.38, p = .001, R 2 = .12. However, only planning and evaluation were significant predictors, albeit evaluation was the strongest predictor of inferential comprehension question performance (see Table 4). Specifically, questions that related to students' evaluation of themselves as readers and their ability to plan effectively were related to their inferential performance.

Participants
A distinct sample of 87 7 th and 8 th grade students-an independent sample from Study 1-were recruited from four different Chilean schools (two public schools and two private schools). As in Study 1, these schools had scores similar to the national average on the National System of Evaluation of the Quality of Education. Again, students from entire, randomly chosen classes participated in the study. This led to a similar distribution of males and females. Their mean age was 12.75 (SD = 0.76).

Measures
3.1.2.1. Reading comprehension test. Students' reading comprehension was assessed using comprehension questions for two texts: one about extinction and one about the endocrine system (Thiede, Wiley, & Griffin, 2011). These texts have been used in studies seeking to assess metacomprehension accuracy (e.g., Thiede et al., 2011). Each of the two text is an average length of 380 words and includes five multiple-choice text-based questions and five multiple-choice inferential questions, for a total of 20 questions. Each question had one correct answer, and hence, the total possible score was 10. We chose this approach because we did not want to merely replicate texts from Study 1 but to remain consistent with the procedure employed by Thiede et al. (2011). The Kuder-Richardson 20 internal consistency reliability coefficients for the tests were as follows: endocrine = .72; extinction = .69.

Metacomprehension accuracy.
Immediately after reading each text, students made one holistic, overall judgment regarding how many questions out of five they believed they would answer correctly on an upcoming test; this procedure was done for text-based and inferential questions separately. In the metacognitive monitoring literature, these FOKs of future performance are known as global prospective confidence in performance judgments. It is noteworthy to alert the reader that this single global judgment is a course-grained analysis; including local judgments (i.e., item-by-item) would have provided a finer-grained analysis. During analyses, these judgments were compared with students' actual performance on text-based and inferential questions separately, thereby yielding an absolute monitoring accuracy index, one each for text-based and inferential questions. Some studies calculate metacomprehension accuracy by calculating the correlation between judgments and performance (choosing a relative accuracy measure; Schraw, 2009). However, we elected to use the absolute accuracy approach. Specifically, scores were calculated by comparing participants' global predictions of performance judgments (again, immediately after reading the text but before the test of performance on the text, or prospective) against their actual performance scores for each set of five text-based and five inferential questions respectively . More specifically, we subtracted their average global prospective judgment scores from their actual performance scores (i.e., number of correct responses on the text-based and on the inferential questions). Comparing predictions of performance against actual performance yielded continuous, absolute metacomprehension accuracy scores (henceforth, "metacomprehension accuracy"), as described by Schraw (2009); one text-based metacomprehension accuracy score, and one inference-based metacomprehension accuracy score for each passage. A score of "0" indicates perfect accuracy; on the other hand, the higher the value, and thus the farther away from "0", the greater the metacomprehension inaccuracy. Thus, the higher the score, the less accurate the participant's predictions of comprehension performance on the text.
We elected to use this measure because measures of absolute accuracy can be used to gauge metacomprehension accuracy (Keener & Hacker, 2012). Students who are proficient at predicting their future performance with reading comprehension tasks are more likely to take appropriate actions that promote metacomprehension accuracy and learning (Gutierrez & Schraw, 2015;Keener & Hacker, 2012;Schraw, Potenza, & Nebelsick-Gullett, 1993), and thus, are of interest to the current study. When students are well calibrated, they are better able to learn from their studied materials (Winne, 2004) because they can take appropriate actions when necessary (Hacker et al., 2008), given task demands.

Procedure
Students completed the reading task for the first text, made a prediction about their subsequent performance, and then answered comprehension questions corresponding to that text. They then repeated this process for the second text. In total, the study lasted 60 minutes. The counterbalance for this second study was implemented by presenting the texts alternately in schools of the same type. More specifically, in 7 th grade of one of the public schools, students were first asked to read and make performance judgments about the text that deals with extinction and then about the text that deals with the endocrine system. In the other public school, the order was reversed; that is, first we worked with the text on the endocrine system and then on the text about extinction. In the case of the two private schools, the same procedure for counterbalancing texts was applied.

Data analysis
Data screening and assumption testing procedures were conducted prior to data analysis. Data met the assumptions of normality and linearity. Further, an outlier analysis detected no cases that would be deemed outliers, and hence, all cases were retained for analysis. We subsequently conducted a series of standard/simultaneous ordinary least squares regressions, with each metacomprehension accuracy score-inferential and text-based extinction as well as inferential and text-based endocrine-serving as the criterion in each model respectively. Finally, in order to evaluate the effects of question level type (text-based, inferential) on reading comprehension performance and metacomprehension accuracy, we conducted two separate one-way multivariate analyses of variance (MANOVAs), one for endocrine and extinction comprehension performance as outcomes and the other for metacomprehension accuracy scores for endocrine and extinction questions as outcomes. We conducted two separate MANOVAs because we sought to more clearly disentangle the unique effects of performance and metacomprehension accuracy. We controlled for Type I error rate inflation using the Bonferroni adjustment. Table 5 presents the descriptive statistics of performance and Table 6 presents metacomprehension accuracy scores as a function of the text type. Table 7 contains the Pearson's zero-order correlation coefficients. Negative correlations between performance and metacomprehension accuracy can be interpreted as better comprehension performance being related to better metacomprehension accuracy, as the higher the comprehension performance, the lower the miscalibration. This was a function of the method we chose to compute absolute metacomprehension accuracy. Table 7 show that metacomprehension accuracy was strongly related to question type (i.e., inferential question performance more strongly correlated with metacomprehension accuracy on those items and text-based performance correlated more strongly with text-based metacomprehension accuracy scores). Results of a standard regression revealed that only inferential question performance on the extinction test was a significant predictor of inferential question metacomprehension accuracy, F (4,82) = 42.21, p = .001, R 2 = .53. Similarly, text-based question performance on the extinction test predicted text-based question metacomprehension accuracy, F (4,82) = 24.21, p = .001, R 2 = .37. The same pattern held for the endocrine text. Inferential question performance on the endocrine test predicted inferential question metacomprehension accuracy, F (4,82) = 29.66, p = .001, R 2 = .40; and textbased question performance on the endocrine test was the only significant predictor of textbased question metacomprehension accuracy, F (4,82) = 28.34, p = . 001, R 2 = .40. Table 8 presents the results of the standard regression models. These analyses suggest that students' appraisal of their understanding will be somewhat dependent on whether they understood the text at a deep or surface level. Because students were not asked to make separate predictions for their inferential and text-based comprehension performance (students, in fact, were not told anything about these distinctions), these different patterns of relations seem to emerge without prompting or priming.

Correlation coefficients in
With respect to our final research question, correlations between reading comprehension performance and metacomprehension accuracy revealed that correlation coefficients were higher in absolute value for inferential-level questions (Pearson's correlations ranged from r = .32 to r = .92 in absolute value) than for text-based-level questions (Pearson's correlations ranged from r = .08 to r = .63 in absolute value; see Table 7 for the correlation matrix). Because our absolute metacomprehension accuracy index is interpreted such that higher values indicate greater miscalibration, negative correlations between the comprehension performance and absolute metacomprehension accuracy measures suggest that the higher the comprehension performance the lower the miscalibration, and hence, the greater the metacomprehension accuracy.
Results of the one-way MANOVA with endocrine and extinction comprehension performance as outcomes revealed that question level (text-based, inferential) had a significant effect on the linear combination of dependent variables, multivariate F (2,171) = 17.33, p < . 001, η 2 = .169. Given these significant results, the univariate results were interpreted next. Question level had a significant effect on both sets of questions of the text: endocrine reading comprehension   Table 7. Correlation matrix of reading comprehension performance and metacomprehension accuracy for endocrine and extinction questions by text-based-and inferential-level questions Findings regarding metacomprehension accuracy for the endocrine-and extinction-related passages indicated that question level (text-based, inferential) had a statistically and practically significant effect on overall metacomprehension accuracy for each passage, multivariate F (2,171) = 9.56, p < . 001, η 2 = .101. Univariate results revealed that question level had a significant effect on students' endocrine, F (1,172) = 10.91, p = . 001, η 2 = .061, and extinction, F (1,172) = 10.88, p = . 001, η 2 = .060, metacomprehension accuracy. Students exhibited significantly greater metacomprehension accuracy-for text-based (endocrine, M = 1.57, SD = 1.12; extinction, M = 1.52, SD = 1.04) than for inferential questions (endocrine, M = 2.14, SD = 1.13; extinction, M = 2.07, SD = 1.16).
In sum, findings of our two one-way MANOVAs indicate consistently that students tend to perform better and manifest greater metacomprehension accuracy for text-based level comprehension performance than for questions that require them to draw inferences based on information provided. This pattern generalized across both sets of questions of the text, as it held constant in both questions assessing students' knowledge of the endocrine system and those evaluating knowledge of factors related to extinction.

General discussion
In this series of studies, we sought to better understand how text-based and inferential performance related to self-reported measures of metacognition (Study 1) and students' metacognitive global absolute monitoring judgments of their own reading comprehension made immediately after the text was read (Study 2). Study 1 demonstrated a relation between students' evaluative reading knowledge, as measured by ESCOLA, and their performance on comprehension questions that required inferential reasoning. This suggests that being aware of self-reported use of evaluative comprehension processes and how they relate to success in reading could be a key factor in supporting inferential comprehension. This is specifically related to declarative knowledge of reading strategies and the when, where, and why to apply strategies given task demands (i.e., conditional knowledge), which elucidates how those strategies influence comprehension (Jacobs & Paris, 1987;Palincsar & Brown, 1987;Schraw, 1994). Planning was also a significant positive predictor of inferential comprehension performance. This suggests that learners need more developed planning skills in order to more deeply comprehend texts, especially when those texts are complex, and thus, require inferential reasoning. Presumably, readers with proficient comprehension of texts incorporate planning as an important process to help them generate high-quality inferences. (Dunlosky et al., 2002;Schraw, 1994).
Study 2 demonstrated that students' global absolute metacomprehension accuracy was related to their performance at different levels of understanding, and that these relations were significantly higher for questions that required inferential reasoning than for declarative-related questions (i.e., text-based). Students who performed well on inferential questions tended to also have high global absolute metacomprehension accuracy for inferential questions. This implies that readers' evaluations of their comprehension may be dependent on their level of processing of a text (see Kintsch, 1988Kintsch, , 1998. This is consistent with the representation assumption from the levels of disruption theory (Dunlosky et al., 2002). Students may base their level of understanding on the disruptions they encounter at a particular level of processing; hence, if students are attempting to generate many inferences, they will likely use their success at generating these inferences to estimate their level of understanding. For readers who do not generate as many inferences (i.e., less skilled readers or perhaps readers who are exerting less effort while reading), estimating their level of understanding will be done at a different level. Thus, if the goal of metacomprehension is to accurately be aware of one's deeper level understanding, it will be necessary to have adequate inferential abilities. Because evaluation involves generating an appropriate judgment about the success of comprehension (e.g., Hacker, 2014;Shraw, 2009;Thiede et al., 2009), it makes theoretical sense that inferential skills are related. It is important to note that we employed global absolute monitoring judgments provided immediately after the text was read. Nevertheless, research shows that delayed judgments tend to be more accurate than immediate judgments (e.g., Van Overschelde & Nelson, 2006;Weaver & Kelemen, 1997).
Next, we propose an explanation of how inferential skills may influence metacomprehension and then provide suggestions for how this account can be supported and refuted. We base this both on the results of the current studies as well as a the levels of disruption theory (Dunlosky et al., 2002).

The influence of inferential processes on global absolute metacomprehension and understanding
The explanation about the different performance on metacomprehension accuracy is that the readers may be using different cues to generate their judgments of understanding. Dunlosky et al. (2002) proposed the "levels of disruption" theory, which assumes that when readers make judgments about the understanding of a text, they base this prediction on cues that are derived from three possible disruptions that threaten the flow of reading, inference assumption, accuracy assumption and representation assumption. The latter proposes that disruptions can occur at different levels of representation of the text. When disruptions occur primarily at one specific level, readers' metacomprehension accuracy is based on that level of the text rather than on highquality inferences.
According to other research, readers who excel at text-based comprehension performance understand the information of the text depending on the explicit and local information between adjacent ideas (McNamara & Magliano, 2009;Soto, Rodriguez, & Gutierrez de Blume, 2018). However, these same readers have some limitations in the monitoring cues to generate a more accurate judgment about their comprehension because at times that explicit information involves different dimensions (e.g., explicit ideas, words or simply details of the text, etc.). In fact, Soto, Rodriguez, and Gutierrez de Blume (2018) employed the error detection paradigm among students with intellectual disabilities and they found that internal inconsistencies were a significant predictor of metacomprehension accuracy. However, the researchers show that the inferential readers operate using more sophisticated cues like self-explanation or elaboration. Our assumption is that the mental representation of readers with high inferential comprehension performance involves more coherent representations of the text, which in turn, yields a greater alignment between performance judgments and actual performance (i.e., greater metacomprehension accuracy).
The fact that two different metacognitive profiles could affect how the reading processing is generated, and particularly which information is considered (or not) is useful information for the appropriate representation of the text. For example, McNamara (2004) concluded that the most cohesive texts benefit poor readers, yet they could pose an obstacle for proficient readers. This is especially interesting because it could be evidence that, although metacognition in general affects reading outcomes, the specific metacognitive cues that the students use are different depending on the reader's profile, as our findings indicate. In the same way, the results of our present study show that the inferential readers demonstrate less accurate monitoring about the explicit information and details (text-based level), and hence, they have difficulty accessing inferential representation (situation level model) of the text.

Implications for learning and instruction
Our studies indicate that students' understanding of text is a function of the type of material they are reviewing. This is important information for educators to have at their disposal because they can use it to explicitly teach and encourage students to engage with reading materials more deeply, and thus, better understand the information presented. This is especially important for complex information, such as concepts of biology used in Study 2-namely, extinction and the endocrine system-which oftentimes are difficult for students, especially middle school students, to fully grasp. Findings from this research suggest that inferential reasoning plays a crucial role in deeper levels of understanding of materials. Hence, educators should emphasize the importance of inferential reasoning in their instruction, and hence, assist students to improve not only comprehension but metacomprehension monitoring. Previous research has demonstrated that learners who are more proficient in their monitoring of learning are also more self-regulated and successful (e.g., Gutierrez & Schraw, 2015;Hacker et al., 2008;Nietfeld & Schraw, 2002;Schraw, 1994;Schraw & Dennison, 1994). Thus, our results have practical implications and application to classroom educators.
Although our current results cannot suggest which reading skills would be more important for novice readers to practice, they do suggest that interventions, such as the Interactive Strategy Training for Active Reading and Thinking (iSTART;McNamara, Levinstein, & Boonthum, 2004;Snow, Jacovina, Jackson, & McNamara, 2016), that teach metacognitive skills and inferential skills, should be particularly effective at encouraging both better comprehension and metacomprehension.

Avenues for future research, methodological reflections and limitations
Together, the results from these studies provide evidence for the hypothesis that inferential processes are important for successful global absolute metacomprehension (Otero, 2002). However, more work is needed to more fully understand the role of inferential processes on metacomprehension. Ample research suggests that readers engage in inferencing during reading, aiding comprehension; but is this sufficient to yield high levels of metacomprehension? Alternatively, inferencing can occur both during reading and after the evaluation process during regulation. Clearly, this latter explanation is more likely. Still, more work is needed to shed light on the nuances of these processes and how inferential abilities influence metacomprehension in a moment-by-moment manner. Future research should explore these avenues using rigorous methods such as think-aloud protocols, especially online as students are engaged in the task itself. This approach would help clarify these nuances and provide additional empirical support for theoretical assumptions. Likewise, future work should employ both global and local (i.e., item-by-item) as well as absolute and relative monitoring judgments, and researchers should include prospective and retrospective monitoring judgments to provide a clearer, finer-grained analyses of the phenomena we investigated. Furthermore, experimental studies providing training on inferential reasoning should be conducted to ascertain its effect on metacomprehension monitoring. Finally, researchers should delay performance judgments to ascertain whether the delayed judgment effect improves metacomprehension accuracy.
These studies employed a correlational design. Non-experimental designs do not permit the isolation of causal effects, and hence, the inferences and conclusions drawn from correlational data, such as regression, are not as strong. Further, some of our effect sizes were relatively small, and hence, our findings should be interpreted with the magnitude of effects in mind. Additionally, we did not collect performance metrics during reading or real-time reading processing, thereby limiting the utility of our findings during actual reading. We also acknowledge that we did not collect information on prior reading ability, and hence, this may behave as a potential confound in the study, as better readers would likely outperform poor readers by virtue of increased prior reading comprehension. In addition, the data were nested for students within classrooms, and we did not account for this nesting of the data.
Moreover, although we used objective measures such as performance and metacomprehension accuracy, it should be noted that we used a self-report survey to capture metacognitive knowledge, and hence, the survey may not have captured the full dimensionality of metacognition. Also, participants may not have been fully honest in their self-reported data. Despite these limitations, however, the sample sizes for the two studies was relatively large and provided enough statistical power to support our expectations regarding the data. Thus, we believe these studies contribute to the literature on metacomprehension monitoring and how it is influenced by factors such as type of performance (i.e., inferential and text-based).

Conclusions
The results of the present study suggest that students who have high inferential skills are also those with higher self-reported metacognitive knowledge, specifically knowledge pertaining to evaluation. These readers can efficiently repair their mental representations of a text, and exhibit higher metacomprehension accuracy in their inferential level of understanding. On the other hand, readers who operate primarily at the text-based level tend to have high metacomprehension accuracy only on a text-based level. Thus, our combined findings suggest the importance of both inferential skills and metacognitive knowledge of reading strategies, particularly as they relate to evaluation of learning. In turn, readers should be able to better understand while they read and have better regulatory abilities to guide their future efforts.