Implementing a Simple, Scalable Self-Regulated Learning Intervention to Promote Graduate Learners’ Statistics Self-Efficacy and Concept Knowledge

Abstract Learners’ efficacy beliefs are an important determinant of their performance and future study in an academic domain, and recent work has highlighted the complexities associated with promoting learners’ statistics efficacy. The current study tested the effectiveness of a course-embedded, activity-driven intervention, grounded in principles of self-regulated learning, in promoting graduate learners’ statistics efficacy and concept knowledge. The intervention was designed to elicit and scaffold students’ analysis-specific efficacy beliefs and facilitate their monitoring of and critical reflection on their statistics understanding across engagement in an intermediate, graduate-level statistics course. Students’ pre- and post-course statistics efficacy was assessed; students’ statistics concept knowledge was assessed at the completion of the course. Engagement in the intervention explained a moderate amount of variance in learners’ post-course statistics efficacy and concept knowledge; efficacy and concept knowledge scores were higher for students in the intervention condition compared with students in the comparison condition. The findings suggest the use of these activities as a method of prompting students to engage more strategically with learning material in statistics. Implications for future research and pedagogy are discussed.

Students' efficacy beliefs are an important determinant of their performance, effort, and perseverance (Pajares 1996;Schunk 1991). Self-efficacy has been shown to predict students' strategy use and intrinsic interest in a variety of academic contexts (Bong and Skaalvik 2003). Self-efficacy beliefs have also been examined as key predictors of students' motivation and persistence (Skaalvik, Federici, and Klassen 2015) and, in some cases, as a barrier to further study in mathematics domains (Larson et al. 2015). In part for this reason, particular empirical effort has been directed toward understanding the contributions to students' efficacy beliefs and achievement in statistics (e.g., Bartsch, Case, and Meerman 2012;Chiesi and Primi 2010;Zare, Rastegar, and Hosseini 2011). At the same time, improving students' statistics self-efficacy can be complex (Bandalos, Yates, and Thorndike-Christ 1995;Hall and Vance 2010), and fewer studies have leveraged principles of self-regulated learning (SRL) to ground the implementation of interventions designed to promote students' statistics efficacy. Further, effectively promoting students' SRL skills-both generally and in the context of statistics education-presents unique challenges to educators (e.g., Smith et al. 2015). Indeed, statistics educators often face complexities and constraints in attempting to promote students' statistics efficacy while at the same time experiencing a need for interventions and tools that are effective, scalable, and easy to implement (e.g., Bartsch, Case, and Meerman 2012). SRL is commonly conceptualized as thoughts, feelings, and actions that are planned and cyclically adapted to help learners attain goals (Cleary, Callan, and Zimmerman 2012; see Zimmerman 2008 for a conceptual overview of SRL). Based on this conceptualization, SRL reflects a multidimensional process whereby learners attempt to monitor and exert control over their cognition, motivation, behaviors, and environments with the aim of optimizing their learning (Cleary, Callan, and Zimmerman 2012). SRL has been identified as a 21 st century skill (National Research Council 2011) that supports academic success and has been linked with a range of learning outcomes (e.g., reading comprehension ;Follmer 2018;Follmer and Sperling 2020;Sperling et al. 2016;Winne 1996). Importantly, SRL draws on the use of cognitive, metacognitive, and motivational skills Roebers 2017) and has been shown to improve through intervention (Dent and Koenka 2016).
A hallmark feature of Zimmerman's social cognitive model of SRL is that learners' efforts to be self-regulated are cyclical and are leveraged to support their learning across key regulatory phases. These phases consist of forethought, performance, and self-reflection (Zimmerman and Martinez-Pons 1988). Broadly, forethought processes precede and prepare students for learning. Performance processes support learners' strategic engagement during learning, while self-reflection processes promote learners' evaluation and attribution of their performance to ground future learning efforts. Example regulatory processes of self-regulation include students' use of self-efficacy beliefs (occurring during the forethought phase), metacognitive monitoring (occurring during the performance phase), and selfevaluation (occurring during the self-reflection phase). Feedback obtained from current and prior learning experiences is incorporated by self-regulated learners to inform and improve the establishment of learning goals, monitoring during learning, and the use of targeted and task-appropriate strategies (among other processes) in support of subsequent learning efforts (Puustinen and Pulkkinen 2001).
Self-efficacy is commonly defined as the level of confidence an individual has in their ability to execute a course of action or attain a specific performance outcome (Bandura 1977). Selfefficacy has been shown to directly predict performance and achievement in a range of learning domains (e.g., Cleary and Sandars 2011). Correspondingly, a range of approaches to promoting students' efficacy beliefs have been suggested in previous research, both generally and in statistics education more specifically. Finney and Schraw (2003), for example, discussed the possible benefits of statistics instructors implementing cooperative learning groups and providing targeted, positive feedback in evaluation of learning and related activities. Others, such as Pajares (1996), suggested providing students authentic mastery experiences that afford opportunities for academic success that are performance-based and tailored to the academic domain. Taken together, this and other work (Chiesi and Primi 2009;Chiesi and Primi 2010) acknowledges the importance of fostering positive beliefs and attitudes about statistics and statistics education. Efficacy beliefs can influence students' views of the utility of statistics and statistical reasoning as well as their interest in and willingness to invest effort and embrace challenge in statistics learning (Ramirez, Schau, and Emmioglu 2012). At the same time, postsecondary students' views of and beliefs about statistics are perhaps most directly shaped by their experiences with statistics coursework, suggesting a need for the study of course-based tools and supports that aim to foster students' statistics efficacy.
In an examination of graduate students' academic selfefficacy, Bartsch et al. (2012) examined the effects of a vicarious experience presentation on master's students' efficacy beliefs in a research design and statistics course. Students either engaged in a presentation of a former student discussing their math anxieties and behaviors supporting course success or completed a written activity in which they described the characteristics of a successful research methods student. Students who engaged with the vicarious presentation reported an increase in their academic self-efficacy beliefs. Other work has implemented teaching frameworks premised on the use of active pedagogical techniques and found positive effects on students' statistics self-efficacy but mixed evidence of effects on performance and achievement-related variables (McGrath et al. 2015). Collectively, much of the published research examining the potential of interventions to promote graduate students' statistics efficacy and achievement has implemented supplemental or extracurricular experiences, is based on the use of brief interventions (i.e., limited dosage), or has been grounded in more extensive course redesign work (e.g., Finney and Schraw 2003). Comparatively less work has centered on examinations of extended and, as described, course-embedded approaches to engendering students' statistics efficacy and, relatedly, concept knowledge.

The Current Study
The study and promotion of students' attitudes and efficacy beliefs is recommended practice and an essential component of instruction and pedagogy in related fields such as mathematics education (e.g., National Council of Teachers of Mathematics 1995; Sonnert, Barnett, and Sadler 2020). As others have noted, however, a similar emphasis on the assessment and promotion of students' attitudes and efficacy beliefs has not been observed in statistics education (Ramirez, Schau, and Emmioglu 2012). Additional work (McGrath et al. 2015) has demonstrated initial promise in implementing activity-driven intervention work aimed at improving graduate learners' statistics efficacy. Despite this work, there remains a need for tested and efficient pedagogical supports designed to improve students' statistics efficacy, knowledge, and understanding (cf. Bartsch, Case, and Meerman 2012). In addition, many of the existing studies examining the contributions of student attitudes and beliefs about statistics to statistics achievement are based on samples of undergraduate learners and those enrolled in introductory statistics courses (e.g., Hood, Creed, and Neumann 2012;Sotos et al. 2009). With a few exceptions, there is a dearth of research examining the effects of supports for learners' statistics efficacy and, in turn, understanding at the graduate level (cf. McGrath et al. 2015). Accordingly, the current study utilizes Zimmerman's model of SRL to frame and examine the contribution of a courseembedded intervention to graduate learners' statistics efficacy and concept knowledge.
The current study is based on two related aims: 1) examine the effect of a course-embedded, activity-based intervention on graduate learners' statistics self-efficacy and 2) examine a secondary effect of the activity-based intervention on graduate learners' statistics concept knowledge. To address these aims, planning and evaluation activities, designed to elicit and scaffold students' appraisals of their statistics efficacy beliefs as well as their monitoring of and critical reflection on their statistics learning, were implemented across 10 weeks of an intermediate, graduate-level statistics course. In the current study, planning and evaluation activities comprised structured, weekly activities that aimed to promote students' strategic planning for and reflection on their studying in the course. Analogous supports have been implemented in recent research in the form of online student response systems designed to promote student engagement in introductory statistics courses (see Muir et al. 2020). The intervention activities implemented in the current study were designed to foster learners' statistics efficacy beliefs in several related ways. First, they provided learners continued practice evaluating their analysis-specific efficacy beliefs in order to situate their ability to reflect on their understanding and ways to improve their understanding. Second, they provided learners with concrete and task-aligned assessment experiences that allowed them to gauge their learning and receive performance feedback. In addition, the intervention afforded learners the opportunity to practice metacognitively monitoring their understanding and performance on these assessment items.
Finally, the intervention elicited students' questioning of the material covered in each unit.
Taken together, the intervention tool implemented in this study was tailored to target three self-regulated learning processes across the three regulatory phases: self-efficacy beliefs (forethought), metacognitive monitoring (performance), and self-evaluation (self-reflection). It was expected that providing students with repeated practice engaging in task-aligned activities designed to promote these regulatory processes would have an overall facilitative effect on their statistics efficacy and understanding. This expectation was based on the finding that learners' use of performance-phase strategic processes, including metacognitive monitoring, as well as self-reflection-phase processes, have been predictive of improvements in self-efficacy in a range of learning tasks and academic domains (e.g., Zimmerman 2008;Zimmerman and Kitsantas 1997;Zimmerman and Kitsantas 1999).
To gauge the primary effect of this intervention activity, precourse and post-course assessments of learners' statistics efficacy were administered. In addition, to evaluate the effect of the intervention on students' statistics learning, an established measure of students' statistics concept knowledge was implemented at the conclusion of the course. Prior research has demonstrated the importance of aligning the measurement of students' statistics efficacy beliefs directly with the area and topic of statistics under study (Finney and Schraw 2003). For this reason, in the current study, graduate students' efficacy beliefs were assessed using a constructed, task-specific self-efficacy measure (see, e.g., seminal work by Pajares and Miller 1995) that aligned with both the scope and aims of the course and the nature and focus of the assessment of students' statistics concept knowledge. This alignment was intended to facilitate a direct examination of the contribution of the SRL-based intervention to students' postcourse statistics efficacy.

Participants and Context
A total of 40 graduate learners (60.0% female) enrolled in graduate degree programs at a large, Mid-Atlantic research university in the United States completed the current study. Approximately 75.0% of participants identified as White/Caucasian, 22.5% as Asian, and 2.5% as Black/African American. Participants were primarily doctoral students (95.0%) and represented a range of graduate majors, including the learning sciences, higher education, strategic management, sport psychology, coaching and teaching studies, and communication studies.
Participants were enrolled in two sections of an intermediate, graduate-level statistics course offered sequentially across two spring semesters. A total of 16 students were enrolled in the first course section, while 24 students were enrolled in the second course section. The course is grounded in frequentist inference with an emphasis on statistical modeling and evaluation (Rodgers 2010;Son et al. 2021). It provides a survey of select inferential analyses based on a single-dependent variable and repeated measures designs, including analysis of variance (ANOVA), repeated measures ANOVA, analysis of covariance, mixed ANOVA models, linear regression, and foundations of mediation and moderation. The course fulfilled a quantitative methods course requirement listed by the programs in which students were enrolled. Prerequisite experiences necessary to be successful in the course included coverage and understanding of sampling distributions, the central limit theorem, probability concepts, tests of hypotheses, confidence intervals, correlation and regression, and analysis of variance. These prerequisite experiences are covered in one introductory, graduatelevel statistics course that precedes the course examined in the current study and is taught by the same instructor. The course emphasizes the development of students' conceptual and procedural understandings as well as the application of select analyses to students' empirical interests and independent practice. A secondary emphasis is placed on fostering students' conditional knowledge related to applied statistical analysis (i.e., knowing when, why, and how to conduct analyses appropriately and in a way, that is, empirically grounded). The course is supported by learning outcomes that target students' abilities to: design and tailor statistical models to test specific research questions; evaluate key assumptions undergirding statistical models; use statistical software to build, test, and report quantitative analyses; and critically assess the appropriateness and utility of differing statistical models. Learning activities and assessments are implemented to address each of these broad outcomes.
The course was implemented across 15 weeks of instruction using a HyFlex design. A HyFlex approach was selected in part to accommodate and support students' learning needs given the COVID-19 pandemic. While there is some variation in the manner in which HyFlex course designs are implemented, a primary feature of the design is the integration of online and face-to-face engagement in the same course with the opportunity for students to choose when and how they participate (e.g., Abdelmalak and Parra 2016). In other words, a HyFlex design is intended to be both hybrid (i.e., Hy, based on the blending of both face-to-face and online course structures) and flexible (i.e., Flex, based on students choosing how to engage in the course over time) and to afford meaningful and autonomous student engagement. In the current study, students opted to engage in the course either face-to-face or online on a weekly basis. Students' learning was supported via standard learning management system-based resources, including week-by-week course overviews and learning objectives, folders for submitting course learning activities, and access to course-based datasets. Learning activities were administered and assessed via the learning management system regardless of the manner in which students engaged in the course. Detailed learning modules were developed to support students who engaged online and included written descriptions and demonstrations of principles, assumptions, computations, interpretations, and reporting of select analyses as well as procedural videos demonstrating how to conduct and interpret major analyses via statistical software. Faceto-face meetings consisted of targeted review of specific content and principles and guided practice and hands-on activities with statistical analyses based on provided datasets. The structure of these meetings was designed to mirror the design of the online learning modules completed by students who engaged in the material online, emphasizing a conceptual and procedural overview of the material and demonstration of key analytical procedures. In this way, face-to-face and online work in the course was designed to be complementary in order to equitably balance variation in students' engagement in the course.

Design and Procedures
This study employed a quasi-experimental design. Students were randomly assigned to either the SRL-based intervention or a comparison condition; assignment to condition occurred at the course section level. Students in the SRL-based intervention completed planning and evaluation activities (described in detail below) that were designed to elicit their appraisals of their analysis-specific efficacy beliefs and facilitate their monitoring of and critical reflection on their statistics understanding. Students completed a total of 10 planning and evaluation activities across Weeks 2 to 13 of the course (excluding two weeks in which major examinations were administered). Construction and assignment of these planning and evaluation activities aligned with the major analyses covered in the course. The planning and evaluation activities were each estimated to require approximately 25 minutes to complete.
Students in the comparison condition completed a question and answer discussion activity in which they received instructor responses to specific content-based questions intended to improve their understanding of the course material. Specifically, students were provided with 10 online discussion threads, via the learning management system used to support the course, and were asked to provide clarification, elaboration, or other specific questions that helped improve and deepen their conceptual understanding. The questions were expected to be specific (e.g., based on principles, assumptions, procedures, etc. undergirding the major analyses covered in each unit) and referenced (i.e., by including relevant chapter and page numbers, where applicable). Questions were reviewed and addressed via text responses by the instructor in the discussion threads. The question and answer activities were each estimated to require approximately 15 minutes to complete.
Both the planning and evaluation and the question and answer activities were assigned on the same day (i.e., Day 3) of the week; this structure was implemented to promote students' engagement with the course material prior to the required submission of the analysis and application activities (described below). Students completed the same number of activities (i.e., 10) across activity type. All students completed either the planning and evaluation activities and the question and answer activities in the course. Other aspects of the course and its instruction, including the primary instructor; content scope, coverage, and sequence; remaining learning activities and assessments; and the structure and content of the learning modules remained the same across course sections. In terms of remaining course activities, students completed: 10 analysis and application activities (designed to provide students practice computing, conducting, and reporting statistical analyses based on specified research questions and to apply specific analyses to their developing empirical interests); two major examinations (designed to assess students' understanding and application of course concepts and principles); and one cumulative final project (designed to evaluate students' ability to independently select, conduct, and report quantitative analyses to address varied research questions and to critically evaluate obtained findings).
Institutional review board approval was obtained for the current study and informed consent was obtained from all participants; all data were collected in accord with human subjects guidelines provided by the American Psychological Association. Participants completed all measures individually and in a selfpaced fashion via Qualtrics. The intervention was also administered using Qualtrics. Participants completed the pre-course assessment of statistics self-efficacy during Week 1 of the course. Students completed the post-course assessments of statistics self-efficacy and statistics concept knowledge in Week 15 of the course. Finally, participants completed a brief demographics questionnaire.

SRL-Based Intervention
Grounded in Zimmerman's social cognitive model of SRL (Zimmerman 2008;Zimmerman and Martinez-Pons 1988), planning and evaluation activities were implemented that promoted students' use of regulatory processes during forethought, performance, and self-reflection phases of SRL as they engaged with the course material week-by-week. The planning and evaluation activities were administered and divided into three sections. In the first section, students were asked to provide efficacy ratings associated with the specific statistical analysis covered in each instructional unit (see Figure 1). They were also asked to summarize, via an open-ended essay response, the aims and procedures of each analysis, with specific emphasis on the variables (i.e., types, levels) examined, the assumptions supporting the analysis, and the procedures for computing the analysis (e.g., calculation of the F statistic in ANOVA-based analyses; see Figure 2). The overarching aims of the first section were to gauge students' confidence with the analytical concepts to be covered in the unit, elicit and scaffold students' analysisspecific efficacy beliefs, and activate students' prior knowledge of relevant analytical concepts and procedures.
In the second section of the planning and evaluation activities, students were provided with two practice test items that assessed their understanding of the analytical concepts covered for each unit. The practice test items were multiple choice (four response options) and emphasized evaluation of students' abilities to interpret statistical output, compute relevant test statistics, evaluate assumptions, and understand logic of calculation (e.g., partitioning sum of squares). The practice test items included on the planning and evaluation activities were structurally and conceptually related to but distinct from the items included in the measure of statistics concept knowledge (described below). This procedure ensured that variation observed in the measure of concept knowledge was not the result of practice effects (see similar procedures employed in a related examination by Follmer and Tise 2022, under review).
After students completed the test items, they rated their confidence in the correctness of their response on a sliding scale ranging from 0-Not at all confident to 100-Extremely confident (e.g., Nietfeld, Cao, and Osborne 2005). They were then asked to  justify their confidence ratings by responding to a prompt asking them to describe why they indicated that level of confidence in their response (see Figure 3 for select output from an example practice test item). Following completion of these confidence ratings, students were provided with correct answers to both of the test items. The overarching aim of the second section was to assess students' calibration and metacognitive monitoring of their statistics understanding and to engender students' metacognitive justifications supporting their confidence judgments.
In the final section, students were asked to reflect on their understanding of the analysis and analytical concepts addressed in each unit by responding to the following prompt:

Based on the readings, your engagement in the course material, and your completion of the practice items, reflect on your understanding of [repeated measures analysis of variance]. In what area(s) is your understanding of [repeated measures analysis of variance] strong?
In what area(s) do you need to improve your understanding? (see Figure 4). Students were provided an essay-style response box to facilitate their reflections. Finally, students were given an open-ended item prompting them to identify (1) questions they had regarding the analytical concepts covered and (2) areas of support they felt they needed from the course instructor. The major aims of the final section  of the planning and evaluation activity were to engender students' critical self-evaluation as a regulatory subprocess contributing to self-reflection (Zimmerman 2008) and to elicit students' approach motives toward help-seeking (Karabenick 2004).
Taken together, the SRL-based intervention intended to promote students' awareness, appraisal, and evaluation of their understanding of the analytical concepts covered by cueing their efficacy beliefs, providing scaffolded practice metacognitively monitoring their understanding, and affording opportunity for critical self-reflection across their engagement with the course and learning material ). These regulatory subprocesses have been shown to serve as related and cyclical determinants of students' strategic learning across a variety of academic domains (cf. Dent and Koenka 2016; Puustinen and Pulkkinen 2001; see also Winne 1996).

Statistics Self-Efficacy
Students' statistics self-efficacy was measured through an 11item instrument designed to capture their efficacy beliefs aligned with their conceptual and procedural understanding of statistics (see Appendix A). Given the task-and domainspecific nature of learners' self-efficacy beliefs (e.g., Bong 2002;Woodruff and Cashman 1993), the instrument used in the current study was constructed to align with the major learning outcomes of the course in which the SRL-based intervention was implemented. Items assessed students' confidence in their ability to identify, align, conduct, and evaluate statistical analyses (e.g., "Align statistical analyses to specific research questions. "; "Evaluate key assumptions that support specific statistical analyses. "). Responses to each item were obtained using a 5point scale ranging from 1-Not at all confident to 5-Extremely confident. Composite scores were formed by computing the mean of participants' responses to the instrument items. The measure has been used in prior versions of the course as well as related graduate-level introductory statistics courses (Follmer and Tise 2022) and has been shown to relate with and predict students' statistics concept knowledge (where Pearson correlation coefficients ranged from 0.43 to 0.50 and prior examination of the standardized regression coefficient indicated β = 0.47). In addition, prior examination of the instrument has indicated that student's pre-course efficacy beliefs were related to their post-course efficacy beliefs (rs = 0.65-0.72) and has also demonstrated strong internal consistency reliability (based on Cronbach's α = 0.93). In the current study, the internal consistency reliability of scores on the instrument was strong for both pre-course (α = 0.95) and post-course (α = 0.96) administrations of the instrument.

Statistics Concept Knowledge
Students' statistics concept knowledge was measured through a 16-item multiple-choice assessment (four response options) based on applied problems and statistical output. The assessment emphasized evaluation of students' concept understanding, ability to compute relevant test statistics, understanding of key assumptions, and interpretation of analyses and select statistical output (e.g., Follmer and Tise 2022). An example item included in the assessment is presented in Figure 5.
"The following diagram depicts select information from a simple mediation model (N = 300) in which word knowledge is positioned as a mediator of the relationship between reading fluency and language skills".
The diagram depicts the a and b paths but does not show the indirect effect. Based on the diagram and the regression coefficients presented, compute and select from the options below the indirect effect for this model. a. The indirect effect is b = 0.31; word knowledge mediates the relationship between reading fluency and language skills. b. The indirect effect is b = 0.06; word knowledge mediates the relationship between reading fluency and language skills. c. The indirect effect is b = 1.29; word knowledge does not mediate the relationship between reading fluency and language skills. d. The indirect effect is b = 0.03; word knowledge does not mediate the relationship between reading fluency and language skills.
Items were included on the assessment to represent and evaluate understanding of each of the major analyses covered in the course. Items were scored as either correct (1) or incorrect (0). Composite scores were formed by summing across participants' correct responses to items on the measure. Prior examination of the measure of concept knowledge has indicated that students' scores on the measure related strongly to overall course performance (r = 0.70) and demonstrated adequate internal consistency reliability (Cronbach's α = 0.72). The internal consistency reliability of scores on the measure was 0.75 in the current study.

Results
Preliminary analyses were conducted to examine (1) evidence of differences in students' engagement in the course and their pre-course statistics self-efficacy scores based on intervention condition and (2) evidence of an association between students' gender and intervention condition. In the first preliminary analysis, a multivariate analysis of variance was conducted to examine variation in pre-course efficacy and course engagement by condition. Course engagement was defined as the amount of time (operationalized as the total number of hours) students engaged with the course, including attendance at face-to-face meetings and completion of learning material (e.g., learning modules, viewing of videos) and course activities through the learning management system. Primary statistical assumptions supporting the MANOVA, including homogeneity of variancescovariances matrices (Box's M = 2.02, p = 0.6) and equality of error variances (pre-course statistics self-efficacy: Levene's F(1, 38) = 0.57, p = 0.5; course engagement: Levene's F(1, 38) = 0.12, p = 0.7), were met. A multivariate effect of condition on pre-course efficacy and course engagement was not observed, Pillai's trace = 0.01, F(2, 37)=0.11, p = 0.9. Condition explained a nominal amount of the variance in the linear combination of pre-course efficacy and course engagement based on partial eta squared, η 2 p = 0.01. Similarly, follow-up univariate tests of between-subjects effects were not statistically significant for either pre-course efficacy, F(1, 38)=0.01, p = 0.9, η 2 p = 0.001, or course engagement, F(1, 38)=0.17, p = 0.7, η 2 p = 0.004. In the second preliminary analysis, a Pearson chi-square test of independence was conducted to examine the association between students' gender and condition. All cell frequencies exceeded five in the contingency table. The result of the test was not statistically significant, χ 2 (1)=1.11, p = 0.3. The phi coefficient was small and likewise not statistically significant, ϕ = 0.17, p = 0.3. Evidence of an association between students' gender and condition was not obtained.
Descriptive statistics for primary measures of pre-course and post-course statistics efficacy and statistics concept knowledge are presented in Table 1. Scores representing students' precourse and post-course statistics efficacy were related based on Pearson's correlation coefficient, r = 0.49, p = 0.001. Students' post-course efficacy and post-course concept knowledge were likewise discernably correlated, r = 0.47, p = 0.003.
To evaluate the effect of the SRL-based intervention on students' post-course statistics efficacy, analysis of covariance was conducted. Students' initial (i.e., pre-course) statistics efficacy scores and course engagement were modeled as covariates in the analysis, while students' post-course statistics efficacy scores were modeled as the primary dependent variable. In addition, acknowledging existing research demonstrating gender differences in efficacy beliefs by academic domain, including mathematics (e.g., Huang 2013), potential gender effects were also statistically controlled (0 = male, 1 = female). Primary statistical assumptions supporting the ANCOVA, including equality of error variances (Levene's F(1, 38)=1.16, p = 0.3) and homoscedasticity (Breusch-Pagan χ 2 = 3.29, p = 0.1), were met. Type III sum of squares estimation was implemented. Significance levels were set at 0.05; 95% confidence intervals are reported. Parameter estimates (given by b as an estimate of the regression coefficient) are reported for each variable in the analysis. Robust standard error estimates are reported ( Table 2). Examination of the self-efficacy model indicated a significant effect of condition, F(1, 35)=4.85, p = 0.03, MSE = 0.31. Condition explained a moderate amount of the variance in students' post-course statistics efficacy scores based on partial eta squared, η 2 p = 0.12, as an estimate of effect size for the group mean difference. The mean post-course efficacy score, adjusted for differences based on gender, course engagement, and precourse efficacy, was higher for students in the intervention condition, M = 4.38, compared with students in the comparison condition, M = 3.98, b = −0.40, SE = 0.18, 95% CI [−0.76, −0.04], p = 0.03 (Table 3).
Finally, an ancillary effect of the SRL-based intervention on students' statistics concept knowledge was examined. Analysis of covariance was again conducted; students' pre-course statistics efficacy scores, course engagement, and gender were statistically controlled. An incomplete response to the concept knowledge assessment was provided by one student; as a result, the analysis was based on 39 rather than 40 total participants. Primary statistical assumptions supporting the ANCOVA, including equality of error variances (Levene's F(1, 37)= 3.88, p = 0.1) and homoscedasticity (Breusch-Pagan χ 2 = 2.80, p = 0.1), were again met. A significant effect of condition, F(1, 34)=4.77, p = 0.04, MSE = 2.06, was observed; condition explained a moderate amount of the variance in students' statistics concept knowledge, η 2 p = 0.12. The mean concept knowledge score was higher for students in the intervention condition, M = 14.33, compared with students in the comparison condition, M = 13.27, b = −1.06, SE = 0.49, 95% CI [−2.05, −0.07], p = 0.04.

Discussion
Promoting students' efficacy beliefs in statistics contexts reflects a unique and complex task. The primary purpose of the current study was to examine the effectiveness of a course-embedded,  SRL-based intervention in supporting graduate learners' statistics efficacy and concept knowledge. Completion of the 10 planning and evaluation activities across Weeks 2 to 13 of the course appeared to engender higher efficacy beliefs among students. Adjusted post-course efficacy scores were higher for students in the intervention condition compared with students in the comparison condition, suggesting increases in students' conceptual and procedural efficacy beliefs. Students completing the planning and evaluation activities also evidenced higher concept knowledge at the end of the course. The intervention explained moderate variance in both statistics efficacy and concept knowledge.
The obtained findings have related implications for both pedagogy and theory. Broadly, the current work suggests that providing graduate students with repeated opportunities to appraise their efficacy beliefs, monitor their understanding and performance, and use performance information to ground critical self-evaluations of their learning enhanced their statistics self-efficacy. These effects are in line with theoretical models of self-regulated learning (Puustinen and Pulkkinen 2001;Zimmerman 2008) as well as empirical work grounded in Zimmerman's three-phase model (e.g., Cleary, Callan, and Zimmerman 2012;Cleary and Sandars 2011). One explanation for these findings is that providing students with authentic, mastery-oriented, and task-aligned learning activities-with feedback-afforded students the opportunity to adjust their study approach, seek more targeted help, and optimize their review of course content and material. Taken together, these planning and evaluation activities-completed across the course-may have prompted students to engage more thoughtfully and strategically with the learning material when compared with a more traditional, discussion-based activity.
Pedagogically, the current work also supports general implications for statistics education. As discussed, the intervention tool implemented in this research was designed to be adaptable and easily fit to a variety of course contexts and learning management systems. It is also able to be implemented with little to no cost. Further, based on the reflective nature of many of the questions implemented in combination with the performance feedback, that is, automatically provided (i.e., in the second section of the activities), assessment of the intervention activities is not grading-intensive. In this sense, while the questions embedded in these intervention activities need to be purposefully constructed and meaningfully aligned to course material, the intervention work is, on the whole, likely to be costeffective. While the intervention was implemented with graduate learners enrolled in an intermediate applied statistics course, such a support could easily be modified to be embedded in undergraduate as well as secondary statistics education settings. Similarly, the planning and evaluation activities implemented in this study could be aligned with a range of statistical content and concepts. In addition, given the ability of such a support to be implemented via standard survey software such as Qualtrics, interventions of similar scope and sequence could likewise be embedded in a variety of course formats, including those that are primarily online, flipped, traditionally in-person, as well as hybrid (e.g., Gundlach et al. 2015;Loux, Varner, and VanNatta 2016).

Limitations and Future Directions
While the current study provided evidence of the utility of an SRL-based intervention tool in supporting graduate students' statistics efficacy, there are several important limitations that must be noted. First and perhaps most pressing, for practical reasons, assignment to condition, while random, occurred at the group rather than the individual level. In addition, while efforts were directed toward obtaining an assessment of students' precourse statistics knowledge, pre-course data representing students' statistics concept knowledge were not available for both sections of the course. 1 For this reason, interpretation of the effect of this SRL-based intervention on students' statistics concept knowledge is based on scores that have not been adjusted for learners' initial concept knowledge and, as such, should be viewed both critically and tentatively. Relatedly, information on additional learner and background factors that may have influenced learners' statistics efficacy beliefs, including age, was not available and could not be controlled in the primary analyses. Based on these considerations, future work would directly benefit from further study of this intervention tool. In particular, future research should more fully examine the possible effects of such an intervention tool on improvements in students' statistics concept knowledge using both experimental and pretest designs that incorporate prior measures of students' concept knowledge.
Next, the intervention-as well as the course as a wholewere implemented across the COVID pandemic. As described, a HyFlex approach was implemented in both sections of the course in order to accommodate and support students' learning needs. As expected, participation in the course varied across students; some students engaged in an online or a face-to-face fashion consistently while others vacillated between online and faceto-face engagement on a week-to-week basis. Specific emphasis was placed on implementing the course sections in ways that were consistent and complementary. Despite this emphasis, it remains possible that variation in student engagement and implementation of the course by section could have had a direct impact on students' learning-related outcomes, including their efficacy and concept knowledge. Similarly, effects of external factors (e.g., stressors associated with the pandemic) on students' outcomes are not known. Future research would benefit from more directly examining the effects of learners' engagement-in both the intervention tool and the coursework as a whole-on their improvement in statistics efficacy and knowledge over time.
A further limitation centers on the nature of the measures implemented to evaluate the effects of the intervention activities. Specifically, in this study, measures of students' efficacy beliefs that aligned with their conceptual and procedural understanding of statistics were administered, which reflects a close examination of the effects of the intervention activities. Future evaluations of these activities would benefit from the use of a measure of students' global efficacy beliefs in statistics (e.g., in a manner similar to that which was implemented in Finney and Schraw 2003) to provide a transfer-appropriate examination of the effect of this intervention on increases in students' general self-efficacy.
Finally, the intervention activities developed for this study were designed to be flexibly implemented and to align with and support applied statistics coursework in the educational and behavioral sciences. In addition, the participants examined in this study were graduate learners, many of whom were pursuing doctoral degrees in their respective fields. Despite the specific application of this intervention tool with graduate learners, the structure of the questions included in this research that were designed to foster students' self-regulated learning skills could easily be fit in a variety of applied statistics contexts, and could likewise be implemented with undergraduate learners in courses that emphasize, for example, statistical inference and statistical modeling. In a similar vein, these questions could be adapted to target additional and overlapping SRL skills that have been shown to promote learning and performance among a range of learners (e.g., goal-setting, task interest/value, causal attributions for performance; Cleary and Sandars 2011;Cleary, Callan, and Zimmerman 2012;Follmer and Sperling 2019). However, the applicability or even utility of such a tool for contexts that extend beyond a focus on applied statistics, such as statistical theory or data science, or in domains outside the behavioral sciences, remain open considerations. Additional work is needed to better understand the usefulness of such a course-embedded support in promoting students' attitudes and beliefs in varied statistics education settings.