Evaluation of the effects of discourse-based mathematics instruction on eleventh grade students’ conceptual and procedural understanding of probability and statistics

Abstract This study attempted to develop a literature-informed discourse-based instructional approach and evaluated its effects on eleventh-grade students’ conceptual and procedural understandings in probability and statistics. To this end, a quasi-experimental study that involved two control groups and one experimental group was used. One-way analysis of variance of the baseline data showed that the three groups were comparable at the start of the experiment. After the intervention, the Kruskal-Wallis test analysis performed on the posttest data showed significant differences in conceptual understanding and procedural understanding. Pairwise comparisons of the three groups using Dunn’s test as post-hoc analysis showed that students who underwent discourse-based instruction demonstrated a better understanding of probability and statistics compared to those students who received traditional instruction. The analysis of survey data about students’ learning mathematics through discourse provided an insight into how the intervention strategy helped students improve their learning. Overall, the main finding of the study is that discourse-based instruction enables students to develop a better understanding of mathematical topics when appropriately orchestrated interactive discourse practices occur. The study contributes a discourse-based instructional design framework and empirical evidence to the field of knowledge of teaching approaches. The findings may motivate educators to examine students’ textbooks for mathematical discourse considerations. A possible recommendation for future research would be conducting longitudinal studies to explore the relations between students’ conceptual understanding and procedural understanding in a specific domain of mathematics and to examine how discourse-based instruction affects the relations between conceptual and procedural knowledge.


PUBLIC INTEREST STATEMENT
The rationale behind the study was to enhance students' learning of mathematics through discourse-based instruction by creating opportunities for students to explain their ideas, justify solution algorithms and procedures, compare and exchange problem-solving strategies, and describe problem situations using multiple representations. It was then motivated to develop a literature-informed framework to design and implement discourse-based instruction and to determine its effects on students' learning outcomes. The study contributed an instructional design framework that substantially assists mathematics teachers in the preparation and implementation of discourse-based lessons to promote their students' learning. It added service empirical evidence to the field of knowledge of teaching approaches that support that mathematical discussion enhances mathematical understanding. It also offered an instructional guide that aids in availing research-based information for school teachers to design and implement discourse-based mathematics lessons. Overall, the research processes and findings reported in the study provide insights on how school teachers and policymakers integrate classroom discourse in curriculum materials and develop discourse-based syllabi and professional training materials.

Introduction
Over the past three decades now, the teaching of mathematics for mathematical understanding continues to be among the major research focuses in mathematics education literature (Simon, 2016). Researchers and curriculum designers have been constantly accentuating mathematical understanding to be the most desirable instructional goal (Haapasalo, 2003;Hiebert & Lefevre, 1986;Kadijevich, 2018;Rittle-Johnson & Star, 2007;Simon, 2016;Star & Stylianides, 2013). However, despite it is receiving a primary emphasis, studies showed that students' poor performance in mathematics, an indication of lacking mathematical understanding and problem-solving skills, remains a concern to many countries (Cai, 2016;Johansson & Strietholt, 2019), which indicates the question of how mathematics should be taught for developing students' understanding concepts and procedures remains a major focus in educational research (Cruz & Antonio, 2008;Garfield, 1995;Li & Schoenfeld, 2019;Shaughnessy, 2007).
Among the domains of the school mathematics curriculum where students have more difficulties and misconceptions in learning and understanding probability are and statistics (Cruz & Antonio, 2008;Garfield, 1995;Woldemichael, 2015). A study by Woldemichael (2015) in an Ethiopian classroom setting indicated that secondary school students have difficulties in learning and understanding descriptive statistics. All the literature evidence shows that researchers should continually look for alternative instructional methods that support the development of both conceptual and procedural understanding of mathematical topics .
As a solution to the problem of lacking adequate mathematical understanding and problemsolving capacity, mathematics education reform efforts have widely propagated the benefits of using mathematical discourse as an instructional approach for fostering students' construction of mathematical understanding (Chapin et al., 2003;Cirillo, 2013). Sfard et al. (1998) articulated the use of discourse as a teaching method as "the question is not whether to teach through conversation, but rather how" (p. 50). Mathematical discourse is viewed as: "a vehicle for constructing knowledge by using concepts interactively, the concepts themselves become clearer and more defined through the practice of relevant language" (Bradford, 2007, p. 41); "a central element of acquiring mathematical knowledge and understanding the nature of mathematics" (Steeley, 2017, p. 1); and "a tool for equity, a vehicle for developing reasoning, and an engine for lasting learning" (Steeley, 2017, p. 7).
Mathematical discourse as a vehicle for students' learning enhances the construction of mathematical understandings (Bradford, 2007;Harbour & Denham, 2021;Steeley, 2017). From this pedagogical perspective, mathematical discourse is expected to create a supportive and equitable learning environment wherein students discuss and share their ideas, articulate their thinking, and compare and justify solution methods (Conner et al., 2014;Harbour & Denham, 2021).  supported the benefits of discourse-based learning opportunities by suggesting comparison, explanation, and exploration among the promising instructional strategies for developing mathematical knowledge.
Orchestrating and facilitating discourse-based mathematics teaching is thought to promote the construction of mathematical understanding because it creates opportunities for students to discuss and share their ideas, to explain their reasoning, to compare and justify problem-solving strategies, communicate their ideas, challenge each other's reasoning through questioning, reflect on and clarify their thinking, and listen to other's viewpoints (Steeley, 2017).
Multiple theoretical perspectives concur that students learn mathematics best when they discuss, compare, share, explain or justify solution methods, construct arguments, reflect on and clarify ideas, and communicate ideas and thinking (Bell & Pape, 2012;Cirillo, 2013). However, studies that explored how mathematics teachers can design and implement discourse-based mathematics lessons and empirical research on the effectiveness of discourse-based instruction on targeted students' learning domains (e.g., mathematical understanding in a content-specific domain of mathematics) are scant and limited to students in primary and lower secondary schools in particular, in an Ethiopian context, there is almost none. The present study, therefore, sought to investigate the instructional use of mathematical discourse and its effects on students' conceptual and procedural understanding of probability and statistics unit topics outlined in the eleventhgrade mathematics syllabus (MOE, 2009a).
Recent moves to learning approaches in Ethiopia, as highlighted in curriculum and policy documents, promote the use of student-centered teaching methods in school classrooms across all grade levels for improving knowledge and skills acquisitions. For instance, the national curriculum framework (MOE, 2009b) articulates that the goal statements of mathematics teaching in Ethiopian secondary schools advocate that all students should be able to acquire and develop solid, applicable, and extendable mathematical knowledge and skills, and appreciate the usefulness and relevance of mathematics. However, based on existing evidence and personal observations, the traditional form of instruction and assessment procedures largely dominate the educational practices. As a result of ineffective and poor classroom teaching practices and other factors, the vast majority of secondary students are characterized by inadequate understanding of mathematical concepts and procedures, lack of mathematical reasoning and problem-solving skills, and negative dispositions towards mathematics. As reported by the Ethiopian National Examination Agency [ENEA] (2010), the majority of secondary school students scored below the average passing mark (50%); which implied that students lacked an adequate understanding of mathematical concepts and procedures across different domains of the mathematics curriculum. A study by Woldemichael (2015) in an Ethiopian classroom setting revealed that students had difficulties in learning and understanding basic descriptive statistical concepts and procedures. Cruz and Antonio (2008) found that lacking a profound understanding of probability and statistics in upper secondary school goes up with students to affect their learning of university courses. Unless the problem is addressed, which fundamentally depends on classroom teaching practices; it may cause difficulties in learning mathematics and science subjects and limit their future job opportunities (Li & Schoenfeld, 2019). Despite recommendations for using discourse as a teaching method for advancing students' construction of mathematical ideas and relations, there is a lack of empirical research for mathematics learning through discourse practices; for instance, studies that implemented discourse-based instruction as an intervention strategy for enhancing students' understanding of probability and statistics are scant (Shaughnessy, 2007).
Research has well established the importance of both conceptual and procedural understanding of mathematical topics as the most desirable instructional goals Star & Stylianides, 2013); given an increasing focus on using mathematical discourse as an imperative instructional approach for enhancing students' mathematical understanding and thinking (Buchheister et al., 2019; and indicated lack of a viable model of discourse-based teaching) and empirical research on student learning (Kooloos et al., 2020). The study, therefore, attempted to develop a literature-informed framework for designing and enacting discourse-based instruction and to evaluate its effects on students' procedural and conceptual understanding in probability and statistics topics outlined in the Ethiopian 11th-grade mathematics syllabus (MOE, 2009a). Findings of the investigation of discourse-based instruction on students' conceptual and procedural understanding of mathematical topics would add empirical evidence to the body of knowledge about instructional approaches in mathematics and statistics education literature.

Conceptual and procedural knowledge of mathematics
Research has established the importance of teaching mathematics for developing both conceptual and procedural understanding (Haapasalo, 2003;Hiebert & Lefevre, 1986;Kadijevich, 2018;Kilpatrick et al., 2001;Rittle-Johnson & Star, 2007;Star & Stylianides, 2013). Both conceptual and procedural understandings are seen as the most desirable instructional goals (Hiebert & Lefevre, 1986;Kadijevich, 2018;Star & Stylianides, 2013) and are expected to provide a strong foundation for building students' mathematical proficiency . Consequently, there have been continual efforts to devise alternative instructional methods that support the development of both conceptual and procedural knowledge (Rittle-Johnson & Schneider, 2015).

Relationship between conceptual and procedural knowledge
There is a long-standing and continuing controversy on the relations between conceptual and procedural knowledge of mathematics (Haapasalo & Kadijevich, 2000;Rittle-Johnson & Alibali, 1999;Schneider & Stem, 2010). There are three theoretical viewpoints on relations between conceptual and procedural knowledge in mathematics: concepts-first, procedures-first, and bidirectional (Kadijevich, 2018;Schneider & Stem, 2010). The unidirectional view argues that the development of conceptual knowledge precedes the development of procedural knowledge (concepts-first view) and vice versa (procedures-first view). In other words, the concepts-first view postulates that "students will initially acquire conceptual knowledge, for example, by listening to verbal explanations, and will then, by practice, derive procedural knowledge from it" (Schneider & Stem, 2010, p. 1955. In contrast, the procedures-first view conjectures that "students initially acquire procedural knowledge in a specific domain, for example, by trial-and-error learning, and then gradually construct conceptual knowledge from it by reflection (Schneider & Stem, 2010, p. 1955).
The bidirectional view advocates the existence of bi-directional causal relations between conceptual and procedural knowledge of mathematics (Kadijevich, 2018;Rittle-Johnson & Alibali, 1999;Schneider & Stem, 2010). By bi-directional causal relations meant an "increase in one kind of knowledge will prompt an increase in the other one as well" (Schneider & Stem, 2010). Rittle-Johnson and Alibali (1999) showed a bidirectional relation between conceptual and procedural knowledge of equivalence in primary school grades.
The current widely accepted view on the relations between conceptual and procedural knowledge is a bi-directional and iterative relationship (Kadijevich, 2018;Schneider & Stem, 2010). That is, increases in one lead to increases in the other (Kadijevich, 2018) and conceptual and procedural knowledge are mutually interdependent and support each other's increment interactively (Hiebert & Lefevre, 1986;. Related to the bi-directional causal relations, an important question for research would be exploring how different teaching methods affect the relations between conceptual and procedural knowledge in a specific domain of mathematics (Kadijevich, 2018;Rittle-Johnson & Star, 2007;Star & Stylianides, 2013) As depicted in Figure 1 and 2, the design of discourse-based instruction focuses on engaging students in mathematical tasks that embody the content to be taught wherein students engage in comparing and contrasting solution strategies, discussing and sharing their ideas, challenging one other's reasoning, explaining and justifying their solution procedures, exemplifying their understanding of concepts, and elaborating problem situations. The discourse-based lesson orchestrated  Figure 2, a diagrammatical representation of the relationship between study variable, the study involved the method of instruction as the independent variable and participants' performance on conceptual and procedural understanding tasks as the dependent variable. A variable being manipulated to cause some variation in another variable is the independent variable (Shadish et al., 2002). The dependent variable is the variable being influenced upon the manipulation of the independent variable (Shadish et al., 2002).

Figure 2. A diagrammatical representation of the relationship between study variables. As depicted in
around challenging tasks is structured into individual work, small group discourse, whole-class discourse, and reflection. The discourse-based lesson sequence in the classroom consists of an introduction, task presentation, followed by engaging students with the task in individual work, small-group discourse, whole-class discussion, and reflection. A teacher starts the class by activating prior knowledge to enable students to engage with the mathematical task(s) and articulate the goals of each lesson.

Purpose of the study
The study aimed to promote a deeper conceptual and procedural understanding of mathematics in Ethiopian schools. The purpose of the study was to develop a framework for the design and implementation of discourse-based instruction and to evaluate its effects on students' learning of probability and statistics topics.

Research questions and hypotheses
To investigate the impact of an attempt to design and implement discourse-based on 11th-grade students' mathematical understanding in probability and statistics, the study was guided by the research question: What impact does an attempt to design and implement discourse-based instruction have on eleventh-grade students' mathematical understanding in probability and statistics? Below are we two sub-questions addressed in the study?
(a) Are there significant differences in conceptual and procedural understanding of probability and statistics between students who were taught probability and statistics using discoursebased instruction (representing the experimental group) and those who were taught the same unit topics using the traditional lecture method (representing the control groups)?
(b) How do students in the experimental group experience learning mathematics through mathematical discourse?
Below are the specified questions to address the sub-question in (a) above: (i) Is there a statistically significant difference in procedural understanding among the control and experimental groups?
(ii) Is there a statistically significant difference in conceptual understanding among the control and experimental groups?
(iii) Is there a statistically significant difference in overall mathematics performance among the control and experimental groups?

Null hypothesis
• There is no statistically significant difference in procedural understanding of probability and statistics among the control and experimental groups.
• There is no statistically significant difference in conceptual understanding of probability and statistics among the control and experimental groups.
• There is no statistically significant difference in the overall performance in probability and statistics among the control and experimental groups.

Research methodology
The study adopted a quantitative experimental research approach because it sought to determine the effects of discourse-based instruction on student learning outcomes. Sarantakos (2005) viewed research methodology as "a research strategy that translates ontological and epistemological principles into guidelines that show how research is to be conducted" (cited in Bengat, 2015, p. 226). However, care should be taken when making decisions regarding the methodological aspects of the research process as each methodological choice represents a compromise between the ideal and the possible (Krathwohl, 1964).

Research design
Research design serves as a blueprint for conducting a study. It describes the settings under which the study is to be conducted and how the data is to be generated, collected, and analyzed to answer the research questions (Ary et al., 2010;McMillan & Schumacher, 2013). The study aimed to determine whether discourse-based instruction can increase students' procedural and conceptual understanding of probability and statistics. According to Cashin (1995), an approach to evaluate the effectiveness of a method of teaching is to determine changes in students' learning outcomes.
Consequently, an experimental study is chosen as a method of inquiry to examine the effects of discourse-based teaching on students' knowledge acquisitions. An experimental study is defined as "research in which variables are manipulated and their effects upon other variables observed" (Campbell et al., 1963, p. 1).
To address its research questions, the study predominantly adopted a quasi-experimental study. More specifically, a quasi-experimental study with posttest-only control group design (Campbell et al., 1963;Goodwin, 2009) was chosen as a method of inquiry. First, it was difficult to randomize each student to either the control group or the experimental group. Second, the goal of the study was to evaluate the effectiveness of discourse-based mathematics instruction by comparing outcome differences in the dependent variable among groups.
In a quasi-experimental design, intact groups are assigned into control and experimental groups when randomization of participants is not possible for some reasons (Gall et al., 2003;Shadish et al., 2002). Random assignment to the treatment condition within intact classes increases both internal and external validity (Rittle-Johnson & Star, 2007). Furthermore, a quasi-experimental design makes it possible to conduct evaluation studies in natural classroom settings (Creswell, 2014). However, group comparability should be ensured at the beginning of any experimental study (Creswell, 2014).
The experimental design suitable for experimentation of a teaching method for a new subject matter is a posttest-only control group design (Campbell et al., 1963). The study employed a quasiexperimental design with a posttest-only control group (Campbell et al., 1963;Shadish et al., 2002) to determine the effects of discourse-based instruction on eleventh-grade students' conceptual and procedural knowledge of probability and statistics. It was not feasible to conduct a pretest before the beginning of the experiment because the participants were not taught all content topics considered in the study during their earlier grade levels. The tenth-grade mathematics syllabus does not incorporate topics from probability and statistics. The participants were taught some probability concepts and measures of central tendency and dispersion for ungrouped data in ninth grade.
Following Shadish et al.'s (2002) recommendation for use of multiple control groups to strengthen the internal validity of a quasi-experimental design with a posttest-only control group, the study used two intact classes as control groups and another intact class as an experimental group. One of the three intact classes was randomly assigned to the treatment condition (experimental group, designated by Group A (n = 34; 15 females, 19 males) and the remaining two classes served as control groups (designated by Group B (n = 31; 16 females, 15 males); Group C (n = 36; 17 females, 19 males)). The random assignment of the treatment condition protects threats due to "differential selection" (Gall et al., 2003, p. 371).

Study variables
The study involved the method of instruction as the independent variable and participants' performance on conceptual and procedural understanding tasks as the dependent variable. A variable being manipulated to cause some variation in another variable is the independent variable (Shadish et al., 2002). The dependent variable is the variable being influenced upon the manipulation of the independent variable (Shadish et al., 2002).

Validity and reliability
The initial version of the test instrument that consisted of fifteen short constructed-response tasks was provided for experts in the field to review the construct and content validity. Validity addresses whether the instrument measures what it claims to measure (Portney & Watkins, 2015). Content validity refers to how well the content topics embedded within the test items represent the specified content topics to form an instrument that measures the construct under consideration (Portney & Watkins, 2015). The content validity of test items was specified based on the content and goal of each test item following the content descriptions and learning goals outlined in the mathematics syllabus. The content of the test items that formed the instrument covered core unit topics of probability and statistics outlined in the Ethiopian 11th-grade mathematics syllabus (MOE, 2009a).
Construct validity refers to how well the test items made up of the measuring instrument reflect the characteristics of the constructs being measured (Oppenheim, 1992). Construct validity requires the construction of test items in ways to adequately reflect the characteristics of the constructs that elicit item responses (Osterlind, 1998). The constructs of conceptual and procedural mathematical understanding were operationalized through the test items. Construct validity of the test items was checked against the theoretical characterizations anchored to a review of literature on measure constructions of conceptual and procedural knowledge in mathematics. Two senior mathematics teachers independently classified the test tasks as conceptual tasks, procedural tasks, or undecided as to which construct each item predominantly reflects characterizations the conceptual understanding, procedural understanding, or undecided. Those test items for which the coders found ambiguous were discarded.
After incorporating the feedback and comments, the test tasks were modified and reduced to eleven tasks, which were then pilot tested for checking their reliability. A separate computation for the internal consistency reliability of conceptual and procedural tasks showed that the test tasks were found to be good since Cronbach's alpha coefficients were 0.78 and 0.72 respectively. The final version of the test instrument that was made up of five procedural and six conceptual tasks was used for data collection.

Experimental and data collection procedures
Before conducting the study, after briefing the purpose of the study, the researcher obtained permission from the school principal. Two 11th-grade mathematics teachers with similar academic profiles and professional qualifications were purposefully selected. The principal and mathematics department head assisted the researcher in the selection of comparable teachers. The two teachers (hereafter named teacher A and teacher B) showed their willingness to involve in the study. Because the teachers were not familiar with discourse-based instruction, they underwent two training sessions each lasting for three hours. The training was on designing and implementing discourse-based lessons that involved the choice or design of discourse-elicited tasks, crafting different questions, setting up and maintaining social and socio mathematical norms as well as assessment of conceptual and procedural knowledge acquisition (Ryve, 2004). After the training, using a coin toss, either teacher was assigned to teach the experimental group or the control group. To ensure internal validity, the teachers were blind to the hypotheses of the study and were advised not to overly leak information about the nature of the experiment to participants (Kocakaya, 2011).

Classroom instruction in the control groups
The control groups (Group B and Group C) were taught with the traditional lecture method and attended the same class hours per week at a similar shift as the experimental group. The duration of each lesson was forty-two minutes. Group B was taught by teacher A who taught the experimental group A while group C was taught by teacher B, who received the same training but was restricted not to implement discourse-based instruction. The same probability and statistics topics were taught to the three groups.
The traditional lecture method can be characterized as the "chalk" and "talk" method (Serbessa, 2006). The traditional teaching of mathematics involves showing and telling in which rotememorization and calculation are highly valued (Lampert, 1990). The kind of discourse is unidirectional with weak teacher-student and student-to-student interactions (Thompson, 2007) and passive transmission of knowledge from the teacher to students with minimal effort to elicit students' mathematical cognition.
In the control group, students often engaged with procedural and computation tasks that did not provide opportunities for students to foster the construction of mathematical ideas and concepts and spent much of the classroom lesson time taking notes and listened to the teacher's lecture, and copied worked examples that demonstrate procedural rules. The control group students spent most of their time listening to what the teacher talked and explained, taking notes, and got involved in procedural-oriented tasks. Only a few students participated in asking and answering questions. In such classrooms, doing mathematics means following already established rules and procedures stated either in textbooks or by the teacher, and to learn mathematics means to remember rules and procedures and to carry out some routine tasks (Lampert, 1990).
The lesson on the topic of classifying statistical variables as qualitative or quantitative taught in the control groups can be summarized as follows. The teachers began the lesson by stating its goal and introduced the topic by speaking-writing the definitions of qualitative and quantitative variables. Students listened to the teacher's explanation about the variables, copied notes, and examples written on the blackboard. There were very limited opportunities for students to exemplify and explain their understanding of concepts and procedures.

Data collection instruments and procedures
At the start of the experiment, baseline data were collected. After informing participants orally and in writing about the purpose of data collection, three days after the completion of the experiment, posttest data on conceptual and procedural understanding in probability and statistics were collected using a researcher-constructed test instrument. The posttest was administered to the three groups under similar conditions. The school teachers and doctoral students invigilated the administration of the test. The researcher attended the invigilation passive observer.

Statistical methods of data analysis
The statistical data analysis was performed using the Statistical Package for Social Science (SPSS) (version 23). Both descriptive and inferential statistical techniques were employed. Parametric and non-parametric inferential tests were used for data analysis based on the statistical assumptions. Equivalent non-parametric tests are opted the raw scores violate the parametric test assumptions (Pallant, 2016). Accordingly, baseline data on participants' mathematics performance was analyzed using one-way analysis of variance ANOVA, whereas the posttest data on conceptual and procedural understanding were analyzed using the kruskal-Wallis tests.

Ethical consideration
The researcher received a letter of approval from the Science College at Bahir Dar University to conduct the experimental study. Before the start of the experiment, the researcher visited the participating school and met with the principal to discuss the purpose and nature of the study. Following a brief discussion, the school administration granted permission to conduct the study. It was unlikely that there were any harm or discomfort associated with the participation in the study because the study did not affect or disturb the normal class schedule, and the main intention of the study was to maximize the process of learning by which the goals and objectives of mathematics teaching are achieved. As far as the knowledge of the researcher was concerned, ethical issues were respected during data analysis and interpretation and in writing and disseminating research findings.

Data analysis results and discussion
The quantitative data analyses were anchored to the two-sub research questions: (a) Are there significant differences in conceptual and procedural understanding of probability and statistics between students who were taught probability and statistics using discourse-based instruction (representing the experimental group) and those who were taught the same unit topics using the traditional lecture method (representing the control groups)? (b) How do students in the experimental group experienced learning mathematics through mathematical discourse? (a) There is no statistically significant difference in mathematics performance among the control and the experimental groups at the start of the experiment.

Null hypothesis
(b) There is no statistically significant difference in procedural understanding of probability and statistics among the control and the experimental groups.
(c) There is no statistically significant difference in conceptual understanding of probability and statistics among the control and the experimental groups.
(d) There is no statistically significant difference in the overall performance in probability and statistics among the control and the experimental groups.

Analysis of baseline data
The following research question addressed by analyzing and comparing the baseline data. Participants' scores on 11th-grade first semester midterm and final mathematics examinations were used as baseline data to check group comparability in terms of prior mathematical knowledge and skills.

Research question:
Is there a statistically significant difference in mathematics performance among the control and experimental groups at the start of the experiment?
One-way analysis of variance (Pallant, 2016) of the baseline data at the alpha level of 0.05 showed that there were no statistically significant differences in mathematics performance among the three groups at the start of the experiment (F (2, 98) = .32, p = .73). The one-way analysis of variance the baseline data indicated the comparability of the control and the experimental groups. It appeared that there was no statistically significant difference in mathematics performance amongthe control and the experimental groups. Hence, the study consisted of three comparable groups of students (Group A, Group B, and Group C) at the start of the experiment.

Post-test data analysis
Graphical representations using histograms displayed that the distribution of scores for each group has a different shape; which indicates the skewness of the scores. The posttest scores for each group were explored for normality assumption using skewness and kurtosis (Orcan, 2020;Warner, 2013). For a small sample size, when the absolute sizes of the Z ratios for skewness and kurtosis are greater than 1.96, the data appeared to be non-normally distributed (Orcan, 2020;Warner, 2013). The Z-ratio is determined by dividing the value of skewness and kurtosis by their standard errors (Warner, 2013).
As displayed in Table 1 and 5, the absolute size of the Z ratio for the skewness of procedural understanding scores in control group one and that of procedural fluency scores in control group two are greater than 1.96; which indicates that the procedural understanding scores are non-normally distributed for the entire sample (Orcan, 2020). Results of the shaprio-wilk test supports that the procedural understanding scores fail to satisfy the normality assumption for the entire sample. Similar observations reveal that the conceptual understanding scores and overall performance scores are non-normally distributed for the entire sample. Graphical techniques and the shapiro-Wilk test showed that the posttest scores on the dependent variables: procedural understanding, conceptual understanding, and overall performance did not meet the normality assumption.
A comparison of three or more independent groups is carried out by applying one-way analysis of variance or nonparametric equivalent tests (Pallant, 2016). Initially, a one-way analysis of variance (ANOVA) was to be used to analyze the posttest scores. However, the posttest data did not meet the normality assumption to run an analysis of variance (Pallant, 2016). Davison (1999) Table 1. Research matrix linking research questions, data sources, and data analysis Table1, reveals research matrix linking questions, data sources and data analysis. Results of the Kruskal-Wallis test analysis on posttest data revealed that there were statistically significant differences in conceptual understanding, procedural understanding, and overall performance among the three groups. Subsequently, pairwise comparisons of the three groups using Dunn's test as Post-hoc analysis (Pallant, 2016) were performed. Results of the pairwise comparisons revealed that students who underwent discourse-based instruction demonstrated greater conceptual and procedural understanding in probability and statistics compared to those students who received traditional instruction

Specified research questions Data to be analyzed Data analysis
Is there a statistically significant difference in mathematics performance among the control and experimental groups at the start of the experiment?
Students' performance in midterm and final examinations served as baseline data.
One-way analysis of variance on baseline data was performed.
Is there a statistically significant difference in procedural understanding among the control and the experimental groups?
Students' performance in procedural probability and statistics tasks.
Kruskal-Wallis H test on the raw scores in procedural tasks was performed.
Is there a statistically significant difference in conceptual understanding among the control and the experimental groups?
Students' performance in conceptual probability and statistics tasks.
Kruskal-Wallis H test on the raw scores in conceptual tasks was performed.
Is there a statistically significant difference in the overall mathematics performance among the control and the experimental groups?
Students' total sum scores in procedural tasks and conceptual tasks.
Kruskal-Wallis H test on the total sum scores in procedural and conceptual tasks was performed.
How do students in the experimental group experienced learning mathematics through mathematical discourse?
Survey response data Descriptive statistical analysis Table 2. Comparison of groups based on descriptive analysis of baseline data Table 2, indicates that Comparison of group based on descriptive analysis of baseline data. According to (Pallant, 2016), one-way analysis of variance on the baseline data indicated the comparability of the control and the experimental groups. It appeared that there was no statistically significant difference in mathematics performance among the control and the experimental groups. Hence, the study consisted of three comparable groups of students (Group A, Group B, and Group C) at the start of the experiment recommended non-parametric tests when the violation of the assumptions for parametric tests occurs. As a nonparametric equivalent to ANOVA, the kruskal-Wallis H test was used to test the overall hypothesis that no significant differences exist among two or more groups based on mean ranks (Pallant, 2016;Sheskin, 2000). The assumptions required by the kruskal-Wallis H test (Laerd Statistics, 2018) are • The dependent variable should be continuous. Posttest test scores served as the dependent variable.
• The independent variable should consist of two or more independent groups. Two control groups and an experimental group were involved in the study. There were three independent groups.
• Independence of observations within each group. The posttest data were collected from three independent groups.
• Variability of the distribution of the data. The histogram for each outcome measure shows that the distribution of scores for each independent group has a different shape. The groups were compared using the mean ranks (Laerd Statistics, 2018).
Several statisticians stressed that the pairwise comparison of three or more independent groups will increase the probability of making a Type I error (Sheskin, 2000). As a result, the kruskal-Wallis test was chosen as an appropriate method of analysis to determine whether there were statistically significant differences in participants' scores in procedural understanding, conceptual understanding, and overall performance among the three groups. kruskal-Wallis test can be used to analyze numerical data that come from experimental, quasi-experimental, and field studies (Green & Salkind, 2011). According to Fancher (2013), analysis of variance or equivalent non-parametric test on posttest scores suffices to infer the difference among comparable groups.

Research question:
Is there a statistically significant difference in procedural understanding among the control and the experimental groups?
A kruskal-Wallis test analysis on posttest scores in procedural understanding at the alpha level of 0.05 revealed a statistically significant difference across the three groups (χ 2 (2, N = 101) = 7.96, p = .019) with a mean rank of 45.92 for Group B, 44.64 for Group C, and 62.37 for Group A.
Research question: Is there a statistically significant difference in conceptual understanding groups, the control and the experimental groups?
A Kruskal-Wallis test analysis on posttest scores in conceptual understanding at the alpha level of 0.05 showed that there was a statistically significant difference among the three groups Table 3. Results of one-way analysis of variance of the baseline data Table 3, shows results of one-way analysis of variance of the baseline data. Pallant (2016) investigated that One-way analysis of variance of the baseline data at the alpha level of 0.05 showed that there were no statistically significant differences in mathematics performance among the three groups at the start of the experiment (F (2, 98)  Research question: Is there a statistically significant difference in the overall mathematics performance among the control and the experimental groups?
A kruskal-Wallis test analysis on overall performance scores revealed a statistically significant difference among the three groups, (χ 2 (2, N = 101) = 10.04, p = .007) with a mean rank of 43.15 for Group B, 45.61 for Group C, and 63.87 for Group A.

Summary and discussion of the findings
Over the past two decades, a growing body of research literature in mathematics education has been consistently promoting the use of mathematical discourse as a powerful instructional approach for developing students' understanding of mathematical topics and problem-solving capacity (Lampert & Blunk, 1998;Linell, 1998;Ryve, 2004;Sfard et al., 1998;Smith, 2018). Discourse-based learning opportunities are thought to be powerful instructional strategies because they (a) promote active student learning based on interactive participation in disciplinary mathematical discourse practices by allowing students to share and listen to each other's ideas, compare and justify their solution methods, challenge one another's reasoning, construct convincing arguments, and reflect on and clarify their thinking, (b) provide meaningful contexts that nurture the culture of classroom participation, (c) create a supportive and inclusive learning environment for all students to construct knowledge by fostering interactive mathematical communications (Bell & Pape, 2012, 1996Harbour & Denham, 2021;Lampert & Blunk, 1998;Steeley, 2017). According to Harbour and Denham (2021), creating discussion-based opportunities has many benefits wherein students across all grade levels (i.e., elementary, middle, and high school) can (a) engage in learning complex mathematical concepts that allow productive struggle and active learning; and (b) share their thinking, justify their thinking, critique the thinking of others, and make connections to others thinking (p. 1).
There is a lack of empirical research on the impact evaluation of discourse-oriented mathematics instruction on students' mathematics learning outcomes and a viable discourse-based instruction model that assists teachers in using mathematical discourse for improving their classroom teaching practices (e.g., Henning et al., 2012;Kooloos et al., 2020;Shaughnessy, 2007;Smith & Stein, 2011. This indicated that there is a need for further research work to investigate the Table 5. Descriptive analysis of post-test data for the three groups As shown in table 5  instructional use of mathematical discourse and evaluation of its effects on student learning outcomes in a domain of mathematics in different cultural contexts. The purpose of this study was to investigate the effects of discourse-based mathematics instruction on students' acquisition of procedural and conceptual understanding in probability and statistics unit topics. The study employed a quasi-experimental design with the posttestcontrol group that involved two control groups and one experimental group (Campbell et al., 1963;Shadish et al., 2002). Baseline data, representing participants' scores in mid-term and final mathematics examinations as administered by the participating school, were gathered from mathematics teachers for checking whether the selected intact classes were comparable.
One-way analysis of variance on baseline data showed the comparability of the three groups at the start of the experiment. After the intervention, post-test data were collected from the participants using a researcher-constructed test instrument. Results of the Kruskal-Wallis test analysis on posttest data revealed that there were statistically significant differences in conceptual understanding, procedural understanding, and overall performance among the three groups. Subsequently, pairwise comparisons of the three groups using Dunn's test as post-hoc analysis (Pallant, 2016) were performed. Results of the pairwise comparisons revealed that students who underwent discourse-based instruction demonstrated greater conceptual and procedural understanding in probability and statistics compared to those students who received traditional instruction. The main finding of the study is that discourse-based interventions enable students to develop a better understanding of mathematical topics when appropriately orchestrated interactive discourse practices occur. Its effectiveness depends on the extent of students' participation in discourse practices of explanation, justification, making a conjecture, questioning, comparison of solution procedures, and sharing of ideas to others. The findings reported in thesis are consistent with other studies (e.g., Berthold & Renkl, 2009;Harbour & Denham, 2021;Smith, 2018). The calculation of effect sizes demonstrated that discourse-based interventions when designed and implemented appropriately have practical significance for improving students' understanding of various topics in mathematics. The descriptive statistical analysis of survey data provided an insight into how the intervention strategy links to student learning outcomes. The findings provide insights into how mathematics should be taught in secondary school classrooms in an Ethiopian context and assist mathematics teachers to be aware of the benefits of discoursebased instructional strategies.

Limitations of the study
The findings reported in the thesis should be viewed in light of some inevitable limitations. For instance, the data collected represent participants' mathematical performance in the unit topics covered during the intervention period, not the entire syllabus. Teacher-related variables and some inevitable differences between the control and experimental groups might weaken the internal validity. The chosen study design, although adequately justified, might limit the generalizability of the findings. The interpretation of the findings based on ranked data might weaken the differences observed as ranked data do not capture all information contained in the original data (Pallant, 2016). Nevertheless, nonparametric statistical tests based on mean ranks are appropriate and powerful statistical methods of data analysis when the normality assumption on the dependent variable fails and the sample size is relatively small (Pallant, 2016;Warner, 2013).
Lack of studies conducted in the local context on discourse-oriented teaching might limit possible comparisons of the findings with similar contexts. Conducting task-based interviews may provide an insight into the quality of students' understanding of concepts and procedures in a particular domain of mathematics (Star & Stylianides, 2013). Future research may conduct task-based interviews to explore the quality of students' conceptual and procedural knowledge.

Contributions and implications
The study has contributed to current research on effective teaching practices in several ways by demonstrating discourse-based instruction as an alternative way of teaching mathematics for developing students' mathematical understanding. It added empirical evidence to the field of knowledge of teaching approaches that discourse-based teaching has led to an increased conceptual and procedural understanding of mathematical topics. The evidence may be used to extend the understanding of the link between students' participation in classroom discourse practices and their mathematics learning. It also provides an exhaustive literature review to readers with a concise list of scholars that have engaged in the research topic investigated and researchers of similar studies have a starting point. Among the contributions is an intervention manual that may assist teachers in designing and facilitating discourse-based lessons in their classes.
Knowledge about the orchestration of classroom discourse practices can help teachers to improve their classroom teaching practices for promoting students' construction of mathematical understanding (Pourdavood & Wachira, 2015). More specifically, the findings of the study have practical implications for the orchestration of discourse-based mathematical lessons and the design and development of teacher training programs (Shilo & Kramarski, 2018). The findings of the study provide empirical evidence to theoretical perspectives that theorize the role of classroom discourse in mathematics learning. Although it is challenging for many mathematics teachers to create classroom discourse and social norms that equally recognize the contributions of all students (Walshaw & Anthony, 2008), developing and orchestrating rich classroom discourse should be an integral component of classroom teaching practices (Kooloos et al., 2019). To help teachers orchestrate productive classroom discourse, the study contributes a framework for the design and implementation of discourse-based mathematics instruction. The framework not only guides teachers to design and implement discourse-based lessons but also helps teachers to decide on the goal of each lesson. For effective implementation, mathematics teachers should be supported on how they plan, design, and implement discourse-based instruction as a component of professional development programs (Kooloos et al., 2020).

Recommendation for future research
Language plays a role in the learning of mathematics (Hoyles, 1985). Lack of knowledge of mathematical terms, symbols, and notations impede students to express their ideas and thinking. Teachers should pay attention to support students develop appropriate mathematical language at each grade level so that students can effectively communicate their mathematical ideas and reasoning. If the effectiveness of discourse-based mathematics is to be increased, teachers should regularly support students to learn how to engage in discursive mathematical activities (Sfard et al., 1998). By considering the limitations reported, for increasing the validity of the findings and practicality of the intervention strategy, other researchers may replicate this study in different content domains of mathematics curriculum across different grade levels using a similar methodological or mixed-design research approach.
Based on the findings of the study, teacher professional development programs and curriculum materials should incorporate discourse-based teaching strategies for the improvisation of student learning outcomes. The effectiveness of discourse-based mathematics instruction can be improved by incorporating discourse-oriented activities in school mathematics textbooks and teachers' professional development training programs. For instance, future research studies may also investigate the impact of an attempt to help teachers design and implement discourse-based instruction in their classes. It would be highly invaluable if other researchers conduct longitudinal studies to explore the relations between conceptual understanding and procedural understanding in a specific domain of mathematics in secondary and upper secondary schools and to examine how discourse-based instruction affects the relations between conceptual and procedural knowledge. Such studies would be useful to draw an extended discussion about the effectiveness of discourse-based instruction in secondary schools and higher education institutions and link the learning outcomes to changes in students' mathematical behaviors.