The effect of physical activities and self-esteem on school performance: A probabilistic analysis

Abstract In the international literature, discussion of the impact of physical activities is mainly focused on improvements in quality of life, particularly in terms of health. However, sports can have other positive effects on individuals that may contribute to improving their academic performance. This study explores the effect of physical activities and self-esteem on the academic performance of high school students in central southern Chile, a topic on which there is currently no evidence. A linear cross-sectional regression and probit methods are used to determine the probability of achieving good school performance. However, because there may be selection bias, a Heckman model is estimated in two stages. Individual, academic, family, and socioeconomic factors are taken into consideration. This was done with data obtained from a survey applied to 2,010 high school students. The results provide evidence of the positive impact of physical activities and self-esteem on school performance. However, the effect on academic results increases at diminishing rates. That is, engaging in sports and physical education contributes positively (0.24 to 0.05%), but spending too much time engaging in sports negatively affects school performance (0.89 to 0.1%). While negative self-esteem influences from—5.8% to −2.9%. Therefore, high school students who engage in sports activities and positive self-esteem have better academic performance. Thus, given the implementation of a government policy to increase the hours of physical education, an improvement in academic results could be expected.


PUBLIC INTEREST STATEMENT
The impact of sports activities is mainly focused on improvements in quality of life, particularly in terms of health. However, sports can have other positive effects on individuals that may contribute to improving their academic performance. This becomes more important when the state of Chile has been defining policies aimed at reducing the hours of physical education in schools, ignoring the effects of these activities not only on the health of students but also on aspects related to self-esteem and academic performance.

Introduction
Discussion of the impact of physical activity (PA) on people is broad and practicing sports is mainly associated with improvements in quality of life, especially health (Lee et al., 2017;Schmidt et al., 2020). It is widely recognized that physical education, sports, and other types of physical activity provide numerous benefits for young people (Cardinal, 2016). Studies have shown that the benefits vary widely and can affect areas such as development; physical, social, and emotional wellbeing; and cognitive and academic achievement (Biddle et al., 2019;Moeijes et al., 2018). Some researchers have also suggested that physical activity, physical education, and sports not only improve physical development and health but can also strengthen academic performance.
In schools, the importance of physical education and extracurricular activities is sometimes overlooked in favor of more conceptual subjects that have a direct effect on university entrance exam results. This is even though the international literature indicates that physical education is important to student performance (Donnelly et al., 2016;Padulo et al., 2019;Podnar et al., 2018;Resaland et al., 2016;Van der Niet et al., 2014). This is because PA and sports increase the flow of blood to the brain, increasing alertness and oxytocin secretion levels and improving self-esteem (Dale et al., 2019;Rassovsky et al., 2019). Other authors mention that physical activity has a positive influence on concentration, memory, and classroom behaviour, showing a positive relationship between PA and intellectual performance (Aguayo et al., 2019;Gallotta et al., 2015). However, other studies have found only an insignificant association between PA and cognitive skills or have concluded that the relationship between PA and academic achievement has not been demonstrated (Howie & Pate, 2012;Rasberry et al., 2011).
There is a belief that "a sound body equals a sound mind" and that physical activity can support skill development among children (Norris et al., 2020;Zeng et al., 2017). However, although parents accept that physical education and sports activities have a place in child development, they do not want sports to interfere with academic activities in the curriculum, based on the belief that such activities will lead to better academic results. Additionally, worldwide, there is growing concern about the effects of little PA or participation in sports. National organizations responsible for protecting public health are alarmed by the impacts of minimal physical activity and social consequences in terms of physical well-being (Ramirez et al., 2004;Rankin et al., 2016).
Additionally, psychological studies have recognized that good self-esteem to achieve other important personal results. Therefore, it is considered that self-esteem and self-concept have direct and indirect effects on academic outcomes (Cvencek et al., 2018;Huang, 2011;Marsh & Martin, 2011). Note that there are studies that indicate that the effect of self-esteem on school performance depends on how you measure it (Valentine et al., 2004). Therefore, it is important to consider the effect of this factor on academic performance.
Overall, considering the effects indicated in the literature, sports could also be expected to contribute to academic performance, without overlooking the fact that academic performance is determined by diverse factors such as parental characteristics, teacher-student relationship, school, infrastructure, socioeconomic strata, and individual student characteristics, among others (Delprato & Chudgar, 2018;Henry et al., 2020;Liu et al., 2020). Therefore, this study considers the relevant variables to explain school performance based on the exposed literature.
In the Chilean case, it is important to study the effects of sports and physical education, as results from the physical education component of the Education Quality Measurement System (Sistema de Medición de la Calidad de la Educación, known as SIMCE) have shown high levels of obesity (18%) and overweight (16%) among students. Moreover, 92% of eighth-grade students did not reach the cut-off point established for being in satisfactory physical condition, based on measurements of muscular strength, resistance, flexibility, aerobic capacity, and body mass index (BMI) among 13,585 students. This is a cause for concern not only because of the high levels of obesity found on the SIMCE test, but also because of the impact that students' poor physical condition could have on other related dimensions or costs.
However, in Chile, there is little evidence concerning how sports and self-esteem affect school performance. This is a relevant issue to elucidate the effects of both national educational policies and those of educational establishments, because when determining the curricular and extracurricular hours of dedication to PA by students, they could be affecting their school performance. Specifically, in the Chilean case this is more important because in 2019 the Ministry of Education eliminated the hours of physical education from the curriculum for students in the last two years of secondary education. Therefore, the main objective of this study is to analyse the impact of PA and self-esteem on the academic performance of high school students. The school performance variable considered is grades. It is a dichotomous variable if the grade is excellent-good or average-poor, which has been widely used in the literature as a proxy variable (Toconi, 2010). Following the literature, the study includes variables associated with student, family, and school that contribute to determining academic performance (Delprato & Chudgar, 2018;Meda et al., 2017;Toconi, 2010). A selfesteem indicator is constructed based on a series of questions taken from the Rosenberg scale. The data used are from a survey applied to 2,010 high school students in central southern Chile. With this information, linear probability regression models are estimated with the Heckman method in two stages.
It should be noted that the literature that has studied the effect of physical activity on school performance has focused on statistical measures of comparison between groups, or factorial analysis that do not imply causality (for example: Donnelly et al., 2016;Podnar et al., 2018;Resaland et al., 2016). Other authors such as Kari et al. (2017), if they have considered that to estimate the effect it is necessary to consider the problems of omission of relevant variables and measurement errors in causal models. However, the latter authors do not use a relevant variable that could have an important effect on school performance: self-esteem; for which the literature has shown evidence as a determining factor in school achievement (Giofrè et al., 2017). Therefore, we use a methodology that has not been applied to this issue. Specifically, we used the Heckman method in two stages to correct the selection bias of the sample and the omission of relevant variables, and within the explanatory variables of school performance, we included self-esteem. The use of this method is important because it reduces the underestimation or overestimation of the effect (Cinelli & Hazlett, 2020;Cook et al., 2020;Heckman, 1979).
Consequently, this study aims to answer the following research questions: (1) Is the relationship between PA and academic performance always positive?
(2) Self-esteem affects the academic performance of the students?
(3) Is the causal effect of PA and self-esteem on academic performance maintained when correcting for selection bias and controlling for other determinants?

Materials and methods
To determine the effect of PA and self-esteem on school performance, linear regression models of the probability of achieving "good" academic results were estimated. In addition, Heckman's estimation model in two stages was used (Cook et al., 2020;Heckman, 1979), including assessments of self-perception, school attendance, bad habits, a proxy for physical skills, and sociodemographic variables. This was done with data obtained from the "Survey of Academic Performance, Sports Activity and Physical Education" applied to 2,010 high school students attending 13 schools (public, private subsidized, and fully private) in Chile. Ethical approval for conducting the survey was granted by the Ethics Committee of the University of Talca, ID 16-2018. The sample It is representative of the population of high school students, with a confidence level of 95% and a margin of error of 2.18.
The survey consists of 18 questions and 45 subquestions (sociodemographic characterization data, number of hours dedicated to sports, number of hours of physical education, self-esteem scale, cigarette and alcohol consumption, drug use, grade point average, establishment education, school attendance rate, perceived benefits of sports activity, among others). The content validation of the instrument was carried out by a group of six experts. In addition, the estimated McDonald's Omega of 0.68 was estimated, which indicates that the scale questions composed of continuous variables have acceptable reliability and the reliability was measured with Cronbach's alpha of 0.86, which is good. Thus, considering that most of the variables are qualitative, we consider that the instrument is adequate. The questionnaire was applied directly in schools with corresponding authorization from principals and parents (parents) and the consent of the students. The average intra-school response rate was between 90 to 98%. The descriptive statistics of the survey are presented in Table 1.
The survey sought to capture information about the school, family, and students. Specifically, the students were asked about whether they liked sports, their practice of PA, their self-esteem, and their performance. Considering the definitions of PA and sports by Thivel et al. (2018), PA is considered as the performance of sports (Extracurricular) and physical education (curricular or intra-school activity). Specifically, we measure PA as the number of average weekly hours spent in sports activities and physical education, self-reported by students. The Rosenberg scale, which is the most widely used in the literature, was used to measure self-esteem (García et al., 2019). The score was based on responses related to the level of agreement of how valuable, important, and useful the students considered themselves, as well as the level of agreement about their own qualities, skills, pride, attitude, satisfaction, and respect. In terms of performance, two variables were considered: the student's grade point average and the dummy variable of perception of academic performance (1: good and 0: poor).
With the dependent variable of the grade point average (Y), we proceeded to estimate simple linear regression models through ordinary least squares (OLS), considering as explanatory variables (X) those associated with school (type), family (whether the parent and siblings engage in sports, proxy income (car)), and the student (age, gender, time spent studying and playing sports, hours dedicated to PA (sports plus physical education), self-perception, and self-esteem, among others).
Since there is self-selection in children or young people who decide to play sports, an estimation method is required to correct the selection bias. In other words, the sample we observed was not chosen randomly, so using only ordinary least squares would have biased coefficients. To correct this, we chose to carry out the estimation using the Heckman method in two stages (Cook et al., 2020). The model's specification consists of two equations: tequation 1 we are trying to estimate (academic performance) and an auxiliary regression (selection equation or equation 2) that corresponds to a discrete selection (probit) model that measures the probability of being in the sample. It should be noted that the auxiliary regression has in common continuous variables that are determinants of academic performance, to prevent identification problems.
Specifically, in the first stage, the auxiliary regression is estimated using maximum plausibility to obtain the probability that a student engages in sports or not (S). To do this, a vector of variables X 2 is constructed to determine the probability of carrying out sports activities (selection equation): In the variables with a scale from 1 to 3, 1 is "poor," 2 is "average," and 3 is "good," except drugs, where the choices for frequency of consumption are: 1: "never"; 2: "sometimes"; and 3: "regularly." b This was measured using the Rosenberg scale of 10 to 40 points; self-esteem is considered normal if it is between 20 and 33 points.
Where δ 2 is the vector of estimated coefficients, μ 2 is the error, and the subscript 2 refers to the fact that it is from the auxiliary regression. The inverse Mills quotient is estimated, which corresponds precisely to the omitted variable. This represents the probability of being in the sample over the probability of not being in it, thus: Finally, the least squares regression is run for the academic performance variable, from a vector of variables X 1 for students including the previously omitted variable (academic performance equation): In the second stage, an ordinary least squares model is used to estimate the academic performance (dependent variable) and among its regressors is the inverse Mills ratio, that is, the equation of interest. This model is represented as follows: Where Y is student's grade point average, β 1 is the vector of estimated coefficients, λ is the inverse quotient of Mills (which depends on the estimation of the selection equation), γ is its coefficient and υ 1 is the error.
The vector X 2 shares variables with the vector of characteristics that influence school performance. Any characteristic that causes the expected academic performance to increase (decrease). Regarding the errors of both equations, it is assumed that they are independent of the variables that make up each vector (X j ). Also, the expected value of both errors is equal to zero. Furthermore, given that the elements of X 1 are in X 2 , the expected value of the error of the selection equation is a function of the latter, then we assume that the error of the school performance equation is linear, like this: Where λ the coefficient of this equation corresponds to the coefficient of the inverse Mills quotient in the school performance equation (Heckman, 1979). Finally, it is assumed that the error of the indicator variable for the performance of sports activities is normally distributed with parameters zero and one.
It should be noted that a selection bias test is carried out, the null hypothesis of which is that the Mills ratio is not significant; that is, there is no bias. If the null is rejected, this would imply that the classical OLS estimation is more appropriate.
In addition, a probit model was used to determine the probability that a student's academic performance is good. To do this, the aforementioned regressor variables in the simple linear regression model were included and academic performance was established as the dependent variable. The value of this variable was one if the student had a grade point average equal to or higher than 5.5, and zero if the student's grade point average was below this threshold. This value was defined to match the Chilean Ministry of Education's value for identifying a grade as "good".

Results and discussion
The literature recognizes a series of benefits derived from physical and sports activities (Guddal et al., 2019;Moeijes et al., 2018;Padulo et al., 2019;Schmidt et al., 2020). For example, physical activities have been shown to improve self-esteem, health, socialization, concentration, grades, and the ability to make friends. Therefore, in this study those aspects were considered. In terms of empirical evidence, the interviewees were asked about their perception of these aspects. The data indicate that high school students believe that sports improve or aid health (80.2%), self-esteem (63.7%), and socializing (52.1%) (Table 2). However, for the students it is not so clear that sports help improve grades and concentration, as only 22.5 and 21.9% of the students, respectively, agree strongly that playing sports benefits those aspects.
It was found that on average the students have good self-esteem, with an average sample value on the Rosenberg scale of 26.8 points, which falls within the interval of normal self-esteem (20-33 points). The aspects most highly valued by high school students are the feeling that "I am capable of doing things as well as others" and that "I am a valuable person, at least in comparison to others" (Table 3).
In addition, the simple linear regression model indicates that engaging in sports activities apart from physical education (hours of PA) at school positively influences grades in high school (0.4%), but at a diminishing rate (−0.9%). That is, spending time playing sports improves academic performance, but as the hours spent on sports increases, the increase in benefits in terms of grades decreases. This is because the estimated parameter associated with the variable has a positive sign, but its square has a negative sign.
The results above are validated in the simple linear regression, in the two-stage Heckman estimated model, and in the probit model for obtaining good academic performance. Nevertheless, as shown in Table 4, the selection bias test indicates that there is no bias in the estimation. Therefore, the appropriate model for assessing the influence of sports on grades is the ordinary least squares estimation (1), and the appropriate model for assessing the probability of achieving good academic performance is the probit model (3). It is highlighted that unlike the studies analysed by Howie and Pate (2012), the method used allows to correct the bias of the previous literature on the relationship between physical education and academic performance.
The models result indicated that engaging in sports activities apart from physical education at school positively influences grades in high school (0.5%), but at a diminishing rate (−1%). This could be explained because sport favours concentration and brain activity and generates a positive effect when studying (Aguayo et al., 2019;Dale et al., 2019), but when a student also does many hours of sports, it occurs an effect substituting hours of study for hours of sports. As indicated by Rosenberger et al. (2019), the day is 24 hours and can necessarily only increase the time in one activity by decreasing the time in another. This could also explain the result that the assessment of sports skills has a negative influence on the variable of school achievement. In other words, the stronger the student's belief that he or she is good at sports, the lower their grades. This could be explained by the fact that if a student considers herself to be good at PA but not academic ones, she may dedicate more time to sports and reduce hours spent studying, since students in the same course have the same number of hours of physical education. In other words, they may spend more time on the activity for what they think have better skills.
Thus, the results show that the number of hours dedicated to physical activities (intra-school (physical education) and extra-curricular (sports practice)) influence the school result (grades and performance). This is consistent with Rasberry et al. (2011), which indicates that the intensity and duration of sports activity do influence school achievement. The previous literature already showed the positive effect of PA; but had not shown that it was increasing at decreasing rates, that is, that there is an optimal level of time dedication to these activities.
Therefore, it is beneficial to encourage PA and increase the hours of physical education of students as an educational policy. However, without this, implying a workload that negatively affected academic performance. Currently in Chile, physical education hours are only two hours, therefore, it would be advisable to increase them, not only because it improves academic performance as we have shown, but also because it has other positive effects on students (Norris et al., 2020;Schmidt et al., 2020).
Moreover, the variables associated with whether the student's mother or sibling play sports also have a positive influence. On the one hand, this result is consistent with studies of determinants of school achievement that demonstrate the role of the mother in the performance of the child (Delprato & Chudgar, 2018;Qurban et al., 2019). On the other hand, a child is more likely to play more sports and with more intensity the higher the level of sports activity carried out by their parents (Rodrigues et al., 2018).
The results show that having good self-esteem positively affects grades (Table 4). Likewise, having low self-esteem also has a negative impact, specifically the influence of negative selfesteem on school performance is also verified with an impact of −5.8%. It is consistent with that Sample of 2,010 students, self-qualification from 1 to 4 (1: "Strongly disagree"; 2: "Disagree"; 3: "Agree"; 4: "Strongly agree"). b The Rosenberg scale ranges from 10 to 40 points; self-esteem is considered normal if it is between 20 and 33 points. obtained by Giofrè et al. (2017), even when a different methodology is used to measure selfesteem and estimate the effect. Additionally, our results are consistent with the higher selfesteem, the more PA is performed (Qurban et al., 2019), which in turn contributes to improving performance (Kayani et al., 2018). However, this relationship can also be bidirectional because, in turn, carrying out more sports activities contributes self-esteem (Andermo et al., 2020;Dale et al., 2019;Kayani et al., 2018). Previous studies such as those of Kayani et al. (2018), perform factorial analysis that allowed measuring the correlation between these variables, but something important is that correlation does not imply causality (Rohrer, 2018). Therefore, we explore how self-esteem and sports activities independently affect academic performance. Finding that these do positively affect the academic performance of high school students in Chile.
Others important determinants that have a positive influence on academic performance as measured by grades are age, attending a subsidized private school, number of cars in the household (income proxy), and hours spent studying (Table 4). Meanwhile, grades are negatively affected by poor class attendance and drug use. The sign of the estimated coefficient of this last variable is consistent with the findings of Meda et al. (2017), who also found that drug use negatively affects academic performance. Thereby, the estimates demonstrate the fact that confidence and commitment measures affect school achievement, as indicated by Cvencek et al. (2018).
Thus, physical education, promotion of school sports and self-esteem are not only important from a health perspective, but also because of their influence on academic performance. This is the basis of the importance of schools considering this aspect not merely as an activity aimed at improving student health but also as a determining factor of student academic and personal development. Therefore, sports should be encouraged both inside and outside educational establishments. Schools should also promote self-esteem, not only for its intrinsic value in terms of emotional health but also because of its importance in explaining the academic performance of high school students.
In addition, considering the results of this study, the increase in the number of hours of physical education (from three to four hours per week) implemented by the Ministry of Education in 2013 will have indirect effects on academic performance, although the aim of the policy change was to improve the health of the country's students. Moreover, because it is only a one-hour increase, it is likely to generate only positive effects, since dedicating many hours to sports can also have a negative influence on performance (given the negative sign of the parameter associated with the squared sports variable). However, the curricular change of the Chilean Ministry of Education in 2019 to eliminate the hours of physical education for the last two levels of secondary education, could imply negative effects on school performance if these hours are not adequately compensated by extra-curricular sports. This could be considered an inappropriate educational policy because, as shown by both our results and those of Aguayo et al. (2019), there are positive effects on school performance.
For other countries or states, the results of this study show that care must be taken in the allocation of curricular hours by subjects. Since having a greater dedication to the subjects that are traditionally considered the most relevant (for example: mathematics, language, and science) to the detriment of hours of physical education, this would not necessarily be favourable for children's learning. As Whitehead (2019) indicates, physical education in education can be a means or an end.
Even more so if it is considered that PA can improve academic performance not only in the short term but also in the long term in subjects such as mathematics and language (Mullender-Wijnsma et al., 2016). Furthermore, according to Resaland et al. (2016), PA improves school performance especially in children with lower school performance.
Although the findings provide evidence of the statistical significance of sports as a determinant of grade, the results should be approached with caution. This is because the impact is small and the predictive capacity of the models is relatively low, which may be due to the omission of important variables such as household income. That variable could not be directly considered because the survey was applied to children and youth, who are generally unaware of their household income. Therefore, the number of cars in the household and the type of dwelling (owned or rented) were used as a proxy for income, but while the first datum is significant, the second one is not.

Conclusions
The main descriptive statistical findings provide evidence that students recognize that playing sports, self-esteem and engaging in physical activity generate benefits related to health and socialization. In other words, on average the student's self-report a positive perception of the benefits of PA. What is most important, given the objective of this study, is that the econometric estimation, correcting the statistical problem of selection bias, demonstrated a positive causal relationship between academic performance, self-esteem and sports activities, which are statistically significant. However, while spending time engaging in sports improves academic performance, as the hours spent on sports increases, the increase in benefits in terms of grades diminishes. That is, an increase in hours of PA increases academic performance at decreasing rates. Therefore, the relationship between PA and academic performance is not always positive. Thus, the results suggest that that the national policy to increase physical education hours in schools will lead to an improvement in academic results in high school.
Additionally, according to the estimations, other determinants of school performance, variables related to bad habits (drug use), low self-esteem, and class absence negatively influence school performance. Meanwhile, age, attending a subsidized private school, having positive self-esteem, number of cars in the household, and number of hours spent studying have a positive influence.