The relationship between TIMSS mathematics achievements, grades, and national test scores

ABSTRACT The aim of this paper was to examine the relationship between Trends in Mathematics and Science Study (TIMSS) mathematics achievement and the two school achievement measures of grades and national test results in Sweden. A further aim was to examine the association of TIMSS mathematics achievement with different subgroups of students. The results show that there is a strong positive relationship between TIMSS mathematics achievement and national tests and obtained grades from both grade 6 and grade 9. Students from more educated homes performed better overall on TIMSS than those from less-educated homes, and the relationships between TIMSS and the school achievement measures were stronger for students from more educated home backgrounds. The school context and the students’ background had an impact on the students’ TIMSS result. The results have implications for how one should view the results from TIMSS as a measure of student mathematics achievement and thus how TIMSS results can be used in a national context.


Introduction
Results from Trends in Mathematics and Science Study (TIMSS) assessment in 2011 showed that Sweden had declining results in mathematics achievement (Mullis, Martin, Foy, & Arora, 2012). This pattern changed for TIMSS 2015, where Sweden showed increasing results in mathematics achievement (Mullis, Martin, Foy, & Hooper, 2016). An ongoing discussion in Sweden is whether TIMSS is measuring different or similar things as national tests in mathematics and school mathematics grades. Sollerman and Petterson (2016) performed an alignment study of the Swedish mathematics curriculum with the national tests in mathematics and TIMSS mathematics. The overall conclusions were that TIMSS is a relevant instrument to measure mathematics skills among Swedish students concerning the framework and the mathematics content in the items, although not all parts in the national curriculum were included in the assessment. Although TIMSS was concluded to be relevant in the Swedish context, the relationships with school grades and national tests were not completely clear.
Examining the relationship between international and national measures is important because if the relationship is strong, we can use information from TIMSS to support political decisions about a nation's schooling. If the relationship between national and international assessments is weak and international assessments results are low for a nation, educational policy makers might decide to change the national curriculum or make other educational policy changes. Lipscy and Kijima (2016) made a systematic review of the impact of cross-national assessments, like TIMSS, on education policy outcomes. They concluded that these kind of assessments increase both the capacity and motivation of policymakers to implement improvements in education. In their study, a US delegate said that the TIMSS and PISA (Programme for International Student Assessment) results helped to justify the reforms of the No Child Left Behind Act. Yariv (2018) concluded that policymakers' discourse regarding mathematics education transformed based on the TIMSS 1999 results and to improve Israel's ranking, the middle school mathematics curriculum was aligned with the TIMSS curriculum. The declining results in international assessments in Japan were used politically to abandon creativity under a relaxed curriculum, and instead implement a more traditional and academic rigour curriculum (Takayama, 2008). Educational reforms as a direct response to results of international assessments can be observed in other countries as well (see e.g. Grek, 2009;Lingard & Grek, 2006;Martens, Nagel, & Windzio, 2010). International assessments and national assessments might not always show the same pattern. For example, in Canada international assessment results and provincial assessment results often differ (Cartwright, Lalancette, Mussio, & Xing, 2003). There have also been inconsistent results from TIMSS and PISA for some countries (Wu, 2009). In Poland, national assessment trends have not always been consistent with trends in PISA (Szaleniec, Grudniewska, Kondratek, Kulon, & Pokropek, 2013). There is therefore an interest in examining the relationship between international and national assessments within each country.
It is well-known that students, who take national and international assessments, are not a homogenous group in terms of their home background. Many different researchers have concluded that students from homes with a high socioeconomic background are more likely to perform well in school and to have better academic performance than their peers with lower socioeconomic background (e.g. Erberber, Stephens, Mamedova, Ferguson, & Kroeger, 2015;Hanushek & Luque, 2003;Jurdak, 2014;Sirin, 2005;Wikström, 2005). This association is universal across nations, school subjects (e.g. science and mathematics), and grades (e.g. from primary to secondary education) (Erberber et al., 2015;OECD, 2011). Akyuz and Berberoglu (2010) examined TIMSS data from nine European countries and found that home educational resources were significantly related to mathematics achievement in eight of the countries. Teodorovic (2011) concluded, after reviewing school effectiveness research that student-level variables are very important in determining student achievement in rich countries but are less pronounced in poor and developing countries. Kaleli-Yılmaz and Hanci (2016) examined TIMSS 2011 mathematics achievement in Turkey in connection with different student variables, including gender, school grades, parents' educational level, and cognitive domains. They did not find any gender effect, but they did find a relationship with TIMSS and school grades and parents' educational level. In their study, if the student's mother had a high educational level the student had on average higher TIMSS achievement than if the mother had a lower educational level.
The average score on TIMSS was also higher when the mothers' educational level was used to divide the students compared with if the students' fathers' educational level was used. Caponera and Losito (2016) examined contextual factors and student achievement in all participating countries in TIMSS 2011 and concluded that high socioeconomic status has a significant and positive effect on student achievement and that students from advantaged schools perform better. For Sweden, especially the student's gender and socioeconomic status and the average student socioeconomic status in the school were significant in their study. In Hastedt (2016), it was concluded that an immigrant background can affect student achievements either positively or negatively depending on the background of the immigrants. From the results of the referred studies, we believe it is of interest to examine TIMSS mathematics achievement in connection to students' home background variables as well as the school context in terms of the composition of students within the schools. Summing up, this study aims to examine the relationship between TIMSS mathematics achievement and the two national school achievement measures of grades and national test results in Sweden. A further aim is to examine the association of TIMSS mathematics achievement with different subgroups of students.
In order to analyse TIMSS mathematics achievement and its association with different student groups, several country-specific studies have used multilevel analyses where one takes into account both student variables and school context variables. However, these studies yield different conclusions depending on which countries are examined. For example, Ghagar, Othman, and Mohammadpour (2011) examined Malaysian and Singaporean TIMSS mathematics achievement with multilevel analyses and concluded that there were student variables that were significant in both countries and that there were school differences in Malaysia, while the differences between classrooms were more evident in Singapore. Harmouch, Khraibani, and Atrissi (2017) modelled Lebanese and Singaporean TIMSS mathematics achievement and concluded that both student variables and school context were relevant, but which variables that were significant differed between countries. Wiberg, Rolfsman, and Laukaityte (2013) used multilevel analyses to model Swedish TIMSS mathematics achievement in 2003. They concluded that students' home background variables as well as the school's average socioeconomic background were significant, while other school-level factors were not. Thus, besides examining the association between student variables and TIMSS achievement, the composition of students in the schools might also be of interest when modelling TIMSS mathematics achievement in Sweden.
Most TIMSS studies have only used material from the TIMSS international database (e.g. Caponera & Losito, 2016;Kaleli-Yılmaz & Hanci, 2016;Wiberg, Rolfsman & Laukityte, 2013;Drent, Meelissen, and van der Kleij (2013) for a review of TIMSS studies until TIMSS 2011). This study is unique because it combines TIMSS mathematics achievements with other school achievement measures and register data. Thus, the obtained results are important in Sweden as it may affect how TIMSS results are used and which value they are given in comparison to grades and national test results. Although the study is conducted within the national context of Sweden, the results might be of interest in other countries because many countries participate in the international TIMSS assessment, most countries use subject grades, and many countries have national tests. We also believe that this study is important in a wider context because we have not been able to find any similar studies with TIMSS data. The closest studies we have found are studies that link different large-scale assessments, such as the linking between TIMSS and the US National Assessment of Educational Progress (NAEP) scale (e.g. Johnson, 1998;Hambleton, Sireci, & Smith, 2009;Lim & Sireci, 2017). The focus in those studies was on how students from other countries would likely perform on the NAEP by linking the NAEP scale to TIMSS and PISA both for a particular year and over time. For example, in Lim and Sireci (2017) an equipercentile equating between NAEP and TIMSS was carried out, and Hambleton et al. (2009) compared the students' performance on NAEP through the link between NAEP and TIMSS and PISA. A major finding was that the standards for NAEP are set high (especially on the highest level) but not too high in an international context. Those studies are thus very different from this study because they did not examine the individual link between an individual's different international and national achievement measures but rather were studies on the overall link between two different achievement measures.
Although we have not been able to find any similar TIMSS studies, there are studies that have examined the relationship with the large-scale Scholastic Assessment Test (SAT 1 ) scores and high school grades and socioeconomic factors (e.g. Zwick & Green, 2007;Zwick & Himelfarb, 2011), including the role of student ethnicity and first language (Zwick & Sklar, 2005). In these studies, it is evident that the SAT and grades have similar connections to socioeconomic factors if one takes into account the school context. Students' socioeconomic status correlates highly with both SAT scores and grades, and students who have a different first language than English are disadvantaged on the SAT. It is important to note that although the student background data were self-reported, these results are interesting because even if TIMSS and the SAT are not the same kind of measure they share similar features. They are, for example, freer from the curriculum than tests given in schools or national tests, which are based completely on the curriculum. Recently, some preliminary analyses were conducted by the National Agency for Education (2017) in Sweden with a focus on TIMSS and grade 9 in mathematics and science. This study differs from that report because our focus is on the mathematics results and thus we also include the student grades and national test results from grade 6 and control for somewhat different student background variables. Based on our aims and the literature review, we thus focus on the following research questions: (1) How is student achievement in school in terms of grades and national test results associated with TIMSS mathematics achievement in Sweden? (2) How are student home background variables associated with TIMSS mathematics achievement in Sweden? (3) How is the school context in terms of average level of student background variables associated with TIMSS mathematics achievement in Sweden and can the school context explain some of the variation in TIMSS mathematics in Sweden?
We hypothesised that there is a high degree of association between TIMSS and the national school achievement measures, but we expected the national achievement measures to be more highly correlated with each other because they follow the same curriculum. From previous research of Swedish TIMSS data (e.g. Caponera & Losito, 2016;), we expected student-level variables as well as the school context in terms of the composition of students at the school level to be associated with TIMSS mathematics achievement. We however expect that only a small part of the variance can be explained by the school context.
In the next section, the method used is described, including the participants, the instruments used, and the statistical analyses conducted. The third section contains the results, and the paper ends with a discussion section where some concluding remarks and some practical implications are given.

Participants
Students' mathematics achievement results from TIMSS 2015 8th grade in Sweden were used (IEA, 2017). The students in TIMSS 2015 were randomly selected in a two-stage procedure so they are representative of all the grade 8 students in Sweden. In TIMSS 2015, the students came from 150 different schools, and a total of 4,090 students participated in grade 8 from Sweden. In the used sample, there were 52% boys, 20% who were nonnative born or who had nonnative-born parents, 46% who had mothers with higher than high school education, and 37% who came from homes with more than 100 books. The response rate in Sweden tends to be generally high, and only about 5% did not respond to the TIMSS items. The nonresponsive students were students who were out of school the day TIMSS was given at their school. The instruction to the students who were part of the TIMSS assessment was that the test is important. However, because the results are not used for grading the student it is a low-stakes test. The TIMSS 2015 assessment offered a unique possibility to carefully study TIMSS in Sweden because social security numbers were collected from all students and this made it possible to connect the information in TIMSS with Swedish register data containing information about national test results, school grades, immigrant status, and parental education level.

National tests
The Swedish national tests are given in grades 3, 6, and 9 in different core subjects and comprised different subtests. The starting point for the tests is the curriculum and the syllabus. The aim of the national tests is to support an equal and fair grading process and to provide information about how the knowledge demands are fulfilled on a school level and on a national level (National Agency for Education, 2016). The national tests are not final tests, but are rather one out of many other tests that help the teacher to provide a fair grade to the student. The mathematics national test aims to measure the students' different mathematics abilities through the use of four (grade 9) or five (grade 6) subtests. One subtest is oral and is given in groups of 3 or 4 persons and takes about 20-30 (grade 9) or 60-80 minutes (grade 6) to complete. The three (grade 9) or four (grade 6) written subtests are individual and given on specific national tests days. The written subtests contain items that require responses in the form of multiple choice, short answers, or complete solutions. Each written subtest takes about 80-100 minutes (grade 9) or 40-80 (grade 6) minutes to complete. The national tests are designed to give the teachers a tool so they can provide a fair grade to each of the students regardless whether they are a low-performing or a high-performing student. This means that the national tests are high-stakes tests for the students. In this study, we had access to national test results in mathematics and other subjects from grade 6 and grade 9.

The Swedish grading system
The purpose of subject grades is to measure students' subject knowledge. The current grading system in Sweden is criterion-referenced and is an evaluation of a student's knowledge in comparison to specific knowledge criteria. The grading scale comprises six steps, A-F, where F is fail and A-E are different levels of passing with A as the top grade. For each grade step, there are rubrics that describe what a student needs to know in order to obtain a particular grade. The grades are determined by the teacher at the end of a course or at the end of a school semester. If there is too little information about a student, then no grade is given to that student. The letter grades also correspond to the following numeric grading scale: A = 20, B = 17.5, C = 15, D = 12.5, E = 10, F = 0. The numeric grading scale allows us to get a joint metric of the sum of the 16 subject grades a student receives in grade 9. This overall grade value is referred to as the merit value and ranges from 0 to 320 because the best-performing student getting all As would obtain 16 × 20 = 320 (National Agency for Education, 2017). Because the merit value contains 16 grades, it has a larger variability because many teachers set the different grades. The value could, however, be viewed as a measure of how well the student performs overall in school. In this study, we had access to the students' grades in mathematics from grade 6 and grade 9 as well as their merit values from grade 9.
Grades vs. national tests Grades and national tests are different in several respects. The national test results stem from a single test or a few single tests given on a single occasion that test many parts of the curriculum, but not the whole curriculum. Grades on the other hand are an overall judgment of the student that typically builds on several assessments on several occasions and covers the whole curriculum.

TIMSS 2015
TIMSS is an international assessment of mathematics and science given every four years since 1995, and the respondents are students from grade 4 and 8. The idea is to examine trends in student achievement together with some contextual data. Fifty-seven countries participated in TIMSS 2015, and the assessment was constructed on frameworks developed by the participating countries for each curriculum area and for each grade. Most items are constructed to assess students' application and reasoning skills (Mullis et al., 2016). In TIMSS, the students' achievements are summarised with five plausible values. In this study, the five plausible values were used for mathematics achievement from grade 8 for Swedish students.

TIMSS 2015 levels
A number of benchmarks define the achievement levels in TIMSS 2015 that the student can be categorised within. These levels are defined from the TIMSS achievement values and include the elementary level (400-474 points), average level (475-549 points), high level (550-624 points), and advanced level (625 or more points) (Mullis et al., 2016). In this paper, we will examine these levels in connection to the students' school grades.
Comparisons between TIMSS, grades, and national tests We used data on students' grades in mathematics and their national test results in mathematics from grade 6. At the time the international database from TIMSS 2015 was released, the students had received their final grades in grade 9 as well as had taken national tests in grade 9 in the core subjects that are assessed in TIMSS. We used national mathematics test results from grade 9 and information about the students' mathematics grade and a summary of 16 different subject grades from grade 9 (merit value). We are aware that the achievement measures used in this study were not measured at the same time that TIMSS 2015 was given to grade 8 students, and we used school achievement measures from grade 6 and grade 9. To use information from different grades gives us information about whether the observations are consistent over time or not. We are aware that the students in grade 9 might have matured and that they had learned more between grade 8 and grade 9. However, we think they are relevant to use because the national tests share similar features to the international TIMSS assessment. They both are given on certain test days with certain subtests, and the results are later used in the public discussion about the students' achievements. The national tests are also used to ensure that the grades from one school are not inflated. Thus, the grades from grade 9 and national tests from grade 9 should be somewhat close to each other. We thus examined the connection between different school grades and national test results and TIMSS mathematics achievement, but we did not use the national school achievement measures from grade 9 to try to explain the TIMSS mathematics achievement. In summary, we compared the students' TIMSS mathematics achievement results with their mathematics grades from grade 6 and grade 9, with their results from national tests in mathematics from grade 6 and 9, and with the their merit values from grade 9.

Statistical analysis
We used descriptive statistics such as average scores, standard errors, and correlations between TIMSS 2015 mathematics achievement and national test results and the students' grades from grade 6 and 9 to answer the first research question. All five mathematics plausible values from TIMSS 2015 were used in the analyses as a measure of the students' mathematics achievement in line with the recommendations on how to use plausible values given in Laukaityte and Wiberg (2017). Student weights were used on the student level when conducting the analyses. For the statistical analyses we used IEA IDB analyser 4.0.12 (2018), SPSS 24.0 and HLM.
To answer the second research question, we examined the association between TIMSS mathematics achievement in Sweden and the students' home background because previous studies have shown that a student's socioeconomic status might be of importance when examining students' achievements. There is no universal definition of socioeconomic status, but basic dimensions include parents' education level, parents' occupation, and family income. Unfortunately, we did not have access to family income data and the parents' occupation contained too much missing information. We did, however, have access to the parents' educational level. Because the mothers' educational level contained fewer missing observations, and inspired by the results of Kaleli-Yılmaz and Hanci (2016) as described in the introduction, the mothers' educational level was used instead of the fathers' educational level. Note that we also reran the analysis using the fathers' educational level and got similar results. We chose not to use the TIMSS home educational resource index, which is based on the following TIMSS variables: number of books at home, possession of an Internet connection, if the student has their own room, and the parents' highest educational level. One reason was that we thought it was more interesting to examine the influence of selected variables instead of an overall index. Another reason was that most Swedish students have an Internet connection (99%) and their own room (93%). Not using home possessions differs from the study by Caponera and Losito (2016). Although sex is not part of defining socioeconomic background, it was shown to be significant for Sweden in the study by Caponera and Losito (2016) and it is of interest in Sweden. For example, the national test results in mathematics from grade 6 and grade 9 do not show any large differences between boys and girls (National Agency of Education, 2016), but a previous study has shown that girls tend to get better grades than boys in relation to their performance on national tests (Nycander, 2006). Although girls tend to get better grades at the end of compulsory school in all subjects except sports/athletics, the difference is smallest in mathematics, physics, and technology. We thus did not expect it to have a large impact on the relationships examined here. Migration, in terms of whether students or their parents were born in Sweden or not, was examined because it is well known that it can affect mathematics achievement in both a positive and a negative way depending on the background of the immigrants, as seen in Hastedt (2016). In summary, student background variables were chosen from the students' questionnaires based on previous studies (e.g. Caponera & Losito, 2016;Ilie & Lietz, 2010; and the availability of data in TIMSS and in the register we had access to. Linear regressions were used to model TIMSS mathematics achievement with the student home background variables in order to answer the second research question. They were also used for comparison when examining the third research question connected to the school context, which was primarily examined with multilevel analyses. Multilevel analyses (Snijders & Bosker, 2012) are often used to analyse large-scale assessments, and such analyses take into consideration the two-stage sampling procedure of TIMSS. Because such analyses allow one to enter variables on several levels, they gives a deeper understanding of the relationship between TIMSS mathematics achievement and the school achievement measures, the student-level variables, and the average school-level student variables.
The students' background variables were recoded to make sense in the analyses. The students' sex was coded as 0 for girls and 1 for boys, and the migration variable was coded as 0 (labelled Swe) if the student was native born by native parents and coded 1 (labelled Imm) otherwise. The students' mothers' educational-level Med was defined as 1 if the mother had at least one year of education after high school education (labelled Hmed) and coded 0 otherwise (labelled Lmed). The variable book was defined as 1 if the student had more than 100 books at home (labelled Hbook) and coded 0 otherwise (labelled Lbook). The students' letter grades (A-F) were converted into the numeric grading scale (20-0) as described above in the grading system subsection. All of the home background variables examined in the later linear regressions and multilevel analyses were significant on at least the 0.05 level. Missing data were in general low in the students' background variables, ranging from 0.3% (sex) to 7% (national test results from grade 6), thus listwise deletion was used to exclude missing data (Tabachnik & Fidell, 2007). The missing data in national tests were due to students who were absent on the day of testing. We are aware that it is theoretically better to impute missing data, but the choice to use listwise deletion was made because reasonably few cases were deleted. Note that we only used grade 6 school achievement measures as independent variables in the linear regressions and multilevel analyses because it would be improper to use grade 9 measures to explain TIMSS mathematics achievements obtained in grade 8.

Results
The results in Table 1-3 are related to the first research question. In Table 1, the average scores and standard errors on TIMSS 2015 in mathematics for students receiving different grade levels in grade 6 and grade 9 are displayed. From Table 1 it is evident that regardless of year, the higher the grades the students had, the higher the student's average TIMSS achievement. Although this was true in both grade 6 and grade 9, the TIMSS results were in general higher for the different grade groups in grade 9. This is not surprising because there are more criteria that need to be met by the end of grade 9 in comparison to grade 6. Thus, this means that a smaller proportion of students might not have managed to get the highest mathematics grade in grade 9 (10%) in comparison to grade 6 (16%) because there are fewer goals to achieve in grade 6. This was similar for the students scoring highest on the national test for grade 9 (6%) compared to grade 6 (20%).
Next, we examined the proportions of students who received each grade in grade 6 and grade 9 distributed on the TIMSS achievement levels ( Table 2). By merging the two last TIMSS categories of 550-624 points and 625 or more points, we could conduct chisquare tests that showed that grades and TIMSS achievement levels cannot be seen as independent either in grade 6 (X 2 (15) = 442, p < 0.001) or in grade 9 (X 2 (15) = 594, p < 0.001). Noticeable is the consistency of the overall results regardless of whether we Table 1. The proportions of students (%) in grades 6 and 9 receiving grades A-F on the national mathematics assessments and their respective mean scores (M) and standard errors (SE) on the TIMSS mathematics assessment. use data from grade 6 or grade 9. It is also interesting to note that no student with a grade of A got fewer than 400 points on the TIMSS achievement scale. Likewise, no student with a grade lower than C reached the TIMSS advanced level. The overall correlations, which were used to examine the relationships between grades, national tests, and TIMSS achievement can be seen in Table 3 in general and in Table 4 for different subgroups of students. The correlations between grades and national tests for both grade 6 (0.89) and grade 9 (0.86) were strong and positive. This indicates that the national test helps the teachers in their grading exactly as it is supposed to do. Recall that the aim of the national tests is to support an equal and fair grading process and to give information about how the knowledge demands are fulfilled on a school level and on a national level. From Table 3 it is evident that the highest correlation with TIMSS was for mathematics grade 9 (.74), followed by national test grade 9 (.73). This is not surprising if we believe that these achievement measurements measure similar things and because these were collected only one year after (grades and national tests in grade 9) or two years before (grades and national tests in grade 6) the TIMSS assessment was conducted. The correlations are quite high and in line with our expectations, although we thought they would be closer to the correlation between grades and national test results. It is interesting to note that the correlation between merit value and TIMSS was quite high, even though the merit value is a combined measure of 16 subject grades in grade 9. The merit value indicates what is usually seen in school studiesthat students who perform well in one subject area also tend to perform well in other subject areas.  The results in Tables 4 and 5 were used to answer the second research question. When examining the sub correlations of the measures in Table 3, no significant differences in the relationships were found between girls and boys, as seen in Table 4. If the student was nonnative or had nonnative parents only gave a significant correlation difference in the merit value. This is not surprising because the merit value contains all subject areas and thus it is possible that language issues will have a stronger effect on that measure. The relationship was significantly different for most achievement measures if the mother had a high educational level or if the student lived in a home with more than 100 books. The exceptions were for national test score in grade 6 and students from homes with high number of books and the merit value. The latter result is probably due to the diversity among school subjects, which the merit value captures.
Correlation can only give us a measure of the linear relationship between two variables. To answer the second and third research questions, and to be able to examine the impact of several variables on both student and school levels, we used linear regressions and multilevel analyses. In the left part of Table 5 linear regressions and   Table 5. Regression and multilevel analyses with TIMSS mathematics achievement as the dependent variable and student background variables and different types of grades as the independent variables. G6 = Mathematics grade 6. NT6 = National test in mathematics in grade 6. *Achieve = Achievement measure used varies and the achievement measure used is given as the name of each column. Sex = 1 if boy, 0 if girl. Imm = 1 if student or student's parents not born in Sweden. Med = 1 if student's mother has higher education than high school education, 0 otherwise. book = 1 if student's home has more than 100 books, Imm_A = aggregated Imm, Book_A = aggregated book. R 2 = Explained variance. * p-value < 0.05. G6 = Mathematics grade 6, G9 = Mathematics grade 9, NT6 = National test grade 6, NT9 = National test grade 9. Mvalue = Merit value. Swe = parents or student born in Sweden, Imm = Nonnative students or nonnative parents. Lmed = Student's mother has at most high school education, Hmed = Student's mother has higher education than high school education, Lbook = Student's home has 100 or fewer books, Hbook = Student's home has more than 100 books.
multilevel analyses are displayed for the subject grade 6 achievement measures and likewise in the right part of Table 5 these are given for the grade 6 national test achievement measure. Note that we examined all possible average student background variables on the school level, but only the significant school-level variables are shown in Table 5. The analyses indicated that on average being a nonnative student meant that TIMSS mathematics achievement was on average lower and being a boy meant on average a higher TIMSS mathematics achievement. A home with many books and a mother with a higher education were associated with higher TIMSS achievement in general. In addition, regardless of which achievement measure was used, being in a school with students with many books at home gave on average higher TIMSS achievement. In contrast, the aggregated immigration variable was only significant for national tests data. Students who received a higher grade in the national achievement measures had on average higher TIMSS mathematics achievement. The multilevel analyses were similar to the linear regressions with student home background in terms of proportion of explained variance, which ranged from 0.46 (national tests) to 0.48 (grade 6). The explained variance with only the achievement measure and no other school background measure was a bit lower at 0.43 (national test) and 0.45 (subject grades). The intraclass correlation coefficients (ICCs) in the multilevel analyses were 0.11 (national test) and 0.13 (grade 6). Thus, the school context seemed to only have a small impact on the students' results.
Summarising the findings from the perspective of the posed research questions, there seems to be a quite high association between TIMSS mathematics achievement and national school achievement in Sweden, although the association is lower than between the two national achievement measures. Further, there appears to be a strong association with TIMSS mathematics achievement and students' home background variables when it comes to immigration and socioeconomic status in terms of books at home and parental educational level, while there is no strong association with gender. Finally, the school context in terms of the average student background seems to only be able to explain a small amount of the variance.

Discussion
The aim of this paper was to examine the relationship between grades, national test scores, and TIMSS mathematics achievement as well as to examine the association between TIMSS mathematics achievement and different subgroups of students. The overall result is that the relationship with TIMSS, national tests, and grades is strong in Sweden regardless of whether we examine the relationships from before TIMSS (grade 6) or after TIMSS was conducted (grade 9). A somewhat weaker relationship was observed if the joint grade from 16 subjects (merit value) was used as the achievement measure. This is not surprising because this measure contains greater variability because many different teachers are part of grading the students in the 16 subjects that the merit value is based upon. As expected and in line with, e.g. Nycander (2006), the gender differences were small. The educational background of the students' mother and if the student came from a home with many books did, however, have a clear influence on the students' academic performance. These results are in line with the results from a wide range of previous studies (e.g. Chiu & Xihua, 2008;Erberber et al., 2015;Hanushek & Luque, 2003;Ismail & Awang, 2008;Jurdak, 2014;OCED, 2011;Sirin, 2005;Wikström, 2005).
We used multilevel analyses in order to take into account the hierarchical structure of TIMSS and its two-stage sampling design, where the probability of selecting a sample unit is unequal (Kyriakides & Charalambous, 2005). An advantage of such analyses is the possibility to examine whether the average level of school achievement influences the results of students' achievement and to control for both student and school factors in the analyses. Although the aggregated number of books variable was significant when using grade 6 and national test scores in the analyses, this was not the case for the aggregated migration variable, which was only significant when using national test scores as the achievement measure. The results from the multilevel analyses were compared with the results from the linear regressions, and it was concluded that schoollevel variables only account for a small amount of the variance. As expected, there were only a few variables that were significant on the school level in Sweden in TIMSS 2015, which is in line with previous studies based on TIMSS 2003, TIMSS 2007, and TIMSS 2011 achievement in mathematics and science in Sweden (e.g. . The quite low ICC values are similar to other studies that have used international assessments. Because Sweden is a high-income country, it is also unlikely that the school has a large effect according to the Heyneman-Loxley effect, which states that the quality of schools has a greater impact on achievement in low-income countries than it does in high-income countries. It is also true that the effects on achievement of the student family context tend to be stronger in higherincome countries (Ilie & Lietz, 2010), something that could be seen in the multilevel analyses presented here.
This study also shared some resemblance with studies examining SAT scores (a large-scale assessment) and their correlation with high school grades and socioeconomic factors. The conclusion that TIMSS results were lower for nonnative students is in line with the results in the Zwick and Green (2007) SAT study that found that students with a first language other than English were disadvantaged on the SAT. A conclusion shared with other studies is that in everyday school life a significant challenge is how to help students from lower socioeconomic background homes and/or are nonnative students succeed in all subjects in school.
The uniqueness of this study was that it allowed for connecting TIMSS mathematics achievement with the individual students' grades and results on national tests in Sweden. The use of grade 6 measures and background information among the students can also help in explaining the Swedish students' TIMSS mathematics achievement. The strong positive relationship found between TIMSS mathematics achievement and school grades and national tests in mathematics in both grade 6 and grade 9 implies that Sweden can use TIMSS as a trend indicator for how the students perform over time within the schools and thus use the results more strategically. Although this study was set in a Swedish context, the results are important for other countries that take parts in TIMSS. It is likely that the results would be similar for other countries, and thus it would be interesting to repeat the conducted analyses in other countries.