Teachers’ perceived challenges in group work assessment

Abstract Group work assessment is a challenging and complex practice for teachers. This study focuses on the challenges teachers perceive before and after participating in a group work assessment project that emphasizes individual assessment. By conducting a qualitative thematic analysis of twelve interviews with six teachers at upper secondary schools in Sweden, several challenges could be identified. The most prominent challenge concerning group work assessment is how to discern students’ individual performance within groups. This challenge has consequences for both the validity and the fairness of the assessment. Further, teachers experienced challenges with (un)fairness in group work assessment, in terms of both achieving fairness and having to deal with students’ emotions regarding perceived unfairness. The results also show how teachers perceive inadequate conditions, such as a lack of time and methods, and generate challenges in their practice, which is also related to reliability.


PUBLIC INTEREST STATEMENT
Group work assessment in educational settings has been proven by previous research to be complex and challenging for teachers. To be able to provide scientific knowledge, increase understanding, and supply teachers with methods that can support them in their practice in order to face these difficulties, it is important to understand what teachers perceive as challenging. This study focuses on teachers' perceived challenges when practicing group work assessment and found that the most prominent challenge is discerning individual students' performance within groups, that is determining who knows or did what. This challenge has consequences for both the validity and the fairness of group work assessment. Furthermore, the conditions regarding teachers' practice, such as lack of time and methods, are also considered to be challenging in connection with reliability. Validity, fairness, and reliability are measures of quality in assessment that are all important to consider when facing these challenges.

Introduction
This study focuses on teachers' perceived challenges when conducting group work assessment at upper secondary schools in Sweden. Group work in an educational setting has been proven by a body of research to have a positive impact on acquiring academic knowledge, social skills and the ability to collaborate (e.g., Baines et al., 2007;Gillies, 2016;Gillies & Boyle, 2010, 2011Hammar Chiriac, 2014;Hattie, 2008). Yamarik (2007) also concludes that students who worked cooperatively in groups performed better in exams than students who studied alone. Further, Van Aalst (2013) and Pellegrino and Hilton (2012) express that important twenty-first century skills, such as collaboration, problem-solving, critical thinking, and self-management, are developed in group work. Therefore, it is important that education at all levels supports children and young people in developing these skills. However, group work is also connected to challenges, and one of these is assessing students' knowledge and abilities. In addition, schools' curricula around the world (Lundahl et al., 2016) require teachers to assess each student's knowledge and abilities individually, and to give individual grades. This need for individual assessment alsoemphasizes the need to find tools that can assist teachers with this task. Furthermore, group work assessment is intertwined with anxiety, guilt, and uncertainty in the minds of teachers (Ross, Rolheiser & Hogaboam-Gray, 1998, p. 299;Ross & Rolheiser, 2003, p. 120), and is described as a complex and challenging task (Forslund Frykedal & Hammar Chiriac, 2011Murray & Boyd, 2015). In Ross et al.'s (1998) study, the teachers expressed that the assessment they performed was imprecise and unsystematic. Forsell, Forslund Frykedal and Hammar Chirac (2020), Khuzwayo (2018), Meijer et al. (2020), and Steel et al. (2014) have all suggested that group work assessment is a research area that is rather unexplored, and it is therefore likely that the challenges described by teachers have not been studied sufficiently. We argue that the challenges of group work assessment that teachers perceive in their practice are an essential key for understanding the complexity of group work assessment. Hence, to gain more knowledge, we focus in this study on further investigating these challenges to achieve a deepened understanding of their intricacy.

Background
In this study, group work is defined as two or more students working together in an educational setting in order to develop knowledge and abilities (Forsyth, 2010). This broad definition accordingly encompasses several concepts in the field, such as cooperative and collaborative learning (e.g., Davidson & Major, 2014;Johnson & Johnson, 1999;Hmelo-Silver et al., 2013. Using group work as a way to support students' learning in educational settings requires teachers to assess and grade students' knowledge and skills (e.g. ;Brookhart, 2013;Dijkstra et al., 2016;Forsell et al., 2020). Collaboration and joint learning, which are important goals when students work together in groups, are not always compatible with individual assessment and grading (e.g., Steel et al., 2014;Walker, 2001). In addition to teachers' experienced challenges, individual group work assessment can also create competition in the group instead of the desired collaboration (Forslund Frykedal et al., 2019). The students can also experience unfairness in the assessment and grading (Alm & Colnerud, 2015;Harrison et al., 2013). Fairness involves the students having equal opportunities to demonstrate their achievements and the teacher preventing or avoiding bias in the assessment (McMillan, 2018). By reviewing research into group work assessment, additional nuances regarding its complexity and challenges have been found. In this study, we define challenges as hindrances and problems that teachers come across in their practice of group work assessment.

Individual group work assessment
One challenge of group work assessment is the difficulty in discerning the individual's contribution within the group's process during group work (Dijkstra et al., 2016;Forsell et al., 2020;Van Aalst, 2013), as well as discerning each individual's knowledge with the group's knowledge (Dijkstra et al., 2016;Forsell et al., 2020), in other words, who in the group knows what, or did what. Gammie and Matson (2007) and Nordberg (2006) argue that in group assessment, there is a chance that students will get a higher grade at the expense of other students' performance, or that students may get a lower grade than what they are capable of since they worked with others who did not achieve. Van Aalst (2013) also addresses this challenge and points out that when a group product is graded, it is difficult to address what has been learnt since participation and effort within the group's process may be mistaken for learning. This too may be considered as unfair by students. Meijer et al. (2020) argue that group work assessment of individuals may cause misaligned behavior among students, such as less sharing with and support of each other in the group. Furthermore, Meijer et al. (2020) argue that individual assessment may hamper collaboration in groups and even create rivalry. However, it is more likely that individual assessment is more valid than group assessment. Consequently, it may be considered challenging for teachers to balance these misaligned behaviors in favor of increasing validity.

Group assessment
Concerning group assessment, (i.e., where all group members get the same assessment or grade) Dijkstra et al. (2016) point out that it is a challenge for teachers to make sure that the group's achievement reflects each individual achievement in a group assessment. Strijbos (2011Strijbos ( , 2016) uses the concepts of convergence and similarity to show the degree to which students who work in a group develop the same knowledge. He argues that the group members probably have similar knowledge rather than the same knowledge. When the correlation between group grades/assessment and individual grades/assessment has been investigated, no correlation could be found (Epstein, 2007;Plastow et al., 2010), which accordingly supports Strijbos (2011Strijbos ( , 2016) arguments concerning convergence and similarity. If students do not have the same knowledge but are still assessed as a group, this may lead to challenges regarding the trustworthiness of the assessment (Dijkstra et al., 2016;Khuzwayo, 2018).
Group assessment may also be associated with social interdependence (Strijbos, 2011(Strijbos, , 2016 and may be positive in terms of structuring social interdependence but negative in terms of developing individual accountability (Meijer et al., 2020). This relates to the previous challenge of individual assessment in terms of balancing valid assessment while simultaneously promoting the group processes. Strijbos (2011Strijbos ( , 2016) asserts that group work is not only about developing cognitive outcomes (knowledge and abilities), it is also about developing social gains and motivational aspects. Van Aalst (2013) argues that collaboration is a human capability and an important 21 st century skill that may be assessed in its own right. Consequently, a further challenge involves knowing what knowledge to assess in connection with group work assessment.

Lack of methods for carrying out group work assessment
There seems to be a lack of methods for helping and guiding teachers in their practice of assessing group work. Meijer et al. (2020) conclude that there is a lack of both practical methods and research literature concerning group work assessment. According to Ross et al. (1998), most of the advice given concerns the assessment of social skills, with little attention being paid to the assessment of cognitive growth of knowledge and abilities. Khuzwayo (2018) gives an example from South Africa, where there are no techniques or methods for group work assessment of students' performance. Thus, teachers have to rely on methods and techniques they develop themselves. There are challenges not only in terms of the lack of methods, but also with some of the available methods. A common method used in group work assessment is peer assessment, where students assess each other's contributions in group work (Conway et al., 1993;Forsell et al., 2020). Some researchers use peer assessment of group members´ contribution as a weighting factor in combination with group assessment to get an individual assessment for each student to maintain fairness (Conway et al., 1993;Cheng & Warren, 2000;Sung-Seok, 2014. Further, Ross et al. (1998 point out when teachers share control of the assessment process with students, this also allows for a new source of bias. For instance, research has shown that students rate their own contributions higher than those of their peers (Li, 2001). Further bias in peer assessment can occur based on friendship, gender, race, or group role (Forsell et al., 2020). Accordingly, this leaves teachers with the challenge of defending the grades they award based on peer assessment (Ross et al., 1998).

Aim
This study aims to provide a deeper understanding of what teachers perceive as challenging regarding group work assessment, and also to explain why. The study also aims to examine whether teachers' perceptions of challenges change after carrying out a group work assessment project. With a deeper understanding of these challenges, we can build a more holistic view of the complexity of group work assessment.

Methods
The study is based on twelve qualitative semi-structured interviews with six teachers, analyzed using thematic analysis (Braun & Clarke, 2006.

Context and participants
Six female teachers from upper secondary schools in three Swedish cities participated. Their ages ranged from 35 to 50, with a range of 7-19 years of experience working as practicing teachers. The teachers in the study were selected based on that they all taught the same subject and courses and showed interest in participating in the study. They were interviewed on two occasions, before and after carrying out a group work assessment project together with their classes. They were teaching Swedish course one, including Swedish language, comprising reading and writing, as well as speaking, listening, and talking. The course content also includes oral presentation, argumentation techniques, rhetoric, written communication, and understanding linguistic variations (Gy 11).

Design and assignment
Six teachers participated in the study, of whom three took part in a two-day educational session. The first day included theory about group work (i.e., designing group assignments and assessment). One essential objective was to clarify and distinguish between the process of learning in groups and the assessment of knowledge and abilities that has been learnt and developed during the process of working together in a group. The second day was more of an applied workshop, where the teachers produced a group work assignment based on the following conditions: (a) group work as a method, (b) assessment based on goals and criteria from Swedish language course one, (c) enable formative assessment at group level, and (d) enable individual summative assessment (see Figure 1). Finally, the assignment should be based on teachers' actual practice and on actual goals and criteria from the course they were teaching. This last condition aimed to enable individual summative assessment; according to the curriculum, teachers are required to assess and grade knowledge and abilities individually. This individual assessment requirement is common in curricula around the world (Lundahl et al., 2016). However, since formative assessment in group work is a rather neglected research area (Forsell et al., 2020), we wanted to investigate how formative assessment could take place at group level. Previous research (e.g., Ashman & Gillies, 1997;Black et al., 2003;Healy et al., 2018) using educational sessions has been shown to have a positive effect on both students learning and teachers practice. An expected outcome from the educational session was to increase teacher knowledge and experience concerning individual group work assessment and to provide an increased understanding of challenges connected to group work assessment.
Three of the teachers designed the assignment, but all six teachers implemented it in their classrooms. One of the assignment's goals for the students was to learn and reflect on academic knowledge regarding linguistic variations. Another goal was to practice and develop the ability to present orally. During one of the lessons, teachers carried out a formative assessment of the groups' current work. In total, the assignment lasted for six lessons.
At the end of the assignment, the students presented the groups' results individually through inter-group oral presentations. The teachers produced summative assessments of each student's knowledge regarding linguistic variation and the ability to give oral presentations according to the criteria for Swedish course one (see Figure 1Figure 2).

Central content
• Oral presentation with a focus on adaption according to the listener. Factors that make an oral presentation interesting and convincing. Use of technical aids for oral presentation. Various ways to listen and respond, adapted according to the communication situation.
• Basic language concepts needed for a structured way of speaking about and analyzing language and language variation, and discussing accuracy regarding the use of language.
• Dialects and language variation in spoken and written language related, for instance, to age, gender and social background. Differences between formal and informal use of language and attitudes to types of language variation.

Task
You should perform a small study in groups using a questionnaire or interview regarding attitudes and connections to power and status concerning a linguistic variation your group has been given. Your group should also compile a summary of the results from the study and present it as a group to the teacher and individually in intergroups.
• Each group focuses on one of the linguistic variations below and presents it orally as a group for approximately ten minutes.
• A short written summary of the group's work.
• Intergroup presentations given individually, approximately five minutes for each student.
Abilities that will be assessed are: • The ability to speak in front of others in a proper manner regarding the situation of communication, and to participate in a constructive way in a prepared conversation and discussion.
• The ability to reflect on different types of language variation.
Group presentations will be assessed formatively. Intergroup presentations will be assessed summatively.

Data collection
The data consist of twelve semi-structured interviews between 20 and 56 minutes in length (mean 35.5 minutes), resulting in a data corpus of 425 minutes in total. Six of the interviews were carried out before the group work assessment project, and the other six interviews were carried out afterwards. The purpose of interviewing the teachers before and after participation was to be able to study differences in the teacher's experiences of group work assessment. Two interview guides (Patton, 2002) were used, including the following four main question areas: "How do you perceive group work?", "What experience do you have of assessing group work?", "How do you assess group work?" and "What methods have you used when assessing group work?". The guide before participation focused on the teachers' previous experience of group work assessment, while the guide after participation focused on the experiences from the assignment.

Data analysis
All interviews were transcribed verbatim, followed by a thematic analysis (TA) in six steps (Braun & Clarke, 2006. The first step, (a) familiarization with the data, began during the transcription and continued by reading the transcripts repeatedly. Through this process, we noticed that the teachers recurrently described challenges perceived in group work assessment. Based on this discovery, the process continued with (b) a selective coding of the data to identify sections in the interviews in which the teachers described challenges. These sections were coded using the software MAXQDA12. Since the idea also was to study differences in the interviews before and after participating in the group work assessment project, we treated the before and after interviews as two different data sets. The codes were elaborated with comparative method, and we began to (c) identify themes in the empirical data. At this stage of the analysis, the themes still were undeveloped, and we elaborated with the relation between codes, sub-themes and themes. In Figure 1, The process for the theme Discerning the individual's performance is presented in a visual representation. Clarke (2006, 2013)) suggest that when using TA, the analysis process tends to go back and forth, which we are inclined to agree with. Moving further in the analysis, we began (d) reviewing the themes, to be sure that the meaning of the theme's core and dimensions was coherent and accurately reflect the meaning evident in the empirical data. At the end of this step, we had a good idea of the themes and how they were related to each other. This led to the process of (e) defining and naming the three constructed themes: Discerning the individual's performance, (Un)Fairness in group work assessment and Conditions for group work assessment. A fourth theme was added after this thematization, Teachers' perceptions of challenges after participating in the group work assessment project, based on differences that could be identified between the interviews before and after participating in the group work assessment project. After this, we (f) finalized the results of this article.

Ethics
The ethical principles of the American Psychological Association (American Psychological Association, 2017) were applied throughout the study. The study was approved by the regional Research and Ethics Committee at Linköping University, Sweden (Dnr 2013/401-31, Dnr 2014/134-32 and Dnr 2016. All the participants were informed both orally and in writing about the study and agreed to participate by signing a written informed consent form. All names were changed in order to process all data confidentially.

E C A
You can, in prepared conversations and discussions, orally present your own thoughts and opinions and perform an oral presentation in front of a group. You do this with a degree of certainty.
You can, in prepared conversations and discussions, orally present your own thoughts and opinions with nuances and perform an oral presentation in front of a group. You do this with a degree of certainty.
You can, in prepared conversations and discussions, orally present your owthoughts and opinions with nuances and perform an oral presentation in front of a group. You do this with certainty.
You can produce simple reflections on how language variation is connected to the speaker and communication situation, and give examples of how language and use of language can mark distance and togetherness. Furthermore, you argue synoptically regarding attitudes concerning some forms of linguistic variation.
You can produce well-grounded reflections on how language variation is connected to the speaker and communication situation, and give examples of and discuss how language and use of language can mark distance and togetherness. Furthermore, you argue synoptically with some different perspective regarding attitudes concerning some forms of linguistic variation.
You can produce well-grounded and nuanced reflections on how language variation is connected to the speaker and communication situation, and give examples of and nuanced discuss how language and use of language can mark distance and togetherness. Furthermore, you argue in detail from several different perspectives regarding attitudes concerning several forms of linguistic variation. Forsell et al., Cogent Education (2021)

Results
Under the following heading, the results are presented based on the themes identified in the thematic analysis.

Discerning the individual's performance
The first theme focuses on teachers' perceived challenges in discerning each student's individual performance when working in a group. The aspects they address concern the discerning the individual's (a) knowledge and (b) contribution.

Discerning individual knowledge
One challenge that was frequently described by the teachers concerns how to discern and obtain evidence of each individual's knowledge. Teachers find this difficult because the individual's knowledge is intertwined with the group's performance. The challenge perceived by the teachers involves unraveling what is the group's knowledge and what is each individual's knowledge.
I think it is hard to assess the students' own achievements if just one [student] presents the group's work, because you do not know who has done what and whose knowledge is expressed. (Teacher F) Accordingly, the teachers need to find an assessment situation in which they are able to discern each individual student's knowledge. The teachers also describe situations where they practice group assessment, but this kind of assessment may generate distrust regarding whether one, a few, or all students have the knowledge demonstrated by the group. This distrust regarding evidence of each individual student's knowledge generates uncertainty among the teachers. Another aspect that generates uncertainty is the teachers' expressed concerns regarding situations where students perform beyond expectations, and the teachers describe it as being hard to trust their assessment.
One is a little afraid that this weak student who usually gives a weak performance [. . .] and now, in a group, this person gives a very good performance and in this case one becomes skeptical. Is it really this student's knowledge that is shown now, or how much help has this student got? (Teacher B) The assessment situation seems to make it difficult for teachers to find solid, trustworthy evidence of each student's knowledge. However, this challenge depends on which knowledge and abilities the teachers assess. They describe a difference between the challenge of assessing oral presentations and the challenge of assessing content in, for instance, a written manuscript.
It is maybe easier to see if it is an oral presentation, because then I can see how they act in some way. Content, I mean the subject, then I have to see what each and every student knows to make it fair. Not that someone just says something, because that does not convince me that they know it. (Teacher D) As the teacher describes in the excerpt above, there is a difference between assessing the content and the ability to present orally regarding the visibility of the evidence. Oral presentation can be observed during presentations; the evidence is displayed and can thereby be collected by the teachers. However, the challenge is greater regarding obtaining evidence of students' academic knowledge from, for instance, a manuscript for a presentation. In these cases, the teachers do not know whether the group has written the text together or whether only one or a few group members have done most of the work.

Discerning individual contributions
The second challenge regarding discerning the individual's performance concerns their contribution to the group's process, such as writing the group's manuscript for the presentation, finding materials for the group's presentation, or contributing ideas. There are criteria in the curriculum describing what to assess in terms of knowledge, but there are no criteria relating to contribution.
The challenge remains since the curriculum does not give any answer to whether and how to assess contribution; hence, this is still unclear to teachers.
Like many teachers, I also think it is useful [for the students] to learn to collaborate and become a little humbler perhaps, and not only want to have it my way all the time. But is it a part of our assignment [to assess collaboration skills]? You may think that, but how can we connect that to knowledge requirements? [. . .] Anyway, I do not teach any courses where being good at collaborating is a knowledge requirement. It is not written anywhere, so I can't use that when I am grading. (Teacher A) The teachers describe how they lack insight into the students' group work and their process, the challenge lies in discerning who has contributed to the group's process. Consequently, the teachers think it is hard to discern individual contributions. Reasons given by the teachers include students lacking involvement or working somewhere else and returning to the classroom at the end of the lesson.
They are often allowed to go to the library, and then I cannot be with them all the time. But of course, I go around and make sure that they do what they should and listen. I do that, but not that much, actually. (Teacher D) The teachers express a desire to be able to follow the process, but when they are unable to do so this leaves the teachers without any evidence of what each student has contributed. Without insight or documentation regarding the process, the teachers have no real evidence to base their assessment on, only a feeling.
All in all, a prominent challenge regarding group work assessment concerning discerning the individual's performance is the need for trustworthy evidence of each student's academic knowledge, abilities, and contribution. This challenge seems to differ in terms of the assessment situation and which knowledge and abilities to assess. The lack of insight into individual contributions generates distrust among teachers.

(Un)Fairness in group work assessment
Another challenge that is closely linked to the previous challenge of discerning individual knowledge concerns (un)fairness. This is because the teachers define fairness in group work assessment as the possibility to assess each individual student's knowledge.
I have to be able to see what each one can do to make it fair. Not just that someone says something, because then I do not know what they know [. . .]. Fairness involves being able to able to see what everyone knows as far as possible. (Teacher D) However, in the interviews with the teachers, two further aspects regarding challenges in terms of (un)fairness in group work assessment can be found: (a) unequal contributions and (b) dealing with students' emotions.

Unequal contributions
A perceived challenge concerning fairness arises when students' contributions to the group's joint work are unequal. This occurs in situations where not all students do their fair share of the group work but still get the same assessment, even though they have not contributed equally.
Thus, it is not fair to those who present the work. Some students can talk extensively and really show their presence, while others are much more withdrawn and do not [contribute so much]. And then it does not feel right that you can give a C for group work for an entire group when not everyone has contributed in the same way. (Teacher F) The teacher describes it as unfair to give a group grade when not everyone in the group has contributed equally. Accordingly, the challenge of discerning each student's contribution is also connected to fairness.

Dealing with students' emotions
The second aspect of challenges relating to teachers' perceptions of fairness concerns students' emotions and reactions toward group assessment. According to the teachers, the students often perceive group assessment as being unfair, which creates a challenge in terms of teachers having to deal with students' emotions.
They prepare as a group, but they are also assessed as a group. I know this usually generates very heated feelings where the students do not think they have been assessed fairly. (Teacher A) As a consequence of the students' disapproval of group assessment, they may even complain to the school's principal. This can lead to a challenging situation where the teachers need to be aware of the possibility of unfairness in using group assessment. Additionally, the teachers describe how they have to deal with students' emotions when they perceive group assessment as unfair, for instance, when group members contribute unequally but get the same assessment or grade. This is particularly applicable in situations where students who do not contribute or do not show up for group work hold back more ambitious students and lower the group's performance. One teacher remembers how she perceived this unfairness when she was a student.
[. . .] then the teacher wants those who were orderly and ready, who enjoyed school and wanted to work, then placed them with those who were disorganized, who were not so motivated to study. So, they thought they would help the others. And I was one of the ones who was ambitious and so on. And it was extremely annoying that I and maybe someone else did the whole job while the others just had a free ride or maybe even sabotaged our work or didn't make the effort. And then the whole group would get the same assessment. Of course, it was very unfair. (Teacher A) The teacher's own experience indicates that she understands the students' aversions to group assessment. Here, students' emotions seem to be aligned with the teacher's perspective of unfairness.
In summary, the challenge of achieving fairness in group work assessment is connected to the challenge of assessing each student individually. Based on this definition, group assessment is therefore considered to be unfair. It also leads to the challenge of dealing with students' feelings of unfairness regarding unequal contributions to the group's work.

Conditions for group work assessment
The teachers' practice of group work assessment is not only dependent on challenges that arise from their own perspectives and experiences. The context of the practice is also important, since it affects the teachers' conditions for managing group work assessment. These conditions can be divided into two aspects: (a) lack of time and (b) lack of methods.

Lack of time
The teachers say that they have limited time to practice assessments, which leaves them with the challenge of finding enough time for implementation. Following the group's process, giving feedback and practicing individual assessment are described as time-consuming. The opportunity to obtain evidence of individual knowledge during oral presentations by asking the students questions is also a challenge, due to the lack of time. Accordingly, a lack of time is an obstacle that affects assessment situations and leaves the teachers with less evidence.

Lack of methods
The teachers' conditions for group work assessment also involve a lack of methods, with the teachers describing how they are 'left in their practice' without suitable methods. This leaves the teachers with the challenge of finding usable methods on their own.
[. . .] there is a need to find emergency solutions and somehow make it possible, so it is clear that you may want to find some tools for making it even more efficient and clear. (Teacher C) The teachers obviously have no sources describing how to practice effective, clear and fair group work assessment. In those cases where they have discussions with colleagues, the teachers say that there is no consensus since they all seem to practice group work assessment differently. One of the teachers describes how she does not actually know what she is doing when assessing groups.
[. . .] and then when they present and then when you sit there and somehow assess, what should I assess here really, and who has done what. Yes, I can remember that I just sat there and just thought this, I do not really know how to assess now at all, and I thought that I will take this lightly. (Teacher D) In summary, the conditions for practicing group work assessment create challenges for teachers. Both finding time and a lack of methods for practicing group work assessment exacerbate the teachers' experiences of a challenging practice.

Teachers' perceptions of challenges after participating in the group work assessment project
The fourth theme is based on the interviews with the teachers, focusing on their new experiences and differences that can be identified in the interviews after participating in the group work assessment project. The challenges identified after taking part in the project can be summarized in four aspects: (a) silence, (b) bias, (c) enough time, and (d) assessment in the moment.

Silence
One challenge perceived by the teachers when using intergroup presentations was the difficulty in getting silent students to demonstrate their knowledge in the intergroups. For instance, this involved assessing students who did not talk early on in the presentations, and who then did not have the opportunity to demonstrate their knowledge since there was nothing left to add to the discussions.
The last groups were much quieter and asked no questions and thought that everything the others had said was great and there was nothing more to add. And then it became much more difficult too, where there was no conversation. (Teacher D) Consequently, silence made the assessment situation even more challenging, which became obvious during the intergroup presentations since no evidence for assessment was generated.

Bias
After participating in the group work assessment project, the teachers talked about how bias in assessment may present challenges. The teachers described certain factors that can affect their assessment and make it biased, such as when they have preconceptions of students as high and low performers, or their positive or negative impressions of a student. One of the teachers expressed how she became more suspicious of students who did not usually perform well, but who then surpassed themselves in group work.
Another aspect described by one of the teachers that may cause bias is the question of how to include effort in the assessment. In this case, students' efforts were interwoven with their achievements.
I am a little scared that, having followed them throughout this process, they were so engaged and thought like, yes, that they had worked so well. So that is always it, if you give a grade, you give an assessment that you think they have worked well, or you give an assessment of what they actually know. That is, I think it is not easy, it is not clear what is what, really. (Teacher B) The assessment situation seems to affect the teacher. She describes how she has followed the students throughout their process and seen their engagement and effort. Nevertheless, it is hard for her to distinguish which knowledge is represented in the assessment she has carried out. The challenge of discerning contributions has previously been problematized, but in the interviews after the project a new challenge consists of how effort may also cause bias in group work assessment.

Enough time
Just as before participating in the project, the teachers were still concerned about the time aspect, especially in terms of not giving students enough time in the assessment process. The teachers highlight this challenge by citing two different group work assessment situations: (a) the scheduled occasion for formative group assessment (i.e., teachers having time for formative assessment on the group's performance so far), and (b) individual presentations in intergroups (i.e., estimating time for intergroup presentation and interaction). For instance, one teacher highlights that there was not enough time to ask the students questions during the intergroup presentations.
It would have been great if there had been time for them to ask each other questions. What did you think about it, and how did you think, because that can also show knowledge, but there was no time because we only had one lesson and the time was not sufficient. (Teacher A) Accordingly, not having enough time was perceived as challenging in both group work assessment situations.

Assessment in the moment
The fourth challenge of group work assessment identified in the post-project interviews concerns challenges and the difficulty of assessing students in the moment. This was the case during the intergroup presentations, where the teachers were supposed to assess each student individually by using a matrix and making brief notes. Even with experience of similar situations, this was perceived as a challenging assessment situation since the teachers needed to concentrate on what the students said while at the same time carrying out the analysis to obtain evidence of each student's knowledge.
The situation is heated, and it is difficult to concentrate on what they are saying and to assess at the same time. So, it is hard. I have worked as a teacher for a long time, and I still find it difficult. (Teacher B) As one of the teachers put it, it is important in these situations to feel confident about what the students have performed because otherwise there is a risk of going too easy on the students and giving higher grades than they actually deserve based on their performance. Hence, it is a challenge to handling several tasks at the same time.
In sum, after participating in the group work assessment project, the teachers experienced new challenging situations connected to silent students, bias situations, time constraints, and the difficulties of managing multiple tasks.

Discussion
The findings from this study illustrate teachers' perceived challenges with group work assessment, which reveals the complexity of the area. From this study some conclusions can be drawn, but with caution given the limited sample it is based on. However, this is not an argument for ignoring or reducing the results of this study that problematizes teachers perceived challenges with group work assessment (Mercer, 2008) since even small studies can contribute with important results. These challenges are important to understand since they also constitute obstacles to developing group work assessment in the teachers' practice. A first step to providing teachers with methods for group work assessment is to better understand what they perceive as challenging.

The prominent challenge of discerning the individual performance
The most prominent challenge found in the results concerns discerning each individual's performance, which is a challenge that has been clearly recognized in previous research (e.g., Dijkstra et al., 2016;Forsell et al., 2020;Van Aalst, 2013). Before taking part in the group work assessment project, the teachers described how they were challenged by the fact that what had to be assessed-knowledge, abilities, or contribution-was generally not obvious to them. However, after the group work assessment project, the teachers discovered the importance of having trustworthy, undisclosed evidence of the knowledge and abilities defined in the assessment criteria in order to carry out a valid assessment. It became evident to the teachers that it was clearly more challenging for them to discern academic knowledge compared to discerning oral presentation in the intergroup presentation. Regarding assessing effort, the challenge concerned whether or not it should be included in their assessment. This is in line with what Van Aalst (2013) states: effort may be mistaken for knowledge, which might be the case regarding the teachers' ambiguity when assessing contribution.
The issue of discern individual performance is also related to questions of quality in group work assessment, where validity is one of the main pillars according to McMillan (2018). Regarding validity, Messick (1989) emphasizes the importance of having empirical evidence of what should be assessed. Since curricula require teachers to assess students' knowledge and abilities individually, this means that teachers also need to collect empirical evidence at individual level to ensure validity. This may explain why the challenge of discerning each student's individual performance seems so crucial and necessary for the teachers to solve. This is an important finding that illustrates how the challenge of discernment is also significant for validity. If they ignore this challenge, teachers also jeopardize the validity of group work assessment.
However, when striving for validity, teachers must also be aware of the misaligned behaviors such as less collaboration caused by individual assessment (Meijer et al., 2020). Collaboration skills are highlighted as important 21 st century skills (Pellegrino & Hilton, 2012;Van Aalst, 2013. Previous research also points out that the potential for learning offered by collaboration (e.g., Johnson & Johnson, 1999, 2004 might be overlooked in individual group work assessments (Meijer et al., 2020). On the one hand, teachers need to ensure validity in group work assessment; on the other hand, teachers also need to support collaboration. This balancing act emerges as another potential challenge for teachers.

Fairness in group work assessment needs discernment
A second pillar of quality in assessment is fairness, defined as the absence of bias (McMillan, 2018). However, in the context of group work assessment, the teachers described fairness as "being able to assess each individual performance in the group work", which differs from the general definition in educational settings. How teachers define fairness in group work assessment is also a key to understanding their experienced challenge of achieving fairness in group work assessment. Hence, group assessment is probably perceived as unfair since it does not consider each student's individual achievement. Dealing with students' feelings of unfairness regarding unequal contributions to the group's work also increases the experience of the challenging practice. Further, validating Khuzwayo's (2018) and Dijkstra et al.'s (2016) conclusions, the teachers describe the evidence they collect from group assessment as untrustworthy. The rejection of group assessment may be understood as the teachers finding students' knowledge to be similar rather than converging when working in groups, which is in line with Strijbos (2011Strijbos ( , 2016) conclusion. If the teachers were confident about the convergence in the students' knowledge when working in groups (i.e., all students' knowledge is identical), then group assessment would not be considered problematic. Accordingly, the challenge of obtaining fairness is intertwined with the challenge of discerning individual knowledge and is thereby connected to the previous discussion regarding validity.

The conditions affect group work assessments
In line with Khuzwayo (2018) and Meijer et al. (2020), the teachers described not having the proper conditions in terms of time and methods for group work assessment. Individual practice with little collaboration between the teachers and a lack of support from the organization entails perceived challenges. After participating in the group work assessment project, the same challenges remained according to the teachers, although the lack time had become even more apparent. Lack of time can be understood as a challenge for reliability (i.e., consistency of assessment), the third pillar of assessment quality according to McMillan (2018). Since a lack of time obviously does not give the teachers the possibility to collect enough evidence, it consequently makes it difficult for the teachers to ensure reliability. In a similar way, reliability is also affected by silence and hectic situations where teachers have to practice assessment in the moment. Furthermore, teachers also risk being biased by their preconceptions of students as high and low performers when students' efforts are interwoven with their achievements. This also puts validity at risk, since -according to the curriculum-effort should not be assessed (Lundahl et al., 2016;Gy 11).

Conclusions
This study enhances the understanding of the teachers' perceived challenges in group work assessment when assessing individual knowledge and abilities to obtain group work assessment that is valid, fair and focus on given criterions in the curriculum. By highlighting these challenges, the study contributes with a deeper understanding of the complexity of group work assessment. In the context of group work assessment, empirical evidence of students' individual performance is a prerequisite for ensuring a valid assessment. The teachers cited it as the most prominent challenge in terms of discerning this evidence. The teachers suggest that individual assessment is fairer, since evidence collected from group assessments is perceived to be untrustworthy. Accordingly, the challenge of ensuring fairness is intertwined with the challenge of discernment and ensuring a valid assessment. Furthermore, teachers' time-constraints, students' silence, and hectic situations have consequences for reliability, and the risk of teachers being biased and confusing effort with achievement jeopardizes validity. After taking part in the group work assessment project, the teachers' challenges became more clearly defined and they experienced a greater awareness of how to perform group work assessments more effectively. Overall, the study has revealed that teachers need better management methods for high-quality group work assessment that takes validity, fairness, and reliability into consideration. These important insights could facilitate both students learning and teacher's professional development regarding their practice. One suggestion for further research is to investigate how teachers describe their practice of group work assessment and how they face up to these challenges.