Enhancing written feedback: The use of a cover sheet influences feedback quality

Abstract Feedback can be effective to student learning if the feedback practice meets several success criteria. It appears, however, that it is not easy to put insights from theory into practice. Using a cover sheet to provide structured feedback may provide a solution. Just how cover sheets influence feedback practice is, however, still largely unknown. The present study offers an in-depth evaluation of the effects of the implementation of cover sheets on feedback practice. The study described in this article gathered data from almost 1000 feedback instances, from tutor and student interviews and from a student questionnaire. The analysis shows that the use of the cover sheet led to an increased use of feed up, feed forward and feedback on process level. Tutors and students valued the use of the cover sheet as positive, and the cover sheet helped students resolve issues better than with annotations alone. The study described in this article adds to the field of research by providing empirical data for how a cover sheet influences educational practice. The study furthermore shows that a cover sheet can be used to enhance student feedback literacy and it offers background for an easy to introduce educational intervention.

ABOUT THE AUTHOR Jorik Arts is a teacher trainer with a PhD degree in Biology at the Fontys University of Applied Science in Tilburg in the Netherlands. His research is focused within the field of assessment in higher education, feedback, and multimedia testing. The present study is about a quite common intervention in higher education: the introduction of a structured feedback cover sheet. Interestingly, there is remarkably little empirical research on this topic. He is a member of several education and assessment network groups. For example: the European Association for Research in Learning and Instruction (EARLI), and the Dutch platform for assessment as learning (platform leren van toetsen). He is presently involved in a research project on multimedia testing together with the Open University in Heerlen, the Netherlands.

PUBLIC INTEREST STATEMENT
In (higher) education teachers spend a lot of time providing written feedback to learners. From literature it appears that organizing a feedback practice that is beneficial for learners is not an easy thing to do. An intervention to streamline feedback practice which is quite common in education is the use of a structured feedback cover sheet. Such structured feedback cover sheets contain questions and cues for tutors and sometimes students that guide attention to important aspects of effective feedback. In contrast to the widespread use of these coversheets there is little known about their impact on feedback. The present study gives an overview of what is known about structured feedback and it gives insight on the effects of a structured feedback cover sheet on feedback quality.

Introduction
In higher education, teachers spend much time on formulating feedback on assignments, often in the form of written comments (Carless, 2006). Feedback can have strong effects on learning, provided that there is an effective feedback practice and that the feedback messages meet certain quality standards (see, for example, Hattie & Timperley, 2007). Based on a meta-analysis, Hattie and Timperley (2007) described a model for the information that feedback messages should contain to be effective for the recipient. Effective feedback provides answers for the questions "where do I go?", "how am I going?", and "where to next?" (Hattie & Timperley, 2007). Feedback information that answers the first question, which is about goals, was referred to as feed up, information that answers the second and third question was referred to as, respectively, feedback and feed forward. So, effective feedback messages include three types of feedback: feed up, feedback and feed forward. Moreover, Hattie and Timperley (Hattie & Timperley, 2007) indicated that feed up, feedback, and feed forward can be given at four levels: task, process, self-regulation, and self as a person. Based on the meta-analysis, feedback on process and self-regulation are best in deep processing of tasks. Since feedback messages should help the recipient in learning, it is also of importance that feedback is not merely an indication or a correction, but that the message contains some explanation of why something is good or bad. This aspect was named depth of feedback (Glover & Brown, 2006). Apart from the formulation of feedback messages, it is important to emphasize that feedback should not be seen as one-way written comments, since one-way comments often result in lack of effect (Carless et al., 2011). Feedback should instead be considered part of a dialogue (Ajjawi & Boud, 2018;Carless, 2006;Higgins et al., 2002). Because of this, several definitions of feedback refer to an interaction between teachers and students (e.g., Carless et al., 2011). In this paper, the following definition is adopted: "feedback is a process through which learners make sense of information from various sources and use it to enhance their work or learning strategies" (Carless & Boud, 2018).

Issues with feedback
From literature, it appears that providing effective feedback is complicated (Price et al., 2010;Sadler, 2010), and often problematic in (higher) education (Boud & Malloy, 2013;Glover & Brown, 2006;Henderson et al., 2019;Price et al., 2010;Walker, 2009), and that the effect sizes show considerable variability (Hattie & Timperley, 2007;Kluger & DeNisi, 1996). Possible explanations for the issues with the quality of feedback are lack of time, and lack of knowledge on feedback processes of tutors (Orrell, 2006;Price et al., 2010;Walker, 2009). This last assumption is strengthened by the fact that in higher education many aspects of (formative) assessment are learned on the job, without explicit prior instruction (Perera et al., 2008;Price et al., 2010). Possibly, the issues with respect to formulating effective feedback are worsened by the fact that tutors do not always act congruently with ideas they communicate (Adcroft, 2011;Lee, 2009;Perera et al., 2008).
Focussing on written feedback, literature indicates, among others, the following issues: • Feedback is given in the form of remarks or corrections without explanations (Glover & Brown, 2006;Lee, 2009;Walker, 2009); • Feedback is focused on a task that (often) does not reoccur in the curriculum and it is not dedicated on future tasks or on the development of the receiver (Carless, 2006;Glover & Brown, 2006;Hyatt, 2005;Lee, 2009;Orsmond & Merry, 2011;Perera et al., 2008).
An additional problem lies in the dialogical nature of feedback. The recipient must be capable of actually using the feedback information and this requires the recipient to be feedback literate. Feedback literacy refers to the ability to evaluate and use feedback, and to self-regulate cognitive and affective reactions (see for instance, Carless & Boud, 2018;Yu & Liu, 2021). So, even wellconstructed feedback messages can prove useless when the recipient is not equipped to respond. In the framework described by Chong (2021), student feedback literacy encompasses three dimensions (a contextual dimension, an engagement dimension and an individual dimension), which are subsequently further subdivided into levels, emphasizing the complex nature of feedback dialogues.

Improving the quality of written feedback
The effectiveness of feedback may be improved by using a form that asks tutors to answer several questions about the student work that is being evaluated (Bloxham & Campbell, 2010;Newton et al., 2012). Such forms are referred to as structured feedback cover sheets. A structured feedback cover sheet can guide tutor attention to what information to give to the student. By doing so, it may encourage tutors to add explanations to remarks and corrections and to address recommendations for future use (feed forward). So, the rationale behind using structured feedback cover sheets is that it enhances the effectiveness of feedback comments by guiding attention of tutors to the use of feed up, feedback, and feed forward and to the use of explanations of why things are good or bad (adding depth). Structured feedback cover sheets may also be a tool to improve the feedback dialogue, since it may include fields for student responses and questions (see for instance, the structured feedback cover sheet in Ellegaard et al., 2018). Together, all these assumptions may be the reasons for the numerous structured feedback cover sheets that circulate in educational practice.
Although structured feedback cover sheets have been widely implemented in education, there are very limited empirical data on how cover sheets influence feedback practice. In the study of Newton et al. (2012) the use of a cover sheet resulted in (i) more feedback and (ii) more elaborate explanations as part of the feedback (more depth). As far as we are aware of, no other studies were done to analyse how tutor feedback messages are influenced by the use of structured feedback cover sheets. The feedback that was analysed in the study by Newton et al. (2012) was provided at the end of a summative assignment and not on intermediate versions of students' work.
The study by Bloxham and Campbell (2010) looked into perception of students and staff on the value of an interactive cover sheet. The interactive cover sheet asked students to identify specific aspects of writing on which they would like feedback. This study showed that (first year) students found it rather hard to do so. This is summarised with the following notion: "There is something here that I don't understand, but I don't understand enough to ask questions about it" (Bloxham & Campbell, 2010). Tutors were positive, however: the cover sheet speeded up the marking process and answering questions helped staff to focus the feedback (Bloxham & Campbell, 2010). The feedback practice analysed in this study involved written feedback coupled to marks as part of summative assignments.
The study of Bitchener and Knoch (2008) looked into the value of a focused approach to written corrective feedback in second language acquisition. Although this study did not make use of a cover sheet, it did show that a focused approach in which tutors focus on a limited set of criteria helped improve student performance (Bitchener & Knoch, 2008). Feedback in this study was provided just before participants did a post-test, so feedback was provided in a formative setting. Ellegaard et al. (2018) focussed on how the dialogical nature of formative feedback was influenced by formulation of feedback on a cover sheet. The study showed that open questions, wondering questions and leading questions led to productive responses by the students and that long and comprehensive feedback had the tendency to lead to a frustrated response or to a lack of response. The study did not encompass an analysis of the feedback messages of the tutors.
An analysis by Dirkx et al. (2019) showed that in-text and rubric-referenced feedback in a formative context differed with respect to focus, level, and function. Although this study did not look at the use of a cover sheet, it makes clear that modality may infer different approaches by tutors. As such, feedback behaviour may be influenced by the cover sheet.
In this study, we aimed at gaining insight in how a feedback cover sheet influences tutor feedback messages. The research question is "What is the effect of the use of a cover sheet on the quality of the feedback provided by the tutors?" Quality of the feedback is operationalized as providing feedback with (i) more depth (explaining indications/corrections, see Glover & Brown, 2006), (ii) references to goals and criteria (feed up), (iii) references to future development (feed forward), and (iv) aimed at process and self-regulation level (see Hattie & Timperley, 2007) in order to increase students' learning.
Apart from an analytical approach, we wanted to determine whether we could reproduce the positive perceptions described by Bloxham and Campbell (2010), since positive perceptions may by itself influence tutor behaviour. Moreover, we were interested to see whether the feedback was actually used by the students and whether this was influenced by the use of a cover sheet, since this may also reflect the quality of the feedback. By doing so, we were interested to see whether the results were in line with the outcomes of the studies by Bitchener (2008) and Ellegaard et al. (2018).
Four sub-questions are formulated: (1) Which quantitative effects does the use of a cover sheet have on depth, feed up, feed forward and on the amount of feedback at process and selfregulation level? (2) How do tutors value the use of the cover sheet in relation to efficiency and effectivity? (3) How do students value the use of the cover sheet in relation to their learning process? (4) What are the effects of the cover sheet on the use of the feedback by students?
To answer the research questions a mixed method approach was used. Feedback comments by tutors were analysed using the theoretical framework of Hattie and Timperly (2007) and Glover and Brown (2006) to determine whether the cover sheet influenced feedback messages in a comparable way as in the study of Newton et al. (2012). Tutors and students were interviewed to get insight into their perceptions and to compare outcomes with the study of Bloxham and Campbell (2010). Finally, final versions of student work were compared with draft versions to get an idea of the effects of feedback and to determine whether the cover sheets in some way helped learning just like the focused approach described by the study of Bitchener and Knoch (2008) or like the approach described by Ellegaard et al. (2018).
Our study shows that (i) the coversheet helped to improve the quality of the feedback supplied by tutors; (ii) tutors and students valued the use of the cover sheet as positive, and (iii) the cover sheet helped students resolve issues better than with annotations alone. This outcome is in line with previous research (Bitchener, 2008;Bloxham & Campbell, 2010;Newton et al., 2012), yet provides a more in-depth exploration.

Context
The study took place at the Biology department of a Teacher training institute in the Netherlands. The course "action research" in the final year of the bachelor program was selected to study the quality of written formative feedback. As part of this course, students had to write a paper. The focus of this paper was about the students' performance in a classroom at their internship, where students chose a certain point of concern that they want to improve (e.g., how do I organize fieldwork for Biology pupils in such a way that pupils accomplish the learning goals?). Students first write an interim version of their paper which describes the context, the point of concern coupled to theoretical background (what is known from literature about field work with pupils?) and data with context-information from pupils and colleagues (what do pupils think about the present field work? How do colleagues organise field work?) accompanied by description of the methodology used for data gathering and a conclusion/discussion section. Based on the outcomes of this first study, students chose established interventions in the classroom at the internship. The final version of the paper encompasses the interim version, coupled to a second round of data gathering, accompanied by, again, methodology and a conclusion/discussion section. This second round of data gathering is used to evaluate the interventions that the students implemented. Since the interim and final versions of the paper contained similar sections that ask for similar approaches, feedback at the interim version of the paper aimed at helping students to do well in the second round of data gathering. Students have to hand in an interim version of the paper, which is returned with feedback from a tutor, in the form of written (digitally inserted) annotations. The final version of the student paper is graded by two independent tutors who were not involved in providing feedback on the draft version of this particular student. For the grading of the final version, a scoring rubric was filled out and argumentation was added. Criteria upon which the final papers were evaluated relate to general requirements of research papers, like usage of literature, writing style, clarity of methodology, data analysis, etc.

Participants
Thirty-four students were in class and received feedback. The analysis of feedback was done for 18 students (9 males, 9 females, aged 21-26). Four tutors provided the feedback. Tutor 1 is the first author of this publication and had been involved with the course "action research" for five years when data were gathered. Tutors 2 and 3 participated in the course for the third year and tutor 4 participated for the second year. Students and tutors were informed about the aim of the study and all agreed on participating.

Study setup
To determine the effects of a structured feedback cover sheet (hereafter referred to as cover sheet) on (perceived) feedback quality, a quasi-experimental study was set up. Participants of the course "action research" were randomly divided over a control and an experimental group. Both groups received feedback in the form of digitally inserted comments on an interim version of a paper that they had to write (hereafter termed annotations). The experimental group received, next to the annotation on the interim version of the paper, feedback in the form of a filled out cover sheet. Tutors involved in providing feedback were the same for both the control and the experimental groups.

Data collection Use of depth, feedback levels and feedback types in the annotations on the student papers. Categorization of in-text and side-line feedback
Halfway the course-after fourteen weeks-the students sent in their interim version of the paper by email. The papers were randomly divided into two groups. One group was the control group where tutors provided feedback only in the text of the paper itself. The other group was the experimental group where tutors provided feedback in the paper itself in combination with feedback on a cover sheet. To determine whether the use of the cover sheet influenced the feedback provided in the paper itself, in-text and side-line annotations were analysed for both the control and the experimental group. All interim versions of the papers with feedback from the tutors were collected. Eighteen interim versions of papers were randomly chosen (9 papers from the control group consisting of 4 male and 5 female students, aged 21-26 and 9 papers from the experimental group, consisting of 5 males and 4 females, aged 21-26) and used for analysis: six papers from tutor 1, five papers from tutor 2, three papers from tutor 3, and four papers from tutor 4.
Categorization of in-text and side-line annotations in the papers was done by the first two authors of this study in line with previous research by the authors (2016), where a measure of agreement (kappa) was found to be κ = 0.765 (p < 0.001). First, the annotations were categorised based on depth as described by Glover and Brown (2006). Do the comments indicate (depth 1) or correct a problem (depth 2)? Do these comments also include an explanation (depth 3)? Second, annotations were categorised based on descriptions by Hattie and Timperley (2007). Comments like "add citation in the text" were classified as feedback on task level, comments linked to the strategy used like "interlink the insights from literature, instead of summing them up" were classified as feedback on process level, comments like "what may be the reason that your data does not match with views from literature?" were classified as feedback on self-regulation and comments like "you are a bright student" were classified as feedback on self as a person. When identical or similar comments were repeated (like "add citation"), all comments were counted. Lastly, annotations were categorized as feedback, feed up, or feed forward. Only when comments were explicitly coupled to goals and/or criteria, they were counted as feed up ("if you make the following adjustments, your paper will improve for this criterion"). Similarly, only comments that explicitly refer to future use were counted as feed forward ("In your future job as a teacher these kind of conclusions will help to give colleagues a clear picture"), were they counted as such. All comments that lack a future dimension, or a reference to goals/criteria were counted as feedback. The division of annotations over the different categories was compared between the experimental group and the control group. The comparison was done using percentages to give insight in relative abundance. To determine whether differences in the division over categories were significant, a t-test analysis was done.

Use of depth, feedback levels and feedback types in the annotations on cover sheets
After the analysis of feedback as in-text and side-line annotations in the papers, the feedback on the coversheets was analysed. The actual cover sheet that was used is in Dutch, a translated version is added as Figure 1. The cover sheet encompasses five sections for tutors to fill out. The first section is about general impression. The second section is for answering the questions "what was done well? Explain why this is well done", and "what needs improvement, and why?"(feedback).
The third section is about the guiding questions "what are essential improvements in order to reach the goals?", and "which aspects don't yet meet the assessment criteria?" (feed up). The fourth section is for answering the questions 'what are points of concern (process) for the student and how can the student work on these points (process and self-regulation)?' (feed forward). The last section asks the tutor to answer the question "Which of the choices made require additional consideration by the student (self-regulation)?" (feed forward). A first analysis was done by counting the annotations made on the cover sheet and determining how they were divided over the answering boxes. A second analysis focused on categorizing annotations based on their formulation. Categorization was done by the first two authors in line with previous research of the authors (2016). Categories for analyses were (i) depth (1-3), level (task, process, self-regulation, personal) and goal (feedback, feed up, feed forward). When similar annotations were repeated (like "add citation"), all annotations were counted.

Perception of tutors on the use of the cover sheet in relation to efficiency and effectivity
To explore the perception of the tutors on the use of the cover sheet, tutors were interviewed shortly after they had provided the feedback. The criteria-based interviews included questions about the overall perception of the usefulness of the cover sheet, about efficiency (workload), and the value of the cover sheet for students' learning. The first and the second author of this paper did tutor-interviews. All interviews were recorded and transcribed.

Perception of the students on the quality and usefulness of feedback on the interim version of the paper.
After receiving the feedback, students were informed about the research activities described in this paper and were asked to fill out the questionnaire. This questionnaire was developed (see also Arts et al., 2016) based on quality criteria mentioned in literature (Bruno & Santos, 2010;Gibbs & Simpson, 2004;Nicol & Macfarlane-Dick, 2006) and consisted of 12 open-ended questions, which focus on (i) quality criteria that feedback should meet, (ii) students' use of feedback and (iii) students' feedback needs. Shortly after receiving feedback, the questionnaire was mailed to all participants of the course (n = 34). In total 18 students filled out the questionnaire (response 53%). Eleven of these students received feedback as annotations in the student paper combined with the cover sheet (experimental group), seven students belonged to the group that only received feedback as annotations in their paper (control group).
To further explore student perceptions 12 randomly chosen students were interviewed (6 students from the control group and 6 students from the experimental group). The criteriabased interviews included questions about the usefulness of the cover sheet (if applicable), and which feedback messages were helpful. Student interviews took place after the students' (successful) completion of the course. By doing so, the chance for socially desired answers was minimalized. The first author of this paper held student-interviews. All interviews were recorded and transcribed.
The answers of the students on the questionnaires and interviews were qualitatively analysed on their content and compared. Since we were mainly interested in the effects of the use of the coversheet, we focused on differences in the answers between the control group and the experimental group.

Effects of feedback
To determine whether the feedback provided on the interim version of the paper was actually used by the students, a final exploration was done. First, the interim versions of the student work were compared to the final version (using the Word compare function) to track all changes. This gave insight in how specific annotations were used. To analyse effects of more general remarks, like "Let someone help you check on language. There are several language errors in your paper and some sentences are improperly formulated." a sample was taken. Four student papers out of the control group and four student papers out of the experimental group were randomly chosen. Argumentation on the scoring rubric for the final version of the paper was compared with annotations on the interim version and/or annotations on the cover sheet. This approach gave insight in issues mentioned in earlier feedback that were apparently resolved in the final version of the paper and issues that remained unresolved. Moreover, since feedback messages on interim versions were categorised, the analysis allowed us to examine categories of feedback that were apparently easier to follow up, or harder to follow up.

Use of depth, feedback levels and feedback types in the annotations on the student papers
To determine whether the use of a cover sheet influenced feedback behaviour of tutors when adding in-text or side-line annotations in the interim papers, a comparison was made between the control and the experimental group. Eighteen interim versions of a student paper were analysed for the type of feedback (feed up, feedback, feed forward), level of feedback (task, process, selfregulation, self as a person) and depth (depth 1: indication, depth 2: correction, depth 3: indication/ correction with explanation). In total, 802 annotations in the papers (mean = 45; sd = 13,2) were counted, divided over papers in the control group (n = 389; mean = 47, sd = 11,0) and papers in the experimental group (n = 413; mean = 46, sd = 9,4).
For the feedback given as in-text or side-line annotations in the papers, no statistical differences between the control and the experimental group were found for depth, for the use of different levels of feedback or for the types of feedback (see Table 1). Annotations that qualify as indications (71%) or corrections (19%) were most common. The remarks made were mostly related to task (48%) and process (30%). No feedback on a personal level was found. With respect to the different types of feedback, nearly all annotations were labeled as feedback (97.6%), with only a marginal amount of feed up (1.5%) or feed forward (0.9%).

Use of depth, feedback levels and feedback types in the annotations on cover sheets
Subsequently, attention was shifted to feedback on the cover sheet. There were 173 annotations (mean = 19; sd = 7) on the cover sheets (n = 9) divided over the five sections of the cover sheet (see Figure 1). For the categorization for the types of feedback (feedback, feed up and feed forward), the texts in the accompanying text boxes were analysed (n = 142 annotations; the box for general impression was left out). This approach led to the notion that feed up and feed forward were used quite often (25% and 32% of the annotations, see Table 2).

Table 1. Distribution of annotations in the papers (n = 802) and on the cover sheet (n = 173) across the depths, feedback levels and feedback types. Mean and standard deviation is given for the amount of annotations per student paper
Annotations in the paper (control group: no cover sheet)  Similar to the annotations in the student papers, the most prominently used depth for providing feedback is indication (81%, see Table 1). The second most often used depth is indication/correction combined with an explanation (17%).
The most prominent category of feedback as in-text or side-line annotations in the student papers is that of feedback on task level (53% in the control group, 42% in the experimental group). In contrast, the major group of feedback on the cover sheet is feedback classified as being on the process level (78 annotations, 45%). The results in percentages were compared and differences were analyzed for being significant. Only for feedback at process level a significant difference was established (t(25) = −3,230; p = 0,003, 95%CI[−0,24, −0,52]).
As described, the annotations on the cover sheet were divided over five categories with each category having a title: global impression, feedback, feed up and feed forward (see Figure 1). To test whether the formulation of annotations on the cover sheet fit with the section where they are posted, the annotations were further analysed. From this analysis followed that purely based on formulation most annotations (90%) would be qualified as being feedback. Annotations that could be classified as either feed up (7%) or feed forward (3%) were quite sparse (see Table 2).

Perception of tutors on the use of the cover sheet in relation to efficiency and effectivity
In interviews, all tutors indicated that the cover sheet helped them to distance themselves from the paper and to give feedback on the most relevant aspects. For example, tutor 2 replied "The annotations in the student paper are more technical. The cover sheet forces me to take more distance". When asked for time efficiency, the tutors differ in their perception. Tutors 2 and 3 indicate that using the cover sheet costs additional time. Tutor 4 indicates that using the cover sheet does not take more time, since using the cover sheet changed her way of working with annotations: "I use the cover sheet in addition to the annotations in the text, but it doesn't take me more time. Before using the cover sheet, I used more annotations and they were more elaborated".
The question about tutors' impression on how effective the feedback on the cover sheet is showed different experiences. Tutor 2 emphasized the value of the cover sheet, but only when it is accompanied with annotations in the text. Tutor 3 received more questions for explanation from the students, and he thought that this was caused by the cover sheets. The cover sheet might trigger some uncertainty in students. Tutor 4 also hints at students asking for explanation, although she does not relate this to uncertainty: "Students seem to experience the cover sheet as a summary of the feedback. Students actually use the term summary. It makes me spend less time talking the students through my annotations. I spend less time talking to students, since they (the students) are more focused".

Perception of the students on the use of the cover sheet in relation to their learning process
Based on the criteria-based interviews, students felt very positive about the coversheet, see for instance, the following replies. "The annotations in the text are very specific, but the coversheet gives a global overview. I used the coversheet throughout the entire process: am I proceeding in the right direction?" [Student 1]

"Very useful addition to the annotations in the text. It explains to a certain level what the intentions were of the annotations. The coversheet described the most important aspects of my paper. The questions at the end of the coversheet made me really think about my action research" [Student 2] "What I really like is that the annotations in the text reappear in a sorted way. In this way it becomes clear which annotations really matter." [Student 3]
Students were able to pinpoint annotations that they valued useful for learning. "A useful comment for me was 'exploration of literature is very general. Is there anything specific on Biology that you could use?'. It made me reflect on the scope of my research." [Student 1]. "Most useful for me was this comment: the aims of my action research. I had to adjust that in the end. That is an example that was very useful to me. I think all annotations were useful" [Student 2] However, students were also able to pinpoint annotations that they believed did not contribute to learning. Noteworthy is the overlap with one of the answers above: "A non-useful comment for me was 'describe the aims of your action research'. To my opinion, the paper already describes this so I ignored this comment." [Student 1] In replying the question about the usefulness of the cover sheet, students indicate that they really need the in-text annotations for revision of their paper, see for instance, the following answer:

Perception of the students on the quality and usefulness of feedback on the interim version of the paper.
The opinion of students on quality and usefulness of feedback was further examined with a questionnaire. Both groups of students, the control group and the experimental group, gave similar answers to the open-ended questions in the questionnaire on quality criteria like timing of the feedback, amount of feedback and usefulness of feedback. Overall, students were quite positive about these criteria: 15 out of 18 students indicate that they received feedback within 2 weeks. The remaining three students received feedback sometime between 2 and 3 weeks. Students indicate that the amount of feedback is sufficient and that it helps them to continue.  There did seem to be some differences with respect to three questions.
When replying to the question "what do you value as most important from what you have learned so far?", students in the control group gave quite generic answers, e.g., "using APA guidelines". Students in the experimental group replied to the same question with answers like "linking theory and practice. In practice you often do what you think is right. I've learned that literature offers a great deal of support for how to act in practice." When replying to the question "Can you give an example of feedback that was useful for you?", the students in the control group answered again quite similarly, e.g., "being more to-the-point". The students in the experimental group replied to the same question with answers like "I received feedback on linking theory and practice. I already felt that this was not very strong, but I didn't know how to improve this. The feedback gave me insight in how to make this linking of theory and practice more explicit. This gave me a better insight and helped me to proceed." When replying to the question "can you mention the strengths of your paper and the aspects that need improvement?", once again, students in the control group answered quite similarly, e.g., "being critical towards new things in education". Students in the experimental group replied to the same question with annotations like "I could improve by making the paper more reliable with respect to the methodology, although this is hard because of practical possibilities at an internship."

Effects of feedback
Since the interim paper was an integral part of the final paper, it was possible to determine how students processed the feedback. All annotations on the interim versions that were coupled to specific texts like "substantiate this remark with actual student numbers" were used by the students. This became clear from the compare documents option in MS Word.
A different approach was needed for annotations that were not coupled to a specific text, but that instead were more general like "interleave insights from literature, instead of summing them up". Feedback on the scoring rubric for the final paper was analysed for points of concern that were already pointed out as problematic in feedback on the interim versions to determine whether students were able to use (and transfer) feedback to make improvements. Table 3 provides some examples of feedback messages that appeared unresolved. Not all feedback on the scoring rubric for the final paper could be linked to annotations made on the interim version (see "new comments" in Table 4). Issues that were mentioned earlier in relation to the interim version and which reappear on the final grading rubric show an overrepresentation of annotations that can be qualified as either feedback on process or as feedback on a self-regulation level.
Divided over 4 scoring rubrics, there are 20 remarks in the control group that point at issues that had already been identified in the feedback on the interim versions which appear to have remained unresolved. Divided over 4 scoring rubrics, there are 7 remarks in the experimental group that point at issues that had already been identified in the feedback on the interim versions which appear to have remained unresolved.

Discussion
Although there is little debate about the positive effects of feedback on learning, there does seem to be a gap between everyday practice in (higher) education and effective feedback practice described in literature. Providing students with quality feedback appears to be a quite difficult task (Boud & Malloy, 2013;Glover & Brown, 2006;Price et al., 2010;Walker, 2009). The use of cover sheets for structured feedback might be a way to improve feedback quality. Although such cover sheets are widely used in education, there is surprisingly little known about their effects on feedback practice. In this quasi-experimental study, the effects of using a quite straightforward cover sheet were studied.
Although this study gives valuable insights in the effects of a cover sheet on feedback practice, there are some limitations to keep in mind. First of all, given the small number of students participating in the course and in the study, it is unclear whether conclusions will hold true for other contexts. Secondly, in the study setup, tutors provided feedback for both the control and the experimental group. This might have influenced tutor behavior. On the one hand, tutors might have had "cover sheet information" in their minds when providing feedback to the control group. On the other hand, it might have been that tutors unconsciously added more information to the cover sheet since they were aware that they took part in the study. As such, the effect of the intervention of using cover sheets might either be weaker or stronger as measured. However, since data for the control and the experimental group are quite alike (see Table 1), the study setup in which tutors provided feedback to both groups does not seem to be very problematic. The latter notion is strengthened by the fact that the outcomes of the present study are in line with other studies (see below).
In the study by Newton et al. (2012), the use of a cover sheet led to feedback messages with more explanations (depth). A result that is replicated in this study, where the use of a cover sheet resulted in more feedback messages with explanations (depth 3) in comparison to the in-text annotations (17% and 10%, respectively). The present study is, however, more comprehensive than that of Newton et al. (2012) since it also encompasses the theoretical framework of Hattie and Timperley (2007) to categorize the feedback. Students will benefit most from feedback if the feedback message contains feedback, feed up and feed forward, not only on task level, but also on process and self-regulation levels (Hattie & Timperley, 2007). As shown in Table 1, the use of the cover sheet led to more feed up and feed forward messages and to more feedback on process level. Taken together, the use of the cover sheet led to higher quality feedback messages, according to the theoretical models of both Glover and Brown (2006) and Hattie and Timperley (2007). The study is also in line with the publications by Nordrum et al. (2013) and Dirkx et al. (2019) that showed that modality influences tutor feedback. Usage of a cover sheet may (partly) circumvent common problems in feedback practice in (higher) education like tutor feedback messages that lack explanations (Arts et al., 2016;Glover & Brown, 2006;Lee, 2009;Walker, 2009) and tutor feedback messages that do not contain feed forward (Arts et al., 2016;Carless, 2006;Glover & Brown, 2006;Hyatt, 2005;Lee, 2009;Orsmond & Merry, 2011;Perera et al., 2008).
It is, however, important to emphasize that categorization of feedback messages on the cover sheet into feedback, feed up, and feed forward as shown in Table 1 was based on pre-structured text boxes in which tutors placed their feedback. When evaluating the texts in the feed up and feed forward boxes, there seemed to be a less explicit relation to the learning objectives (feed up) and future activities (feed forward) than expected (Table 2). Although this might seem to point at misclassification, this does not seem the case since annotations that were placed in the category feed up, do implicitly relate to goals and criteria because of the guiding questions in the heading of the text-box. For example, the remark "In the method section the data analyses is missing" does not state anything about the objectives, however by putting this remark in the text-box with the title feed up and the guiding question 'what are essential improvements to be made?, it will be clear for the student that this relates to assessment criteria and as such to goals and aims of the course. It is possible that the division of the cover sheet into different textboxes with each an explicit heading leads to tutors paying less attention to formulating feedback messages that in word-use link to goals/criteria (feed up) or to future goals (feed forward). An alternative explanation is that formulating feed up and feed forward by itself is a difficult and complex task. A notion that was also described in a study by Walker (2007) that showed that tutors found formulating feed up and feed forward a difficult task even after having undergone a specific training.
In line with the study by Bloxham and Campbell (2010), the use of the cover sheet is valued as positive and effective by both students and tutors (see interview results). Tutors and students mentioned that the cover sheet highlighted the overall quality of the paper. Students stated that the information on the cover sheet was helpful in their learning process as it provided insight into how to continue, which was also mentioned as a valued aspect of feedback in the study by Dawson et al. (2019). Possibly, the cover sheet indicates that there is a certain hierarchy in the annotations.
Annotations in the paper itself visually all look the same, so it may be quite difficult for a student to identify the most important ones.
Students and tutors both indicated that the cover sheet could not replace the annotations in the papers. This may imply that feedback messages may differ depending on modality (in-text or cover sheet) and serve different functions as was also pointed out in previous studies (Dirkx et al., 2019;Nordrum et al., 2013). The notion that the cover sheet cannot replace in-text annotation seems to put time constraints on the use of the cover sheet, since it may ask additional time of the tutors (so, it does not score high on efficiency). Since time constraints is one of the possible explanations mentioned in literature for poor quality of feedback (Orrell, 2006;Price et al., 2010;Walker, 2009), this is an important thing to bear in mind. It may ask to reflect on the position of feedback in the whole curriculum and on how and when to include workload-friendly activities that enhance student feedback literacy (Carless & Boud, 2018;Carless et al., 2011;Yu & Liu, 2021). Maybe, early in the curriculum the way to provide feedback and the modality may differ compared to later in the curriculum (e.g., going from in-text annotations to feedback on cover sheets). Bitchener and Knoch (2008) showed that students can benefit from structured feedback on a limited set of criteria. In line with this study, students who received feedback on the cover sheet seemed somewhat more successful in resolving issues addressed in the interim versions (see Table 4). Moreover, students in the experimental group seemed to reflect more deeply on the feedback (see results of the questionnaire), which is in line with the study by Ellegaard et al. (2018). When combined with the outcomes of the student interviews, these data imply that the cover sheet enhanced student and teacher feedback literacy. Students in the experimental group seem to have a better understanding of the feedback, to use it in a more effective way and, as such, benefit more from the feedback. This might be explained by the fact that feedback cover sheets can influence both the contextual dimension and the engagement dimension of feedback literacy. The contextual dimension of feedback literacy encompasses four levels: a textual, an interpersonal, an instructional, and a sociocultural level (Chong, 2021).
The textual level includes the content of the feedback and since the cover sheet added content (see Table 1) this textual level was influenced and based on the perceptions of the students (see interviews and results from questionnaire), it was positively influenced.
The interpersonal level may also have been influenced, since tutors determined what essential improvements were to be made (see Figure 1) which may have led to a strengthened trust by helping students to determine what the main message of the feedback was. As the cover sheet asks teachers to review students' work from a more holistic point of view, it also influenced the instructional level. The engagement dimension may be influenced by the use of cover sheets as well. The engagement dimension of feedback literacy is divided into a cognitive, an affective and a behavioural level (Chong, 2021). Our data imply that cover sheets help (i) to understand the feedback (influencing the affective dimension on the cognitive level) and (ii) to continue (influencing the affective dimension on the behavioural level).
As for the effects of feedback, it was very interesting to look at the actual use of feedback by comparing interim and final documents and by comparing feedback, in the form of explanations on the final scoring rubric with feedback that was given halfway the course. This approach was somewhat difficult since the tutor that provided feedback on the interim version of a student paper was never involved in the final assessment of that same student, resulting in the need to interpret formulations from one tutor to another. For the examples in Table 4, the comparison was easy to make, in other cases this was much more implicit. Therefore, the data presented in Table 4 and the accompanying discussion below must be approached with care.
In line with a previous publication (Arts et al., 2016), all feedback annotations were used in some way by students. Solving interpretation issues with on task feedback seemed to be quite easy for students. Solving issues that require processing feedback on process level or selfregulation level appeared more difficult. By itself, this seems logical and understandable. However, close examination of previous literature seems to point to a more fundamental problem here. Studies that examined the effects of feedback on cognitively demanding assignments, like writing a paper, are hard to find and the ones that are available seem to indicate a lack of effectiveness (see for instance, Duijnhouwer et al., 2012). Moreover, studies on cognitively demanding assignments were largely absent in meta-analyses (Kluger & DeNisi, 1996;Shute, 2008). Boldly formulated, no data seems to be available that shows positive effects of feedback on cognitively demanding assignments. This is a quite disturbing observation, since it could be argued that these are the assignments that tutors invest most time in. Research on effective feedback for higher thinking order assignments would thus be of enormous value. Maybe a similar strategy as employed in this study would be worthwhile. Additionally, it may be worthwhile to examine how the students' uptake of feedback on process and self-regulation level improves by curriculum-embedded activities designed to enhance feedback literacy, like peer feedback and discussion of exemplars (Carless & Boud, 2018) Taken together, the conclusion of this study is that a coversheet helps to improve the quality of the feedback supplied by tutors. This outcome is in line with previous research (Bloxham & Campbell, 2010;Newton et al., 2012), but provides a more elaborate exploration. With this study, we hope to inspire the field of education to look into a more evidence-based approach in developing feedback cover sheets. It may be possible to create cover sheets that increase the quality of feedback, but that also improve the dialogical aspect of the feedback practice.

Conclusion
The use of the cover sheet tested in this study led to more feed up and feed forward messages and to more feedback on process level being distributed. As such, the use of the cover sheet led to higher quality feedback messages. Moreover, the use of the cover sheet was valued as positive and effective by both students and tutors. Tutors and students stated that the cover sheet helped tutors in formulating a more holistic evaluation of the work. In addition, students said that the information on the cover sheet was helpful in their learning process as it provided insight in how to continue. As such, the structured feedback cover sheet turned out to be a helpful instrument to enhance both teacher and student feedback literacy. The present study adds knowledge to the field by giving a rare insight on effects that the implementation of a cover sheet has on feedback practice. From a theoretical point of view, the use of structured feedback cover sheets offers an intervention, that might be a valuable addition to e.g., peer feedback, that enhances student feedback literacy which seems to be largely overlooked in literature up until now. Structured feedback cover sheets can relatively easily be included in various educational contexts and the present study provides insights in the effectiveness of this intervention. As such, the present study can contribute to further development of research-based structured feedback cover sheets.