Re-balancing of assessment methods derived from semesterisation

ABSTRACT This paper proposes a model to re-balance assessment schedules through the quantification of students’ workload. The initiative derives from the migration of a term-based to a semester-based calendar in a HE institution. The change required scrutiny of former and new assessment calendars which highlighted deficiencies in the administration of engineering courses, such as the uneven distribution of assignments during the academic year. The model and its application derived recommendations to achieve a better balance of workloads, past reflection on nexus between students learning experience and the outcome of mechanisms to evaluate the quality of education.


Introduction
The adoption of a semester-based calendar 1 was first seen in British higher education (HE) institutions at the University of Stirling in the 1960s. They developed programmes based on modules that carry credits and run through semesters, which contrasted with the three-term structure adopted by most European universities. To date, no consensus on the ideal academic calendar there exists; some argue that semester-based delivery maximises learning rates and student-staff interaction while decreasing the study burden preceding annual examination periods. Others counter-argue the difficulties around operations and administrative procedures, as in Manson, Arnove, and Sutton (2001), Morris (2002), Bhattacharya (2011), Aslam et al. (2012), and Du Plessis and Pretorius (2013). Further issues span from the disproportion between the amount of teaching students receive versus the amount of study they undertake, to negative correlations between assessment workloads and student satisfaction, as reported in Gibbs and Dunbar-Goddet (2007), Kingsland (1996) and Greenwald and Gillmore (1997), and Rummel and Rummell (2015). Past research also highlights welfare-related problems such as stress, anxiety, and depression, derived from students' workload. Although these and other issues may vary across disciplines, they are indicative of the impact that poorly managed courses can have on learners.
Regulatory bodies take no fixed position on calendar structures. Instead, they focus on the achievement of learning outcomes through accumulated credits, leaving scope for educational providers to select and develop their own delivery plan. This is implicit in the HE framework in England that came into force in 2005 (QAA, 2008) following the 2004 report of the Measuring and Recording Student Achievement Scoping Group (Burgess Group). In compliance, the Quality Assurance Agency for Higher Education recognises modules as a basic unit in the HE framework and the vehicle for accumulating credits. Under such context, the University of Birmingham (UoB) in England replaced the threeterm structure from the session 2020-21, to implement a semester-based calendar. The basic premise for the change being that the new structure will enable redistribution of assessments throughout the academic year that will mitigate the heavy workloads imposed on students and staff during a single assessment period.
The present investigation explores the impact of the academic calendar on the students' learning experience. It does that through the scrutiny of workload scheduled over a year and the development of a model to quantify students' availability to complete academic tasks. The study builds on past investigations that provide objective approaches to measure students' behaviours, for example, as in Krohn and O'Connor (2005) who propose a mathematical model to determine the degree of correlation between students' effort and performance at school. Other authors have attempted to measure factors that influence performance, although using subjective methods for example through critique of survey results and interviews, as in Svanum and Bihatti (2009) or Natriello and McDill (1986). The common denominator in past research is the recognition of the negative impact that heavy workloads can have on students' performance.
The research methodology adopted here reflects the structure of the paper. Section 2 establishes the influence of assessment workloads on students' level of satisfaction; Section 3 describes the three-term calendar formerly used at the UoB and compares it to the new semester-based calendar. In Section 4, metrics to quantify the time needed by students to complete tasks develop, while comparing these to the effort required. Section 5 identifies measures to mitigate heavy workloads through a semester-based calendar, and Section 6 concludes with some final remarks.

Nexus between assessment workload and students' satisfaction
In England, the teaching excellence framework (TEF) and the national student survey (NSS) assess the quality of education and provide resources for undergraduate students to judge academic practice in universities (Bhardwa, 2019;Kovacs et al., 2010). Data collected through NSS enable the measuring of students' satisfaction and feed in two TEF metrics, namely, Assessment and Feedback, and Academic Support -see Figure 1. Looking at the 2018-19 NSS results at the UoB -School of Engineering (Civil), we find an overall satisfaction rating of 80% (BEng) and 86.5% (MEng) for Assessment and Feedback and 83% (BEng) and 71.7% (MEng) for Academic Support. 2 The overall UK results for the same exercise are 73% and 80%, respectively, 3 suggesting that, while the UoB NSS results are ±3 percentage points from UK averages, there is room for improvement. The quote below, taken from NSS (free text) results, illustrates that assertion.
"balance workload more evenly" Academic over-burden to students is not new. It has motivated research to understand its causes and formulate strategies to reduce it. The list below cites examples of mitigation measures identified by Dunlap (2005), Reyes Ruiz-Gallardo, Castaño, Gómez-Alday, and Valdés (2011) and Greening (1998), (1) Providing project and schedule details at the start of the course.
(2) Asking learners to post any questions about the course by the end of the first week and updating course materials to include clarification. (3) Making the course available to learners a few days before it starts so they can become accustomed to the site and materials. (4) Ensuring learners read the materials and understand the course requirements and expectations at the earliest opportunity. (5) Providing information in a Frequently Asked Questions format. (6) Balancing the breadth and depth of contents to control workloads and promote generic skills. (7) If applicable, make Problem-Based Learning (PBL) more structured for first-year courses and provide more detailed guidance to students.
More broadly, reflective practitioners in professional and higher education could seek other ways to develop broad, flexible assessment approaches. The 53 suggestions discussed in Burns (2015) could provide a departure point to support both the assessment of learning and assessment for learning. For example, through written tasks, examinations, problem-based activities, live and authentic forms of assessment, assessment over different timescales, interpersonal aspects -such as group work, student involvement, and feedback -and quality assurance.
To maximise results, these measures could complement with structural changes to course administration. For example, by creating mechanisms to optimise time and effort associated with assessments. In that sense, the following section identifies changes to balance students' effort, tailored to UoB engineering undergraduate courses, as they run in the new academic teaching year (NATY).

Overview of pre-and post-NATY course structures at UoB
The new academic calendar implied administrative and operational changes, leaving intact course contents and learning outcomes. The transition to NATY required that most modules originally taught over two terms contract into one semester, as shown in Fig 2.

NATY: an overview
The reorganisation of the academic year changed the rate of credits that students attempt each assessment period and the timescales. The key points are the implementation of two examination periods (P1, P2) preceded by scheduled assessment and revision support (ARS).
NATY does not modify the length of teaching periods, neither does it change rates of contact time. The rationale for the change reflects the desire to spread across student and staff workloads. The following sub-sections demonstrate that such redistribution occurs with the change of the academic calendar. Although arguments added further down reveal that direct mapping of courses from the old to the new calendar was not ideal, which generates further strategies to rebalance and refine assessment workloads.

Course structures
The undergraduate Civil Engineering BEng and MEng course structures include 10-and 20-credit modules that are compulsory for Y1 and Y2 students. On the other hand, Y3 and Y4 courses (the latter MEng only) incorporate optional modules, as shown in Tables  1 and 2, which illustrate pre-and post-NATY course structures for the relevant programmes, while Figure 3 illustrates pre-and post-NATY programme structures for Y2.
It can be seen in these tables and figure that most NATY subjects span one semester and conclude with the assessment either in period P1 or P2, see Figure 2. Few other modules transit into NATY with no change, either because these carry 10 credits and fit in one semester or because these are research projects and require two semesters to develop. Based on this, one infers that NATY implied a redistribution of workload; although, as pointed out above, the direct mapping of courses from the old to the new calendar required fine-tuning.  The above outcome demands more in-depth analysis, as to inform further decisionmaking for managing undergraduate courses. The following section introduces a model to quantify students' workload, while subsequent sections refine and apply the model to a case study.

A model to quantify students' net availability to complete work
The basic framework for balancing workload takes time as the unit of measure, a criterion that consolidates past investigations (Reyes Ruiz-Gallardo et al., 2011). The demand for effort by students derives from the conversion of credit numbers into hours; in England, this equals 10. 4 Table 3, illustrates the result of applying this rule for Y1 subjects for each semester, contrasting pre-and post-NATY experiences. Each column relates to either coursework or exam, therefore net effort required (sum of hours across the year) totalises 10 times the credit value carried by each subject.
Table 3 also illustrates that total workloads associated with pre-NATY coursework slightly imbalanced past the transition. For example, in 2019-20, an even effort was spread across two semesters; however, in 2020-21, this changed to demand about 30% more effort in semester 2 In this scheme BEng, MEng students select 10 and 20 credits from options, respectively -once WE3 is core to MEng only (250 hours) than it does for semester 1 (190 hours). This imbalance does not seem a major problem given the time that students have available to attempt credits during term time -see Table 5. Moreover, asymmetries associated with coursework tend to decrease during two NATY exam periods, which modifies the pre-exam study burden seen in past years. In this paper, students' availability consists of hours outside contact time but within working hours. Typical contact time includes lectures, tutorials, laboratories, site visits, and seminars. This already indicates that students' availability to complete academic duties fluctuates over time. Hence, the first step for quantifying workloads consists of estimating contact time. Table 4 summarises the distribution of contact time over the teaching blocks, exemplified for Year 1.
We now determine the time that students effectively have for attempting credits -hereafter referred to as student availability (A). Eq. (1) determines the value of A by integrating three terms. The first one relates to the total time comprised in the academic calendar, the second and third, deduce contact time when students are busy either lecturing or doing practical work, but recognising that a fraction of that time helps students to complete assignments or revise for exams. It is thus hypothesised that lecturing, solving tutorials, and collecting information through research, help students to complete coursework and prepare for exams.
where, A: Student's availability to attempt credits (hours (hr)) DW: Days of the week student works NW: Number of weeks CT: Contact time (hr) LT: Laboratory/practical time (hr) λ: Percentage of time spent on practical work φ: Percentage of CT contributing towards assignments Ψ: Time that student puts in on working days (hr) It is worth highlighting that the model does implicitly integrate welfare or extenuating circumstances that students could be facing. This is done through the vector of variables {λ, φ, Ψ}. The parametric analysis shown in section 5, defines reasonable intervals for the cohort and discusses potential variations for the tails or student subsets.
The quantification of A enables the comparing of students' availability to complete work versus the effort required for it. This defines the R-Index: R = A/E, where E represents effort, expressed in hours. According to this, obtaining R > 1 implies that students could complete more work during the established period, whereas R < 1 indicates that students are overloaded. In line with this, and using the data given in Tables 3 and 4, we obtain the following: Note that the odd result in Table 5: NULL (#DIV/0! In the numerical processing) means that no assessment took place in the corresponding assessment period, i.e. E = 0, hence the time that student had available became an unused resource. It seems relevant to highlight that the case study depicted in Table 5 derives from allocating specific values to the variables λ, φ, and Ψ in Eq. (1), which appear in Table 6.
From Figure 2, we infer that the standard teaching period comprises of 11 weeks and that students have approximately 3 weeks to prepare before examination periods. The remaining parameters in Table 6 are subject to criteria; for example, during teaching periods, students would spend 5 days per week on academic activities, although this figure increases to six when preparing for exams. Working hypotheses include working days of 8 hours, also that 50% of contact time contributes towards assessments, e.g. active learning, and that students spend 12.5% of their time attending each laboratory. The latter figure derives from personal timetables standardised to teaching delivery at UoB (Engineering).
Although the results shown in Table 5 would change by selecting other values to variables in Eq. (1), the analysis done reveals changes in students' workload, pre-and post-NATY implementation. For example, having R = 3.63 in NATY Semester 1 suggests that students could take considerably more workload, while the outcome R = 1.06 in Semester 2 indicates a well-balanced schedule. The workload during exams mitigates when the R-Index for the single exam period in 2019-20 passes from 0.54 to 0.55 and 1.33 in NATY exam periods P1 and P2, respectively. These figures reiterate that, while NATY certainly re-distributes workload, there is room for optimisation. The following section  addresses this outcome and develops some other ideas for optimising the uniformisation derived from NATY.

Assessment re-balancing
The assessment workloads and partial conclusions discussed in the previous section highlight some dependence on valued parameters integrated in Eq. (1). Therefore, it is necessary to scrutinise the sensitivity of Equation (1) to the vector of variables {λ, φ, Ψ}.

Parametric analysis
To assess the variability of the model, we define reasonable intervals as follows: 0.0625 < λ < 0.1875, 0.25 < φ < 0.75, and 6 < Ψ < 10. Bearing in mind that λ measures the fraction of time that one student spends on laboratories -out of the time allocated during the year for the whole cohort, φ represents the amount of contact time that contributes towards solving coursework or studying for exams, and Ψ defines the number of hours within a working day. These ranges would capture low to high student performance, commitment, or discipline. Tables 7 and 8 show the results (min-max) obtained when varying φ and Ψ, respectively. It was noted that when values of the resulting R-Index round to a second decimal point, no change appears when varying λ. Looking at the variability and confidence levels shown in Table 7, one concludes that φ = 0.5 is representative of the interval 0.25 < φ < 0.75. The reason why φ has no major influence on the R-Index, relates to the magnitude of quantified contact time with respect to the estimated time available outside teaching activities, in other words, the first term in the polynomial in Eq. (1) dominates. This also explains why results in Table 8 depict a significant influence of Ψ on the R-Index, as this parameter scales the first term in Eq. (1). Despite the variability shown in Table 8 across confidence levels associated with the R-Index, it seems reasonable to standardise the model to Ψ = 8. This is because the tails in the interval would represent low student numbers. For example, lower values of Ψ would represent those whose personal circumstances hinder their performance or those who simply tend to perform poorly. Larger values of Ψ would capture groups of individuals with high levels of commitment, that for the mass of the cohort would be difficult to sustain. This suggests that the combined effects of the tails do not revert the overall results. Based on the above, the values of the R-Index shown in Table 5 prevail. The following section builds on those results and discusses ways to further balance assessment workloads derived from the NATY.

Re-balancing assessments
The rationale underpinning NATY includes encapsulating teaching contents and learning outcomes while re-distributing student workloads. In line with that principle, we revisit the weighting of assessment components or type, for optimising the R-Index. Accordingly, Table 5 expands in Table 9-13 to reconfigure assessment workloads per module per year. For the sake of clarity, Table 9-13 highlight the changes recommended (last 4 columns) with respect to the configuration directly derived from NATY (central columns).
The changes reflected in Table 9-13 essentially modify the weighting of assessment components. There are few exceptions, such as replacing the combination CW + Exam in the Y1 computing module, by CW only; the reason being that the subject seems more adequate for computer-based laboratories, quizzes, or other forms of PBL. Some other radical changes take the form of the partial or total replacement of exams by their CW equivalent, such as the case of Structures 1 and Geotechnics 1 in Y2, and Structures 2 and Geotechnics 2 in Y3. A single examination is a viable option, although experience shows that a single exam at the end of the course, for those challenging courses, induces high   0  30  70  0  30  0  70  0  50  0  50   20  Fluids  30  30  140  0  60  0  140  0  100  0  100  10  Computing  30  0  70  30  0  70  0  100  0  0  0  20  Mechanics 1  30  30  140  60  0  140  0  100  0  100  0 stress amongst students, particularly at Level H where no re-assessment opportunities exist. This could promote over-training or the uneven distribution of time with respect to other subjects whose examination takes place during the same period. It is perhaps worth mentioning that Tables 11 and 12 include one pre-selected optional module only. Structures 1   Structures 2   Structures 2    Preliminary results showed that considering other options to repeat these tables does not overcome overall conclusions; hence, we delimit the scope of this report by taking the most popular option for each cohort.
The assessment re-balance shown in Table 9-13, translates into benefits. This is shown in Table 14, where average values of the R-Index, calculated across all levels now tend to unity.
The reader should note that the values of R-Index per module per academic year are susceptible to the volume of teaching activities across semesters as well as the time needed for preparing coursework and examinations. For example, in transforming coursework into an examination or vice versa, one needs to consider different values in the controlling parameters of variable A in Equation (1), as established in Table 6. Variation in the weighting of assessment components also introduces non-linearity to the model and therefore the optimisation of assessment rates could not be achieved by differentiating Equation (1) with respect to independent variables. In other words, the changing process reflected in Table 14 derives from sub-processes taking place within specific time periods, as dictated by the academic calendar.

Conclusion
This investigation examines a major re-structuring of the academic calendar at the University of Birmingham, in England. It evaluates the impact of implementing a new academic teaching year (NATY) from the perspective of assessment workloads, 5 via the following methodology: • Outline the influence of assessment workloads on students' level of satisfaction. • Compare old engineering course structures based on Terms with new ones based on Semesters. • Develop a model to quantify the availability and demand of students' effort. • Test and calibrate the model through a case study and apply it to evaluate pre-and post-NATY assessment workloads at the School of Engineering, UoB. • Reflect on the outcome of the study and recommend assessment re-configurations for improving students' learning experience.
The established link between assessment practices and NSS results (Civil Engineering) stresses the need to distribute assessment workloads more evenly through the academic calendar. This fact derived in a model to quantify students' availability to attempt credits. The model basically discretises teaching activities and estimates the time that students can use to complete coursework or revise for exams (availability), and the amount of time required to accumulate credits (effort). Once the model was refined, it served to determine the efficiency of assessment schedules through one single parameter, the R-Index. This study highlighted that direct mapping from term-to semester-based courses does not result in optimum workloads for students, which required re-configuration of assessments in terms of credit balance, change of format, or re-scheduling so that these occur more gradually during the academic year. Noting that the student experience does not uniquely depend on scheduled workloads, recommendations for improving their level of satisfaction cite a range of assessment strategies to support both the assessment of learning and assessment for learning. Notwithstanding that the investigation did not intend to measure the influence of assessment workloads on students' level of satisfaction, we have been able to capture students' feedback on the new academic calendar. The following quote was taken from an internal report generated through to student-staff liaison -in this case, associated with Y2: "The general feeling of the New Academic Teaching Year framework amongst students is that it is fantastic as it is designed to lighten the load on students by not having to revise for all exams at the end of the year. Focusing on only 3 modules in one semester is good as we do not have to worry about structures, geotechnical engineering, and engineering maths for the rest of the academic year after January. Had we gone through with the original teaching framework, it would be more difficult to regurgitate and revise for this content in the summer exam period" The final reflection points out the need to improve academic practice, in this case via the balance of students' workloads bearing in mind that results could also permeate the outcome of nationwide academic assessments.