Developing medical educators – a mixed method evaluation of a teaching education program

Background It is well accepted that medical faculty teaching staff require an understanding of educational theory and pedagogical methods for effective medical teaching. The purpose of this study was to evaluate the effectiveness of a 5-day teaching education program. Methods An open prospective interventional study using quantitative and qualitative instruments was performed, covering all four levels of the Kirkpatrick model: Evaluation of 1) ‘Reaction’ on a professional and emotional level using standardized questionnaires; 2) ‘Learning’ applying a multiple choice test; 3) ‘Behavior’ by self-, peer-, and expert assessment of teaching sessions with semistructured interviews; and 4) ‘Results’ from student evaluations. Results Our data indicate the success of the educational intervention at all observed levels. 1) Reaction: The participants showed a high acceptance of the instructional content. 2) Learning: There was a significant increase in knowledge (P<0.001) as deduced from a pre-post multiple-choice questionnaire, which was retained at 6 months (P<0.001). 3) Behavior: Peer-, self-, and expert-assessment indicated a transfer of learning into teaching performance. Semistructured interviews reflected a higher level of professionalism in medical teaching by the participants. 4) Results: Teaching performance ratings improved in students’ evaluations. Conclusions Our results demonstrate the success of a 5-day education program in embedding knowledge and skills to improve performance of medical educators. This multimethodological approach, using both qualitative and quantitative measures, may serve as a model to evaluate effectiveness of comparable interventions in other settings.

T he drive for continuous improvement in medical education is propelled by both advancements in educational theory and research evidence, which is subsequently changing the traditional requirements of a medical educator (1Á4). Hence, many medical faculties have endorsed development programs to improve teaching skills of their staff (5Á7). A variety of different approaches have been surveyed, however, establishing the effectiveness of new faculty development programs and their impact on student education remains a challenge (8Á10). In a systematic review by Steinert et al., the effects of faculty development interventions on knowledge, attitudes, and skills of educators on quality of education delivered, and on the institutions in which they worked, was reported (10). This review identified that repetitive interventions over time, using a deliberate adoption of theory of learning and educational principles, and the support of reflection and learning among participants, was effective. It was recommended that such interventions should be accompanied by process and outcomeoriented research, using multiple methods (quantitative as well as qualitative) in a performance-based way to Medical Education Online ae Medical Education Online 2014. # 2014 Marco Roos et al. This is an Open Access article distributed under the terms of the Creative Commons CC-BY 4.0 License (http://creativecommons.org/licenses/by/4.0/), allowing third parties to copy and redistribute the material in any medium or format and to remix, transform, and build upon the material for any purpose, even commercially, provided the original work is properly cited and states its license. assess changes on the basis of a conceptual framework (10). Kirkpatrick's model developed for measuring training effectiveness is one such useful framework (11). It measures effectiveness on four outcome levels: 1) the participants' affective responses to training content and environment, 2) the impact of the training itself (level of learning), 3) the long-term outcome in job-related performance (level of behavior), and ultimately 4) institutional changes (level of results) (4, 5, 12Á14).
The Heidelberg Medical Faculty implemented a 5-day education program in 2001 to support the implementation of its new medical curriculum HeiCuMed (Heidelberger Curriculum Medicinale) (15Á17). The program prepares members of the medical faculty for their role as educators. Prior to 2001 within the old curriculum, the role of faculty members that were involved in teaching medical students was that of a directive instructor rather than that of a facilitator to support the learning process of students, as it is within the new curriculum. The 5-day education program targeted the entire teaching staff of the faculty (including those involved in both core sciences and clinical teaching). To date, over 1,000 faculty members have passed through this education program. It is delivered by qualified faculty members holding a Master degree in Medical Education or a Federal Certificate in Higher Education (in collaboration with external experts for adult education from University of Education, Heidelberg).
The content of the education program covers five essential objectives: 1) learning theory and educational principles, the 'sandwich-architecture,' and 'constructive alignment' as a general framework for teaching sessions, 2) simulation and skills labs as teaching methods and environments, 3) problem-based learning (PBL), 4) modern teaching assessment methods, and 5) reflection on the role of a medical educator. Each day of the course was concluded with a session of peer-coaching on an individual teaching session (facilitatorÁlearner feedback as well as learnerÁlearner feedback). The intent was to actively support participants to improve the ways in which they structured their teaching sessions as well as to enhance their teaching skills. These sessions were also designed to incorporate aspects of constructive alignment (18). The program content of each day is summarized in Table 1.
The purpose of the present study was to evaluate the effectiveness of this 5-day education program applying Kirkpatricks' framework. The questions we sought to answer were whether such an intervention improved the knowledge of the trainees on learning theory and educational principles, led to a better performance in teaching sessions, and improved peer-feedback. Interviews were conducted to evaluate the impact of the 5-day program with respect to changes in teachers' behaviors and attitudes.

Study design
To examine the effectiveness of the 5-day program, we adopted Kirkpatrick's model, applying quantitative and qualitative instruments, whose endpoints were defined on each of the four levels of Kirkpatrick's model. The study design ( Fig. 1) was developed in compliance with the Helsinki Declaration for Ethical Principles for Medical Research Involving Human Subjects (www.wma.net). In accordance with the general regulation of the ethics committee at the Medical Faculty at University of Heidelberg, no ethic approval was needed. The study protocol did not include data of patients or medical intervention. Data were out of a quality management intervention of a faculty development program. Furthermore, data were collected anonymously (quantitative part), negative aftereffects for participants were not given and participation was on voluntary basis with an oral consent.

Reaction
The level of reaction is defined as the grade of acceptance of training content and environment and, therefore, has a direct impact on learning performance. We measured the level of reaction daily with a standardized questionnaire both on a professional and an emotional level: on the professional level, participants evaluated relevance for on-the-job performance (as a medical educator) for each session during the day on a 5-point Likert-scale (1 0strongly agree to 50strongly disagree); on the emotional level, participants evaluated the working atmosphere (1 0very good to 50poor) during the day.

Learning
A multiple choice questionnaire (MCQ) was developed consisting of 35 questions related to content of the 5-day program (objectives), set out in different question formats (type A, long-menu, and case-based questions) to measure basic knowledge and application of knowledge as well as analytical and synthesizing skills. Questions were developed by two educationalists and reviewed by an expert panel using the Delphi method (19). Participants answered the MCQ immediately before prior to starting the 5-day program, immediately after finishing the 5-day program, and 6 months later. Results are reported in sum scores.

Behavior
Qualitative instruments were applied to evaluate behavior. Behavior was defined as transfer of training content into teaching sessions (on-the-job performance) after completing the education intervention. Teaching sessions were observed by a peer (participant in program) and by an expert (facilitator of the 5-day program). In total, three formative assessments were conducted to evaluate the level of behavior: 1) self-assessment (participant in the program), 2) peer-assessment (by another participant in the program), and 3) expert-assessment (by a facilitator of the 5-day program). A standardized evaluation form was used to maximize objectiveness for each of the three assessments. The evaluation form provided assessors with different headings such as educator attitude toward teaching, application of educational principles, and coherence with learning objectives, as well as studentÁ educator interaction. Assessors were requested to give feedback for each heading. All formative assessments were collected and processed as qualitative data on the basis of grounded theory. In addition, semistructured interviews were conducted 4Á8 weeks after the completion of the 5-day program to gain insights into changes in behavior and attitudes of participants, who were encouraged to reflect on their role as a medical educator. The interviews were guided by four questions: 1) Did the training program give you any incentives to improve your teaching sessions?, 2) Did you experience any changes in your personal view as a medical educator?, 3) Do you observe any obstacles in your work environment preventing you from being an effective medical educator?, and 4) What are your personal goals concerning effective medical education? All interviews were transcribed and processed as qualitative data on the basis of grounded theory.

Study population
The target population consisted of all participants in the 5-day education program in 2006 (n065). All participants were asked for consent. Nine participants declined (Fig. 1). Fifty-six participants responded on level of reaction and learning. However, because of the complexity of the qualitative evaluation of effectiveness, we randomly selected a study subgroup of 13 out of the 56 participants. This component evaluated the level of behavior according to Kirkpatrick's model. This subgroup was also selected for the observed teaching session (including self-, peer-, and expert-assessment) and the semistructured interview as described above. Experts (facilitators of the 5-day program) accompanied this subgroup during the post-training period.

Data analysis
Results of MCQs, presented as means9SD were analyzed using Wilcoxon signed-rank test to describe differences between measurements (level of significance defined as pB0.05). In addition, effect size, r, and Cohen's d, were calculated. To identify matched pairs, we used an individual reproducible code for each questionnaire. All quantitative data were statistically analyzed with SPSS 18 (SPSS Inc., an IBM Company, Chicago, USA) and SAS 8 (SAS Institute Inc., Heidelberg, Germany).
All qualitative data (peer, self-, and expert-assessment, written semistructured interviews) were analyzed using grounded theory (20) and qualitative content analysis (21). All qualitative data were transcribed verbatim and afterwards coded word by word by two separate investigators. Categories and subcategories were generated and conceptually organized. An expert panel (including two coding investigators) compared the generated categories, came to consensus on final domains, and related the number of responses to each category in a table.

Results
Participants (nearly 60% were males) represented a complete cross-section of clinical and basic sciences disciplines of the medical school (see Table 2). Results presented according to the Kirkpatrick model are as follows: 1. Reaction: Table 3   feedback of peer coaching led to seven categories, divided in different groups (peer-, self-, and expert-assessment). Table 4 shows the distribution of different categories mentioned by each group.
In the semistructured interviews 157 text units were identified. One hundred and forty nine text units were related to the following four categories: 1) better knowledge on learning theory and educational principles, 2) reflection on the role as a medical educator, 3) barriers to being a good medical educator, and 4) goals of 'good' medical education (Table 5). Eight single text units could not be related to a definite category and were excluded from the analysis.

Conclusions
The primary goal of this study was to identify several factors that influence the developmental process of participants in the 5-day education program to become better medical educators. It found a high acceptance of the theoretical content, which was reflected in its approval on a professional and emotional level. This was an important prerequisite for achieving new knowledge and its transfer to on-the-job performance (in clinical settings), which was confirmed in the results of MCQ and qualitative measurements. Finally, it found a high concordance of self-, peer-, and expert-assessment in following clinical settings. According to the results of our study, the 5-day education program fulfills criteria from Kirkpatrick's framework on all explored levels measuring effectiveness of an education intervention.
As a precondition of achieving knowledge and of changing behavior a high satisfaction with the education structure and the content is needed (11). Our results confirmed satisfaction as shown in daily evaluations during the training, using items on professional aspects of the training content and on emotional reactions of our participants (Kirkpatrick's level of reaction). At the same time, education specific knowledge increased significantly (with high levels of effect size) during the training period and was maintained at the 6-month follow-up, as shown by the MCQ at three time points (p B0.001) (Kirkpatrick's level of learning). These findings are in line with previously published reviews and confirm the importance of an integrative theory of education motivation (2,4,12,22). They also support the fact that reactions to education interventions have a fundamental impact on the engagement with theoretical content (knowledge and skill acquisition) and the transfer of new knowledge to job-performance (22).
A supportive organizational environment is indispensable for the sustainability of educational interventions to improve the transfer of new knowledge to jobperformance and to stimulate changes in behavior as a medical educator (5, 23Á25). This study identified three main supportive elements in the environment of our 5-day training program. First was establishing peer-coaching structures by implementing small group work of 4Á5 peer Quantitative assessment of reaction of participants to theoretical content and environment. S1.1ÁS5.2: measurements of professional acceptance, relevance for daily life of different sessions. A1ÁA5: measurements of acceptance on an emotional level (daily atmosphere). Means with SD (1 0 strongly agree/very good Á 5 0strongly disagree/poor). tandems (learner with learner) with the aim of transferring the theoretical content into participants' teaching portfolio (on-the-job performance) and to enhance development by collaborative working behaviors. Furthermore, a non-judgmental and peer-coaching environment with a constructive, formative feedback promotes a supportive learning environment and keeps participants motivated (26,27). As a result, participants used their new knowledge for peer observation in the post-intervention period by evaluating and discussing job-related performance in peer-assessment, including suggestions for improvement (formative feedback). Supporting a feedback culture among the participants was one of the major training aims to encourage collegiality and collaboration within and across disciplines. Supporting collaborative formative feedback seemed to be the most valuable factor in learning processes, promoting effectiveness of training intervention (28). Second, the fact that we found high agreement in observations between peer-and expert-assessment implies Table 4. Qualitative peer-/expert-/self-assessment Categories referring to evaluation of performance 'Samples' Peer-assessment, n (%)

Developing medical educators
Expert-assessment, n (%) Self-assessment, n (%) Using a sandwich structure in the teaching session 'There was a transparent didactic architecture in the lecture (sandwich structure)' 'The sandwich architecture was realized very well' 8 (6.6) 8 (6.6) Transparency on learning objectives 'Learning objectives were shown in the beginning . . .' 'All learning objectives were transparent to students'. 6 (4.9) 6 (4.9) Using an agenda as a (pre-)structure 'An agenda was visual all the time'.
'Good explanation of the agenda in the beginning' that our participants acquired the ability to apply theoretical content. They were supported in the transfer process into their teaching practice with peer-and own-performance reflections, as an important stage in their professional development (2,14). The participants gave feedback to each other on content and quality of teaching performance. This individual feedback was comparable with expert feedback in quantity and quality, although the evaluation form only provides headings to guide and standardize feedback. Third, the delivery of the intervention by professional facilitators with medical and pedagogical backgrounds was an important prerequisite for the effectiveness of this education intervention. This seems to be an essential structural element to avoid barriers between educators and clinicians (5,10,14).
Although the results of the study met the goals of our study, several limitations still remain. The needs of our institution did not allow for a control group. Therefore, we cannot be sure that some of the outcomes we attribute to the 5-day program are not actually due to a selection bias. Participation in the 5-day program is a prerequisite for the postdoctoral qualification for some participants. Furthermore, the 5-day program is compulsory for all faculty teaching staff within the faculty development program.
A second limitation is the small size of our study group (the whole faculty has approximately 1,200 members).
However, for the intervention group, we reached 85% participation for quantitative (Level 1 and 2 of Kirkpatrick's framework) and 20% for qualitative measures (Level 3 of Kirkpatrick's framework). This might be a contrast to other findings in literature (29).
Third, long-term outcomes were measured at a 6-month follow-up, and some may regard this as too short an evaluation interval. However, we could demonstrate that the immediate and 6-month outcomes were still significantly higher than pre-intervention scores.
In conclusion, our findings indicate that the participants of our 5-days education intervention achieved a higher level of educational proficiency. Our participants achieved the required cognitive development. They felt familiar with learning theory and expressed their intention to apply the learned educational principles. Indeed, results of self-and peer-assessment revealed a direct impact of theoretical content on their job-related performance. Furthermore, peer-assessment motivated the participants to collaboratively work on the improvement of their teaching performance. Finally, the participants identified more with the role as a medical educator. We are convinced that this education intervention supported self-reflection of medical educators in their professional environment, promoted collegiality and collaboration within and across traditional discipline boundaries, and exerted an important impact on an effective faculty development (5,10,14,30,31). Table 5. Qualitative assessment of self-reflection (semistructured written interviews) 'Many times I feel unversed in using the ''new'' methods, but I hope it will be better with more practice'.
'I think some didactic methods are artificial and not useful in daily teaching'.
25 (16.8) Goals of improved medical education 'Teach students in accordance with best practice'.
'. . . improvement on medical knowledge, development of social and ethics competencies'.
'To be an enthusiastic role model for students and to motivate students to prepare for a complex profession'.
'A good physician is not a specialized ''idiot'' but an interdisciplinary team worker'.

(25.5)
Total text units 149 (100) More than 10 years after the initial implementation of the 5-day program, implemented to develop teaching staff in line with the new medical curriculum HeiCuMed, the success of this continuous quality improvement is confirmed by various studies on student evaluations and satisfaction (32,33).