Validating a SMIIL:) Development and initial validation of a Scale Measuring the Impact of Interprofessional Learning (SMIIL)

ABSTRACT Design and evaluation of interprofessional learning (IPL) in pre-qualification education lacks customization. In response to this, the Scale Measuring the Impact of Interprofessional Learning (SMIIL) was developed to facilitate the context-driven evaluation of IPL interventions in the undergraduate setting. Items of this scale reflect common IPL themes and align to levels one to three of the modified Kirkpatrick’s model. This scale was piloted in a cohort of 787 BMBS (Bachelor of Medicine Bachelor of Surgery) students at a medical school in the South West of England. A response rate of 22.7% was achieved despite the disruption of face-to-face data collection during the Covid-19 lockdown (March to July 2020). Descriptive statistics, factor analysis, and Cronbach’s Alpha were used to validate and refine the scale. The resultant SMIIL is a unidimensional instrument comprised of 17 items with an acceptable internal consistency (Cronbach’s α = 0.80). Further research is required to develop the scale fully and validate it by involving different cohorts of pre-qualification healthcare students in multiple localities and varying styles of IPL interventions.


Introduction
Over the years, medical education has moved away from the traditional lecture-based curriculum to be more reflective of professional practice. This correlates with the growing interest in interprofessional education (IPE), which can be defined as, "when two or more professions learn with, from, and about each other to improve collaboration and the quality of care" (Barr, 2002). For some time now, regulatory bodies and policymakers have been reiterating the importance of IPE in the undergraduate curricula of healthcare professionals (Aresko. et al., 1988;Barr et al., 2016;Reeves et al., 2016). Whilst educational institutions have acknowledged the importance of IPE, there have been significant logistical challenges which have deterred its incorporation into the curriculum . Therefore, IPE in this setting must be heralded by a firm foundation of evidence and tools to assess its impact on students' future practice. After all, there is little point in introducing new curricular components if they cannot be evaluated effectively (Fraser et al., 2020). Currently, the body of evidence for IPE in undergraduate health education is limited and few evaluative tools have been developed for this setting (Freeth et al., 2019a;Reeves et al., 2016). This paper will outline the development and initial validation of a scale designed specifically for the undergraduate context to measure the effect of IPE. However, as with many UK medical schools, the host institution's incorporation of IPE into the curriculum is limited. Therefore, the initial validation of the scale is in the context of informal IPE encounters. Interprofessional learning (IPL) is a concept which encompasses both formal and "serendipitous" (informal) learning in an educational or workplace setting (Freeth et al., 2019a). So, the term IPL will be used going forward.

Background
Evaluation of curricular components can be guided by a variety of models and approaches. Kirkpatrick's hierarchy is an outcomes-based model which has been popular in medical education. However, the original model has been criticized for superficial evaluation; failing to look beyond the achievement of the intended learning outcomes (ILOs) of the course (Belfield et al., 2001). Barr et al. (2000) modified Kirkpatrick's model to outline the intended outcomes for IPL ( Figure 1). The existing evaluative tools for IPL which have been used in the undergraduate setting can be examined using the modified Kirkpatrick's model to ascertain their effectiveness.
The Best Evidence Medical Education (BEME) guide no. 39  offers insight into the IPL outcomes commonly reported in both pre-and post-qualification settings. An update for this review was also conducted focussing specifically on pre-qualification education. Both reviews identified that a large proportion of the studies reported outcomes which align with levels 1, 2a, and 2b of the modified Kirkpatrick's model. Results from various studies demonstrate that students already possess positive attitudes to interprofessional collaboration (IPC) before they participate in any IPL interventions . Perhaps, this is indicative of a shift in culture toward a more team-oriented approach in clinical practice. Therefore, now more than ever, it is important to explore outcomes beyond student perceptions or attitudes to strengthen the evidence base for IPL in the undergraduate curricula of healthcare professionals.
Longitudinal follow-up of students who have experienced IPL is necessary to assess outcomes at levels 4a and 4b. However, some studies have demonstrated that IPL evaluation can reach higher levels by developing IPL interventions in a clinical setting i.e. involving interprofessional student teams in patient care (Luebbers et al., 2017;Marcussen et al., 2019;Reeves et al., 2016;Tervaskanto-Mäentausta et al., 2017). Alternatively, evaluative instruments can be tailored to the prequalification setting through outcomes which mirror interactions and scenarios commonly faced by undergraduate students.

Questionnaires as evaluative tools for IPL
As IPL is a relatively new concept in undergraduate healthcare education, assessment and evaluation are reliant on reflections, surveys, and professionalism judgments (Brashers et al., 2016). While self-administered questionnaires are not renowned for accuracy, they facilitate the collection of large volumes and types of data from varying population sizes (Artino et al., 2014). Moreover, complex topics such as student experiences can be divided into multiple factors to facilitate a more comprehensive evaluation. These traits make self-administered questionnaires highly attractive in undergraduate IPL development, as time and resources are already so limited in medical curriculum development. Therefore, it is unsurprising that many of the existing tools are self-administered questionnaires.
One such tool is the Readiness for Interprofessional Learning Scale (RIPLS). It was originally developed to assess the willingness of qualified healthcare professionals to engage in IPL (Parsell & Bligh, 1999). It has since been validated and used in various settings, including pre-qualification training. This scale determines level 1 outcomes, and if used in a postintervention context, level 2a. As alluded to previously, recently there have been steps toward deconstruction of hierarchy in healthcare. It is possible to infer that such a shift in paradigm has influenced the admissions process resulting in recruitment of undergraduates with a particular disposition to teamwork. This should prompt further implementation of IPL and a more robust evaluation of outcomes beyond the primary levels of Kirkpatrick's hierarchy as achieved by RIPLS.
The University of West England (UWE) developed a series of three questionnaires to assess changes in undergraduates' perceptions of IPL longitudinally and into post-qualification practice (Pollard et al., 2005). Individually, these instruments focus on levels 1 and 2b of the modified Kirkpatrick's model. Combined, they provide data which reflects level 2a. However, a combination of these factors in the undergraduate setting is yet to be addressed in a single questionnaire. Additionally, the UWE questionnaires, like many other evaluations of IPL, report level 2b outcomes in terms of teamwork and communication skills alone . This implies that IPL is regarded as a separate entity to the main curriculum. Although there is no obvious "gap" for IPL in the curriculum, it should be considered as a means to enhance existing aspects of the curriculum (Freeth et al., 2019). This indicates that IPL interventions should be engineered to meet the requirements of individual institutions. So, the need for an evaluative tool which demonstrates the efficacy of an individual intervention as well as IPL within the curriculum as a whole is highlighted.
A variety of self-administered questionnaires are available for evaluation of IPL from the growing pool of evidence. The majority of these questionnaires seem to be based on regional or national competencies for IPC from parts of the world which have absorbed IPL into the pre-qualification curricula of healthcare professions, such as the United States and Canada. However, the present instruments available for IPL evaluation are concentrated on outcomes which practising professionals can more readily achieve.

Study aims
(1) To outline the development and initial validation of the "Scale Measuring the Impact of Interprofessional Learning" (SMIIL) in the undergraduate training of healthcare professionals, which can be used postintervention.
(2) To create achievable, and stage appropriate outcomes for student healthcare professionals by aligning items in the SMIIL to the modified Kirkpatrick's model of evaluation.
(3) Provide an insight into the self-perceived impact of IPL on students from a UK medical school.

Scale composition
A literature review by Reeves et al. (2016) and an update completed by the researcher identified the commonly reported outcomes of IPL and assessment tools used. Following this, the RIPLS (Parsell & Bligh, 1999), UWE questionnaires (Pollard et al., 2005)  professions' roles, application of IPL to practice. Acquisition of medical knowledge and clinical skills was also added to enable evaluation of interventions integrated into the medical curriculum. These themes were elaborated on to produce 25 items employing a 5-point Likert scale to facilitate quantification and objective analysis of responses (Sullivan & Artino, 2013). A further three items (which do not form part of the SMIIL) were added to the 25 items regarding IPL to ascertain students' year of study and extent of IPL experience -formal vs informal. Participants were given a Participant Information Sheet (PIS) which contained the relevant definitions of IPL, informal and formal IPL (found in Supplementary Materials).
In order to assess face validity, the questionnaire was initially appraised by the project supervisor, three final-year medical students and an expert panel consisting of the University of Exeter Medical School (UEMS) IPL committee (staff from multiple disciplines and three pre-clinical medical students). Following a discussion with the project supervisor, six items were removed due to the repetition of content and some items were reworded for ease of understanding. The phrasing of items was further refined as a result of student critique of the scale. Two members of the IPL committee also provided feedback on scale composition. They suggested the inclusion of negatively phrased items to enable identification of rote answering, and items to ascertain if students prefer learning with other students from different healthcare professions instead of qualified professionals. In response to this, two items were added to the scale. It was also recommended that the questions should be divided into sections for ease of completion and analysis. As the items align with levels 1, 2a, 2b, and 3 of the modified Kirkpatrick's model (Table 1), they are arranged in ascending order of their corresponding level. In addition to this, the statement "As a result of engaging in interprofessional learning . . . " was added as the stem for each item to prevent the items from being too lengthy. Lastly, it was proposed that the scale could include items which explicitly explore students' perception of other professions in terms of respect and appreciation of diverse backgrounds. However, this seems more appropriate for a pre-intervention or at the end of an IPL curriculum. It was also noted that the topic of respect was implicitly assessed through items which explored changes in students' appreciation of other professions' roles in patient care. Psychometric analysis offered further recommendations for modification of language. In particular, adaptation of the stem statement to read, "As a result of my experience with interprofessional learning I think that . . . " It was suggested that this would not exclude those who may not have experienced IPL yet. The 21-item SMIIL distributed for data collection can be found in Supplementary Materials.

Data collection
This study took place at the UEMS within the BMBS cohort of students in the academic year of 2019-2020 consisting of 787 students in total. The BMBS students were targeted as they were the largest, accessible cohort of pre-qualification healthcare professionals and the scale is intended for use in all undergraduate healthcare professional courses. Although the SMIIL is intended for use in a post-intervention context, the time scale of this project did not coincide with formal IPL sessions delivered by UEMS in years 2 and 5 of the BMBS programme. Therefore, the scale was validated in the context of evaluating the impact of IPL across the whole BMBS curriculum. The UEMS BMBS programme is a problem-based curriculum, which provides students with clinical exposure within the first month of course. So, it was considered likely that students had been exposed to situations that could result in informal IPL, so students in years 1 to 5 were approached to complete the questionnaire.
Participants were recruited through convenience sampling from lectures and small group sessions. The scale was initially distributed in paper form and handed in at the end of the session into a sealed box to maintain the anonymity of the participants. In return for the completed scale, participants were offered chocolate. Face-to-face data collection was achieved for the majority of clinical year students (years 3 to 5) in three different localities. However, due to the Covid-19 pandemic, it was not possible to approach pre-clinical year students (years 1 to 2) or half of the year 4 students in person. Instead, the PIS and the SMIIL were converted to an online format using, the General Data Protection Regulation (GDPR) compliant tool, SmartSurvey™. The link to the online survey was distributed across all years via e-mail to reach both the preclinical and clinical students who may have been missed previously. This generates the chance of replicated responses. However, it was outlined that students who had previously submitted paper responses should not be completing the online version.

Data analysis
The anonymized data was transcribed onto Microsoft Excel and all data points were double-checked. Statistical analysis was conducted using IBM® SPSS® v26.0. First, all negatively coded items were reverse-scored and total scores were calculated for each response. The Mahalanobis distance was determined for each response to enable identification of outliers for removal. Following this, the total response rate and response rate by year group was identified.
Item-level descriptive statistics such as the mean and standard deviation of Likert scores can define item-total correlations to an extent. As per Othman et al. (2011), items within a scale should yield similar mean scores and the ratio of the maximum standard deviation to the minimum should be 2:1. These statistics were analyzed to provide a preliminary stance on whether the SMIIL is a unidimensional or multidimensional scale. This was further investigated through Exploratory Factor Analysis (EFA), which aims to reduce multidimensional data into fewer variables. Of the two forms of EFA, Factor Analysis (FA) and Principal Component Analysis (PCA), FA was considered more appropriate in this instance. This is due to the intentional design of the items (observed variables) to fit the modified Kirkpatrick's model (latent variables). FA will determine how successful the intention has been. There are multiple forms of FA available, but as the intent of the study is to develop an instrument for use with multiple sample sets, Maximum likelihood or Kaiser's alpha factoring are more relevant than Principal Axis Factoring. Both orthogonal and oblique rotation methods were used in conjunction with the above extraction procedures to investigate the possibility of subscales within the SMIIL. The internal consistency of the whole scale was scrutinized by calculating Cronbach's Alpha.
In addition to this, it was explored if removal of certain items would improve the internal consistency by employing the "Cronbach's Alpha if item deleted" function. Distribution of students' responses regarding the hours of formal and informal IPL they experienced was examined. Also, data from the remaining items in the SMIIL was used to investigate the difference of IPL impact between the year groups. Each item was scored as per the respondents' rating; 1 for strongly disagree to 5 for strongly agree. A composite score was then generated for each respondent for analysis (lowest = 21, highest = 105) with the understanding that a higher score correlated to a greater impact of IPL. The oneway ANOVA test in conjunction with Tukey's Honestly Significant Difference (HSD) post hoc test was then used to assess the statistical significance of any differences noted between year groups.

Ethical considerations
The project was reviewed and approved as a low-risk project by the College of Medicine and Health Research Ethics Committee (Reference No.: Sept20/D/147∆3). All collected data will be stored and deleted according to the principles of GDPR.

Scale validation
In total, 203 responses were collected. Of these, 20 responses were partially completed, so were excluded from analysis. A further four responses were excluded as outliers following generation of Mahalanobis distance for each respondent. Therefore, the response rate for the pilot study is 179/787 (22.7%). This poor response rate is likely a reflection of the Covid-19 lockdown, as it was not possible to speak to a larger proportion of the cohort in person and e-mail communication may have been lost in the large volume of information being forwarded to students. Also, intercalating medical students were harder to contact in person as they were dispersed across multiple campuses, localities, and institutions. A breakdown of response rates by year group is available in Table 2.
Initial analysis of the 21-item scale using item-specific descriptive statistics (supplementary reading materials) revealed that the SMIIL could be a unidimensional scale; the ratio of highest to lowest standard deviation is approximately 2:1 and the mean Likert scores are similar. Bartlett's test of sphericity indicated that FA was appropriate for this data set (x 2 = 1160.615, p < .0001). The Kaiser-Meyer-Olkin measure of sampling adequacy revealed that the strength of inter-variable relationship was high (KMO = .75). FA verified that the scale was likely unidimensional. Although various combinations of extraction and rotational procedures were applied, no subscales were identified. Therefore, Cronbach's Alpha was used to assess internal consistency of the scale and finalize the items. Cronbach's Alpha showed the SMIIL to have an acceptable reliability, α = 0.76. Most items appeared to be worthy of retention, resulting in a decreased alpha if deleted. Any item with an item-total correlation of <0.2 was removed to produce the final 17-item SMIIL. Items 1.4, 2.4, 2.6, 3.5, and 3.6 from the 21-item SMIIL had a corrected item-total correlation of <0.2. As such, all of these items except 2.4 were removed, increasing Cronbach's alpha to 0.80. Item 2.4 was retained despite a corrected item-total correlation of 0.198, as its removal reduced the item-total correlation of item 2.5. The final 17-item SMIIL can be found in Table 3 with the corresponding corrected item-total correlation for each item.

Exposure to IPL
Students within the same year group reported varied hours of participation in both formal and informal IPL (Supplementary Reading Materials). The modal group for formal IPL for all year groups was identified as 1 to 5 hours. Regarding exposure to informal IPL, the modal group varied as follows: years 1 and 3 reported experiencing 0 to 1 hours and years 2, 4, and 5 reported 20+ hours.

The SMIIL Score
There was a statistically significant difference between the mean composite score for each year group as determined by one-way ANOVA (F(4,174) = 4.438, p = .002) (Figure 2). Post hoc analyses using Tukey's HSD revealed that the composite score for the SMIIL was significantly greater for students in year 4 (68.9 ± 7.56, p = .009) and year 5 (67.86 ± 6.13, p = .014), when compared to students in year 3 (64.2 ± 5.71). There were no statistically significant differences between the other year groups.

Discussion
Previously, the evidence for IPL was largely generated in the post-graduate setting (Hammick et al., 2007). This means that despite the increasing evidence base for undergraduate IPL  , the conceptual underpinnings for design and evaluation of IPL in pre-qualification training lacks "customization" which was identified as an important factor for success (Hammick et al., 2007). In response to this, the SMIIL was developed to facilitate context-driven evaluation of IPL interventions in the undergraduate setting. This study produced empirical data which has enabled the development and initial validation of the 17-item SMIIL. Items for this scale were generated by expanding on themes extracted through examination of existing scales which have been validated in the undergraduate setting and IPC competencies from the United States and Canada. The SMIIL is an innovative scale because it presents IPL themes in a studentfriendly manner. For example, level 3 of Kirkpatrick's modified evaluation model is often explored in terms of clinical practice. However, this scale acknowledges that knowledge gained from IPL can be applied to both clinical practice and approaches to learning experiences. This is demonstrated by items 15 to 17 in the 17-item SMIIL. Additionally, the inclusion of items 10 and 11 highlights the possibility of integrating IPL with aspects of the common curriculum. Certainly, moves to integrate IPL better within the curriculum will emphasize that IPC is a vital aspect of practice as opposed to a separate and optional entity. However, IPL and pre-qualification healthcare curricula share an "uneasy co-existence" in the United Kingdom (UK) (Barr, 2012). This reveals the need for the development of undergraduate-specific IPL competencies in the UK, or even reinterpretation of competencies for the pre-qualification setting, to aid design and evaluation of IPL interventions.
Analysis of the acquired data identified the unidimensional nature of the SMIIL. In fact, the original 21-item scale had an acceptable internal consistency (Cronbach's α = 0.76). However, four items were removed due to poor item-total correlation (<0.2), which increased Cronbach's Alpha value to 0.80. As the unidimensionality of the scale has been confirmed, the items are no longer separated into sections, but have been left in ascending order of the modified Kirkpatrick levels. Although the 17-item SMIIL was used to assess the impact of IPL for this particular sample set, it is not yet ready for use as an evaluative instrument. Prior to measuring internal consistency of the whole scale, FA was attempted to identify potential sub-scales. While this was not fruitful, it was noted that communalities (the extent to which items correlate with each other) of items were persistently low. According to Costello and Osborne (2005), this could imply the need for further items to explore additional factors in future studies. When constructing the scale, the intention was to limit the number of items for ease of completion and evaluation in practice. However, Artino et al. (2014) recommend using six to ten items to fully address a particular theme. Although there is scope to develop the SMIIL in the future, it still provides relevant information about the impact of IPL on students in its current form.

Impact of IPL
It is difficult to generalize the findings from this study due to a limited response rate (22.7%). Nevertheless, this provides a preliminary insight into the self-perceived impact of IPL on students from a UK medical school. One-way ANOVA of the composite scores for the SMIIL suggests that IPL has a significantly greater impact on students in years 4 and 5 compared to those in year 3. No significant differences were observed between the other year groups. Again, this is likely a reflection of the limited sample of year 1 and 2 students. Also, those pre-clinical students who responded to this scale online may have an interest in IPL or had more experience of it. The IPL curriculum for BMBS students at the UEMS consists of two formal sessions (where IPL is part of the intended learning outcomes), one in year 2 and another in year 5, which sum to approximately five hours. This is reflected by the survey responses which show that "1 to 5 hours" is the modal group for all years relating to formal IPL. Variation in responses to this item within year groups may be explained by extra-curricular activities or increased delivery of formal IPL in some localities due to availability of other pre-qualification healthcare students. However, the majority of IPL experienced by students is informal or serendipitous IPL on clinical placements which varies from student to student. The accuracy of student reports regarding informal IPL is questionable. It is possible that students have mistaken simply working or learning alongside professionals or students from another profession as IPL. Indeed, when completing the scale, the majority of students were uncertain the term IPL itself, as well as formal and informal IPL. Barr et al. (2017) note that expecting students to identify and engage with IPL opportunities on placement is not realistic. Perhaps this is due to a lack of understanding of IPL and its importance in future practice. This reiterates the need for raising awareness about IPL, not just among educators, but also the student population. Conceivably providing students with IPL learning outcomes, or IPC competencies, would enable them to effectively utilize and participate in IPL opportunities. The GMC has adapted the Good Medical Practice document (General Medical Council, 2019) for students to provide further guidance on how this code applies to them pre-qualification (General Medical Council & Medical Schools Council, 2016). Development of something similar for IPL would not be misplaced in assisting students and educators alike to understand what is expected of them.

Limitations
The low response rate is a significant limitation of the study. Not only does a multivariate statistical analysis require a larger sample size, but the small sample also reduces the generalizability and reliability of the findings. Additionally, this study has been conducted at one institution and assesses the impact of IPL on one profession. To explore the full impact of IPL, all professions involved need to be consulted especially when the intention of IPL is one of mutual learning.
The data was not triangulated, the context of responses and additional dimensions of student IPL experience and the subsequent impact may be missed. Furthermore, this scale was originally intended to assess the impact of a particular IPL intervention. However, this study has validated the scale in a more general sense, assessing the impact of IPL across the BMBS curriculum at UEMS. This is somewhat problematic, as students reported (in written comments below the scale) difficulty in completing the SMIIL with only IPL experiences in mind. It should also be considered that the inadvertent use of overlapping bands for number of hours of formal and informal IPE makes these statistics less robust. For those who filled in the paper version, it was possible to clarify how best to complete the scale.

Recommendations for future research
This study has verified that the scale is appropriate for evaluating the outcomes of IPL. However, future validation of the scale in a post-intervention setting will be required. Studies in multiple localities, after varied styles of IPL intervention, and inclusion of a variety of undergraduate healthcare professionals would strengthen the evidence base for the SMIIL as an evaluative instrument. Prior to this, development of additional items may be necessary. These items should be generated by more than one person to avoid researcher bias and with expert input from the outset. Once items have been added to the scale, FA should be conducted again to assess multidimensionality and formation of sub-scales. If still unidimensional in nature, the scale could be further evaluated with Rasch Analysis.
To facilitate triangulation of data in future studies, the SMIIL could be adapted to create a facilitator mark sheet. This will particularly be useful as students' perception of knowledge may be inaccurate. With a more detailed scale and a larger data set, it may be possible to define the significance of a particular composite score i.e. which score constitutes a "significant impact."

Conclusion
The SMIIL has been developed specifically for the evaluation of IPL in the pre-qualification training of healthcare professionals. The originality of this scale is the presentation of complex themes, which are better suited for evaluation of professional practice, in the undergraduate context. This study has validated the SMIIL as a unidimensional measure of IPL outcomes. However, further research is required to develop the scale fully and validate it by involving different cohorts of pre-qualification healthcare students in multiple localities and varying styles of IPL interventions.