Personalized programming education: Using machine learning to boost learning performance based on students’ personality traits

Abstract This study explores the use of machine learning and physiological signals to enhance learning performance based on students’ personality traits. Traditional personality assessment methods often yield unreliable responses, prompting the need for a novel approach utilizing objective data collection through physiological signals. Participants from a Taiwanese university’s Department of Electrical Engineering engaged in a programming video task while wearable sensors captured their physiological signals. A Big Five-factor theory questionnaire was administered to assess their personality traits, and a personality prediction model was developed using the collected data. Results indicated that galvanic skin response and heart rate variance significantly predicted extroversion, while heart rate variance also predicted agreeableness and conscientiousness. These findings hold implications for personalized programming education, enabling educators to tailor pedagogical methods based on students’ personality traits, thereby improving learning outcomes. A case study in a game development elective course demonstrated significantly better performance with personalized materials. By leveraging machine learning and physiological signals, this research presents new opportunities for personalized education, fostering engaging and effective learning environments. Future research can explore its application in other educational domains and assess its long-term impact on learning outcomes.


Motivation and background
What is the significance of personality assessment models?Over the years, a variety of personality assessment models have emerged, including the big five model, self-efficacy and innovativeness, locus of control, and the need for achievement (Kerr et al., 2018).Many researchers have established a correlation between an individual's personality and their level of success.For instance, O'Connor and Paunonen emphasized the strong relationship between personality and academic performance (OConnor & Paunonen, 2007).Meanwhile, Rothmann and Coetzer's study found a connection between job performance and personality traits among employees at a pharmaceutical company (Rothmann & Coetzer, 2003).In the field of adaptive learning, most research focuses on using parameters gathered from e-learning environments.However, as Brinton et al. pointed out, students' video-watching behaviors can provide insight into when they need help from instructors (Brinton et al., 2016).In traditional classroom settings, such information is typically unavailable, but students' personality traits may serve as a viable alternative.
Assessment methods have already been developed for evaluating personality traits using established models.For example, Goldberg introduced a marker system that transparently quantifies one's degree of Extraversion, Agreeableness, Conscientiousness, Emotional Stability, and Imagination in terms of the Big Five model (Goldberg, 1992).Similarly, Sherer et al. devised a generalized scale for measuring self-efficacy (Sherer et al., 1982), and Craig et al. created a system to assess locus of control (Craig et al., 1984).However, these existing assessment tools have limitations and are mainly questionnaire or marker system-based.Despite their widespread use and validation, the need for additional assessment methods persists.As summarized by Akash Choudhury, using questionnaires has some limitations e.g.poor/unreliable/incomplete responses. 1  Our study proposes an approach based on physiological signals, which offers several advantages.Firstly, participants do not need to provide manual responses.Secondly, the use of physiological signals provides a more objective measure.Thirdly, as Tiwari et al. and Šalkevicius et al. have pointed out, people's physiological signals are linked to their psychological status, such as emotion and anxiety levels (Šalkevicius et al., 2019;Tiwari et al., 2019).Therefore, analyzing participants' physiological signal changes during events can potentially reveal valuable information about their internal psychological status and, by extension, their personality traits.This approach allows for a more in-depth understanding of individuals.Furthermore, we conducted an experiment in Yuan Ze University R.O.C. to see whether or not delivering lecturing materials based on students' personality traits is helpful to enhance their learning performance.An elective course "Game Development", which has 22 students enrolled was chosen.Based on the research results, students who were given learning materials based on their personality traits outperformed other students in their learning results significantly.To sum up, the research goals are the followings: (1) To investigate the relationship between physiological signals and personality traits and to construct a model to assess personality traits based on physiological signals.
(2) To understand whether delivering materials based on personality traits has positive impacts on the learning performance.

Related works
Numerous models for assessing personality traits have been developed over time.Among the various personality models available, the big five model is particularly popular.Goldberg first proposed this model in 1990, and subsequently developed a marker system that transparently quantifies one's personality traits in terms of Extraversion, Agreeableness, Conscientiousness, Emotional Stability, and Imagination (Goldberg, 1992).
Rothmann and Coetzer discovered that there were positive associations between several personality traits and task performance (Rothmann & Coetzer, 2003).The healthcare sector in Pakistan conducted a survey and found that Agreeableness, Conscientiousness, and Openness to Experience positively impacted organizational effectiveness (Butt et al., 2020).O'Connor and Paunonen established that Conscientiousness was strongly linked to academic success (OConnor & Paunonen, 2007).Duff et al. found that individuals with different personality traits tend to choose among three different types of learning strategies: the deep learning approach, the surface learning approach, and the strategic learning approach (Duff et al., 2004).The research of Donker et al. showed there is a relationship between learning strategies and learning performance (Donker et al., 2014).
Previous research has demonstrated that physiological signals can be utilized as indicators of psychological states.Egger, Ley, and Hanke conducted a survey that found physiological signals to be 79.3% accurate in assessing individuals' emotional states, while speech recognition achieved 80.46% accuracy in detecting happiness and sadness specifically (Egger et al., 2019).Sriramprakash et al. successfully developed a model to assess an individual's stress level using electrocardiogram and GSR (Sriramprakash et al., 2017).Wache conducted a study to examine the relationship between personality traits and different physiological signals while participants watched emotional movie clips (Wache, 2014).In Bastos' research, multiple models were created to evaluate personality traits through physiological signals such as pupil, ECG (Electrocardiogram), BVP (Blood Volume Pulse), and EDA (Electrodermal Activity).A total of 473 features were extracted from these signals.The findings indicated that EDA and BVP were the best predictors for Openness, ECG and EDA for Agreeableness, and ECG and EDA for Extraversion (Bastos, 2019).The research of Butt et al. extracted 11 features from participants' EEG, GSR, and PPG (Photoplethysmography) signals for assessing their personality traits and the classification accuracy was from 67% to 92% (Butt et al., 2020).

The first phase: The development of the personality assessment model
The study consisted of two phases.In the first phase, physiological signals were collected from participants to construct a personality assessment model.In the second phase, participants were provided with learning materials tailored to their personality traits to evaluate if this information could improve their learning performance.In this section, the experimental results of the first phase will be explained.

Experiment design
Thirty participants were recruited from the electrical engineering department at YuanZe University in Taiwan for the first phase.Participants were selected randomly and were not informed about their personality traits, grades, or interests prior to the study.An experimental protocol was designed to collect participants' personality traits and physiological signals as follows: (1) attach GSR and heart rate sensors to participants; during the experiment, these sensors will collect participants' physiological signals and saved them to a local disk (2) participants watched a video of 8 minutes and 30 seconds which is about JavaScript programming (3) after watching the video, participants were asked to summarize the content of the video (4) participants completed the big five personality traits models questionnaire

Personality trait collection
Participants' personality traits were collected using the IPIP big five factor markers system developed by Goldberg. 2The system employs a 50-item questionnaire, with each question offering five response options from very inaccurate to very accurate.It assesses the tendency scores of the big five personality traits.Participants with higher scores in a specific personality trait are considered to have a stronger tendency towards it.To enhance convenience, a web application was created to administer the questionnaire, and a screenshot of the web application is provided in figure 1: In addition to hosting the questionnaire, the Web application also shows the level of the tendency of each personality trait compared to other participants.

Physiological signals collection
The grove GSR sensor 3 and the grove ear clip heart rates sensor 4 were installed on Raspberry Pi for the collection of physiological signals.Figure 2 shows the device and the usage during the collection step:  We also built a GUI-based application for the collection of physiological signals.The application was written using the JavaFX technology and can be run on the Raspberry Pi board.Figure 3 below shows the application: The application collects voice frequencies, volumes, heart rates, and GSR values.However, in this research, only the heart rate values and GSR values were kept.

Preprocessing of data
Due to the different processing speeds of the sensor modules, different sampling rates are used: • GSR values: 0.3 samples per second • heart rate values: 1.1 samples per second For each participant, we recorded the physiological signals for the first 7 minutes.Totally, we recorded 4111 GSR values and 12,552 heart rate values.To reduce noise, data below the 10th percentile and data higher than the 90th percentile were removed.The resulting dataset contained 3269 GSR values and 10,128 heart rate values.Then we equally divided the data into three segments.For each segment, we calculated the variance of each signal.Additionally, we calculated the difference between the variance of GSR values in different segments and labeled them as delta_gsr12, delta_gsr23, and delta_gsr13, respectively.The same procedure was also applied to the variance of heart rate values, and they were labeled as delta_hr12, delta_hr23, and delta_hr13, respectively.The variance values were normalized using the z-score method and the delta values were represented as ratio values, that is, delta_gsr12 was calculated as {variance of GSR in the 2nd segment}-{variance of GSR in the 1st segment}/{variance of GSR in the 1st segment} Participants' personality trait values were preprocessed using the steps below: • the median values of each personality trait were calculated • the personality trait values were then labeled as "H" and "L" according to whether they were higher than the corresponding median value or not; the "H" value indicates the stronger tendency for the corresponding personality trait while the "L" values represents the weaker tendency

Statistics of raw data
First, the figure 4 and figure 5 below show the distribution of GSR and heart rate variance in each segment: Table 1 below lists the descriptive statistics of the input values (not normalized): Then, the descriptive statistics of the input values (normalized) are listed in table 2 below: Figure 6 below illustrates the distribution of each personality traits:

The prediction models
Since the range of GSR values and heart rate values are different, we used the scaled dataset for the construction of the prediction model.The input/output variables are listed below: • input variables: gsr1_var, hr1_var, gsr2_var, hr2_var, gsr3_var, hr3_var, delta_gsr12, delta_gsr23, delta_gsr13, delta_hr12, delta_hr23, delta_hr13 • output variables: Ex_class, Co_class, Es_class, Ag_class, Op_class The goal of the prediction model is to use the input variables to predict the level of the output variables.The Random Forest 5 algorithm was used for constructing the model.The R implementation 6 of the Random Forest algorithm was used.To achieve better prediction precision, we had to figure out the best set of input variables for model construction.The R package Boruta 7 was used for input variable selection, while tuneRF 8 was used for tuning the parameters for the Random Forest implementation.Each model was trained separately, and the results are described below.

Ex_class
To predict Ex_class, we first used the Boruta package to figure out the important variables.The list below shows the output of the execution of Boruta: Boruta performed 38 iterations in 0.8774686 secs.
3 attributes confirmed important: delta_gsr12, delta_hr12, and gsr2_var; 9 attributes confirmed unimportant: delta_gsr13, delta_gsr23, delta_hr13, delta_hr23, gsr1_var and 4 more; As suggested by Boruta, delta_gsr12, delta_hr12, gsr2_var were important variables to predict Ex_class.In the resulting Random Forest model, the mean decrease accuracy value of the three variables were 50.73191, 46.04577, and 48.88196, respectively, which showed that removing these variables will decrease the resulting accuracy greatly.The resulting OOB (Out-Of-Bag) error was 10.34%. Figure 7 shows the first tree in the resulting Random Forest model:

Co_class
To predict Co_class, Boruta suggested that important variables were delta_gsr23, delta_hr13, and gsr2_var, and the corresponding mean decrease accuracy were 12.84836, 30.57563, and 28.50746, respectively.The resulting OOB error was 24.14% and figure 8 shows the first tree of the resulting model.

Ag_class
To predict Ag_class, Boruta suggested that important variables were Ex_class, Co_class, and hr2_var, and the corresponding mean decrease accuracy was 23.55775, 37.66948, and 14.26147, respectively.The resulting OOB error was 27.59% and figure 9 shows the first tree of the resulting model.

Es_class and Op_class
For Es_class and Op_class, we did not find any effective model in the experiment.However, among the 30 participants recruited in the first phase, we did find some tendencies.For example, we found that most participants with low Ex_class, high Co_class, and low Ag_class also had low Es_class.Table 3 summarizes the results: 5. The second phase: Enhancing learning performance using personality information

Experiment design
In this phase, we invited students enrolled in the "Game Development" course to join the second phase of the experiment.Initially, we had 22 participants in the second phase.However, by the end of the semester, the number of participants had decreased to 16.The participating students were then divided into the control and experimental groups.For students in the experimental group, we delivered learning materials based on their personality traits, while for students in the control group, their learning materials were randomly assigned.At the end of the semester, the learning performance of the two groups was compared.For students favoring the deep learning strategy, the delivered learning materials consisted of notes, programming examples, and further readings.For students favoring the strategic learning strategy, the learning materials consisted of notes, programming examples, and API (Application Programming Interface) documents.For students who preferred the surface learning strategy, the learning materials consisted of notes and programming examples (more detailed version) only.

Data analysis
The students were then randomly distributed into the experiment group and the control group.After the midterm exam, 6 students withdrawn and did not finish the experiment.In the end, there were totally 10 students in the experiment group and 6 students in the control group.
During the semester, we followed the Duff's research results which concluded that students tend to choose learning strategies (deep learning, surface learning, and strategic learning) based on their personality traits (Duff et al., 2004).We prepared 4 labs which students had to complete in 6 weeks with learning materials designed based on the three learning strategies.
Students in the experiment group received learning materials based on their personality traits while students in the control group received random learning materials.We conducted a before test and an after test to know students' understandings of the JavaScript language, and then calculated the delta of points they gained in the two tests to evaluate their improvements.The equation below was used: In which At and Bt refers to the points they obtained in the after and before test, respectively.The result is shown in table 4: Then, a normality test using the Shapiro-Wilk method was conducted, the result is shown in table 5: Based on the result of the normality test, the data did not distribute normally.Then, both Student's t test and Mann-Whitney U test were performed and the results were shown in table 6: The non-parametric Mann-Whitney U test was used due to the non-normal distribution of the data, and results indicated a significant difference between the two groups (p < 0.05), with the experiment group scoring significantly higher than the control group and the resulting Hedges' g was 0.69.Despite the small sample size of 16 participants, the results suggest that the difference is significant.

Conclusions and future work
In this research, we proposed a mathematical model to assess participants' personality traits via their physiological signals.Using the proposed mechanism is more convenient than the original personality traits assessment method, which is questionnaire-based.Knowing one's personality traits has some  advantages.For example, the connection between one's personality traits and task performance has already been established in the research of Rothmann and Coetzer (Rothmann & Coetzer, 2003).
Our findings are partly consistent with those of Bastos (2019), who found that ECG and EDA can be used to evaluate an individual's degree of extraversion.In our study, we discovered that an individual's GSR and heart rate variance are the most significant factors in predicting their extraversion level.This similarity can be attributed to the fact that ECG is highly correlated with heart rate variance (as mentioned in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7472094/), and EDA is equivalent to GSR (as explained in https://en.wikipedia.org/wiki/Electrodermal_activity).Moreover, like Bastos, we found that ECG and EDA can be employed to assess an individual's degree of agreeableness.Our results also highlight the significance of heart rate variance in assessing an individual's agreeableness level.
Compared with Wache's results, both Wache's model (Wache, 2014) and the proposed model showed that heart rate variance is related with one's degree of agreeableness and conscientiousness.Wache's model showed the correlation between one's EEG and degree of emotional stability, but we did not collect EEG signals in our experiment.Besides, Wache's model found that one's degree of creativity (openness to experience) was best predicted by GSR, while our model did not show such a relationship.Please note that the research of Wache focused more on finding the correlation between physiological signals and degree of personality traits, while the proposed model focused more on building a prediction model.
In addition to building a prediction model and a software system, An experiment was conducted to examine whether the adoption of the proposed method enhances the academic performance of students.According to our experimental findings, the utilization of educational materials tailored to align with the personality traits of students resulted in improvements in their academic performance.The research work of Donker et al. summarized literatures about the effectiveness of learning strategy intervention and found the mean effect size (Hedges' g) was 0.66, and the effect size measured in our experiment was 0.69, which was comparable with existing researches.Due to the limited number of participants in the experiment and the fact that the "Game Development" course was not mandatory in our department, the results may not be widely generalizable, but they still hold some degree of reference value.
In the future, we have several possible directions to enhance this work.First, the developed personality traits assessment method relies on the use of physiological signals, which may not be easy to obtain in a large scale course using the current device and the software applications, so redesigning the device and the software applications may be needed.Second, we would like to test the developed learning materials on a larger scale classes to further verify the results.

Figure 1 .
Figure 1.The screenshot of the web application.

Figure 2 .
Figure 2. The usage scenario of the device.

Figure 5 .
Figure 5. Distribution of heart rate variance.

Figure 6 .
Figure 6.The distribution of personality traits.

Figure
Figure 9.The random forest for Ag_class.