SSVEP-based Brain-computer Interface for Music using a Low-density EEG System

In this paper, we present a bespoke brain-computer interface (BCI), which was developed for a person with severe motor-impairments, who was previously a Violinist, to allow performing and composing music at home. It uses steady-state visually evoked potential (SSVEP) and adopts a dry, low-density, and wireless electroencephalogram (EEG) headset. In this study, we investigated two parameters: (1) placement of the EEG headset and (2) inter-stimulus distance and found that the former significantly improved the information transfer rate (ITR). To analyse EEG, we adopted canonical correlation analysis (CCA) without weight-calibration. The BCI for musical performance realised a high ITR of 37.59 ± 9.86 bits min -1 and mean accuracy of 88.89 ± 10.09%. The BCI for musical composition obtained an ITR of 14.91 ± 2.87 bits min -1 and a mean accuracy of 95.83 ± 6.97%. The BCI was successfully deployed to the person with severe motor-impairments. She regularly uses it for musical composition at home, demonstrating how BCIs can be translated from laboratories to real-world scenarios.

Brain-computer interfaces (BCIs) for musical applications aim to interface brain waves directly with composition tools, instruments, algorithmic composers, and music players, to name but a few (Eaton et al., 2015;Grierson & Kiefer, 2014;Miranda, 2014). It is beneficial for patients who are suffering from locked-in syndrome, which is the loss of all or most motor abilities, because it provides a means of creative expression, which is shown to have positive effects on mental well-being (Leckey, 2011). It also allows creative practitioners to communicate with musical applications through a novel mechanism of control. Steady-state visually evoked potential (SSVEP) has been widely The objective of this paper is to develop BCI-based performance and composition tools for a person with a severe motor-impairment condition, such that the system can be used independently in the absence of specialists or engineers. To our knowledge, this is the first study that addresses this task of using BCIs to compose music at home. For this reason, all BCI-related operations are encapsulated into one stand-alone application. In this paper, we adopt joint frequency-phase modulation (JFPM) (Chen et al., 2015b) to present the visual stimulus and canonical correlation analysis (CCA) (Bin et al., 2009;Lin et al., 2007) to analyse EEG.
BCI studies that adopt CCA generally collect training data from users to calibrate weights (Nakanishi et al., 2014). In this study, we adopt CCA without weightcalibration (explained in section 3.4). This does not require user-training sessions and thus, improving the usability of BCIs (Lin et al., 2007;Bin et al., 2009;Nakanishi et al., 2015;Yger et al., 2016). Additionally, this paper investigates 2 parameters -(1) size of and distance between flashing regions in the visual stimulus and (2) placement of the EEG headset. On one hand, Duszyk et al. (2014) demonstrated that there is a linear relationship between the size of the flashing region and SSVEP amplitude. It also stated that inter-stimulus distance does not have a significant effect on SSVEP magnitude. On the other hand, Ng et al. (2012) observed a relationship between inter-stimulus distance and classification accuracy. There were methodological differences between the two studiesone directly measured SSVEP amplitude and the other used a classifier. We understand that the design of visual stimulus has been covered by many studies in the literature. However, these studies were conducted using wet electrodes which have low impedance. Hence, this paper evaluates two different designs for the visual stimulusone with large flashing squares and the other with smaller flashing squares. Furthermore, Y. Wang et al. (2006)  for SSVEP by using subject-specific electrode placements. Therefore, in this paper, we examine whether the placement of the EEG headset has an effect on the communication rate of the BCI. We are using a commercial and low-cost EEG headset which has a fixed structure with instructions on how to place the headset on the subject's head.
However, we investigate if minor adjustments in the placement, which are generally not discussed by headset manufacturers, produce an impact on the communication rate of the BCI. We conduct experiments to investigate this phenomenon.
For practical reasons, we could not visit the individual with motor-impairments to perform extensive tests. Therefore, we conducted experiments with other subjects to investigate parameters before the final deployment.

BCI Design
As BCIs are considerably different from conventional musical interfaces (with respect to communication rates and ease of use), design considerations were taken to allow the person with severe motor-impairments to perform and compose music solely by using the BCI. Due to the restricted communication rate, there needs to be a trade-off between flexibility and time taken for composition. We have developed multiple BCI systems with varying degrees of flexibility for composition. For instance, in one system, the user is allowed to choose from several musical pitches, but not control other musical parameters like rhythm. This provides high flexibility to compose melodies. In another system, the user can choose from a fixed set of musical loops that comprise a variation in multiple musical parameters. However, the choice is limited to the loops available at that time.
The operations carried out by the BCI are categorised into 4 sub-operationsproviding the visual stimulus, recording EEG signals, analysing the obtained data, and  The next two BCIs enable the user to compose music. The user's compositions would be stored on their computer. Violin composer enables the user to create a melody.
As shown in figure 1a, the user can make a choice from four different musical pitches, which are randomly displayed by the computer. If the user wishes to choose a note that is not on the screen, the shuffle option can be chosen, which would display a fresh set of four musical pitches. After a choice is made by the user, it is added to the composition stack. Afterwards, the composition up to that instant of time is played back and a new set of musical pitches is displayed. The range of musical pitches is from G3 to G5. We decided to display a random set of pitches instead of an ordered set so that the user need not wait for very long periods for the "ideal" pitch. Instead, the user can either choose a pitch from the displayed options or shuffle if none of them are preferable. The remove option can be chosen to delete the most recent choice made in the composition. The duration of each note in the composition is constant. A separate configurations programme was developed to allow the user to set the duration of each note and the number of notes occurring in the composition.
For Violin loops composer (as shown in figure 1a), the user can choose from musical loops. In this context, a musical loop is a short section of pre-recorded sound material that can be directly used in a composition. Hence, the user can use a musical phrase directly instead of individually entering the notes. This approach was adopted to make the interface more interesting for the user and address the low communication rates of BCIs. In this BCI, the user navigates through 18 screens to develop an entire composition. The set of musical choices are unique for each screen. In other words, they depend on the timeline of the composition. This enables the user to build an entire composition from the start. The play option allows the user to play the composition until that instant of time and the remove option can be used to delete the most recent choice.

Visual Stimulus Presentation
This paper adopts JFPM (Chen et al., 2015b) to present the visual stimulus. Each flashing region is encoded with 2 unique attributesphase and frequency. Generally, in SSVEP, the highest amplitude is elicited for low stimulation frequencies (Chen et al., 2014), which is essential for high-impedance headsets. Hence, our system is a 6-target In order to utilise hardware-accelerated rendering and vertical synchronisation (VSync), the visual stimulus is implemented with the help of Open Graphics Library (OpenGL). Vertex shader and fragment shader programmes were written in OpenGL shading language (GLSL). The vertex shader specifies the coordinates of the flashing squares and the fragment shader varies the luminance of the region. The luminance is varied by equation 1 with the help of sinusoidal stimulation (Manyakov et al., 2013).
where is the stimulation frequency, is the time, and is the phase. Rhodes (2010) explains the importance of frame-rate independence during software development. This will be useful for deploying the BCI on multiple platforms that may have varying refresh rates. Hence, in this programme, the value of is updated every 1 ms. OpenGL textures were used to load graphical icons of musical notation in the BCI as shown in figure 1b.

Record EEG
EEG signals were recorded from the head with the help of a customised Quick-20 headset manufactured by Cognionics, Inc. The headset comprises 4 electrodes -Cz, Pz, O1, and O2 as shown in figure 3. In addition to the occipital and parietal electrodes, Cz was added to the set of electrodes because it is commonly chosen as the reference electrode in SSVEP studies (Chen et al., 2015b;Y. -T. Wang et al., 2017). Due to design for manufacturing considerations, we were restricted to choosing electrode positions from the 10-20 international system. Therefore, we were unable to choose electrodes like Oz or POz, which might pick higher amplitudes of SSVEP signals. The sampling rate of the headset is 500 Hz. The headset provides the option to calculate the impedance of electrodes with the help of a carrier wave, which is superimposed with the EEG signals at 125 Hz. Data is transmitted from the headset to the laptop through Bluetooth. A dedicated computer thread was written to receive data from the headset.
(a) Electrode positions of the EEG headset.
(b) Components of the EEG headset. Figure 3: Dry, wireless, and portable EEG headset.
The Cognionics website 1 suggests that for these dry sensors, impedances lie in the range of . The following steps were taken to obtain good signal quality. After the headset was applied, the electrodes were adjusted to obtain an impedance of under . One useful characteristic of these electrodes is that their contact improves with time (this was verified with the obtained impedance values).
Subsequently, the electrodes were lightly pressed against the head at regular intervals of time. In all circumstances, for all electrodes, we were able to obtain an impedance of less than within 10 minutes. After this, the carrier wave was disabled to allow an effective bandwidth of 0 to 125 Hz for the EEG signals. Lin et al. (2007) proposed the idea of using canonical correlation analysis (CCA) for SSVEP to improve communication rates of BCIs. CCA is a multivariate statistical technique that quantifies the relation between 2 sets of variables (Härdle & Simar, 2003). Using this technique, multiple EEG channels can be used for analysis and therefore, it improves the accuracy of BCIs (Bin et al., 2009;Lin et al., 2007). Let be a matrix comprising samples recorded by the EEG headset as shown in equation 2.

Data Analysis
where is the number of EEG samples considered for analysis. For instance, if the analysis window is 2 s, N is 1000 because the sampling rate of the headset is 500 Hz.
Furthermore, each region in the visual stimulus corresponds to a matrix of reference signals , which consists of the fundamental frequency as shown in equation 3 (Bin et al., 2009).
where is the stimulation frequency, is the number of harmonics, and is the sample index of the EEG recording. Chen et al. (2015a) showed that 5 was an optimal value for and therefore, in this study, we set to 5.
X and Y are two multi-dimensional variables as defined in equations 2 and 3 respectively. Considering their linear combinations to be and , CCA finds the weight vectors and such that the correlation between and is maximised. Correlation between and is defined by equation 4.
where , and , ( ) is the correlation between and , belongs to , belongs to , N is the number of samples, ̅ is the mean of , ̅ is the mean of

Classification
There are two ways in which CCA can be used for BCIswith and without weightcalibration. The literature has generally adopted the former technique, where the user trains the system with few trials before actually using it. Weight vectors are calculated over these trials and then averaged. This requires the user to spend additional time training the system (Bin et al., 2009;Chen et al., 2015aChen et al., , 2015b. In the second technique, which is without weight-calibration, solely the highest correlation value for that specific trial is calculated. The user's choice is the maximum correlation value calculated among all target frequencies as shown in equation 5 (Bin et al., 2009).
where is the correlation value for each target . calibration. The advantage of this technique is that it does not require any time to "train" the system. As we were designing the system for a person with severe motor- impairments, it was not suitable to conduct user-training sessions before using the system. During initial experiments, we tried to conduct user-training sessions with the person. However, the person found it time-consuming and tiring. Furthermore, the training session expects the person to precisely look at specific regions at specified times, which was challenging for the person. Therefore, this paper uses CCA without weight-calibration. CCA calculations were implemented with Eigen 2 , which is a C++ template library for linear algebra.

Relax Time
For Violin, the 6 choices presented on the screen were constant. Hence, a relax time of 1 s was given for the user to shift the gaze from one region to another. For Violin composer and Violin loops composer, the choices presented on the screen were not constant. Thus, the user was given a relax time of 6 s to view and make a musical choice. As this is a prolonged relax time, the user may not expect the squares to start flashing. In order to prevent this, a preparation time of 0.3 s was given after the relax time, during which the icons disappeared and only the borders of the flashing squares were visible. Figure 4 illustrates this process through a flow diagram.

Audio Output
Each BCI choice had one sound file associated with it. The audio output was played during the relax time. All sound files were stored in Waveform Audio File Format (WAVE or commonly known as WAV). Cross-fading was implemented to have smooth transitions between sound samples. Non-musical choices like remove, play, and shuffle had sound files that uttered the corresponding commands. The composition made by the user was stored as a WAV file in the computer, which can be played by any music player.

Design of the Visual Stimulus
There were two different stimuli evaluated in this paper, termed as stimulus A and stimulus B. Stimulus A had bigger flashing squares and smaller inter-stimulus distance and stimulus B had smaller flashing squares and bigger inter-stimulus distance. Interstimulus distance is defined as the distance between two consecutive flashing squares.
Moreover, it is important to note that the size of the flashing squares and inter-stimulus distance are interdependent. If stimulus size is increased, the inter-stimulus distance automatically reduces and vice versa. This is due to the fact that the size of the laptop screen is fixed. The motivations behind exploring different sizes of flashing squares and inter-stimulus distance were: (1) Larger flashing squares might elicit higher magnitudes of SSVEP, (2) larger inter-stimulus distance would have less interference between consecutive flashing squares.
In stimulus A, each flashing region was a square of side 7.62 cm (384 pixels).
where is the visual angle, is the size of the object, and is the distance between the eye and the object. For our experiments, the user was seated at a distance of approximately 70 cm from the screen and hence, is equal to 70 cm. The

Placement of EEG Headset
The headset adopted by this paper detects EEG from two electrodes in the occipital region and one electrode in the parietal area. As we are using the same headset for all users (who may have unique head sizes), the electrodes may not reach the required locations on the head. User manuals provided by Cognionics suggested that the ground dry pad sensors are to be placed on the middle region of the forehead. We considered this set-up to be placement A, as shown in figure 5a.
(a) EEG headset deployed according to placement A. In placement B, the ground dry pad sensors were placed as high as possible on the forehead, but making sure that the sensors were not placed on hair (as shown in figure 5b). In other words, it is placed just below the hairline. Effectively, in placement B, the central, parietal, and occipital electrodes move further behind compared to placement A.
Note that the headset has a fixed structure with four electrodes -Cz, Pz, O1, and O2. For placement A and B, we did not adopt any special measures to precisely calculate the position of each electrode with reference to the 10-20 international system, because it was not the goal of the study. However, we aim to find out if minor deviations from the manufacturer's recommendations in placing the EEG headset impacts performance. As we are adopting a dry and high-impedance EEG headset, signals may be sensitive to minor adjustments.

Configurations
In order to evaluate the two parameters mentioned above, there were 4 different configurations of the BCI -C1: placement A and stimulus A, C2: placement A and stimulus B, C3: placement B and stimulus A, and C4: placement B and stimulus B.

Offline Experiment
As mentioned earlier, we were unable to visit the individual with severe motorimpairments to perform extensive tests. This was due to the following reasons. Our laboratory and the individual were located in different cities. Furthermore, most of our time with the person was dedicated to understanding her expectations from the composer system, explaining the BCI system to her, and enabling her carer to set-up the system for her independently, without our supervision. Therefore, our BCI system was optimised and tested separately on 6 subjects.
All experimental procedures were approved by the University's ethics committee. 6 healthy subjects (5 males and 1 female) in the age range of 23 to 55 years participated in the experiments. All subjects were experienced with the process of musical composition and had a normal or corrected-to-normal vision. The experiment was conducted in a dark room. Subjects were asked to minimise body movements and keep electronic gadgets away. During the experiment, subjects were requested to avoid eye blinks. However, there were no methods adopted to detect unintentional eye movements.
The laptop screen was placed approximately 70 cm away from the user. Initially, one of the four configurations was randomly chosen and the system was set-up accordingly. Six targets were present on the screen. A green colour visual cue indicated which target the subject was supposed to look at, which was randomly chosen by the s. Each test case consisted of six appearances (or six trials) of visual cues (that is, one for each target) in random order. After two test cases, subjects were asked to close their eyes and rest for few minutes. For each configuration, four test cases were acquired from each user. The same procedure was followed for all four configurations and hence, a total of sixteen test cases was obtained from all users.

Data Processing
In EEG recordings, we observed high magnitude under 1 Hz and a peak at 50 Hz. Highpass and notch filters at 4 Hz and 50 Hz respectively were applied on EEG data. Both were Butterworth filters and zero-phase filtering was performed. The filter was designed in MATLAB and code generator was used to translate the code to C++ for the JUCE application. In visual-based BCIs, a visual latency of 7 to 15 ms is generally observed (Russo & Spinelli, 1999). Therefore, this study discarded the first 20 ms of EEG data after the onset of the stimulus.

SNR
In order to compare the four configurations, SNR values were calculated. For each EEG channel, fast Fourier transform (FFT) was performed to obtain the amplitude spectrum where is the frequency resolution (0.6 Hz in this paper) and is equal to 6. Among the 4 EEG channels, the maximum SNR value was taken for comparison. The SNR values were averaged over all SSVEP frequencies.

Performance Evaluation
In order to evaluate the performance of the BCI, classification accuracy and ITR were calculated. Accuracy is defined as the ratio of the number of correct predictions and the total number of trials. For example, if the green colour visual cue highlights the fourth target, the user's brain waves are analysed through CCA and a prediction is made. If the fourth region is predicted by the analysis, then it is a correct prediction. Otherwise, it is a wrong prediction. For each trial, only the order in which the targets occur was randomised. Therefore, within each trial, all the six targets are highlighted exactly once.
For each configuration, the total number of trials was 24 ( ).
where is the trial time in seconds, is the number of flashing regions, and is the mean accuracy averaged over all targets. The relax time for Violin is 1 s. For Violin composer and Violin loops composer, the relax time and preparation adds up to 6.3 s.
These durations were added to the trial time while calculating ITR.

SNR
Two-way repeated measures ANOVA was performed to compare the configurations.
The two factors for analysis were the parametersplacement and stimulus. The Placement had a significant effect on SNR ( ) and stimulus did not have a significant effect on SNR ( ). Figure 6 shows the SNR for different configurations.

Classification Accuracy
For all four configurations, one-way repeated measures ANOVA showed that there were significant differences in accuracy for different data lengths ( ). Figure 7 shows the classification accuracy of the four configurations for different data lengths.
C3 and C4 (placement B) performed better than C1 and C2 (placement A) for all four data lengths -1 s, 2 s, 3 s, and 4 s. Two-way repeated measures ANOVA was performed for these data lengths. The two factors were placement and stimulus. The significance level was set at 0.05.

Composer
C1 and C2 obtained their highest ITRs for a data length of 4 s, which was and . C3 and C4 obtained their highest ITRs for a data length of 3 s, which was and . Figure 8 shows the ITRs of the BCIs for different configurations.

Optimal Configuration
In this paper, C3 and C4 consistently obtained higher ITRs than C1 and C2.
Considering other factors such as increasing the number of flashing regions in future research and smaller stimuli causing less visual fatigue, C4 was chosen as the optimal configuration. Table 2 shows the accuracy and ITR values of individual subjects for both types of BCIs using C4.

Online Experiment
Violin was tested by asking the user to choose all options successively in a clockwise manner, starting from the first option. For Violin composer, subjects were asked to choose the duration of note and length of melody by using the configurations programme. After doing so, they composed a melody. While composing, if the BCI detected a wrong choice or the user did not like the choice after listening to it, the remove option was used. For Violin loops composer, subjects composed an entire song by using the BCI. All songs and melodies composed by using the BCIs were sent to the users after the experiment.
There were no measures taken to calculate accuracy or ITR for the online experiment because of the complex control flow of the BCI. This was due to the following reasons. Generally, accuracy and ITR in BCI spellers are calculated by asking the user to type a specific sentence. However, we asked the users to compose music using the BCI and this cannot be fixed beforehand. Additionally, the remove option in a BCI speller is used if a word was typed incorrectly. However, in our system, the user used the remove option also if they did not like the sound of it, after listening to the composed melody. Due to the above reasons, we were not able to calculate the accuracy and ITR during the online experiments.
As you can see in table 3, the composer BCI had a high mean accuracy of 95.43% for an analysis window of 3 s. Therefore, the feedback from most users was that the BCI generally selected the correct option. We also tested all the BCI systems -Violin, Violin composer, and Violin loops composer on the person with motor impairments. She successfully used all the musical systems. First, the carer deployed the EEG headset on the person with motor impairments. Subsequently, she was asked which of the three musical systems she would like to play. Although the person with motor impairmnets could not provide an answer through speech, there were subtle means of communication such as nodding and smiling which the carer could understand. Later, the configurations program was set by the carer, which specified the duration of note and length of melody. Then, the person with motor impairments played and composed music with the BCI system.
As mentioned earlier, most of our time with the person was dedicated to understanding her expectations from the composer system, explaining the BCI system to her, and enabling her carer to set-up the system for her independently, without our supervision. Therefore, we were unable to calculate the ITR of her using the system due to time and funding constraints. However, her feedback was that the system listened to the choices made by her.

Concluding Discussions
In this paper, we presented a portable high-speed BCI that used a dry, low-density, and wireless EEG headset. It was applied to three musical systems -Violin, Violin composer, and Violin loops composer. It was a bespoke system developed for a person with severe motor-impairments to allow composing music at home. It adopted JFPM (Chen et al., 2015b) and CCA (Bin et al., 2009;Lin et al., 2007) for visual stimulation and EEG data analysis respectively. For relax times of 1 s and 6.3 s, the system obtained an ITR of and respectively. The mean accuracies were and respectively. The ITR obtained in this paper surpasses the performances in studies that use dry electrodes (Chi et al., 2011;Mihajlović et al., 2012;Y. -T. Wang et al., 2017). The improvement in performance is attributed to an optimal placement of the EEG headset. However, Xing et al. (2018) obtained a higher ITR than this paper but used a greater number of electrodes. Our BCI's performance can be improved by increasing the number of electrodes and the electrode density in the occipital and parieto-occipital region.
Furthermore, multiple studies have shown that calibration or user-training improves the performance of BCI systems (Nakanishi et al., 2014;Wong et al., 2020). As we developed a bespoke BCI-based musical system for one user, it would be beneficial to have some user-training sessions in future research. This way, communication rate of the BCI could be boosted.
This paper found that headset placement significantly improved the performance of the BCI. This encourages researchers to explore different headset placements and find an optimal one for their task. If they are adopting dry EEG, minor adjustments in placement may significantly alter communication rates. The guidelines provided by the manufacturer need not be optimal for the placement of the EEG headset. Moreover, in this study inter-stimulus distance did not have a significant effect, however, we analysed only two different designs for the visual stimulus. The difference in their performances was not significant, however, a variety of designs need to be tested to confirm the relationship between inter-stimulus distance and performance of the BCI.
The optimal configuration for the BCI was C4: placement B and stimulus B. The BCI was successfully delivered to the person with motor-impairments. Video and text-based user manuals were created to enable the carer to set-up the BCI. We witnessed the carer set-up the BCI and the person use all the three musical systems to compose music. The configurations programme was set-up by the carer communicating with the individual. Figure 9 is a picture of the person using the BCI. We have been in touch with the person, and she regularly composes music using the BCI. RESEARCH by g.tec, to name but a few, can be adopted for our BCI-based musical systems. In this paper, we ensured that the impedance of the electrodes was under . Therefore, if a reader wishes to reproduce this work on another commercialgrade EEG headset, the impedance needs to be below to ensure reasonable signal quality. Furthermore, researchers have proposed alternative stimulation paradigms like steady-state motion visual evoked potential (SSMVEP) (Yan et al., 2017), which might be more convenient due to the absence of flashing stimuli and be more interesting for musical systems.

Disclosure statement
The authors report no conflict of interest.

Funding
This study was partially funded by the School of Humanities and Performing Arts, University of Plymouth and GREY Advertising, London.