Speech-ABRs in cochlear implant recipients: feasibility study.

Abstract Objective: The aim of this study was to assess the feasibility of recording speech-ABRs from cochlear implant (CI) recipients, and to remove the artefact using a clinically applicable single-channel approach. Design: Speech-ABRs were recorded to a 40 ms [da] presented via loudspeaker using a two-channel electrode montage. Additionally, artefacts were recorded using an artificial-head incorporating a MED-EL CI with stimulation parameters as similar as possible to those of three MED-EL participants. A single-channel artefact removal technique was applied to all responses. Study sample: A total of 12 adult CI recipients (6 Cochlear Nucleus and 6 MED-EL CIs). Results: Responses differed according to the CI type, artefact removal resulted in responses containing speech-ARB characteristics in two MED-EL CI participants; however, it was not possible to verify whether these were true responses or were modulated by artefacts, and artefact removal was successful from the artificial-head recordings. Conclusions: This is the first study that attempted to record speech-ABRs from CI recipients. Results suggest that there is a potential for application of a single-channel approach to artefact removal. However, a more robust and adaptive approach to artefact removal that includes a method to verify true responses is needed.


Introduction
Current clinical outcome measures to assess speech detection or speech-in-noise performance in cochlear implant (CI) recipients are all behavioural measures that require participation and behavioural responses from the CI recipient (e.g. Arthur Boothroyd monosyllabic words (Boothroyd 1968), Bamford-Kowal-Bench sentences (Bench, Kowal, and Bamford 1979), Hearing in Noise Test (Nilsson, Soli, and Sullivan 1994)). These same outcome measures are also used in CI research (e.g. Vickers et al. 2009;Heywood et al. 2016). Such behavioural measures cannot be used with infants or with some children and adults with additional needs (e.g. developmental delay, intellectual disabilities or other disabilities). Therefore, there is a need to develop an objective measure that does not require behavioural responses from the CI recipient.
Current and past work has focussed on auditory late (cortical) evoked responses (ALR) as an outcome measure of detection and discrimination of sounds in adults and children with CIs (e.g. Sharma, Dorman, and Spahr 2002;Zhang et al. 2010). Although ALRs have been successfully recorded in adults and children with CIs (e.g. Gordon, Wong, and Papsin 2013;Visram et al. 2015), ALRs as objective outcome measures have not yet transitioned to standard clinical practice. This may be owing to several factors relating to the physiological characteristics of ALRs such as (i) ALRs do not mature and become adult like until around the age of 16-18 years; (ii) ALRs are influenced by attention and state of arousal where the response is more pronounced when the subject is attending to the stimulus compared to when not attending to the stimulus, which makes the response more variable within and across subjects and across test sessions and (iii) ALRs are influenced by sedation and anaesthesia (Hall 2015). An alternative to ALRs are auditory brainstem responses (ABR) that have some advantages over ALRs such as (i) they mature early in life; (ii) they can be reliably measured in infants, children, and adults with additional needs; (iii) they are more consistent than ALRs within and across subjects and across test sessions and (iv) they are not influenced by attention, state of arousal or sedation and anaesthesia (Hall 2015). ABRs can be recorded in response to short consonant-vowel speech stimuli (speech-ABR). Speech-ABR studies on normal-hearing adults and children have shown that speech-ABRs (i) follow the spectral and temporal features of the stimulus used to evoke them, therefore may be a good candidate for an objective outcome measure of brainstem speech encoding (e.g. Johnson, Nicol, and Kraus 2005;Kraus and Nicol 2005;Chandrasekaran and Kraus 2010;BinKhamis et al. 2019); (ii) are recordable from infancy to older adulthood and repeatable within and across sessions (e.g. Song, Nicol, and Kraus 2011a;Hornickel, Knowles, and Kraus 2012;Skoe et al. 2015) and (iii) addition of background noise results in reduced response amplitude, delayed response timing, and overall degradation in response quality. These changes have been linked to performance on behavioural measures where individuals with more degradation in their speech-ABRs (e.g. smaller amplitudes, delayed timing) due to background noise performed worse on behavioural speech-innoise measures, suggesting that speech-ABRs may be a good candidate for an objective measure of behavioural speech-in-noise performance (e.g. Anderson et al. 2011;Parbery-Clark, Strait, and Kraus 2011;Song, Skoe, Banai and Kraus 2011b;BinKhamis et al. 2019).
To date, there is no published literature on speech-ABRs in CI recipients. This is likely owing to the large electrical artefact that is generated from the CI that may obscure the speech-ABR ). Several CI artefact removal techniques have been reported in the literature; however, these techniques were only applied to ALRs and auditory steady-state responses (ASSR) but were not attempted for speech-ABRs. One of the commonly used CI artefact removal techniques is Independent Component Analysis (ICA). ICA has been successfully used in CI research to remove the CI artefact from acoustically evoked ALRs (e.g. Viola et al. 2011;Bakhos et al. 2012;Miller and Zhang 2014;Sandmann et al. 2015) and from electrically evoked cortical (40 Hz) ASSRs (e.g. Deprez et al. 2018). Other artefact removal techniques that have been successfully applied in CI research to electrically evoked responses include subtraction (e.g. Friesen and Picton 2010) that was applied to ALRs, linear interpolation (e.g. Gransier et al. 2016;Deprez et al. 2017a) and template subtraction (e.g. Deprez et al. 2017b) that were applied to ASSRs. In terms of acoustically evoked responses, ICA appears to be the most commonly used method to remove the CI artefact. However, ICA requires a large number of recording electrodes and complex signal processing that is currently not feasible to apply in a clinical setting. Additionally, it is generally recommended to use a short stimulus in order for the CI artefact not to overlap in time with the response (Gilley et al. 2006). However, using a short stimulus that does not overlap with the response cannot be applied to speech-ABRs due to the early occurrence of the response. McLaughlin et al. (2013) reported a CI artefact removal technique that they applied to acoustically evoked ALRs recorded with one channel (i.e. three electrodes) in response to long tone bursts (100, 300 and 500 ms) that overlap in time with the ALR, this technique, described in detail in Supplement 2, Section 1, is promising as it may potentially be applied in a clinical setting.
The aim of this study was to assess the feasibility of applying the McLaughlin et al. (2013) artefact removal technique on speech-ABRs recorded to the 40 ms [da] from adult CI recipients.

Participants
Twelve post-lingual adult CI recipients (age: mean ¼ 49.08 years, SD ¼ 6.43, range ¼ 39-60, 9 males, 6 with Cochlear Nucleus CIs and 6 with MED-EL CIs) participated in this study (see Supplement 1 for participant information). All participants provided written informed consent. This study was ethically approved by National Health Services, England (IRAS ID: 226216). Participants were compensated for time and travel expenses. CI recipients from the Manchester Auditory Implant Centre were invited to participate in this study, letters were posted to potential participants and the first 12 that expressed interest were enrolled.

Equipment
Speech-ABRs were collected with Cambridge Electronic Design Ltd. (CED, Cambridge, UK) "Signal" (Cambridge, UK) software version 5.11 using a CED power 1401 mkII data acquisition interface (Cambridge Electronic Design Ltd.) and a Digitimer 360 isolated 8-channel patient amplifier (Digitimer Limited, Hertfordshire, UK). The 40 ms [da] (described in Banai et al. 2009) was presented from the "Signal" software through the power 1401 mkII and routed through a Tucker-Davis Technologies (TDT, Alachua, FL) PA5 programmable attenuator and a TDT HB7 Headphone Driver to a Fostex Personal Monitor 6301B loudspeaker (Fostex Company, Foster Electric Co. Ltd., Tokyo, Japan). Loudspeaker was positioned at 45 azimuth, 1.1 metres away from the speech processor microphone. Stimulus was calibrated in dB-A using a Br€ uel and Kjaer type 2250 (Br€ uel and Kjaer, Naerum, Denmark) sound level metre.

Recording parameters
"Signal" software sampling configuration was set to gap-free sweep mode, sample rate of 125,000 Hz (used by McLaughlin et al. 2013), pulses with a resolution of 0.01 ms as the output type, and outputs were set at absolute levels and absolute times. Online second-order Butterworth filtering was 100 Hz (high-pass filter) and 3000 Hz (low-pass filter), filter settings were based on 40 ms [da] speech-ABR literature (e.g. Anderson et al. 2013;Skoe et al. 2015); 100 Hz high-pass filter was used to ensure cortical response components were excluded from the recordings. Stimulus was presented in sound field at 70 dB-A at a rate of 9.1 stimuli per second. Stimulus polarity was reversed using Adobe Audition CC (2015.1, build 8.1.0.162) to evoke speech-ABRs using two opposite stimulus polarities and remove loudspeaker stimulus artefacts (recommended by Skoe and Kraus 2010). Two-blocks of 2500 epochs at each stimulus polarity were collected for each participant for a total of 10,000. Online artefact rejection was switched off due to the large CI electrical artefact. Two-channel vertical electrode montage recording with Cz active, earlobe reference (A1 and A2), and high forehead ground (Fz) was used, electrode sites were based on the international 10-20 EEG system. Two-channel recording was used in order to evaluate artefact removal from the contralateral channel (opposite to CI) using the estimated artefact from the ipsilateral channel (same side as CI) versus the estimated artefact from the contralateral channel. The CI artefact is expected to be larger in the ipsilateral channel; therefore, estimating the artefact from this channel and using it to remove the artefact from contralateral channel responses may be more successful than estimating the artefact from the contralateral channel.

Participant preparation and recording environment
Skin at electrode sites was prepared using Nuprep Skin Prep Gel. Ag/AgCI 10 mm disposable disc electrodes were placed on prepared sites with Ten20 Conductive EEG paste. Electrode impedances were below 3 kX, impedances between electrodes were balanced and below 1 kX. Impedance balancing <1 kX was essential to reduce the CI artefact (McLaughlin et al. 2013).
Participants lay in a comfortable recliner in a double-wall soundproof booth, and were instructed to remain relaxed with their eyes closed to reduce myogenic artefacts and eye blinks.
Recordings were conducted with the participant's own CI speech processor using their everyday settings.

Speech-ABR processing and artefact removal
Speech-ABR processing and artefact removal were performed offline using MATLAB R2015a (MathWorks, Natick, MA). Artefact removal method was based on McLaughlin et al. (2013), this was attempted at different low-pass (LP) filter cut-off frequencies and with estimating artefacts from either the ipsilateral or contralateral channel (details in Supplement 2, Section 1). Responses preand post-artefact removal were visually inspected for waveform morphology (whether they were consistent with speech-ABRs to the 40 ms [da] containing an onset response, envelope following response, and offset response), response amplitude (whether they were the appropriate size for brainstem responses, amplitudes will be called "amplitude-BAR" before processing and artefact removal, and "amplitude-PAR" post artefact removal) and artefact (whether it was high or low in frequency).

Results
Before artefact removal, responses were generally excessively large in amplitude-BAR (lV) and contained high-frequency and low-frequency artefacts. Applying artefact removal at different LP filter cut-off frequencies resulted in a slight reduction in response amplitude-PAR with decrease in cut-off frequency in some but not others. The 300 Hz cut-off frequency resulted in the largest amplitude-PAR reduction and removal of high-frequency artefacts. Additionally, estimating artefacts from the ipsilateral channel did not have any advantage over estimating artefacts from the contralateral channel and in some cases resulted in responses that were larger in amplitude-PAR (artefact removal examples at different LP cut-offs in Supplement 2, Section 2). Therefore, responses were evaluated using estimated artefacts from the same contralateral channel at 300 Hz cutoff frequency.
There were clear differences in responses depending on the CI type and processing strategy. Responses from participants with Cochlear Nucleus 22 CIs (CI01 and CI04) using SPEAK bipolar (BP) processing were smallest in amplitude-BAR and contained mainly a low-frequency artefact that resembled the stimulus envelope. Artefact removal resulted in minimal changes in responses and these responses were likely artefacts rather than speech-ABRs. Responses from participants with Cochlear Nucleus 24 CIs using ACE monopolar (MP) processing were large in amplitude-BAR and contained mainly a low-frequency artefact that followed the stimulus envelope with some superimposed high-frequency artefact. Artefact removal had a limited effect on response amplitude-PAR in two participants (CI05 and CI10), and responses after artefact removal from the other two participants (CI06 and CI12) did not resemble speech-ABRs. Responses from participants with MED-EL CIs using FS4 MP processing were large in amplitude-BAR with an excessive highfrequency artefact. These responses had the greatest reduction in amplitude-PAR after artefact removal compared to participants with Cochlear Nucleus CIs. After artefact removal, responses from two MED-EL participants (CI07 and CI08) contained speech-ABR characteristics combined with residual artefact (see Figure 1 for CI08). However, it was difficult to conclude whether they were true responses as speech-ABR amplitude-PARs were larger than amplitudes previously reported for normal-hearing individuals (e.g. BinKhamis et al. 2019). Also, latencies were on average 0.51 ms (CI07) and 1.72 ms (CI08) earlier than the peaks in the estimated artefact, making them similar (but not identical) to the estimated artefact (see Figure 2 for CI08). Responses from Figure 1. Contralateral responses (two sub-averages) from CI08 (MED-EL Concerto Flex 28 -FS4) showing (A) Response before processing (prior to filtering and artefact removal) that is very large in amplitude-BAR containing an excessive high-frequency artefact, (B) LP filtered response (300 Hz) that is smaller in amplitude than (A) with no high-frequency artefact, (C) artefact that was estimated from the LP filtered response and (D) final response after artefact removal that is much smaller in amplitude-PAR than (A) and (B) and contains speech-ABR characteristics combined with residual artefact. Ã Note that in order for the response to be visible in all subplots, the y-axis scale is not equal across sub-plots. two participants (CI02 and CI09) did not resemble speech-ABRs, while responses from the other two participants (CI03 and CI11) were large in amplitude-PAR and did not resemble speech-ABRs (remaining participant figures in Supplement 3, Section 1). In an attempt to verify whether artefact removal was successful for CI07 and CI08, Wagner et al. (2018) were approached and agreed to record artefacts based on participant MAPs. Artefacts based on the closest possible approximation to CI MAPs from three MED-EL participants (CI07, CI08 and CI09) were recorded using an artificial-head (details in Supplement 3, Section 2), the same artefact removal was applied to these recorded artefacts and resulting waveforms contained minimal residual artefact (see Figure 3 for one example, and Supplement 3, Section 2, for the other two). This suggests that it is possible that detected responses in CI07 and CI08 were true responses; however, estimated artefacts from these recordings were smaller in amplitude than those recorded from participants (e.g. Figure 1(C) versus Figure 3(C)).

Discussion
The aim of this study was to assess the feasibility of recording speech-ABRs from CI recipients using a clinically applicable recording setup combined with an artefact removal technique that has been applied to ALRs recorded using one channel and elicited with a longer stimulus that overlapped in time with the ALR (McLaughlin et al. 2013).
Results from this study showed that responses had different characteristics depending on the CI type and processing strategy. First, Cochlear Nucleus 22 CIs (SPEAK, BP) resulted in responses that resembled the 40 ms [da] envelope and had smallest amplitude-BAR compared to both Cochlear Nucleus 24 (ACE and MP) and MED-EL (FS4 and MP) CIs. It is not unexpected that BP would result in responses with smaller amplitude-BAR given that the reference electrode is intra-cochlear, which would restrict current flow within the cochlea, as opposed to MP processing where the reference electrode is extra-cochlear (located in the implant receiver/stimulator), and current flows between the intra-cochlear and extra-cochlear electrodes. Li et al. (2010) also found that CI artefacts were smaller in BP versus MP processing when they recorded CI artefacts from an implanted chicken carcase. Although responses from participants with Cochlear Nucleus 22 implants were the appropriate amplitude-BAR and amplitude-PAR plus resembled speech-ABRs, it was difficult to conclude whether responses were artefacts or brainstem responses. CI artefacts from Cochlear Nucleus CIs follow the stimulus envelope (Wagner et al. 2018), which would result in waveforms that may resemble speech-ABRs. Given the smaller amplitude CI artefact (Li et al. 2010) and that CI artefacts from Cochlear Nucleus CIs follow the stimulus envelope (Wagner et al. 2018), it is more likely that responses recorded from participants with Cochlear Nucleus 22 CIs are artefacts rather than speech-ABRs. Second, Cochlear Nucleus 24 CIs (ACE and MP) resulted in responses that followed the 40 ms [da] envelope. However, they were much larger in amplitude-BAR than those recorded from Cochlear Nucleus 22 CIs. Artefact removal resulted in either minimal changes in responses or removal of artefacts with no residual speech-ABRs. Both SPEAK and ACE processing strategies transmit the stimulus envelope without its fine structure, this may explain the difficulty in extracting speech-ABRs from artefacts especially since artefacts would resemble speech-ABRs. Also, pitch is indirectly transmitted  Figure 1(C)) with axes adjusted for better visualisation (x-axis from 0 to 70 ms). Triangles point to major peaks that are closest in latency to detected peaks in (B). Latencies are marked for each peak. Amplitudes (lV) are as follows 68.67 (first positive to first negative peak), negative peaks to preceding positive peak ¼ 62.25, 76.73 and 66.98, respectively. (B) Response after artefact removal (as in Figure 1(D)) with axes adjusted for better visualisation (x-axis: 0-70 ms, y-axis: À1.8-1.8 lV). Triangles point to potential speech-ABR peaks V, A, D, E, F and O, respectively. Latencies are labelled for each peak. Amplitude-PARs (lV) are as follows positive V to negative A ¼ 0.96, D ¼ 3.13, E ¼ 1.12, F ¼ 2.31 and O ¼ 1.19 (measured as preceding positive peak to negative peak). through these CIs via temporal modulations of envelope cues (Stickney et al. 2007). This indirect pitch encoding may not be sufficient enough to evoke speech-ABRs, as was the case in two participants with no residual speech-ABRs after artefact removal. Finally, MED-EL CIs (FS4 and MP) resulted in responses that were very large in amplitude-BAR with an excessive high-frequency artefact that was not present in Cochlear Nucleus CI responses. This may be explained by the higher stimulation rate used by MED-EL CIs compared to Cochlear Nucleus CIs (McLaughlin et al. 2013). Artefact removal resulted in removal of most of the artefact with either potentially detectable speech-ABRs or with no residual speech-ABRs, or in responses that were too large in amplitude-PAR to be speech-ABRs. These potentially detectable speech-ABRs were larger in amplitude-PAR than in normal-hearing individuals. However, large electric-ABR amplitudes (!2 to 3 lV) have been reported in CI recipients (e.g. Brown et al. 2000;Firszt, Chambers, and Kraus 2002a;Firszt et al. 2002b). Also, larger electric-ABR compared to acoustic-ABR amplitudes have been shown in humans and in cats (e.g. van den Honert and Stypulkowski 1986;Gamgebeli et al. 2010). Suggesting that our larger amplitude-PARs may be true responses. With regards to the earlier (but similar) latencies between detected peaks and peaks in the estimated artefact, timing differences of responses from different brainstem structures are typically short; e.g. the difference in latency between click-ABR peak III (generated from the Superior Olivary Complex) and V (generated from the Lateral Lemniscus and Inferior Colliculus) is around 2 ms (Hall 2015). Therefore, a 2 ms difference is considered large enough to generate distinct brainstem responses. This suggests that the latency difference between detected peaks and estimated artefacts (at least in CI08, average 1.72 ms) may be large enough to consider the detected peaks true responses. However, this does not rule out the possibility that these were modulated by artefacts. Also, amplitude-PARs for CI08 were larger than those for CI07, following the same trend as the estimated artefact that was larger in CI08 than CI07. The FS4 processing strategy transmits the stimulus envelope and its fine structure (including pitch). This pitch encoding may be sufficient to evoke speech-ABRs, though it may also cause "stimulation bursts" in the recordings resembling the temporal fine structure of the stimulus that remained after artefact removal, resulting in responses containing speech-ABR characteristics. However, when artefact removal was applied to the three artefacts recorded by Wagner et al. (2018) using an artificial-head, artefact removal was successful and consequently the obtained traces did not contain speech-ABR characteristics. Indicating that these "stimulation bursts" were not present after artefact removal. Thus it is possible that true speech-ABRs were detected in CI07 and CI08. Nonetheless, true speech-ABR detection cannot be confirmed with certainty given that (i) estimated artefacts from participant recordings were much larger in amplitude than those estimated from artificial-head recordings (ii) latencies between detected peaks and estimated artefact peaks from participant recordings were similar and (iii) speech-ABR amplitude-PARs followed the same trend as the estimated artefact amplitudes.
It is likely that the artefact removal technique was generally unsuccessful due to the complexity of the 40 ms [da] stimulus used in this study compared to the tone bursts used by McLaughlin et al. (2013). The 40 ms [da] would result in a more complex CI artefact that is more difficult to resolve from the response. Also, brainstem response amplitudes are smaller than cortical response amplitudes as shown in 80-Hz brainstem ASSRs versus 40 Hz cortical ASSRs in normal-hearing individuals (e.g. Picton et al. 2003), and in brainstem versus cortical responses in individuals with CIs (e.g. Makhdoum et al. 1998). . Artefact recorded based on a close approximation to CI08's MAP showing (A) artefact before processing (prior to filtering and artefact removal) that is very large in amplitude-BAR containing an excessive high-frequency artefact, (B) Band-pass filtered artefact (100-300 Hz) that is smaller in amplitude than (A) with no high-frequency artefact, (C) artefact that was estimated from (B), and (D) final response after artefact removal that is much smaller in amplitude-PAR than both (A) and (B) suggesting that most of the artefact has been removed. Ã Note that in order for the response to be visible in all subplots, the y-axis scale is not equal across sub-plots.
These smaller amplitude brainstem responses may be more difficult to extract from the artefact. Additionally, although electrical stimulation leads to increased phase-locking in the cats' auditory nerve (e.g. Hartmann, Topp, and Klinke 1984;van den Honert and Stypulkowski 1984), phase-locking to electrical stimulation appears to be reduced in the cats' brainstem (e.g. Clark 1969;Middlebrooks and Snyder 2010) potentially affecting measurability of speech-ABRs in some CI recipients. Another issue may be that the different CI pitch/fine structure processing strategies may be influencing the capacity of the CI to evoke speech-ABRs. Alternatively, it may be that artefact removal is also removing or masking some of the neural response resulting in the response becoming undetectable.
In summary, extending the artefact removal technique developed by McLaughlin et al. (2013) to speech-ABRs collected in this study was potentially successful on responses from two of 12 participants. These two participants both had MED-EL CIs. Additionally, artefact removal appeared successful when it was applied to three MED-EL artefacts recorded using an artificialhead. Which indicates that this method of artefact removal, or an adaptive version of it, may be suitable for use with some MED-EL CI recipients.

Conclusions
To the best of our knowledge, this is the only study that attempted to record speech-ABRs from CI recipients. Results from this study show that it remains challenging to remove the CI artefact from speech-ABRs recorded using two-channels. However, successful artefact removal from three MED-EL recordings made with an artificial-head, and potentially successful artefact removal from two of our MED-EL participants suggests that an expansion of the single channel approach proposed by McLaughlin et al. (2013) may potentially be applicable, with the development of a more robust and adaptive method to model the CI artefact in order to be able to remove each participant-specific CI artefact from their brainstem responses. Plus, there is a need to develop methods to verify that responses after artefact removal are not modulated by artefacts. Thus, more work is needed on CI artefact removal techniques from speech-ABRs.