A portable Raspberry Pi-based system for diagnosis of heart valve diseases using automatic segmentation and artificial neural networks

Abstract This study proposes a Raspberry Pi-based system for the diagnosis of heart valve diseases as a primary tool to improve the diagnostic accuracy of physicians. The proposed system is able to detect and classify nine common valvular heart cases encompassing eight types of heart valve diseases as well as the normal case of valves. The design and development of the proposed system are mainly divided into two phases, namely development of a disease classification approach, and design and implementation of the diagnostic hardware system. The developed disease classification approach is comprised of five stages, namely obtaining phonocardiogram (PCG) signals, preprocessing, segmentation using a proposed automatic algorithm, feature extraction in three domains (time, frequency, and wavelet decomposition domains) and classification using a backpropagation neural network. The hardware of the diagnostic system consists of a PCG signal acquisition module connected to a processing and displaying unit, which is represented by a Raspberry Pi connected to a touch screen. Where the developed disease classification approach is implemented in the software of the Raspberry Pi to enable it to detect the diseases in real time and fully automatically. The proposed system was clinically tested on 50 real subjects encompassing the nine cases. The performance of the diagnostic system is obtained with an accuracy of 96%, sensitivity of 95.23%, and specificity of 100%.


PUBLIC INTEREST STATEMENT
According to the World Health Organization (WHO) Center, cardiovascular diseases are the number one cause of death globally, taking an estimation of 17.9 million lives each year. Many cardiac diseases, which are valve related, can be easily and less costly detected by the analysis of phonocardiogram (PCG) signal, i.e. heart sound signal. Therefore, an automatic analysis of PCG signal is used to improve the diagnosis of heart valve diseases, especially in rural health-care clinics where neither experienced physicians nor Doppler-echocardiography equipment might exist. The goal of this work is to provide a primary diagnostic tool for heart valve diseases. Thus, in this paper, a Raspberry Pi-based system is proposed for the diagnosis of nine common valvular heart cases. The proposed system is portable, easy to use, and can provide the diagnosis result immediately.
the software of the Raspberry Pi to enable it to detect the diseases in real time and fully automatically. The proposed system was clinically tested on 50 real subjects encompassing the nine cases. The performance of the diagnostic system is obtained with an accuracy of 96%, sensitivity of 95.23%, and specificity of 100%.

Introduction
Valvular heart disease is caused by either damage or defect in one of the four heart valves, aortic, mitral, tricuspid, or pulmonary. Defects in these valves can be congenital or acquired (Kameswari et al., 2010;Zeng et al., 2016). Treatment of damaged valves may involve medication alone, but often involves surgical valve repair or replacement (insertion of an artificial heart valve) (Amirjani et al., 2014;Cabrera et al., 2017;Rick et al., 2014). Stenosis and regurgitation represent the conditions associated with valvular heart disease. Stenosis describes a narrowing of the valve opening that prevents adequate outflow of blood. Regurgitation describes the valve's inability to prevent backflow of blood as leaflets of the valve fail to close completely. In general, heart valve diseases include eight common classes, namely aortic stenosis, aortic regurgitation, mitral stenosis, mitral regurgitation, pulmonary stenosis, pulmonary regurgitation, tricuspid stenosis, and tricuspid regurgitation (Rick et al., 2014;Zeng et al., 2016). Doppler-echocardiography is today well-established tool in the diagnosis of heart valve diseases, but it is expensive. On the other hand, auscultation (analyzing cardiac sounds) is one of the cheap techniques commonly used by physicians for diagnosis. It is simple and effective; however, it needs long-term training and expertise (Singh et al., 2017). Therefore, many studies have been conducted toward designing systems based on the digital analysis of the phonocardiogram (PCG) signal in order to improve the diagnostic accuracy of physicians. In the field of heart valve disease diagnosis, which is based on PCG signals, most of the studies deal with computer-based systems that can only diagnose few valvular heart cases. Systems are devised in (Ahmad, 2011;Grzegorczyk et al., 2016;Hofmann et al., 2016) to interpret the condition of heart valves as normal or abnormal without further classifying the abnormal ones, while in (Emre & Uguz, 2011;Uğuz, 2012), the valvular heart condition is interpreted as one of the three cases (normal, mitral stenosis, pulmonary stenosis). Furthermore (Noman et al., 2018) presents a novel system to diagnose four valvular heart cases (normal, aortic regurgitation, mitral stenosis, mitral regurgitation), whereas in (Safara et al., 2013;Suboh et al., 2008;Suhas et al., 2017), five valvular heart cases (normal, aortic stenosis, aortic regurgitation, mitral stenosis, mitral regurgitation) are diagnosed. In (Kumar et al., 2018), a system is devised to diagnose five heart valve diseases (aortic stenosis, aortic regurgitation, mitral stenosis, mitral regurgitation, pulmonary stenosis). According to the aforementioned approaches, the maximum number of the diagnosed valvular heart cases is five, not to mention that the diagnosis process is performed by processing a pre-recorded PCG signal, which means these systems cannot clinically examine the patient to provide the diagnosis result as fast as possible.
The aim of this work is to develop a Raspberry Pi-based clinical system for diagnosing nine common cardiac valve cases encompassing the eight heart valve disease, as well as the normal case of valves. This system can examine patients in real-time, providing the diagnosis result immediately. It is also portable, easy to use, and low cost. All that makes it an effective primary diagnostic tool useful in rural health-care clinics where it can improve the diagnostic accuracy for those less experienced physicians, or at home so that family members can detect diseases early.

Methodology
Initially, a disease classification approach will be developed using a simulation software (Spyder). After that, the diagnostic hardware system, which consists of a PCG signal acquisition module connected to a Raspberry Pi (which is running the developed classification algorithm), will be designed and implemented.

Development of the diseases classification approach using spyder software
In this section, which comes before implementing the Raspberry Pi-based hardware system, the Python language is used. Figure 1 shows the block diagram of the developed disease classification system.

PCG signals database
The database was collected from several internet sites (Classifying Heart Sounds Challenge, 2016; I. eGeneralMedical, 2016; http://www.3m.com/healthcare/littmann/mmm-library.html; https://physionet. org/content/challenge-2016/1.0.0), as well as from a hospital for cardiology and heart surgery. This database comprises a total of 200 PCG samples encompassing the nine valvular heart cases (50 samples for the normal case and 150 samples for the eight pathological cases).

Preprocessing
The frequency range of the PCG signal is 20-700 Hz (Firuzbakht et al., 2018). Therefore, this signal was filtered by applying a second-order low-pass Butterworth filter with a cut-off frequency of 800 Hz to attenuate the high-frequency components of noise resources as ambient noise, etc. Next, the filtered PCG signal was normalized to a standard scale of [−1 1].

Automatic segmentation
In this stage, a single cardiac cycle is segmented from the preprocessed PCG signal in order to be used for extracting features in the next stage. An automatic segmentation algorithm was proposed that works for the nine PCG signal cases. The goal of this algorithm is to determine the single cardiac cycle length, representing the period (T), by detecting the periodicity within a 4-second segment (S(t)) from the PCG envelope. This 4-second segment is sufficient to contain at least two heart cycles since the heart period range is 0.5-1.5 sec. The key principle for determining T can be summarized as follows (as seen in Figure  2): Two adjacent windows (W1(t) and W2(t)) with the same length (ω) are defined within S(t). ω is gradually increased by a fixed step (20msec), spanning the range of 0.5-1.5 sec. After every increase, the  mean values (M1 and M2) of the two adjacent windows are calculated and the (M1 M2) condition is examined. If this condition is met, which means the same signal is detected in the two adjacent windows, the period is determined as (T = ω) and the algorithm is stopped.
The flow chart of the proposed automatic segmentation algorithm is shown in detail in Figure 3. It consists of three phases. 2.1.3.1. Phase 1. The goal of this phase is to obtain the envelope of the PCG signal. At first, murmurs are attenuated which helps in obtaining a smoother envelope. Since the frequency range of murmurs is 100-600 Hz (Patidar & Pachori, 2013), the preprocessed PCG signal (PCG(t)) is filtered with a low-pass zero-phase filter with a cut-off frequency of 100 Hz. Next, Shannon Energy Envelope (SEE) given by (1), where x represents the murmur-attenuated signal, is computed using a fixed window of 20msec (Suhas et al., 2017).
The purpose of this phase is to extract a 4-sec segment (S(t)) from the SSE, whereas S(t) must start with a required onset (δ_0), that is determined according to two proposed main criteria: (1) First criterion: The signal must start with a noticeable change in amplitude. This is attained by calculating the standard deviation within 20msec of the beginning of the signal. The first criterion is met when the value of this standard deviation is found to be higher than (ε = 0.01). The value of ε was obtained empirically and works for the nine cases of PCG signal.
(2) Second criterion: The signal must start with a positive increase in the rate of change. This is achieved by calculating the slope within 20msec of the beginning of the signal. The second criterion is met when the calculated value of the slope is found to be positive.

First of all, S(t) is defined within the time range of [δ→δ + 4 sec]
, where δ is a variable that holds an initial value of zero. Next, δ is gradually increased by a fixed step (20msec) until the two previous criteria are met. Thus, the desired onset is defined as (δ_0 = δ).

Phase 3.
The main objective of this phase is to determine the signal period (T) in order to segment a single cardiac cycle from the PCG signal. In the beginning, a window (W1(t)) is defined within S(t). This window is measured ω in length and spans the time range [δ_0→δ_0 + ω]. After that, the mean value (M1) of W1(t) is calculated. Subsequently, a second window (W2(t)) that is adjacent to W1(t) is defined within S(t). W2(t) is also measured ω in length but spans the time range [δ_0 + ω→δ_0 + 2 ω]. Following that, the mean value (M2) of W2(t) is calculated. W2(t) must also adhere to the two proposed criteria, which means that the standard deviation and the slope are calculated within 20mes of the beginning of W2(t). Then, ω gradually increases by a fixed step (20msec) within a range of 0.5-1.5 sec until the following condition, which consists of two parts, is met: 1) |M1-M2|<(ε_M = 0.015), in other words M1 M2, where ε_M is an experimental value, 2) W2(t) meets the requirements of the two main criteria. Finally, T is defined as (T = ω) and the single cardiac cycle (C(t)) is segmented from PCG(t), where the time range of C(t) is [δ_0→δ_0 + T].

Discussion.
In this subsection, the importance of the two proposed main criteria is discussed, some results of the algorithm are shown then the algorithm's performance is evaluated. Figure 4 illustrates two cases for extracting the 4-second segment (S(t)) from the SEE. In subfigures (a)-(c), the two main criteria were not considered. At a closer inspection, we notice that the (M1 M2) condition was not met and the algorithm was terminated after failing to determine the period (T). The reason for this failure can be traced to the incorrect extraction of S(t), seeing that it was extracted starting from a static state of the SEE. By contrast, subfigures (d)-(f) show a successful algorithm sequence when the two main criteria were considered. S(t) was extracted after detecting both a noticeable change in amplitude and a positive signal slope. The algorithm ended when the (M1 M2) condition was met and the value of T was defined as equal to the window length (ω). Figure 5 illustrates two cases for defining the second window (W2(t)) within S(t). Subfigure (a) represents the case of incorrect algorithm result, when the two main criteria were ignored. The algorithm finished when the (M1 M2) condition was met, even though the resultant value of T (determined to be equal to 0.55 sec) was incorrect. Subfigure (b), however, shows that meeting the requirements of the two main criteria is necessary to obtain correct results. The algorithm finished when the (M1 M2) condition was met, which in this case, translates to a value of T that equals ω (0.71 sec).  The aforementioned analysis demonstrates the importance of complying with the two main criteria proposed by the authors.  Figure 6 shows segmentation algorithm results for segmenting a single cardiac cycle from the PCG signal for (1) normal case and (2) abnormal case (aortic stenosis).
The proposed algorithm was tested on the adopted database (200 PCG signals). At First, the cardiac cycles were segmented manually with the help of a physician as a reference. Afterward, accuracy was computed to evaluate the performance of the algorithm as follows: Accuracy % ð Þ ¼ correctly segmented cardiac cycles total cardiac cycles � 100 In this research, the correctly segmented cardiac cycles are considered to fit if they are within an acceptance range of ±40 msec. As a result, an accuracy of 96% was obtained.

Feature extraction
First, a large number of features were extracted from the single-segmented cardiac cycle. After that, the importance of those features was defined by using a feature selection algorithm, ReliefF (Kononenko et al., 1997). Then, the highly informative features were chosen by eliminating the features with little predictive information. As a result, 11 features were adopted in this study: one feature in the time domain, three features in the frequency domain, and seven features in the wavelet decomposition domain.
• Time domain feature: It is the standard deviation of the absolute value of the single cycle (std) which is computed according to (3), where N is the total sample number of signal, x_i is the ith sample value, mean abs is the mean of the absolute value of the signal. (1) Mean frequency: It is computed according to (4), where N is a number of frequency bins in the FFT of one cardiac cycle, f_i and y_i are frequency and intensity (in dB scale) of the FFT of the single cycle at bin i respectively (Emre & Uguz, 2011).
(3) OBW: It is the 99% Occupied bandwidth of the PSD (MathWorks). A Python function was written in order to compute OBW. Figure 7 illustrates the obtained OBW in relation to the PSD curve of the single cycle of normal PCG signal.
• Wavelet decomposition features: Here, the single cycle is subjected to five levels of Daubechies db6 wavelet decomposition (Meziani et al., 2012). After that, four coefficients (approximation five, detail five, detail four, detail three) are considered (see Figure 8) and seven features based on these coefficients are extracted: (1) Mean of absolute value of detail five coefficients (mean abs (D5)).

Classification
Different types of intelligent classification techniques are used in the field of systems engineering and biomedical (Darwish et al., 2018;Souliman et al., 2013). At the present stage, among different varieties of artificial intelligence techniques, artificial neural networks, and their architectures (Joukhadar et al., 2020;Souliman et al., 2014), the feedforward artificial neural network (ANN) with error backpropagation training algorithm was chosen for the classification process (see Figure 9) because it is considered the most widely used model for the purpose of pattern recognition in the biomedical field (Ahmad, 2011;Kumar et al., 2018;Uğuz, 2012). This ANN used here contained three layers: The input layer has 11 points that represent the extracted features. The hidden layer has nine neurons that represent 75%  of the 11 input points. And the output layer has four neurons for encoding the nine target cases of PCG signal in binary.
Six models of feedforward backpropagation ANNs, which vary in the activation functions, were experimented with to select the best model that has high accuracy. These models were trained with 140 instances (70% of the dataset) and tested with 60 instances (30% of the dataset). The performance of these models was evaluated through classification accuracy (the ratio of correct predictions to the total predictions). Table 1 shows the performance of these backpropagation ANN models.
It was noticed that the feedforward backpropagation ANN model with logsig activation functions had a high degree of accuracy (95%). Thus, this model was adopted as the best backpropagation ANN.
In addition to the previous backpropagation ANN architecture, other ANN architectures (radial basis ANN, learning vector quantization (LVQ) ANN, and probabilistic ANN) (Aggarwal, 2018;MathWorks, 1111;Mohebali et al., 2020) were experimented with. These ANNs were trained and tested using the same previous dataset. Their structures used here were as following: The radial basis ANN includes nine neurons in the radial basis layer (75% of the 11 input points) and four neurons in the output linear layer to encode the nine target cases in binary. Regarding the LVQ ANN, the competitive layer has 11 neurons (equals the number of input points) and the output linear layer has one neuron for each target case (9 neurons in total). Concerning the probabilistic ANN, the pattern layer contains one neuron for each instance in the training dataset (140 neurons in total) and the competitive layer contains one neuron for each target case (9 neurons). The performance of these architectures was compared to the adopted backpropagation ANN model as shown in Table 2.
It was observed that both the backpropagation ANN (with logsig functions) and the probabilistic ANN had the best performance (95%). However, the probabilistic ANN is more complex and requires more  memory space to store the model. Thus, the backpropagation ANN (with logsig functions) was realized as the best and simplest classifier with regards to the classification of the nine cases of PCG signal.

Design and implementation of the diagnostic hardware system
Generally, the hardware of the diagnostic system consists of a PCG signal acquisition module connected to a processing and displaying unit, which is represented by a Raspberry Pi connected to a touch screen, as shown in Figure 10. The Raspberry Pi processes the PCG signal according to the proposed disease classification approach (preprocessing, applying the automatic segmentation algorithm, extracting the 11 features, and applying the best classifier). A demo of a clinical test process using the diagnostic hardware system is shown in Figure 11.

PCG signal acquisition module
The function of this module is to pick up the PCG signal, condition it, and convert it to a digital format so it can be sent to the Raspberry Pi through the USB port. The block diagram of this module is illustrated in Figure 12.
A condenser microphone was used to acquire the PCG signal. This microphone is fixed in a tube with close proximity to the chest piece as shown in Figure 13. The acquired PCG signal is then passed to the conditioning circuit which consists of three sections. First, a high pass filter, with a cut-off frequency of 0.15 Hz, used for removing baseline noise. Second, an amplifier circuit with a gain value of 20. Third, a fourth-order low-pass Butterworth filter, with a cut-off frequency of 800 Hz, which helps in removing high noise components. The resultant conditioned PCG signal is then passed to the USB sound card, which contains a 16-bits analog to digital converter (ADC) with a sample frequency of up to 44.1 kHz. Finally, the digital output is sent to the Raspberry Pi through its USB port.

Processing and displaying unit
This unit consists of a Raspberry Pi (model 3B+) where the processing of the PCG signal is done, in addition to a 5-inch touch screen for displaying the diagnosis result. This unit is illustrated in Figure 14.
A graphical user interface (GUI) was designed in the software of Raspberry Pi as shown in Figure 15. This GUI displays the diagnosis result after the PCG signal, coming from the acquisition module, is processed according to the proposed disease classification approach. The GUI also makes it possible to listen to the recorded patient heart sounds by pressing the "listen" button. When the "Examination" button is pressed, an order is given to the acquisition module to record a PCG signal for a time period of 5 seconds. The recorded signal is shown in the 'plot1ʹ window; moreover, the signal is processed to obtain the segmented cardiac cycle and then shown in the 'plot2ʹ window. Finally, the classification result appears in the "Diagnosis Result" text box.

Results and discussion
The diagnostic hardware system was clinically tested in real time on 50 subjects encompassing the nine valvular heart cases with the help of two cardiologists. Where the performance of the system, as shown in Table 3, was evaluated by calculating three metrics, namely accuracy, sensitivity, and specificity. These metrics are given by (6), (7), and (8), where TP, TN, FP, and FN represent true positives, true negatives, false positives, and false negatives, respectively (Ahmad, 2011;Uğuz, 2012).
A comparison of the proposed diagnostic system with state-of-the-art systems is presented in Table 4. For systems that use the PCG signal to diagnose heart valve conditions, the proposed system has three advantages: First, it is capable of diagnosing more heart valve cases. Second, it uses an automatic segmentation method in the time domain which needs less processing time. Third, it is able to examine patients in real time. It must be noted that there is a previous system that is similar to the proposed system, but it relies on Electrocardiogram (ECG) signal to diagnose non-valvular diseases (arrhythmias). However, the proposed system is superior to the ECG system in regard to the number of diagnosed heart conditions.

Conclusion
In this paper, a Raspberry Pi-based system has been proposed for the diagnosis of nine common valvular heart cases (normal one and eight pathological ones) by processing the PCG signal. First, a disease classification approach has been developed. The developed approach consists of five steps, namely obtaining PCG signals, preprocessing, segmentation,  feature extraction, and classification. The diagnostic hardware system has been designed and implemented. It consists of a PCG signal acquisition module and a Raspberry Pi connected to a touch screen. The developed classification approach has been implemented, together with a GUI, in the software of the Raspberry Pi to enable it to detect the diseases in real-time. The proposed system has been clinically tested on 50 real subjects covering the nine cases. It has been shown that the accuracy of the system is 96% with a sensitivity factor of 95.23% and a specificity factor of 100%.
The developed system has been proved to be superior to other systems developed in the literature review, especially in relation to the number of diagnosed valve cases. It is also portable, low cost, and provides the diagnosis result immediately, which is useful for rural health-care clinics as a primary diagnosis tool. Future improvements to the system will focus on enlarging the optimal classifier database, in addition to testing a large number of valve patients. With the aim of adopting this system as an official diagnostic device in the medical community.