Detection and classification of cardiovascular abnormalities using FFT based multi-objective genetic algorithm

ABSTRACT Signal processing and data analysis are widely used methods in a biomedical research. In recent years, detection of cardiovascular abnormalities in patients can be achieved by using electrocardiogram (ECG) recording. In this paper, a fuzzy-based multi-objective algorithm using Fast Fourier Transform (FFT) is proposed. Initially, an effective FFT is used to extract the feature points in ECG signals, such as PQRST wave's amplitude and wave function and then the proposed multi-objective genetic algorithm is used to classify the abnormality of heart patient. Basically, the ECG behaviour depends on various factors such as age, physical condition of patients and the surrounding environment. The efficient detection of abnormalities (e.g. arrhythmia and myocardial abstraction) can be achieved by initializing the above-mentioned factors and maintaining a database containing previously attributed signals, such MIT-BIH arrhythmia. The present study provides efficiency of around 98.7% in detection of abnormalities in patients.


Background
The accurate detection of cardiovascular abnormalities of patients has attracted greater interest in the medical field in recent years. The main reason for cardiac malfunction is due to irregular function of the sinoatrial node (SA node). The SA node controls the contraction and relaxation of the heart by electrical impulses. These electrical pulses are recorded as electrocardiogram (ECG) signals. Therefore, monitoring the electrocardiogram signal will provide a better way to represent the bioelectrical activity of the heart and the type of data acquisition technique for ECG signal can be illustrated as single lead to multi-lead systems [1]. Accurate detection of ECG signals is necessary to determine any cardiovascular abnormalities [2]. The ECG signal consists of P, Q, R, S and T waves and each wave represents a function of the heart: the P-wave represents the atrial depolarization; the Qwave represents the depolarization of the interventricular septum; the R-wave represents the depolarization of the ventricles and the S-wave represents the final depolarization of the ventricles.
The ECG signal is recorded from the surface of a patient's body and, while recording, this system will introduce some noise in the ECG signal. The amplitude and peak value of ECG signals varies largely for different patients with different conditions [3]. Therefore, an efficient algorithm for removing the noise in ECG signals is required. The noise in ECG signals is due to factors such as baseline wandering, motion artefacts, supply-line interference within signal, electrode contact noise and some attenuation losses [4]. Baseline wandering noise is produced by factors such as respiration, electrode impedance variation and excessive body movements. Motion artefacts can be affected by the body motion of patients during recording. The supply-line interferences are due to stray effect that is caused by power cables from the leads. The frequency of noise in an ECG signal is affected by baseline wandering and it is in the range of 0.3-1.5 Hz. Several hardware design techniques are available for the reduction of noises such as supply-line interference and motion artefacts. Therefore, an efficient algorithm for removing the lower frequency components present in ECG signals is much needed.
After the removal of noise, the features in the ECG signal should be extracted [3]. The number of fiducial points in the ECG signal can be removed using the feature extraction technique by applying an efficient threshold to the peak points. An efficient feature extraction technique called Fast Fourier Transform is employed to detect these peak points and the detected peak points consist of some additional points other than the PQRST signal; then, by using an efficient threshold technique, the P, QRS and T-wave in the ECG signal can be separated. The peak amplitudes that are in the input ECG are shown in Figure 1.
The identification of the type of abnormalities from the detected peak points of the ECG signal poses some difficulties, due to the fact that two signals may have similar patterns but may show different diseases. In other cases, two signals might have different patterns but indicates same disease [5]. Hence, efficient detection algorithms for detecting abnormalities in the heart need to be developed. Therefore, we propose a multiobjective genetic algorithm that will differentiate the type of abnormality based on the condition of the patient. In the proposed scheme, the susceptibility of the patient to the disease (more-severe, severe and normal condition) were determined with the help of a fuzzy-based scheme.

General concepts
The efficiency of the classification technique is based on its ability to differentiate cardiac abnormalities of patients by considering their profile. Previous classification techniques do not hold this kind of feature, because they consider one and the same detection criteria for all patients [6]. The main objective of the proposed research is divided into three phases: (i) pre-processing (noise removal); (ii) feature classification (to separate PQRST peak values and QRS wave functions) using FFT and (iii) classification of abnormalities using a fuzzy-based multiobjective genetic algorithm.
Initially, ECG signals contain noises due to baseline wandering and attenuation losses and these high frequency components will lead to false detection or pseudo detection of peak signal in the classification. Therefore, a median filtering technique is employed to remove the noise from the ECG signal. The trade-off between noise attenuation and signal details has to be verified [7]. This median filtering technique will act as a low-pass filter for the removal of high frequency components in the ECG signal.
The second step of this classification technique is feature extraction. Fast Fourier Transform technique is effectively introduced to extract feature components such as PQRST signals from the ECG signal [8]. This technique will decompose the original signal from time domain to frequency domain. High frequency signals and maximum values of particular frequency will be extracted from the frequency domain based on the thresholding technique [8].
The third step of the classification is abnormality detection, where the condition (objective) of the patient will be initialized. After initialization, a fitness function for the multi-objective genetic algorithm is implemented using the area of convergence for a particular abnormality [9]. If the fitness function is less than the area of convergence, it will be marked as an abnormality. Then, the second step is to determine the stage of the abnormalities with the help of the fuzzy-based scheme and this condition will deliver the degree of deviation to the output. Hence, the abnormality type and its condition can be predicted. A flow chart of our proposed abnormality detection algorithm is shown in Figure 2.

Pre-processing
ECG are inclined to noises, since the ECG signals are taken from the surface of the patient's body, so many wavelet algorithms have been introduced to de-noise the output signal, but the Daubechies wavelet transform was predicted as more efficient for de-noising [10]. The noise of the input signal is filtered by a moving averages algorithm and this algorithm will consider the moving slope of the ECG signal [11].
Noise removal requires numerous approaches from different sources. Thakor and Zhu [12] achieved noise removal by an adaptive filtering technique, considering constant reference input to cancel noise such as baseline wandering in the input ECG. However, this technique was found inefficient for diagnostic applications. The main concept of the wavelet technique is to decompose an ECG signal into numerous subsamples at different gauges [3,13]. The wavelet transform technique does not differentiate noise and signal coefficients of wavelet decomposition which have a very low signal-to-noise-ratio (SNR) value. This proves it an attractive method for noise removal with improved SNR.

Feature extraction
Contour Wavelet Transform (CWT) [13] is used as an efficient method for multi-lead to record ECG signals. However, it was found that it is not suitable for single lead ECG signals, as they are susceptible to noises [14]. Hence, an optimization algorithm is suitable for varying sampling frequency. Many wavelet algorithms have been introduced to de-noise ECG signals, but the Daubechies wavelet transform has been identified to be more efficient for de-noising [10]. The features of the ECG signal such as P-wave (both peak and amplitude) of the ECG signal should be extracted from the denoised signal. As a result, an efficient feature extraction technique should be introduced to remove all noise in the ECG signal [15].
Discrete Wavelet Transform (DWT) is the most common algorithm for feature extraction and the input signal is decomposed into many levels. This decomposition technique will give all the information (waveform shape and peak values) of the ECG signal [15]. Many feature extraction techniques such as the threshold technique are also available [16]. However, the peak value of the ECG signal obtained from DWT was found to be inaccurate for noisy signals. Hence, an adaptive wavelet decomposition method is proposed; the least square values are used for approximation of the ECG signal. Fast Fourier Transform (FFT) is used to convert the ECG signal, i.e. in the time domain, to an ECG signal in the frequency domain for more accurate extraction of peak values. QRS complex waves of ECG signal are obtained with the help of FFT, QRS complex, peak values of P amplitude, QRS wave and amplitude values; then the results are fed to the classification unit to detect the abnormality in the ECG. Peak wave detection by soft computing techniques such as Support Vector Transform (SVM), Particle Swarm Optimization techniques was accessible to extract the peak value of ECG [17]. This kind of soft computing will lead to a separate condition for each peak. To define all the peak signals, a separate technique for feature extraction is required.

Abnormality classification
The block-based neural network was used to classify the abnormality of the ECG signal and the values can be optimized with the help of a Particle Swarm Optimization (PSO) technique. The classification of abnormality of different sampling frequency was found inefficient [18]. The Artificial Neural Network (ANN) with Mixture of Experts (MOE) technique is introduced using a local classifier with a combination of global classifier for categorizing abnormalities. This kind of approach is quite efficient; therefore an accurate global classifier is used to define the normal and abnormal region for ECG features. Classification phase learning algorithms such as Support Vector Machine (SVM), genetic algorithm and neural network are used to find abnormalities in ECG signals. But, many of the algorithms, such as genetic algorithm and SVM, are not robust for varying characteristics of patients, such as age, weight and physical condition, so the output of the algorithm will be based on assumption [19]. If a patient with different conditions (objectives) is selected, then the ranges of abnormalities are common. Therefore, an improved learning algorithm for abnormality detection of different patients has been framed. Therefore, multi-objective genetic algorithm is proposed to detect the abnormality of a particular patient based on the condition of the patient [20].

Pre-processing
The ECG signal raw input will be inclined to noises in the output because of the electrical potential produced by the heart, leading to attenuation losses. Hence, removal of noise is essential for better abnormality prediction [10]. The removal of noise in ECG signals will be carried out with the help of a relaxed median filter; the trade-off between noise attenuation and signal details has to be verified [4]. The median filter is considered as 'm' and the median filter function is characterised by function [17]. Every input ECG wave in the time domain is given by Equation (2): Step 1: Input ECG signal is given by the wave function y: Step 2: Signal will be subjected to median function Step 3: For median filtering, the window size for filtering should be assigned first window sizew ¼ j 1 4fs ÃlengthðvÞj, where f s is the sampling frequency, and length (v) is the total number of samples .
Step 4: The total signal is divided based on the sample size; the value of the amplitude of the signal is sorted in the window size.
Step 5: Find the median of the signal with median functionm ¼ medianðyÞ Step 6: After initialization of the median value, the value of median is then compared with the value in the entire signal.
Step 7: The window size will be incremented w ¼ 2m þ 1 and the corresponding value median will be calculated.
Step 8: Stop if all the criteria are met.
After the computation of the median, the vector will be stored in G, where G is the subset of signal vector S: The value of window size will be increased in corresponding stages and this type of computation will remove different frequency components in every step. Then, this gradient vector is subtracted from the original signal (i.e. original signal minus the gradient vector) [16]. The whole algorithm will form a low-pass filter to remove the sharp edges in the ECG signal [7]. The deviation of amplitude in the ECG signal is shown in Figure 3. The pure ECG signal without noise is shown in Figure 4.

Feature extraction
In this phase, the main features of the ECG (amplitude and time period of peak signal) such as R-peak, QRScomplex wave peak, wave function and T-amplitude were extracted. The PQRST wave extraction was done with the help of FFT. FFT is the simplest way of analyse a signal with the help of Discrete Fourier Transform (DFT) [8]. The response of the time domain and frequency domain is shown in Figure 5; FFT will eliminate lowerorder harmonics and the signal needs to be decomposed into subsamples then the input should be periodic in nature which is the summation of many sinusoidal signals of different frequency. The input samples contain N number of samples.
The basic equation for FFT is given as where k = 0, 1, 2, 3…n  Note: sine and cosine functions can be expressed in polar or rectangular coordinates using Euler's equation [21]. The output spectrum of DFT 0 xðkÞ 0 is represented as correlation between input time samples, N cosine and N sine equation. In complex notations, the DFT algorithm consists of both a real and a complex part, but for realtime applications the complex function will be removed.
N is the nth primitive root of unity and xð nÞ is the signal amplitude. The above equation is divided into odd and even elements. XðkÞ where By this formula, the value of the original signal is decomposed into many subdivisions in the frequency domain. The decomposition process can be obtained by re-ordering of the signal. The basic functional diagram of the FFT algorithm is shown in Figure 6. The radix-2 FFT algorithm divides the entire DFT calculation into two parts of 2-point DFTs. Each 2-point DFT will consist of a multiply-and-accumulate operation called a butterfly, as shown in Figure 6. The FFT algorithm is the simplest form to evaluate the DFT [8]. The computational efficiency of the DFT algorithm will be N 2 for computing N-points, whereas the computational efficiency of the FFT algorithm will be

RR intervals
The heartbeat of the patient will be determined with the help of RR intervals, which are calculated by the distance between two R peaks. This value of the RR-interval will also be used to understand the heart rhythm of a particular patient. The heartbeat is defined as the number of R-peak values in a particular minute. The rhythm of the heart can be determined by the following formula:  where R i is the time period of the 'R' peak in the ith wave and R iÀ1 is the time of the 'R' peak in the ði À 1Þ th wave.
RR interval calculation provides useful information for clinical diagnosis and identification of symptoms for arrhythmia events that are associated with heart-rate variation in patients.

QRS detection
The duration of the QRS wave varies with the origin and conduction path of the activation pulse in the heart. So, QRS will be the main feature for the classification of the conditions of the heart. In this work, the QRS duration is represented by the time interval between the two peak Q and S signals which can be calculated by.
where Qt i and St i are the time values of the Q and S wave.
In each sub-band, signal variance can be found with the help of average power in the sub-band. The variance of a QRS signal is calculated by the formula, where x is the sample mean of signal and N is the number of samples in the given input segment. The number of samples N differs from one ECG to another based on whether its type is large beat (ectopic beat) or narrow beat (normal beat). The peak signals are plotted in an ECG wave (Figure 8).

Multi-objective genetic algorithm
Objectives that are taken into consideration will conflict with each other for many problems in the real world. Hence, an optimizing value x with respect to single objective is required. An optimal solution for the multiobjective optimization problem is to consider a set of solutions, which has capability to satisfy objectives at an optimum level without affecting other solutions. The multi-objective genetic algorithm uses two operators to determine new solutions and they are mutation and crossover. In the crossover operation, generally two parent random numbers are combined together to form new offspring. The selection of parent in the existing random number is considered according to the strength of the parent to produce better offspring [21]. The iteration efficient random numbers will undergo mutation to produce better offspring. The selection of parent will be based on the fitness function of a particular node. The fitness value can be initialized by calculating the particular offspring with a corresponding fitness function. For example, every different objective will possess a different fitness function; factors such as age, weight and physical condition will have a different objective function when compared to the physical activity of a particular patient. The offspings will undergo repetitive mutation until an effective result is obtained [9].  Step 1: Set i = 1, N number of random populations are selected for every variable Step 2: Initialize the number of objective functions i, where i = 1,2,3… .k.
2.1: Calculate weight for every random number x k ¼ ð1 6 u k Þ X k i¼1 u i , Where u k is the random number for that particular population. The weight of the random number is the measure of significance of the random number to remaining numbers. 2.2: Calculate the selection probability ðf ðyÞ À f ðminÞÞ (13) where f ðxÞ is the objective function, f ðminÞ is the minimum of the function, f ðyÞ is the common objective. Selection probability is the chance of each random number to be selected as the best fit.
Step 3: Based on the objective function, the rank of that particular random number is calculated by the following formula: x kp1 x kp2 x kp3 x kp4 where z k is the objective function; the rank of an individual in the matrix is selected from the fitness value of that particular individual.
Step 4: Rank matrix of all the populations are then compared to obtain the parent individuals.
Step 5: The parent nodes are moved to cross-over in this cross-over phase; N number of off-springs will be produced.
Step 6: The above steps are repeated until a best offspring is produced.
Step 7: After initialization of offspring, the best offspring will be obtained based on the rank of that particular offspring.
Step 8: Repeat the step until a best solution is determined. If the condition gets satisfied, then stop the iteration.

Datasets
The proposed method uses the MIT-BIH arrhythmia database [22]. The data consist of 48-h excerpts from two channels ambulator containing about 134 samples and the sampling frequency of the ECG signal is taken as 360 samples per second with a resolution of 11-bit over a maximum range of 10 mV amplitude. The input data are grouped in the same directory and the recording of ECG contains maximum of 13,000 data values. The data retrieval technique will differ for varying databases. Hence, the system will not be robust; effectiveness calculation will be carried out from the same dataset.

Experimental procedure
The input raw ECG signal is then read with the given input sample frequency of 360 samples per second. The value of the amplitude and position of the amplitude signal is then stored in a vector which should be equal to the length of the ECG signal. The input ECG signal will contain noise and it should be removed for better classification of the ECG peak signal amplitude. The ECG signal de-noising will be done with the help of a relaxed median filter. In the relaxed median filter technique, the window will be assigned as 1 6 4f s , where f s is the sampling frequency. The sampling frequency of the filter will be incremented at each step and the high frequency signals other than fundamental ECG signals will then be removed from the ECG signal. The median filter will act as a simple low-pass filter to remove the high frequency components from the input ECG signal. The output of the median filter will be a pure ECG signal and the signal will be forwarded to the feature extraction phase. Feature extraction will be carried out with the help of FFT. The input ECG signal will be seperated into two 4-point DFT; the 4-point DFT signal will be subdivided into four 2-point DFT signals. This combination is then compared to form an input signal in the frequency domain. The outputs from the FFT are the peak amplitudes of the Pwave, QRS waves amplitude and T-wave amplitude.
The amplitude and position values are given to the multi-objective genetic algorithm and the Parametric equation of objective will be considered as objective function for the genetic algorithm [21]. Random numbers will be generated for feature extracted output [23,24]. The disease susceptibility will be determined with the help of fuzzy concept, such as very severe, severe or normal.

Results and discussion
We have shown an experiment on automated detection of arrhythmia and other heart disease using the MIT-BIH arrhythmia database. In this experiment, we have used a combination of linear and non-linear methods to extract the features of ECG signals. The data obtained consist of 11,000 samples in the time interval of about 10 s with sampling frequency of 360 Hz. In this part, the classification performance of arrhythmia by the proposed multiobjective genetic algorithm and existing methods such as PSO, SVM and Genetic Algorithm (GA) algorithm was compared.

Performance evaluationdenoising or preprocessing
The analysis of the proposed work was carried out in MATLAB 2015a environment; raw ECG data were obtained from the MIT-BIH database and then de-noised with a median filter to remove the noise in the input ECG signal [25]. Then the filtered ECG signal was plotted over the raw ECG signal to determine the variation in the ECG signal. The filtered output is shown in Figure 9. The SNR of ECG signals were identified for different signals in the database and the results were compared with existing Hilbert transform. The result was found more efficient than existing systems [26].
SNR is the ratio of the power of the pure signal to the power of the noise signal of the input ECG. Increased value of SNR will lead to better efficiency. The SNR values of the median filtered ECG signal are given in Table 1. The median filter was found efficient in removing the high frequency components that are present in the input ECG signal.

Performance evaluationbeat classification
FFT was used for efficient beat classification and to detect peak amplitudes. The proposed FFT was compared with existing discrete wavelet transform algorithm and it was found that the proposed method has efficiency of 98.7% in peak detection. The incorrect detections of peak waves will be calculated by identifying the values of true positive (TP), true negative (TN) and false positive (FP) of detected ECG signals. The TP, TN and FP values were detected by the FFT algorithm and then compared with existing discrete wavelet transform algorithm. The results showed that the false detection of ECG peaks by FFT is reduced when compared to the existing DWT algorithm [27] as shown in Table 2. The error detected by the FFT algorithm can be calculated as follows: Sensitivity (s e ) and specificity (s p ) give the measure of positives and negatives that can be identified from the database. The sensitivity and specificity of beat classification can be calculated as follows.
It is evident that the FFT algorithm was found to be sensitive for detection of peak signals (Table 2) and the detected features are compared in Figure 10. The plot shows the histogram value of the peak signal and the standard deviation curve for the detected ECG signal. The major area of the histogram above the standard deviation curve illustrates low sensitivity of ECG signal and it can be found that data above 100 and 101 show some abnormal behaviour of ECG signals. The standard deviation curve gives some clear idea about peak detection by the FFT algorithm ( Figure 10).
The peak detection of ECG signals by FFT is then compared with DWT. True positive is the number of correct peaks detected, whereas True negative is the number of peaks detected incorrectly by the detector. The true positives and true negatives of the two algorithms are compared in Table 3 and Figure 11. It is clear that the detection of genuine peak values from FFT are more efficient than the existing DWT algorithm (Table 3 and Figure 11).

Performance evaluationabnormality detection
A multi-objective genetic algorithm was used to differentiate between normal ECG signals and abnormal ECG signals. The ECG data with detected arrhythmia from the   database are given in Table 4. Abnormalities detected by both the genetic and the multi-objective genetic algorithm were compared as shown in Tables 5 and 6. It is shown that out of 45 data entries, 22 ones were found to be abnormal with the help of the multi-objective genetic algorithm (Tables 5 and 6). The proposed algorithm was found to be accurate in detection of abnormalities from ECG [28]. The classification accuracy of our proposed multi-objective genetic algorithm was compared with existing classification algorithms such as Neural Network (NN), genetic algorithm etc. and the proposed fuzzy based multi-objective genetic algorithm was found to be more accurate than the existing system. The accuracy percentage of our proposed system is around 98.7% (Table 4 and Figure 12). The proposed multi-objective genetic algorithm was implemented using MATLAB 2015a environment in 64bit Windows operating system. The important advantage of the proposed multi-objective algorithm is that it has very low computational cost for feature classification. The total time of computation is about 0.28 ms of single CPU implementation for the resolution of 360 samples per second [29]. The speed of computation is faster than the real time requirement. The hybrid fuzzy NN, Mixture of Experts, ANN-PSO, ANN, Probabilistic NN, MLPNN, SVM and MO-GA algorithm provided accuracy upto 96%, 98%, 95.5%, 98.4%, 96.9%, 97.5%, 97.6%, 97.7% and 98.8%, respectively ( Figure 12 and Table 4). Recently, we   (Table 6). For instance, for the combined dataset with 184 subjects, the WDIST2-5 and ML-5 attained 96.31% and 98.33% accuracies, respectively, while the cascaded two 99.52% subject verification accuracy. It should be noted that the unified dataset includes subject ECG measured from mobile phones, subject ECG measured in the presence of arrhythmia and subject ECG data measured 2-20 times over a 6-month period.

Conclusions
The classified individual characteristics of heartbeats from standard 10-s, 12-lead ECG signals database were used to identify arrhythmia in patients. The features are extracted by the FFT algorithm and then it is fed to the multi-objective genetic algorithm by considering the age, weight and physical condition of the patient. The multi-objective genetic algorithm was determined to be more sensitive to identify abnormalities in patients. The results from the multi-objective genetic algorithm were then compared with existing genetic algorithm and the simulation results illustrated that the multiobjective genetic algorithm was more efficient for varying factors. The final result showed 20% increase in the efficiency in detecting abnormalities when compared to the existing system. This present research provides better efficiency for both SNR and mean square error with efficiency of around 98.70%. This proposed work sets out an automatic method to identify arrhythmia in patients from 12-lead ECG signals by classifying heartbeats using machine-learning methods. The results showed that it is possible to obtain better efficiency using multi-objective genetic algorithm. In future research, analysing and modelling the sequence of heartbeats using advanced machine learning methods can be implemented to achieve better performance.

Disclosure statement
No potential conflict of interest was reported by the authors.