Concurrent fault diagnosis of modular multilevel converter with Kalman filter and optimized support vector machine

ABSTRACT In this paper, concurrent fault diagnosis problem of modular multilevel converter (MMC) with Kalman filter and optimized support vector machine (SVM) is investigated. The state space model by synthesizing the circulating current and the output current is first established. Recurring to the Kalman filtering theory, the estimation on circulating and output current is realized, the residual is achieved by using the innovation which involved the predicted and measured current. Based on the obtained residual, the residual evaluation function and its threshold are constructed. Then, the fault can be detected according to the proposed fault detection strategy. Once the fault is detected, the fault localization unit is triggered and the residual data is adopted as data set. By employing the optimized SVM with genetic algorithm, the concurrent and intermittent fault localization of MMC can be accomplished. Finally, an 11-level MMC simulation systems with concurrent fault and intermittent fault are set up in MATLAB/Simulink, and the effectiveness of the proposed fault detection and localization method is verified.


Introduction
In recent years, there has been an increasing interest in modular multilevel converter (MMC) for high voltage direct current (HVDC) transmission project due primarily to their significant advantages such as high modularity, low switching frequency, strong scalability, small running loss and high output waveform quality (Yang, Lin, Zheng, & You, 2013;Yin, Jang, Hu, Gu, & Huang, 2015). Most of the research works on MMC are focused on topological structure, modulation strategy, capacitance voltage equalization control and circulation technology (Debnath, Qin, Bahrani, & Saeedifard, 2014). Actually, security is one of the important challenges in MMCs. Once an Submodular (SM) fails, it often causes operational failures in the entire system, resulting in property damage and casualties. Therefore, accurate and effective fault diagnosis (FD) of the MMC system is particularly important.
For decades, FD has proved to be a vitally important part contributing to safety critical systems. Compared to data-driven method (Li, Jiang, Zhou, Jiang, & Kong, 2019;Yuan, Zhang, Wu, Zhu, & Ding, 2017;Zheng, Mao, Liu, Wong, & Wang, 2016), the model-based FD approach has received particular attention due to the availability of physical models for many practical systems, and a great CONTACT Yong Zhang zhangyong77@wust.edu.cn number of excellent results have been reported in the literature, see e.g. Zhang, Wang, and Alsaadi (2018), Zhang, Wang, Ma, and Alsaadi (2019) and the references therein. At present, the FD research on MMC is mainly based on the model-based FD method (Deng, Chen, Khan, & Zhu, 2015;Shao, Watson, Clare, & Wheeler, 2016). During the past decades, the complexity of power-electronic process has been drastically increased, which imposes too many challenges in model-based diagnosis methods and it sometimes becomes impractical for large-scale HVDC systems. Alternative to traditional model-based approaches, data-driven FD (Wang, Xu, Han, & Elbouchikhi, 2015;Yang, Qin, & Saeedifard, 2016) has been developed, which offers powerful tools to extract useful information for FD based on the available voltage and current measurements. Among the data-driven methods, support vector machine (SVM) is employed to diagnose the fault (Li, Liu, Zhang, Chai, & Xu, 2019), and intelligent optimization algorithms (Li, Deng, Wei, Xu, & Cao, 2011;Li, Wang, Gooi, Ye, & Wu, 2017;Liao et al., 2017;Wang, Zhang, Song, Liu, & Dong, 2019;Zeng et al., 2018) are adopted to optimize the key parameters of SVM so that accurate FD can be achieved. It should be noted that the results in FD of MMC so far have been mostly focused on the model-based research (Li, Shi, Wang, & Wang, 2015;Xu, Xie, Yuan, & Yan, 2017;Zhou, Yang, & Tang, 2018). It is, therefore, the main purpose of this paper to shorten such a gap by combining the advantages of the two approaches for MMC with a hybrid approach to FD. Motivated by the above discussions, in this paper, we aim to provide a systematic approach to detect and locate the concurrent fault for MMC. First, according to the law of circulating current and the output current during the operation of MMC system, the mathematical model of state space form is established. Different from H ∞ estimation method (Liu, Wang, Shen, & Liu, 2018;Wan, Wang, Han, & Wu, 2019;Wang, Tian, & Fang, 2019), the Kalman filter theory is used to obtain the estimation of circulating current and the output current, and the residual value is achieved with the help of innovation. Next, the residual estimation function and the threshold are constructed with residual, and the concurrent fault of MMC can be detected with the introduced fault detection rule. After detecting the faults, the fault location unit is triggered, and the residual information is processed by using the optimized SVM with GA algorithm, so that the concurrent fault categories can be determined and the concurrent fault localization can be implemented. Finally, the effectiveness of the proposed data model fusion method for concurrent FD is verified by the MATLAB/Simulink simulation results.
The main contributions of this paper are highlighted as follows: (i) a state space model of MMC is established, which covers circulating current and the output current; (ii) the estimation and residual are achieved with the help of Kalman filter; (iii) the residual evaluation function is constructed to detect the fault, and the optimized SVM with GA is adopted to locate the fault; (iv) both concurrent and intermittent faults are considered for MMC with the proposed methods.

Modelling of MMC
A typical MMC topology diagram (Debnath et al., 2014) is shown in Figure 1(a), which is composed of six arms. Each arm consists of a series of inductors, N equivalent resistors and N SMs. The upper and lower arms comprise a phase unit. u vj and i vj (j = a, b, c) are the output voltage and the output current, i dc is the input dc current, i pj and i nj are the upper arm current and the lower arm current, u pj and u nj are the upper arm and lower arm voltages, R 0 is the arm resistance, L 0 is the arm inductance, U dc is the input dc voltage. As shown in Figure 1(b), the half-bridge SM contains two IGBTs T 1 and T 2 , two antiparallel diodes D 1 and D 2 , and a dc capacitor C SM .
If the open circuit fault occurs in the arms, the faulty SMs will be bypassed, which results in the asymmetry in the number of SMs and a change in the dc component of the arm (Guan & Xu, 2011). At the same time, the dc component of each arm cannot maintain balance, the output voltage will be dc biased. Actually, the different dc components in the upper and lower arms will finally deviate the expected value from the circulating current and the output current of each phase (Li & Zhao, 2015). In this paper, the fault detection of the MMC system is realized by comparing the characteristics of current change before and after the SM fault.
Similar to Debnath et al. (2014), the current equation of the upper and lower arms of the MMC can be expressed as where i pj and i nj are the upper and lower arm currents, i diffj is the circulating current and i vj is the output current of phase j, respectively. According to the Kirchhoff law, the differential equation of upper and lower arm currents of MMC can be described as Taking (1) into account, the circulating current equation of MMC can be written as Combining (2) and (3), we have the following relation: (4), the current state space model of MMC can be expressed asẋ In the simulation verification of MMC, the sampling period T is set as milliseconds, then the system (5) can be discretized into the following model: where In addition, ω j (k) and ν j (k) are the errors caused by external electromagnetic interference and modelling. Set ω j (k) and ν j (k) to be uncorrelated Gaussian white noise sequences satisfy Let the initial state x Ij (0) of the system be a random variable that satisfies the normal distribution, and its mean and covariance can be expressed as and the process noise ω j (k) and the measurement noise ν j (k) of the system are not related to the initial state

State estimation of MMC
Based on model (6), setx Ij (k|k − 1) andx Ij (k|k) as the onestep predicted value and estimated value of the current, respectively, the following relation satisfy: Taking (6)-(8) into account, the prediction errorx Ij (k|k − 1) and the estimation errorx Ij (k|k) can be expressed as Furthermore, prediction error covariance P j (k|k − 1) and the estimated error covariance P j (k|k) of the current of the MMC system can be described as where Q j and R j are the variance of ω j (k) and ν j (k), respectively.
Setting dtrP j (k|k) dK j (k) = 0 , the Kalman filter gain K j (k) can be obtained as follows: Actually, (7)- (8) and (11)-(13) are the key formulas which can be used to achieve the estimation and prediction of the circulating current and the output current of MMC. In particular, the prediction is a critical component in constructing residual which provides the key data for fault detection.

Fault detection of MMC
If there is fault in the SM j of MMC, the model (6) becomes the following form: Based on model (14) and above analysis, the following residual is needed: wherex Ij (k|k − 1) is the one-step prediction for (14). In order to detect the fault effectively, the following residual evaluation function J(k) and threshold J th are introduced where k 0 is the initial time of fault detection. Once the residual evaluation function J(k) and the threshold J th are chosen, the following scheme is employed to determine whether the fault occurs or not Based on the above definitions, the specific steps of fault detection for MMC can be described as Algorithm 1.

Fault localization of MMC
The complete FD includes two parts: fault detection and localization. Once the fault of the MMC system is Algorithm 1 Fault detection of MMC.
Step 1. Establish the state space model (5) for MMC and discretize it as (6).
Step 2. Consider the MMC model (14), and utilize the Kalman filter theory to obtain the residual r j (k).
Step 3. Construct the residual evaluation function J(k) and the threshold J th .
Step 4. According to the discriminant rule (17), determine whether the fault occurs or not.
detected, the fault localization phase is triggered. Compared to the signal-based technique, data-driven method is more efficient for FD in complex industry systems. As a typical data-driven approach, SVM is based on the statistical theory and structural risk minimization (Li & Shu, 2016;Wan, Wang, & Xu, 2012), it is very suitable for pattern classification with not very big data. Therefore, SVM is adopted to locate the fault for MMC in this paper.

The basic principle of SVM
SVM adopts the principle of structural risk minimization to find an optimal classification hyperplane that meets the classification requirements and ensures that the super-plane classification accuracy is high, and the blank areas on both sides of the hyperplane are maximized. Taking the two-class problem as an example, assume that the training set x consists of n sample point groups x = {(x i , y i )|i = 1, 2, . . . , n}, x i R n is the training sample and y i {±1} is the category mark of the training sample. The nonlinear sample function ϕ(x) is used to project the input sample into a high-dimensional feature space, and a hyperplane is constructed in this highdimensional feature space where w is the weight vector, b is the threshold. Finding the optimal hyperplane problem can be translated into the following optimization problem: where the slack variable ξ i is used to measure the distance between the actual index value y i and the SVM output, and the penalty factor C is used to control the degree of punishment of the misclassified sample.
According to the Lagrangian method, the above problem is transformed into the optimization problem of quadratic programming where is the kernel function and α i is the lagrangian multiplier, 0 ≤ α i ≤ C (i = 1, . . . , n). Let α * = (α * 1 , α * 2 , . . . , α * n ) T be the optimal solution obtained according to Equation (20). The threshold b * and the weight vector w * can be achieved by the following relation: Bringing the optimal solution into (19), we acquire the classification discriminant function of SVM as

The principle of optimized SVM with genetic algorithm
Genetic algorithm (GA) is an optimized search method based on the evolutionary and genetic theory, which is a computational model that simulates the evolutionary process of biological processes in nature (Wu & Zhang, 2018). The specific steps of optimization SVM with GA are listed in Algorithm 2.

Fault localization based on SVM
Based on the GA-SVM method, this part makes full use of the residual data r j (k) which is chosen as the training set. A one-to-one classification method is selected as the model, and the idea of fault localization for MMC is shown in Figure 2.

Simulation
In order to verify the effectiveness of the proposed method, the parameters of MMC system are shown in Table 1.

Algorithm 2 SVM with GA
Step 1. Generate randomly the initial population.
Step 2. The individual gene strings in the population are decoded into corresponding penalty factors and kernel functions. These two parameters are substituted into the SVM, and the residual data is used as the training set of SVM to obtain the classification model.
Step 3. The fitness function is used to determine the quality of individual population, so that the individual can survive the fittest and choose a good individual. Next, the accuracy obtained by the test set is used as the individual fitness.
Step 4. Determine the set of overlapping algebra, if it reaches, then exit the cycle, genetic optimization end; if not, continue to the next step.
Step 5. If the three genetic operators, namely the selection operator, the crossover operator and the mutation operator, act on the population, then generate the new populations of next generation, and return Step 2 to continue the operation.

Current estimation of the MMC
The sampling period of MMC is set as 10 −4 s, its mathematical model from (14) can be obtained as follows: x Ij (k + 1) = 74/75 0 0 74/75 x Ij (k) The variances for process noise and measurement noise of three-phase in MMC are denoted as By using the Kalman filter theory in Section 4, the estimation on both circulating current and output current in phases A, B, and C is shown in Figures 3-5, respectively, and corresponding estimation error can also be found in these figures.  It can be seen from the above simulation that the Kalman method can achieve an effective estimation for the current in MMC, and the relative error does not exceed 1%.

Fault detection simulation of MMC
Based on the parameters given in Table 1, an 11-level MMC model is built in MATLAB/simulink, and the fault occurs at 0.3 s. Recur to the proposed fault  detection method of Algorithm 1, we concluded from Figure 6 that J th = 206.4778 and 192.4909 = J(3010) < J th < J(3012) = 212.3835, therefore, the fault can be detected within 0.0011s after the fault occurs at step 3011.

Fault localization of MMC
In order to further verify the effectiveness of the proposed fault location method, Bayesian algorithm, decision tree (DT) algorithm, SVM algorithm, the optimized SVM with  particle swarm optimization (PSO) and the optimized SVM with genetic algorithm (GA) are used to locate the fault, so that the advantages of GA-SVM algorithm can be confirmed. Especially, the proposed method can be used to not only intermittent fault but also concurrent fault for MMC.
Different from the existed results on single fault (Deng et al., 2015;Shao et al., 2016;Wang et al., 2015;Yang et al., 2016), in this part, we try to realize the complex diagnosis of concurrent fault. After the concurrent fault is detected with Algorithm 1, we adopt Algorithm 2 to locate the faults. First, it is assumed that there are two SMs failure in MMC, then the fault label set can be ascertained as C 2 6 = 15 categories. The training data sets are chosen as the front 960 groups, and the test data sets are set as latter 341 groups. By employing the  algorithms such as Bayesian, DT, SVM, PSO-SVM and GA-SVM to locate the concurrent fault, we found from Table 2 that the accuracy of the GA-SVM algorithm is significantly higher than Bayesian, DT, SVM and PSO-SVM (Figure 7).
In the other case, intermittent fault is considered. Specially, intermittent fault means that if the upper arm concurrent fault occurs in both phase A and phase B, the faulty SMs are bypassed after its detection, for some  extreme but normal circumstances, the upper arm fault happens in both phase A and phase C in the next time interval. Consequently, the concurrent fault may occur in the lower upper arm of both phase A and phase B in the next time interval. Similar to above case and along the same main line, it is seen from Figure 8 that the faults can be effectively located, and the accuracy of the classification is 96.96% from Table 3.

Conclusion
This paper deals with the fault detection and location problem for MMC with open-circuit fault. By using the Kalman filtering method to estimate the circulating current and the output currents of MMC system, the residual can be achieved. Based on the obtained residual, both the residual evaluation function and its threshold are constructed to detect the fault. In the same time, recur to the residual data, the optimized SVM with GA algorithm are employed to locate the concurrent fault and intermittent fault, respectively. Simulation results show that the proposed data model method can effectively detect and locate the faults, especially, the accuracy of GA-SVM algorithm is significantly higher than Bayesian, DT, SVM and PSO-SVM. Future research will focus on other algorithm (Ye & Shen, 2018) for SVM optimization so that more accurate fault location can be achieved.