Deep learning approaches for fault detection and classifications in the electrical secondary distribution network: Methods comparison and recurrent neural network accuracy comparison

Abstract The electrical power system comprises of several complex interrelated and dynamic elements, that are usually susceptible to electrical faults. Due to their critical impacts, faults on the electrical power system in the secondary distribution network should be immediately detected, classified, and urgently cleared. Several studies have endeavored to determine appropriate methods for electrical power systems faults detection and classifications using a mathematical approach, expert systems, and normal artificial neural network-integrated with Supervisory Control and Data Acquisition (SCADA) and Phasor Measurement Units (PMU) systems as the sensing element. However, limited studies have explored the application of deep learning approaches in fault detection and classifications. In this study, several deep learning approaches were compared including Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Feed Forward Neural Network (FFNN), and Artificial Neural Network (ANN) to determine the appropriate approach for implementation. The simulation results have shown that the RNN deep learning approach is efficient in detecting and classifying faults in the electrical secondary distribution network, whilst the accuracy increases as the complexity increases. The study takes advantage of the developments in sensors and the Internet of Things (IoT) technologies to capture and preprocess data along with the secondary distribution network. The research used the challenge-driven education approach where Tanzania Electric Supply Company Limited (TANESCO) was the case study and source of the training data.


PUBLIC INTEREST STATEMENT
Defects and faults in electrical secondary distribution network for most developing countries including Tanzania, are mainly reported by customers and through visual inspection (physical) by utility personnel. This makes the entire process from faults occurrence to faults clearance to be more expensive and time-consuming. The advancement in information and sensor technologies have paved a way to enhance the efficiency by allowing the use of the modern mechanisms including deep learning approaches to detect and classify faults whenever they occur. This study focussed on the exploration of different deep learning approaches which can directly be deployed on a sensor node to detect and classify faults as they occur in secondary distribution network. The study made use of the challenge driven approach where stakeholders from the industry were involved from the problem identifications to the solution development. Furthermore, the study used the dataset collected from one of the LV transformers in the real electrical network for the year 2012 to 2020. network, whilst the accuracy increases as the complexity increases. The study takes advantage of the developments in sensors and the Internet of Things (IoT) technologies to capture and preprocess data along with the secondary distribution network. The research used the challenge-driven education approach where Tanzania Electric Supply Company Limited (TANESCO) was the case study and source of the training data.

Introduction
Fault management in power systems is one of the main challenges facing electrical utility companies as they strive to ensure improved efficiency and reliability, using various modern approaches and mechanisms leveraging the advancements in information, communications, and technology (DOE, 2015). Faults in the electrical Secondary Distribution Network (SDN) is a part of the network extending from the Low Voltage (LV) transformer to the end-users. In many developing countries including Tanzania, electrical faults are currently reported by the consumers or by utility personnel, through physical visual inspection. Usually, the customer care personnel has to assign a maintenance team for troubleshooting and restoration of the electrical services. The entire process is inefficient as it takes excessive time to detect and report faults through phones, ineffective troubleshooting techniques, and inadequate tools to identify and classify faults (Mnyanghwalo et al., 2018). Figure 1 shows a typical fault reporting process for the SDN whereby a consumer who has experienced a power outage contacts the customer care personnel.
There have been notable advancements in electrical grid monitoring systems in transmission and primary distribution networks using Supervisory Control and Data Acquisition (SCADA) systems (Jamil et al., 2015). The systems gather real-time measurements from the remote terminal units (RTUs) installed in transmission and primary distribution substations (Thamarai & Amudhevalli, 2014). Further developments using Phasor Measurement Unit (PMU) (Gou & Kavasseri) were invented with enhanced synchronization taking advantage of the Global Position System (GPS). The faults were mainly detected and classified using the mathematical systems, based on the information gathered from the SCADA or PMU.

Figure 1. Typical faults reporting process.
However, these systems cannot be applied in the electrical SDN due to their complexity and high cost of the system (Shahriar et al., 2018). Most researches have focused on the transmission and primary distribution network where sensors are placed in the primary distribution substations with a few numbers of nodes and the network is less complex. Implementation of the fault detection and classifications mechanisms in the secondary distribution network remains a challenge due to the high density of users, its complex nature, dynamic changes, and massive amount of information required.
Recent advancements in information processing and sensors technologies have led to the possibility of deploying intelligent automation, hierarchical control, hybrid communication networks, and Internet of Things (IoT) technologies into the electric grid to enhance its performance and efficiency (Hossein Motlagh et al., 2020). The new IoT devices allow large amount of information to be collected from numerous grid systems, preprocessed, and transmitted to the central control systems promptly; a possibility which could not be envisioned when previous generation and transmission systems were designed with legacy technologies. The availability of the massive amount of data from the grid network provides an opportunity for the utilization of state-of-the-art artificial intelligence techniques to efficiently detect and classify faults.
Several deep learning models including Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Feed Forward Neural Network (FFNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU) were explored. The performances of the RNN, GRU, and LSTM were compared using the same data set gathered from a part of the secondary distribution network (between the year 2014 and 2020) to determine the most efficient model for the applications. The study used the challenge-driven education approach where TANESCO, a public utility company was selected as a case study and source of training data. This study adopted the Challenge-Driven Education (CDE) approach from problem identification stage to solution development stage, through the involvement of experts from the industry and staff from academic institutions.
This study is limited to the secondary part of the electrical distribution network, from the 11KV/ 0.4KV transformer to the end-users. Moreover, the research focused on the faults caused by voltage and current readings captured from the secondary distribution network AMR which mimics the values to be capture from the IoT-based sensor node across the SDN. Figure 2 highlights the part of the secondary distribution network (SDN) which is the main focus of this study.
The organization of this paper is as follows: Section 2 discusses the faults classifications in the secondary distribution network, section 2 presents the faults detection mechanisms adopted. Section 4 describes the faults detection and classification methods which were mainly used in previous studies, section 5 explains different deep learning architectures which were compared to determine the appropriate one for faults detection and classifications in electrical SDN. Section 6 describes the data collection, analysis, and preparation for training and testing. Section 7 presents the modeling of the methods and comparison simulations results of the methods. Section 8 presents the simulation results and discussion for the RNN when modeled with basic faults classifications and when modeled with detailed faults classifications, section 9 presents the results discussions and section 10 provides the conclusions and recommendations for the future work.

Faults classifications in electrical secondary distribution network
Faults in electric power systems are unpredictable irregular conditions that can be caused by changes in climate conditions, human errors, fire, and electrical hardware failures. Faults in SDN can either be open circuits or short circuits faults. In the distribution networks, short circuits faults occur frequently and can be identified by observing the phase currents (Karić et al., 2017). A short circuit is a type of fault where the electric current passes through an unintended path with very low electrical impedance. Open-circuit faults are the opposite of short circuit faults where there is infinite resistance between two nodes, it is relatively hard to detect open-circuit faults using the current relays (Lau & Ho, 2017). Under ideal states, all phase voltages have the same maximum value but differ in phase from each other at an angle of 120 degrees and deviation of the values by 5% for transmission lines and 10% by distribution lines are considered as the voltage faults.
The chances that a fault can occur in an overhead transmission line is higher compared to the underground line since the maximum part of it is exposed to the atmospheric environment (Davis, 2013). Faults in overhead secondary distribution network can mainly be categorized into two types, i.e. series or open conductor electrical faults, and shunt/short circuit electrical faults. Series faults can be identified easily by observing each phase voltage whereas short circuit faults can be identified easily by observing each phase current (Prasad et al., 2018). Figure 3 shows the classification of faults in overhead electrical SDN, phase A, phase B, phase C, and ground are presented by letters A to C and G, respectively.
The frequency at which the faults occur and the level of severity caused varies between different types of faults. The commonly occurring fault in overhead electrical SDN is Line to Ground (LG) fault, however, LG fault is less severe compared to other faults (Zahri et al., 2015). Double Line to Ground (LLG) fault is the next in the occurrence and severity as double lines in the system are affected. The critical type of faults in both severity and occurrence are the ones involving all the three lines either (LLL or LLLG) as they may result in total system collapse. The single line to ground faults occurrence is around 70-80% whereas the threephase faults are around 5-10% (Karić et al., 2017). The protection scheme needs to detect the fault and classify the nature of the fault and location of the fault within less time to avoid the major damages.
Once the faults occur, they may result in poor power quality, unreliable electrical power supply, less consumer comfort, damage to the equipment, or potential hazard to the person based on how severe they are and the duration of the fault persistence (Armendariz et al., 2016). This study focuses on the use of the voltage and current readings from the electrical SDN to detect and classify faults based on the affected line(s) to the ground. The use of the deep learning approaches aims at enhancing the detection time and accuracy to reduce the impact caused by the faults when they persist for longer durations.

Faults detection mechanisms
As the sensor node across the SDN monitors the network parameters by capturing the voltage and current value, the faults need to be detected first before being classified; hence, there should be proper detection mechanisms. The faults detection mechanisms are used to determine whether the captured parameter values have faults or not before classifications. The mechanisms presented in this study are used to detect the limiting values for a transformer and conductors. Furthermore, the mechanism also presents the fault voltage values either overvoltage or under voltage for both single-phase and three-phase lines.

Conductor current carrying capacity
The current carrying capacity of the conductors differs based on the cable type, characteristics, and standards. The commonly used cable types for the overhead electrical power systems are All Aluminum Alloy (AAAC), All Aluminum Conductors (AAC), and Aerial Bundled Conductors (ABC). Based on the cable standard in Table 1, the cable types that are currently being used in the utility company are Ant and Wasp with cross-section areas of 50 mm 2 and 100 mm 2 respectively. The current-carrying capacity for these standards is 175A and 268A, respectively. These values form the reference for the faults detection used with the proposed deep learning models in this study.

Line voltage values
The acceptable values of the voltage ranges are highlighted in Table 2. The values present the acceptable ranges for the line to ground voltages and line to line voltages. Any values outside the range are detected as faults which can either be over voltage or under voltage.

Detection and classifications algorithms
Faults in electrical SDNs have to be detected first then classified accordingly for further processes.
Once known the occurrence of the faults and which lines are affected, it is easy to engage appropriate clearance processes and restore the services to its normal conditions. There have been several studies to determine the best methods for fault detection and classifications. The  Figure 4. The prominent approaches include wavelet which involves the wavelet transformations, Artificial Neural Network (ANN), and fuzzy logic. The hybrid approaches apply a combination of more than one approach to detect and classify faults, and includes hybrid methods of neuro and fuzzy techniques, wavelet and ANN, Wavelet and fuzzy logic, and wavelet and neuro-fuzzy technique. The third type is modern techniques including the recently used approaches such as Support Vector Machine (SVM), genetic algorithms, decision tree technique, deep learning technique, pattern recognition technic to name a few.
The work done by Chakraborty (Chakraborty et al., 2012) reported a wavelet transform method to detect and classify faults in electrical transmission network lines. The algorithm used the mathematical methods to determine the Root Mean Squire (RMS) values of the wavelet coefficients of electrical current signals at each end of the transmission line under considerations over a varying window length of half cycle. The analysis of the obtained current signal was done using db4 wavelet in PSCAD and MATLAB to obtain the details of the coefficient which were compared with the benchmark values to detect and classify faults. However, this technique focused on the transmission part of the network and only focused on the shunt type of electrical faults.
Another study that was done by Jamil et al. (2015), considered the use of an artificial neural network in transmission lines to detect and classify electrical faults. The input to the proposed scheme was captured from the three-phase currents and voltages of one end. The results realized that the method is efficient in detecting and classifying faults in transmission lines. However, the study did not consider the electrical secondary distribution and did not explore the use of the beep learning approaches for further improvements.  Furthermore, another approach for faults detection and classifications was presented by Majid Nayeripour et al (Nayerİpour et al., 2015). The authors presented using a hybrid approach taking advantage of the wavelet transform and fuzzy logic. The method is called Fuzzy-Wavelet which uses the wavelet transform to decompose the signals from the three-phase currents of the transmission lines and uses the fuzzy logic to detect and classify faults. This one is well focused on the transmission part of the network where the network is not widely scattered compared to the SDN.
The advancement in machine learning currently shows the performance of deep learning and machine learning architecture such as Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Gated Recurrent Units (GRUs), Long Short Term Memory (LSTM), Feed Forward Neural Network (FFNN), and support vector regression (SVR) approaches to have better performance than traditional approaches (Li, 2017). Table 3 presents a summary of the literature reviews done on the faults detection and classifications approaches in power systems.

Deep learning algorithms
Deep Learning originated from a multi-layer Artificial Neural Network (ANN) and is a subset of machine learning. Deep learning refers to a large deep neural network (Zhang et al., 2018). Deep Learning is a computational learning technique whereby raw data is used to hierarchically model

Deep feed forward neural network
Deep Feedforward neural Network (DFFNN) is adopted as a reference model for many current deep learning architectures. In the DFFNN, information only travels forward in the neural network. DFFNN is tremendously significant for practicing machine learning which forms the basis of many commercial applications (Upadhyay, 2019). The input to the model is passed to the hidden layer of the network in the forward direction where the weights and biases are initialized by the output determined by activation functions in each hidden layer node. The optimal values of the network are obtained by adjusting weights using the back-propagation method and the loss functions. The deep feed-forward neural network consists of an input layer, an output layer, a hidden layer, and neuron weights as shown in Figure 5. Figure 6 illustrates a sample of the DFFNN model using the TensorFlow framework. The model is made up of input with three entries of historical data, one hidden layer each with three nodes, and an output layer with a single node. As DFFNNs are feedforward only and not feedback; they do not exhibit the mechanism to utilize or remember previous outputs of the network, unlike the RNN. Therefore, DFFNN is not highly recommended for time series forecasting applications including electrical secondary distribution network faults detection and classifications.

Gated recurrent unit
Gated Recurrent Unit (GRU) is a deep learning architecture among the variant of ANN where data mining and machine learning methods are commonly used. ANN-based on the GRU has been employed in numerous disciplines because of its model-independent and efficient computational properties (James et al., 2017). Gated Recurrent Neural Networks revealed its efficiency in several applications involving sequential or temporal datasets such as natural language processing, music synthesis, speech recognition, machine translation, and classifications (Dey & Salemt, 2017).   Nevertheless, most of the neural networks tend to ignore the input data correlation when dealing with time-domain dataset. GRU architecture is designed to overcome this type of drawback by adding recurrent network connections in the hidden intermediate layers of a neural network. GRU is designed to maintain the previous information in the network for future use and captures the temporal data dependencies in the input data. Consider a time series X ¼ x 1 ; x 2 ; . . . ; x T ½ �, GRU can produce a sequence of output with values H ¼ h 1 ; h 2 ; . . . ; h T ½ �, where each output value h t is determined using all input values from x 1 to x T. This is attained by its internal structure, presented as follows: where all w and b matrices represent the learning parameters of the GRU network and � are the element-wise product. From equations (1) to (4) it can be realized, by using the learning parameters whose values are not known initially, GRU simulates the relationship between output h t and  input x 1 ,···, x t . Practically, the known data for input and output are used to fine-tune the parameters to reflect the actual relationship between input and output. This entire process is normally called training of the model and is typically performed off-line. Once the model is trained, the values of the learning parameters can be considered to calculate the predicted output values with a new set of input data, although the actual output is not known.

Recurrent neural network
It is essential to emphasize the key differences from feed-forward networks to understand the training of Recurrent Neural Networks (RNN). Feed-forward networks are considered without dynamic states (stateless) because for training, the data is treated as static. Feed-forward networks are trained to merely illustrate the relationship among the variables of each observation at any given time without considering the history. Hence, in certain instances, important relationships can be lost due to disregard of the impact of dynamic correlations of past and current input and output data. The concept of "stateful" models was developed to fully utilize the sequential relationships of data regarding prior inputs by storing the information in a "memory cell" over time (Mireles Gonzalez, 2018).
The RNN is designed to deal with a variety of complex computer tasks such as object classification and speech detection. Compared with the traditional feed-forward neural networks, the RNN has additional complexity. The RNN consists of an internal state to represent context information and stores information regarding past inputs for an amount of time and is not fixed a priori but rather depends on the input data and its weights. An RNN whose inputs are not fixed, can be used to transform an input sequence into an output sequence while flexibly taking into account contextual information (Brownlee, 2017).

Long short-term memory
Long Short-Term Memory (LSTM) is a unique type of Recurrent Neural Network (RNN) that can learn long-term dependencies; useful for specific prediction types that necessitate the network to preserve information over longer periods, a task that traditional RNNs contend with (Hochreiter & Schmidhuber, 1997). The LSTM comprises the input gate for adding information to the cells, and the output gate, which selects and outputs necessary information. Figure 7 illustrates the forget gate, input gate, and output gate of the LSTM. LSTMs are designed to conquer the vanishing gradient problem and permit them to preserve information for longer periods compared to traditional RNNs. LSTMs are capable of continuing to learn over numerous time-steps and backpropagate through time and layers since they can sustain a constant error.
Additionally, as depicted in Figure 8, LSTMs utilize gated cells to hold information outside the regular flow of the RNN. The network can thus exploit the information in many ways, for example, storing and reading the information in the cells (Karim et al., 2017). The cells can decide on the information being processed. Furthermore, by opening or closing the gates, the cells can execute these decisions. LSTM has an edge over traditional RNNs through the capability to preserve information for a long period.

Convolutional neural network
Convolutional Neural Network (CNN) is a feedforward neural network that are applied BP algorithms to regulate the parameters of the network to decrease the value of the cost function. CNN differs from the traditional BP networks in the following conceptions: local receptive fields shared weights, pooling, and the combination of different layers. CNNs are a special type of FFNN for processing data that has grid-like topology (LeCun et al., 1998). The structure is biologically inspired, whereby the connections between the neurons are established using the animal visual cortex (Cardoso, 2017). CNN explores spatially local correlations thus imposing local connectivity patterns between neurons of adjacent layers.
CNNs are modeled to execute data through multiple layers of arrays and finds applications in image recognition, face recognition, and faults detections in power systems. The main difference between CNN and any other neural network is that CNN takes input as a two-dimensional array and works directly on the images instead of focusing on feature extraction (Hu et al., 2015). As CNNs are trained with the backpropagation algorithm, having fewer parameters to learn makes training easier compared to other frameworks. CNNs are highly preferred due to reduced memory requirements for running the network, thus enabling the training of larger and powerful networks. CNNs applications are primarily in the visual field, with several image recognition examples (Matsugu et al., 2003), (Szegedy et al., 2015). When complexity increases due to the temporal dimension, for video classifications some extensions of CNNs have been explored (Simonyan & Zisserman, 2014) (Baccouche et al., 2011).

Training and test data management
The study used test data and training data from the electrical secondary distribution network. The data used were from the Automatic Meter Reading (AMR) form the energy utility company in Tanzania. The collected dataset comprises of the voltage and current readings for all the phases from the year 2012 to the most recent ones in September 2020.

Study area and source of data
The case study area was selected from part of the electrical SDN based in Tanzania Electric Supply Company (TANESCO) which is the sole national electrical utility company. The selected area considered the requirement that there should be at least two distribution transformers installed with the AMRs, should be part of the secondary distribution network, should be an area that is more prone to faults for easy analysis, and should be easily accessible. The specific areas selected were in Kinondoni North Area near Msasani Peninsula Hospital as shown in Figure 9. The training dataset was collected from the distribution transformer along with the secondary distribution network. The transformers selected were the ones with ID number 211107296 and 211106960 as they seem to have enough data from 2012 to 2020. Data from the meters were captured after every 20 minutes and includes voltage readings, current readings, and power readings. The current and voltage values were considered in this study from the year 2014 to 2020. Table 1 shows the sample data from the AMR.

Proposed architecture
The proposed architecture was used for automatic fault detection and classifications for the electrical SDN. Figure 9 illustrates the architecture, which consists of the sensor nodes, hybrid communication network, Distribution Control Units (DCU), and control center. The remote sensing unit (Sensor Node) is a node as it stands with computing and communication capabilities where the proposed faults detection and classification model can be deployed. The node also has the capability to allow the implementation of web services thus easily comply with IoT infrastructure for efficient faults detection and classifications. The system has several processing units at different substations in the SDN thus deploy the concept of distributed processing. The algorithm for faults detection and classifications was deployed in the sensor nodes to allow the preprocessing of the captured data as they are read before transmission to the central systems for further processing through the hybrid communication networks. The pre-processed data were sent to the central control systems for further processing and analysis.
The proposed sensor node architecture included the three current sensors one for the phase lines (Phase A, Phase B, and Phase C or L1, L2 & L3) and the voltage sensors one for each line as well. These sensors were connected to the raspberry pi through the input analog lines. The raspberry pi allowed the implementation of web services to allow easy interfacing with the node. The node had the wireless communication interface to allow connectivity to the central station through the implemented communication architecture for the application. There was a gateway to facilitate interfacing of the sensor node and other smart grid applications to send the readings and accept configurations remotely. Figure 10 illustrates the sensor node architecture.

Deep learning model development
The deep learning model was developed by first analyzing the data from the secondary distribution network to determine the training and test dataset. The faults detection were mainly analyzed based on the standard values for the conductors and acceptable voltage ranges. The faults classifications were achieved by separating the values from each line and prepare the model based on input combinations.

Data analysis and preparations
Data were analyzed for faults caused by the voltage only and another one for the faults caused by the voltage and currents to justify the improved performance of the deep learning approaches as the network becomes more complex.

Data analysis and preparations for voltage and current faults
A dataset of faults for a three-phase electrical power system was developed from a set of time series of 20 minutes interval samples from the year 2014 to the year 2020. Table 4 shows the sample data collected from the AMR for currents and voltages in phase A, B and C. Faults in data were detected  by checking whether the phase voltage is within the range of 230 ± 5% and the line current is within the limit of an AAC conductor which is 175A for Ant cable type (50 mm 2 and 268A Wasp cable type (100 mm 2 ). Using this method, 63 faults could be classified. The classification was done by assigning each fault a unique number starting from 0 to 63 where 0 means no-fault. Table 5 shows the sample fault number assigned to the fault for only 16 first numbers. Figure 11 shows the sample faults distribution graph for January 2014,2015,2016,2017,2018. 2019 and 2020.

Data analysis and preparations for detailed voltage and current faults
The dataset was further analyzed to incorporate the detailed faults caused by voltage and current variations. Apart from classifying the faults as to which phase has been affected, this approach also incorporates the details of the faults to state whether it is an outage, overvoltage, undervoltage, or overcurrent faults. The dataset of faults for a three-phase electrical power system was developed from a set of time series of 20 minutes interval samples from 2014 to 2020. Faults in data were detected by checking whether the phase voltage is within the range acceptable range of 230 V ± 5% and the line current is within the limit of an AAC conductor which is 175A for Ant cable type (50 mm 2 and 268A Wasp cable type (100 mm 2 ). Once the fault has been detected and classified as to which line is affected, further details were also included to determine which specific type of fault has occurred. A total of eight states were used to represent the detailed fault classes where 512 faults were classified. The classification was done by assigning each fault a unique number starting from 0 to 511 where 0 means no-fault. Table 6 the fault states with details for each entry.

Model Development and comparison for Three Different Methods
The algorithm for fault detection and classifications was modeled using three different methods which were GRU. LSTM and RNN. The development was done using Keras, a high-level API for TensorFlow, used to build and train models. In Keras, the model can be built by assembling layers in a certain configuration such as Sequential or Functional. In this case, a Sequential configuration which is a linear stack of layers was used because it is one suitable for deep learning. Several libraries were involved in the development; matplotlib for plotting, sklearn for data preparation and numpy for matrices processing. The algorithm was developed using Rectified Linear Unit (ReLU) since it converges faster compared to the sigmoidal activation functions. Since the deep learning approach deals with the processing of massive amount of data, it is recommended that the hardware requirement should be of the following minimum specifications; GPU 4+ GB, preferably Nvidia, CPU (Intel Core i3 or higher), and at least 4 GB RAM (depending on the dataset).
A six-neuron input was used to take in the three voltage values and three current values as recorded from the three phases of the electrical power systems in the SDN. The model was tried with multiple numbers of layers with different nodes to determine the optimal ones. Only one hidden layer with 128 nodes was found to be optimal and was used in this study. The output layer is comprised of 512 nodes for each of the fault output. Since the output was a matrix of size of (1x512), the values of the fault category were converted to this matrix shape so that they can be used by the designed network. The model was developed and the output for the losses and accuracy comparisons per year from 2014 to 2020 are presented in Tables 7 and 8 below, respectively. Figures 12 and 13 show the comparison of the losses and accuracy for all the years, respectively.
Tables 7 and 8, and Figures 12 and 13, show that the performance of the RNN is better than the GRU and LSTM by more than 50% accuracy. The graphs also show that the performance is more linear for the RNN which makes it easier for analysis. Having confirmed that the RNN is better than the rest. The model was extended to incorporate the voltage and current faults for the electrical secondary distribution network.

RNN models accuracy comparison
As the performance of the RNN in faults detection and classifications was found to be the best, the work was then extended to compare the performance of the RNN models in two different scenarios. The first one is when the fault classes only identified the only the affected phases and the second one is when the fault classes went further to incorporate faults details where the faults types were also identified.

Data preparations for voltage and current faults
A dataset of faults for a three-phase transmission was developed from a set of time series of 20 minutes interval samples from 2014 to 2020. Faults in data were detected by checking whether the phase voltage is within the range and the line current is within the limit of an AAC conductor which for this case was considered to be 175A. Using this method, a total of 64 faults could be classified for affected phase faults classifications and a total of 512 faults could be classified for the detailed faults classifications where each number in the entry represented fault class.

Training and testing
The developed model was trained using half of the dataset and the rest were used for testing the model. The RNN is trained using an optimization process, which needs a loss function to compute the model error. The loss function used in this network was Categorical Cross-entropy, which measures the performance of the classification model whose output is a probability between values 0 and 1. Figures 14 and 15 show the results of the comparisons for the accuracy and losses, respectively. It was realized that when the detailed faults classes were considered, the maximum accuracy reached was 95.6% where for undetailed fault classes the maximum was 93.91%.

Results discussion
The results for a comparison of different methods are shown in Tables 7 and 8. The results reveal that the performance of the RNN method exceeds other methods by more than 50%. The results prove that the RNN model has the better where the accuracy can go up to 95.6% which is sufficient enough to detect and classify faults accurately in a secondary distribution network. The results from Figures 14 and 15 which shows the comparison of the RNN model in two different scenarios for the years 2014 to 2020 revealed an increment of maximum accuracy from 93.91% to 95.6% and a loss reduction from an average of 0.387 to 0.280. The results show that there are improvements of 1.69% accuracy when the model complexity increases which is the best for the secondary distribution network where the network is more complex as compared to the transmission and primary distribution network. Furthermore, the results also depict that the faults predictions from the model versus test data are aligned at 10 th epoch which signifies that the model takes less time to reach its maximum performance hence can be used to accurately classify faults in SDN.

Conclusion and future work
The deep learning approaches were well explored and the model was developed to compare the performance of the GRU, RNN, and LSTM architectures. The results show that the accuracy of the RNN for the dataset from 2014 to 2020 has an average of 94% while other methods (GRU and LSTM) recorded less than 50% accuracy for each year hence proves that the performance of the RNN deep learning model is better than others and fits well for the faults detection and classification in the secondary distribution network. The results from the RNN method show that there are improvements of 1.69% accuracy when the model complexity increases which is the best for the secondary distribution network where the network is more complex. These results signify that the RNN deep learning model can be enhanced and implemented in fault detection and classification in the secondary distribution network. The model has been designed to allow future expansion to incorporate other types of faults. The model was developed using the Keras platform and can be exported and deployed in a sensor node to be deployed on the real environment for the utility company using the python programming language. The implementation of the deep learning methods in faults detection and classification enables total control of the SDN using the IoTbased sensors which in turn will improve the efficiency and reliability of the electrical power systems. Future work will involve the improvement of the algorithm to incorporate more types of faults, real deployment of the model in the sensor node, and enhance its performance accuracy for more accurate and fast detection and classifications.