A sustainable health and educational goal development (SHEGD) prediction using metaheuristic extreme learning algorithms

The United Nations established the 17 Sustainable Development Goals (SDGs) in 2015 to address issues like gender equality, clean water, health, education, and hunger by 2030. Of the 17 SDGs, health and education have an outsized impact on countries’ socioeconomic development, so providing insights into progress on these two goals is crucial. Machine learning can help solve many real-world problems, including working towards the SDGs. This paper proposes using a metaheuristic ensemble of Cat Swarm Optimization algorithms with Feed Forward Extreme Learning Machines, called Sustainable Health And Educational Goal Development (SHEGD) Prediction, to effectively contribute to countries’ economic growth by achieving health and education SDGs through machine learning. The model is assessed using UN SDG datasets and performance metrics like accuracy, precision, recall, specificity, and F1-score. Comparisons to other machine learning models demonstrate this model's superiority in designing a recommendation system for progressing towards the health and education SDGs. The proposed model outperforms the other approaches, proving its value for an SDG recommendation system design.


Introduction
The 17 SDGs, which 193 UN member nations endorsed in 2015, serve as a guide for constructing a stronger, more sustainable society for people and the environment and cover a wide range of human endeavours [1].Each objective is specified with concrete targets for raising our quality of life by 2030 [2].The SDGs have now drawn more attention from governments, organizations, enterprises, and individual scholars worldwide, primarily for national development policies, technology, commercial advancements, and practical and theoretical aspects of their execution.The deep and vast socioeconomic and cultural differences worldwide, the intricate intersections of the SDGs, the size and dynamism of underlying data properties, and so on present both obstacles and opportunities.Figure 1 demonstrates the Sustainable Development Goals.
Education and healthcare are important in supporting the country's progress toward socio-economic growth.According to the United Nations, education and healthcare data catalyze a progressive system that can aid socio-economic status.Since undeveloped nations cannot gather pertinent data, such internationally accessible data are essential to understanding the development and contribution of those nations to sustainable development.This significant volume of data needs to be properly collected, evaluated, and processed using the right techniques and tools to provide reliable indicators regarding SDG [3].ML is a branch of artificial intelligence, tries to allow robots to learn from information without even being explicitly programmed.As ML algorithm tries to create a pattern among inputs and outputs based on the information and models, the development and research of algorithms is crucial.ML is growing rapidly and has recently shown new, brighter research options for tracking and analyzing humanitarian operations.It also has a significant impact on the analysis of datasets related to education and healthcare [4].ML techniques, namely support vector machines (SVM), Random forest (RF), and K-means clustering, have already been applied to analyze these data and used to explore the relationship between the SDG, healthcare, and education.Whether from clinical data that give significant clinical assistance by replicating human perception and can even identify illnesses difficult to detect by human intelligence, ML, and deep learning are used to produce steps that anticipate diseases.These steps may be created from clinical data.It could substantially impact the accuracy of illness prediction, which could save patients' lives if the prognosis is accurate and made promptly [5].
Additionally, these algorithms are employed to comprehend the significance of achieving the SDGs.However, these algorithms need improvisation to achieve the best performance in finding the correlation of these data in SDG.Motivated by the above drawback, this paper proposes the novel CSO-inspired Feed forward Neural Networks, which work based on ELM.The CSO algorithm is a population-based optimization approach that imitates the way cats forage for food as they seek for prey.Because the input is only processed in one direction, the feed-forward model is considered a fundamental form of neural network.The following is the work's primary contribution: 1. Propose the CSO-optimized Feed Forward Neural Networks, which are used to categorize the healthcare and education data that can aid in attaining the SDG.

Create feed-forward networks using the Extreme
Learning Machines principle to obtain high accuracy and quick classification.
Calculate the performance measures for the proposed method and contrast it with other state-of-the-art ML algorithms.The work is divided into the following sections: The SDG overview and the significance of ML techniques useful for achieving SDG goals are presented in Section 2. Section 3 contains descriptions of the datasets and the suggested algorithm.In Section 4, the implementation results and comparison analyzes are shown.The study ended in Section 5 with a discussion on the future direction.

Overview of SDGs
At the Rio de Janeiro UN Conference on Sustainable Development in 2012, the SDGs replaced the Millennium Development Goals (MDGs).It was necessary to enhance environmental effectiveness because of climate change and other significant environmental issues [6].Therefore, the primary purpose was to establish new objectives to address the world's critical environmental, political, and socio-economic issues [7].The SDGs [8] constitute a strong commitment to advance the MDGs and address some of the world's most significant concerns [9].They offer an urgent call to change the path of the world in a more sustainable direction.Zero Hunger, No Poverty, Excellent Health including Well, Good Education, Equality Of the sexes, Availability to Clean and Affordable Energy, decent employment and economic expansion, Industry, Innovation & Infrastructure, Lowered Disparities, Sustainable Cities and Communities, Responsible Consumption, Production, and Climate Action, Existence Below Water as well as Life on Soil, Calmness, Justice, and Strong Institutions are among the 17 goals that are benefited by the achievement of the others [10].The 2030 Agenda [11] establishes clear goals and reachable targets for reducing carbon emissions, managing climate change, and decreasing the likelihood of natural calamities.It was made public simultaneously as another important decision made at the COP21 climate conference in Paris [12].The SDGs are unique because they deal with global challenges and renew the commitment to eradicating poverty, enhancing the health system, improving education, reducing inequality, etc.Even better, they work with all countries to create a more sustainable, secure, and prosperous planet for people [13][14][15].It offers multiple advantages [16][17][18][19], including Data at various scales (local, regional, national, and even worldwide) & intervals of time; Reliability; Wide variety of characteristics; and Cost-effective data collecting, Earth Observation (EO) become a crucial component of monitoring and achieving the SDGs.
ML is increasingly popular across various subdomains, including Deep Learning, Natural Language Processing, Image Recognition, and Statistical Learning techniques [21].In the literature, a sizable number of ML algorithms have been employed and described for a variety of tasks in a variety of domains, including agriculture [22], renewable energies [23], disasters [24], climate [25], construction [26], and human living conditions [27] and Health System [28].The ML model was used to categorize a selection of patent families registered with the European Patent Office (EPO).The investigation shed light on how the SDGs were addressed in patents.In addition, it uses a SVM algorithm is an extension of ML model recognition with reference to the SDG orientation of patents.The results can potentially progress the identification issues of science and technology artifacts, particularly relevant in light of global objectives and activities for sustainable development [29].
The Disposition of Youth in Predicting SDGs is to assess the attitudes of young people in Asia.The effective use of ML methods emphasizes the views of a nation's young population about a sustainable future.This is because the young population is the key to a country's future growth.Several study findings have shown the enhanced prediction capacities of neurofuzzy approaches.During this same period, Random Forest became more well-known as a sophisticated tool for prediction and classification.This work intends to expand on the research that has been done before and evaluate the predicted accuracy of the adaptive neurofuzzy inference system (ANFIS) and Random Forest (RF) models for three different types of SGDs.
Both the methodology and the findings of an impartial, evidence-based evaluation of Australia's progress towards the SDGs are discussed in this study.The evaluation examines Australia's progress with SDG and 144 relevant indicators [31].These targets and indicators were chosen via a consultation process that experts guided.According to the findings, Australia has a mixed performance on the SDGs, with excellent success in objectives related to health and education being offset by low development in goals related to climate action and decreasing disparities by applying DT algorithm [34].Artificial Neural Networks (ANNs) are the most widely used and efficient technology for optimization, decision-making, and forecasting.To achieve environmental and socio-economic views of sustainability, the research proposes applications of ANNs that are considered to be state-of-the-art.Using Nigeria as a case study, the research investigated the influence of corruption on achieving the SDGs.The use of ELM predictive modelling to analyze data about development to find trends that facilitate proactive and strategic decisionmaking.However, this research can't investigate the impacts of corruption on accomplishing the SDGs [36].However, none or only some of the above techniques concentrate on education and healthcare.

Materials and methods
The main data source for this study is the United Nations SDG data repository [30], including the entire list of targets and indicators for each of the 17 SDGs starting in 2015.The United Nations provides a detailed overview of the aims and indicators.This paper mainly focuses on data structure from the two SDGs, namely education and healthcare, out of the 17 SDGs.The exchange of knowledge and expertise to address fundamental human rights concerns such as health, water, sanitation, quality education, and food security.An education of sufficient quality, including education for children in their early years, is a basic human right; it is a requirement for many long-term prospects, and it can act as a social equalizer.While this is happening, there needs to be more advancement in this sector.Quality Education is the emphasis of the SDG of the UN, which aims to ensure that all children, particularly children from disadvantaged backgrounds living in rural regions, children from vulnerable populations, and children from indigenous communities, have access to quality education and to enable them to continue their education throughout their lives.The objective is to eliminate inequities in terms of income and gender to achieve the goal of providing equitable access to inexpensive and high-quality education at all levels.To achieve excellent education via an Artificial Intelligence and ML system.
Privileged levels of education and better flexibility in the domination system to adjust to a continually adapting environment.For illustration, it was reasonable to claim that a nation's educational level and quality would affect its degree of creativity and productivity, which would affect its level of manufacturing and R&D.To guarantee, supply, fund, and promote health is the collaborative effort of society, which is what we refer to as healthcare.There was a substantial change in the ideal of well-being and the avoidance of illness and disability throughout the twentieth century.One way to think about health care is as a set of standardized standards that assist in evaluating actions or circumstances that impact the decision-making process.
By having a major influence on the accuracy of illness prediction, it has the potential to save the lives of patients if the prognosis is accurate and made promptly.However, it can potentially put patients' lives at risk if the prediction is inaccurate.Therefore, it is necessary to anticipate and evaluate disease prevalence precisely.Because of this, there is a need for trustworthy and effective methodologies for predictive analysis in the healthcare industry.There has been a growing interest in the use of predictive analytics approaches to improve healthcare, which is shown in the long-term investment in the development of innovative technologies based on ELM and CSO techniques to improve people's health via the prediction of future occurrences.The data properties were cleansed and reorganized to fit in with the modelling technique.The data could be labelled in various ways, such as location, country, indicator, etc.This research concentrates on country variations and suggests that performances within countries might reveal what causes variations in indicators and that indicator variables can predict geographic locations.

Implementation scheme
This research proposes a novel hybrid feed forward network optimized by the CSO algorithm to achieve a better prediction.

An overview of ELM
The proposed scheme utilizes the feature maps for training the deep-feed forward learning model to classify the SDGs.The suggested structure uses the principle of ELM suggested by B. H. Pham [27] for the high-speed and highly accurate classification of SDGs.This particular neural network has a single hidden layer, which does not always require tuning.ELM yields better precision and improved performance using the kernel function [28].Minimal supervision error and faster approximation are the main benefits of ELM.Since ELM employs non-zero activation functions and automatic adjustment of weight biases, it finds usage in classification with classification values.The ELM's intricate functioning mechanism is covered in [28].
While the activation function of the output layer is straight in this type of system, the "L" neurons in the concealed layers must operate with a significantly different activation function (for example, the sigmoid function).Hidden layers in ELM do not require constant tuning.The concealed layers' loads are chosen randomly (counting the bias loads).Although it is not true that hidden layers are useless, and the parameters of hidden neurons can be generated randomly even in advance.
Before dealing with the training data set data, Equation (1) defines the system returns for a monolayer ELM Where x input features from encoder-decoder.The target weight vector δ is provided as follows: The given Equation determines the output concealed layer M(x).
The hidden layers are represented by equation ( 4), and the goal is to identify the Output vector O, also known as the target vector.
The ELM's fundamental implementation employs the minimal non-linear least square approaches shown in Equation ( 5) M →known, the Moore-Penrose generalized inverse, is the opposite of M. Additionally, the following Equation can be used.
Consequently, the output function may be computed employing the above Equation.
The SDG data are classified based on the mathematical Equation (7) in which the thresholds are used for an effective prediction.

Improvisation in ELM
ELM has a major drawback in handling larger datasets, leading to high computational overhead and low prediction performance.The main drawback of ELM is the non-optimal adjustment of input weights and biases, even though they demonstrate efficiency, including training and testing.Compared to traditional learning algorithms, ELM uses numerous hidden layers to alter the appropriate weights, which may affect the detection's accuracy.An innovative bio-inspired CSO technique is employed to improve the input and bias factors and generate high prediction accuracy to get beyond the abovementioned limitations.The following are the main benefits of CSO algorithms: 1. High Efficiency Compared to Genetic Algorithm (GA), Particle Swarm Optimization (PSO), and Other Heuristic Algorithms 2. A quicker and more flexible search area.
In the coming section, the CSO algorithm's operation is described.

Cat swarm optimization technique
A continual, single-objective method called the classical cat swarm optimizer draws its inspiration from cats' tracing and sleeping habits.Cats appear to be lazy creatures who prefer to lie around and sleep.However, when they sleep, they are attentive and aware of everything around them.Therefore, they are continually making educated, purposeful observations of their surroundings, and when they spot a target, they rapidly begin travelling in that direction.Therefore, the CSO method is designed to combine these two fundamental cat behaviours.The tracing mode and the seeking mode make up the CSO algorithm.Each Cat is a solution set with a unique location, fitness value, and flag.
The position in the search process comprises M dimensions, each with one's speed.Finally, the flag shows if the cats are all in seeking mode or tracing mode.The fitness value describes how well the optimal solution (Cat) performs.Before inserting the cats into the algorithm, we should identify how many cats would participate in the iteration.The top Cat from each iteration is kept in memory, and the Cat from the most recent iteration will be used to represent successful results.The operation of the CSO technique is shown in Figure 2. The next section discusses the operational mechanisms of the seeking and tracing modes.

Seeking modes
Counts of dimension to change (CDC), seeking memory pool (SMP), seeking a range of the selected dimension (SRD), and self-position considerations all play significant roles in this mode, which mimics the resting behaviour of cats (SPC).The user adjusts and defines each of these parameters through trial and error.Five new, unique positions would be developed for each Cat, while one of them will be picked as the Cat's next position, for instance, if SMP is set to 5.This establishes the scope of the Cat's seeking memory, i.e. the number of potential destinations to which the Cat will be directed.The other two parameters, CDC and SRD, will determine how to select the new placements randomly.The CDC specifies the range [0, 1] for the number of dimensions that need to be changed.For instance, if the state space contains five dimensions and the CDC is defined to 0.2, then four randomly chosen parameters among the five must be changed for each Cat while the fifth dimension is left unchanged.The SRD counts the number of mutations and other changes for the dimensions the CDC selected.The mutative ratio for the given dimensions is another name for it.
Lastly, SPC is a Boolean value that indicates whether or not the Cat's present location will be chosen as a candidate location for the following iteration.As a result, for each Cat, for instance, if the SPC flag has been set to true, we must produce (SMP-1) candidate numbers rather than SMP numbers since the present position is one of them.Following are the steps for seeking mode.
1. Generate as many SMP clones of Cats as possible in its current place.2. Choose as many CDC parameters as possible for each copy to be modified at random.Additionally, replace the former places by arbitrarily adding or subtracting SRD variables from the present value, as illustrated in the Equation below ( 8) Where x(n_cat) →newest Cat's latest position, x(o_cat) → Cat's starting position &rand → random interval in the range 0 to1.After calculating the fitness function, the candidate position is selected based on probability and the fitness function with the highest value, as shown in Equation (9).
Where FF(i) →fitness of present cat FF(b) →Total population of Cat, FF max → greatest Fitness Function, FF min →Lowest Fitness function.

Tracing modes
This mode mimics how cats track objects.All of a cat's position's dimensions are assigned random velocity values for the initial iteration.However, velocity values must be changed for subsequent steps.The following moving cats are in this mode: (i) According to the Equation below (10), update all dimensions' velocities (V (CAT)) Where a & c ⇒ constants.Figure 2 shows the suggested CAT-Inspired algorithm's whole operational mechanism.

Optimization of hyper parameter utilizing cat algorithm
The hyper parameters in the fully linked layers of the proposed model are optimized using CSO.Choosing the right hyper parameter is critical since it affects how well a network performs and depends on the task for which ELM is used.The most typical hyper parameter settings in the ELM are the concealed layers, incoming weights, the number of epochs, and the learning rate.Table 1 shows the importance of these hyper parameters.
These hyper parameters need to be improved to get more accurate findings.Algorithm 1 provides the suggested CAT-inspired optimization of hyperparameters in ELM.Considering the quantity of incoming weights, concealed units, learning rate, and epochs, the cat colonies are chosen randomly.The Equation is changed to create the new fitness function.(Table 2)   Fitness FunctionA = {1 − mAMaximum(Accuracy)} (14) Table 3 shows the specifics of the optimized hyper parameters acquired following the suggested optimization procedure's application.

Hardware details
An all-encompassing Python open-sourced platform c, Tensorflow.18, implements the suggested model.The multi-classification system underwent 100 iterations of training.The loss functions in the systems were optimized using the CSO optimizer to get the lowest possible loss during iteration.The model was developed using an i7 CPU, 16GB RAM, and a 2.5 GHz operating NVIDIA K80 GPU.Teaching in a way that achieves sustainable development while also adhering to learning in the twenty-first century and establishing sustainable societies is a challenging endeavour.

Performance measures
We have demonstrated the effectiveness of the suggested framework over the competing deep learning techniques in this section.Table 3 shows the number of datasets utilized to train and test the suggested model.Metrics, including accuracy, sensitivity, specificity, recall, and F1-score, are calculated to evaluate the effectiveness of the suggested design.Table 3 displays the math formulas for calculating the metrics needed to evaluate the suggested architecture.the Cat-optimized ELM has produced better results in prediction.
Figure 4 explains Handling the Health Care SDG Datasets with the Proposed Model's Validation Performances.This figure also explains the training and testing accuracy performance based on several epochs.
The training accuracy reaches 98.8% at 200 epochs, and the testing accuracy presents 97.5% at 50 epochs.Still, the testing accuracy reaches 98.7% at 200 epochs in Health Care SDG Datasets since using the Catoptimized ELM has produced better results in prediction.
The evaluation performance of the suggested model in addressing the goals of education and healthcare is shown in Figures 5 and 6.The proposed model presents the least validation error, as seen in the figure (Root Mean Square Error = 0.001).Also, the suggested model has presented 98.4% accuracy for education datasets and 98.5% for healthcare datasets.As the data is considered Big -data, cat-optimized ELM has produced better results in prediction.
The comparative study of the various ML approaches such as SVM, RF, DT, ANN, and ELM are used for the Teaching datasets is presented in Table 4 and Figure 7.The proposed model generated the maximum performance, as seen in the figure and table, whereas an artificial neural network (ANN) produced the lowest performance.Though Extreme Learning Machines (ELM) exhibited better performance than SVM, DT, RF, and ANN, it is less than the proposed model SHEGD.
The CSO technique incorporated to tune the hyper parameters of ELM has played a significant role in producing the best performance compared to the other traditional ML algorithms.When categorizing SDG datasets related to health care, Table 5 and Figure 8 show similar performance patterns.

Conclusion with future scope
In today's world, ML algorithms play an important part in a wide variety of health-related domains and

Algorithm 1 :
CSO-based parameter optimization in full operation 1 Input: Bias Weights(β), No. of concealed units(η), Total Epochs(µ), Learning standards (α) 2 Set the Cat Swarm society N & velocity of CSO as V 3 While n = 0 to N-1, N ⇒ maximum iteration 4 Evaluate the probability and searching agents employing Equation (8) & (9) 5 Find the hyper values (β,η,µ,α) 6 Estimate the fitness function utilizing Equation (14) 7 Check Fitness Function is equal to Threshold 8 Upgrade the latest Cats and save the best 9 Otherwise 10 upgrade the Cat's values and Go to Step 04 11 Stop 12 Stop

Figure 3
explains the Handling of the Education SDG Datasets to Validate the Proposed Model's Performance.The performance accuracy of training and testing is compared to a number of epochs from 50 to 200.From the result, the training accuracy is greater than the testing accuracy.The training accuracy reached 98.5% at 200 epochs, and the testing accuracy reached 98.4% at 200 epochs in Education SDG Datasets since using

Figure 3 .
Figure 3. Handling the education SDG datasets to validate the proposed model's performance.

Figure 4 .
Figure 4. Handling the health care SDG datasets with the proposed model's validation performances.

Figure 5 .
Figure 5. Validation performance of the ELM in handling the health care SDG datasets.

Figure 6 .
Figure 6.Validation performance of the ELM in handling the education SDG datasets.

Figure 7 .
Figure 7. Performance comparison of ML methods for education SDG datasets.

Figure 8 .
Figure 8. Performance comparisons of ML methods for healthcare SDG datasets.

Table 1 .
The functions of the hyper parameter.

Table 2 .
Hyper parameters that are optimized are utilized to train the system in ELM.

Table 3 .
Mathematical formulae for the calculation of performance metrics.

Table 4 .
Performance analysis between the ML algorithms in handling education SDG datasets.