Modeling optical gap of cupric oxide nanomaterial semiconductor using hybrid intelligent method

Abstract Copper II oxide (CuO) semiconductor belongs to the compound of metal oxide with abundant uniqueness and features which facilitate its wider applicability. The nature of the optical band gap of this semiconductor strengthens its usage for many technological and industrial applications while chemical doping mechanisms through breaking of symmetry of the host semiconductor have proven successful for its energy gap tuning for meeting the desired demand. This work proposes hybrid particle swarm optimization-based support vector regression (PBSVR) as an effective intelligent algorithm for determining optical band gap using lattice parameters (distorted) as input predictors. The developed PBSVR model demonstrates low mean absolute error (MAE) of 0.287 eV, low root mean square error (RMSE) of 0.367 eV and high correlation coefficient (CC) of 90.3 % while validating on testing samples. PBSVR model performs better than three existing models in the literature which include stepwise regression model (SWR), extreme learning machine model with sigmoid function (ELM-IP-Sig) and sine function (ELM-IP-Sine). On the basis of MAE, the developed PBSVR model outperforms ELM-IP-Sig, ELM-IP-Sine and SWR models with performance improvement of 33.7%, 26.93% and 67.6%, respectively. The PBSVR model further investigates the influence of iron and aluminum on the semiconductor energy gap while the predicted optical band gaps agree excellently with the experimental optical gaps. The experimental stress circumvention potentials of the developed PBSVR model coupled with its superior performance over the existing models are of great importance in ensuring precise and quick characterization of CuO optical gap for desired applications.


Introduction
Semiconductor metal oxide such as copper II oxide (CuO) belongs to the class of highly explored and meritorious semiconductors due to their low cost, wide and tunable energy gaps, nonhazardous nature of their constituents and ease of fabrication (Bhuvaneshwari & Gopalakrishnan, 2016).At nano-scale of its crystallites, additional characteristic features such as quantum size effects and large surface to volume ratio significantly enhance the optoelectronic properties for application in gas sensing, photodiodes and solar cell among others (Huang et al., 2014(Huang et al., , 2016;;Iqbal et al., 2017).Nanomaterials have specific surface properties as well as physical/ chemical features linked to their sizes, such as thermal conductivity, electronic properties, chemical reactivity and mechanical strength (Auffan et al., 2009).As a result of the unique properties of this semiconductor, it has recently attracted significant attention from the scientific community.They are mainly focused on their ability to produce exceptional optical, catalytic and electronic properties only at dimensions of nanoscale when compared to the bulky material (Mosquera et al., 2013).This class of semiconductors have vital roles in the development of diverse technology based materials such as gas sensors, solar cells and spintronics (Albert Manoharan et al., 2018).
Copper II oxide semiconductor is characterized with a large number of intrinsic vacancies when doped with other foreign materials for optical feature improvement and this method of properties enhancement promotes the potentials of semiconductors in spintronics and gas sensing applications among others (Liu et al., 2011;Philipose et al., 2006;Xing et al., 2010;Yuan et al., 2009).The influence of dopants on energy gap is modeled in this work using hybrid intelligent based model with distorted lattice parameters descriptors.
Copper II oxide (also called cupric oxide) belongs to the class of p-type semiconductor with monoclinic crystal structure (Tran & Tuyen Nguyen, 2014).The crystal structural description of the semiconductor shows co-ordination of Cu atoms with four atoms of oxygen in square planar arrangement.CuO has the lattice parameters a = 0.4684 nm, b = 0.3423 nm, c = 0.5129 nm, and β = 99.54° and α = γ = 90° (Forsyth & Hull, 1991).The lattice features can be distorted and altered after foreign material inclusion and during fabrication processes.Extensive research has been conducted on transition element-doped CuO due to its potential applications in many areas of technology.For instance CuO has been doped with Sn (Vomáčka et al., 2016), Cr (Bhuvaneshwari & Gopalakrishnan, 2016), Co (Baturay et al., 2019), Mn (N.Sharma et al., 2015) (Iqbal et al., 2017), Fe (Haq et al., 2017;Mohamed Basith et al., 2013;Pugazhendhi et al., 2018), La (Devi et al., 2017;Rodney et al., 2018), Li (Chand et al., 2014), Ni (Basith et al., 2014), Nd (Albert Manoharan et al., 2018), Ce (Ponnar et al., 2018), See (Sharma & Kumar Dutta, 2018), and Zr (Mersian et al., 2018).Similarly, deposition of nano-crystalline CuO thin films has been accomplished using a wide range of processes, including sol-gel (Danks et al., 2016;Yang et al., 2010), co-precipitation (Mukhtar et al., 2012), pulsed laser deposition (Chand et al., 2014), chemical vapor deposition, combustion synthesis, hydrothermal, solvothermal (Kaviyarasu et al., 2014), chemical reduction (Jose et al., 2016;Kaviyarasu et al., 2014), magnetron sputtering, spin coating (Shaikh et al., 2011) molecular beam epitaxy, spray pyrolysis (Sajeesh et al., 2010(Sajeesh et al., , 2012;;Vijayalakshmi et al., 2008), electrodeposition (Desai et al., 2000) and thermal evaporation among others.Several methods have also been employed for CuO electronic properties calculation.Some of such methods are the Perdew functional with Generalized Gradient Approximation (PBE-GGA) (Baruah et al., 2004), the Configuration Interaction by Perturbation Selected Iteratively (CIPSI) (Daoudi et al., 1999), Local Spin Density Approximation (LSDA), Plane Wave Pseudo Potential-Local Spin Density Approximation (PWPP-LSDA), Becke three parameters, Lee, Yang and Parr (B3LYP), and the Complete Active Space Self Consistent Field (CASSCF) (Mochizuki et al., 1991) methods.The distorted lattice parameters due to dopant incorporation are related with energy gap in this work using hybrid particle swarm optimization based support vector regression (PBSVR) algorithm.Novelties of the developed model include its superior performance over the existing models and the embedded great potential in ensuring precise as well as quick characterization of CuO optical gap for desired applications such as water purification, pollutant degradation and nitrogen fixation among others (El et al., 2022;Menazea & Awwad, 2020;Menazea et al., 2022;Morsy et al., 2022;Okoye et al., 2023).Support vector regression (SVR) is a supervised structural risk minimization-based machine learning method with excellent potentials for establishing relationships between descriptors and target (Rui et al., 2019;Vapnik, 1998).The support vector regression algorithm has statistical learning theory background and was initially developed for classification problem with support vector machines nomenclature.SVR is potentially adaptive to small samples training set of data with extra-ordinary performance and overcomes the major drawbacks of neural networks (Esfandiarpour-Boroujeni et al., 2019).The algorithm conveniently approximates non-linear complex relationships with distinct precisions and these potentials have strengthened application domains of the algorithm in many areas (Akomolafe et al., 2021;Dodangeh et al., 2020;Owolabi & Amiruddin Abd Rahman, 2021;Owolabi et al., 2021;Parsa & Naderpour, 2021).The demonstrated uniqueness of SVR-based model is extended to copper II oxide nano-material semiconductor for energy gap estimation in the present work using distorted lattice parameters descriptors.The Precision of SVR based model is strongly attached to the combinatory nature of SVR hyper-parameters which include the epsilon, penalty factor, as well as the kernel parameter.These parameters are tuned, selected and determined using swarm-based algorithm.
The precision and accuracy of the developed particle swarm optimization-based support vector regression (PBSVR) are compared with the existing models using several evaluation parameters.The developed PBSVR outperforms the existing models such as the extreme learning machine intelligent model with sine activation function (ELM-IP-Sine), stepwise regression (SWR) model and extreme learning machine intelligent model with sigmoid activation function (ELM-IP-Sig) (Alqahtani, 2021) with performance improvement of 22.87%, 61.48% and 33.79%, respectively when evaluated on testing data samples using root mean square error (RMSE) as comparison metric.

Mathematical formulation and background of the intelligent algorithms
This section presents the mathematical operational principles of particle swarm optimization evolutionary and support vector regression algorithm.

Description of support vector regression algorithm
Supposing that a set of lattice parameters (P k ) for doped copper II oxide nano-material semiconductors and their corresponding measured energy gaps � measured k are defined as � M k¼1 2 < m for M number of available copper II oxide semiconductors, where k ¼ 1; 2; . . .:; M. The regression equation to be established using support vector regression algorithm is contained in equation (1) (Bhagwat & Maity, 2012;Owolabi, 2019;Science & Hasheminejad, 2021).
Where χ and � k respectively represents the threshold coefficient (also known as biasing parameter) and estimated energy gap of doped copper II oxide semiconductor such that χ 2 < and ω 2 < n .Furthermore, the weight vector is represented by ω.SVR algorithm aims at determining � k for every doped copper II oxide such that each determined energy gap has ε deviation from � measured k .To attain this goal, equation ( 2) is subjected to minimization with consideration of constraints presented in equation ( 3) Where the positive trade-off coefficient (the regularization factor) is represented by β while the slack variables are represented by ψ � k and ψ k in equation ( 3).The significance of slack variables inclusion is to cater for constraints that tend to impede the actualization of ensuring that the deviation of the estimated energy gaps does not exceed epsilon ε.L in equation ( 2) stands for the lost function with mathematical expression presented in equation ( 4) (Shamsah & Owolabi, 2020).
With Lagrange multipliers (δ k and δ � k ) implementation for solving complex optimization problem, the weight vector obtained is presented in equation ( 5).
With inclusion of the weight vector presented in equation ( 5), equation ( 1) can be modified as presented in equation ( 6).
Non-linearity is addressed within SVR operational principle using kernel trick {γ which allows dimensional (to high h, dimension) mapping $ : < m !< mþh and P !$ P ð Þ .Equation ( 7) presents the final relation while the mathematical formulation of the employed Gaussian kernel function is presented in equation ( 8) (Murillo-Escobar et al., 2019) Where μ is the kernel parameter with which the structure of high dimensional feature space is defined.The kernel parameter (μ), regularization factor (β) and epsilon (ε) need to be carefully tuned and selected so as to ensure precise model.The employed optimization algorithm in this work is particle swarm optimization due to its fast convergence to global solution.

Particle swarm optimization (PSO) algorithm
PSO is a unique optimization method that obtains global solution of optimization problems through mimicry of organism movement in fish school or bird flock (Eberhart & Kennedy, 1995;Owolabi, 2023).The algorithm has demonstrated high efficiency in many real-life problems through initialization of some probable and possible solutions with a defined dimension in line with the addressed problems.The initialized possible solutions are further optimized in an iteration processes.Consider a searching procedure in which particle t t ¼ 1; 2; . . .; λ ð Þ is defined by a vector position y t 2 R n , velocity ν t 2 R n and best position σ t 2 R n in n-dimensional search space containing, λ-number of particles with a randomized initial positions as well as velocities.Computation of deviation through objective function implementation determines the best position σ t of each of the particle.The particle's position with a characteristic smallest error is regarded as the current best position while the global best position ρ is defined as the position with characteristic smallest error for the entire σ t .The global best position and the particle best position are employed in adjusting the position and the velocity of the particle at every iterative steps of algorithm execution (Parsa & Naderpour, 2021).The algorithm is conditioned to stop after achieving maximum number of iteration or attainment of the defined minimum error.equation ( 9) and equation ( 10) respectively update the velocity and position of the particles.
Where ϕ = velocity magnitude factor, α = Coefficient (inertia weight), υ 1 = first acceleration constant, υ 2 = second acceleration constant, κ 1 = first random weight and κ 2 = second random weight.Each random weight spans between 0 and 1.The adopted velocity magnitude factor strengthens the speed convergence.The significance of the previous velocity on the present one is determined using inertial weight coefficient.This shows that lower value of α breeds local exploration while larger value enhances global exploration.To achieve a balanced exploration as well as exploitation, a linearly decreasing coefficient of inertial weight defined by equation ( 11) is applied.

Computational method for algorithms hybridization
The description of the employed sample data are contained in this section.Furthermore, the employed computational methodologies are also presented.

Description of the employed dataset
Fifty-seven samples of copper II oxide nano-materials doped with varieties of external materials serve as the sources of dataset utilized for the simulation.The entire set of data was extracted from the literature (Albert Manoharan et al., 2018;Arfan et al., 2019;Babu et al., 2021;Basith et al., 2014;Chand & Kumar, 2018;Chaudhary et al., 2021;Mohamed Basith et al., 2013;Ponnar et al., 2018;Rodney et al., 2018;Sharma et al., 2017;Velliyan & Rajendran, 2021;Vimala;Devi et al., 2017).The content of the dataset includes the distorted lattice parameters along three axises of semiconducting crystal structure and their corresponding energy gaps.The introduced dopants frequently induce lattice distortion on the parent semiconductor due to ionic radii difference of the dopants as compared with that of the parent copper, presence of impurity phase and oxygen vacancies due to the incorporated dopants (Elayaperumal et al., 2015).The statistical analysis results conducted on the employed data-samples are shown in Table 1.
The presented mean values for each of the predictors and the target are insightful for determining the entire content of the employed data-samples while the minimum and maximum values are informative in determining the data-samples range.The presented standard deviations measure discrepancies in the data-samples during experimental measurement especially during dopants incorporation.The lattice parameter predictors are weakly correlated with the optical band gap which clearly necessitates invocation of efficient non-linear-based models that can establish a link between energy gaps and lattice parameters (distortion).

Computational strategies
The proposed PBSVR model for estimating the optical band gap of doped cupric oxide semiconductor was developed within MATLAB computing environment.The main energy gap estimation was performed within support vector regression algorithm operation description while particle swarm optimization algorithm helps in hyper-parameter tuning so as to prevent being trapped within local solutions.The cupric oxide lattice parameters (which serve as the descriptors) and the measured energy gaps (which serve as the target) were randomized and subsequently partitioned into training and testing set in the ratio of 8:2.Since only fifty-seven data-points extracted from the literature are available for modeling and simulation, data-set from forty-six doped cupric oxide nano-semiconductors goes to training set while data-points from eleven doped cupric oxide semiconductors was assigned as testing set of data.Due to the available few data-points, testset cross-validation technique was adopted for model optimization using particle swarm optimization technique.
Step by step computational descriptions of the developed PBSVR model are discussed as follows: Step 1: Swarm and model parameters initialization: The parameters initialized include the coefficient of inertial weight (α), acceleration constants (υ 1 and υ 2 ), swarm population size (λ), maximum iteration (T) and swarm search space.Since the parameters to be optimized include the regularization factor (β), epsilon (ε) and kernel parameter (μ), the upper search spaces were respectively set at 1000, 0.009 and 0.009 while the lower search spaces were respectively set at 1, 0.001 and0.001.In order to ensure global stability of the developed model, random search procedures were carried out before finalizing the upper and lower search spaces.
Step 2: Generation of randomized swarm position and velocity: Using the defined search space as the boundaries of probable solutions, swarm positions and velocities were generated randomly and afterwards updated as the algorithm evolves from generation to generation until the attainment of global solution.
Step 3: Fitness computation: The fitness of each of the swarm in the population was evaluated through computation of root mean square error (RMSE) between the measured and estimated energy gaps of swarm incorporated support vector regression algorithm for testing phase of model development.The fitness computational details go thus: (i) kernel function selection from Gaussian, sigmoid and polynomial function for transferring data-points to feature space of higher dimension for construction of linear model in higher dimensional space (ii) Incorporation of swarm particle (whose fitness is to be determined) in the chosen kernel function together with the training set of data (iii) training of SVR algorithm with the incorporated swarm and training dataset.The RMSEtraining is saved together with the support vectors.Other performance measuring parameters were also saved (iv) step III was repeated using testing set of data while RMSE-testing was also saved (v) step I to step iv were repeated for other kernel function (vi) the value RMSE-testing for each of the swarm particle was saved and ranked in ascending order.The swarm particle with lowest value of RMSE-testing happens to be the best model.
Step 4: Individual best position update: If the current position is represented as σ current equation ( 12) is implemented at this stage of algorithm execution.
Step 5: Global best position update: Equation ( 13) is implemented for global position (ρ) update Step 6: Maximum number of particle check: proceed to the next step if t<λ and go back to step 3 if otherwise Step 7: Fitness function computation with global best position: implement ρ t for fitness function computation.
Step 8: Update final velocity and position: employ equation ( 9) and equation ( 10) to update the velocity and position of the particles in the swarm.
Step 9: Stopping conditions: If the same value of RMSE-testing is obtained for fifty consecutive iterations or maximum iteration is attained, the algorithm is brought to stop.The flowchart is presented in Figure 1 4

. Results and discussion
The predicted energy gaps using PBSVR model are presented in this section.Comparison of the estimates of the existing models with the present model is also contained in this section.

Optimum hyper-parameters from particle swarm optimization algorithm
The results of hyper-parameter optimization using particle swarm optimization method are presented in Figure 2. The size of particles in swarm is varied from ten to two hundred so as to ensure that the exploration and exploitation capacity of the algorithm attain equilibrium purposely to avoid local convergence.The hyper-parameters optimized include the regularization factor (β), epsilon (ε) and kernel parameter (μ). Figure 2(a) presents the optimization and convergence of maximum error threshold epsilon at different values of swarm particle size.For ten numbers of particles in the swarm, the model converges to high epsilon and this is not a good indication of precise model.The search space is not well explored due to few numbers of particles.Hence, the model becomes characterized with weak exploration ability.Increase in the numbers of particle in the swarm strengthens the exploration capacity of the algorithm while it balances with the exploitation strength which gives the algorithm a significant potential to fully exploit the global solution and shows excellent convergence.The convergence of the developed PBSVR becomes insensitive to swarm population when more than fifty particles participate in global exploration.Swarm sizes of fifty, one hundred and two hundred show similar convergence.Similar behavioral patterns are demonstrated for kernel parameter optimization and error convergence presented in Figure 2(b) and Figure 2(c), respectively.The optimum values for each of the hyper-parameter are presented in Table 2.

Evaluation of model accuracy and comparison with the existing models
Performance of the developed PBSVR model during different phases of model development is presented in Figure 3.The performance was evaluated on the basis of root mean square error (RMSE), correlation coefficient (CC) and the mean absolute error (MAE) between the measured and estimated energy gap of cupric oxide nano-material semiconductor doped with foreign materials.The principal role of the incorporated dopants is to adjust and tune the energy band gap of CuO semiconductor for photocatalytic activity enhancement.The foreign materials employed for doping include cesium, copper and nickel among others.
The developed PBSVR model shows better performance while testing as compared with the training phase where the support vectors needed for future modeling are drawn.It was observed that the testing stage of PBSVR shows better performance over training stage with 1.64%, 14.60% and 29.29% using CC, RMSE and MAE, respectively.Comparisons of the outcomes of the developed PBSVR model with existing model are presented in Figure 4.  (Alqahtani, 2021), ELM-IP-Sine (Alqahtani, 2021) and SWR (Alqahtani, 2021) model with performance improvement of 10.94%, 13.74% and 51.38%, respectively.On the basis of CC for the training set of data presented in Figure 4(b), the developed PBSVR outperforms the existing ELM-IP-Sig (Alqahtani, 2021), ELM-IP-Sine (Alqahtani, 2021) and SWR (Alqahtani, 2021) model with performance enhancement of 5.19%, 6.49% and 83.43%, respectively.Similar performance improvements of 30.25%, 33.09% and 71.30% were respectively obtained while comparing the developed PBSVR model with the existing ELM-IP-Sig, ELM-IP-Sine and SWR models on the basis of MAE presented in Figure 4(c).For the testing set of data presented  model with performance enhancement of 7.13%,3.53%and 34.3%, respectively on the basis of CC and 33.70%,26.93%and 67.59, respectively on the basis of MAE.The value of each of the performance evaluation parameters and the performance improvement of the present model over the existing model are presented in Table 3.

Investigation of the influence of different dopants on energy gap of cupric oxide using developed PBSVR model
The doping effects of terbium, nickel, iron, cesium and aluminum on energy gap of cupric oxide are presented in Figure 5 together with experimental results.Incorporation of terbium (Tb) leads to defect formation in which variation in the level of scattered photons results into blue shift in the absorption range of the spectrum.The estimated energy gap for terbium doped cupric oxide agrees well with the measured value (Vimala Devi et al., 2017).The observed change in the energy gap of cupric oxide due to terbium incorporation can be attributed to vacancies formation in the parent lattice structure (Tounsi et al., 2015).In the case of nickel doped cupric oxide presented in Figure 5, closeness of the ionic radius of Cu 2+ and Ni 2+ leads to insignificant difference in the lattice parameters of doped and un-doped semiconductor while cation vacancy formation due to charge variation contributes to the observed higher energy gap.The estimated energy gap using the developed PBSVR agrees with the measured value (Basith et al., 2014).
Other reasons that might be attributed to the observed high energy gap include quantum confinement, size effect and oxygen stoichiometry (Chou et al., 2008).Cesium doped cupric oxide presented in Figure 5 shows a reduced energy gap due to quantum size effect.Decrease in energy gap leads to exchange interaction between electron localized in cesium dopant and the band electron.The processes of energy gap reduction due to cesium incorporation involve absorbance of visible light by cupric oxide nano-particles with excitation of electrons from valance to conduction band follow by the interaction between conduction band electron and 4f electrons of cesium dopant.The estimated energy gap for cesium doped cupric oxide agrees with the measured value (Ponnar et al., 2018).Doping of iron with cupric oxide creates shallow levels inside the band gap which results into red shift in the absorption edge.Experimentally measured energy gap for iron doped cupric oxide agrees with the estimated value using PBSVR (Kannaki et al., 2016).Aluminum dopants as presented in Figure 5 change the crystal symmetry due to vacancies and defect formation within the lattice sites which subsequently leads to charge imbalance.The result of the developed PBSVR model for aluminum doped cupric oxide agrees with the measured value (Arfan et al., 2019).The comparison between the experimentally measured energy gap and the estimated values is presented in Table 4.The absolute error for each of the doped semiconductor is also presented in the table.The details of the experimental preparation of the CuO samples are contained in the cited references.

Conclusion
The energy band gaps of cupric oxide nano-material semiconductor are estimated using hybrid particle swarm optimization based support vector regression (PBSVR) model.The inputs to PBSVR model are the distorted lattice parameters as a result of vacancies and defect due to the incorporated dopants.The performance of the developed PBSVR model is compared with the existing models for cupric oxide energy gap estimation using various performance metrics.The developed PBSVR model demonstrates superior performance over the existing models with significant performance improvement.The developed PBSVR investigates significance of terbium, nickel, iron, cesium and aluminum on energy gap of cupric oxide semiconductor and the obtained energy gaps agree excellently well with the measured values.The precision as well as the superiority of the developed PBSVR model would be highly meritorious in providing quick and accurate approach of determining energy gap of cupric oxide without experimental stress.

Figure 4
Figure 4 (a) presents the comparison between the developed PBSVR model and the existing models on the basis of RMSE for the training set of data.Using this yardstick, the developed PBSVR model outperforms the existing ELM-IP-Sig(Alqahtani, 2021), ELM-IP-Sine(Alqahtani, 2021) and SWR(Alqahtani, 2021) model with performance improvement of 10.94%, 13.74% and 51.38%, respectively.On the basis of CC for the training set of data presented in Figure4(b), the developed PBSVR outperforms the existing ELM-IP-Sig(Alqahtani, 2021), ELM-IP-Sine(Alqahtani, 2021) and SWR(Alqahtani, 2021) model with performance enhancement of 5.19%, 6.49% and 83.43%, respectively.Similar performance improvements of 30.25%, 33.09% and 71.30% were respectively obtained while comparing the developed PBSVR model with the existing ELM-IP-Sig, ELM-IP-Sine and SWR models on the basis of MAE presented in Figure4(c).For the testing set of data presented

Figure
Figure 2. Convergence of hyperparameter during model optimization (a) convergence of epsilon at different swarm particle sizes (b) convergence of kernel parameter at different swarm particle sizes (c) model error convergence at different swarm particle sizes.
Figure 4. Performance evaluation and comparison between the present and existing models (a) training phase RMSE (b) training phase CC (c) training phase MAE (d) testing phase RMSE (e) testing phase CC (f) testing phase MAE.