A novel model for estimating the body weight of Pelibuey sheep through Gray Wolf Optimizer algorithm

ABSTRACT Weight prediction in live animals remains challenging. Several studies have been carried out trying to predict the body weight in livestock through morphometric measurements, the Schaeffer's model is one of them. However, the fit of those studies in small ruminants is not well covered. Therefore, a novel model to predict the weight of Pelibuey sheep through morphometric measurements and the Gray Wolf Optimizer algorithm is presented. The model involves calculating the volume of the specimen through a truncated cone and leaving density as an estimation parameter of the algorithm. Also, two alternative models were made where the original Schaeffer's model was optimized. The modified models from the original Schaeffer's formula showed improvements up to 22.61% in R-squared and decreases up to 33.48% in RMSE. However, the truncated cone model had the best estimates, with an RMSE of 2.57, R-squared of 89.02%, and the lowest AIC. This represented a 25.13% improvement in R-squared and a 38.31% reduction in the RMSE. The model is expected to improve its efficiency if the cattle sample is larger, and it is also intended to be implemented in animals of other proportions.


Introduction
Sheep farming is a livestock activity of great importance, such that according to the 2017 Census of Agriculture report (NASS 2017), the sheep market had an increase of 6.1% from 2012 to 2017. This census shows the reports every 5 years. In this context, it is very important to continue increasing production efficiency, but improvements through innovation and research are also required. For example, the study of the characteristics of the carcass allows establishing various parameters, among which the following stand out: carcass yield (Juárez et al. 2018;Prache et al. 2021), percentage of meat (Pavlov et al. 2018), among others. These characteristics provide information about the quality of the final product; however, to obtain this type of information, it is first necessary to sacrifice the animals, which is not always possible.
The ability to predict the final weight of the beef at slaughter and the survival rate of heifers in advance can help producers to optimize their strategies for better production (Hakem et al. 2022).
Therefore, it would be very useful to carry out studies aimed at predicting the composition in live animals, without the need to sacrifice them.
The concept of using body measurements of animals to estimate their weight has been applied to various livestock including cattle, buffalo, chickens, pigs, sheep, fish, horses, and rabbits (Tscharke and Banhazi 2013).
Among the characteristics most widely considered by many cattle weight studies are body length (BL), hip height (HH), height at withers (WH), heart girth (HG), and abdominal girth (AG) (Tebug et al. 2018;López-Carlos et al. 2010;Dohmen et al. 2022). This is to monitor animal performance through weight gain, to carry out research in the area and genetic improvements (Weber et al. 2020).
In this sense, many types of studies have been carried out trying to predict the body weight (BW) through morphometric measurements (MM) (Yan et al. 2009;Gurgel et al. 2021aGurgel et al. 2021bIqbal et al. 2019), this with the aim of doing it in a faster way without the need to take the animals to a weighing area and avoid the possible stress to which they may be exposed. Correlation between ante-mortem and post-mortem parameters has also been used as a practical tool to identify animals whose carcasses are of better quality before slaughter (Assan 2013).
In the following research (Wangchuk et al. 2018) an analysis of different measurement techniques based on simple formulas that require MM is shown. The techniques consisted of Weighbridge, Weighing tape, Rondo tape, Schaeffer's formula, Agarwal's formula, and calculator method. It was concluded that Schaeffer's formula is the most reliable of all techniques to estimate live body weight of cattle, followed by the weighing tape.
The prediction of live weight of Somba cattle from linear body measurements was studied in Vanvanhossou et al. (2018). They proposed different polynomial regression models, and found that the allometric regression model using only chest girth (CG) is the best for predicting weight, regardless of age and sex.
An investigation to determine if farmers adequately estimate the live weight of cattle was conducted by Machila et al. (2008), since if the weight estimates are underestimated, then the drug doses administered will be insufficient when the farmers treat the animals.
Genetic and phenotypic correlations between BW and MM as cattle grow have been investigated (Kamprasert et al. 2019). It has been concluded that BW and MM in Brahman cattle can be genetically improved through a breeding program. Genetic correlations of up to 97% were found between BW and BL.
It is also common to find studies where artificial intelligence is used to assist decision-making in livestock farming. In order to detect more important parameters without having contact with the specimens, an artificial vision algorithm has been developed to collect body measurements of sheep efficiently and estimate the weight through different regression models (Lina Zhang et al. 2018). In the work (Sant'ana et al. 2021) a system for the prediction of the body mass of sheep through image processing and machine learning techniques is proposed. An automated computer vision system to create 3D models of Hereford cattle was designed by Ruchay et al. (2020) based on RGB-D data to obtain various measurements from the body without having contact with the specimen. Errors of 3% were obtained using a 90% confidence interval with respect to manual measurements. However, the new measurement systems are expensive and it becomes difficult to acquire them in rural areas (Tebug et al. 2018).
Other studies proposed nonlinear models, which may result in a better prediction of BW but a wide variety of parameters to adjust are required, such as the ones in Cano et al. (2015) and Maharani et al. (2017), where different nonlinear equations are compared to predict weight in cattle. The nonlinear functions like Richard's, Brody's, Bertalanffy's, Gompertz's and Logistics are special cases of the four-parameter nonlinear function y = A(1 − b −kt ) M , where y is the observed BW (kg) at age t (day) and a, b, k, M are constants that are fixed depending on the conditions. The equations were fitted using the spline modelling and the Richard's model was found to be the best one. In Mgbere and Olutogun (2002) nonlinear models were presented to estimate the weight of NDama cattle with respect to time. They showed that using nonlinear models allows to reduce the number of descriptive parameters on cattle growth.
In the previous works the importance of weight estimation in live animals is demonstrated and they show an overview of current models. However, the fit of those models in small ruminants is not well covered. Therefore, in this study a novel model for weight estimation of Pelibuey sheep through the use of MM and the Gray Wolf Optimizer (GWO) algorithm is presented. The model consisted of using certain MM, specifically those most correlated with weight, and calculating the volume of the specimen through a truncated cone and leaving density as an estimation parameter of the algorithm. Also, two alternative models were made where the original Schaeffer's model was optimized to compare the results of the truncated cone model. The number of parameters used was intended to be as few as possible to remain within the practicality of commonly used empirical models. The truncated cone model presented the lowest RMSE and a higher R 2 than the original and modified Schaeffer's models. The proposed methodology can be implemented in other existing models that estimate weight in animals of other sizes.

Data collection and experimental procedure
The animals were handled according to the regulations for ethical animal experimentation of the División Académica de Ciencias Agropecuarias of the Universidad Juárez Autónoma de Tabasco (ID project PFI: UJAT-DACA-2015-IA-02). The experiment was performed at the Southeastern Center for Ovine Integration (17 • 78 ′ N, 92 • 96 ′ W; 10 masl). The data were collected from 56 non-pregnant and non-lactating pelibuey ewes aged from 4 to 10 months with a mean bodyweight of 30.31 + 7.83 kg. The animals for the experiment were placed in raised-slatted floor cages with a feeding group in a feedlot system. The experimental diet was a total mixed ration (80:20 concentrate to forage ratio) comprising ground maize, soybean meal, star grass hay, vitamins, and minerals premix and had a crude protein level of 16% dry matter (DM) (AFRC 1993).

Modified formulas and the truncated cone model
Schaeffer's formula is used to estimate the body weight of cattle, and it has been shown that for cattle close to half a ton the estimates are around 5% deviation from the real weight. Using other techniques, deviations can reach up to 32%, depending on the type of cattle (Wangchuk et al. 2018). Previously, in the (Topai and Macit 2004) study, a significant correlation of 86.7% was shown between the variables Heart Girth (HG) and Body Weight (BW), although the correlation between BL and BW was 53.1%. In Wangchuk et al. (2018) it was shown that Schaeffer's formula provides good estimate of body weight followed by weigh tape. Therefore, it was decided to implement this formula to calculate the weight of pelibuey-type sheep, see Equation (1). Subsequently, a modified model of the latter was used that includes a coefficient for the denominator, named Schaeffer's modified 1 and can be seen in Equation (2). Another modified model is included with coefficients on the powers of the variables and named Schaeffer's modified 2, it can be seen in Equation (3). (1) Finally, the main model proposed in this research consists of using the AG, the HG and the BTL to approximate the volume with a truncated cone (Figure 2), using a coefficient that approximates the density and adding a coefficient fit to the first expression. The proposed model can be seen in Equation (4), named as the truncated cone model.

Model fitting using the Gray Wolf Optimizer
The parameters a 0 , a 1 , and a 2 are unknown value constants that allow the model to be adapted to the morphometric characteristics of the pelibueyes. Schaeffer's formula is the empirical formula analysed to compare the proposed model. Schaeffer's formula is adapted in Equations (1) -(3) with the parameters a n to determine the performance of the modified formula and compare it with the model proposed in Equation (4). The model of Equation (4) based on the volume of a truncated cone also contains unknown fit parameters. Artificial intelligence was used through a metaheuristic algorithm to find the value of these variables and obtain the best possible model. Genetic Algorithms (GA) are typically used as the standard metaheuristic algorithm. However, GA presents a high number of specific search parameters as a mutation probability or biological pressure. These parameters can condition the algorithm's performance to the correct selection of these specific search parameters. On the other hand, the Gray Wolf Optimizer (GWO) provides a metaheuristic algorithm tested in multiple fields but with the advantage of presenting only specific search parameters. This provides a simpler and faster parameter search compared to Genetic Algorithms.
The GWO is an algorithm bio inspired by how the wolf hunts, starting from an alpha wolf (best solution) and the next best solutions (beta, gamma, and omega wolves). The complete algorithm is summarized in Figure 3. The search parameters required are the search agents (number of wolves), the number of iterations in the parameter search, the fitness function and the search space's limits (hunting territory). These parameters are summarized in Table 2.

Results
In total, seven morphometric parameters were measured and used for weight estimation. In order to summarize the correlations between the BW and MM, a correlation matrix was carried out and is depicted in Figure 4 through Pearson's correlation coefficient (PCC) and coefficient of determination (R 2 ).
Also linear regressions were performed to analyse each variable individually with BW. These results are shown in Table 3.
According to Table 3, the best correlation and the lowest RMSE was shown by ASC. However, the variable with the lowest values in the same indicators was BL. The RMSE ranged from 3.23 to 4.90 while the R 2 ranged from 60.10% to   82.60%. The GSC also showed a high fit and correlation. In some investigations (Dohmen et al. 2022;Kamprasert et al. 2019;López-Carlos et al. 2010;Tebug et al. 2018) the BTL showed much higher correlations with respect to the BW, but although in these linear regressions it has not shown to be the variable with the highest correlation, it was still considered in the truncated cone model in Equation (4) since that parameter is the one defined as the height of the cone.
The results of the parameters obtained for each model with the GWO algorithm applied with the values shown in Table 2 are summarized in Table 4. The estimated coefficient for the Schaeffer's modified 1 model in Table 4 changed very little from the default value of 300 in the Schaeffer's original formula. Only this small change produced a significant increase in the fit for these small ruminants. In Schaeffer's modified 2 model can be seen that the coefficients of the powers change considerably with respect to the quadratic powers of the Schaeffer's original, and also the estimation of the denominator. This would magnify the contribution of the GC measurement.
The parameters obtained by GWO were used and evaluated in the proposed models. In total 4 statistical indicators were computed: The Root Mean Square Error (RMSE), coefficient of determination (R 2 ), Akaike's Information Criteria (AIC) and Bayesian Information Criteria (BIC). The weight errors indicators for each model obtained with the estimated coefficients are shown in Table 5. However, the AIC and BIC are not usual indicators in this type of research (Dohmen et al. 2022). The BIC criterion was slightly higher than the models Schaeffer's modified 1 and 2 due to the fact that it penalizes the number of additional parameters in the model, but how viable it is depends on how all the evaluation parameters turn out. Globally, the truncated cone model presented the best statistical indicators. Additionally, the comparison between real and estimated weight for each sheep using the truncated cone model from Equation (4), is shown in Figure 5. The truncated cone model estimated the same value of mean body weight as the real one, which was 30.3089, while the standard deviations were 7.3874 and 7.8299, respectively. The model underestimates approximately 44.6% of the data, while it overestimates the other 55.4%, see Figure 5, which represents a balanced model.
Finally, the error of the proposed model and the standard deviation of the error are evaluated and depicted in Figure 6. It can be seen that 75% of the experimental data fall within the standard deviation of the error in the truncated cone model, as well as that the variability of the data is similar above and below the standard deviation lines, which may be due to the model underestimating and overestimating the weight in a balanced way.

Discussion
The first models (empirical formulas, tables, tapes, etc.) are calibrated for very specific cattle. However, to adjust this type of models to Pelibuey cattle samples, new approximations of the parameters are required. The application of these models in Pelibuey sheep shows considerable increases in the error. An attempt was made to adjust the formulas used in the literature, but the error was still high. The constant of the denominator of the Schaeffer's formula is empirical, and it was decided to estimate it in case an increase in the R 2 of the model was achieved, what would produce the Schaeffer's Modified 1 model. This increase was 21.33% in R 2 and a reduction of 31.14% in the variability of the estimates in relation to the original Schaeffer's model by simply adjusting the constant of the denominator. This means that Schaeffer's Modified 1, which is much simpler and faster to use, can serve as a good alternative for this type of cattle.
The model Schaeffer's Modified 2 showed an increase of 1.05% in the coefficient of determination and a reduction of 3.40% in the variability of the estimates with respect to Schaeffer's Modified 1. The truncated cone model showed an increase of 2.05% in the coefficient of determination and a reduction of 7.25% in the variability of the estimates with respect to Schaeffer's Modified 2.
This means that the truncated cone model improves data prediction by 25.13% and reduces variability by 38.31% compared to the original Schaeffer's model. Sales et al. (2019) propose equations to estimate BW from body measures in the Cornigliese sheep, implementing multiple regression analysis and they do the analysis for males, females and the whole group. The best model considering the whole group obtains an R 2 of 93%, where seven different measurement variables are required, while the truncated cone model that we propose uses three measurement variables, obtaining an R 2 of 89%.
Artificial learning techniques are progressively emerging in precision farming. Meckbach et al. (2021) used a deep neural network to calculate the weight of pigs by providing the model with standardized images of depth and the measured weight of more than 400 pigs. Weight was recorded over time, and ranged from 20 to 133 kg. Although the weight range is much higher than that evaluated in our work, the average RMSE was 3.75%, which is very close to those shown in this paper. This means that, if the sample size of this work were increased, the coefficient of determination could be closer to the 97% obtained with neural networks.
Various machine learning algorithms in Ruchay et al. (2022) were used to estimate live weight of Hereford cows, and performed with a maximum R 2 of 0.713, with much larger sampling and an average weight of 521.68 kg. Although it   may seem that their R 2 is lower than that shown in our research, it must be said that they used thirteen morphometric parameters and each one with its respective variability. It has been found that correlation between body weight of sheep at different ages increases as days increase, although there are some maternal or genetic components that maintain a high correlation with weight at early ages (Wolc et al. 2011). In our work this type of correlation with the time variable is not taken into account, but it is intended to add it in future works, and to consider the non-linearity of weight over time.
In another study Samperio et al. (2021), a system was used to estimate the weight of lambs by capturing 3D images, and errors of less than 6% were obtained. This represents variation of the error of approximately 1.37 kg. If we consider that data from 272 lambs were used and the degree of difficulty of the methodology is greater, the approximations were lower, only 86%.
In contrast to this, Figure 6 shows that 75% of data enters if one standard deviation is taken, but if two standard deviations are taken, 97% of data enters.
Several regression models for different breeds of sheep are shown in the following research (Sant'ana et al. 2021), with variations of R 2 ranging from 0.49 using simple and multiple linear regression to 0.945 using cubic regression. In the following works (Gurgel et al. 2021a;Novoselec et al. 2020;Worku 2019;Kumar et al. 2017;Weber et al. 2020), the coefficients of determination closest to those shown in our research were obtained, between 0.81 and 0.92. The fit of our model is well above that shown in Sant'ana et al. (2021), who had an R 2 of 0.687 and a mean absolute error of 3.099 kg using machine learning techniques. Now, it has been shown that even using vision techniques and machine learning, relatively low evaluation parameters can result, up to 0.747 in the coefficient of determination or around an average of 0.88 (Dohmen et al. 2022). It is worth mentioning that sheep were not used, if not, bovines and pigs. Many of the results shown are below the evaluation parameters in our investigation.
In a table presented in Tebug et al. (2018), many different regression models to predict live weight for a group of animals and subgroups were presented. The RMSE ranged from 0.7685 to 0.9301, and the best model to predict all groups of animals had an R 2 of 0.86 and an RMSE of 32.81, which corresponds to a mean of 11.02%. The model consists of few predictor variables as in our investigation. The algorithm presented in our investigation can be used to fit the coefficients of Tebug et al. (2018) and compare the results.
Although image processing of body shots and weight were well correlated at Lina Zhang et al. (2018), no model exceeded 80% fit of R 2 . In this case, the partial least squares regression had the best correlation and lowest error of 0.7271 and 7.11 kg, respectively, and the support vector machine (SVM) model had 0.7938 in R 2 and the lowest standard deviation of 5.49 kg.
In this other work (Weber et al. 2020), computer vision techniques and regression algorithms were used to estimate weight from images of the dorsal area of Nellore cattle. The best result of the correlation coefficient in this research was 0.75 using the bagging algorithm, and it is presumed that it could improve if height and volume are included in the experiment. Among the algorithms used, the RMSE errors ranged from 15.88 to 25.61 kg, although the weights analysed were around 500 kg, while in our investigation the average body weight was 30.31 kg. Computer vision was also used in Cominotte et al. (2020), and coefficients of determination ranged from 0.63 to 0.92 at different growth phases in a herd of Nellore steers, with artificial neural network (ANN) being the best prediction. This means that the proposed model, despite not having artificial vision or measurements extracted from processed images, produces a relatively high coefficient of determination.
Some methods for estimating the weight of cattle through artificial vision have proven to be more approximate than conventional methods (Tscharke and Banhazi 2013), but this has caused them to stop proposing simpler and more practical models with a smaller required sample, almost zero equipment cost, and that they are used where it is not very essential to obtain deep mathematical relationships with other parameters such as growth and feeding.
It is important to understand that if you choose to use calibrated bands to estimate weight, you can fall into an overestimation or an underestimation, because the bands are calibrated for races from certain countries, and this can cause the doses of medicine or drugs supplied to the specimens vary considerably, even the estimates made by the farmers can deviate by 20% from the real weight (Machila et al. 2008).

Conclusions
In this study, a model was developed to estimate the weight of pelibuey sheep using morphometric measurements and the GWO algorithm. The model consisted of estimating the volume of the specimen through a truncated cone and leaving the density as an estimation parameter of the algorithm. Schaeffer's empirical formula was used as a reference and two other optimized models of the latter were calculated for the mentioned sheep.
Schaeffer's modified 1 and Schaeffer's modified 2 improved significantly, however, the model that predicts the best results was the truncated cone model.
The metaheuristic algorithm was always adaptable and adjustable, regardless of the weight estimation model used. The modified Schaeffer models showed a performance very similar to that of the truncated cone, although the latter model has an additional morphometric parameter (AC).
The model is expected to improve its efficiency if the pelibuey cattle sample is larger. Also, it remains to test the truncated cone model in a sample of another type of cattle, preferably with different proportions.
The capture of morphometric measurements of animals could be improved using artificial vision techniques where there is a greater background for discussion. Also, the presented prediction system can be transferred in the future to a low-cost embedded system to be implemented in farms and to be implemented in other existing models that estimate weight in animals of other sizes, as well as farmers and animal health workers can benefit from livestock live weight training and information to improve health care in lowresource rural areas. We believe that the presented models can motivate other researchers to implement metaheuristic algorithms in the solution of precision livestock problems.

Nomenclature
The following abbreviations are used in this manuscript: