Analysis of uncertainty measure using unified hybrid censored data with applications

Entropy is a measure of random variable uncertainty that reflects the anticipated quantity of information. In this paper, estimation of Shannon entropy for Lomax distribution, viz unified hybrid censored data are considered. The processes of maximum likelihood and Bayesian estimation procedures are regarded. The Bayesian estimator under balanced squared error, balanced linear exponential, and balanced general entropy loss functions are derived. The purpose of a Monte Carlo simulation study is designed to compare the accuracy of different estimators in the context of specific measures. Real data analysis is employed to confirm the proposed estimators. In summary, we discovered that, according to the findings of the study, the mean squared errors values decrease as the sample size increases. In the majority of cases, Bayesian estimates under balanced linear exponential loss function are more appropriate in terms of simulation out comes than other specified loss functions.


Introduction
Due to a lack of funding and time constraints in the majority of life's trials, it is better to stop the test before all of the items fail. Censored samples are the observations that arise from that situation, and there are numerous censoring strategies. The most common types of censorship are type I (T-I) and type II (T-II). The test is completed at a predetermined censoring time in T-I censoring. The test is completed at a fixed number of failures in T-II censorship. The hybrid censoring scheme (HCS) is a combination of T-I and T-II censoring methods that can be characterized as below.
In a life-testing setting, assume there are n identical items with lifetimes are identically distributed and independent. Let X 1:n , X 2:n , . . . , X n:n be the ordered failure times of these objects. When a prefixed number of items, 1 ≤ r ≤ n, out of n, fail, or when a prefixed period T ∈ (0, ∞) is reached, the test is ended. Epstein [1] was the first to propose this hybrid censoring method, and there are two types: T-I HCS and T-II HCS.
The life-testing experiment in T-I HCS is completed at a random time T * 1 = min(x r:n , T). The disadvantage of T-I HCS is that very few failures occur until the predetermined period T * 1 . To address this flaw, Childs et al. [2] suggested a recently HCS known as T-II HCS, which ensures fixed number of failures and has a termination time T * 2 = max(x r:n , T). Although the T-II HCS ensures a specific amount of failures, detecting failures and completing the life test can take a lengthy time, which is a disadvantage. These techniques were modified by Chandrasekar et al. [3], who introduced generalized T-I HCS (GT-I HCS) and generalized T-II HCS (GT-II HCS).
One fixes k, r ∈ (1, 2, . . . , n) and time T ∈ (0, ∞) in GT-I HCS, where k < r. T * = min(x r:n , T), if the kth failure appears before the time T. T * = x k:n if the kth failure occurs after the time T. In GT-II HCS, r ∈ (1, 2, . . . ., n) and T 1 , T 2 ∈ (0, ∞) are chosen so that T 1 < T 2 . T * = T 1 if the rth failure occurs before time T 1 ; T * = X r:n if there is the rth failure between T 1 and T 2 ; otherwise, T * = T 2 . Even while these two new censored sampling systems outperform the previous ones, they nevertheless have flaws. Because there is just one the pre-assigned period T in GT-I HCS, we cannot ensure r failures. There is a chance that the GT-II HCS will not see a few failures until a pre-determined time T 2 , and so it will have the same trouble as the T-I HCS. To overcome the shortcomings of both systems, Balakrishnan et al. [4] proposed a scheme that combines GT-I HCS and GT-II HCS and is referred to the unified hybrid censoring scheme (UHCS).
Fix the integers r, k ∈ (1, 2, . . . , n) and the time points T 1 , T 2 ∈ (0, ∞) in this scheme. T * = min(max(x r:n , T 1 ), T 2 ), if the kth failure happened prior to time T 1 . T * = min(x r:n , T 2 ) if the kth failure happens between T 1 and T 2 , and T * = x k:n , if the kth failure happens after T 2 . Using this scheme strategy, we can confirm that the test will be finished in time T 2 with at least k failures, if not, exactly k failures. See [5] for more details on how well the difficulties in the previous types were resolved in this manner. As a result, we have the six cases indicated in Figure 1 under this UHCS: Case 1: 0 < x k:n < x r:n < T 1 < T 2 , if this is the case, we will end at T 1 , Case 2: 0 < x k:n < T 1 < x r:n < T 2 , if this is the case, we will end at x r:n , Case 3: 0 < x k,:n < T 1 < T 2 < x r:n , if this is the case, we will end at T 2 , Case 4: 0 < T 1 < x k:n < x r:n < T 2 , if this is the case, we will end at x r:n , Case 5: 0 < T 1 < x k:n < T 2 < x r:n , if this is the case, we will end at T 2 , Case 6: 0 < T 1 < T 2 < x k:n < x r:n , if this is the case, we will end at x k:n .
Entropy is a key concept in statistics and information theory, and it was first recommended in physics second law of thermodynamics. Shannon [6] re-defined it and introduced the idea of entropy into information theory to quantify information uncertainty. Cover and Thomas [7] built on Shannon's notion, defining differential entropy (or continuous entropy) of a continuous random variable X with probability density function f (x) as follows: Entropy measurement is a crucial concept in a variety of fields. More entropy indicates that there is less information in the sample. Shannon entropy has many applications in many fields see [8,9] in physics and other fields can be found in [10][11][12][13].
Entropy in hybrid censoring data studied by Morabbi and Razmkhah [14]. In progressive censoring of T-II, Abo-Eleneen [15] looked at the entropy and the best method. Cho et al. [16] used doubly GT-II HCS to calculate the entropy of the Rayleigh distribution. Liu and Gui [17] investigated entropy estimate for the Lomax distribution using maximum likelihood (ML) and Bayesian methods under generalized progressively hybrid censoring. Under multiple censored data, Hassan and Zaky [18] found the ML estimator of Shannon entropy for the inverse Weibull distribution. The Shannon entropy of the inverse Weibull distribution under progressive first-failure censoring was investigated by Yu et al. [19]. Bantan et al. [20] investigated entropy estimation of inverse Lomax distribution viz multiple censored data.
The study of inference problems associated with entropy measures has recently gained prominence. As previously stated, the UHCS may be able to overcome the restrictions of the GT-I HCS and GT-II HCS. The Lomax distribution, on the other hand, it has sparked scholarly interest in the last decade due to its possible uses in a range of domains, including lifetime data prediction. We could not discover any study on the Shannon entropy measure estimation problem using UHCS in this regard. Consequently, the main purpose of this study is to use the UHCS to derive ML and Bayesian estimators from the Lomax distribution. Three losses are used to create a Bayesian estimator for Shannon entropy. The balanced-squared error loss (BSEL) function, balanced linear-exponential (BLINEX) loss function, and the balanced general entropy (BGE) loss function are used to create a Bayesian estimate for Shannon entropy. The Markov chain Monte Carlo (MCMC) strategy is employed to solve Bayesian entropy estimators with balanced loss functions. A numerical evaluation is used to assess the behaviour of the provided estimates via UHCS. Real data is analysed for more evidence, allowing the theoretical results to be confirmed.
This paper is planned as follows: The Shannon entropy for the Lomax distribution is calculated in Section 2. The ML Shannon entropy estimator is shown in Section 3. Section 4 covers the MCMC method for calculating Bayesian estimators and derives the Bayesian estimator of entropy under various loss functions. Simulation issue and application to real data are given in Sections 5 and 6, respectively. Section 7 brings the article to a close with some last thoughts.

Lomax distribution
Lomax (Lo) distribution is one of the important lifetime models. It has been useful in reliability and life testing problems (see, Hassan and Al-Ghamdi [21]). Lo distribution was first introduced by Lomax [22] and it is useful in a variety of domains, including actuarial science and economics. The Lo distribution was applied to income and wealth data see [23,24]. More details about applications of Lo distribution can found in [25,26].
A random variable X is said to have Lo distribution with shape parameter α and scale parameter φ if its probability density function (pdf) is given by (2) The cumulative distribution function (cdf) of X is given by Various studies about Lo distribution can be found in the literature by several authors. Some recurrence relations between the moments of record values from the Lo distribution were studied by Balakrishnan and Ahsanullah [27]. On the basis of T-II censored data, Okasha [28] studied E-Bayesian estimate for the Lo distribution. Singh et al. [29] explored Bayesian estimation of Lo distribution under T-II HCS using Lindley's approximation. Statistical inference of Lo distribution using some information measures can be found in [30][31][32]. Substituting (2) into (1) will give the Shannon entropy of the Lo distribution Hence, To compute H(x) in (4), we must calculate I 1 and I 2 and Using integration rules we get To obtain I 2 , let u = 1 + x φ then du = dx φ , so I 2 will be then, using integration by parts we can find the value of the integral then, As a result, the Shannon entropy of the Lo distribution looks like this: As a function of parameters α and φ, this is the needed formulation of Shannon entropy of the Lo distribution.

Maximum likelihood estimation
Under UHCS, the ML estimation of the Lo distribution is discussed in this section. Assume there are n identical items in a life-testing experiment. Let X 1:n , X 2:n , . . . , X n:n represent the ordered failure times of these items, with fixed integer r, k ∈ 1, 2, . . . , n where k < r < n and time points T 1 , Then the likelihood function of α and φ, for six cases of UHCS is given by (8) where D is total number of failures up to time c in the experiment, and its value in the statements given by where D 1 and D 2 refer to the number of failures prior to T 1 and T 2 , respectively. Substituting (2) and (3) in (8) we obtain this equation can be written in the following way: where E = n! (n−D)! . Taking the logarithm of both sides we get where l(x | α, φ) is the logarithm of the likelihood function. Taking derivatives of (11) with respect to α and φ, we can get and To find the ML estimator of α and φ, set Equations (12) and (13) equal to zero and solve them. Therefore, Equation (12) can be written aŝ where and Equation (13) can be written as substituting from Equation (14) into (15) to calculateφ as shown Therefore, we can use an iterative procedure to compute the ML estimator of φ and then substitute into (14) to find the ML estimator of α. Then we get the ML estimator of H(x), indicated byĤ(x) as follows:

Bayesian estimation
Here, we find Bayesian estimators of the unknown parameters α, φ, and H(x) based on BSEL, BLINEX, and BGE loss functions. The gamma distribution is used as a conjugate prior for the class of distribution and is also a conjugate prior for Lomax distribution so we assume that α and φ are distributed independently as gamma (a 1 , b 1 ) and gamma (a 2 , b 2 ) priors, respectively. Then, the prior of α and φ is Where we assumed the hyperparameters a 1 , a 2 , b 1 and b 2 > 0 are constant and known. The joint prior density of α and φ is given by The posterior distribution is given by where π(α, φ) is the joint prior distribution for the parameters α and φ, L(x | α, φ) is the likelihood function, and π * (α, φ | x) is the posterior distribution for the parameters.
From (10) and (18) we obtain the joint posterior density function as follows: where E 1 is the normalizing constant given by then, .
The marginal posterior distributions of α and φ take the following forms: dφ, (23) and It is obvious from (22) that the posterior density function of α given φ is proportional to As a result, the posterior density function of α given φ is gamma distribution with shape parameter (D + a 1 ) and scale parameter Consequently, samples of α can be easily created using any gamma generating routine.
It is possible to write the posterior density function of φ given α as follows: . (26) Because this equation cannot be reduced analytically to well-known distributions, standard sampling methods cannot be used to sample directly. Using the MCMC method to get estimator under the following balanced loss functions.

Loss functions
Loss functions are classified according to symmetry criteria into two main types: the first one is a symmetric loss functions such as the squared error loss function. The second type of loss functions is called asymmetric loss functions, one of these functions is the entropy loss function and two types called unbalanced loss function.
Balanced loss functions are appealing as they combine proximity of a given estimator δ to both a target estimator δ o and the unknown parameter θ which is being estimated see [33] and is defined according to Zellner's formula as follows: where 0 ≤ ω ≤ 1 is a weighted coefficient, ρ(θ, δ) is an arbitrary loss function, δo is a chosen prior estimator of θ, and ρ(δ, δo) an unbalanced loss function for the likelihood function. When ω = 0 there are no differences between Bayesian estimators under the balanced and unbalanced loss functions.
In the BSEL loss function ρ(θ, δ) = (δ − θ) 2 , then (27) takes the form and the Bayesian estimator of H(x) in this case is given by where and E 1 is the normalizing constant given in (21).
we will get the BLINEX loss function where q represents the shape parameter of the loss function and the behaviour of the BLINEX loss function and BGE loss function changes with the choice of q. Then the Bayesian estimator of H(x) in this case is If we choose we will get the BGE loss function where q = 0, and the Bayesian estimator of H(x) in this case is

MCMC approach
The MCMC method is used to generate samples from posterior distributions and then compute Bayesian Shannon entropy estimators under balanced loss functions. MCMC schemes come in a wide range of options. Gibbs sampling and more general Metropolis-within-Gibbs samplers are a significant subclass of MCMC methods.
To pull samples from posterior density functions and then compute Bayesian estimators, we suggest the following MCMC technique.

Simulation study
In this section, we present some simulation results mainly to compare the performances of the ML estimate (MLE) and Bayesian estimate (BE) of the Shannon entropy for Lo distribution under different losses, in terms of mean squared errors (MSEs).
(1) For given hyperparameters a 1 , b 1 , a 2 and b 2 generate random values of α and φ from Equations (25) and (26). (2) For given values of n and D with the initial values of α and φ given in Step 1, we generate random samples from the inverse cdf of Lo distribution and then ordered them.
(3) The MLE of H(x) were calculated for certain values of r, k, T 1 and T 2 under six cases of UHCS, as explained in Section 3. (4) The BE of H(x) based on BSEL, BLINEX, and BGE loss functions using the MCMC method has been given, respectively, for specific values of r, k, T 1 and T 2 , as discussed in Section 4. (5) Ifθ is an estimate of θ then MSE over N samples, where i = 1, 2, . . . , N, is calculated as follows based on sample j, where j = 1, 2,...,N, then MSE over N samples is given by   the next row) for all cases are calculated as shown in Tables 1 and 2.

Simulated results:
Here are some observations on the Shannon entropy estimates performance as shown in Tables 1 and 2.
• The MSE of MLE and BE of the Shannon entropy decrease when n is increased (Figure 2). • The Bayes estimate of H BLINEX (x) at q = −0.5 is superior than the other Bayes estimates for different values of n in terms of yielding more information as seen in Figure 3. However, when compared to other Bayes estimates, the Bayes estimate of H BGE (x) at q = 0.5 is the worst, resulting in less information.

Real life data
To demonstrate the proposed methodologies, we look at an application to two real data sets in this section. Nelson [35] provided these statistics, which Abd Ellah Abd Ellah [36] used the Kolmogorov-Smirnov test to evaluate the correctness of the fitted model where the P-value = 0.749683 and statistic value is 0.147477. The estimated pdf and cdf are represented in Figure 8. Now we will look at what happens if the data are censored. From the uncensored data set, we generate six artificially UHCD sets as follows: In these cases, we used ML and a Bayesian estimate of Shannon entropy. We used 11,000 MCMC sample and discarded the first 1000 values as "burn-in" under balanced loss functions (BSLE, BLINEX, BGE) when ω = 0.5. Since we do not know anything about the priors, we use a non-informative prior to compute the Bayes estimators, thus we choose a 1 = 0, b 1 = 0, a 2 = 0 and b 2 = 0, as shown in Table 3.
The second data were obtained from a meteorological study by Simpson [37]. The data represent radarevaluated rainfall from 52 south Florida cumulus clouds. This data set was successfully fitted for the Lo distribution using the Kolmogorov-Smirnov test with a statistic value of 0.0825474 and a P-value = 0.841763. The estimated pdf and cdf are represented in Figure 9. Now we will look at what happens if the data are censored. From the uncensored data set, we generate six artificially UHCD sets as follows: In these cases, we used ML and a Bayesian estimate of Shannon entropy. We used 11,000 MCMC sample and discarded the first 1000 values as "burn-in" under balanced loss functions (BSLE, BLINEX, BGE) when ω = 0.5. Since we do not know anything about the priors, we use a non-informative prior to compute the Bayes estimators, thus we choose a 1 = 0, b 1 = 0, a 2 = 0 and b 2 = 0,    as shown in Table 4. Figures 10 and 11 show the trace plot and histogram of the first 1000 MCMC outputs for posterior distribution of H(x) for case 1 based on UHCS for first real and second real data respectively.
We note from the study of these applications in Tables 3 and 4 the amount of data obtained in Bayesian estimation is larger than at MLE because the uncertainty in the case of Bayesian is less than the uncertainty in MLE.
The BE based on BLINEX and BGE loss functions at q = 0.5 has a large amount of data where at q = −0.5 has a less amount of data because it has a large amount of uncertainty and this is the same result in simulation study.

Conclusion and summary
In this paper, the maximum likelihood and Bayesian estimation of Shannon entropy for Lo distribution under UHCS are considered. The Bayesian estimate of Shannon entropy for Lo distribution is obtained based on balanced loss functions (BSEL, BLINEX, BGE). The MCMC method is employed to calculate the Bayesian estimator of Shannon entropy under BSEL, BLINEX, BGE in terms of their mean squared error. Application to real data is provided.
Regarding the simulation results, we conclude that the mean squared error of ML and Bayesian estimates of the Shannon entropy decrease as the sample size increases. The entropy estimates via BLNIEX and BGE loss functions are chosen over the others at q = −0.5, allowing for more information. The MLE and BE of Shannon entropy are preferred above the others when T 1 = 1.8, providing more information for fixed values of r, k, T 2 . The BE has a greater amount of information than MLE since the uncertainty in BE is lower. In future work, one might study the challenge of estimating other measures of uncertainty using the E-Bayesian approach under UHCS. In this study, we used the MCMC method to calculate the Bayesian estimator of Shannon entropy under BSEL, BLINEX, BGE. Other approaches, such as Tierney-Kadane approximation and Lindley approximation will be used along with the MCMC approach under BSEL, BLINEX, BGE or another types of loss functions.