Estimation of Population Mean by Using a Generalized Family of Estimators Under Classical Ranked Set Sampling

ABSTRACT Estimation of population mean of study variable Y suffers loss of precision in the presence of high variation in the data set. The use of auxiliary information incorporated in construction of an estimator under Rank set sampling scheme results in efficient estimation of population mean. In this paper, we propose an efficient generalized family of estimators to estimate finite population mean of study variable under ranked set sampling utilizing information on an auxiliary variable. Bias and Mean Square Error (MSE) of the proposed generalized family of estimators are derived. The conditions of efficiency of proposed generalized family of estimators from competitor estimators are also derived. The applications of estimator are discussed using simulation study and real-life data sets for comparisons of efficiency. It is concluded that when correlation between study and auxiliary variables increases, the proposed generalized family of estimators proves to be the efficient estimator of population mean of the study variable.


Introduction
In many situations of practical interest, mainly in environmental and ecological studies, the variable of interest, say Y, is not easily observable in the sense that measurement may be expensive, time consuming, invasive or even destructive. Although data collection may be complex, ranking the potential sampled units with respect to an available auxiliary variable can often be relatively simple at no additional cost or for a very little cost. In those situations, where the variations in study variable is high and it is strongly correlated with auxiliary variable, the Ranked Set Sampling (RSS) proposed by Mclntyre (McIntyre, 1952) is more efficient as compared to Simple Random Sampling (SRS) (Patil et al., 1993(Patil et al., , 1994Stokes, 1977).
Literature on RSS has rapidly grown and several estimators, originally conceived for the SRS, have been re-proposed to estimate the mean of the study variable by changing their sampling design into RSS framework (Ali et al., 2021;Iqbal et al., 2020;Kadilar et al., 2009;Khan & Shabbir, 2016a, 2016b, 2016cMandowara & Mehta, 2013Pelle & Perri, 2018;Samawi & Muttlak, 1996;Singh et al., 2014;Vishwakarma et al., 2017). Motivated by these studies, and in line with many other contributions present in the literature section of this article, we propose an efficient generalized family of estimators by changing the sampling design of the estimator proposed by Shahzad et al. (Shahzad et al., 2019).

Notations under Ranked Set Sampling Design
Let Ω = {1,2, . . ., N} be a finite population of N units, Y the variable under study and X an auxiliary variable which is highly correlated with Y. Let µ y and µ x denote the population means of Y and X, respectively, S 2 y and S 2 x the variances, C y and C x the coefficients of variation, ρ xy the correlation coefficient between X and Y, β 2 x ð Þ and β 1 x ð Þ the kurtosis and skewness, and C xy ¼ ρ xy C x C y . Let us denote X j i ð Þ ; Y j i ½ � À � as the pair of the i th -order statistics of X and the associated element Y in the j th cycle. Then the ranked set sample is, : : : To obtain biases and mean square error, we consider following notations under RSS: Deviation of i th cycle ranked mean from population mean µ x . τ y i ½ � ¼ � y i ½ � À μ y : Deviation of i th cycle ranked mean from population mean µ y .
To obtain biases and mean square error, we consider following notations under SRS: (1:1)

Proposed Generalized Family of Estimators under RSS
Motivated from Shahzad et al. (Shahzad et al., 2019), we propose the following generalized family of estimators under Ranked set sampling, (3:1) Where w 1 and w 2 are unknown constants and α; a; b; g; η; �; ϕ and λ are suitably chosen known constants.

Derivation of Bias and Mean Square Error
Rewriting the above estimator with "e" terms under first order of approximation we get, Subtracting μ y on both sides, (3:2) For bias, we apply expectation on both sides of 3.2, the expression for bias of Δ RSS ð ÞG is given as, For MSE, we apply square and expectation on both sides of 3.2, the expression for MSE of Δ RSS ð ÞG is given as, For minimizing MSE, we obtained the optimum values of w 1 and w 2 as follows: Hence, the minimum Bias and MSE are given by, (3:3) All of these are sample observations, so we will calculate these observations for quantifying Bias and MSE of our estimator for any given sample.

Efficiency Comparison
We derive the theoretical conditions to compare the efficiency of our proposed generalized family of estimators to their competitor estimators.

Simulation Study
A hypothetical data for the study variable (Y) and auxiliary variable (X) is generated by using Bivariate Normal Distribution with parameters,  Table 3 shows that, when correlation coefficient of x and y equals to 0.8 and n = 20, our proposed estimator will be 366.52% more efficient than Δ SRS  Table 4 shows that, when correlation coefficient of x and y equals to 0.9 and n = 20, our proposed estimator will be 437.43% more efficient then Δ SRS ð Þsh . In the same situation proposed estimator will be 210. 49%, 193 Tables 1, 2, 3, 4 and 5 show the trend that when we increase the sample size, efficiency of proposed estimators under RSS design also increases as compare to estimator under SRS design.
Results also revealed that as we increase the ρ xy , proposed estimator in RSS performs more efficiently as compared to its competitor estimator in SRS (i.e. Δ SRS ð Þsh ). Therefore, we may say that as the correlation coefficient of x and y increases, the use of RSS is more appropriate as compared to SRS.

Real-Life Applications
To observe performances of the estimators, we use the following real-life data sets. The descriptions of these populations are given below.
Population I [source: (James et al., 2013)] The summary statistics are given below. Y: Acceleration of automobiles X: Engine horsepower of automobiles Objective: To estimate population mean of Acceleration of automobiles.  Table 6 shows that, when we consider the population I, our proposed estimator will be 363.74% more efficient then Δ SRS ð Þsh . In the same situation the proposed estimator is 150.18%, 150.18%, 155.15% and 153.49% more efficient than Δ RSS ð Þmm1 , Δ RSS ð Þmm2 , Δ RSS ð Þmm4 and Δ RSS ð Þvz respectively. Table 7 shows that, when we consider the population II, our proposed estimator will be 418.12% more efficient than Δ SRS ð Þsh . In the same situation proposed estimator will be 140.35%, 140.35%, 177.79% and 151.89% more efficient then Δ RSS ð Þmm1 , Δ RSS ð Þmm2 , Δ RSS ð Þmm4 and Δ RSS ð Þvz respectively. Table 8 shows that, when we consider the population III, our proposed estimator will be 149.93% more efficient then Δ SRS ð Þsh . In the same situation proposed estimator will be 118.81%, 118.81%, 127.66% and 121.52% more efficient then Δ RSS ð Þmm1 , Δ RSS ð Þmm2 , Δ RSS ð Þmm4 and Δ RSS ð Þvz respectively. Table 9 shows that, when we consider the population IV, our proposed estimator will be 221.91% more efficient then Δ SRS ð Þsh . In the same situation proposed

Conclusion
In this study, we proposed generalized family of estimators under RSS to estimate the finite population mean motivated from Shahzad et al. (Shahzad et al., 2019). The biases and MSEs of the proposed estimators were derived up to first order of approximation. The efficiency conditions for the proposed generalized estimator were also derived. On the basis of simulation study and real-life data sets, MSEs of all estimators have been computed and it is shown that the proposed generalized family of estimators are more efficient than the competitor estimators under SRS and RSS. It may concluded that with an increase in sample size and ρ xy the proposed estimator in RSS performs more efficiently as compared to its competitor estimators in SRS (i.e. Δ SRS ð Þsh ).