Rate of complete second-order moment convergence and theoretical applications

ABSTRACT The purpose of this work is to present a novel mode of convergence, complete second-order moment convergence with rate, which implies almost complete convergence and gives a smaller rate of convergence. Indeed, this mode is easier to obtain and gives better performances than those of the almost complete convergence in the case of the nonparametric estimators with kernels of the density function, of the distribution function and of the quantile function. A great advantage of the proposed approach is that less conditions are imposed to the kernel function thanks to the use of the mean squared error expression.


Introduction
The rate of convergence is an important concept that describes the speed with which a sequence converges to the limit. In this setting, a fundamental question is how fast the convergence is illuminating theoretical studies on the subject has been carried out proposing important new results, algorithms and applications to address this issue. The purpose of this work is to present a novel mode of convergence, namely the complete secondorder convergence with a rate. The novelty here is in the introduction of the complete secondorder convergence rate. Hence, we give proofs of different proprieties of this kind of convergence. The almost complete convergence is induced by second-order convergence in different context (see [1][2][3] and recently see Yu et al. [29]). The complete convergence concept was introduced by Hsu and Robbins [4]. Then, it has been used by several authors such as Gut and Stadtmller [5], Gut [6,7], Li et al. [8], Sung [9], Sung and Volodin [10]. The interest of such a notion lies in the fact that almost complete convergence (a.c.) implies almost sure convergence (a.s.) due to the Borel-Cantelli lemma. However in some situations at least, it is much easier to obtain complete second-order moment convergence (c.s.m.) instead of almost complete convergence.
As a practical framework, rates of complete secondorder moment convergence for the probability density , the distribution function and the quantile function kernel estimators are established. We also discuss about the speed at which these estimators converge. First, in the context of estimating probability density function, many studies using different methods have been proposed. The Kernel method is one of the best of these methods which seems to be convenient and does not require a multiple choice of parameters. Rosenblatt [11] was the earliest pioneer of the class of kernel density estimators, using two parameters, namely the kernel and the bandwidth. For this estimator, convergence in probability was established by Parzen [12]. Habbema et al. [13], Hall and Kang [14], Hall and Wand [15], Gosh and Chaudhury [16] and Gosh and Hall [17] can be consulted for various works on the subject, in particular the estimation by classical kernels of the densities. In the case of independent observations, optimal rates of convergence to zero for mean square error and the bias of the kernel estimators have been addressed by several authors under varying conditions on the kernel (K) and the density (f ). As a contribution, a new complete second-order moment convergence with a rate is introduced for the first time to improve the rates of convergence for the bias and the mean square error (MSE) of kernel density estimators. Then the complete convergence of the density kernel estimator, under weaker conditions on the density function than those proposed in the literature, is achieved. As a consequence of the complete second-order moment convergence, the almost complete convergence is obtained with a better rate.
Second, the proposed mode of convergence is applied to the distribution function and quantiles kernel estimators. Notice that for the distribution function, Nadaraya [18] proposed its kernel estimator. While, Parzen [19] retraced the context of Nadaraya [18] and constructed the kernel quantile estimators for which we established the rate of almost complete convergence. To the best of our knowledge, this is a new result obtained from the proposed convergence mode. A great advantage of the proposed approach is that less conditions are imposed to the kernel function thanks to the use of the mean squared error expression.

Complete second-order moment convergence with a rate
Throughout this paper, real-valued random variables are defined on a fixed probability space ( , A, P).
Let (U n ) n∈N and (V n ) n∈N be two sequences of real numbers. We assume (V n ) n∈N does not vanish from a certain rank. We say that (U n ) n∈N is dominated by (V n ) n∈N if there exist a real number M and an integer n 0 , such that, for all n ≥ n 0 , we have |U n | ≤ M|V n |, and we note U n = O(V n ).

Definition 2.1:
A sequence (X n ) n∈N of random variables is said to be almost complete second-order moment convergent (c.s.m) to the random variable X, with the convergence rate 1 where (U n ) in a sequence of positive numbers. Note this mode The following theorem shows that if (X n ) n∈N converges in complete second-order moment to X with a rate 1 And for β > −1, ε > 0 and the sequence (V n ) n∈N = (n β U n ) nN we obtain n∈N n β P (|X n − X| > ε) ≤ n∈N n β U n E(X n − X) 2 < ∞.
Proof: Suppose that n∈N U n E(X n − X) 2 < ∞, then from the Markov inequality we obtain is a rate of convergence, so 1 √ U n −→ 0 which equivalent to, for each ε > 0, there exist n 0 ∈ N such that for all n ≥ n 0 we have 1 The following proposition gives some elementary calculus rules.
Proof: (1) Immediately from the following inequality: (2) We have Now we have two properties which are consequences of the previous calculus rules.
Proof: Follows directly from Proposition 2.1.

Theoretical applications
In this section, rates of complete second-order moment convergence for the probability density , the distribution function and the quantile function kernel estimators are established. The following remark is very important to establish these rates. implies the convergence of n≥1 U n h 4 n . If h n > 1 and h n −→ 0 while n −→ ∞, the convergence is obtained.

Kernel density estimator
Let X 1 , X 2 ,..., X n be independent and identically distributed copies of a random variable X, which has unknown continuous probability density function f. The kernel density estimator, notedf n , of the unknown density f defined by Parsen [12]; Rosenblatt [11], is given byf where (h n ) n≥1 is a sequence of positive numbers, usually called a bandwidth or smoothing parameter, and K is an integrable Borel measurable function satisfying K ≥ 0 and R K(x) dx = 1, called kernel. Assume that the kernel K and the density f functions verify the following conditions: Proof: To prove (1), notice first that the mean squared error MSE off n , defined by can be written as Hence to prove (1), it suffices to show that and n≥1 U n Bias(f n (x)) 2 < ∞.
For the inequality (3) Using the condition (H2) of the kernel K, it follows that On the other hand for the inequality (4), we have Using Taylor's series expansion of the function f about a point x up to order 3, (H3), (H1) and (H4), one obtain where θ is a real number between x and zh n . Hence The proof is completed.

Remark 3.2:
If K is a symmetric compactly supported kernel, we obtain the complete second-order moment convergence off n under H4.
The choices of U n and h n are not arbitrary because their expressions must be selected so that the convergence of the obtained series is ensured.
So, the condition α > 2 is always verified.

Corollary 3.1: Under (H1)-(H4), we havef n
Proof: For the optimal bandwidth h n = n − 1 5 and a rate of convergence U n = nh n n α (log n) 2 , which satisfies the inequality U n < nh n log n , where nh n log n is the rate of almost complete convergence of the density kernel estimator, one obtain Combining the convergence conditions of the two series in the right hand side, we obtain (1) for α > 1.

Kernel distribution function estimator
Let X 1 , X 2 ,..., X n be independent and identically distributed copies of a random variable X, which has unknown continuous probability density f and distribution F functions. The kernel distribution function esti-matorF n , that was proposed by Nadaraya [18], can be obtained by integrating the kernel density estimatorf n , as follows: where the function H is defined from the kernel K as Function H is a cumulative distribution function because K is a probability density function. Assume that the following hypothesis are satisfied: Then Theorem 3.2 states the complete second-order moment convergence ofF n to F.
Proof: To prove (5), we use the same argument used to check (1). First for the bias, we have Using integration by part, substitution x−z h n = y, a Taylor series expansion of the function F about the point x up to order 2, (H1), (H3) and (H8), we obtain n≥1 U n Bias(F n (x)) 2 ≤ C n≥1 U n h 4 n < ∞.

Now for the variance
Using (H1), (H5)-(H8), integration by part, substitution where C 1 and C 2 are two constants. Since (h n ) converges to 0, so for every > 0 there exists n 0 such that h n ≤ for all n ≥ n 0 , then  [20], then Hypothesis H5 and H7 are verified. We can obtain the MSE of the kernel distribution estimatorF n with the assumptions used by Azzalini [21] and the kernel satisfying the above assumptions, we have the squared bias and the variance Then the MSE is given by So under H8 we obtain n≥1 U n MSE(F n (x)) < ∞.
The next corollary gives a new rate of the almost complete convergence ofF n to F. The last series converge for α > 1. The two right-hand side series converge simultaneously if α > 5 3 .

Kernel quantile function estimator
Let X 1 , X 2 , . . . , X n be independent and identically distributed copies of a random variable with absolutely continuous distribution function F. Denoting X (1) ≤ X (2) ≤ · · · ≤ X (n) the corresponding order statistics. The quantile function, noted Q, is defined to be the left continuous inverse of F, given by A kernel quantile estimator, based on the Nadaraya [18] kernel distribution function estimatorQ n , is defined aŝ and given bŷ where K is a density function, while h n −→ 0 as n −→ ∞.
Our result of convergence is based on the expression of the MSE of the kernel quantile estimator, given by Sheather and Marron [22] (Theorem 1, p. 5), when p is in the interior of (0, 1), under conditions that the kernel K is symmetric about 0 with compact support, and Q (2) is continuous in a neighbourhood of p.
Proof: Building on Falk [23] and David [24], Sheather and Marron [22] give the expressions of bias and variance ofQ n as and Finally we have obtained the almost complete convergence and the compete second-order moment convergence of the kernel quantile estimatorQ n .

Example 3.3:
In the same way of Example 3.2 and using the same h n and U n , we obtain

Simulation study
In this section, to present the performance of the new rate of convergence for a finite-size sample, we realize a simulation study. We give a visual impression of the quality of convergence by calculating the correspondent MSE together with the value of the rate of complete second-moment convergence and the value of the rate of almost complete convergence, based on a sample obtained from two theoretical models: Gamma kernel density estimates and the innovation one said Laplace kernel density developed by Khan and Akbar [25], inspired to Chen's idea [26]. Defined by In the second part, Normal and Epanechnikov kernel distribution estimators are used with optimal bandwidth and different sizes of normal and exponential sample to give a performance of the new rate of convergence. Finally, with the Normal model, one perform the quantile (25%, 50%,75%) estimates for the same sample size.

Kernel density MSE
We propose two schemes of Kernel estimations, Laplace and Gamma kernel density's estimates with optimal bandwidth h = n − 1/5. One conduct simulations of samples data from Exponential and Gamma density with sample sizes n = 100, n = 150, 200, 250, 300, 500, 800, 1000. We summarize the numerical calculations in Table 1.
The convergence CSM rate is more efficient in terms of speed towards zero. Even if the rate is almost efficient in the AC convergence, we attend the good behaviour of the new rate for both kernel models.
According to the results obtained in Table 1, we remark that the values of the CSMC rate of the kernel density estimator for Laplace kernel (resp. Gamma kernel) are closer to the Exp MSE (resp. Gamma MSE) values, than the ACC rate one.

Kernel distribution MSE
In this case, the Normal kernel distribution and Epanechnikov kernel distribution with optimal bandwidth h = n −1/3 are compared to Normal sample. The size varies between 100 and 1000. We summarize the numerical calculations in Table 2. Here too, we notice the fast convergence speed of the CSM rate. We also notice its We remark that the CSM rate gives a good results for the Normal kernel. By the results given in Table 2, we notice that, in the both cases normal and Eparechnikov kernels distribution, CSMC rate values of the kernel distribution estimator, are closer to the normal MSE and Exp MSE, respectively values than that of the ACC rate.

Kernel quantile MSE
Now we calculate the MSE of the normal kernel quantile estimator. The size varies between 100 and 1000. We summarize the numerical calculations in Table 3. We also remark that the CSMC rate values are closer to the Kernel quantile estimators than the ACC rate values.
Eventually, we conclude that, in all cases, the CSMC rate of the kernel density, distribution and quantile estimators, gives better results than the ACC rate of the same estimators.

Real data analysis
Female infertility and BMI: This study aims to investigate the body mass index (BMI) of infertile women of childbearing age. We use data from 200 participants from the Ben Badis University Hospital Centre of Constantine, Algeria, in 2018.
The first step is to establish the conformity test between the sample and a normal distribution. One obtain, D = 0.079226 smaller than p-value 0.1623. This implies the use of the density and the distribution of the normal law. The second step is to estimate the density and the distribution functions and to represent them in a graph (Figure 1). For the MSE between the real density and a normal kernel density is equal to 0.003128583 and the rate of CSM convergence is 0.0008729775 and the AC convergence is 0.06541179. We obtain for the distribution, MSE = 0.003128583 and the rate of CSM convergence is 0.0047592 and the AC convergence is 0.1627624. For the same rate of convergence, one have  the MSE quantiles given by (25%, 0.0260281), (50%, 0.04186125), (75%,0.058767206). Note that the results of the real data come supported those of the simulation.

Conclusion
The present work proposed a new method to obtain the convergence rate of the MSE that is much more efficient in the CSMC case. Indeed, the CSM convergence rate gives better results than that of AC convergence. The previous results indicate that the choice of this type of convergence using kernel method estimate is a good alternative to the almost complete one. We can apply this type of convergence in any estimation that requires the study of the MSE, such as for example neural network, least squares method, ..., and following the suggestion, it can be applied in neutrosophic statistics developed in Smarandache [27] and Afzal et al [28]. Moreover, maybe we can apply this type of convergence to extend the theorem of the law of large numbers.

Disclosure statement
No potential conflict of interest was reported by the authors.