Asymptotic properties of Kaplan–Meier estimator and hazard estimator for censored survival time with LENQD data

In this paper, we consider the estimators of distribution function and hazard rate for censored survival time. First, some properties and inequalities are established for linearly extended negative quadrant-dependent sequence as auxiliary results. Then by applying the properties and inequalities, we investigate the strong consistency and strong representation for the Kaplan–Meier estimator and hazard rate estimator with censored linearly extended negative quadrant-dependent data. Under some mild conditions, we derive that the rates of strong consistency are near $ O(n^{-1/2}\log ^{1/2} n) $ O(n−1/2log1/2⁡n) and also obtain the strong representations with the remainder of order $ O(n^{-1/2}\log ^{1/2} n) $ O(n−1/2log1/2⁡n). The results established here extend and generalize the corresponding ones in recent literature.

Definition 1.1 (see Lehmann, 1966): The pair of random variables (X, Y) is said to be NQD, if for all real numbers x and y, P(X ≤ x, Y ≤ y) ≤ P(X ≤ x)P(Y ≤ y).Definition 1.2 (see Block et al., 1982): A finite family of random variables {X 1 , . . ., X n } is said to be ND, if for all real numbers x 1 , . . ., x n both inequalities hold.An infinite sequence {X n , n ≥ 1} is said to be ND if every finite subsequence is ND.

Definition 1.3 (see Joag-Dev & Proschan, 1983):
A finite family of random variables {X 1 , . . ., X n } is said to be NA, if for every pair of disjoint subsets A and B of {1, 2, . . ., n}, Cov(f 1 (X i , i ∈ A), f 2 (X j , j ∈ B)) ≤ 0, whenever f 1 (•) and f 2 (•) are coordinatewise increasing (or decreasing) and the covariance exists.An infinite sequence {X n , n ≥ 1} is said to be NA if every finite subsequence is NA.
Definition 1.4 (see Newman, 1984): A sequence of random variables {X n , n ≥ 1} is said to be LNQD, if, for any disjoint subsets A, B ⊂ {1, 2, . . .} and positive real numbers {r 1 , r 2 , . . .} or negative real numbers {r 1 , r 2 , . . .}, the pair ⎛ Definition 1.5 (see Liu, 2009): A finite family of random variables {X 1 , . . ., X n } is said to be END, if there exists a constant M ≥ 1 such that both inequalities hold for all real numbers x 1 , . . ., x n .An infinite sequence {X n , n ≥ 1} is said to be END if every finite subsequence is END.
From the above-mentioned literatures and definitions, we know that NQD, ND, NA, LNQD and END sequences are widely used in multivariate statistical analysis and reliability theory and survival analysis, and have been received more and more attention recently.For example, Wang and Zhang (2006) provided a Berry-Esseen theorem for LNQD sequence; Ko et al. (2007) discussed the strong convergence and central limit theorem for weighted sums of LNQD sequence; Wang et al. (2010) established some exponential inequalities and by those obtained complete convergence based on LNQD sequence; Li et al. (2012) provided some inequalities for LNQD sequence and by those investigated the asymptotic normality of weighted function estimator for regression function; Shen and Zhu (2015) studied the complete convergence for weighted sums of LNQD sequence and presented some complete convergence for arrays of rowwise LNQD sequence; Hu and Jiang (2018) discussed the uniformly asymptotic normality of sample quantile estimator for LNQE sequence; Hu and Wang (2020) investigated the Berry-Esseen bound of wavelet estimator for nonparametric regression model based on LNQD sequence.
Furthermore, we note that END sequence is more general than ND sequence in that it can reflect not only a negative dependence structure but also a positive one to some extent.Motivated by END and LNQD sequences, Li et al. (2023) put forth the definition of linearly extended negative quadrant dependence (LENQD) as follows.
Definition 1.6 (see Li et al., 2023): The pair (X, Y) is said to ENQD, if there exists a dominating constant M ≥ 1, such that both inequalities P(X ≤ x, Y ≤ y) ≤ M P(X ≤ x)P(Y ≤ y) and P(X>x, Y>y) ≤ M P(X>x)P(Y>y) hold for every real x, y.A sequence of random variables {X n , n ≥ 1} is said to be EPNQD, if for all i, j ≥ 1, i = j, the pair (X i , X j ) is ENQD.A sequence of random variables {X n , n ≥ 1} is said to be LENQD with dominating constant M ≥ 1, if for any disjoint subsets A, B ⊂ {1, 2, . . .} and positive real numbers {l 1 , l 2 , . . .} or negative real numbers {l 1 , l 2 , . . .}, the pair ⎛ From Examples 1.1 and 1.2 and Remark 1.1 in Li et al. (2023), it is easily seen that independent random variables, NA, END and LNQD random variables are LENQD random variables.Therefore, LENQD sequence has also wide applications in multivariate statistical analysis and reliability theory and survival analysis.
We know that Kaplan-Meier estimate is one of the commonly used methods in survival analysis and has received considerable attention in the literature.For example, Liebscher (2002) derived the rates of uniform strong convergence for kernel density estimators and hazard rate estimators for right censoring based on a stationary strong-mixing sequence; Liang and Una-Alvarez ( 2009) studied the convergence of Kaplan-Meier estimator based on the stationary strong-mixing data; Wu and Chen (2013) established the strong representation results of Kaplan-Meier estimator for censored NA data.More recently, Shen and Wang (2016) investigated the strong convergence properties for Kaplan-Meier estimator and hazard estimator based on censored NSD data; Anevski (2017) derived the limit distribution results of Kaplan-Meier estimator for distribution function based on right censored observations with a stationary time series; Li and Zhou (2019) and Li and Zhou (2020) respectively investigated the Kaplan-Meier estimator and hazard estimator for censored WOD and END data; Ahmed and Flandre (2020) proposed a weighted Kaplan-Meier estimators to estimate the HIV-1 RNA reduction; Nematolahi et al. (2020) derived the asymptotic distribution of Kaplan-Meier estimator under PROS sampling design; Wu et al. (2022) investigated the rates of strong consistency and the strong representations for the Kaplan-Meier estimator and hazard estimator with censored WOD data.
However, the work mentioned above focuses basically on strong mixing dependent data, NA, NSD, END and WOD data.Noting that NA, END and LNQD imply LENQD, we naturally ask whether those results can still hold for more comprehensive LENQD data in theory.From this, it is of interest to discuss the Kaplan-Meier estimator and hazard rate estimator of survival time based on censored LENQD data.Therefore, in this paper, we will investigate the strong consistency and the strong representation for Kaplan-Meier estimator and hazard rate estimator with censored LENQD data.
The remaining parts of this paper are structured as follows.Section 2 briefly provides some notations, properties and inequalities, while Section 3 is devoted to setting up some auxiliary results.Finally, we discuss and derive the strong consistency and strong representation for Kaplan-Meier estimator and hazard estimator based on censored LENQD data in Section 4.

Some notations, properties and inequalities
To facilitate the notation reference below, we begin with a brief review of Kaplan-Meier estimator and hazard rate estimator as follows.
Let {X 1 , . . ., X n } be survival times with an unknown continuous distribution function F(x) = P(X i ≤ x) satisfying F(0) = 0, and {Y 1 , . . ., Y n } be random censoring times with an unknown distribution function G(y) = P(Y i ≤ y) satisfying G(0) = 0. Suppose that {X 1 , . . ., X n } and {Y 1 , . . ., Y n } are mutually independent.Let survival times X i be censored on the right by the censoring times Y i , so that one can only observe (T i , δ i ), where Here and thereafter, a ∧ b means the minimum of a and b, and I(A) means the indicator function of an event A.
Denote the number of uncensored observations less than or equal to t by and the number of censored or uncensored observations greater than or equal to t by The Kaplan-Meier estimator of F(x) was proposed by Kaplan and Meier (1958) as follows: It is further assumed that F(x) has a density f (x), and then the hazard rate λ(x) can be written as , where F(x)<1 and F(x) = 1 − F(x).
The distribution function and empirical distribution function of {T n , n ≥ 1} are defined as and And the cumulative and empirical cumulative hazard rate functions are defined by and where F * (t) and its empirical distribution function F * n (t) are defined as and Noting that N(t) is a step function, and dN(T (k) ) = δ (k) , k = 1, 2, . . ., n, we can easily rewrite the above estimators of n (x) and F n (x) as follows: and where denotes the order statistics of T 1 , T 2 , . . ., T n , and δ (i) is the concomitant of T (i) .
Next, we give a basic property and an inequality of LENQD sequence, which are helpful in proving the theorems in Sections 3 and 4.

Some auxiliary results
We will establish two auxiliary results, which play an important role in the proof of the main results in Section 4.
Theorem 3.1: The union of independent sets of ENQD random variables is also ENQD.
Proof: Let X = {X 1 , X 2 } and Y = {Y 1 , Y 2 } be two independent random vectors, the pairs (X 1 , X 2 ) and (Y 1 , Y 2 ) be ENQD.We shall show that the pair of random vector Noting that (X 1 , X 2 ) is ENQD, and (Y 1 , Y 2 ) is also ENQD, then by the independence of X and Y, we know there exist constants M 1 ≥ 1 and M 2 ≥ 1, and ( 3 ) Similarly, we can obtain that By relations of ( 3) and ( 4), and Definition 1.6, we know that (X 1 , Y 1 ) and (X 2 , Y 2 ) are ENQD random variables.This completes the proof.
Theorem 3.2: Let {X n , n ≥ 1} be a sequence of LENQD random variables with a dominating coefficient M ≥ 1, and having unknown distribution function F(x) with bounded probability density function f (x).Setting F n (x) = n −1 n i=1 I(X i ≤ x) as the empirical distribution function, and taking τ n = n −1/2 log 1/2 n, then Proof: By Lemma 2 of Yang (2003), let {x n,k } satisfy F(x n,k ) = k/n for n ≥ 3 and k = 1, . . ., n−1, and then we obtain that sup It is easy to see that nτ n → ∞.Therefore, for all n large enough and ε ≥ 4 √ 3, we have 2/n<ετ n /2.Hence, we can get that P sup Then it follows from Lemma 2.1 that {ξ j,k } is still LENQD random variables with ξ j,k = 0, |ξ j,k | ≤ 2. Taking t = ετ n /4, and applying Lemma 2.2, we can obtain that Thus, with relations ( 5) and ( 6), we obtain Therefore, Theorem 3.2 holds.

Main results
We can obtain the strong consistency and strong representation for Kaplan-Meier estimator and hazard estimator based on censored LENQD data as follows.
Theorem 4.1: Let {X n , n ≥ 1} and {Y n , n ≥ 1} be two LENQD sequences with a dominating coefficient M ≥ 1. Suppose {X n , n ≥ 1} and {Y n , n ≥ 1} are independent.Then, for any 0<τ <τ L , and Theorem 4.2: Under the assumptions of Theorem 4.1, we have where, for j = 1, 2, Remark 4.1: By the auxiliary results in the previous section, we can prove Theorems 4.1 and 4.2 which are the extensions of those results in Wu and Chen (2013) from NA to LENQD, Li and Zhou (2020)  Following the same lines of the proof of Equations (3.1) and (3.2) in Li and Zhou (2020), by straightforward calculations, we have and then, by Equations ( 11) and ( 12), for 0<τ <τ L , we have Thus Equation ( 7) follows immediately.

Proof of Theorem 4.2: It is easily seen that
Firstly, for J 1 (x), noting that F * n (t) = N n (t)/n and N n (t) is a step function, we derive that Secondly, we analyse J 2 (x).By dividing the interval [0 , and τ <τ , we have the following decomposition: where and According to Lemma 2.1, we know that {η ik } and {ζ ijk } are both LENQD with the same dominating constant M, and Therefore, from ( 18), ( 21) and ( 22), it follows that

Li et al., 2023)
: (i) Let (X, Y) be ENQD random variables with dominating coefficient M ≥ 1.If f and g are nondecreasing (or nonincreasing) functions, then (f (X), g(Y)) is ENQD random variables, and its dominating constant M remains unchanged.(ii) Let {X n , n ≥ 1} be LENQD sequence with dominating coefficient M ≥ 1, and f n be nondecreasing (or nonincreasing) functions.Then {f n (X n ), n ≥ 1} is LENQD sequence, and the dominating constant M remains unchanged.Lemma 2.2 ((see Li et al., 2023) Bernstein-type inequality): Let {X n , n ≥ 1} be a sequence of LENQD random variables with a dominating coefficient M ≥ 1, and EX n = 0, |X n | ≤ d n a.s.for each n ≥ 1, where {d n , n ≥ 1} is a sequence of positive constants.Assume that t>0 such that t • max 1≤i≤n d i ≤ 1.Then for any ε>0, Wu et al. (2022)D,Wu et al. (2022)from WOD to LENQD, and so on.