New methods to define heavy-tailed distributions with applications to insurance data

Heavy-tailed distributions play an important role in modelling data in actuarial and financial sciences. In this article, nine new methods are suggested to define new distributions suitable for modelling data with an heavy right tail. For illustrative purposes, a special sub-model is considered in detail. Maximum likelihood estimators of the model parameters are obtained and a Monte Carlo simulation study is carried out to assess the behaviour of the estimators. Furthermore, some actuarial measures are calculated. A simulation study based on these actuarial measures is done. The usefulness of the proposed model is proved empirically by means of two real data sets. Finally, Bayesian analysis and performance of Gibbs sampling for the data sets are also carried out.


Introduction
In many applied areas, particularly in finance and actuarial sciences, data are usually positive, and their distribution is unimodal hump shaped and extreme values yield tails which are heavier than those of the standard well-known distributions. Classical distributions are not flexible enough to cater such heavy-tailed data sets. For example, (i) the Pareto distribution, which is widely used in modelling financial data sets, does not provide a reasonable fit for many applications and (ii) the Weibull model covers better the behaviour of small losses, but fails to cover the behaviour of large losses; see Bhati and Ravi [1]. In such situation, heavy-tailed distributions are reliable and accurate candidate models to employ. For positive data, heavy-tailed distributions are those whose right tail probabilities are greater than the exponential one (see Beirlant et al. [2]), that is where F(x) is the cumulative distribution function (cdf). For further detail, see McNeil [3] and Resnick [4]. Due to the importance of the heavy-tailed distributions in actuarial practice, the actuaries are motivated to introduce new flexible distributions. In this regard, serious attempts have been made and still growing rapidly.
The new developments have been made through many different approaches such as (i) transformation of variables, (ii) composition of two or more distributions, (iii) compounding of distributions and (iv) finite mixture of distributions.
Recent studies of Eling [5] and Adcock et al. [6] identify that skew-normal and skew student t distributions are the best competitors as the skewed distributions adjust right-skewness and high kurtosis; for further detail see, Shushi [7] and Punzo [8]. However, insurance losses and financial risks take values on the positive real line and hence these skew class of distributions may not be appropriate as they are defined on R. In such situations, the transformation of variables, particularly the exponential transformation, has proven to be substantial; see, for example, Azzalini et al. [9]. Bagnato and Punzo [10] showed that the transformation approach is simple to use but most often the inference as well as computation of the other distributional characteristics become complicated.
Another promising approach for obtaining new flexible heavy-tailed families of distributions, which gives reasonably good fit for heavy-tailed losses, is the method of composition; see Paula et al. [11], Klugman et al. [12], Nadarajah and Abu Bakar [13], and Bakar et al. [14]. However, it should be noted that the new distributions obtained by the composition approach involve more than three parameters causing difficulties in the estimation process and computational efforts are required.
Another prominent approach is compounding of distributions to cater data modelling with unimodality, right-skewness and heavy tails [8,15,16]. However, the density obtained via this method may not have a closed form expression which makes the estimation more cumbersome as shown in Punzo et al. [15]. For a brief review about compounding of distributions, we refer to Tahir and Cordeiro [17].
Finite mixture models represent a further approach to define very flexible distributions which are also able to capture, for instance, multimodality of the underlying distribution [18][19][20]. The price to pay for this greater flexibility is a more complicated and computationally challenging inference.
Furthermore, Dutta and Perry [21] performed an empirical analysis of loss distributions and risk was estimated by different approaches such as Exploratory Data Analysis and other empirical approaches. These authors rejected the idea of using the exponential, gamma and Weibull models in modelling insurance losses due to the poor results. They concluded that "one would need to use a model that is flexible enough in its structure", and this encourages researchers to search for more flexible probability distributions providing greater accuracy in fitting heavy-tailed insurance losses.
Carrying out this branch of distribution theory, Alzaatreh et al. [22] defined the T-X family method to introduce new families of distributions. Let v(t) be the probability density function (pdf) of a random variable, say T, where T ∈ [m, n], −∞ ≤ m < n < ∞, and let W[F(x)] be a function of F(x) of a random variable, say X, satisfying the conditions given below: The cdf of the T-X family of distributions is defined by where W[F(x)] satisfies the conditions stated above. The pdf corresponding to (1) is For the contributed work based on the idea of T-X approach, we refer to Ahmad et al. [23]. Using the approach of T-X method, one can introduce new members of survival family [24] via the cdf whereF(x) = 1 − F(x) is the survival function of the baseline distribution.
Recently, Mahdavi and Kundu [25] proposed a new prominent approach for introducing statistical distributions via the cdf given by with additional parameter α. Under these premises, we are motivated to propose new families of distributions. Taking inspiration from (1)-(3), in this article, nine new families of distributions are proposed.
The paper is outlined as follows: new development to the heavy-tailed distributions is presented in Section 2. In Section 3, we define a special sub-model of the proposed family and provide plots of the density function. We derive some statistical properties in Section 4. Some characterizations results are provided in Section 5. Maximum likelihood estimation of the model parameters is addressed in Section 6. In the same section, the Monte Carlo simulation study is provided. Actuarial measures of the proposed method along with a simulation study are provided in Section 7. In Section 8, we provide two applications to real data to illustrate the importance of the new family. Bayesian analysis as well as a Gibbs sampling procedure for the considered data sets are discussed in Section 9. Finally, some concluding remarks are presented in Section 10.

New contribution to heavy-tailed distributions
As we mentioned in Section 1, the researchers are often in search of new heavy-tailed distributions suitable for modelling insurance losses. In this section, we propose some new useful methods to obtain heavy-tailed extended versions of the existing distributions.

New extended exponentiated-X family
Taking inspiration from (1), we introduce a new flexible class of distributions. The proposed class is called a new extended exponentiated-X (NEEx-X) family. Let T ∼ exp(1), then its cdf is given by The density function corresponding to (4) is If v(t) follows (5) and in (1), we define the cdf of the NEEx-X family given by where F(x; ξ) is the cdf of the baseline distribution depending on the parameter ξ ∈ R. The pdf of the NEEx-X family is For a = 1, the NEEx-X family gives the new extended-X(NE-X) family.

New type-I cosine exponentiated-X family
Again taking inspiration from (1), we introduce another new family of distributions, called new type-I cosine exponentiated-X(NTICEx-X) family. If v(t) follows (5) and setting in (1), we define the cdf of the NTICEx-X family via For a = 1, the NTICEx-X family gives the new type-I cosine-X(NTIC-X) family.

Type-I cosine exponentiated-X family
Taking inspiration from (2), we introduce a new method of proposing distributions. The proposed method is called type-I cosine exponentiated-X(TICEx-X) family of distributions. If v(t) follows (5) and setting in (2), we define the cdf of the TICEx-X family given by The density function corresponding to (8) is , For a = 1, the TICEx-X family gives the new Type-I cosine-X (TIC-X) family.

The exponent power exponentiated-X family
Take inspiration from (3), we introduce a new method of proposing probability distributions, called the exponent power exponentiated-X (EPEx-X) family of distributions. The cdf of the EPEx-X family is defined by the following expression The pdf corresponding to (9) is x ∈ R.
For a = 1, in (9), we introduce a sub-case of the EPEx-X family, called the exponent power-X (EP-X) family by cdf The pdf corresponding to (10) is For θ = 1, the EP-X family gives the new reduced exponent power-X (REP-X) family.

The exponent power-Weibull distribution: a new heavy-tailed distribution
In this section, we define a special sub-model of the exponent power-X family, called exponent power-Weibull (EP-W) distribution. Let F(x; ξ) be the cdf of the Weibull distribution given by F( x ≥ 0, where ξ = (α, γ ). Then, the cdf of the EP-W distribution has the following expression The pdf corresponding to (12) is given by

Statistical properties
In this section, we study some statistical properties of the EP-W distribution such as quantile function, moments and moment generating function (mgf). Some key advantages of the EP-W distribution are: • The mean and variance of the distribution are in explicit forms and simple to calculate. • The quantile function of the distribution has a closed form which makes it easier to generate random numbers.

Quantile function
Let X be the random variable follow the EP-W distribution with cdf (12). Then, the quantile function (qf) of X, say x u = Q(u) = G −1 (u), can be obtained by inverting (12) as follows where u ∈ (0,1). The qf of the EP-W distribution has a closed form expression. In particular, the first quartile, second quartile (median) and third quartile are obtained by substituting u = 0.25, 0.5 and 0.75 in (14), respectively.

Moments
Some of the most important features and characteristics of a model can be obtain through its moments. The rth moment of a random variable X, say μ / r , with pdf (13) is derived as On solving, we have where η = γ /θ. Using the definition of Beta type-II distribution, we have For r = 1, in (16), we get the first raw moment (mean) of the EP-W distribution given by Similarly, the second raw moment of the EP-W distribution is given by Now, the variance of the proposed distribution is obtained by using the following relation Using (17) and (18) in (19), we obtain the variance of the EP-W distribution. Furthermore, the mgf of the EP-W  distribution, M X (t), is given by The graphs of skewness and kurtosis of the EP-W distribution along with the corresponding contour plots are presented in Figures 2 and 3, respectively. From Figures 2 and 3, it is clear that for fixed values of θ, the skewness is decreased when α is increased. (ii) For fixed values of α, the skewness is increased when θ is increased. (iii) For fixed values of θ, the kurtosis is decreased when α is increased.

Characterization results
In designing a stochastic model for a particular data set, an investigator will be vitally interested to know if their selected model fits the necessary requirements. To this end, the investigator will rely on the characterizations of the selected distribution. Thus, the problem of characterizing a distribution is important in various fields and has recently attracted the attention of many researchers. Consequently, various characterization results have been reported in the literature. These characterizations have been established in different directions. This section is devoted to certain characterizations of the EP-X distribution based on a simple relationship between two truncated moments. Due to the nature of the cdf of this distribution, our characterizations may be the only possible ones. The first characterization result employs Theorem 1 of Glänzel [26], given in Appendix. As shown in Glänzel [27], this characterization is stable in the sense of weak convergence and can also be employed when the cdf does not have a closed form. We would like to mention that the goal of Theorem A.1 is to make η(x) as simple as possible.
Then X has density function (11) if and only if the function η, defined in Theorem A.1, is given by Proof: Suppose X is a random variable with density function (11), then we have Conversely, if η is of the above form, then and consequently Now, according to Theorem A.1, X has pdf (11).
Corollary 5.1: Suppose X : → R is a continuous random variable and q 1 (x) is as in Proposition 5.1. Then X has density function (11) if and only if there exist functions q 2 and η defined in Theorem A.1 satisfying the following first-order differential equation Corollary 5.2: The differential equation in Corollary 5.1 has the following general solution in which D is a constant. We like to mention that a set of functions satisfying the above first-order differential equation is given in Proposition 5.1 with D = 0. Clearly, there are other triplets (q 1 , q 2 , η) satisfying the conditions of Theorem A.1.

Maximum likelihood estimation and Monte Carlo simulation study
In this section, we use the maximum likelihood method to estimate model parameters and also provide a Monte Carlo simulations to assess the behaviour of these estimators.

Maximum likelihood estimation
In this sub-section, we obtain the maximum likelihood estimators (MLEs) of the model parameters of the EP-W distribution from complete samples only. Let x 1 , x 2 , , x n be an observed sample of size n obtained from (13). The corresponding log-likelihood function can be expressed as The partial derivatives of (21) with respect to the parameters (α, θ, γ ) are given, respectively, by and Solving simultaneously yields the MLEs (α,θ,γ ) of (α, θ , γ ).

Monte Carlo simulation study
In this sub-section, we perform a Monte Carlo (MC) simulation study with the objective to assess the behaviour      Table 1 and displayed graphically in Figures 4-7.

Actuarial measures
One of the most important tasks of financial and actuarial sciences institutions is to evaluate the exposure to market risk in a portfolio of instruments, which arise from changes in underlying variables such as prices of equity, interest rates or exchange rates. In this section, we calculate some important risk measures including value at risk (VaR), tail value at risk (TVaR), tail variance (TV) and tail variance premium (TVP) for the proposed distribution, which play a crucial role in portfolio optimization under uncertainty.

Value at risk
In the context of actuarial sciences, the VaR is widely used by practitioners as a standard financial market risk measure. It is also known as the quantile risk measure or quantile premium principle. The VaR is always specified with a given degree of confidence say q (typically 90%, 95% or 99%), and represent the percentage loss in portfolio value that will be equaled or exceeded only X per cent of the time. VaR of a random variable X is the qth quantile of its cdf, see Artzner [28]. If X follows the EP-W distribution, then its VaR is

Tail value at risk
Another important measure is TVaR, also known as conditional tail expectation (CTE) or tail conditional expectation (TCE), used to quantifies the expected value of the loss given that an event outside a given probability level has occurred. If X follows the EP-W distribution, then its TVaR is derived as Using (13) in (26), we get Let η = γ /θ, then from (27), we have Finally, we get

Tail variance
The TV is one of the most important actuarial measures which considers the tail variance beyond the VaR. The TV of a EP-W distributed random variable is Consider On solving, we get Using (28) and (30) in (29), we get the expression for the TV of the EP-W distribution.

Tail variance premium
The TVP is another important measure playing an essential role in insurance sciences. The TVP of the EP-W distributed random variable is where 0 < δ < 1. Using the expression (28) and (29) in (31), we get the tail variance premium of the proposed distribution.

Numerical study of the risk measures
In this sub-section, we provide numerical study of the VaR, TVaR, TV and TVP for the Weibull and EP-W distributions for different sets of parameters. The process is described below: • Random sample of size n = 100 are generated from the Weibull and EP-W distributions and parameters have been estimated via the maximum likelihood method. • 1000 repetitions are made to calculate the VaR, TVaR, TV and TVP for these distributions.
The numerical results of the risk measures are provided in Tables 2 and 3 and displayed graphically in Figures 8-11 corresponding to each table.    Table 3.  Table 3.  Table 4.  Table 4.
The simulation is performed for the Weibull and EP-W for the selected values of parameters. A model with higher values of the Risk measures is said to have a heavier tail. The simulated results provided in Tables 3 and 4 show that the proposed EP-W model has higher values of the risk measures than the traditional Weibull distribution. The simulation results are graphically displayed in Figures 8-11, which show that the proposed model has heavier tail than the Weibull distribution.

An application to a medical care insurance data set
In this section, we use two insurance data sets to illustrate the importance and flexibility of the proposed distribution. The comparison of the proposed distribution is made with a nested model, the Weibull distribution, and with some other non-nested models such as the generalize exponential (GE), Lomax, Burr, exponentiated Weibull (EW), generalized odd Burr III-Weibull (GOBXIII-W) and generalized log-Moyal (GLM) distributions. It is important to emphasize that the EW distribution is a popular model for analysing data in the applied areas; see Mudholkar and Srivastava [29]. The GE is another non-nested model and offers the characteristics of the Weibull and gamma distributions; see Gupta and Kundu [30]. The Lomax and Burr-XII (B-XII) distributions are prominent competitors for modelling large losses and offer a wide range of applications in financial sciences. The proposed EP-W model is also compared with the GLM distribution of Bhati and Ravi [1]. We also considered the fiveparameter non-nested GOBXIII-W distribution of Haq et al. [31]. The cdfs of the competing distributions are: Next, we consider certain analytical measures in order to verify which distribution fits better the considered data. These analytical measures include (i) discrimination measures such as Akaike information criterion (AIC), Bayesian information criterion (BIC), Hannan-Quinn information criterion (HQIC), Consistent Akaike information criterion (CAIC), and (ii) two other goodness of fit measures including Anderson-Darling (AD) test statistic and Cramer-von Mises test statistic. The discrimination measures are given by • The HQIC HQIC = 2k log (log (n)) − 2l, • The CAIC is where l denotes the log-likelihood function evaluated at the MLEs, k is the number of model parameters and n is the sample size. The considered goodness of fit measures are given by • The AD test statistic is given by where n is the sample size and x i is the ith observation in the sample, calculated when the data is sorted in ascending order. • The CM test statistic is given by For the optimization and calculation of the analytical measures, we use the optim()R-function with the argument method = "BFGS"; see Appendix.

Data 1: medical care insurances
Here, we illustrate the EP-W distribution by analysing a heavy-tailed data set representing medical care insurances. The data set is available at https://data.world/ datasets/insurance. The maximum likelihood estimates with the standard error (in parentheses) of the fitted models for the analysed data are presented in Table 4. The analytical measures of the EP-W and other considered models are provided in Table 5. Based on the above-mentioned measures, the proposed model fits the heavy-tailed medical care insurance data set better than the other models. In support of the results provided in Table 5, the estimated pdf and cdf of the proposed distribution are plotted in Figure 12. The PP and Kaplan-Meier survival plots are sketched in Figure 13. The proposed model may be an interesting alternative to the other existing models in the literature for modelling positively skewed heavy tail data. The practical application to the medical care insurance data shows    that the proposed distribution should be taken into account among other possible distributions for insurance data in which the properties of a heavy-tailed distribution are present.

Data 2: vehicle insurance losses
The second data set, available at https://data.world/ datasets/insurance, refers to vehicle insurance losses.
The competing models are also applied to these data. The estimates with the standard error (in parentheses)     of the model parameters are provided in Table 6. The analytical measures of the EP-W and other considered models are provided in Table 7. We can see that the EP-W model again outperforms all the fitted competitive models under these statistics. For the second data set, the fitted density and distribution functions of the EP-W model are plotted in Figure 14. We note that the EP-W distribution best captures the fitted pdf and cdf. The PP and Kaplan-Meier survival plots of the EP-W model are shown in Figure 15. From Figure 15, we can see that the proposed model is closely followed by the PP and Kaplan-Meier survival plots.

Bayesian estimation
Bayesian inference procedure has been taken into consideration by many statistical researchers, especially those in the field of the survival analysis and reliability engineering. In this section, a complete sample data is analysed through Bayesian point of view. We assume that the parameters α, γ and θ of EP-W distribution have independent prior distributions as   where a,b,c,d,e and f are positive. Hence, the joint prior density function is formulated as follow: α a−1 γ c−1 θ e−1 e −(bα+dγ +f θ) .
(32) In the Bayesian estimation the actual value of the parameter, may be adversely affected by the loss when choosing an estimator. This loss can be measured by a  function of the parameter and the corresponding estimator. Five well-known loss functions and associated Bayesian estimators and corresponding posterior risk are presented in Table 8. For more details see Calabria and Pulcini [32]. Next, we provide the posterior probability distribution for a complete data set. We define the function ϕ as The joint posterior distribution in terms of a given likelihood function L(data) and joint prior distribution π(α, γ , θ) is defined as Hence, we obtain the joint posterior density of the parameters α, γ and θ for the complete sample data by combining the likelihood function and the joint prior density (33). Therefore, the joint posterior density function is given by   where K is given by It is clear from Equation (34) that there is no closed form for the Bayesian estimators under the Five loss functions described in Table 8, so we suggest using a MCMC procedure based on 10,000 replicates to compute Bayesian estimators. The corresponding Bayesian estimates and posterior risk are provided in Table 9. Table 10 provides 95% credible and HPD intervals for each parameter of the EP-W distribution. The posterior samples extracted by using Gibbs sampling technique. Moreover, we provide the posterior summary plots in Figures 16-18. These plots confirm that the sampling process is of the prime quality and the convergence does occur. For the vehicle insurance losses, the Bayesian estimates along with corresponding the posterior risk are provided in Table 11. Table 12 provides 95% credible and HPD intervals for each parameter of the EP-W distribution. The posterior samples extracted by using Gibbs sampling technique. Corresponding to the vehicle insurance data set, the posterior summary plots are provided in Figures 19-21. These plots confirm that the sampling process is of the prime quality and the convergence does occur.

Concluding remarks
To cater data in financial and actuarial sciences, a number of methods to define heavy-tailed distributions have been proposed. In this regard, we further carried this branch of distribution theory and proposed nine new families of distributions. A three-parameter special model, called EP-W distribution, is studied in detail. Some mathematical properties along with certain characterizations are derived. Actuarial measures of the proposed model are also calculated and a simulation study has been conducted to show the usefulness of the proposed method in actuarial sciences. To prove the potential of the EP-W distribution, we considered two insurance data sets and compared its goodness of fit with other well-known distributions. The proposed distribution outclassed the competitive models by considering certain analytical measures. We hope that the new development will attract wider applications in the financial and insurance sciences and would be quite helpful for the new comers in the field of distribution theory.
Theorem A.1: Let ( , F , P) be a given probability space and let H = [d, e] be an interval for some d < e (d = −∞, e = ∞ might as well be allowed). Let X : → H be a continuous random variable with the distribution function F and let q 1 and q 2 be two real functions defined on H such that is defined with some real function η. Assume that q 1 , q 2 ∈ C 1 (H), η ∈ C 2 (H) and F is twice continuously differentiable and strictly monotone function on the set H. Finally, assume that the equation ηq 1 = q 2 has no real solution in the interior of H. Then F is uniquely determined by the functions q 1 , q 2 and η, particularly where the function s is a solution of the differential equation s = η q 1 /ηq 1 − q 2 and C is the normalization constant, such that H dF = 1.
We like to mention that this kind of characterization based on the ratio of truncated moments is stable in the sense of weak convergence (see, Glänzel [27]), in particular, let us assume that there is a sequence {X n } of random variables with distribution functions {F n } such that the functions q 1n , q 2n and η n (n ∈ N) satisfy the conditions of Theorem A.1 and let q 1n → q 1 , q 2n → q 2 for some continuously differentiable real functions q 1 and q 2 . Let, finally, X be a random variable with distribution F. Under the condition that q 1n (X) and q 2n (X) are uniformly integrable and the family {F n } is relatively compact, the sequence X n converges to X in distribution if and only if η n converges to η, where This stability theorem makes sure that the convergence of distribution functions is reflected by corresponding convergence of the functions q 1 , q 2 and η, respectively. It guarantees, for instance, the "convergence" of characterization of the Wald distribution to that of the Lévy-Smirnov distribution if α → ∞. A further consequence of the stability property of Theorem A.1 is the application of this theorem to special tasks in statistical practice such as the estimation of the parameters of discrete distributions. For such purpose, the functions q 1 , q 2 and, specially, η should be as simple as possible. Since the function triplet is not uniquely determined it is often possible to choose η as a linear function. Therefore, it is worth analysing some special cases which helps to find new characterizations reflecting the relationship between individual continuous univariate distributions and appropriate in other areas of statistics. In some cases, one can take q 1 (x) ≡ 1, which reduces the condition of Theorem A.
We, however, believe that employing three functions q 1 , q 2 and η will enhance the domain of applicability of Theorem A.1.