Stochastic modelling of earthquake interoccurrence times in Northwest Himalaya and adjoining regions

ABSTRACT This work concentrates on the distribution of earthquake interevent times in northwest Himalaya and its adjacent regions. We consider 12 time-dependent probability distributions for analysis. The maximum likelihood estimation and Fisher information matrix-based methods are used to estimate model parameters and their respective uncertainties. Results from three model selection criteria suggest that the best fit arises from exponentiated Weibull, exponentiated exponential, Weibull and gamma distributions. An intermediate fit comes from exponentiated Rayleigh, and lognormal distributions, whereas the remaining distributions exhibit a poor fit to the seismic interoccurrence times of present catalogue. Using exponentiated Weibull model, it is observed that the estimated cumulative probability of magnitude 6.0 or higher event in the northwest Himalaya reaches 0.92–0.95 after about 20–23 (2019–2022) years since the last event in 1999. The conditional probability, for an elapsed time of 19 years (i.e. 2018), reaches 0.90–0.95 after about 20–25 (2038–2043) years from now. A series of conditional probability curves is also presented to understand the recent and future earthquake hazard in the study region. This temporal evolution of seismic interevent times may provide important clues to the underlying physical mechanism of earthquake genesis in the northwest Himalaya region.


Introduction
Occurrence of earthquakes has been known to be a natural phenomenon in the northwest Himalaya and its adjoining regions since colonial times. Historical record of these earthquakes started with the settlement of people and their gradual involvement in understanding the process. The early documentation comes from newspapers, manuscripts, letters, diaries, and books (Oldham 1883;Milne 1911;Chandra 1992). The felt reports of Himalayan earthquakes dating back to the early nineteenth century have been utilized to estimate magnitudes and epicentral positions until the 1930s, when the instrumental recordings became the main source of earthquake data (Ellsworth et al. 2015). By neglecting some ambiguous information in historical records, one can obtain a complete earthquake catalogue of a specific magnitude for a particular region.
It is believed that earthquakes occur as a consequence of a complex nonlinear dynamical system in the brittle part of Earth's crust (Shcherbakov et al. 2005). Interactions among different spatial network of fault systems and temporal processes govern this nonlinear threshold dynamics. Despite this complex behaviour, one may consider earthquakes as a simple point process in time and space, by ignoring its temporal effect due to event duration and spatial effect due to rupture zones. Then it is possible to analyse various properties of this point process using suitable statistical models that represent the observed seismicity (Utsu 1984;Anagnos and Kiremidjian 1988). Apart from an immediate empirical earthquake hazard estimation of the study area, the best-fit statistical models may provide important insights to the long-term earthquake generation process involving loading and stressing condition of the fault system, deep fault creeping, post-seismic relaxation of fault-zone material properties, and earthquake interactions between low-frequency (slow-slip) and regular events (Hainzl et al. 2006;Shelly et al. 2007;Wu et al. 2013).
Probabilistic earthquake forecasting or binary earthquake prediction is not new. However, the challenge is to identify probability models that describe both physical and statistical characteristics of earthquake sequences. The exponential distribution, due to its Poissonian assumption, was the primary model in probabilistic earthquake interevent time analysis (Cornell 1968). However, time-independent exponential distribution contradicts to the elastic-rebound theory of earthquake mechanics. This leads to the development of a number of time-dependent seismic renewal models of which gamma, lognormal, and Weibull distribution became the leading ones (Utsu 1984;Ram 1997, 1999;Tripathi 2006;Yadav et al. 2008Yadav et al. , 2010bDikshit 2015a, 2015b). However, it was later observed that the distribution function or survival function of a gamma distribution with non-integer shape parameter is difficult to calculate. Similarly, it turns out that the additive reproductive (hereditary) property of lognormal and Weibull random variables does not exist (Johnson et al. 1995). Besides, gamma and Weibull models fail to offer non-monotone bathtub-type hazard shapes (Mudholkar and Srivastava 1993). Thus, to overcome these shortcomings of conventional models and to bridge the gap between physical and statistical models in seismology, a series of generalizations and modifications have taken place. Some notable examples are the generalized gamma distribution (Bak et al. 2002), triple exponential distribution (Kijko and Sellevoll 1981), heavy-tailed Pareto distribution (Kagan and Schoenberg 2001;Ferraes 2003), inverse Gaussian or the Brownian passage time distribution (Matthews et al. 2002;Pasari and Dikshit 2015b), Rayleigh distribution (Ferraes 2003), inverse Weibull (Frechet) distribution Dikshit 2014a, 2015a), and the most recent exponentiated group of distributions, such as the exponentiated exponential, exponentiated Rayleigh, and the exponentiated Weibull distributions (Pasari and Dikshit 2014b, 2015a, 2015b, 2018. Though these distributions, as a whole, serve most of the requirements of seismologists and earthquake professionals, it is still unclear which distribution is the globally best model and whether that one will remain as the best in future also. In view of the above discussion, the present study considers 12 probability models from timedependent and heavy-tailed distributions to provide a comprehensive statistical analysis of earthquake interoccurrence times (waiting times) in the northwest Himalaya. We utilize the maximum likelihood estimation (MLE) method for parameter estimation and the Fisher information matrix (FIM) based approach for uncertainty analysis. Using three goodness-of-fit tests, we identify the best-fit probability model(s) for the studied catalogue and use them for earthquake forecasting and seismic hazard analysis in the study region. Towards the end, we briefly discuss our results with other studies from geodetic (GPS) observations and palaeoseismological investigations for a fruitful discussion.

Study area
Our study area covers northwest part of the Himalayan mountain range, part of alluvial Indo-Gangetic Plains, and its surrounding regions bound by 28 N to 34 N and 74 E to 82 E (Figure 1). Enormous tectonic forces generated from persistent collision of Indian-Eurasian continental plates have triggered many destructive interplate earthquakes in the Himalayan foothills (Yin 2006). Most of these thrust-faulting, strike-slip, crustal (depth 10-30 km) to subcrustal (depth 50-80 km) events are caused by a sequence of east-west thrust faults, such as the Himalayan frontal thrust (HFT), main boundary thrust (MBT), main central thrust (MCT), and their branching faults (Pasari 2015 Ambraseys and Douglas 2004;Yadav 2009;Martin and Szeliga 2010;Arora et al. 2012;Rajendran et al. 2013;Gupta and Gahalaut 2014;Pasari 2015;Szeliga and Bilham 2017). Yet, it appears that these earthquakes are inadequate to accommodate 10-20 mm/yr plate convergence (observed geodetically) that has gradually accumulated to more than a 9-m slip in the present day (Jade et al. 2014;Pasari 2015). The increasing number of populations in the hilly regions and inadherence of building construction codes have further accelerated seismic vulnerability in the entire Himalayan arc and its adjacent cities including the National Capital Territory (NCT) of New Delhi (Bhatia et al. 1999;Iyenger and Ghosh 2004).

Geology and tectonics
The northwest Himalaya and its environs comprise a number of thrust faults (e.g. MCT, MBT, HFT) and their imbricate branching faults (e.g. Jawalamukhi thrust, Palampur thrust, Barsar thrust, Panjal thrust, and Nalagarh thrust), ridges (e.g. Delhi-Haridwar ridge), duns (e.g. Dehradun, Pinjore, and Kota dun), uplifts, anticlines (e.g. Janauri and Sarkaghat), synclines (e.g. Lambagraon), and lineaments. From north to south, the 8000 m to 250 m high study region covers a part of the Tethys Himalaya, Higher Himalaya, Lesser Himalaya, Siwalik, and the foreland basin of Indo-Gangetic Plains (Yin 2006;Kayal 2010). Three main geographical units, namely the Chamba-Kangra-Palampur-Sundarnagar region, the Nalagarh-Chandigarh-Shimla-Kinnaur region, and the Garhwal-Kumaun Himalayan region characterize the northwest Himalaya and its adjacent areas (Powers et al. 1998;Thakur et al. 2000). Tectonics of these units is associated with several structural discontinuities parallel or sub-parallel to the Himalayan trend (Thakur et al. 2000;Yin 2006). Intensive folding and thrusting that occurred in Cenozoic and Mesozoic periods are evident in NW-SE to E-W directions of Himalayan ranges (Yin 2006). The southernmost edge of the northwest Himalaya is demarcated by the HFT, which delineates the quaternary alluvial filled Indo-Gangetic plains and the complex folded-faulted upper Siwalik hills comprising molassic sediments of Early Pliocene and Early Pleistocene periods (Philip et al. 2011). Seismicity is very high in these areas as evident from a number of damaging earthquakes. In addition, the study region has a long history of moderate earthquakes, indicating a steady stress release in these regions. As a consequence, major portion of the study area has been listed in Zone V (assigned peak ground acceleration of 0.36 g) and Zone IV (assigned peak ground acceleration of 0.24 g) in the seismic zonation map of India (BIS 2002). Nevertheless, it may be noted that the present analysis focuses on empirical stochastic modelling of earthquake interevent times, and hence the effect of various physical seismotectonic processes of earthquake generation (e.g. field topography, geological and geomorphological parameters, and fault parameters) is not considered here (Utsu 1984;Shanker et al. 2007;Dikshit 2015a, 2015b).

Earthquake data
In the present analysis, a real, homogeneous, and time-complete earthquake catalogue of 29 independent strong-to-great earthquake events M W 6 ð Þspanning for the period 1803-1999 is considered. The epicentral locations, focal depths, magnitudes, and times of occurrences of these earthquakes are summarized in Table 1. It may be mentioned here that no strong event M W 6 ð Þ had occurred in the study region since the last event in March 1999. Thus, the present catalogue may be considered as the most updated catalogue of M W 6 events in the northwest Himalaya and its adjacent regions.
A number of national and international seismic databases, such as the International Seismological Centre (ISC), Indian Meteorological Department (IMD), National Disaster Management Authority (NDMA) of India, National Earthquake Information Centre (NEIC) of United States Geological Survey (USGS), Global Centroid Moment Tensor database (formerly known as the Harvard CMT database), and a few earlier published articles and reports (e.g. Quittmeyer and Jacob 1979;Oldham 1883;Milne 1911;Chandra 1992;Ambraseys 2000;Ambraseys and Jackson 2003;Ambraseys and Douglas 2004;Yadav 2009;Martin and Szeliga 2010;Arora et al. 2012;Rajendran et al. 2013;Gupta and Gahalaut 2014;Szeliga and Bilham 2017) are used to compile this catalogue. However, majority of the recent (after 1900) events are adopted from the Reviewed ISC Bulletin (April 1905(April -2013 and the ISC Bulletin (2014-March 2017) available on http://www.isc.ac.uk/iscbul letin/search/catalogue/. The historical (non-instrumental) earthquake events before 1905 were mainly obtained from the IMD and were verified with a few earlier published articles (Yadav 2009;NDMA 2010;Yadav et al. 2012a). It may be noted that the IMD catalogue, particularly prior to 1905, uses local magnitude M L ð Þ scale to describe the events. It is observed that the Himalayan earthquake events are not homogeneous (Yadav 2009;Yadav et al. 2012a;Pasari 2015). Therefore After homogenization, declustering of the 33 events M W 6 ð Þhas been carried out to remove the repeated and dependent events, such as foreshocks, aftershocks, and seismic clusters. We use a dynamic (earthquake size dependent) window-based spatio-temporal filtering algorithm modified by Uhrhammer (1986) from the initial development of Gardner and Knopoff (1974). As a conserva-  Chandra (1992). a Conversion to moment magnitude by Ambraseys (2000). b Conversion to moment magnitude by Ambraseys and Douglas (2004). c Moment magnitude values obtained from GCMT (Global Centroid Moment Tensor) catalogue.
tive scrutiny-based measure (e.g. Ford and Walter 2010), we add/subtract 60 days and 15 km to the time and distance given by the Uhrhammer (1986) relations so that the search radius is (2) and the time window is Based on the above criteria, we eliminate four dependent events (»12%) from the initial list of 33 events M W 6 ð Þfor the period 1803-1999, homogenized to moment-magnitude scale M W ð Þ by the Scordilis (2006) and Yadav et al. (2012a) procedure.
After homogenization and declustering, the time-completeness of the present catalogue of 29 events (Table 1) is examined using a magnitude-frequency-based visual cumulative method test (Mulargia and Tinti 1985). In this method, a linear fit (in a least-squares sense) is first obtained from the scatter diagram of time (year number) and cumulative number of M W 6 events. A catalogue of certain magnitude threshold will be designated to be time-complete if the trend of the data stabilizes to approximately a straight line. In other words, this cumulative straight-line approach assumes that earthquake rates and moment releases in a specified area are ultimately steady over sufficiently long time periods (Mulargia and Tinti 1985). For the present catalogue, it is observed that the graph between occurrence time and the cumulative number of events has a near-perfect linear relationship with an R-square value greater than 0.93 ( Figure 2). Therefore, the present catalogue (Table 1) as per the visual cumulative method test can be considered as time-complete.
It can be observed from Table 1 that the focal depths mostly range from 20 to »60 km, implying these events to be typical plane of detachment earthquakes as envisaged by Ni and Barazangi (1984).  Table 1) that occurred in the northwest Himalaya and its adjoining regions during 1803-1999.
Among the events listed here, the most disastrous one is the Kangra earthquake (M W 7:8; 00:50 UTC, 4 April 1905; epicentral location »32.64 N, 76.79 E) that killed more than 20,000 people and destroyed approximately 100,000 buildings in the epicentral area (Ambraseys and Bilham 2000). Besides, two strong earthquakes, namely the 19 October 1991 Uttarkashi earthquake (M W 6.8; maximum intensity IX in MSK scale; focal depth 12-15 km) and the 28 March 1999 Chamoli earthquake (M W 6.5; maximum intensity VIII in MSK scale) recently jolted northwest Himalaya. Extensive studies were conducted for these two earthquakes because of their notable connections to the conceptual models of plane of detachment earthquakes as well as their aftermath effects on the engineering construction of Tehri dam in Garhwal Himalaya (Kayal 2010;Arora et al. 2012).

Methods and results
Our methodology in this paper is three-folded: model description, parameter estimation, and model selection. In model description, we specify our assumptions and discuss various important properties of the studied distributions. In statistical inference, we utilize the MLE approach to estimate various parameters of these distributions. The associated model uncertainties in the form of estimated standard deviations and confidence bounds are obtained from the FIM-based method. Finally, in model selection and validation, we make use of three statistical tests, namely the maximum likelihood test with its improvement to Akaike information criterion (AIC), Kolmogorov-Smirnov (K-S) test, and Chi-square test to prioritize different models. The best-fit model(s) for the present catalogue will be used for earthquake hazard calculation and earthquake forecasting of the study region.

Assumptions and model description
Let T be a continuous and positive random variable that denotes interevent times between successive earthquakes (main shocks) above a threshold magnitude in a selected spatial region. Since the interevent times are calculated from earthquakes of all faults in a region, physical interactions among individual faults are ignored. Besides, we assume that all earthquakes in the present catalogue are identical and independently distributed (i.i.d.). We also disregard seismic effects or stress changes induced either from any nearby or remote earthquakes outside the boundary, or from any dependent event, such as foreshock, aftershock, or seismic clusters (Yadav 2009;Pasari 2015). With this set up, we assume that the interevent time T has the probability density function f t ð Þ, cumulative distribution function F t ð Þ, survival function S t ð Þ, hazard function h t ð Þ, and reversed hazard function r t ð Þ. Among these functions, h t ð Þ and r t ð Þ have the most significant importance in statistical seismology. Various shapes of these two functions characterize earthquake residual times (time to a potential event) whether decreasing or increasing for an increasing elapsed time (Matthews et al. 2002).
Corresponding to the random variable T, the probability density functions of the 12 different distributions, their possible domains and significance of different parameters are summarized in Table 2. It is observed that all distributions except Pareto model assume entire positive real line as their domain. In addition, we have specified domains for the shape parameter that governs the appearance of a distribution and the scale parameter that determines the spread of a distribution. Particularly, when shape parameter(s) takes unit value, all of gamma, Weibull, exponentiated exponential, and exponentiated Weibull distributions turns out to a single-parameter time-independent exponential distribution (Johnson et al. 1995). Similarly, it may be observed that exponential, Rayleigh, Weibull, exponentiated exponential and exponentiated Rayleigh (Burr Type X) distributions belong to the sub-families of the exponentiated Weibull distribution (Mudholkar and Srivastava 1993;Pasari and Dikshit 2018). In terms of hazard shapes, gamma, Weibull, and exponentiated exponential distributions offer monotone (increasing, decreasing, and constant) hazard functions, whereas the lognormal, inverse Gaussian, inverse Weibull, exponentiated Rayleigh, and exponentiated Weibull distributions provide monotone as well as non-monotone hazard shapes; Rayleigh distribution has only increasing hazard shapes, and the heavy-tailed Pareto distribution offers only decreasing hazard shapes (Johnson et al. 1995). In addition, the asymptotic nature t ! 1 ð Þof these The FIMs of exponentiated Weibull and exponentiated Rayleigh distributions are not calculated, as the elements of FIM of exponentiated Weibull distribution is not completely available while for the exponentiated Rayleigh distribution, FIM contains nonlinear implicit terms (Pal et al. 2006).
hazard functions may describe long-term earthquake behaviour in a seismic-prone region (Sornette and Knopoff 1997). Let t be the time beyond the last event and v be the residual time to a potential event. Then t is known whereas v is random. For a given t 0 < t < t ð Þ , our aim is to estimate v so that the earthquake occurs in the time window t; t þ v ð Þ : For this, we introduce another continuous random variable V corresponding to v. The random variables V and T are linearly related. Therefore, the conditional probability of an earthquake in t; t þ v ð Þ ; given that no earthquake occurred in t years since the last event in 1999, can be framed as Different values of t constitute a series of estimated conditional probability values coupled with various conditional probability curves (hazard curves). These hazard curves play important roles in decision making, city planning, structural health monitoring, seismic zonation, revision of building codes, scenario earthquake analysis, earthquake insurance program, and more importantly disaster planning and mitigation (SSHAC 1997;Working Group 2013;Pasari 2017).

Parameter estimation
We have used the MLE method for statistical inference. The MLE is an extremum consistent estimator and it allows assessing parametric uncertainties of the estimated model parameters Dikshit 2014b, 2015b). The MLE method estimates model parameters by maximizing the joint likelihood (log) function of observed sample values. Therefore, it requires solving a system of likelihood equations. While seeking a solution for a set of nonlinear likelihood equations, the initial values of the model parameters are usually obtained either from graphical estimations, such as Weibull probability plot, inverse Weibull probability plot, hazard plot, and lognormal probability plot or from a sub-family distribution. For example, the initial parameter values for the generalized family of distributions (say, exponentiated Weibull) may be obtained from its sub-family distributions (say, Weibull).
In statistical inference, emphasis should also be given on calculating exact uncertainties of the estimated parameters. However, due to the fact that actual distributions of the estimated parameters seldom are known, a surrogate approach based on the FIMs is used to quantify parametric uncertainties in the form of asymptotic standard deviations and confidence limits of the estimated model parameters (Johnson et al. 1995).
For a vector of parameters u ¼ u 1 ; u 2 ; . . . ; u p À Á ; the FIM I pÂp u ð Þ is defined (Hogg et al. 2005) as In the above expression, E and L T; u ð Þ denote the linear expectation operator of a random variable and the log-likelihood function of n sample data points t 1 ; t 2 ; t 3 ; :::; t n f g . The symmetric and positive semi-definite FIM I pÂp u ð Þ may also be calculated from the density function f t; u ð Þ, the hazard function h t; u ð Þ, and the reversed hazard function r t; u ð Þ as Although the random variables @ 6 @u ð Þlnf T; u ð Þ; @ 6 @u ð Þlnh T; u ð Þ, or @ 6 @u ð Þlnr T; u ð Þ in the above expression provide different first moments, they produce identically same second moments which constitute the elements of I pÂp u ð Þ. Now the Cramer-Rao lower bound theorem (Hogg et al. 2005) uses FIM to provide an estimated asymptotic covariance matrix Pû of the estimated parametersû À Á as Pû nIû À Á Â Ã À1 ;û denotes the maximum likelihood estimate of u. In addition, we can also find the 1 À d ð Þ% upper and lower asymptotic confidence limits of the estimated model parameters asû À z d 6 2 : ...;p is the diagonal elements of the covariance matrix and z d 6 2 is the critical value for a confidence level of 1 À d 2 À Á on the standard normal variate (Hogg et al. 2005). The FIMs of the studied distributions are summarized in Table 2. For the Pareto distribution, we provide exact uncertainties of the estimated parameters (Quandt 1966), whereas for the exponentiated Rayleigh and exponentiated Weibull distributions, we could not calculate parametric uncertainties due to the unavailability of explicit expressions for FIM elements (Pal et al. 2006). The exact variances of the estimated parameters of a Pareto distribution are given as (Quandt 1966) The estimated parameter values and their asymptotic uncertainties are summarized in Table 3. It may be noted that we have used equation solver package fsolve() of MATLAB (2010) for computations. Table 3 provides many insights to the distributional properties of earthquake interevent times in the northwest Himalaya and its adjoining regions. For instance, the estimated values of shape parameters in Weibull, gamma, and exponentiated exponential distributions become less than 1.0, revealing that the corresponding hazard functions have monotonically decreasing shape. Similar is for the exponentiated Weibull distribution, asb < 1 andbĝ < 1 (Pal et al. 2006). In contrast, the exponentiated Rayleigh distributionb < 0:5 À Á shows that the hazard functions of these distributions assume bathtub type shape (Pal et al. 2006). In addition, we observe that the first moment about origin (mean) and the second central moment (variance) for the heavy-tailed Pareto model do not exist, asb < 1 (Johnson et al. 1995).
Table 3 also shows that the asymptotic standard deviations of estimated model parameters numerically vary from 0.010423 (Pareto) to 3.714372 (inverse Gaussian). However, this does not directly relate to the amount of uncertainty in application of the underlying models. Instead, the propagation effect from these estimated parametric uncertainties should be assessed thoroughly to comment on the applicability of various models (Hogg et al. 2005).

Model selection
To compare the usefulness of different competitive distributions, we employ three model selection tests. These are the maximum likelihood test with its improvement to AIC, K-S test, and the Chisquare test.
The maximum likelihood test is derived from the MLE technique. In this test, we make use of log-likelihood values for model comparison. The model that produces the maximum likelihood value is considered to be the best-fit model (Johnson et al. 1995). However, this test is applicable to prioritize the models that have same number of model parameters. To rule out this a-priori assumption and to incorporate the relative cost of operations due to more number of model parameters, the AIC is formulated as AIC ¼ 2k À 2L; k and L denote the number of parameters and the log-likelihood value. As per this criterion, the most economical model is the one that produces the minimum AIC value (Johnson et al. 1995).
In comparison to AIC, the non-parametric K-S test sorts competing models according to their 'closeness' to a step-like empirical distribution function (EDF). Various forms of EDFs are available in literature (Johnson et al. 1995). The Monte Carlo simulations are usually employed in K-S criterion.
In the Chi-square test, the range of sample observations is first divided into k parts (may be equal or unequal) and then the corresponding Chi-square distances between observed and expected frequencies are calculated. The distribution with the minimum Chi-square distance is marked as the best model. It is observed that in Chi-square test, there is no theoretical method to select the number of class intervals. As a consequence, we calculate Chi-square values corresponding to five classes (<2, 2-4, 4-8, 8-12, >12). The results of different model selection tests are summarized in Table 4. Table 4 shows that the AIC values corresponding to the exponentiated exponential (167.82) and Weibull (167.84) models are very close (except second decimal digits) and also minimum among all AIC values. Thus, these two models can be tagged as the most economical ones in representing earthquake interevent times in the study region. Nevertheless, the AIC values for the gamma (168.36) and the exponentiated Weibull (170.16) distributions are also smaller in comparison to other AIC values. In the non-parametric K-S test, it is observed that the exponentiated Weibull model has the minimum K-S distance (0.0911), although the K-S distances for the exponentiated a For Pareto distribution, exact standard deviations sâ; sb À Á of the estimated parameters are calculated (Quandt 1966); the upper and lower confidence bounds are capped at 0 and 0.049315, as 0 < a < t i for all i. Similarly, for inverse Gaussian distribution, the lower confidence bound forb is capped at 0, asb > 0: b Asb < 1 in the exponentiated exponential distribution, the tri-gamma function c 0b À 1 À Á is not defined. Hence, the FIM for the exponentiated exponential distribution was not evaluated. c For the exponentiated Rayleigh and exponentiated Weibull distributions, parametric model uncertainties are not calculated, as their FIMs are not explicitly available (Pal et al. 2006). exponential (0.0951), Weibull (0.1054), and gamma (0.1177) distributions are not negligible. To analyse more on the overall fitness of these competitive distributions, we provide a number of K-S plots in Figure 3. In the Chi-square test, the x 2 value for the exponentiated Weibull distribution (0.5955) is again the least among all Chi-square values. From these three model selection criteria and associated K-S plots, three broad categories of distributions have emerged. The gamma, Weibull, exponentiated exponential, and the exponentiated Weibull distributions (the most-best fit) provide the best fit, while the exponentiated Rayleigh and the lognormal distributions offer intermediate fit, and the remaining models, that is, inverse Gaussian, inverse Weibull, Levy, Maxwell, Pareto, and Rayleigh distributions provide poor fit to the interevent times of the studied catalogue. Furthermore, as the non-parametric K-S criterion is free from any prior assumptions and possesses several advantages over other model selection criteria (Johnson    Hogg et al. 2005), we prioritize the best-fit distributions as per their minimum K-S distances: exponentiated Weibull (rank 1), exponentiated exponential (rank 2), Weibull (rank 3), and gamma (rank 4). However, we caution our readers that the absolute differences in K-S values are too insignificant to draw any stringent comment on the superiority of these competing models (Utsu 1984;Pasari 2015).

Earthquake hazard assessment
Knowing the best-fit models in the previous section, our next aim is to assess earthquake hazard to measure the extent of damages from seismic events. Ideally, earthquake hazard assessment requires reliable estimation of future earthquake size, location, and time of occurrence (Rundle et al. 2003). However, in this study, we measure earthquake hazards in terms of the mean interoccurrence time and conditional probability of future earthquakes. We also present a few earthquake hazard curves (obtained from conditional probability values) in Figure 4 to illustrate the projected hazard in the thickly populated northwest Himalaya and its adjoining regions. Using the best fit exponentiated Weibull model, the estimated mean interevent time for a magnitude 6.0 or higher event in the study region turns out to be approximately 7 §8 years, whereas the estimated cumulative probability reaches 0. (2038-2043) years from now (i.e. 2018) ( Table 5). In addition, a series of estimated conditional probability values for different combination of elapsed times (time elapsed since the last strong event in March 1999) and residual times (time to a next potential earthquake) are also calculated (Table 6) to understand the recent and future earthquake hazard in the study region. Associated conditional probability curves (hazard curves) are presented in Figure 4. It may be noted that in Table 5, we have provided a range of conditional probability values considering the parametric uncertainties in a 95% confidence limit (refer Table 3). However, due to the unavailability of parametric uncertainty for exponentiated exponential and the exponentiated Weibull parameters, we have provided their 'absolute' conditional probabilities. It is observed that the conditional probability values from different competitive models (gamma, Weibull, exponentiated exponential, and exponentiated Weibull distributions) are quite consistent and they converge to the highest probability for larger residual times (35-40 years).
The conditional probability curves in Figure 4 provide long-term earthquake forecasting of the study region. These emanated hazard curves have important roles in disaster planning and mitigation, seismic insurance, seismic zonation, and revision of building codes (SSHAC 1997).

Discussions
The northwest Himalaya and its adjacent regions suffer from a number of high-intensity interplate earthquakes (Kayal 2010;Arora et al. 2012). In this work, the statistical modelling of earthquake interevent times has been performed using 12 probability models, namely gamma, lognormal, Weibull, inverse Gaussian, inverse Weibull, Levy, Maxwell, Pareto, Rayleigh, exponentiated exponential, exponentiated Rayleigh, and exponentiated Weibull distributions. An updated and homogeneous earthquake catalogue comprising 29 independent events M W 6 ð Þduring the period 1803-1999 is utilized for this purpose. The present and future (projected) earthquake hazards have been assessed in terms of conditional probability of earthquakes in a specified time-space window.  The earthquake recurrence in different regions of the world has been studied by many researchers using various probability models (Utsu 1984;Ram 1997, 1999;Tripathi 2006;Yadav et al. 2008Yadav et al. , 2010bYazdani and Kowsari 2011;Pasari 2015). Utsu (1984) used four competitive probability models, namely Weibull, gamma, lognormal, and exponential distributions in Japan and surrounding regions. Five important probability models, namely exponential, lognormal, Pareto, Rayleigh, and gamma were used by Yazdani and Kowsari (2011) in the seismotectonic provinces of Iran. In India, Ram (1997, 1999) used lognormal, Weibull, gamma, and exponential probability distributions in the northeast Indian peninsula and Hindukush regions, and latter for the entire Indian subcontinent regions. They used earthquakes with threshold magnitude 7.0 (Parvez and Ram 1997), and 6 < m b < 7 (Parvez and Ram 1999), respectively. The Weibull and gamma distributions were observed to be the most representative models for the Indian subcontinent (Parvez and Ram 1999). In a similar effort, the probabilistic assessment of earthquake hazards in Gujarat and adjacent regions has been studied by Tripathi (2006) and later re-estimated by Yadav et al. (2008) and Pasari and Dikshit (2015b). Yadav et al. (2008) used Weibull, gamma, and lognormal models, and observed that the gamma model fits best to the Kachchh and Saurashtra regions, whereas the lognormal model fits best to the mainland Gujarat regions. In the northeast India and adjoining regions, the time to next earthquake has been estimated by Ram (1997), Yadav et al. (2010b), and Pasari and Dikshit (2015a). While Yadav et al. (2010b) observed that the gamma model is the most suitable model, Pasari and Dikshit (2015a), from a comprehensive list of 11 probability models, remarked that the gamma, exponentiated exponential, and Weibull models provide the best fit to the observed interoccurrence times. In the northwest Himalaya and its vicinity, including part of India, Pakistan, Hindukush, Pamirs, Mongolia, and Tien-Shan, remarkable efforts towards probabilistic assessment of earthquake hazard parameters are made by Parvez and Ram (1997), Shanker et al. (2007), Yadav (2009), Yadav et al. (2010a, 2012a, 2012b, 2013a, 2013b, 2015, and Chingtham et al. (2016). Shanker et al. (2007) used Gumbel's third asymptotic distribution of extreme value probability model in three seismotectonic zones in the Hindukush-Pamir northwest Himalaya region to assess time-dependent seismicity in these regions. Their results reveal that the most probable largest annual earthquakes in these three zones are approximately 6.0, 4.9, and 5.0, respectively. Later, Yadav et al. (2010a) studied regional time and magnitude predictable model in the same region, however, dividing it into 17 seismogenic sources. They estimated time-dependent conditional probabilities and observed that the time predictable model seems to be preferable over slip-predictable model (Yadav et al. 2010a;Chingtham et al. 2016). Analysis of time-dependent seismicity and earthquake hazard parameters (e.g. mean seismic activity rate, b-value of Gutenberg-Richter frequency-magnitude relationship, and upper-bound or maximum regional magnitude) in 28 seismogenic source zones in the northwest Himalaya has been performed in a series of works by Yadav et al. (2012aYadav et al. ( , 2012bYadav et al. ( , 2013aYadav et al. ( , 2013bYadav et al. ( , 2015 using classical and Bayesian statistics, Kijko-Sellevol method, and Gumbel III distribution. The spatial mapping of earthquake hazard parameters including annual and 100-years mode of earthquake magnitude in northwest Himalaya was undertaken by Yadav et al. (2012b). They observed high seismic hazard (low recurrence periods) in the Hindukush-Pamir Himalaya as well as in the Kangra and Kashmir regions (Chingtham et al. 2016). The above results not only depict some quantitative measure of seismicity, but also highlight the importance of proper engineering planning for inexorable earthquake events in the study region (Shanker et al. 2007;Working Group 2013;Pasari 2015).
The present study contributes to the hazard assessment endeavour by modelling the distribution of earthquake interevent times of magnitude 6 and higher events from an exhaustive statistical framework of 12 time-dependent and heavy-tailed probability distributions. The study region partially overlaps with some of the zones as defined in previous studies viz. Parvez and Ram (1997), Shanker et al. (2007), Yadav (2009), Yadav et al. (2010a, 2010b, 2012a, 2012b, 2013a, 2013b, 2015, and Chingtham et al. (2016). In particular, Parvez and Ram (1997) found that the lognormal model comparatively fits better to the Hindukush region, and the region has reached to the highest conditional probability level for an elapsed time of 23 years (i.e. 1997) since the last major event in 1974.
They have also inferred that this non-occurrence of a major event after 1974 could be due to the frequent occurrences of moderate-size earthquakes (17 earthquakes of magnitude greater than 6.0 and 4 earthquakes of magnitude greater than 6.4) in the Hindukush area (Parvez and Ram 1997). Similarly, on the basis of 43 years of data , Shanker et al. (2007) observed that the NW Himalayan region (25 -34 N, 73 -85 E) possesses most probable earthquake magnitude between 6.0 to 6.5 in next 10-25 years. In an another effort, Yadav et al. (2012a) estimated the return periods for earthquake with magnitude 6.0 in different zones of NW Himalaya and found that it broadly varies from 11-30 years. They remarked that the seismic hazard level from one zone to other varies significantly revealing the crustal heterogeneity and complex seismotectonic setting of the Himalayan region (Yadav et al. 2012a). Later, Yadav et al. (2012b) studied the spatial distribution of upperbound magnitude in different zones of NW Himalayan region and found that the maximum magnitude ranges from 6.4 to 8.0. Likewise, similar analysis including the most perceptible earthquake magnitude estimation was carried out in this direction by Yadav et al. (2013aYadav et al. ( , 2013bYadav et al. ( , 2015 and Chingtham et al. (2016) to refine the seismic hazards in the western-northwestern Himalayan region.
Our earthquake hazard assessment results from a comprehensive catalogue ranging from 1803-1999 and an updated set of probability models (including exponentiated exponential, exponentiated Weibull) are broadly consistent with the previous results (Parvez and Ram 1997;Shanker et al. 2007;Yadav et al. 2012aYadav et al. , 2012bYadav et al. , 2013aYadav et al. , 2013bYadav et al. , 2015. We found (using the best-fit exponentiated Weibull model) that the return period (one standard deviation above mean) of a magnitude 6.0 or higher event in the NW Himalayan region turns out to be about 15 years, and the cumulative probability of an earthquake reaches 0.92-0.95 after about 20-23 (2019-2022) years since the last strong Chamoli event in March 1999. In addition, knowing the time beyond the last strong event in March 1999 (elapsed time = 19 years), we have also calculated conditional probabilities using exponentiated Weibull distribution. These conditional probability values, for t ¼ 19; lie between 0.90-0.95 after about 20-25 (2038-2043) years from now (i.e. 2018). A series of other conditional probability values are also estimated (Table 6 and Figure 4) to depict future earthquake hazard scenario in the study region. The emanated conditional probability curves are useful in many engineering applications including disaster prevention and mitigation from scenario (hypothetical) earthquake studies and earthquake insurance premium calculations (Pasari 2015(Pasari , 2017. Apart from conditional probability of earthquakes for various projected residual times, for the first time we examine parametric uncertainties of the studied models from a surrogate FIM approach (Hogg et al. 2005;Dikshit 2015a, 2015b) to evaluate their efficiency and precision in practical applications. We used three popular goodness-of-fit tests to evaluate the model performances. It was observed that the newly introduced three-parameter exponentiated Weibull distribution provides the best representation to the present data of earthquake interevent times. The shape of the exponentiated Weibull distribution (Figure 3) agrees well to the gamma, Weibull, and exponentiated exponential; yet it is more generic and demands further investigations (Mudholkar and Srivastava 1993;Dikshit 2015b, 2018). The exponentiated Weibull density and hazard rate function, irrespective of small or large t, can be well approximated by a simple two-parameter Weibull model (Mudholkar and Srivastava 1993;Pasari and Dikshit 2018). Apart from the hazard function, the mean residual life (MRL) function and its asymptotic behaviour of the exponentiated Weibull model also characterizes earthquake hazard analysis in a large seismotectonic province (Pasari 2015;Pasari and Dikshit 2018). Therefore, the present study comprising the recently used exponentiated models illumines better insights to the hazard assessment and contributes to the development of innovative methods in earthquake hazard analysis in the northwest Himalaya and its adjacent regions.
As an alternative to the present analysis based on real-time interval between the triggering events, one may also use simulated earthquake catalogues to calculate the comparative chances of earthquake phenomenon in every search grid in the northwest Himalaya (Jordan 2006). Alarm-based earthquake forecasting techniques using pattern informatics (e.g. Rundle et al. 2003;Holliday et al. 2006;Lee et al. 2011), relative intensity (e.g. Zechar and Jordan 2008), or moment ratios (e.g. Talbi et al. 2013) are some of the other prominent ideas to be explored. These methods usually exhibit grids where earthquakes are most likely to appear in a near-future time period (Rundle et al. 2003). Several national and international agencies on statistical seismology, such as the Regional Earthquake Likelihood Models (RELM) testing group and the Collaboratory for the Study of Earthquake Predictability (CSEP) group have adopted these practices in their national earthquake forecasting curriculum (Jordan 2006;Schorlemmer et al. 2007). Nonetheless, these simulation and extrapolation based likelihood testing techniques require a strategic environment with precise information of crustal deformation rates and comprehensive seismotectonic data of small, moderate, or large dependent and independent historical events for all grids (Schorlemmer et al. 2007). For better understanding of earthquake processes and seismic hazard analysis, the northwest Himalaya region has also experienced a series of different approaches based on geology and geomorphology (e.g. Powers et al. 1998;Thakur et al. 2000), micro-seismicity (e.g. Kayal 2001Kayal , 2010Arora et al. 2012), archaeoseismology (e.g. Rajendran et al. 2013), GPS geodesy (e.g. Banerjee and Burgmann 2002;Ponraj et al. 2011;Jade et al. 2014;Pasari 2015), geophysical investigations (e.g. Koulakov et al. 2002;Caldwell et al. 2013), and palaeoseismic studies (e.g. Kumar et al. 2001;Philip et al. 2011). In particular, the geodetic investigation in the northwest Himalaya reveal an estimated slip rate of 11 § 2 mm/yr (Pasari 2015) and envisage one or more great earthquakes in near future (Banerjee and Burgmann 2002;Ponraj et al. 2011;Jade et al. 2014;Pasari 2015). Therefore, in time to come, we hope that the continuum efforts on sophisticated instrumentation and observational consequences will provide an improved hazard assessment strategy in the study region.

Conclusions
The present research brings out the following conclusions: (1) The studied probability distributions can be broadly classified into three groups based on their relative fitness to the present earthquake data. The best fit, in highest to lowest hierarchical order, comes from the exponentiated Weibull (rank 1), exponentiated exponential (rank 2), Weibull (rank 3), and gamma (rank 4) distributions. An intermediate fit comes from the exponentiated Rayleigh and lognormal models. The remaining models, namely inverse Gaussian, inverse Weibull, Levy, Maxwell, Pareto, and Rayleigh distributions fit poorly to the studied catalogue. (3) The empirical earthquake hazard analysis from exponentiated models may provide important clues to the underlying physical mechanism of earthquake genesis in the northwest Himalaya. These results can be combined with other studies to bring more insights to the seismic disaster planning and prevention of the study area.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This research work is partially supported by a fund received from Additional Competitive Research Grant (ACRG) at BITS Pilani, Pilani campus. Grant No: PLN/AD/2016-17/01