RETRACTED ARTICLE: Seismic site characterization in northeast India: a holistic GIS based approach

We, the Editors and Publisher of Geomatics, Natural Hazards and Risk have retracted the following article: Manik Das Adhikari, Woo-Young Jung, Ji-Myong Kim & Sang-Guk Yum, Seismic site characterization in northeast India: a holistic GIS based approach. Geomatics, Natural Hazards and Risk, 13, 2022, 538–567 DOI: 10.1080/19475705.2022.2029583 The authors acknowledge that the premise and some text and images published in this article were from work undertaken by others. We have been informed in our decision-making by our policy on publishing ethics and integrity and the COPE guidelines on retractions. The retracted article will remain online to maintain the scholarly record, but it will be digitally watermarked on each page as “Retracted”.


Introduction
Ground motion-triggered from a strong seismic excitation may get modulated due to the impedance contrast between the bedrock and the soil column. The resonance effect is induced by energy trapped between bedrock and soil damping caused by the sediment column at the site of interest. Thus, even for sites located at the same distance from the epicenter, the seismic motion and earthquake damage experienced during an earthquake can vary considerably. The amplitude of ground vibrations is proportional to the magnitude of the earthquake and the ground's solidity (compaction). The seismic energy may amplify by the ductile ground, resulting in strong ground vibrations that build-up on the surface (Siska et al. 2020). This is due to the seismic response of site-specific geotechnical properties of subsurface materials, such as the shear wave velocity (V S ) and the source impacts and propagation path effects. Thus, the site classification system has been proposed in almost all existing seismic design standards (BSSC 2001;BIS 2002;IBC 2006) to predict the site-specific seismic hazards caused by earthquakes. As a result, seismic site characterization maps are required at all earthquake management stages, including mitigation, preparedness, response, and recovery (Singh 2015).
The rapid population growth and migration of rural Indians to cities have resulted in the construction of buildings and infrastructure in urban areas. Building on sedimentary basins and rapid-unplanned urbanization has reduced the structural protection of buildings built in earthquake-prone areas. The significant effect of local site conditions on the ground shaking level during a large earthquake due to the nonlinear behavior of unconsolidated sediment deposits has been evidenced from the damage pattern of impending historical earthquakes in Northeast India. Therefore, the importance of site effect studies cannot be overstated, and must be considered in such areas. Destructive earthquakes in India, such as the 1897 Shillong earthquake (M w 8.7), the 1918 Srimangal earthquake (M w 7.6), the 1930 Dhubri earthquake (M w 7.1), the 1950 Assam earthquake (M w 8.7), the 2001 Bhuj earthquake (M w 7.6), the 2005 Kashmir earthquake (M w 7.8), and the 2011 Sikkim Earthquake (M w 6.9) have demonstrated that local soil conditions can play a primary role in destruction (Bansal et al. 2021). Numerous building collapses, road subsidences, and deaths in various parts of Sikkim during the 2011 earthquake have highlighted the importance of conducting a comprehensive site response analysis of Northeast India. Therefore, subsurface soil classification is one of the necessary works to reduce the potential damage caused by earthquake effects (Silahtar et al. 2020). The effects of soil types on ground motion response are the basis for site characterization studies. The dominant frequency is a key factor in regulating the amplification of ground motions. Due to resonance at a specific frequency related to the thickness and average shear-wave velocity of a layer, a significant reduction in seismic impedance may result in large amplification. Therefore, the surface geology and topography of the area play a major role in the magnitude of ground shaking caused by earthquakes (Singh 2015). The local site effect becomes critical in dictating the damage intensity in most buildings where the foundation does not extend to bedrock (Sitharam et al. 2015). Therefore, in an earthquake-prone region, a thorough investigation of site classification and amplification is crucial. Several studies have shown that the shear-wave velocity is an important factor in assessing ground motion (Borcherdt 1970;Stephenson et al. 2005;Kanli et al. 2006;Odum et al. 2013;Nath et al. 2014;Silahtar et al. 2020). According to Borcherdt (1994), the average shear-wave velocity of the uppermost 30 m of the soil segment (V s 30 ) is the most efficient site characterization value. In general, geophysical testing (i.e. Down-Hole, Cross-Hole, Spectral Analysis of Surface Waves (SASW), Multi-channel Analysis of Surface Waves (MASW)), and geotechnical investigation (i.e. borehole drilling) are reliable techniques for obtaining detailed site-specific information of dynamic and static soil properties, which are commonly used for the design and assessment of civil structures under seismic loadings (Karray and Hussien 2017). However, these investigations are limited to smaller areas and are not suitable for characterizing a larger hilly terrain. Several alternative approaches have been proposed in the technical literature (see e.g. Wald and Allen 2007;Ansal et al. 2010;Wills et al. 2015;Yong 2016;Parker et al. 2017;Foster et al. 2019;Karimzadeh et al. 2019), the majority of which are proxy-based. One of the most popular methods for regional analyses of site conditions has been recommended by Allen and Wald (2009). The approach relies on a topographic slope-V S 30 correlation by assuming the morphometrical characteristics of terrain as a lithology representative. More in detail, rock formations are generally indicated by steep slopes, while soft soils are indicated by nearly flat basins and stiff soils are indicated by intermediate slopes (Foster et al. 2019). Besides, sediment fineness, which is a proxy for lower shear velocity (Parker et al. 2017), should be linked to the slope. For example, steep, coarse alluvial fan material near the mountain front usually grades to finer deposits as it moves away from the mountain front. As a result, harder rock is more likely to be found on steeper slopes, resulting in faster V S . On the other hand, basin sediments and regolith are found on the flatter slope, resulting in slower V S . Thus, DEM data is widely used for regional seismic site characterization studies (Nath et al. 2013).
Northeast India is known as one of the world's six most active plate tectonic regions (Kayal 1998). The area is classified as the highest seismic risk zone by the Global Seismic Hazard Assessment Programme (GSHAP), with Peak Ground Acceleration (PGA) values in the range of 0.35-0.4 g. It places in seismic zone V on the seismic zonation map of India (BIS 2002), where the occurrence of earthquakes is frequent, as depicted in Figure 1. The overall effect of previous earthquakes modified the landform, suggesting that local site conditions substantially impact earthquake strength. It is well established that the basins can significantly amplify earthquake ground motion as peak ground motion, amplification, and shaking duration are all affected by the basin structure. The seismic response can vary within a short distance over minor changes in the geology and soil condition. Thus, seismic site classification is widely used to quantify the site effects and spectral amplification. The purpose of site characterization is to understand how a soil deposit responds during an earthquake based on the frequency of the base motion, the geometry, and the soil's material properties above the bedrock. It entails characterizing a site based on the existing properties of subsurface materials, and it's crucial for ensuring that the proposed design and construction are suitable, cost-effective, and safe. Thus, in this article, the motivation is to perform a holistic seismic site characterization study of Northeast India, which will fortitude the behavior of all aspects of a site in order to provide data for seismic hazard and risk analysis.

Seismotectonic setting of Northeast India
Northeast India is one of the most earthquake-prone territories in the world, located in a complex tectonic province exhibiting juxtaposition of the N-S trending Arakan Yoma belt and the E-W trending Himalaya. The region's background tectonic regimes can be summarized as Eastern Himalayas, Mishmi Block, Indo Myanmar arc, Brahmaputra valley, and Shillong Plateau (Nandy 2001). The underthrusting of the Indian plate with the Eurasian plate describes the Himalayan setting. The Indo-Myanmar subduction zone, which extends from the Bay of Bengal to the Mishmi block in N-S, NNE-SSW, and NE-SW directions, is seismically active. The Naga thrust and the Disang thrust are the two principal discontinuities in this zone. The seismic activity in the Bengal Basin may be related to intra-plate activity, while the seismic activity in the Tripura-Mizoram fold belt may be related to the plate-boundary activity. The Sylhet fault, the Oldham fault, the NE-SW Hail-Hakula lineament, the E-W trending Dauki fault, the NNW Tista fault, the Tuipui fault, and the Mat fault (Jaishi et al. 2014) are the major tectonic domains in this area as shown in Figure 1. It is observed that the overall impact of the previous earthquakes in the region viz. the 1869 Cachar Earthquake of M w 7.4, the 1897 Shillong Earthquake of M w 8.1, the 1930 Dhubri Earthquake of M w 7.1, the 1943 Assam Earthquake of M w 7.2, the 1947 Arunachal Pradesh Earthquake of M w 7.3, the 1950 Assam Earthquake of M w 8.7, the 2011 Sikkim earthquake of M w 6.9 and the 2016 Manipur Earthquake of M w 6.7 implies that the local site conditions significantly affect the intensity of an earthquake. A closer examination of the seismicity data reveals that the majority of the events occurred in the Eastern Himalaya, Meghalaya Plateau, Mikir Hills, Mishmi block, and Assam shelf tectonic domains. These tectonic domains have a diffused pattern of earthquake events with post-collisional intracratonic characteristics (Nandy 2001). The focal depths of most earthquakes in the Indo-Myanmar tectonic domain range from 70 to 200 km. The westerly convex N-S subduction zone of the Indian plate is defined by increased seismicity. Seismicity is sparse in the Eastern Himalaya, the Tethyan Himalaya, and the Yarlung Zangbo or Tsangpo suture zone. The majority of events occurs between the Main Boundary Thrust (MBT) and the Main Central Thrust (MCT), and is more concentrated in areas traversed by the transverse faults/ lineaments that run across the Himalayan trend.
The isoseismal map of the 1869 Cachar Earthquake of M w 7.4 exhibits MM intensity varying from V-VII in Northeast India and caused widespread damage from Dibrugarh to Manipur, Patna, and Kolkata. The Shillong Plateau and its nearby region were struck by an intraplate earthquake of maximum MM intensity X in 1897 with M w 8.1. The 1930 Dhubri earthquake of M w 7.1 was located at an epicentral distance of 3.9 km NNW of Dabigiri, Meghalaya, with a maximal intensity of IX. The earthquake did not cause any fatalities, but most of the buildings were damaged in Dhubri and the surrounding areas. An earthquake of M w 7.2 again hit the area close to the Shillong Plateau from the Kopili fault on 23rd October 1943, whose epicenter was located at 13.6 km east of Hojai in Assam. The intense shaking had caused fissures, the falling of trees, unevenness of the ground, and damage to buildings. The 1947 Arunachal earthquake of M w 7.3 occurred with a maximum intensity of V, where the released seismic moment was about 1.5 Â 1020 Nm with the cracks in the walls of Dibrugarh, Jorhat, and Tezpur. The 1950 Assam earthquake of M w 8.7 with a maximum intensity of XI is an overwhelming earthquake of the 20th century caused by continental collision rather than subduction whose epicenter was located in the Mishmi Hills; roughly 4800 people were killed, and there were also numerous aftershocks of magnitude 6.0 and above. The 2016 Imphal earthquake of M w 6.7 with a maximum intensity of VII, occurred on the Indo-Burman plate boundary, where the Indian and Burman plates converge in a northeasterly direction (3-4 cm/yr) (Dasgupta et al. 2000;Parameswaran and Rajendran 2016;Singh et al. 2017;Bora et al. 2022).
Further, to understand the seismic activity of the region, we have performed smoothened gridded seismicity analysis in the entire Northeast India. The technique of smoothened gridded seismicity modeling has established by Frankel (Frankel 1995) for activity rate distribution mapping for a specified threshold magnitude. Several workers had previously employed this technique (Frankel et al. 2002;Maiti et al. 2016). In this study, the study area is gridded at 0.1 intervals, with each grid point encompassing a cell of 0.1 Â0.1 that represents about 11 km 2 . We have considered the threshold magnitude of M w 3.5 and above for the analysis. Figure 2 depicts the seismic activity rate distribution for M w 3.5 threshold magnitudes at different hypocenter depths. From the smoothened seismicity analysis, the probable asperity zones can be identified. It is observed that at the threshold magnitudes of M w 3.5, patches of well clustered seismic activity rate indicating possible stress concentration are seen within the hypocentral depth range of 70-180 km in the Eastern Boundary subduction belt, which has been the source of the January 2016 Manipur earthquake of M w 6.7 and the April 2016 Myanmar earthquake of M w 6.9. Similar consideration of clustered seismicity as a proxy to stress concentration has aspired in the Shillong Plateau region believed to be the source of the 1897 Shillong earthquake of M w 8.1. Thus, the occurrence of these devastating earthquakes during the last hundred years and in the near future around Northeast India signifies how vulnerable the region towards seismic devastations.

Data and methods
Site response analysis is a cutting-edge seismic hazard analysis technique that uses the properties of soil profiles and the ground motion parameters of the base rock beneath the soil profiles to estimate ground motion parameters at the earth surface (Cramer 2003;Andrade and Borja 2006;Teague et al. 2018). Therefore, the elastic properties of near-surface materials, as well as their impacts on seismic wave propagation, are significant in earthquake engineering activities and earth science research. One of the most important factors responsible for amplifying earthquake motions is the increase in amplitude of soft sediments. The site effects, including ground motion amplification and modification of the ground motion frequency, have a considerable impact on the seismic design and vulnerability of structures (Borja and Sun 2007). The need for site characterization study for Northeast India requires immediate attention as it is one of the most devastating regions of the country, and anytime earthquakes can occur in this region. The geology and topography of the area play a major role in the magnitude of ground shaking caused by earthquakes. Thus, developing a regional seismic site characterization (SSC) map necessitates a substantial investment in geophysical and geotechnical data acquisition and analysis. As a result, SSC maps are rarely available in many seismically active areas. The time-averaged shear wave velocity of the top 30 m earth surface is a key parameter that is commonly used as an indicator of seismic site conditions, and is frequently used in SSC mapping (Shafique et al. 2018). Thus, the shear-wave velocity estimation is one of the main steps in site classification. It depends on the shear modulus and density of soil with a relation, Shear modulus ¼ Density Â V s 2 . Average V s are inversely correlated with earthquake intensity; therefore, high V s areas are at low susceptibility to seismic shaking. For instance, there will be little amplification of seismic waves at a bedrock site (high shear-wave velocity). On the other hand, in a sedimentary basin (low Vs), intense amplification is expected. Thus, the effective shear-wave velocity (V s 30 ) is an actual proxy to perform site classification as it is a good indicator of shallow subsurface condition and is directly related to the sediment stiffness. In addition, its spatial distribution helps in delineating the presence of soft sediments in the terrain. National Earthquake Hazard Reduction Program (NEHRP) and Uniform Building Code (UBC) have recommended five site classes or ground profiles (soil types) based on V s 30 with similar site response as given in Table 1. Geotechnical and geophysical tests are commonly used to classify soil overburden. However, these tests are limited to smaller areas and are not suitable for characterizing a larger hilly terrain. As a result, efforts have been made to determine geotechnical properties of subsurface materials using geology, geomorphology, and topography as proxies (Wald and Allen 2007;Allen and Wald 2009;Shafique et al. 2018). The control of underlying geological units on estimated V s 30 has already been observed by Wald and Allen (2007), Wills et al. (2015), and Parker et al. (2017). Shafiee and Azadi (2007) revealed that topographic slope has a major effect on V s 30 measurements, so they used topographic slope obtained from remote sensing data as a proxy for estimating V s 30 with global coverage. Accordingly, Nath et al. (2013), Yong et al. (2008), Shafique et al. (2012), Ahmad (2016), Kwok et al. (2018), and Heath et al. (2020) used topographic slope as a proxy for V s 30 mapping. In this study, a proxy-based seismic site characterization has been performed to determine the amplification characteristics in Northeast India. The topographic analysis provides an important tectonic and geomorphic context for the earthquake, and it could have significant implications for predicting the most likely locations of future events in the area (Kirby et al. 2008). The topographic slope-based approach is supported by two major factors i.e., (i) hardness, permeability, fracture spacing, and other physical properties of soil/rocks that are strongly expressed by Vs and can be linked to topographical gradients, and (ii) topographic slope information is available at a greater resolution than geological data, allowing for better detection of site differences (Wald and Allen 2007;Nath et al. 2013). There can be significant differences in seismic wave velocity between different rock types and between saturated and unsaturated soils, with a lot of overlap in the measured velocities; however, in topographic slope analysis, the steepest slope indicates higher V S , while nearly flat topography indicates lower V S with soil deposits (Nath et al. 2013).
Since there is neither a widely used direct relationship between slope and V s 30 nor any analytic formula that can emerge from the data, Wald and Allen (2007) characterized the relationship in terms of discrete steps in shear-wave velocity values tied to NEHRP V s 30 boundaries. To increase resolution, the NEHRP borders were further subdivided into smaller velocity windows. The topographic gradient at any site that falls within these boundaries will be assigned a V s 30 that defines the median value of the subdivided NEHRP boundaries. Hence, the approach can be useful for the regional level evaluation, and based on this approach, a preliminary site classification map of Northeast India has been developed. Consequently, various site characterization parameters viz. predominant period, amplification, and vulnerability index have been calculated based on the various empirical equations on each pixel of an image size. In the present study, we have used geotechnical data (reported by https://asdma. assam.gov.in/, Nath 2007;Jana et al. 2012;Sil and Sitharam 2013;Sitharam and Sil 2014;Haloi and Sil 2015;Pallav et al. 2015;Sil and Sitharam 2017;Biswas and Baruah 2017), strong ground motion data (recorded by Program for Excellence in Strong Motion Studies (PESMOS) of IIT Roorkee available at http://pesmos.in (last accessed Sept 2013) and Darjeeling-Sikkim Strong Motion Array (DSSMA) of IIT Kharagpur (2011 Sikkim earthquake and its aftershocks are available by https://www. moes.gov.in/content/strong-motion-data-records-september-2011-sikkim-earthquake; last accessed April 2019; Adhikari and Nath 2016; Adhikari 2018), and high-resolution Digital Elevation Model (ALOS PALSAR DEM, https://search.asf.alaska.edu/#/) for regional SSC mapping of Northeast India. The geotechnical data (e.g., soil descriptions and blow counts from standard penetration tests) and V S profiles are reported for various urban centers in Northeast India. These include data from Guwahati, Agartala, Silchar, Aizawl, Kohima, Imphal, Jorhat, and Shillong (Figure 3(a)) (e.g. https://asdma.assam.gov.in/, Nath 2007;Jana et al. 2012;Sil and Sitharam 2013;Sitharam and Sil 2014;Haloi and Sil 2015;Pallav et al. 2015;Biswas and Baruah 2017;Sil and Sitharam 2017). A compilation of measured V S 30 from different projects across Northeast India has been used in the present study (Figure 3(a, b)). In Guwahati city, the V S 30 distribution was computed from the SPT-N value and subsequent data analysis by Nath (2007). NEHRP site classes D (180-360 m/s) were observed in the city (Nath and Thingbaijam 2011). Sil and Sitharam (2013; carried out an MASW survey and geotechnical investigation to evaluate the shear-wave velocity distribution in Agartala and Aizwal. Pallav et al. (2015) and Mero et al. (2019) conducted site characterization in Imphal city and found V S 30 values ranging from 140-230 m/s for sites with different depositional conditions of sand, silt, clay, and shale. Jana et al. (2012) carried out a site response study in the Jorhat city based on geotechnical investigation and reported V S 30 in the range of 222-339 m/s for sites with various depositional conditions of sand, silt, clay, and sandy clay. In Kohima city, the NEHRP site class D and E reported by Imtikumzuk et al. (2015). The site characterization of Shillong city was reported by Biswas and Baruah (2017). Haloi The strong motion stations that have been considered in the present study are shown in Figure 1. These include the PESMOS networks with 45 stations and Darjeeling Sikkim Strong Motion Array (DSSMA) with 15 stations (Nath 2004, http://pesmos.in, last accessed Sept 2013https://www.moes.gov.in/). IIT Roorkee operated the PESMOS networks (currently operated by the National Centre of Seismology (NCS), New Delhi), including records of 25 earthquakes with magnitudes ranging from M W 3.9 to M W 6.9. On the other hand, DSSMA was installed by IIT Kharagpur to monitor earthquakes in the seismically active Darjeeling-Sikkim Himalayan region continuously. The site response at each strong-motion stations has been calculated using HVSR analysis of recorded events (25 three components recorded earthquakes available by PESMOS and 2011 Sikkim earthquake & its aftershocks recorded by DSSMA are available by https://www.moes.gov.in/content/strongmotion-data-records-september-2011-sikkim-earthquake) with a good signal-to-noise ratio greater than 3.0. The Shear wave phases have been picked up in all filtered acceleration time-series using windows of varying lengths, depending on the record, with each window chosen to include the strongest shaking. Thereafter, the windowed shear wave acceleration time series has been transformed into the frequency domain using fast Fourier transforms, and a 7-point smoothening algorithm has been used to smoothen the shear wave Fourier amplitude spectra. The Site Amplification Factor (HVSR ij (f k )) has been computed at each j th site for the i th event at the central frequency f k using the root mean square average of the amplitude spectra of the transverse (H ij (f k )j T ) and the radial (H ij (f k )j R ) components to the Fourier spectra of the vertical component V ij (f k ) (Lermo and Chavez-Garcia 1993).
The recorded earthquake events from this network have been used in previous studies (e.g., Pal et al. 2008;Adhikari and Nath 2016;Adhikari 2018;Sandhu et al. 2020). An example of recorded time series data and average spectral ratio based on all the recorded events of the same station is depicted in Figure 4.
Geological Survey of India (GSI) conducted an extensive microtremor survey in Gangtok (Mohan et al. 2014), Pakyong (Ibrahim et al. 2014), Kohima (Imtikumzuk et al. 2015), Jorhat (Jana et al. 2012), Itanagar (Imtikumzuk et al. 2017), Imphal (Mero et al. 2019), and Aizwal (Jana et al. 2013) city. On the other hand, Nath (2007) performed an extensive seismic noise survey in Guwahati city (https://asdma. assam.gov.in). Thereafter, the seismic vulnerability index (K g ) was reported based on the Nakamura (1997) approach to identify sites where seismic hazards and damage may be expected. In the present study, we compiled all the reported K g values and the one estimated from recorded strong motion data to understand the efficacy of topography gradient-based approach.
Finally, a holistic GIS model has been developed for site characterization in Northeast India by integrating site characterization attributes, that is, Site Classes, Site Amplification, Site Period, and Vulnerability index (Kg) as depicted in Figure 5. The SSC is achieved by using a multicriteria-based decision support tool formulated by Saaty (1990), referred to as an analytical hierarchical process (AHP). According to the layers' apparent contribution to the overall seismic site response, the corresponding weights, ranks, and theme attribute scores of each thematic layer are assigned. To create an SSC zonation map of Northeast India, all georeferenced thematic layers are combined using the GIS aggregation process. After that, the generic amplification curves for various SSC zones (i.e. Low Susceptible, Moderate Susceptible, and High Susceptible) have been generated from the recorded strong ground motion data based on the bootstrapping technique. The resultant SSC map and the generic site response curves may be used for new urbanization in Northeast India.

Topographic gradient as a proxy for seismic site classification
The geology and topography of the area play a significant role in the magnitude of ground shaking caused by earthquakes. The local site effect plays a major role in assessing damage severity in most buildings where the base does not extend to bedrock. V s 30 is one of the standard indicators for mapping seismic site conditions in most earthquake-prone countries' building codes. Thus, NEHRP site classifications protocol has proven practical applications for seismic hazard studies. The seismic site classification of Northeast India has been carried out using a topographic gradientbased approach.
The gradient of the topography is compliant with shear-wave velocity (V s 30 ), amplification, and the predominant period of surface soil. The original ALOS PALSAR digital elevation (12.5 Â 12.5 m) data generated by JAXA and the Japan Resources Observation System Organization (JAROS), downloaded from the Alaska Satellite Facility (ASF) has been used for site characterization. A hole-filling algorithm is used in the processing to provide smooth and accurate elevation surfaces for the entire region while removing no-data areas. The average change in elevation over the distance between the cell and its eight neighbors determines the steepest downhill descent. The slope function is a concept that corresponds to a plane to the z values of a 3 Â 3 cell neighborhood surrounding the processing or center cell. The average maximum technique is used to measure the slope value of this plane. The slope map (Figure 3a) for the study area has been created using the spatial analysis tool available in ArcGIS, based on the slope function described above. Allen and Wald (2009) proposed slope ranges corresponding to each NEHRP site class based on correlation studies performed in active and stable continental regions. The active continental region encompasses the entire Northeast India (Nath et al. 2013). Therefore, for the site characteristics evaluation in Northeast India we have used the correlation of an active continental region (Table 1). The average shear wave velocity map is produced by classifying slope ranges for each pixel size directly. It has been observed that the Basin sediments are deposited mainly in areas with relatively low gradients and are represented by low shear-wave velocity, while high-velocity materials are more likely to maintain a steep slope. Figure 6(a) depicts the correlations between measured V S 30 and topographic slope given by Allen and Wald (2009). The correlation has greatly improved when a broader range of NEHRP site classes B, C, D and E is considered. Using the international building code (IBC 2006), Northeast India has been classified into four major categories, including eight sub-classes as depicted in Figure 6(c). Figure 6(b) displays the logarithm's histogram of the ratio between the observed value and the corresponding V S 30 estimated from the topography gradient method. The logarithm residuals exhibit a good clustering at zero. On the other hand, a good match is found when topographic slope and surface shear wave velocity (V S 30 ) are compared. The topographic slope has been found to have a slightly higher V S 30 than the values (V S 30 ) collected from site-specific studies. These variations can be due to the different mapping resolutions in both the techniques. However, the findings are highly close, demonstrating the efficacy/applicability of this technique for estimating site effects in the absence of subsurface information. Following that, a simple equation relating the surface shear velocity (V s 30 ) and the slope gradient (S) for Northeast India has also been obtained, 4.2. Estimation of site response using Borcherdt (1994) and square root impedance (SRI) model Several models are proposed to determine spectral amplifications based on the V s , like the Borcherdt (1994) spectral amplification model and the square root-impedance model (SRI) as given by Joyner et al. (1981). In the literature, other models for predicting spectral amplification have been discussed (i.e. Holzer et al. 2005;Wald and Allen 2007;Sil and Sitharam 2017). The model is based on the principle of energy conservation. According to this theory, when a seismic wave moves from one medium to another, its energy remains unchanged, but it undergoes a transition. The amplification can be estimated by assuming horizontally stratified layers over a nonattenuating half-space (Haskell 1953). This method uses a simplified form of the impedance function to evaluate site response (Joyner et al. 1981). The defining characteristic of this method is that the material characteristics are average. The impedance ratio is determined using the following formula, where A 1 and A 2 are the seismic waves' amplitudes (S-waves); q 1 and q 2 are the materials density; and V 1 , V 2 , are the V s of the respective subsurface soil layers. In the present study, the general amplification factor (F) and spectral amplification factor have been calculated using the method suggested by Borcherdt (1994) and Joyner et al. (1981). Using the assumption that rock and soil densities are equal, the general amplification factor 'F' has been determined as the ratio of averaged hard rock V s to effective shear-wave velocity (V S 30 ) of soil site for each pixel, without taking groundmotion levels into account (Borcherdt 1994;Sil and Sitharam 2017;Karimzadeh et al. 2019),   (Fig. 14) with local topography variation.
where, ij are pixel coordinates, and V 0 is the average V S of rocky medium, that is, $1130 m/s. The assigned lower limit of V S of rock (type B) is 760 m/s, while the higher limit is 1500 m/s (IBC 2006), resulting in an average velocity of approximately 1130 m/s for the rocky medium. Based on the topography-derived V s 30 map, the general amplification factor (F) has been calculated for Northeast India as depicted in Figure 7(a). The acceleration-independent amplification factor (F) ranges between 1.4 and 5.8. The entire Lower Brahmaputra Basin, Agartala, Silchar, Tezpur, Siliguri, and Imphal City have a high capacity to amplify ground motions, as shown by the discrete distribution of amplification factor (Figure 7(a)).
Several researchers have used the fundamental site period (Ts), which corresponds to the first mode of vibration of the soil deposit, as one of the criteria for seismic site characterization (Molnar et al. 2020). The risk is mainly due to the ability to amplify earthquake motions in soft surface deposits. The fundamental resonant frequency (f 0 ) has been determined using the following equation if there is only one homogeneous layer (Kramer 1996;Ahmad 2016;Sil and Sitharam 2017) for each pixel (i, j).
where, H ij is the depth of soil layer (according to the NEHRP guidelines (IBC 2006), the depth of the soil layer is considered to be 30 m for a site response study), and V 30 sij is the average shear wave velocity of the soil layers. Alternatively, the fundamental period (T s ) is defined as where, T sij is the predominant period. The developed site period map presented in Figure 7(b) indicates that the highly elevated part of the region is associated with the lowest site period ranging from 0.15 to 0.24 s (equivalent to 4-7 Hz) due to the lesser overburden thicknesses. The site period varies from 0.24 to 0.67 s (equivalent to 1.4-4 Hz) in the other parts of the region due to the existence of relatively thick alluvial deposits. On the other hand, it varies from 0.15 to 0.33 s along the slope or transition zone from the hilly part to the floodplain, while predominant periods are longer in the lowland or floodplain areas where soil deposits are likely to be thicker and softer. Near-surface geologic materials have long been known to have a direct impact on ground vibrations. Therefore, the SRI model has been used to calculate the frequency-dependent amplification factor. The surface amplifications have been calculated using the impedance function corresponding to average shear wave velocity at the bedrock and surface levels. The frequency dependence amplification function (Shearer 1999;Brown et al. 2002) has been used in this experiment as, where, f is the characteristic site frequency (i.e., 1/T s ), A (f) is an amplification function, q r (density at the rock, kN/m 3 ), and q s (density at soil layers, kN/m 3 ) are taken as 22 and 16 kN/m 3 , respectively, whereas v r (V s at bedrock) is taken as 1130 m -s . Vs denotes the average shear wave velocity in the soil profile's upper 30-meter depth (V S 30 values obtained from the topographic gradient approach have been considered here). Following Nath et al. (2012), the kappa factor k 0 ¼ 0.0035 has been used for the analysis. Based on the NEHRP site class, the average V S 30 values (i.e. 270, 560, and 1130 m/s) of site class D, C, and B and the corresponding site periods (i.e. 0.44, 0.21, and 0.11 s) have been evaluated for determination of short period and long period amplification factor as depicted in Figure 8. The amplification factor for 100 Hz (0.01 s) has also been estimated for the study region (Figure 8(a)). The frequency dependence amplification factor (A f ) ranges from 1.0 to 3.0 for all the periods (0.11, 0.21, and 0.40 s), whereas in PGA, it ranges from 0.35 to 1.10. The amplification factor for PGA and short period and long period are essential for seismic hazard analysis.

Estimation of seismic vulnerability index (K g )
The seismic vulnerability indexes (K g ) give a chance to estimate the earthquake damage before the earthquake occurs (Mase and Sugianto 2021). After the 1989 Loma-Prieta Earthquake, the study offers pretty big ground deformations where K g values are more significant than 15 and no damages where K g values were minimal (Kang et al. 2021). In 1999, during the M w 7.7 Chi-Chi earthquake in Taiwan, large-scale liquefaction was also reported in regions of higher K g (> 10) (Huang and Tseng 2002). Choobbasti et al. (2015) reported that the threshold value of liquefaction potential is K g >5. The K g values in the liquefied areas are higher than those in the adjacent regions devoid of liquefaction . Previous studies (i.e. Nakamura 2000; Huang and Tseng 2002) had shown that the distribution of Kg very well with damage due to more recent earthquakes. Based on these observations, it is concluded that K g value is site-specific and designated as an index of vulnerability for that site. Thus, this value might be useful in demarcating vulnerable regions, which are susceptible to liquefaction. Nakamura (1997) suggested that both the soil's vulnerabilities and the earthquake damage could be estimated using the K g value before a destructive earthquake. The K g value is a parameter that is determined by soil's dynamic properties. This parameter can be used to assess the site's vulnerability under strong ground motion. The amplification factor and natural vibration period can be used to measure the seismic vulnerability index for both soil and structure. Therefore, this analysis's primary aim is to evaluate the K g values and their spatial distribution in the study region and the zones that may be impacted by an earthquake or the extent of liquefaction. Generally, the Horizontal to Vertical Spectral Ratio (HVSR) method is widely used to determine K g . However, in this study, the K g value has been obtained using a topography gradient-based approach. K g is derived from the resonant frequencies (f 0 ) and peak amplification factor (F) as, where F is the amplification factor and f o is the fundamental frequency (i.e. 1/T s ), T s is the predominant period. The general amplification (F) and predominant site period (Ts ij 30 ) calculated based on Borcherdt (1994) approach as discussed in section 4.2. Thereafter, the K g values have been estimated (Equation (8)) for respective sites to divide into discrete zones of weak surface covers. Figure 9(a) shows the vulnerability index (K g ) distribution, having values ranging from 0.3 to 22.7. It is observed that the Lower Brahmaputra Basin are mostly associated with higher K g . The K g values are bigger than 20 are found in NEHRP site class 'E', while site class 'D' associated with K g value in the range of 5-20 and site class B, & C associated with K g values < 5. High vulnerability indices (K g ) are associated with Imphal, Silchar, Agartala, Guwahati, Tezpur, Siliguri, and Jorhat (Figure 9a). The results show that the sites with K g > 10 are where possible structural damage can occur. According to the BSSC (2001), soil classification criteria based on V s 30 values, the Lower Brahmaputra Region is mostly of NEHRP classes D and E. These areas have relatively higher K g values (K g >10), while other types of the soil generally have K g values that are smaller than 10. The large amplification and low frequencies are common in the lower Brahmaputra basin. Since, the K g is an inherent index which shows the vulnerability of the site to earthquake shaking, areas with higher K g indicates seismically vulnerable areas which may be prone to seismic damage and need to be carefully considered during new urbanization.
Thereafter, a comparison has been performed between reported and estimated K g . Geological Survey of India had performed extensive microtremor measurements in the few urban regions in Northeast India to identify the predominant frequency and amplification of the ground-based on the Nakamura technique (i.e. Nath 2007;Jana et al. 2012Jana et al. , 2013Ibrahim et al. 2014;Mohan et al. 2014;Imtikumzuk et al. 2015Imtikumzuk et al. , 2017Mero et al. 2019). The HVSR method compares the horizontal components' ratio to the vertical components of the microtremor signal spectrum (Nakamura 2000). After that, K g values were calculated using the predominant frequency and amplification factor. On the other hand, in the present study, strong ground motion data recorded by PESMOS and DSSMA has also been used to determine the seismic vulnerability index. Figure 9(b) shows the histogram of the reported K g value of Northeast India. The logarithmic difference between K g obtained through microtremor observations and estimated based on the topography gradient approach for the different urban centers is depicted in Figure 10(a-h). On the other hand, Figure 10(i) presented the comparison between the K g value estimated based on the recorded strong motion data and the present one. It is observed that the estimated K g value is a good match which is located in the site class B and C or rugged topography, while inconstancy has been observed in the flat topography, which is formed by the sedimentary deposit or located in site class E and D (Akkaya 2020). It is also observed that the large uncertainty between observed and predicted K g of Guwahati, Kohima, and Figure 9. (a) K g distribution map of Northeast India. The higher K g values mostly appear in the lower Brahmaputra Basin. Subplot 'b' represents the histogram of reported Microtremor and Strong Motion observations and corresponding K g value obtained through the HVSR technique, while subplots 'c-e' shows the comparison between predicted K g through topography approach and in-situ-K g (reported) determine through HVSR technique.
Imphal city which is located in flat topography and mostly in site class D and E, which needed a further site-specific investigation. Figure 9(c-e) shows the comparison between K gT and K gM for Aizawl, Itanagar, and Gangtok city exhibited good consistency. Figure 10. Logarithmic difference between reported seismic vulnerability index (K gM ) obtained through HVSR analysis based on the Microtremor data and estimated seismic vulnerability index (K gT ) based on the topography gradient approach for (a) Gangtok, (b) Aizawl, (c) Guwahati, (d) Imphal, (e) Itanagar, (f) Jorhat, (g) Kohima, and (h) Pakyong, while subplot (i) represent the log different between K g value calculated from recorded strong motion data and K gT .

GIS-based integrated SSC model of Northeast India
Site effect and vulnerability of ground and structures is a crucial issue for earthquake disaster mitigation (Jena et al. 2021). Thus, the detailed microzonation of any city waiting for the earthquake is still needed (Nath et al. 2014). GIS-based micro-zone of seismic site characterization has been carried out previously by several researchers through a multi-criteria based decision support tool (Pal et al. 2008;Nath et al. 2014;Ahmad 2016;NCS 2016;Som et al. 2016). After confirming the ground's dynamic characteristics, in the present study, the SSC index of each pixel has been achieved, which makes it possible to estimate the weak points of the ground surface. Therefore, a holistic GIS-based seismic site characterization map (SSCM) has been generated through Multi-criteria based decision support tool formulated by Saaty (1990) referred to as Analytical Hierarchal Process (AHP). The associated features are ranked or scored within the SSC themes. The initial integral ranking, R j , is normalized to ensure that no layer exerts an influence beyond its determined weight using the following relation (Nath 2004), where, R j is the row score, R max and R min are the minimum and maximum scores of a particular layer.
In the present study ArcGIS 10.3 has been used for the purpose of thematic mapping through vector/raster layer generation and spatial analysis tool, that is, 'Raster Calculator'. The SSC themes, pertaining to the study region materialized as thematic layers on the GIS platform are (a) Site Class, (b) Site Amplification, (c) Site Period, and (d) Vulnerability Index (K g ). The weights and the ranks to each thematic layer and the feature ranks thereof are assigned values accordingly to the apparent contribution of the layers to the overall amplification characteristics and seismic hazard. For example, in NEHRP site class theme we have assigned ranking as 'low' for site class B to 'high' for site class E for severity to liquefaction index and considering site response point of view (Nath et al. 2014). Similarly, Site Amplification Factor and Site Period theme attributes are ranks considering amplification characteristics (Nath and Thingbaijam 2011). K g >15 indicate zones of weak surface covers, susceptible to liquefaction and seismically vulnerable and therefore is assigned higher rank. Table 2 presents the pair-wise comparison matrix for the respective themes and their normalized weights. To obtain a normalised pair-wise matrix, the weights of each factor are computed by adding all the ratios in the relative matrix column and then dividing each element in the matrix by its column total. Finally, the weighted matrix is constructed by dividing the sum of the normalised rows by the total number of criteria. The normalized ranks assigned to the features of each theme are listed in Table  3. The seismic site characterization (SSC) index is calculated as, where 'w' is the normalized weight of a theme and 'r' represents a feature's normalised rank in the theme. Thereafter, a post-classification filter is applied to the SSCM to reduce the highfrequency variance. The micro-zones of seismic site characterization map of Northeast India shown in Figure 11 exhibited three broad divisions with an SSC index (HI) defined as 0.66 < HI ! 1.0 for 'High' susceptible condition associated with the lower Brahmaputra region, Agartala, Silchar, Tezpur, Siliguri, Guwahati, and Imphal City; 0.33 < HI ! 0.66 for 'moderate' susceptible conditions; and 0.0 < HI ! 0.33 for 'Low' vulnerable conditions, in most parts of the elevated region. It is also observed that the High and Moderate zone are associated with K g >5 and expected ground deformation due to moderate to large earthquakes. On comparison of reported seismic intensity data (i.e. USGS Earthquake catalog, https://earthquake.usgs. gov/earthquakes/search/, Dasgupta et al. 2000;Hough et al. 2005;Martin and Szeliga 2010;Som 2014) with the SSCM, it has been observed that the 'High' and 'Moderate' zones are associated with MM intensity of V-XII thus indicating a satisfactory correlation between these proxies of hazard and seismic intensity. The obtained SCC zonation map of North-East India has been validated by correlating reported seismic intensity data with the SSC Index through ROC (Receiver Operating Curve). For AUC using the reported intensity data, the SSC model yielded AUC values of 0.834 ( Figure 12). Additionally, SSCM based generic study of the said attributes reveals that the soft sediments associated with low V S 30 and high T s exhibit low predominant frequency and high site amplification, for example, site class E/$F (V s 30 < 180 m/s; T s > 0.50 sec) shows lowest predominant frequencies within the range of 0.6-1.7 Hz and higher site amplification of 4.5. The K g values in the Lower Brahmaputra Basin are also higher indicated weak zones that may fail and liquefy during moderate to large earthquakes. The SSCM indicated that the significant seismic effect could occur in the entire Lower Brahmaputra river basin.

Generic site response modeling based on the strong ground motion data
In the field of seismic hazard assessment, site response analysis is regarded as a key research subject. Site amplification at each of the 60 Strong Motion Stations has been computed through Horizontal-to-Vertical Spectral Ratio (HVSR) analysis of recorded events with a good signal-to-noise ratio greater than 3.0. The site amplification spectrum derived by considering both the near-and far-field recorded earthquakes for the sites falling in each of the SSCM, that is, Low, Moderate, and High, are amalgamated and averaged out within ± one standard deviation to provide a generic site amplification spectra for each SSC zone. The generic site response curves for various  SSC zones have been developed for Northeast India (Figure 13). The grey curves represent a composite of site amplification spectra associated with each SSC index for the different earthquakes. In contrast, the bold red curve represents the generic site amplification of the corresponding SSC zone as an average spectrum, and dotted blue curves are associated ± one standard deviation. It is evident from Figure 13; the higher site amplification factor 4.5 is associated with very soft sediments of the High zone of SSCM. These generic site amplification curves can further utilized for the surface consistent ground motion synthesis and building codal provinces of Northeast India.

Correlation between topography and SSC attributes
The findings show how digital elevation model (DEM), geotechnical data, strong ground motion data, and reported microtremor data had been used to assess site-specific conditions and create a comprehensive seismic site characterization map for Northeast India. The slope-velocity model has been found to be a valuable tool for estimating site characterization attributes in terms of surface shear-wave velocity (V S 30 ), general amplification factor (F), predominant period (T s 30 ), spectral dependent amplification factor (A f ), and seismic vulnerability index (K g ). The results demonstrated the compatibility of the velocity-slope model with reported SSC attributes. The produced maps fit with the geomorphological and geological settings of Northeast India. A comparison of topography and SSC attributes, that is, velocity, amplification, and predominant period of Northeast India reveal a significant agreement to the area's morphometry. In the Brahmaputra and Tista river valleys, where shear velocity is low and quaternary sediments are thick, higher values of Figure 14. Topography, slope, velocity, amplification, and predominant period variations along (a) AA 0 and (b) BB 0 profile (location of both the profiles is shown in Figure 7(b)).
predominant period and amplification have been found. The profiles AA 0 and BB 0 have been found to accurately represent the surface geology and geomorphology ( Figure 14).

Conclusion
The SSC attributes and its micro-zone map are being proposed on a regional scale for the first time in Northeast India. Here, the general amplification factor (F) has been determined using the velocity contrast between the surface and bedrock. The present results show the predominant period of surface soil based on V s 30 data varies in the range of 0.15-0.67s. The acceleration-independent amplification factor (F) has been found to be changing in the range of 1.4-5.8. On the other hand, accelerationdependent A f varies from 1.0 to 3.0 for all the periods (0.11, 0.21, and 0.40 s), whereas in PGA, it varies from 0.35 to 1.10. The spatial distribution of K g value suggested that the lower Brahmaputra Basin is more prone to earthquake damage. The results of the amplification and fundamental frequency studies corroborate the findings. It is recommended that the High and Moderate zone of the SSCM should be avoided from new urbanisation and planning projects while the other parts of the region are relatively safe. Thus, the developed maps are critical for Northeast India's anti-seismic building codes. Proposed high-resolution SSC map can provide crucial information for the development of Smart Cities in Northeast India as intended by India's Federal Government. This information can also be used by architects and civil engineers to evaluate the risk of failure of existing structures and design future earthquake-resistant structures in the region. The approach could be further improved by using higher resolution digital elevation models (e.g. Quickbird, Cartosat-1, and Cartosat-2), and more site-specific geotechnical data/information which recover finer scale variations of derived SSC attributes.