An optimized hexagonal quadtree encoding and operation scheme for icosahedral hexagonal discrete global grid systems

ABSTRACT Although research on the discrete global grid systems (DGGSs) has become an essential issue in the era of big earth data, there is still a gap between the efficiency of current encoding and operation schemes for hexagonal DGGSs and the needs of practical applications. This paper proposes a novel and efficient encoding and operation scheme of an optimized hexagonal quadtree structure (OHQS) based on aperture 4 hexagonal discrete global grid systems by translation transformation. A vector model is established to describe and calculate the aperture 4 hexagonal grid system. This paper also provides two different grid code addition algorithms based on induction and coordinate transformation. We implement the transformation between OHQS codes and geographic coordinates through the , and coordinate systems. Compared with existing schemes, the scheme in this paper greatly improves the efficiency of the addition operation, neighborhood retrieval and coordinate transformation, and the coding is more concise than other aperture 4 hexagonal DGGSs. The encoding operation based on the coordinate system is faster than the encoding operation based on the induction and addition table. Spatial modeling based OHQS DGGSs are also provided. A case study with rainstorms demonstrated the availability of this scheme.


Introduction
With the development of technology in earth science, the era of big earth data has arrived (Guo et al. 2014;Guo 2017). To meet the challenges in transmitting, storing, processing, analyzing, managing, and sharing big earth data, advanced technology is required to make the data manageable and valuable . Discrete global grid systems (DGGSs) are gaining traction as data models for a digital earth framework designed for heterogeneous geospatial big data (Craglia et al. 2012;Goodchild 2018). Based on its infrastructure, which is superior to regular grids based on map projections (Bauer-Marschallinger, Sabel, and Wagner 2014;Lin et al. 2018), the DGGS will enable the integrated analysis of massive, multisource, multiresolution, multidimensional earth observation data (Yao et al. 2020).
DGGSs comprise a spatial reference system that uses a hierarchical tessellation of cells to partition and address the globe, which is considered a promising model for processing big earth data. The Open Geospatial Consortium (OGC) specified the core abstract specification and extension mechanisms for the DGGS (Purss et al. 2017). In the DGGS, the earth's surface is partitioned into a set of cell regions, where each region has a single point and unique code associated with it. Compared with traditional spatial data organization and application mode, the DGGS can avoid the large deformation caused by direct projection. The DGGS is more suitable for solving large-scale problems and efficient processing of multiresolution data because of its hierarchical and continuous structure. As a tool, the DGGS has been employed in multiple fields, such as geospatial data management (Shuang et al. 2016;Li et al. 2019;Li, Stefanakis, and Mcgrath 2020), ocean modeling (Lin et al. 2018), construction of global gazetteers (Adams 2017), global oil palm monitoring (Cheng et al. 2017), integrated environmental analytics (Robertson et al. 2020), and prediction of hate crimes (Jendryke and McClure 2021).
Compared with the DGGS based on the geographic coordinate system, the geodesic DGGS based on regular polyhedrons has equal-area cell regions and can avoid distortion from the equator to the poles, which has increased its popularity among many researchers (Sahr, White, and Jon Kimerling 2003). Different geodesic DGGSs may be characterized according to base polyhedrons, cell shape, aperture, cell indexing, and projections (Mahdavi- Amiri, Alderson, and Samavati 2015;Peterson 2016). The base polyhedrons used to approximate the earth can be partitioned into cells and projected onto a sphere by a projection method. The icosahedral Snyder projection is commonly used in geodesic DGGS research due to its equal-area property and low angular distortion (Snyder 1992;Kimerling et al. 1999;White 2000;Sahr 2008;Vince and Zheng 2009). In geodesic DGGSs, there are three common cell shapes: triangle (Dutton 1984(Dutton , 1999Iii, John, and Goldsman 2001;Zhao, Chen, and Li 2006), quadrilateral (Alborzi and Samet 2000;White 2000;Amiri, Bhojani, and Samavati 2013), and hexagon (Sahr, White, and Jon Kimerling 2003;Vince 2006;Vince and Zheng 2009;Tong et al. 2013;Wang et al. 2021). Compared with other cell shapes, hexagonal cells have advantages such as close packing and uniform adjacency, which are favorable for spatial analysis and data visualization (Sahr 2011;Ben et al. 2015). Three hierarchical spatial partitioning structures-apertures 3, 4, and 7 were utilized in hexagonal DGGSs. Most research on multiprecision hexagonal DGGSs has focused on apertures 3 and 4.
The core functionality of the DGGS is the indexing method that supports the rapid indexing of spatial data in the DGGS and the efficient calculation of spatial analysis. The indexing categorization of the DGGS includes hierarchy-based indexing, space-filling curve indexing, and axesbased indexing. Different indexing methods have different characteristics and can be converted among each other (Amiri, Samavati, and Peterson 2015). For hexagonal DGGSs, hierarchy-based indexing methods are primarily utilized in research and application. Sahr (2005Sahr ( , 2008Sahr ( , 2011 introduced the icosahedral modified generalized balanced ternary (MGBT) for indexing the DGGS. Peterson (Peterson 2011) proposed the PYXIS scheme to encode and index cells of the Icosahedral Snyder Equal Area Aperture 3 Hexagonal (ISEA3H) DGGS. Arithmetic and Fourier transforms for the PYXIS were analyzed by Vince and Zheng (Zheng 2007;Vince and Zheng 2009). Mocnik proposed an identifier scheme that encodes the geographic coordinates of the centroid of a cell on ISEA3H DGGS (Mocnik 2019). Two types of aperture 4hexagonal grid systems were combined to construct the hexagonal quaternary balanced structure (HQBS) (Tong et al. 2013). According to the characteristics of the aperture 4 hexagonal grid systems, Wang et al. designed the hexagon lattice quad tree (HLQT) structure Wang et al. 2019).
Compared with the aperture 3 hexagonal grid, the partition structure of the aperture 4 hexagonal grid is simple and consistent, and its encoding operation is more efficient. The Aperture 4 DGGS is usually based on quad-tree encoding, but coding gaps and unique attribution problems will arise in the process of quad-tree encoding. The encoding cannot tile the whole plane because hexagonal grid subdivision is not nested at different levels. The method adopted by the HLQT was to construct the first symbol in the third resolution to the tile plane. The HLQT code consisted of the first symbol and subsequent digits, which caused a mismatch between the first symbol and subsequent digits (Wang et al. 2019). The calculation of the first symbol was more complicated. The HQBS adopts quadtree encoding, which combines two types of aperture 4 hexagonal discrete grid systems into a hierarchy, leading to complicated operation rules and the situation in which the code 'regularization' fails and needs to be returned and recalculated (Tong et al. 2013). The discrete global grid encoding operation usually constructs addition tables and addition rules, which are complex to build and computationally complex (usually O(N 2 ), by means of induction. In practical applications, more emphasis is placed on the simplicity, usability and efficiency of encoding operations. The efficiency of the existing DGGS still cannot meet the actual needs. The significance of the global discrete grid is to provide a new management and analysis scheme for spatial data. However, current research hardly provides the expression methods and interfaces of traditional spatial data on the discrete global grid, which limits the application of the DGGS.
In this study, we establish the mathematical vector model of the aperture 4 hexagonal grid system, which is used to express and operate the hexagonal quadtree encoding, and design the encoding scheme of the Optimized Hexagonal QuadTree Structure(OHQS) by translation transformation. The scheme is extended to the icosahedron and sphere. In the encoding addition and neighborhood retrieval operations, in addition to constructing a set of addition rules similar to integer addition by means of induction, we also construct a set of addition rules based on ijk coordinate transformation. The efficiency of the two addition rules is compared. In the transformation algorithm between DGGS code and geographic coordinates, we calculate the coordinates of the center point of the grid by converting the code to the ij coordinate system. The location characteristics of the hexagonal center point in the ijk coordinate system and IJK coordinate transformation are employed to calculate and determine the hexagonal grid code where the point is located. The data conversion model from traditional spatial data (vector data and raster data) to a discrete global grid is also constructed. Multisource data, such as social media data and elevation data, are utilized to analyze the influences of the 'Beijing 7.16' rainstorm disaster.

Basic model of a dGGS
A geodesic DGGS is generally specified by a base polyhedron, orientation of the polyhedron relative to the earth, a hierarchical spatial partitioning method, projection, and assigning points to cells (Sahr, White, and Jon Kimerling 2003). To construct the spherical hexagonal grid system, first, nearly all existing hexagonal DGGSs should build a planar hexagonal grid system. Second, the polyhedral model is built based on the planar hexagonal grid system. Last, the polyhedral model is projected onto the sphere. In this study, we selected the Inverse Snyder Equal-Area Projection Aperture 4 Hexagon Discrete Global Grid System (ISEA4H DGGS) model, as shown in Figure 1, because aperture 4 hexagon hierarchy changes more regularly and the Icosahedral Snyder Equal-Area Projection has the most minor cell distortion (Kimerling et al. 1999;Mahdavi-Amiri, Alderson, and Samavati 2015).
To construct the DGGS, the icosahedron is selected to approximate the earth, and the spherical discretization based on the earth is converted to discretization on the icosahedron. We determine the position of the icosahedron relative to the sphere by setting the vertices at the poles. The Figure 1. Model of icosahedron DGGS icosahedron is expanded, as shown in Figure 2. Referring to the discretization and encoding scheme in previous studies, the initial discretization is established in each triangular face of the icosahedron (Sahr 2008;Vince and Zheng 2009;Jin et al. 2018;Wang et al. 2019;Wang et al. 2021). We use Hg to indicate a complete hexagon, and Hg-d to indicate a hexagon with a triangle deleted. The twodimensional coordinate axis xy is used to identify the direction of hexagonal subdivision and encoding, and then the icosahedron is divided into 20 and 12 Hg-d, as shown in Figure 3. Note that the  icosahedron is subdivided into 32 hexagons, and we just need to build a subdivision and encoding structure on each hexagon.

Design of optimized hexagonal quadtree structure encoding scheme
There are four types of hexagonal subdivisions in aperture 4 hexagonal DGGSs. As shown in Figure 4, the red hexagon representing parent cells is partitioned into four gray hexagons representing child cells. In this paper, the first structure (a) is selected as the basic hexagonal grid subdivision structure, which is symmetrical about the parent cell and has better geometric properties.
To facilitate the representation and calculation of the aperture 4 hexagonal DGGS, the lattice representing the center point of the planar hexagonal grid is defined to replace the grid. Vectors u and v in the two-dimensional Cartesian coordinate system are employed to describe the lattice. As illustrated in Figure 5 (right), the unit of the side length of the initial red hexagon is 1; then and v = 0, 3 √

.
The vector model of the hexagonal lattice is constructed by describing the hexagonal lattice with vectors. In every resolution of hexagonal grids, each hexagonal cell can be represented by the combination of u and v. Let P n denote the set of all lattices that are partitioned in the nth resolution of hexagonal grids; it can be expressed as follows:  For n ≥ 1, let u n = 1 2 n−1 u, v n = 1 2 n−1 v and L n be the linear combination of u n and v n . It is easy to determine that L n represents all lattices in the nth level vector space. L n = n 1 u n + n 2 v n ; n 1 , n 2 [ Z (2) Properties of P n . The following properties are easily obtained from the definitions of P n and L n and are fundamental theorems for the aperture 4 hexagonal DGGS: 1. for n ≥ 1, P n , L n 2. P n are nested: P 0 , P 1 , P 2 , · · · 3. The number of lattice points in P n is 4 n 4. For any n ≥ 0 and P i [ P n , there exists a unique d i [ 0, 1 2 i u, The mathematical proof of Property 4 is presented in Appendix (1). Property 4 can be used as follows to assign each cell in P n a label (address) that consists of a string of n digits from the set {0, 1, 2, 3, 4}. Equation (4) is referred to as the standard form for P i in P n . Let a i [ {0, 1, 2, 3, 4}. The string a = a 1 a 2 . . . .a n of integers in the standard form is referred to as the label of P i in P n , it is a one-to-one correspondence between the label and the hexagonal cell. Figure 6 shows the labels of P 1 and P 3 . As shown in Figure 6, quadtree encoding will cause encoding gaps and uniqueness problems, and cannot cover the entire spherical surface, so it needs to be adjusted. For the code in the lower right corner, the excess hexagonal grids can be expressed as: The set of hexagonal grids in the upper left corner can be expressed as: and then RL n = 〈 〉RL * n . The mathematical proof is provided in Appendy (2). The part of the grid that exceeds the lower right corner exactly corresponds to the part that is vacant in the upper left corner. Similarly, the part of the grid that exceeds the upper and lower left corners can also correspond to the empty part in the lower and upper right corners, respectively. Therefore, Figure 6. The labels of P 1 and P 3 we can construct an optimized hexagonal quadtree structure by means of translation transformation, as shown in Figure 7.

Rule of code addition
3.1.1 Addition based on induction A vector represents the center point of the hexagonal cell, and the corresponding code is constructed. The code records the distance and orientation of the hexagonal lattice relative to the origin lattice. Similar to vector addition, the parallelogram rule can be used to define the code addition operation. According to induction and verification, the coded addition satisfies the properties of the Abelian group.
The rules of code addition in the current study are similar to the addition rules for integers (Vince and Zheng 2009;Tong et al. 2013;Wang et al. 2019). The code of the nth resolution a = a 1 a 2 . . . .a n and b = b 1 b 2 . . . .b n , The calculation method of code addition is similar to that of decimal addition. The method starts from the last symbol on the right and calculates bit by bit to the first symbol on the left. The code addition operation can be decomposed into the sum of the corresponding codes for different additions. a The same process applies to the decimal addition operation rule. For all the code elements to be added and the calculation result code element of the current bit, a carry code element will be generated on the left side. The specific operation rules are expressed as follows: a n ⊕ b n = a n 0 (a n = b n ) a n (a n = 0, b n = 0) c . . . . . . c n (a n = b n , a n = 0, b n = 0, c = {0, a n , b n } In the calculation process, when a n = b n , a large number of carry symbols will be generated on the addition bit, and the number of addition operations on each resolution exponentially increases. To improve the calculation efficiency, the number of addition operations must be reduced. Analyzing the rule of addition, the optimization methods can be described as follows: (1) 0 can be ignored as a calculation code element. When the code elements are identical, only one carry code element is generated so that the same code element can be calculated first. Only three symbols {1, 2, 3} remain, they participate in the addition algorithm. (2) It is unnecessary to store every carry symbol but only count the number of symbols that appear in addition, which can save considerable calculation memory. c = {0, a n , b n } can be realized by the binary XOR (^) operation: c = a n^bn .
In the calculation process, when a n = b n , numerous carry symbols are generated. By adopting the method of counting the number of occurrences of symbols and by preferentially calculating the same symbols, the number of symbol calculations can be reduced Wang et al. 2019). Through the above steps, the computational efficiency of code addition can be significantly improved, and Algorithm 1 shows the process of code addition for the OHQS.
Step 1. Initialize two-dimensional array A[n][4] to record the number of every digit that participates in the calculation in digit position i (position of a i and b i ). Let A[n][4] record the digits of labels a and b first.
Step 2. For i = n to 2 (right to left), do.
The number of remaining digits is three at most. The remaining digits can be input in the calculation by Equation. (8).
, and then the first number of results can be calculated.

Addition based on ijk coordinate
Although the efficiency has been improved, Algorithm 1 still has difficulty meeting the needs of practical applications. We attempt to implement the addition by transformation between OHQS codes and ijk coordinates. The ijk coordinate can be utilized to represent the relative position of the hexagonal grid in the space, as shown in Figure 8. The coordinate axis divides the space into 3 quadrants, and the coordinates in each quadrant are represented by two adjacent coordinate axes. The ijk coordinates are always positive, which is easy to calculate. In this paper, the OHQS code is converted to ijk coordinates, and the addition operation is performed in three dimensions in the ijk coordinate system. The addition result is converted to a code.
We need to realize the transformation between OHQS codes and the ijk coordinates. The directions of the three coordinate axes in the ijk coordinate system are equivalent to u, v, and −u − v in the vector model of the hexagonal lattice. Formulas (1) and (2) show that the units of the i, j, and k coordinate axes of the nth resolution are 1 2 n−1 |u|, 1 2 n−1 |v| and 1 2 n−1 |−u − v|. Each hexagonal grid can be expressed as [r, (i, j, k)] in the ijk coordinate system, where r represents the resolution of the hexagonal grid. For each digit a r in the OHQS code, from Formula (3), we know that . We can obtain the correspondence between digits and ijk coordinates. Next, the transformation relationship between the OHQS code a(a 1 a 2 . . . .a n ) and the ijk coordinate can be obtained, and Algorithm 2 for converting the code to coordinates and Algorithm 3 for converting the coordinates to code can be constructed.
In the calculation of converting the encoding into ijk coordinates, first, we separately calculate the ijk values of each resolution in the order of encoding, and accumulate the results to obtain (i, j, k). Second, because the value of three coordinate axes will be generated in the calculation, it needs to be converted to [n, (i, j, k)], which is represented by two coordinate axes. Last, since the translation transformation of the encoding is performed by using Formulas (4) and (5), the translation transformation of the ijk coordinates also needs to be performed in the calculation. The mathematical proof of Formula (9) is provided Appendix (3).
where n represents the length of the code and w(a p+1 ) is given in Table 1.
Step 2. Convert (i, j, k) to [n, (i,j,k)], which is represented by two coordinate axes.
Step 3. Obtain the translation transformation of the ijk coordinates.
In the calculation of converting [n, (i, j, k)] to code, first, the offset vector of each resolution of the DGGS is sequentially calculated from the 1st resolution to the nth resolution. The offset vector refers to the offset value of each layer symbol relative to the previous layer in the vector mathematical model. The offset vector refers to the offset value of the digits of each resolution relative to the  Table 2, the value of the digit of each resolution is determined, and last all the digits are combined to form the OHQS code a 1 a 2 . . . .a n .
Step 2. Determine the r-th digit in the code according to the offset vector.
The addition algorithm based on ijk coordinate transformation utilizes Algorithm 2 and Algorithm 3. For the two codes a 1 a 2 . . . .a n and b 1 b 2 . . . .b n to be calculated, first, the two codes are converted to ijk coordinates [n, i 1 , j 1 , k 1 ] and [n, i 2 , j 2 , k 2 ]. Second, the addition operation is performed in three dimensions in the ijk coordinate system, and last the result of the operation is converted to the OHQS code.

Rule of adjacent grid retrieval
Based on the additional rules of the OHQS code, the adjacent grid retrieval operation of the hexagonal cell can be constructed. The six adjacent codes of the grid code can be obtained by adding adjacent codes of the center code on the OHQS. As shown in Figure 9, the six adjacent codes of the center code are 00 . . . n . The adjacent grid retrieval operation based on the ijk coordinate transformation is relatively simple. First, we use Algorithm 2 to convert the OHQS code to (i, j, k) coordinates. Second, we calculate the ijk coordinates of the six adjacent grids (i, j + 1, k), (i + 1, j + 1, k), (i + 1, j, k), (i + 1, j, k + 1), (i, j, k + 1), and (i, j + 1, k + 1), as shown in Figure 10. Last we use Algorithm 3 to convert the six ijk coordinates to codes.

Transformation between OHQS codes and geographic coordinate system
Although there are various storage formats, most existing spatial data are usually represented by geographic coordinates or projected coordinates. To facilitate the organization and expression of spatial data in the DGGS, it is necessary to construct the transformation between geographic coordinates and global discrete grid codes. Based on the characteristics of the OHQS encoding scheme, this paper designs and implements an efficient transformation algorithm between codes and geographic coordinates. The algorithm involves six coordinate systems: (1) OHQS codes that can be regarded as a one-dimensional coordinate system, (2) ij coordinate system, (3) ijk coordinate system, (4) IJK coordinate system, (5) two-dimensional cartesian coordinate system on the triangular surface of an icosahedron, and (6) geographic coordinate system. Compared with the ijk coordinate system, the ij coordinate system has one fewer coordinate axis. The origin point is located at the center of the origin hexagon, and the coordinate axes are aligned with the two sides of the origin hexagon. Each hexagonal grid can be uniquely determined by (i, j) coordinates which are integer values. Figure 11 illustrates the assignment of the coordinate address. The IJK coordinate system is different from the ijk coordinate system. Figure 12 show the IJK coordinate system. The coordinate value in the IJK coordinate system is the vertical projection of the lattice point on the coordinate axis, and can be negative.
The units of the I, J, and K coordinate axes of the nth resolution are 1 2 n |u|, 1 2 n |v| and

From codes to the geographic coordinate system
According to the grid code, we can calculate the geographic coordinates of the center point and six vertices of the grid where the code is located. Unlike the complex expansion of the HLQT, we directly implement the transformation from code to geographic coordinates through the ij coordinate system. This transformation algorithm can calculate the geographic coordinates of the center point. The other six vertices are identical. First, the OHQS codes are converted to the (i, j) coordinates, Second the (x, y) coordinates of the Cartesian coordinate system are obtained by the (i, j) coordinates. Last, the geographic coordinates (B, L) can be calculated by the Snyder equal-area projection. Figure 13 and Algorithm 5 show the transformation process from codes to geographic coordinates.
Algorithm 5: CodeToCoordinate(code) Input: One label a 1 a 2 . . . .a n Output: Geographic coordinate (B, L) of the center point of the hexagonal cell Step 1. Calculate (i, j).
where n denotes the length of the label and the results of v(a k+1 ) are given in Table 3.
Step Table 3. v corresponds to different code cells Step 3. Calculate (x, y).
where L denotes the length of the triangular surface of the icosahedron.
Step 4. (B, L) can be calculated by the Snyder equal-area projection.

From the geographic coordinate system to codes
If we obtain the geographic coordinate (B, L) of a point, the OHQS code at which the point is located can be obtained by a transformation algorithm. The HLQT uses the parallelogram rule to project the coordinates to two specific coordinate axes, calculates the corresponding component codes on the two axes, and then calculates the final code through component code addition. We realize the transformation from geographic coordinates to code through the mutual conversion of the IJK coordinate system, ij coordinate system and ijk coordinate system. First, the (B, L) coordinate is converted to the (x, y) coordinate by the inverse Snyder projection. Because there are errors in the direct process of converting the (x, y) coordinate to the (i, j) coordinate, second we introduce the IJK coordinate system to obtain the (i, j) coordinate. Last, we convert the (i, j) coordinate to the (i, j, k) coordinate, and the code a can be calculated by the coordinate transformation of Algorithm 3. Figure 14 and Algorithm 6 show the transformation process from geographic coordinates to codes. Output: Code a of the hexagonal cell at resolution n, which contains the point.
Step 1. The geographic coordinate (B, L) is transformed to the (x, y) coordinate.
Step 2. Calculate the IJK coordinate as follows: Step 3 where L denotes the length of the triangular surface of an icosahedron, and m indicates the side length of the hexagon in resolution n.
Step 4. For Step 5. Convert the (i, j) coordinate to the (i, j, k) coordinate and then use Algorithm 3 ijkToCode ([n, (i, j, k)]) to calculate the OHQS code.

Spatial data modeling on a discrete global grid system
The organization and expression of spatial data is an essential key issue in processing and application, and it must be carried out in a specific digital space. Plane-based digital space and data expression methods have been significantly restricted in processing global or large-scale massive spatial data. It is urgent to establish an effective data expression method in a suitable digital space with characteristics such as uniformity and multiple scales. The hexagonal global discrete grid system is just an alternative form of spherical digital space. As previously mentioned, the OHQS code of the global discrete grid constructed in this paper implicitly expresses the spatial position. Additionally, the OHQS code carries information such as the scale and accuracy of the data and has the characteristics of multiresolution and the potential ability to process multiscale data. This part mainly addresses the construction of traditional spatial data (vector data and raster data) in the global discrete grid system. Both raster data and discrete global grids are discretizations of space. Rectangular grids are employed in raster data to partition space on a plane, while regular polygons are utilized in discrete global grids to partition spherical space. It is necessary to establish a data conversion model from raster data to a discrete global grid. The first step is to determine the level of the hexagonal grid that corresponds to the raster data according to the resolution of the raster data and the area of the hexagon. Table 4 shows the area of the hexagonal grid on different DGGS levels. Let initial area a = 17, 002, 187.39080. The best level to raster data conversation is: where n denotes the best level for data conversation, and RX and RY denote the resolution of the raster data on the x axis and y axis, respectively. There are three methods to resample raster values to DGGS grids: nearest-neighbor interpolation, bilinear interpolation, and bicubic interpolation. Compared with other methods, bilinear interpolation can guarantee higher interpolation accuracy and calculation efficiency (Ma et al. 2021). Figure 15 shows a schematic of the interpolation operation between pixels and hexagons.
For vector data, the best DGGS resolution to express the vector data can be selected according to the positional accuracy of the vector data. To sample the vector data into discrete global grid systems, as shown in Figure 16, the Bresenham algorithm (Bresenham 1977) is used to implement a polyline vector data conversation. The scan line seed fill algorithm is employed to implement vector data conversation of a polygon. The value of the attribute of vector data is assigned to DGGS grids.

Results
Presently, the hexagonal DGGS encoding schemes that are most commonly utilized in research and applications are PYXIS, HQBS, and HLQT (Vince and Zheng 2009;Peterson 2011;Tong et al. 2013;Wang et al. 2018;Wang et al. 2019;Wang et al. 2021). Previous studies have demonstrated that the code addition efficiency of the HLQT is superior to that of the HQBS and PYXIS, and that the transformation efficiency of the HLQT is superior to that of HQBS. This paper compares the operation efficiency of the OHQS with that of the HLQT (Tong et al. 2013;Wang et al. 2018;Wang et al. 2021).
To verify the efficiency and advantages of the above algorithms, comparison experiments are designed to test the efficiency of the addition operation, adjacent cell retrieval, and transformation  Table 5. Ia n and Ja n are OHQS codes, and Ib n and Jb n are HLQT codes. We compress the OHQS code to obtain the decimal code. By comparison, it can be determined that the OHQS encoding is relatively simpler, that there is no need to use commas to separate the first symbol, and that the OHQS encoding can also be converted to decimal code to further compress the encoding.

Comparison of the efficiency of code addition and adjacent cell retrieval
The efficiency of addition and adjacent cell retrieval in the DGGS is associated with the encoding scheme and calculation algorithm, encoding resolution, and number of code operations. We test three operations of the HLQT and OHQS based on induction and the OHQS based on the ijk coordinate. In the addition algorithm test, 100000 pairs of codes are randomly selected from each resolution to realize the calculation. In the adjacent cell retrieval operation, 100000 codes are randomly chosen to obtain the six adjacent codes. The calculation efficiency from resolutions 5-12 was tested ten times, and the unit is ms. The efficiency of addition and adjacent cell retrieval results between OHQS and HLQT are shown in Figure 17 and Table 6.
In the addition operation of the DGGS, the efficiency of the OHQS based on induction is higher than that of the HLQT, and the efficiency of the OHQS based on ijk transformation is much higher Figure 16. Data conversation diagram of vector data than that of both. As the resolution of the DGGS increases, the efficiency of the OHQS based on ijk transformation is affected the least, followed by that of the OHQS based on induction, while the impact on the HLQT rapidly increases. In the adjacent grid retrieval operation, the efficiency of OHQS based on ijk transformation is twice that of the OHQS based on induction and four times that of the HLQT. With an increase in the resolution of the DGGS, the efficiency of adjacent grid retrieval of the OHQS based on ijk transformation suffers the least, while the HLQT suffers the most. The adjacent grid retrieval operation is built on six code additions. The adjacent operation time of the HLQT is 4 times that of the addition operation, and the adjacent operation time of the OHQS based on ijk transformation and induction is 2.5 times that of the addition operation. The results are analyzed as follows.
(1) The HLQT constructs addition tables and addition rules by means of induction, which is not only cumbersome but also computationally complex. Although it is reduced from O(2 N ) to   O(N 2 ) through a series of methods, the operation efficiency is still slow. In this paper, we construct a code addition and adjacent grid retrieval algorithm based on the ijk coordinate transformation. In Algorithm 2 and Algorithm 3 of the mutual conversion between codes and ijk coordinates, each digit is only operated once, and the computational complexity is O(N), which greatly improves the efficiency of the OHQS code addition operation.
(2) Unlike the HLQT, the first symbol is not defined in this paper. The set E = {0, 1,2,3,4,5,6,10,20,30,40,50,60,100,200,300,400,500,600,106,601,201,102,302,203,403,304,504,405,605,506,1000,2000,3000,4000,5000, 6000} of the first symbol in the HLQT has a total of 37 symbols (Wang et al. 2019). Since there is no addition table built for some of these symbols, the HLQT builds complex rules to avoid its addition. Therefore, in the calculation of the first symbol, the method of counting the occurrences of the symbol cannot be utilized for optimization. With an increase in resolution, the calculation time of the first symbol also rapidly increases. Compared with the HLQT, the efficiency of the addition operation of the OHQS is greatly improved. (3) The adjacent grid retrieval operation is realized by the addition of the code and the six adjacent codes of the central grid. Since the six adjacent codes of the central grid are simple, the efficiency of adjacent grid retrieval is high. However, the addition of the HLQT is greatly affected by the first symbol, so the efficiency is limited.

Transformation
The OHQS encoding scheme of the planar hexagonal grid system has been established and extended on the surface of the icosahedron. The ISEA4H DGGS was constructed by Snyder equal-area projection. Figure 18 shows the third and fourth levels of the ISEA4H DGGS extended from the OHQS planar hexagonal grid system to the sphere. For the OHQS DGGS, the transformation algorithm between geographic coordinates and codes was realized and compared with the HLQT encoding scheme.

Transformation between geographic coordinates and codes
An experiment is designed to test the transformation between geographic coordinates and codes. The geographic coordinate (B, L) was converted to codes in different DGGS resolutions. These codes obtained the geographic coordinates of the center point and six vertices of the hexagonal cells where the codes are located. These cells are displayed on the map. In this experiment, the center location of Beijing (39.9177 • N, 116.4126 • E) was selected and transformed to codes from resolutions 5-12 in the OHQS. The results are shown in Table 7 and Figure 19.

Comparison of the efficiency of transformation between geographic coordinates and codes
In this experiment, 1000 codes are randomly selected from each resolution to calculate the geographic coordinates of the center point and six vertices of the hexagonal cells, and 10000 geographic coordinates are randomly chosen to obtain the codes in different resolutions. The transformation efficiency of the OHQS and HLQT from resolution 5-12 was tested ten times, and the unit is ms. A comparison of the efficiency of transformation between OHQS and HLQT is shown in Figure 20 and Table 8. The results show that the average efficiency of the OHQS from geographic coordinates to the code conversion algorithm is approximately 2 times that of the HLQT algorithm. As the resolutions increase, the efficiency of the OHQS remains basically the same. The conversion efficiency of the OHQS from codes to geographic coordinates is slightly higher than that of the HLQT, both of which are not greatly affected by the increase in resolution. The results are analyzed as follows: (1) The transformation efficiency in this paper is less affected by resolutions, and the transformation from coordinates to codes remains roughly unchanged because only one step in the algorithm involves the operation of the resolution, such as step 1 of algorithm 2 and step 5 of algorithm 3. (2) The HLQT uses the parallelogram rule to project the coordinates onto two specific coordinate axes. The corresponding component codes on the two axes are obtained, and the final code is calculated by the addition of component codes. The HLQT still implements the transformation by the addition operation. We directly realize the transformation from geographic coordinates to code through the mutual conversion of the IJK coordinate system, ij coordinate system and ijk coordinate system. A substantial increase in efficiency has been achieved. (3) In the transformation from codes to geographic coordinates, the OHQS and HLQT are similar in efficiency, but the first symbol of the HLQT affects some computation time. Therefore, the transformation efficiency of the OHQS from code to geographic coordinates is slightly higher than that of the HLQT.

Multisource data fusion in the DGGS for rainstorm influences evaluation
Analyzing environmental processes such as hurricanes, wildfires, and rainstorms is ill-equipped for multisource geospatial data. DGGSs have been developed as a new solution for multisource spatial data processing and analysis. In this case, we use spatial data modeling on DGGSs to realize the fusion and analysis of multisource spatial data for the influence evaluation of the 'Beijing 7.16' rainstorm disaster. Figure 21 shows the process flow.  Figure 21. Schematic of the process flow of multisource data fusion on DGGS for rainstorm influence evaluation The multisource data are derived from analyzing the influences of the 'Beijing 7.16' rainstorm disaster. We obtained and processed social media data related to rainstorm disasters and extracted structured traffic impact information. In addition, there are elevation data, road network data, and climate data. Road condition information and public emotional information are point vector data; road network data is polyline vector data; and elevation data and average precipitation data of Beijing are raster data. We sampled all data into the 14th resolution grids using the method of spatial data modeling on a discrete global grid system. Next, a weight value is assigned to each influencing factor, and the comprehensive impact value of rainstorm disasters is calculated for each hexagonal grid. The natural breakpoint method is applied to divide each grid into five levels of rainstorm impact-highest risk, higher risk, moderate risk, lower risk and lowest risk-and the results are displayed. We effectively assessed the spatiotemporal distribution of rainstorm impacts in urban of Beijing on 16 and 17 July 2018, as shown in Figure 22.

Conclusion
This study establishes the vector model of the aperture 4 hexagonal grid system, and designs the encoding scheme of an optimized hexagonal quadtree structure (OHQS) by translation transformation. Compared with the complex first symbol of the HLQT and the mixture of two types of encoding of the HQBS, the quadtree encoding in this paper is more concise. In the code addition and adjacent grid retrieval operations, in addition to constructing a set of addition rules based on induction, we also constructed a set of addition rules based on ijk coordinate transformation. The results show that the OHQS encoding operation is more efficient and that the encoding operation based on the ijk coordinate system is faster than the encoding operation based on the induction and addition table. We realize the transformation from geographic coordinates to codes through the mutual conversion of the IJK coordinate system, ij coordinate system and ijk coordinate system, which is twice as efficient as the transformation based on code addition. These findings can provide an alternative for other types of DGGS operations. In addition, we also employed OHQS to realize traditional spatial data modeling on the DGGS, which provided the basis for multisource data fusion and analysis, and realized the impact analysis of the 'Beijing 7.16' rainstorm disaster. This study provides solutions for the application of the DGGS to traditional spatial data.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Appendices
(1) For any n ≥ 0 and P i [ P n , there exists a unique d i [ 0, 1 2 i u, Proof.