Revealing intra-urban hierarchical spatial structure through representation learning by combining road network abstraction model and taxi trajectory data

ABSTRACT The unprecedented urbanization in China has dramatically changed the urban spatial structure of cities. With the proliferation of individual-level geospatial big data, previous studies have widely used the network abstraction model to reveal the underlying urban spatial structure. However, the construction of network abstraction models primarily focuses on the topology of the road network without considering individual travel flows along with the road networks. Individual travel flows reflect the urban dynamics, which can further help understand the underlying spatial structure. This study therefore aims to reveal the intra-urban hierarchical spatial structure by integrating the road network abstraction model and individual travel flows. To achieve this goal, we 1) quantify the spatial interaction relatedness of road segments based on the Word2Vec model using large volumes of taxi trip data, then 2) characterize the road abstraction network model according to the identified spatial interaction relatedness, and 3) implement a community detection algorithm to reveal sub-regions of a city. Our results reveal three levels of hierarchical spatial structures in the Wuhan metropolitan area. This study provides a data-driven approach to the investigation of urban spatial structure via identifying traffic interaction patterns on the road network, offering insights to urban planning practice and transportation management.


Introduction
The unprecedented global urbanization in China has increasingly attracted the attention of urban planners and geographers who seek to explore the intra-city structure through spatial interactions (e.g.daily commuting) (Barbosa et al. 2018;He et al. 2020;Liu et al. 2015;Martin and Schuurman 2020).Urban spatial structure is defined as the abstract of geographical space distribution in cities (Chen et al. 2019;Foley 1964).A city acts as a dynamic and connected system with the spatial interaction flows of people between subregions (e.g.places and parcels) (Batty 2009).With different travel demands and people's behaviours, the resources and infrastructures of a city are allocated, resulting in the formation of complex and multidimensional city structures (Liu et al. 2015;Jusup et al. 2022).Meanwhile, the rapid development of urban spatial structure has significantly impacted on urban dynamics and intra-city travel patterns of a city's residents (Burger, van der Knaap, and Wall 2014;van Meeteren et al. 2016;Wu, Smith, and Wang 2021).Over the years, research efforts have been conducted to explore the impacts of city structure on the dynamic urban system (Yue et al. 2014).Revealing the underlying spatial structure can help derive interesting insights on the organization and distribution of the urban system, which is important in terms of urban morphology, transportation, and economy (Dokuz 2022;He et al. 2020;Liang and Kang 2021;Luo and MacEachren 2014;Schläpfer et al. 2021;Zhang et al. 2017).
With the proliferation of individual-level mobility data within cities, numerous complex network approaches have been implemented to reveal the underlying spatial structures within a city (Batty 2013;Estrada 2012;Lee and Kang 2015;Zhang et al. 2014;Zhong et al. 2014).For example, the aggregated analysis models an entire city in a spatially embedded graph based on individual travel behaviours (De Montis, Caschili, and Chessa 2013;Liu et al. 2015;Yu 2018).However, the regional units (e.g.administrative districts and grid cells) in the aggregated analysis have been proven to be large and abstract, and detailed human movement information may be ignored without considering interaction relations between finegrained spatial units, such as urban roads (Liu et al. 2017;Zhu et al. 2017).Recently, network abstraction models from the urban road network perspective are frequently applied for urban functional zone identification and spatial structure analysis (Hong and Yao 2019;Rodrigue, Comtois, and Slack 2016).In the network abstraction model, the road intersections are usually represented as nodes and the road segments are represented as edges.The construction of a network abstraction model, however, primarily depends on the topology of the road network, which does not consider the spatial interaction derived from individual travel flows.The integration of individual travel flows into the road network abstraction model can take advantage of the finegrained spatial flow information among urban roads and provides new insights for exploring the underlying urban spatial structure.
Therefore, we proposed a framework to reveal the underlying spatial structure within a city by integrating the topology-based network abstraction model and individual travel flows derived from the taxi trajectory data.The remainder of this paper is organized as follows.Section 2 reviews the related literature on investigation of both urban spatial structure and road network abstraction model.Section 3 introduces the proposed framework and methods.In Section 4 and 5, we conducted a case study in the city of Wuhan and presented the results with discussions.Section 6 concludes this study and presents our vision on the future work.

Revealing city structure based on the spatially embedded graph
The rapid growth of map services and mobile positioning technology has provided a massive amount of emerging geo-tagged data (e.g.taxi trajectory data and mobile phone record data) in the past decade.Those data sources lay the foundation to analyse the intraurban flows to address many research issues, such as detecting human mobility patterns (Liu et al. 2012;Luo, Gao, and Cassels 2018;Siła-Nowicka et al. 2016;Zhang et al. 2018;Zhu et al. 2018), assessing traffic indicators of the urban road network (Brauer, Mäkinen, and Oksanen 2021;Cui et al. 2016;Hu et al. 2021), revealing multi-level city structure (Gao et al. 2013;Liu et al. 2015;Liu, Gao, and Lu 2019;Zhong et al. 2014), and understanding urban land use (Liu et al. 2012(Liu et al. , 2016;;Hu et al. 2021;Zhang et al. 2020) Meanwhile, a spatially embedded graph (also called spatially embedded network) has been widely used to detect city structure with those human mobility data.Each geographical region stands for a node and the travel flow derived from the human mobility data stands for the edge between regions in a spatially embedded graph.Then, the community detection method is used to divide the entire graph into sub-graphs, namely communities or modules (Chen, Xu, and Xu 2015;Fortunato and Hric 2016;Hric, Darst, and Fortunato 2014).Communities are sets of nodes that have strong inner connectivity and are sparsely connected to nodes of other communities.For example, Gao et al. (2013) empirically discovered the clustering structures of spatial-interaction communities according to the construction of two intertwined embedded networks, e.g. the network of phone-call interaction and the network of phone-users' movements.Liu et al. (2015) found a twolevel polycentric community structure in Shanghai with grid cell-based embedded graph using a taxi trip data.Therefore, significant community structures that represent the underlying sub-regions with strong internal travel flows can be revealed via spatially embedded graph (Hong and Yao 2019;Liu et al. 2015).

Network abstraction model based on urban road network
Recently, numerous in-depth researches have been conducted to reveal city structures via network abstraction model based on the urban road network.A road network, an artificial corridor in the urban areas, plays an important role in shaping the city's traffic and functional structure (Burghardt et al. 2021;Hong and Yao 2019;Zhu et al. 2017).Inspired by the spatially embedded graph, the network abstraction model regards the urban road network as an embedded graph, in which the significant road nodes (e.g.intersections) are represented as nodes and the road segments are represented as edges.Hong and Yao (2019) empirically found a hierarchical structure of communities according to the network abstraction model based on OpenStreetMap (OSM) road network.
Spatially embedded graph models an entire city to explore the underlying city structure using individual travel flows, but the regional units are large and abstract.Network abstraction model primarily focus on the inner topology of the road network, but do not consider the role of human movements along with the road network.Previous research argued that an appropriate integration between urban spatial structure and human mobility patterns can provide great potential for investigating urban morphology and transport geography (de Andrade et al. 2021; Guo et al. 2021;Liu et al. 2016;Xu et al. 2016).However, previous studies tend to employ individual travel behaviours and road network abstraction model, respectively, the applications of the integration between individual travel behaviours and road network abstraction model needs further studies.This work, therefore, aims to reveal the intra-urban spatial structure by combining road network abstraction model and individual travel flows.

A critical analysis
In sum, many efforts have been made for revealing the urban spatial structure, but they have the following shortcomings: • The units to explore the underlying urban spatial structure using individual travel flows on spatially embedded graph models are aggregated to areal units that may suffer from the modified areal unit problem (MAUP).• Network abstraction model primarily focuses on the inner topology of the road network, but does not consider the role of human movements along with the road network.Previous studies tend to employ individual travel behaviours and road network abstraction model separately.
This research, therefore, aims to reveal the intra-urban spatial structure by combining road network abstraction model and individual travel flows.We proposed an integrated framework to reveal the underlying spatial structure within a city based on individual travel flows derived from the taxi trajectory data.The contributions of this work are as follows: • We proposed an integrated framework for sensing the underlying hierarchical urban spatial structure, which can benefit from both network abstraction model and detailed human movement information at a fine-grained scale.• We investigated the integration of the quantified spatial interaction relatedness of road segments into the urban road network to enrich urban road network-based GIS research.

Methodology
In this study, we proposed a framework to explore hierarchical urban spatial structure using urban road network and taxi trip data (Figure 1).First of all, we used a representation embedding method-Word2Vec to quantitively measure the spatial interaction relatedness of road segments using taxi trajectory data.Then, we constructed a road network abstraction model based on the OSM road network, and further weighted the network using the quantified interaction relatedness.Finally, we employed the Infomap community detection method to reveal significant community patterns that represent sub-regional patterns of a city.In addition, we used point-of-interest (POI) data to evaluate the spatial distribution of the detected regional structure of this city.

Measurement of spatial interaction relatedness of road segments based on the word2Vec model
In this section, we introduced the Word2Vec representation learning model to quantify the spatial interaction relatedness of road segments on the urban road network using GPS-enabled taxi trajectory data.

Representing taxi trajectories by road nodes on road network
Taxis operate along the urban road network and taxi trip routes contain valuable human activity information about people's movement and traffic flow.Here, we presented a map-matching-based approach that represents taxi trajectories along a road network using the interaction node as a spatial assembly and analytic unit.
In this study, a road node is defined as a point that represents connectivity between two road segments, including the starting point, the ending point, and the intersection of the road segments.Specifically, we started by mapping taxi GPS records to urban roads via a fast map matching algorithm (Yang and Gidófalvi 2018), and we represented each route as the sequence of consecutive matched road nodes (Figure 2).

Building traffic analytic semantic corpus
In recent years, geo-semantic analysis framework derived from the natural language processing (NLP) field has shown potential as a promising tool to exploit spatial relationships in urban areas based on big data (Cai 2021;Crivellari and Ristea 2021;Liu, Gao, and Lu 2019;Luo et al. 2019;Yao et al. 2017).
By analogizing traffic elements to NLP terms, geosemantic analysis builds high-dimensional embedding vectors to quantitatively describe the traffic components, exploring potential information in geographical data (Bengio, Courville, and Vincent 2013;Hu et al. 2021).Researchers can effectively train a language-based representation learning model and evaluate semantic representations or relationships based on a large semantic corpus in geosemantic analysis.
In this study, based on the assumption that the traffic interaction indicates the travel activities of urban people and to be intimately connected with the urban spatial structure, the entire study area was represented as a corpus of traffic analysis.In this corpus, the taxi movement trajectories were analogized to NLP documents, while road nodes on the road network were words in documents.The objective of building the corpus is to capture fine-scale traffic interaction patterns along with the urban road network and measure the spatial co-occurrence relationships between traffic nodes.A higher degree of traffic relatedness between two intersection nodes indicates that they are likely to co-occur in the taxi movement trajectories on the road network.

Training word embedding representation model
Word embedding is one of the most popular representation learning techniques for representing the vocabulary of a document, originating from the NLP domain.For machine learning tasks, words must be represented meaningfully, which requires numerical representation.Word embedding techniques are used to solve this problem, with algorithms, such as Word2Vec enabling words to be expressed mathematically (Cai 2021;Liang and Kang 2021).The core principle behind word embedding techniques is that each word is represented by a D-dimensional real-valued vector, which reflects the multidimensional semantics of the word.The distance between two vectors quantifies their semantic similarity and relatedness (Mai et al. 2022).The word embedding technique represents each distinct word with a numerical vector via a self-supervised neural network.It captures the context of vocabulary words in a document and measures their semantic similarity or relatedness with other words.
Word2Vec is an open-source, state-of-the-art tool proposed by Mikolov et al (Mikolov et al. 2013).for word embedding with ease of operation and high scalability.Word2Vec trains a two-layer neural network and reconstructs linguistic contexts of words to convert each word to a unique vector.In this study, the skip-grambased Wrod2vec model (Levy and Goldberg 2014) was introduced to obtain the numerical representation of intersections (as nodes) along the road network.Given a sequence of road nodes (a trajectory), the skip-grambased Word2Vec model's goal is to predict the context of road nodes for a given target node in a sliding window.The maximum likelihood function can be estimated as: where w is the training context window size, n is the number of road nodes, and v jþw denotes the context road nodes of the target one v j .The conditional probability can be estimated as: where x v o and y v t are context and target vector representations of road nodes and C is size of the semantic corpus.
We trained the Word2Vec model using the traffic analytic semantic corpus built from taxi trajectory data, reconstructed spatial contexts of traffic elements, and finally obtained the symbolic vector representation with real values for each road node.

Measuring spatial interaction relatedness
The urban road network facilitates the majority of human movement, and vehicle routes provide valuable information about human activity and traffic flow in the urban road network.The measurement of relatedness (Liu et al. 2017), cosine similarity, between two road nodes aims to quantitatively investigate the inherent spatial interactions on a road network and reveal intraurban spatial structure.If two nodes in the road network have a strong traffic relatedness, they are more likely to occur in close proximity along taxi trajectories.A cosine similarity measurement is employed to quantify the traffic relatedness of the road nodes based on their embedding vectors, denoted as (Xu et al. 2021): where sim w i ; w j À � means cosine similarity between embedding vectors w i and w j and changes between 0 and 1.A value of sim w i ; w j À � approaching 1 indicates that road node i has strong correlations with road node j.It is worth noting that the obtained similarity denotes the topological measurement of urban roads and quantitates the functional relations of road interactions (Hong and Yao 2019).

Road network abstraction model
Road network architectures in urban settings have been studied with the analogy of urban road networks to graphs in complex network disciplines (Gao et al. 2013;Liu et al. 2015;Yao et al. 2021).With the emergence of OSM, researchers are able to better explore urban road networks and access urban spatial interaction patterns.In this study, the road network abstraction model was constructed using OSM road data.Specifically, the road network was symbolized to a weighted directed graph G; V; E; W ð Þ, where the road node is referred to as graph vertex, V, the road linking two adjacent nodes is referred to as an edge, E, the road driving direction is used to describe the edge direction, and the spatial interaction relatedness of the road segment is used to describe the weight of edge, W. The construction of the road network abstraction model was implemented with an opensourced tool-OSMnx (Boeing 2017).

Community detection using the infomap algorithm
The community detection method attempts to group or divide graph vertices into a few subsets based on their interaction patterns.The divided sets of vertices, commonly called communities or modules, are strongly linked together and sparsely connected to the rest of the graph (Xie, Kelley, and Szymanski 2013).Based on the spatial interaction relatedness and topological connectivity of road nodes, the Infomap community detection algorithm is employed to reveal significant community patterns that represent sub-regional patterns of a city.Among community detection algorithms, the Infomap method with its superior performance and high availability has emerged as a popular method in a wide range of applications (Liu et al. 2015;Yao et al. 2021).
The InfoMap algorithm is an approach for network partitioning built on the minimum entropy principle and random walks strategy.Network partition, given a graph, refers to a specific division of the nodes into modules with an objective function.Infomap optimizes the objective function known as the map equation (Edler, Bohlin, and Rosvall 2017).Infomap uses Huffman coding (Huffman 1952) to describe each node in the network with a two-level description and minimizes the description length of a random walker's movements over all possible network partitions on a network.Thus, the partition with the shortest description length best describes the community structure of the network in terms of network dynamics.More details about the Infomap algorithm can be found in Rosvall and Bergstrom (2008).
In this study, the Infomap algorithm was used to partition the urban road network into multiple layers.To detect hierarchical communities, the Infomap algorithm starts by partitioning the urban road network into modules (or communities) based on the flow of random walks.Each module is treated as a new network, and the process is repeated recursively until no further meaningful partitions can be made.The resulting hierarchy of modules represents a nested set of communities at different scales or resolutions.The input of the Infomap algorithm is the weighted directed graph Accessisdenied, and the output is a graph G divided into hierarchical modules/subgroups.

Mixed land used indices
Inspired by Yue et al. (2017), three mixed-use indices, including the Margalef Species Richness index, Shannon's entropy index, and the Simpson's index of diversity, were introduced to measure the extent of mix/evenness in the distributions of land use types.The Margalef Species Richness index (Gotelli and Colwell 2011), namely Richness index, is a species diversity index in the ecology domain and is used to estimate the degree of diversity in the land use and POI context.
The Richness index for the specific region i is defined as R i : where S is the number of POI types and n i is the number of POI within the region i.Note that a lower value of R i indicates the less diversity of POI types, suggesting a higher purity urban function of the division.
The Shannon's entropy index (Brown et al. 2009), namely Shannon index, reflects the degree of orderliness of the distribution of POI types, which is defined as H i : where p j is the percentage of the number of POI type j in total number of POIs of region i.Note that A lower value of H i indicates the greater orderliness and less random of the distribution of POIs.
The Simpson's index of diversity (Hunter and Gaston 1988;Simpson 1949) measures the concentration of the distribution of POIs, which is defined as D i : where m j is the total number of POIs of a particular type j within the region i.Note that a higher value of D i indicates a greater degree of concentration of region i.The above three indices evaluate mixed land use in the POIs context from different aspects, and therefore can effectively verify the advantage of our proposed method in delineating urban functional areas.

Study area and data collection
A case study was carried out in the city of Wuhan, China which is known for its complex urban morphology and high rates of mixed land use.Because of Wuhan's rapid urbanization, new requirements for exploring the urban spatial structure and planning have been emerging in recent years.The study area was selected from a downtown area of Wuhan city, covering an area of 748.39 km 2 (Figure 3).The study area is characterized by a wide variety of growth and population density levels, wherein the landscapes are exceedingly diverse.
In this work, we used three kinds of geographic data including urban road network, taxi GPS trajectories, and POI data to conduct the experiments.Note that the coordinate system of all geospatial data was unified to the WGS geographic coordinate system.
• Urban road network.The primary road network data was obtained from OSM, which was acquired in January 2018.OSM is an open-source map service that gives users free and easy-to-access digital map data, and it is now the most popular and successful volunteered geographic information provider (Xu et al. 2019).The road data contains essential attribute information, including road name, road type, coordinate location (longitude and latitude) of road nodes, and topological connection information.Extraordinarily high precision in the position and topological relationship is found in the study area.22115 road segments and 14,715 nodes were extracted after pre-processing procedures such as simplification and topological checking (Figure 4).

Embedding representation and spatial interaction relatedness of road nodes
The road network in the study area was abstracted onto a weighted directed graph.Taking the road network graph as input, a skip-gram-based Word2Vec model was trained to obtain embedding representations.In view of the volume of graph structure and the computational costs, most parameters were set to recommended or default values.Specifically, for model training, the dimension of the road node representation was set to 128, the number of training epochs was set to 10, and the window size was set to 10.The Word2Vec model was trained using the representation learning tool gensim in Python (Rehurek and Sojka 2010).
Cosine similarity metrics were then utilized to obtain the relatedness between any two road nodes of a road segment using the embedding vectors.Figure 6 demonstrates the relationship between the average similarity and the distance between pairwise two node embeddings.For each road node, we randomly selected 10% of all road nodes to calculate the average similarity and road distance between two nodes.We found that the topological correlation generally follows Tobler's first law of geography (Tobler 1970): near nodes are more related than distant nodes.Meanwhile, Figure 7 shows an example of the spatial distribution of the similarity of road nodes in a local area.The red node denotes selected centre node.As the blue colour deepens, so does the similarity between the centre node and its neighbours, implying a larger topological correlation.The nodes on roads with the same direction are more related than roads in the opposite direction, which indicates that the similarity of road nodes can be influenced by traffic interactions.

Hierarchical urban spatial structure
Based on spatial interaction relatedness and topological connectivity, the Infomap algorithm was further employed to implement multilayered detection of communities, and hierarchical urban spatial structure was further explored.Table 1 shows the statistical results of three-level communities.At the top level, the entire urban road network in the study area is divided into three aggregated sub-regions.As depicted in Figure 8a, the division result is very consistent with the previous urban spatial pattern, namely Three Towns of Wuhan, which formed in the last century before the administrative regionalization reform in China (Wuhan 2021).Three towns of Wuhan consist of Wuchang, Hankou, and Hanyang, which are three independent towns and located in three disjoint parts in the downtown area of Wuhan.Each town has its own development characteristics and patterns with respective specialities in economy, culture and industry.This result indicates that the spatial structure of the Three Towns of Wuhan still has far-reaching effects on the contemporary urban spatial structure.
As shown in Figure 8b, more detailed structural information is detected at the second-level communities.The study area is divided into 22 communities or sub-regions with an average edge weight of 0.859.Most studies usually demonstrate the sub-regional structure of a city by referring to its administrative divisions (De Montis, Caschili, and Chessa 2013;Liu et al. 2015;Ratti et al. 2010).In the study area, most sub-regions are inconsistent with the district-level administrative division, indicating the different connectivity between the districts and the underlying urban spatial structure via intra-city interactions in the study area.For example, Jianghan district, located in the downtown area of Wuhan, is mainly divided into three different regional areas, corresponding to three different functional areas (Hankou railway station transportation area, Wanda plaza commercial area, and Jianghan Road shopping area) (as shown in Figure 8d).These three areas with different functions have distinct spatial interactions among each other.Administrative division structure is a static artefact which is partially arbitrary.Cities function as dynamic systems where traffic flows play a significant role in connecting the discrete resources of a city into an integrated system.Our results embrace the same idea that spatial interaction ties sub-regions within a city.
To further validate our results and understand the spatial interaction patterns between the second-level communities, the chord diagram was used to visualize the actual traffic flows using the origins and destinations of taxi trips.As depicted in Figure 9, we found that the frequency of intra-flows is greater than inter-flows between sub-regions, indicating that most taxi movements happen within the same sub-regions.This result shows that the proposed methods can effectively reveal regional urban spatial structure and identify the intracity spatial interaction patterns.Figure 8c maps a more elaborate regional division pattern with 127 communities and an average edge weight of 0.864.These results reveal a fine-scale urban spatial structure, and each sub-region can reflect specific urban functional areas.We employed the Thiessen polygon to estimate the influence area of each road node and determine the coverage area of each division.To further explore the functional area characteristics, the term frequency-inverse document frequency (TF-IDF) method was used to identify the urban functions using POI types.By weighting different POI types within each sub-region, the TF-IDF method stresses the uniqueness of POI types in each sub-region and decreases the weight of common POI types, therefore effectively identifying functional characteristics of each sub-region.A similar approach has been used in previous works (Gao, Janowicz, and Couclelis 2017;Hu et al. 2021;Liu et al. 2020); detailed descriptions about TF-IDF algorithm can be found in Beel et al. (2016).
Figure 10 shows the spatial distribution of urban functional areas, and Table 2 shows the function descriptions.Cluster 1 mainly includes educational areas such as many universities.Cluster 2 consists of many kinds of comprehensive marketplaces.Cluster 3 includes most industrial areas located alone with 3rd Ring Road.Cluster 4 includes business and commercial areas located near Wuhan Central Business Unit (CBD).Cluster 5 is heavily mixed areas of residential blocks and commercial areas.Transportation, few business and industrial areas dominate Cluster 6.

Result verification
To evaluate the effectiveness of our proposed method, we implement two comparison experiments with different weights of each road segment within the urban road graph: • O-Infomap: Original Infomap method with all weight of edges set to 1; • D-Infomap: Distance-weighted Infomap method with weight set as Euclidean length of each road segment; • Our proposed method: Relatedness-weighted Infomap method with weight set as spatial interaction relatedness between traffic nodes.
Three hierarchical spatial division patterns were detected using the comparison experiments in a quantitative way.As suggested by Brown and Holmes (1971) and Noronha and Goodchild (1992), one classic approach to measure the performance of delineating urban functional regions is maximizing the interactions within the same region while minimizing the interactions between different regions.Inspired by Liu et al. (2019), we calculated the percentage of average frequency (PAF) of taxi flows between the second-level divisions within the same division over 7 days.A greater value of PAF suggests that people are more inclined to travel within the same division or sub-region, indicating a more significant spatial division and better capability of identifying spatial interaction patterns.As shown in Table 3 the values of PAF are higher than that of the other methods on most days, proving that our proposed method performs significantly better than other methods in identifying spatial interaction patterns.Moreover, we calculated multiply indices for mixed land use of the third-level divisions using POI types.We hypothesize that the mixing degree of urban functions in a specific sub-region reflects regional urban functional structure, thus verifying the validity of the division results.A lower mixing degree of urban functions suggests a more pronounced spatial division and is better capable of delineating urban functional areas.
Table 4 shows statistical information of three mixeduse indices using different methods.Our proposed method achieves good fitting results with a lower average Richness index and average Shannon index, indicating the advantage of our proposed method in keeping  the purity of urban functions and orderliness of the distribution of land-use types.However, compared to other methods, Simpson's index of diversity has no significant difference.In summary, the quantitative comparison results from two different perspectives indicate the effectiveness of our proposed approach in revealing hierarchical spatial structure in a city.

Hierarchical urban spatial structure
Urban spatial structure is closely related to the urban travel patterns generated from their daily lives of urban dwellers.The distribution of various urban elements (or infrastructure) significantly impacts people's travel, and their movement flows connect discrete parcels or areas into an integrated system.Benefited from the spatial interaction analysis, urban parcels or areas, such as grids, communities, or traffic analysis zones could be joint via the similarity of functionality to obtain spatial interactions among aggregated regions (i.e.such urban functional region).With multilayer aggregating at different scales, hierarchical urban spatial structure could be revealed (Hong and Yao 2019;Wu, Smith, and Wang 2021).For example, Liu et al. (2015) revealed a twolevel hierarchical urban spatial structure by dividing   underlying spatial semantics of urban functional regions.What's more, different hierarchical levels of urban regions associated with their functions could be dynamically drawn on a digital map (i.e.tourist map or urban planning map).Such hierarchical semantical map could be beneficial for various map users (Gao 2017).For city planners, it is critical to answer questions such as 'where are the most dynamic regions in the city' and 'where are the neighbourhoods that are changing the most'?Moreover, hierarchical urban spatial structure could reveal the regions that attract a lot of attention from the general public in an urban area.As a result, when limited resources are available for urban planning initiatives, these regions may be given a higher priority.

The spatial interaction relatedness and human mobility patterns within a city
There has been continued and sustained interest in extracting hidden information from human mobility patterns in urban studies.Previous studies generally extract temporal or spatial statistical characteristics (i.e.frequencies statistics of origins and destinations of the individual-level trips) to determine an urban region within a city.Origins and destinations represent the travel purposes of people, but the intermediate trajectory information between origins and destinations is rarely used (Hu et al. 2021;Zheng et al. 2014;Zhu et al. 2017).Inspired by Hu et al. (2021), an analogy strategy from urban elements to NLP field was introduced, where the urban area was regarded as a corpus, a travel trip was deemed as a sentence, road node was used as word.
Further a Word2Vec model was employed to measure the spatial interaction relatedness between road nodes.
Embracing the analogy strategy from GIS elements to the NLP field, considerable research in recent years has demonstrated the advantages of word embedding technologies in GIScience from different perspectives.For example, Crivellari and Ristea (2021) firstly introduced the CrimeVec approach based on word embedding technology to understand the criminology of urban place.Word2Vec model was used to creating dense vectors of crime types based on spatial-temporal distribution.Li et al. (2019) proposed a regionalization method for clustering and partitioning based on semantic trajectories extracted from call detail record (CDR) of mobile phone signalling data.Zhang et al. (2020) proposed a Traj2Vec model to quantify trip trajectories as high-dimensional semantic vectors using mobile phone positioning data.They proved that cell towers with similar vectors are spatially closer to one another.Attempts have also been made to measure traffic interactions (Liu et al. 2017;Wu et al. 2020;Xu et al. 2020), delineate urban functional use (Huang et al. 2022;Niu and Silva 2021;Sun et al. 2021) and so on using word embedding models (Mai et al. 2022).
The spatial interaction relatedness has the unique advantage in extracting the underlying human mobility patterns.Compared with traditional origins and destinations information, the spatial interaction relatedness thoroughly extracts refined characteristics by integrating fixed urban road network structure and dynamic human movement trips.In this regard, the spatial interaction relatedness could be used to reveal the movement flows and traffic states.For example, how  upstream or downstream road traffic flows impact the neighbourhood roads?We could employ the embedding representation and cosine-based similarity measurement to quantify the impact.

Conclusions
It is one of the central themes for transport geographers to reveal the underlying urban spatial structure from the intra-urban spatial interaction patterns.In this study, we employed the Word2Vec representation learning model to quantify fine-scale interaction relations of roads segments in a road network using taxi trajectory data.We then utilized the Infomap community detection method to reveal significant subregional patterns in the Wuhan metropolitan area.We found the three-level hierarchical structures that indicate the division of urban space with the integration of a large volume of real taxi trajectories into the urban road network.Urban space is divided into three aggregated sub-regions at the top level, which shows great consistency with a previous urban spatial pattern known as the Three Towns of Wuhan.The second level reflects sub-regions with more intra-flow than inter-flow connections.The third level reveals a finescale spatial distribution of urban functions.We further assessed the advantage of our proposed method via two comparison experiments.Our work considers the entire city as a connected system using human travel flows on the urban road network.Our method and results can provide traffic managers and urban planners with a better understanding of traffic patterns on the road network and the regional structure of a city.Furthermore, we would like to regard this research as a beginning to detect the spatial interaction communities based on trajectory data and road network abstraction model.Further research can be conducted to improve our proposed framework and compare with state-of-the-art methods if more detailed urban geographical data and open data sources are available.Future work is also anticipated to collect multimodal travel data (e.g.subway, biking, bus trips) to support a more comprehensive spatial interaction investigation in cities.

Figure 1 .
Figure 1.The flowchart of the proposed representation learning framework for extracting hierarchical intra-urban spatial structure.

Figure 2 .
Figure 2.An example of mapping a raw taxi route to intersection nodes sequence along the road network.
• GPS-enabled taxi trajectories.As an important form of transportation, GPS-enabled taxis are not constrained by routes or time and provide a kind of flexible and wide-ranging trajectory data in urban regions with high accuracy and fewer privacy concerns(Cui, Wu, and He 2022;Zheng et al. 2011).Taxi GPS record data used in this study was collected from May 9, 2015-May 15, 2015 in Wuhan, which contains movement tracks of more than 10,000 taxis.The original taxi trajectory dataset is essentially a collection of GPS track points.Each point includes basic driving data such as taxi number, timestamp, coordinate location, speed, direction, and status (vacant or occupied).The sampling frequency of the GPS track is about 50 s.With the attribute of occupied status, passengercarrying trajectories are retrieved from the original GPS track points.A consecutive taxi movement route is composed of the pick-up location, the drop-off location, and several intermediate GPS records.•POIs: The POI data was collected from an opensource data platform, Peking University Research Data (State Information Center 2017).This data covers the period from July 1, 2017 to September 30, 2018.A total of 404,721 POIs is extracted in the study area (Figure5).A POI consists of essential attributes: POI identity number, name, hierarchical types, address, and coordinate location.Hierarchical types are grouped into three categories, i.e. top-level, second-level, and third-level.From the top level to third level, the POIs descriptions are provided in greater detail.In this study, the second-level category of POIs types is utilized to identify urban functions and evaluate the effectivity in order to balance the trade-off between the mixture of urban land use and the complexity of semantic computation.POI data is mainly used to calculate the land use mixed indices and identify the urban functions in hierarchical city structure analysis.

Figure 3 .
Figure 3.Our study area in Wuhan, which covers the downtown area of this city (within the third ring road).(a) Administrative districts of Wuhan city; (b) Satellite remote sensing image of the study area.The bottom images show examples of diverse landscapes: (1) industrial area, (2) residential area, (3) commercial area and (4) educational area.

Figure 4 .
Figure 4. Data schema of the road network in the study area.(a) road segments and (b) road nodes.

Figure 5 .
Figure 5. (a) kernel density estimation of pois (unit: count per km 2 ); (b) spatial distribution of POIs near Jianghan road, a famous commercial area within the downtown of Wuhan.Note that different colours indicate types of POIs.

Figure 6 .
Figure 6.Relationship between the average similarity of node embeddings and the distance between two nodes.

Figure 7 .
Figure 7.An example of embedding vectors in a road network graph: (a) road networks; (b) embedding vectors of road nodes; (c) spatial distribution of cosine similarities.

Figure 8 .
Figure 8.The spatial distribution of hierarchical communities.Note that the solid black lines indicate the administrative boundaries.(a) top-level communities; (b) second-level communities; (c) third-level communities; (d) and (e) indicate the second-level divisions and third-level divisions of Jianghan district.

Figure 9 .
Figure 9.The chord diagram of traffic flows using origins and destinations of taxi movements.noting that the numbers in the left diagram and right map is consistent and indicate the identifier of the sub-regions.(a) the chord diagram; (b) the spatial distribution of sub-regions.the blue lines indicate the sub-regions boundary generated by Thiessen polygon algorithm.

Figure 10 .
Figure 10.The spatial distribution of urban functional areas based on third-level division.

Table 1 .
Comparison of different communities.

Table 2 .
Urban functions areas and example locations.

Table 3 .
The percentage of average frequency of taxi flows between the second-level divisions within the same division of seven days.

Table 4 .
Indices for mixed land use of the third-level divisions.