Region-aware neural graph collaborative filtering for personalized recommendation

ABSTRACT Personalized recommender systems have been widely deployed in various scenarios to enhance user experience in response to the challenge of information explosion. Especially, personalized recommendation models based on graph structure have advanced greatly in predicting user preferences. However, geographical region entities that reflect the geographical context of the items is not being utilized in previous works, leaving room for the improvement of personalized recommendation. This study proposes a region-aware neural graph collaborative filtering (RA-NGCF) model, which introduces the geographical regions for improving the prediction of user preference. The approach first characterizes the relationships between items and users with a user-item-region graph. And, a neural network model for the region-aware graph is derived to capture the higher-order interaction among users, items, and regions. Finally, the model fuses region and item vectors to infer user preferences. Experiments on real-world dataset results show that introducing region entities improves the accuracy of personalized recommendations. This study provides a new approach for optimizing personalized recommendation as well as a methodological reference for facilitating geographical regions for optimizing spatial applications.


Introduction
In the era of 'information overload,' acquiring and recommending information on the basis of individual preferences are crucial. Emerging personalized recommendation systems are trying to help users to obtain information effectively from various kinds of big data, as well as aid commercial companies in attracting more customers to increase their business interests. Various personalized recommendation models have been deployed in a wide variety of fields, including e-commerce, advertising, and tourism (Yin et al. 2017;Guo et al. 2022). However, given the high uncertainty of human activities, it is highly challenging to accurately predict users' personal preferences.
Owing to the high interpretability and excellent performance, collaborative filtering (CF) models boost an increasing body of research on personalized recommendation. CF models assume users with similar historical behaviors have similar preferences (He et al. 2017;He et al. 2018), thus learning the representation vectors of users and items from historical visits. And, they predict the user's preference on each item with the learned two types of vectors. For example, matrix factorization (MF) (Koren, Bell, and Volinsky 2009;Rendle 2010) is a straightforward implementation of CF. It obtains representation vectors of both users and items by decomposing the co-occurrence matrices of users and items, then predicts the user's preference on items according to the similarity between user and item vectors. From the perspective of the user-item interaction graph, these models learn the vector of users and items from their one-hop neighbors, then predict user preferences .
Inspired by graph neural networks (GNNs) that achieves great performance in modeling higherorder features of graph nodes, recent CF models introduce GNN to capture higher-order interactions between users and items Wu et al. 2021). For example, neural graph collaborative filtering (NGCF) (Yang et al. 2020), a GNN-based model, makes a significant improvement in recommendation accuracy. However, these GNN-based CF models do not sufficiently incorporate the geo-entities associated with items to enhance model performance. These geo-entities, such as the geographical regions in which items are located, reflect the geographical context of items and play an increasingly important role in user activities (Liu and Seah 2015;Liu et al. 2019). Since these models ignore the contribution of geographic entities, there is room for further improvement in predicting user preferences.
To fill this gap, this study proposes a region-aware neural graph CF model by introducing geographical region entities to improve recommendation accuracy. The model first characterizes the spatial relationship between items and regions by constructing a user-item-region graph to represent interactions among users, items, and regions. Then, a region-aware GNN is designed to synthesize features among nodes. Finally, the model predicts user preferences by combining region and item vectors. The contribution of this research is concluded as follows: (1) This study highlights the importance of item-related geographical entities in personalized recommendation, and proposes to introduce region entities to improve personalized recommendations.
(2) A novel model, region-aware neural graph collaborative filtering (RA-NGCF), is developed to leverage geographical regions for improving CF-based methods. The model incorporates the region features of items into a graph and formulates a strategy to fuse region and item features. This strategy provides a new exploration for integrating geographic information serving for intelligent spatial applications. (3) The experimental results show that the proposed model outperforms baseline models, which suggest that introducing geographical region entities is effect.
The remainder of this paper is organized as follows. Section 2 reviews the related works. Section 3 elaborates on RA-NGCF. Section 4 details the experimental setting and reports the results of the experiment. Afterward, Section 5 discusses factors that may affect the performance of RA-NGCF. Finally, Section 6 concludes the study.

Sequence-based personalized recommendations
User visit history can be organized as a kind of trajectory sequence, and an increasing number of studies have focused on predicting user preferences by deriving classical sequence models. For example, according to items that users have visited recently, a moving Markov sequence model (Gambs, Killijian, and Del Prado Cortez 2012) is proposed to predict items that a user is likely to visit in the future. Considering that a recurrent neural network (RNN) and its variants can better capture features than Markov models in sequence data, spatial-temporal recurrent neural networks (ST-RNN) and spatio-temporal gated network (STGN) models extend RNNs to predict users' preferences by introducing time and geographical distance (Liu et al. 2016;Zhao et al. 2022). In addition, bidirectional LSTM (BiLSTM) is combined with CNN to predict the regions that users are likely to visit (Bao et al. 2021).
Recent research has demonstrated that introducing the geographical features of items can effectively improve recommendation results (Zhang et al. 2019b;Botangen et al. 2020). Improvements include using kernel density methods to personalize individual distributions (Zhang and Chow 2013;Zhang, Chow, and Li 2015), incorporating the emotional and geographical attributes of items in the representation of items , and predicting users' behavior based on Gaussian processes .
However, the above models still have limitations in effectively capturing higher-order and implicit interactions between users and items, which are significant in capturing in-depth implicit preferences (Wang et al. 2019).

Graph-based personalized recommendations
Personalized recommendation models based on graph structure has been growing rapidly in recent years. According to the method of embedding nodes in the graph, related research can be categorized into two groups, namely, models based on random walk strategy and models based on GNNs.
For models based on random walk strategy (Baluja et al. 2008;Jiang et al. 2018), the strategy is applied to the item graph or user graph with certain transfer probability, thus capturing the implicit preferences between users and items. Recently, embedding models mainly learn a mapping that embeds nodes in a low-dimensional vector space by encoding the graph structure. Typically, they first generate a sequence of users or items with a random walk strategy and then follow the Skip-gram (Mikolov et al. 2013) model to learn the distribution representation of users and items (Shi et al. 2019). LECF predicts the existence probability of an edge by the connections among edges and is able to capture the complex relationship in data (Xiao et al. 2022).
Inspired by GNN's superior performance in natural language processing and computer vision tasks, new methods based on GNN have been proposed for personalized recommendation tasks. A dominating advantage of GNN is that it can capture high-order interaction information between nodes through hierarchical coding (Kipf and Welling 2017;Zhang et al. 2019). Currently, most existing GNN-based approaches are derived from bipartite graphs (He et al. 2017;Yang et al. 2018;Nikolakopoulos and Karypis 2019), in which nodes represent users and items, and edges are used to represent interactions between users and items. For example, as a representative GNN, a graph convolutional network (GCN) is used to learn the latent relationship in user-item graphs (van den Berg, Kipf, and Welling 2018). The study only uses one convolutional layer to capture the connections between users and items, which cannot reveal the synergistic relationships in higher-order connections. Moreover, the spectral convolution operation of GCN is suggested to discover all possible connections between users and items in the spectral domain by performing singular composition on the adjacency matrix of the graph to discover the connections between user-item pairs (Zheng et al. 2018). The high computational complexity of feature decomposition makes it difficult to support large-scale recommendation scenarios. Considering that multiple hidden attribute relationships exist between the edges of the bipartite graph formed by users and items, a model is proposed to decompose the edges of the bipartite graph in terms of attributes and weigh them according to the nodes ). The decomposition is followed by a reorganization based on the weights of the attributes. Then, a vector representation of each node is obtained. To capture the higher-order neighborhood information between users and items better, a neural graph CF (NGCF) algorithm is designed, which recursively propagates over the graph based on the higher-order connectivity (Wang et al. 2019). LightGCN learns user and item embeddings by linearly propagating features on the user-item interaction graph; furthermore, it uses the weighted sum of the embeddings learned in all layers as the final embedding . To highlight user's basket intent, the user-item bipartite graph is extended to a user-basket-item graph . And, dual graph enhanced embedding neural network (DG-ENN) is proposed to optimize user and item embeddings by attribute graphs and collaborative graph (Guo et al. 2021).
Although the above models can greatly capture the interaction information between users and items, they ignore geographical context of items that are closely related to user activities ( Figure 1).

Personalized recommendations with geolocation
With the popularity of smart wear and electronic mobile devices, location-based services are becoming an integral part of people's lives. A growing number of Internet applications are recording users' experiences by collecting location information. Recent studies have revealed that human activities are influenced by geographic space. In addition, these activities have been observed to follow a certain spatial pattern.
By introducing geographical data, we can better infer a user's preferences and accurately predict the items and locations that users would like to visit (Yuan, Raubal, and Liu 2012;Huang and Wong 2015). A growing body of work has paid attention to the location information of users and items (Zhang and Chow 2013;Zhang, Chow, and Li 2015;Zhao et al. 2020), which helps improve the accuracy of recommendations. The above works still fail to both capture higher-order interaction information between users and items and utilize item locations. Consequently, Location-aware neural graph CF (LA-NGCF) introduces the spatial distances of items into GNN to capture higher-order interaction information ). However, determining how to use spatial context, such as geographical regions, to improve the performance of graph-based CF models is a challenging yet promising task.
The above models that are closely related to the study are compared in Table 1, including GNNbased CF models without geolocation (section 2.2) and with geolocation (the LA-NGCF model), and the proposed model (RA-NGCF). Compared with LA-NGCF that utilizes the spatial distances of items into GNN model, this paper will use geographical region entities to capture the implicit spatial preferences of users.

Region-aware neural graph collaborative filtering
This section introduces the proposed model, region-aware neural graph CF model (RA-NGCF). As shown in Figure 2, RA-NGCF consists of three components: building the user-item-region graph, aggregating information between neighbor nodes, and predicting user preferences.
The first component is used for characterizing relationships among users, items, and regions by building a user-item-region graph based on visit history and spatial location. The second component aims to learn the features of each node by propagating and learning from its neighbor nodes with a GNN. The third component predicts user preferences by fusing the vectors of users, items, and regions.
In the

Building user-item-region graph
The model first defines the geographical regions of items for characterizing the relationships among users, items, and regions, as these regions can present spatial features. In previous works, two types of regional definition strategies have been widely used to divide spatial areas into regions. One straightforward method is to divide the whole area into equally sized grids. Generally, the spatial patterns of human activity presented by spatial features of items are not typically grid-like. Therefore, dividing the area into grid regions is not appropriate for presenting user activities. An alternative strategy is to use administrative divisions, such as zip code, census tract, etc. Compared with the grid regions, administrative regions generally have high-level intra-regional connectivity because people are closely connected within a region. The selection of regional definition strategies is discussed in Section 5.3. After determining the region corresponding to each item, the relationships among users, items, and regions are constructed as shown in Figure 3. In this study, these correlations are represented by a graph in which users, items and, regions are presented by three types of nodes, as shown in Figure 3. The graph is defined as G uir = (U < I < R, W ui , W ir ), where U denotes the set of user nodes, I denotes the set of items, and R denotes the set of regions (Algorithm 1, line 1). In addition, W ui presents the binary weight matrix of the edges between users and items, while W ir denotes the binary weight matrix of the edges between items and regions. W ui matrix is used to present the historical preferences between users and items. In W ui a,b , the element W ui , is set to 1 if user a has visited item b; otherwise, it is set to 0 (Algorithm 1, line 3-7). W ir is used to present the relationships between items and regions. W ir a,b is set to 1 if item a is located in region b; otherwise, it is set to 0 (Algorithm 1, line 8-12).

Aggregating neighbor node information
The user-item-region graph consists of two types of edges: edges between user and item (UIE) and edges between item and region (IRE). The UIEs represent interactions between users and items, and IREs represent interactions between regions and items. Interactions between users and regions are not explicitly described in the graph. Considering that user activities usually present a spatial pattern, the interactions between users and regions are captured by propagating information on the three types of nodes.
The initial representations of users, items, and regions are formalized as Equations (1) In the user-item-region graph, an item node is connected to both user and region nodes, which can directly learn region features from neighbor regions and user features from neighbor user nodes. Then, two kinds of learned features are propagated to the neighbor nodes. The updating of node vectors can be performed by three aggregators, including item aggregator, user aggregator and region aggregator, to update item node vectors, user node vectors, and region node vectors, respectively. The aggregators are illustrated in Figure 4.
. Item Aggregator. Item nodes are directly connected to two types of nodes, namely, user nodes and region nodes. The UIE presents interactions between items and users, thus implicitly reflecting user preferences. The IRE presents the relationships between items and regions, thus capturing the geographical characteristics of items. To fuse the above two types of information in the process of information propagation, Equation (4) is designed for fusing information collected from neighboring nodes.
where ⊗ denotes element-wise multiplication. N u , N i , and N r denote the number of direct-connect neighbor nodes of users, items, and regions, respectively. m i u,r is a vector that an item collected from its neighbor nodes. It contains two types of information: original information in the neighbor nodes and information that the neighbor nodes obtained from their neighbor nodes.
W 1 (e u + e r ) denotes the feature information of user nodes and region nodes collected by items from neighbor nodes; W 2 (e i ⊗ e u + e i ⊗ e r ) contains two parts: e i ⊗ e u denotes the interaction information between the item and the user, from which the user's interest preferences can be exploited. e i ⊗ e r denotes the interaction information between the item and the region, which propagates the region feature into item vectors. W 1 , W 2 [ R d ′ ×d are two training parameter matrices. d ′ denotes the dimension size of the aggregated neighbor node vectors. After collecting feature information from the neighbor nodes, the original item vector representation can be updated with the received information, which can be derived as Equation (5): where LeakyReLU() is the activation function; m i i is the preserved feature information, which can be calculated by m i i = W 3 e i . The process of the item vector update can be divided into two steps. The first step is to collect feature information from neighbor nodes, and the second step is to aggregate the information collected from these nodes. In the collection process, both the information of the neighbor nodes and the information of the item's interaction with them is added.
In the process of information aggregation, the characteristics of the item are preserved, and the information collected from neighbor nodes is fused, which can reduce the possibility of overfitting when the node information propagates deeply through multiple layers. After the information collection and information aggregation from the neighbor nodes, the vector representation of the item contains both the region feature of the item and the preferences of users. Then, the region feature in items is propagated to user nodes, thus enabling the user's vector to become aware the region feature of the items. At the same time, the user preference information in items is propagated to vectors of regions so that region vectors can capture the preferences of users. . User Aggregator. In the built user-item-region graph, both user nodes and region nodes have only one type of neighbor node. The update process of both user and region vectors is similar to the update process of item vectors, which consists of two steps: information collection from neighbor nodes and information aggregation. The user vectors that collect information from neighbor nodes are derived as: where m u i presents the information that users collected from their neighbor nodes; the aggregation process is defined as Equation (7): where e (1) u is the representation vectors of the user node after aggregating information from direct-connected neighbor nodes, in which m u u is used to preserve its own feature. . Region Aggregator. The process of region nodes collecting information from neighboring nodes can be formalized as: where m r i is the feature information that regions collected from their neighbor items, which represents the characteristics of region r. For region r, its features can be calculated as Equation (9): where m r r represents the feature information of the region itself, and e (l) r is the representation of the region vector after l-th iteration. At this moment, the region vector contains the feature information of regions, and it also obtains the user's preference information learned from items.
This study designed a three-layer GNN for information propagation and concatenates the vectors of each layer to obtain the final vector representation of users, items, and regions as follows.

Predicting and optimizing
Personalized recommendations based on CF usually score user vectors and item vectors to infer and predict user preferences. In this study, we can infer users' preferences by using three types of vectors: user vectors, item vectors, and region vectors. In the user-item-region graph, each item node is connected to the unique region node, that is, all item nodes in the same area are connected to the same region node. Thus, a region vector should reflect the common characteristics of items in the region. RA-NGCF fuses the region vectors and item vectors before predicting user preferences. Then, fused item vectors are used to predict user preferences. The fusing process is performed with Equation (13).
where e * i−r denotes the fused vector representation of items. In addition, ⊕ denotes the elementwise addition of vectors. Besides addition operation, multiplication operation or subtraction operation can be applied to fuse the two vectors. Extensive experiments are discussed in Section 5.1.
Then, the vector dot product is calculated to score the similarity between user vectors and item vectors, which is defined in Equation (14). Compared with cosine similarity, the dot product of vectors can simplify the computational cost. Given the strategy of introducing parameter regularization and vector normalization in the implementation of the model, the dot product of vectors and cosine similarity is consistent in determining the relative similarity between items and a given user.
whereŷ (u,i−r) is the final preference score of user u to item i. The BPR loss function (Rendle et al. 2009) is applied to learn the optimization objective of the model, and the loss function of RA-NGCF is defined as follows: where i − r denotes a positive case and j − r denotes a negative case. Finally, we summarize the training procedure of our proposed RA-NGCF model in Algorithm 2.
Algorithm 2 Training of RA-NGCF Require User-item-region graph G = (V, E) and nodes features obtained via Eq. 1, Eq. 2 and Eq. 3 01: while not converge do 02: Update item, user and region embeddings via Eq. 5, Eq. 7 and Eq. 9, respectively 03: Obtain the final user, item and region embeddings via Eq. 10, Eq. 11 and Eq. 12, respectively 04: Sample a batch S T from training set 05: Compute the optimization objective L via Eq. 15 on batch S T 06: Update aggregators parameters W in Eq. 4-Eq. 9 07: end while

Experiments and results
This section introduces the details of the dataset selection for our experiments (including dataset processing methods), evaluation metrics, and parameter settings.

Dataset
The dataset used in this study is built from 'Yelp Dataset Challenge 2019' downloaded from https:// www.yelp.com/dataset/. The dataset collected user reviews from Yelp, including user description, user check-ins, user reviews, and item descriptions. We plotted the spatial distribution of items in the dataset, as presented in Figure 5. As seen from the figure, the spatial distribution of items is not uniform, revealing significant spatial heterogeneity. In addition, the POIs visited by users are spatially aggregated, which suggests that the introduction of geographical regions should facilitate in predicting users' preferences.
In the following experiments, we use the user ID in the user's check-in file and the ID of the item visited by the user. We also look for the corresponding postal code in the item files according to IDs of the items. The postal code can be used to identify spatial regions in which items are located. The downloaded raw data form a user review dataset, in which users' check-ins are very sparse. Consequently, users with less than 20 check-ins and items with less than 20 visits are removed, and the descriptive statistics of the dataset after processing are shown in Table 2.
As shown in Table 2, the built dataset contains 24,357 users and 18,544 items, which are located in 2656 zip code regions. To observe the distribution of items across regions further, the number of items in each area is counted and plotted in Figure 6. The figure shows that the distribution of items in each region is very different. In addition, 211 regions contain more than 10 items, with a total of 14,063 items, thus accounting for more than 75% of all items. The remaining 4481 items are located in 2445 regions. This outcome suggests that the items are distributed in a significantly long-tail pattern. This distribution pattern indicates that user activities are highly localized, and the region feature should be valued for personalized recommendation tasks.

Experiment setting
Baselines. Three classical CF models are employed as our baselines, i.e. MF (Rendle et al. 2009), NeuMF (He et al. 2017) and NGCF (Wang et al. 2019). Here MF is a matrix factorization model optimized by the Bayesian personalized ranking (BPR) loss function, NeuMF is an advanced neural CF model, and NGCF is a GNN-based CF model that captures higher-order information between the user and item.
Evaluation metrics. To evaluate the performance of the proposed model, this study takes recall@K, ndcg@K, and hit@K as evaluation metrics. Recall@K considers that the percentage of visiting POIs can emerge in the top K recommended POIs. Hit@K considers whether the relevant items are retrieved within the top K positions of the recommendation list. Ndcg@K measures the relative orders among positive and negative items within the top K of the ranking list (Wang et al. 2019). In the experiments, K is set to 20, which is the same as the set value in the experiment in the NGCF work (Wang et al. 2019).
Implementation details. The proposed model in this study is implemented with Python and Ten-sorFlow (Abadi et al. 2016). In the experiments, the size of all user, item, and region vectors are set to 64 and are randomly initialized with the Xavier initializer (Glorot and Bengio 2010). We divide Figure 6. The distribution of the number of items in each region. the dataset into three parts: training set, testing set, and validation set. The number ratio of these parts is 8:1:1. The proposed model is trained for 400 epochs on the training dataset.

Results
Five models, including three baseline models and two variants of the proposed model, are performed on the dataset to examine the performance of the proposed model. The experimental results are reported in Table 3, where RA-NGCF denotes the proposed model (Wang et al. 2019). A-NGCF is a variant of RA-NGCF in which only the scores of user vectors and item vectors are taken into consideration in the recommended prediction. In addition, A-NGCF does not fuse item and region vectors by ignoring Equation (14) in the predictive user preference component. As shown in Table 3, RA-NGCF achieves the best performance. Although A-NGCF does not combine region node vectors with item node vectors to predict user preferences, it achieves better performance than the baseline models, NGCF. The user-item-region graph involves region nodes, and the process of node information propagation captures the interaction information between users and items, and the geographical features between items and regions.
The performance of the proposed model, RA-NGCF, has been improved, as indicated by three evaluation metrics, namely, recall@20, hit@20, and ndcg@20. The most significant improvement was observed in the metric of recall@20, which was 5.76% higher than the strongest baseline model. The improvements on hit@20 and ndcg@20 are 2.84% and 3.91%, respectively. Compared with A-NGCF, the results suggest that the geographical characteristics of items in RA-NGCF are effectively propagated with the user-item-region graph. In general, we argue that region entities are important in the personalized recommendation task. We also believe that incorporating the region feature as attribute information into the vector representation of items is a promising approach to improving the performance of the model.
To observe the contribution of the region feature in the proposed model further, we randomly select some regions in the experiments. Then, we test the prediction accuracy of the proposed model and the baseline model and plot the Recall@20 values of selected regions in Figure 7.
In Figure 7, the NGCF results are marked in red, and the prediction accuracies fluctuate greatly between regions, which may be because the NGCF model does not take into account the region features of the items. The accuracies may have fluctuated because the items are not uniformly distributed over the regions, thus resulting in spatial heterogeneity model performance. Hence, the NGCF model has performed well in certain regions and poorly in others. The results of the RA-NGCF model are shown in green, and their performance is well balanced. They suggest that the introduction of region nodes can improve the balance of spatial distribution.

Fused presentation vectors of regions and items
The proposed model predicts user preferences by scoring user vectors and fused item vectors, where the fused item vectors are fused with raw item vectors and region vectors. As mentioned in Section 3.3, some operations can be used to combine item and region vectors. In this section, experiments conducted on several widely used vector fusion methods are presented, and the results are listed in Table 4. As shown in the table, the subtraction operation achieves the best results. Meanwhile, the addition operation has the second best results, while the multiplication operation has the worst results. We use the most common operation, the addition operation, to fuse the two types of vectors in other experiments of this study.

Experiments on sparsity data
To illustrate the importance of regional information in the recommendation task further, we conduct experiments on datasets with different sparsity, and the results are shown in Table 5. Specifically, we randomly select items from the used dataset mentioned in Section 4. Then, we randomly remove the visit records from the training set. The percentage in Table 5 indicates the proportion of randomly selected items omitted. As shown in the table, the sparser the dataset is, the worse the performance of the NGCF model and the RA-NGCF model will be. The RA-NGCF model always   outperforms the NGCF model, which indicates that RA-NGCF is robust and that fusing regional information can improve the performance of the model.

Grid-based regions
As most of the spatial applications meet the modified area unit problem (MAUP) (Xiong et al. 2019), the performance of the proposed model may be affected by different region division strategies. The section will examine the impact of region division strategies on the performance of the model. In this section, we perform experiments with two regions and define the methods mentioned in Section 3.1. Then, we list the results in Table 6. For grid-based regions, two spatial sizes, 500*500 and 1,000*1000 are selected, and their results are marked as grid (500*500) and grid (1000*1000), respectively. The zip code denotes defined regions by post codes, which is adopted in several spatial applications.
As shown in Table 6, different region types affect the performance of the proposed model. The best prediction performance is obtained when dividing the spatial area into a 500*500 grid regions. The proposed method also suffers from the MAUP issue (Wong 2009), and the grid size will affect the prediction accuracies when the grid-based method is applied. We argue that if the region is too large, the region will have too many items, and the items may be too 'noisy.' Hence, the regional nature of the items is not capture well. Meanwhile, if the region is too small, then the model will have too many regions, which will greatly increase the nodes of the model. All results are superior to the baseline models mentioned in Section 4, thus indicating that the model is robust.
The regions based on administrative areas are more conducive to understanding human activities and are expected to be more easily integrated with existing spatial applications. The rest of the experiments in this study are designed to define region objects with a widely used administrative area, the zip code region.

Conclusions
By introducing geographical entities, this study develops an optimized personalized recommendation model, RA-NGCF. The proposed model first uses the user-item-region graph to characterize the relationships among users, items, and regions. Then, it projects items into geographical regions and designs a GNN to propagate the features of users, items, and regions. Experimental results show that RA-NGCF outperforms baseline models. Specifically, the introduction of geographical regions of items not only improves the overall effectiveness of the model but also alleviates the impact of the uneven spatial distribution of items. This study provides a new approach for optimizing personalized recommendation and a methodological reference of facilitating geographical information to spatial applications.
A challenge of this study is that some regions have few items, thus making that the proposed model capture the characteristics of those regions insufficiently. More machine learning techniques, such as transfer learning strategies, can be introduced into the proposed models to improve recommendation accuracy in those regions. In addition, the proposed model faces the challenge from the modified area unit problem (MAUP), i.e. different spatial scale levels and region shapes may affect the performance of the proposed model, and the challenge deserve to be further considered.