Multiscale spatially varying coefficient modelling using a Geographical Gaussian Process GAM

Abstract This paper proposes a novel spatially varying coefficient (SVC) regression through a Geographical Gaussian Process GAM (GGP-GAM): a Generalized Additive Model (GAM) with Gaussian Process (GP) splines parameterised at observation locations. A GGP-GAM was applied to multiple simulated coefficient datasets exhibiting varying degrees of spatial heterogeneity and out-performed the SVC brand-leader, Multiscale Geographically Weighted Regression (MGWR), under a range of fit metrics. Both were then applied to a Brexit case study and compared, with MGWR marginally out-performing GGP-GAM. The theoretical frameworks and implementation of both approaches are discussed: GWR models calibrate multiple models whereas GAMs provide a full single model; GAMs can automatically penalise local collinearity; GWR-based approaches are computationally more demanding; MGWR is still only for Gaussian responses; MGWR bandwidths are intuitive indicators of spatial heterogeneity. GGP-GAM calibration and tuning are also discussed and areas of future work are identified, including the creation of a user-friendly package to support model creation and coefficient mapping, and to facilitate ease of comparison with alternate SVC models. A final observation that GGP-GAMs have the potential to overcome some of the long-standing reservations about GWR-based regression methods and to elevate the perception of SVCs amongst the broader community.


Introduction
A standard linear regression seeks to model the relationship between a response variable and a series of predictor variables.Via ordinary least squares (OLS), it estimates a single set of unknown regression coefficients for each predictor (and intercept) together with errors that are assumed independent and identically distributed with mean zero and a uniform variance (usually denoted r 2 ).A standard OLS regression CONTACT Alexis Comber a.comber@leeds.ac.uk assumes that the errors have a Gaussian distribution.For non-Gaussian response variables (such as integer count data), the OLS regression can be replaced with a Generalized Linear Model (GLM) (Nelder and Wedderburn 1972;McCullagh and Nelder 2019) whose coefficients are typically estimated via maximum likelihood (ML).For greater flexibility in capturing non-linear relationships between the response and predictors, a GLM can be extended by a Generalized Additive Model (GAM) (Hastie and Pregibon 2017) where each regression relationship can be modelled non-linearly, as an arbitrary function, say through splines or some other smoothing tool.OLS regression and classic forms of GLM assume fixed (constant) coefficient processes.However, in reality, the relationship between the response and predictor variables in a geographical context may change with observation location, potentially due to unaccounted-for local factors or due to spatial heterogeneity in the process being examined.In this context, spatially varying coefficient (SVC) models provide an alternative to 'whole map' or global regressions that may incorrectly assume spatial stationarity in regression coefficients (Openshaw 1996).GAM smoothing functions allow for varying (non-constant) coefficient processes in attribute-space.In geographic space, spatial settings for the GAM smoothing functions are possible leading to the characterisation of response-to-predictor relationships changing across space, i.e. a SVC model.
The most popular SVC model is Geographically Weighted Regression (GWR) (Brunsdon et al. 2010) which has been subject to a number of developments including Multiscale GWR (MGWR) (Yang 2014;Fotheringham et al. 2017) which is now recommended as the default GWR (Comber et al. 2023).Interestingly these can be viewed in terms of a GAM (Brunsdon et al. 1999;Yu et al. 2020), where coefficients are estimated from data subsets falling under distance-weighted kernels around any spatial location via a weighted least squares algorithm for GWR and an iterative back-fitting algorithm for MGWR.
The focus of this study is only for multiscale SVC models, that allow each predictor-toresponse relationship to operate at its own spatial scale.MGWR is the most popular SVC model (Comber et al. 2021(Comber et al. , 2023)).Other widely applied multiscale SVC models include Bayesian Gaussian Process (GP) models that use co-kriging (via a linear Model of Coregionalisation (LMC)) (Bayes-GP) to generate unknown coefficients that are assumed spatially co-dependent stochastic processes (Gelfand et al. 2003;Finley 2011;Kim and Lee 2017;Finley and Banerjee 2020) and Eigenvector Spatial Filtering (ESF) with Moran coefficients (ESF-MC) (Murakami and Griffith 2015;Murakami et al. 2017) that extends the deterministic ESF model (Griffith 2008) with random effects to model stochastic spatial processes.These SVC constructs can be shown to have a direct relationship to each other (Murakami et al. 2017), where comparisons with MGWR can be found in the simulated studies of Wolf et al. (2018) and Murakami et al. (2019) for Bayes-GP and ESF-MC, respectively.Such comparative studies concluded that no-one multiscale SVC model could be considered significantly better than another, when jointly considering model accuracy and precision, computational overheads, and ease of application for the novice user.More recently, multiscale SVC models include a model that uses triangulation with bivariate spline estimators (Mu et al. 2018), a GP-based model that uses a frequentist approach with ML estimation (Dambon et al. 2021), and a GLM-based model that uses a reduced-rank spline methodology (Fan and Huang 2022), but at the time of writing community take up for these models has been limited.
For this study, a novel SVC model is proposed as an alternative, not only to MGWR, but also to Bayes-GP, ESF-MC and the more recent SVC models listed above: a Geographical Gaussian Process GAM (GGP-GAM), in which GP splines are parameterised at observation locations.The combination of a GAM with GP splines constitutes the key advance.The main theoretical limitation with MGWR (and GWR) is that its rudimentary kernel-based parameterisation means that only a collection of local models is used to capture and quantify the non-stationary processes (re-using data each time), whereas Bayes-GP, ESF-MC and GGP-GAM each provide a single, global model formulation that is still non-stationary in nature (e.g.Finley (2011); Wolf et al. (2018)).This has led some to argue that MGWR (and GWR) is best suited to exploratory rather than inferential analyses (Wheeler and Calder 2007;Finley 2011;Dambon et al. 2021).However, the practical consequences of this theoretical limitation can in part, be quantified and are often marginal (Li et al. 2020;Yu et al. 2020).The GGP-GAM is evaluated through a simulation experiment with MGWR as comparison, and also through an empirical case study with UK Brexit vote data.
Common challenges across all multiscale SVC models, include: (i) sensitivity to the parameterisation of the spatial smoothing function (e.g.sensitivity to the location and number of knots in GGP-GAM, sensitivity to kernel and bandwidth form in MGWR, ensuring a positive-definite fit for the LMC in Bayes-GP), (ii) collinearity in the predictors (e.g. for instances of local collinearity when not present globally), (iii) use as a spatial predictor, (iv) computational burden, and (v) ease of application for the novice user together with intuitive understandings of process spatial heterogeneity from model outputs.Each of these challenges are discussed for the proposed GGP-GAM model in relation to MGWR only.
Comparison with MGWR is important as currently such studies exist only for basic GWR, where all non-stationary relationships are naively assumed to operate at the same spatial scale.For example, P� aez et al. (2011) and Fotheringham and Oshan (2016) examined GWR's sensitivity to local collinearity; Harris et al. (2010) and Harris et al. (2011) for GWR's use as a spatial predictor, including hybrids with kriging; Li et al. (2019) and Lu et al. (2022) describe computational solutions to implementations of GWR.Whilst MGWR has some benefits to dealing with local collinearity (Harris 2019), its value as a spatial predictor is unclear (Lu et al. 2019) and its underlying back-fitting algorithm is computationally expensive (Fan and Huang 2022).Both GWR and MGWR also suffer from an absence of a generic framework for any form of Geographically Weighted model (Comber et al. 2022), whereas frameworks for GAMs are relatively mature meaning that issues such as those for collinearity are more easily resolved through various penalised model options in GAM R packages.

GAMs
Generalized Additive Models (GAMs) are general in that they can handle outputs with many types of distributions and not just linear relationships, polynomial or not (Wood 2006;Fahrmeir et al. 2022).They are additive and because they generate multiple model terms which are added together to generate predictions.Methodologically, GAMs offer a middle ground between simple but interpretable linear models and complex but opaque machine learning approaches, where it is difficult to understand how one variable relates to an outcome.Thus GAMs provide an intuitive approach to fit relatively complex relationships in data with complex interactions and non-linearities, and the outputs provide easily understood measures of the relationship between predictor and response variables and how the outcome is modelled.They are able to predict well from complex systems, to quantify relationships and to make inferences about these relationships, for example about what variables are important and at what range of values are they most influential.GAMS can perform as well or better than most machine learning models and they are relatively fast.
GAMs typically model non-linear relationships using splines.Other approaches to modelling can also fit into a GAM framework, but in many situations splines are the building blocks of GAMs.A spline can be represented as a linear combination of functions called � basis functions � .Each of these is assigned a coefficient and these are linearly combined to generate the predictions of outcome (ŷ).Basis functions can be single or multi-dimensional in terms of predictor variables.Thus, the form of a GAM is composed of linear sums of multi-dimensional basis functions and this is what allows complex relationships to be modelled.
Splines can be of different forms depending on the problem and data (see below) and need to include an appropriate degree of 'wiggliness' to model non-linear relationships.This is determined by the spline smoothing parameter.If this is too small then the model will overfit the data, resulting in too many bases and too much noise in the function.If it is too large then the functions will be overly simple and fail to represent the complexity of the relationship.Thus a core problem to resolve in GAMs is to determine the smoothing parameters and balance overfitting versus capturing the complex in the relationship.This is done automatically in the GAM functions of the 'mgcv' R package (Wood 2015) used in this work.
Some earlier work has combined GAMs with location.Kelsall and Diggle (1998) used GAMs to estimate covariates with a discrete smooth (spline) for location.However they did not look at the interactions of individual covariates over space as was done by Kammann and Wand (2003), who combined geostatistical approaches with a GAM by merging kriging, cast as a linear mixed model, with an additive model in order to account for non-linearity.However, this was only done for the intercept and for prediction rather than process understanding.In recent work, Fan and Huang (2022) proposed a SVC model via a GLM with reduced-rank thin-plate splines.In these, the rotations of the covariates are not changed and the splines are fitted with fewer coefficients than there are data to smooth using an Eigen-decomposition for coefficient (rank) reduction.An alternative approach to SVC modelling is to consider the predictor-to-response relationship over space as a GP.Here a GP is a random process over functions whose terms can be modelled using GP splines within a GAM.In this work, a low rank GP spline, parameterised with location was used rather than the more standard thin-plate spline.This is because GPs are good for specifying autocorrelation in spatially-varying random functions and GP-based smoothing using observations at specific locations should detect any spatial trends in the data.

A Geographical Gaussian Process GAM for SVC modelling
GAMs provide a method for calibrating regression models with unspecified functions of the predictor variables, of the form: where z j may be a scalar or a vector.These can be extended such that each f j ðz j Þ is a linear regression coefficient on another scalar predictor x j : and z is a vector of locations then this specifies a SVC model: One way of specifying aðzÞ � � � f m ðzÞ is that each function is generated from a GP and each function estimate is an a posteriori estimate of a GP with a zero mean.
Importantly in the context of this work, GPs also have a covariance function: These control the smoothness of f m ðzÞ, such that the more j m ðdÞ decreases as d increases, the smoother f m ðzÞ tends to be.In this sense, these are similar to models based on kriging (that similarly use covariance functions via semivariograms) and similar to MGWR, as the covariance function for each f m ðzÞ is individually calibrated to optimise model fit.Thus a key task in the GAM, in the context of SVCs, is to estimate parameters in each j m ðdÞ and so estimate f m ðzÞ: A GAM uses smooth functions of the predictor variables in which the values of y are assumed to have an exponential family distribution: where hð:Þ, �ð:Þ, Tð:Þ and Að:Þ are known functions, and h is a vector of parameters.This very flexible form can represent a wide range of commonly used distributions such as Gaussian, Poisson, Gamma, or Binomial with a known number of trials.If where f is the function being modelled, then in GAMs a space of functions, or basis, is chosen of which f is some element, rather than more specifically assuming y to be some linear function of x.This allows the basic formula above to be expanded: where each j j is a basis function of the transformed x and the c are the corresponding regression coefficient estimates.One example of a basis is a GP basis.If there are n distinct geographical locations in the data set, knowing the locations and the covariance function j allows the variance-covariance function of the values of b j in each location to be found, giving a variance-covariance matrix R.This can be translated into a set of n basis vectors j j ðxÞ (Hefley et al. 2017), and the GAM can be calibrated in this way.Thus, in contrast to standard linear models, the predictors in a GAM include smooth non-linear functions of some or all of the covariates, which allows for non-linear relationships between the predictors and the target variable.
It is this local fitting that starts to hint at how GAMs can be used with geographic data, if the non-linear relationships are constructed over geographical space.This can be done by constructing splines that represent the GP parameterised by location as well as attribute space.Specifically a GP spline with location can be specified for each covariate j, although because the GP has a mean zero, a further constant a j is added as an offset, creating a spatially varying coefficient b j ðzÞ: In the standard 2D case, z i ¼ ðu i , v i Þ: This is the Geographical Gaussian Process GAM (GGP-GAM) described in this paper.

A simulation case study
Simulated spatial data sets with varying degrees of regression relationship heterogeneity were used to examine the performance of the GGP-GAM and to compare that with the performance of a standard MGWR.The simulated data were created in a similar way to Fotheringham et al. (2017) and used subsequently by others (e.g.Fan and Huang (2022)), with the aim of simulating the coefficient estimates (b j ðzÞ's) for Equation (8): Three coefficient surfaces were created over a 25 � 25 regular square grid.Each of the surfaces was assigned to the coefficients of the three predictor variables as follows: The values for predictors x 1 and x 2 were generated from a normal distribution in the range [0, 1] (i.e Nð0:50, 0:17Þ) and � from normal distribution in the range [0, 0.25] (i.e.� � Nð0:04, 0:11Þ), and 100 surfaces were generated.The values of the response y can be found directly from the simulated coefficients, the simulated predictor variables and the simulated error term.The simulated true regression coefficient surfaces are shown in Figure 1, which the analysis will seek to estimate using GGP-GAM and MGWR.
As discussed above, GP splines can be used within a GAM model.The mgcv R package (Wood 2015) was used to construct the GAM with GP splines with a GP smooth and for the simulation data, the spatial locations were extracted from (u, v) respectively.In this package, a parameter which controls the degree of smoothing of the data (via the GP's correlation function) is optimised and as such potentially indicates the locally varying nature of the coefficient estimate in a similar way to MGWR bandwidths.The GPs modelled in the GAM function all have a mean of zero, so for each covariate an extra fixed offset term is added along with the spatially smoothed terms.All of the MGWR models were fitted using the GWmodel R package (Lu et al. 2014;Gollini et al. 2015).
The simulated coefficient surfaces for b 0 , b 1 , b 2 in Figure 1 and the 100 simulated values of x 1 , x 2 and � were used to create a 100 sets of predictor variables over the 625 surface points, y.These values of true y, x 0 , x 1 and x 2 were then used as inputs to the GGP-GAM model (i.e at 625 located observations with 4 fields) to generate coefficient estimates.For comparison, a MGWR was also undertaken with the same data.The spatially located coefficient estimates for both models were retained and used to generate measures of coefficient accuracy by comparing the model estimated coefficients with true ones in Figure 1, generating R 2 for b 1 and b 2 (as b 0 is stationary R 2 cannot be computed) and RMSE and MAE for b 0 , b 1 , b 2 .An assessment of model fit for predicted y (ŷ) and true y for GGP-GAM and MGWR is found using AIC.This was done for each of the 100 sets of simulated y, x 0 , x 1 and x 2 .
The results are shown in Figure 2.Under each fit measure, (AIC, RMSE, MAE, R 2 ) the GPP-GAM generates more accurate estimates of the true coefficients than MGWR, with the difference in fit measures increasing with increasing degrees of spatial heterogeneity.The distributions of AIC values indicate the GGP-GAM to have consistently lowered AIC to that found with MGWR, indicating improved model parsimony and fit with GGP-GAM.This is also shown visually using an example set of b values from the 100 sets of coefficients estimated by the GGP-GAM and by the MGWR to recreate the surfaces for b 0 , b 1 , b 2 .Figure 3 shows the coefficient surfaces estimated from the 4th set of simulations, with the same shading breaks as Figure 1.The better performance of the GGP-GAM in estimating the true bs is clearly demonstrated.
A final comparison concerns the estimated scale of process spatial heterogeneity.In a MGWR model this is indicated by the optimised parameter-specific bandwidths that explicitly describe the scale of the predictor-to-response relationship.In a GGP-GAM, this is indicated through the smoothing parameters (SPs) for each spline.Table 1 summarises the MGWR adaptive bandwidths and the GGP-GAM spline SPs for the 100 models using simulated data.What is interesting is the homogeneity of the bandwidths identified under the MGWR models and heterogeneity of the spline SPs values identified under the GGP-GAMs.However, this is related to the nature of the GAM SPs which are parameters in a covariance function, and thus not equivalent to a bandwidth.Here the GGP-GAM models were constructed using the mgcv package defaults.The result was that the number of knots, k, which defines the basis dimension in a spline, was automatically set at 33 in each GGP-GAM.Normally, k is iteratively determined (S.Wood 2015), as done in the empirical analysis below.Thus, the heterogeneity of the spline SPs in Table 1 is probably a result of under-fitting.This point is returned in the Discussion section.

Data
A spatial analysis of the factors associated with the 2016 referendum on EU membership (Brexit) was used to empirically illustrate the proposed GGP-GAM approach and to compare it with MGWR.Census and voting data were obtained from the parlitools R package (Odell 2020).This includes data on voting for the 632 Westminster parliamentary constituencies in Great Britain (i.e.excluding those in Northern Ireland, for which some data is missing).Information on the data and the package can be found at https://docs.evanodell.com/parlitools/.The voting data were linked to socio-   (Odell 2020).The response variable and the UK regions for reference are shown over both spatial structures in Figure 4.The predictor variables over the hexagonal cartograms (hex-bins) are shown in Figures 4 and 5.The GGP-GAM and MGWR spatial analyses used the original untransformed data and the coordinates from the OSGB projected data, with the results displayed using the hex-bins.

A GGP-GAM analysis
Spatially varying coefficient models with the GGP-GAM were undertaken using the OSGB projected parliamentary constituency in Figure 4.The geometric centroids of each parliamentary constituency were extracted to generate X and Y (Easting and Northing) variables in kilometres.A GGP-GAM was specified.Recall that the splines optimise a parameter which controls the degree of smoothing and that the GPs modelled in the GAM function all have a mean of zero, so for each covariate an extra fixed offset term was added along with the spatially smoothed terms.In this case the GGP-GAM was fully tuned by specifying a sufficiently high number of knots through the k parameter, in contrast to the GGP-GAMs applied to the simulated data above for which the defaults were used (k ¼ 33).The value of k sets an upper limit on effective degrees of freedom associated with GP spline smooth and thus the splines bases.Higher values of k also increase computation time.However the choice of k is difficult to determine algorithmically or to optimise and a typical approach is to investigate different values of k and adjust up or down as need.As Simon Wood notes in the mgcv R package manual under the choose.khelp page, '[the] exact choice of k is not generally critical: it should be chosen to be large enough that you are reasonably sure of having enough degrees of freedom to represent the underlying 'truth' reasonably well, but small enough to maintain reasonable computational efficiency' (Wood 2015).Here, the GGP-GAM was specified with a k ¼ 60.This was large enough to accommodate the GGP-GAM parametric and spline parameters, and to ensure sufficient degrees of freedom.A summary of the fixed parametric coefficient estimates, the smooth terms of the GGP splines for the GGP-GAM model and the distribution of the spatial varying coefficient estimates are shown in Tables 3-5, respectively.The GGP-GAM fixed (parametric) coefficient estimates in Table 3 can be interpreted as linear terms in the model, in a similar way as the results of a standard linear regression model.Here, only the Intercept and the covariate for the election 2015 turn out (Turn out ( 2015)) are globally significant.
Table 4 summarises the GGP-GAM smooth terms (GP splines) and the spline SPs.The SPs determine the 'wiggliness' of the SVCs and are controlled by the number of knots (k).Together these control the characterisation of the response-to-predictor relationships over space.The spline SPs are parameters in a covariance function that penalise the complexity of the model and k defines the basis dimension in a spline (also referred to as a smooth), and are generally indicative of the degree of variation in the predictor to response spatial relationship.Table 4 indicates that the SP is    Table 5 summarises the distribution of the GGP-GAM SVCs.Most flip between being positive in some constituencies and negative in others, except for the Intercept, Over 65s (%) and Transport & utilities employment (%), which are consistently positively associated with the leave vote.However the interpretation of these should be alongside the interpretation of the fixed parameters and the spline smooth terms: the Intercept is significant globally and locally; Turn out ( 2015) is globally significant with Degree qualification (%) and UK born (%) significant locally.
It also possible to map the spatial variation in the coefficient estimates generated by the GGP-GAM as in Figure 6.The spatial trends in the coefficients show a number of spatial patterns associated with different population characteristics and the leave vote share.The 2015 Turn out is positive associated in the East; Over 65s is positively associated except in the South West; Transport and utilities employment is positively associated except in the South East and London; Leisure and hospitality is negatively associated except in Scotland and the South West; Manufacturing is positively associated except on the East, East Midlands and North East; Degree and Bad Health are negatively associated nearly everywhere; and Born UK is positively associated except in the South East, London and Scotland.So overall, for some covariates, there are some distinct differences between Scotland, the East, South East (including London) and the parts of the South West with the rest of the country, but in different directions for different covariates, suggesting how individual factors were interacting in different ways in these areas in relation to the leave vote.

A MGWR analysis
Finally it also is possible to compare the GGP-GAM results with those from a MGWR.Summaries of model fit and accuracy are shown in Table 6.Comparing the diagnostics of AIC, adjusted R 2 , and MAE for the GGP-GAM and MGWR models indicates that the MGWR model exhibits marginally better metrics: broadly they seem to be performing as well as each other.Table 7 summarises the MGWR SVCs and the parameter-specific adaptive bandwidths.It is instructive to examine the variation in these and to compare with the GGP-GAM SVCs in Table 5. Comparing the SVC inter-quartile ranges indicates that both models exhibit similar levels of variation for many of the covariates, in terms of their sign and range, except for the following: � Transport & utilities employment (%): negative in MGWR, positive in GGP-GAM.� Leisure & hospitality employment (%): flips sign in MGWER, negative in GGP-GAM.� Bad Health (%): flips sign in MGWER, negative in GGP-GAM.� UK born (%): positive in MGWR, flips sign in GGP-GAM.
This suggest that in general the inference (understanding) supported by each model is largely similar when the associations between the target variable and individual covariates are considered, but with some potentially important differences as described above.

Discussion and conclusion
Spatially varying coefficient (SVC) models explicitly accommodate process spatial nonstationarity, where statistical relationships expressed using regression coefficient estimates are allowed to vary with location.SVCs provide an explicit representation of process spatial heterogeneity and can be mapped to describe how and where statistical relationships vary.These in turn provide insights into variations in the underlying socio-economic, environmental or other kind of geographical process, and support the spatial 'detective work' that is core activity in much spatial and geographical analysis.This paper proposes an alternative SVC based on GAMs with Gaussian Process splines parameterised with location -a Geographical GP GAM (GGP-GAM).It was applied to 100 simulated data generated from coefficients with known spatial heterogeneities and was shown to out-perform Multiscale Geographically Weighted Regression (MGWR) under a range of fit metrics.Then both models were applied to an empirical case study of the UK Brexit referendum.Here the MGWR model was shown to (marginally) outperform the GGP-GAM.Comparisons of the SVCs arising from both models indicated similar levels of variation for many of the covariates, with few differences in the predictor-to-response relationships.The correlations between the GGP-GAM and MGWR coefficients are summarised in Table 8 with two covariates indicating different relationships.In general, however, this reinforces an emerging theme from the SVC literature that alternative SVC approaches generate results that are much more similar than they are different, indicating a general coherence across this family of models and supporting SVC modelling overall.The GGP-GAM covariates were mapped and these indicated some strong regional differences in their relationship with the leave vote, for specific variables in different parts of the study area, when compared to the rest of the country (Figure 6).These are reflected the spatial variation in the associations between socio-economic factors the process and the Brexit leave vote.
In terms of historic frequency of use, the SVC model 'brand leader' is Geographically Weighted Regression.One could say that SVC modelling is to 'vacuum cleaner' as GWR is to 'Hoover'.More specifically Multiscale GWR is becoming the approach of choice.The increase in GWR and MGWR applications over recent years (as documented in Comber et al. (2022)) has in part been driven by the conceptual simplicity of GWR and MGWR: they generate a collection of local regression models, which are readily understood by both expert and non-expert users.However there are number of conceptual limitations to GWR-based approaches.Coefficients are estimated as though calibrating multiple models rather than a single one, whereas, for example, GP-based and ESF-based approaches within GLM or GAM frameworks provide a full single data model.Additionally, GW-based regressions may be sensitive to local collinearity even when this is not a problem across the whole dataset (Wheeler and P� aez 2010;P� aez et al. 2011), although tools for GWR have been developed to compensate for this (see Gollini et al. (2015)).Finally, kernel based approaches like GWR and particularly MGWR are computationally demanding, as they require a series of optimisation problems to be solved to determine the individual scales of predictor-toresponse relationships (Fan and Huang 2022).
In contrast, the proposed GGP-GAM has a rich and more comprehensive theoretical underpinning through its use of GP splines set within a GAM framework that has much greater flexibility than MGWR (especially as many GWR extensions have not been developed for MGWR).For example, GAM frameworks are relatively mature, and GAM functions commonly provide options for penalized terms for handling collinearity, robust options for treating outliers, and options for heteroskedastic or autocorrelated error terms.Further, MGWR is still only for Gaussian responses, despite being developed in 2014, whereas GGP-GAM can be used for a variety of responsetypes and has much stronger theoretical background to support such 'within framework' extensions than GWR/MGWR.The one area where MGWR has an inferential advantage is in the way that the scales of predictor-to-response variable relationships are reported.In MGWR, the parameter specific bandwidths provide an explicit and intuitively understood indication of process spatial heterogeneity, which GGP-GAM outputs currently lack.The nearest similar measure is the GGP-GAM spline SPs as in Table 1 but these describe parameters in a covariance function whose stability is linked to the specification of the knots parameter, k, and are not equivalent to a bandwidth.They do however provide a measure of the spatial complexity of the predictor to response relationship.Wolf et al. (2018) compared MGWR and multiscale Bayesian SVC (BSVC) models and sought to unpick the conceptual differences between the bandwidths reported by the two approaches.Specifically, they were interested in expounding how the different bandwidth distances indicated process 'scale' and identifying some important ontological differences between them.Their summary was that the MGWR bandwidths describe the local heterogeneity of predictor to response relationships and BSVC bandwidths (ranges) provided a composite measure of predictor to response relationships heterogeneity combined with local variation.Understanding the scale of spatial variation in this way is a key geographic consideration.However, as with BSVC, weights in GGP-GAM will not decay uniformly from a calibration point as in MGWR, since accounting for spatial correlation in a BSVC or a GGP-GAM leads to desirable weighting effects related to the data's spatial configuration (Chiles and Delfiner 1999).In this respect, the spatial configuration of a dataset will directly affect any comparison between a GGP-GAM and MGWR (see also Harris et al. (2011) in the context of GWR versus kriging).Future work will explore these issues in depth and how to characterise the link between the spline SPs through the GP covariance function in order to increase the interpretability of process scale through the GGP-GAM.Specifically, it will seek to connect the spline SPs to an inference of geographic scale.A final observation is the potential for SVCs via a GGP-GAM to overcome some of the long-standing reservations about GWR-based regression methods and to elevate the perception of SVCs amongst the broader community.Typically, GWR studies are presented as informal, exploratory analyses, and thus of implicitly lower status, supporting informal inferences of observed spatial processes.The mature GAM framework offers a potential opportunity to elevate this class of work and maybe incentivise greater mainstream use of SVCs.Future work will provide a more thorough demonstration of how GGP-GAMs and their outputs can be explored, tuned and reported, potentially using the diagnostic tools proposed in Sachdeva et al. (2022).
There are a number of other areas of further work that will be undertaken.The immediate next steps are to explore how the GGP-GAM handles highly complex spatial heterogeneities and apply it to larger datasets.Then a wrapper package for undertaking GGP-GAMs and extracting the coefficients at each locations will be constructed to sit above the mcgv package.Currently, a number of operations are needed to set up the GGP-GAMs and to extract the outputs in a form useful to a person interpreting a SVC model.Although R code can be written to do this, providing a simpler interface for frequent use would greatly increase accessibility.It is anticipated that there will be a number of further methodological developments.First, investigations will be undertaken to determine optimal spline smoothing parameters, which control the smoothing of the data and as such indicates the locally varying nature of the coefficient, and the number of knots used to ensure sufficient degrees of freedom across the data and the splines.In this paper, defaults were used, and the results are encouraging.However this suggests that potentially even better results could be achieved if the rubrics for tuning the GGP-GAP could be established.Second, further work will investigate how to provide explicit descriptions of the scale of predictor-to-response variable relationships.An approximation of these can be extracted from the semivariograms of the GP spline smoothing parameters which can be constructed for each GP spline.However, there is some uncertainty in the interpretation of the semivariograms in the mcgv package, in which the GGP-GAMs were constructed.Third, GAMs have options for penalised terms in order to handle predictor variable collinearity and selection.Investigations are needed to determine how and whether this needs to be tested for and then specified.Fourth, the wrapper package will include options to deal with outliers and dependent errors (i.e.heteroskedascity, autocorrelation) will also be investigated for GGP-GAMs.There are also a number of areas of SVC model development, particularly in the temporal domain.GP splines can be parameterised with time as well as as location.This suggests opportunities for methodological entrancement to current Geographically and Temporally Weighted Regression (GTWR) approaches (Huang et al. 2010;Fotheringham et al. 2015) and multiscale GTWR (Wu et al. 2019), which effectively determine parameter (q) to link space and time.These, and other methodological opportunities, are possible because of the considerable developments that have taken place around GAMs in the temporal, spatial and spatio-temporal domains.
In conclusion, this paper demonstrates for the first time, the formulation and application of a GAM with GP splines calibrated via observation location in a multiscale spatially varying coefficient model.The Geographical Gaussian Process GAM (GGP-GAM) was applied to 100 simulated datasets with known variations in spatial heterogeneity, and were shown consistently to perform better than MGWR.GGP-GAM was applied to a Brexit case study to demonstrate how process GPP-GAMs could be used to understand how and where relationships (and therefore processes) vary spatially.The results were compared with MGWR which was shown to perform marginally better.A number of areas of further work have been identified, including the development of a user-friendly wrapper to the GGP-GAM functions that will incorporate spline

Figure 1 .
Figure 1.The simulated regression coefficient surfaces with zero, low and high spatial heterogeneity.

Figure 2 .
Figure 2. Evaluation of the accuracy of the GGP-GAM and MGWR regression coefficient estimates, when compared to the true coefficients, and with the distribution of AIC values for model fit.

Figure 3 .
Figure 3.The estimated regression coefficients from a single fit of the GGP-GAM and MGWR models, shaded using the same range as Figure 1.

Figure 4 .
Figure 4.The Brexit leave vote majority over parliamentary constituencies with the UK regions, under an OSGB projection (left) and recast into hexagons (right).

Table 1 .
Beecham et al. (2020)dwidths (BW) and GGP-GAM spline smoothing parameters (SP).economicfactors that were associated with the Leave vote as described inBeecham et al. (2020)and contained in the parlitools R package.A number of variables were chosen to illustrate the SVC model using a GGP-GAM relating to structural economic variables, age, and employment sector.These are summarised in Table2.The constituency areas with an OSGM projection (https://epsg.io/27700) are available from the UK Office of National Statistics 1 and the hexagonal cartograms are contained in the parlitools R package

Table 2 .
The variables used to construct the GGP-GAM of leave voting rates, for each parliamentary constituency.

Table 3 .
The fixed (parametric) coefficients for the GGP-GAM.

Table 4 .
The smooth terms of the GGP-GAM model, with the Spline smoothing parameters.and 10 −5 for the other covariates.The effective degrees of freedom (edf in the table) can be interpreted as an indicator of the non-linearity of the relationship of the response (leave vote) with the predictor variables -i.e. the effective degrees of freedom would have a value of ~1 if the model penalized the smooth term to a near linear relationship.Higher edf values indicate increasing non-linearity in the relationship over a 2-dimensional space defined by (X, Y).All edf values are less than k indicating that it is adequately specified.The p-values relate to the splines defined over geographic space and their significance can be interpreted as indicating whether they vary locally.The p-values indicate that the Intercept, Degree qualification (%) and UK born (%) are significant locally.The insignificant p-values can be interpreted as indicating that although there are effects, these do not vary locally.

Table 6 .
Comparison of the model fit diagnostics for the GGP-GAM and MGWR models.

Table 7 .
The distributions of the MGWR spatially varying coefficient estimates and the adaptive bandwidths.

Table 8 .
The correlations between the GGP-GAM and MGWR spatially varying coefficient estimates.