Regional policy and the role of interregional trade data: policy simulations with a model for Norway

ABSTRACT Point data observations are often used to calibrate computable general equilibrium (CGE) models; however, results may be impacted by calibration of an Armington trade specification in a regional CGE (R-CGE) model. This paper calibrates an Armington trade specification with three differently estimated interregional trade data sets. It estimates interregional trade with one survey and two non-survey methods. The resulting three different trade data sets are each used to calibrate REMES, an R-CGE model for Norway. Two regional policy reforms are simulated with the three model versions to analyze the sensitivity of regional manufacturing sector output to trade data estimates. The results show that the trade data estimation method used for calibration significantly affects regional sector output results. Policy analysts and developers should be aware that calibrating an Armington trade specification with differently estimated interregional trade data may have a substantial influence on model results, and hence, on ex-ante and ex-post conclusions on policy impacts.


INTRODUCTION
Regional computable general equilibrium (R-CGE) models have grown in popularity for assessing impacts of various regional policies (e.g., Brandsma, Kancs, Monfort, & Rillaers, 2015;Horridge & Wittwer, 2008;Törmä, 2008;Vandyck & Van Regemorter, 2014). These models are based on their national counterpartthe CGE model. Data availability and quality is a major issue for operational R-CGE models, and calibration from a balanced data set is the commonly used method to specify and parameterize these models. Calibration is advantageous in operational models where data availability is limited. However, this comes at a costmodel results may be sensitive to the choice of data, for example, the choice of base year. The main data input of calibration of CGE models is the social accounting matrix (SAM). The SAM describes the circular flow of income between economic agents (e.g., industries, households and government) and where the SAM, by assumption, shows a snapshot of an economy in equilibrium.
The data for creating national SAMs are available for most countries, e.g., GTAP (Badri Narayanan & McDougall, 2015) and EXIOBASE (Wood et al., 2015). Moreover, it is possible to specify a numerical CGE model with a SAM on the basis of one-year observations. A major criticism against (R-)CGE models is empirical weakness due to parameterization based on calibration from point data. In particular, data used for calibrating R-CGE models should not include atypical data values, especially for parts of the data set important for the analysis (Partridge & Rickman, 1998). One way of dealing with this criticism is to perform a sensitivity analysis, where the quality of the point data is considered. The present paper addresses sensitivity by measuring how much variation we see in outputs of an R-CGE model where calibration of the Armington trade specification is performed with three different (but each internally consistent) data sets for interregional trade. The higher the variation in results between differently calibrated model versions, the higher the sensitivity.
The objective of this paper is to examine to what extent the inferences from an R-CGE model are robust to the choice of various estimated interregional trade data. We apply two exogenous reforms in an R-CGE model in order to analyze sensitivity. The sensitivity evaluation is performed on regional manufacturing output, the largest sector for which the most detailed data are available.
The paper is organized as follows. The next section overviews how sensitivity generally has been addressed in CGE models. Data and the interregional trade estimation are presented in the third section. The fourth section presents the features of the Norwegian R-CGE model, REMES (Werner, Johansen, Perez-Valdes, & Stokka, 2015), used in the sensitivity analysis and calibration of the Armington trade specification. In the fifth section, a calibrated parameter sensitivity analysis (CPSA) is performed using three differently calibrated variants of REMES. We conclude our findings in the sixth section.

CALIBRATION AND SENSITIVITY IN THE CGE MODELS
In the CGE literature, elasticities (of substitution) rather than calibrated parameters from a data set are considered to be the exogenous parameters most crucial for the model results. This is because the model's elasticities are often collected from the literature and can therefore be very uncertain, while the calibrated parameters have a more solid empirical foundation in data (Dawkins, 2005). For that reason, most sensitivity analyses have been performed on elasticities. Three methods that are typically applied for sensitivity analysis in CGE models are (Wigle, 1991): limited sensitivity analysis (LSA), conditional systematic sensitivity analysis (CSSA) and systematic sensitivity analysis (SSA). In particular, LSA is often used for sensitivity analysis of R-CGE models (Partridge & Rickman, 1998;Rutherford & Törmä, 2010). In LSA, only the exogenous parameters considered most important for the simulation results are varied between self-chosen values. In CSSA, rather than choosing values, the exogenous parameters used in the sensitivity analysis are often sampled from econometric studies in the literature or, in some cases, taken from estimates for the particular work in question (Harrison, 1984). As in LSA, other parameters considered less relevant to the model outcomes are held fixed. SSA checks simultaneous sensitivity by varying several of the exogenous parameters. Performing an SSA may lead to computational challenges since a huge number of model runs must be performed. We see that, for example, Monte Carlo filtering (Mary, Phimister, Roberts, & Santini, 2013) and Gaussian quadrature approximation (Channing & Pearson, 1998) have been used as approximation techniques to reduce the number of model runs in partial sensitivity analyses. However, our concern here is not sensitivity in model results caused by varying the exogenous elasticities, but rather sensitivity caused by varying estimation methods to obtain the desired detail level in the data used for calibrating parameters in the model. A CPSA is performed by Roberts (1994), who uses SAM data from five different years, 1986-90, for calibration of a CGE model for Poland. The sensitivity was analyzed for a policy shocking a 10% increase in government expenditure. Roberts concluded that the results were quite robust to the data year. On the other hand, it was also argued that all years represented were in a rather stable economic period for Poland. Dawkins (2005) developed a methodology to analyze the sensitivity of both exogenously given parameters (elasticities) and calibrated parameters (i.e., from the SAM). We find similar examples in Elliott, Franklin, Foster, Munson, and Loudermilk (2012) who performed both SSA and CPSA on a multinational CGE model, comparing the sensitivity with regard to the parameters calibrated on the base-year data, different levels of economic and geographical level, substitution, and Armington trade elasticities. Using a Monte Carlo experiment, they found greater sensitivity to uncertainty in the elasticity of substitution parameters than to uncertainty in the base-year (calibrated) data as the projection period increased. Another conclusion is that sensitivity varies dependently on which output variable is analyzed.
All reviewed papers show how one can conduct a sensitivity analysis in a CGE modelling framework in order to pinpoint assumptions of the model crucial for the interpretation of the models' results. However, in a regional context, data issues may be especially important for the final model results. Regional models, in many cases, suffer from a lack of good data sources and, in some cases, missing data must be constructed. Hence, in the sensitivity analysis in this paper, we focus on variation of the constructed interregional trade data set used for parameter calibration of the Armington trade function used in the CGE model. Without knowing the uncertainty related to the weakness of data quality, e.g., interregional trade data, it will be hard to convince policy-makers to use such complex models and to have confidence in the results. Alternatively less complex approaches to evaluate regional policy are more likely to be preferred by policy-makers.
The role of this data set has previously been investigated for both input-output models and SAM models (Robinson & Liu, 2006;Sargento, Ramos, & Hewings, 2012). However, to the best of our knowledge, this has not been done in a regional CGE model. We test the sensitivity of simulation results in the R-CGE model REMES. 1 An introduction to the data sources and how the interregional trade data are estimated now follows.

INTERREGIONAL TRADE DATA
The national SAM used is based on supply-use tables from Statistics Norway (SSB) for the year 2014. Regional data on production, intermediates and consumption are used to disaggregate the national SAM into regional SAMs and is based on data for 2010 (SSB, 2014). We represent nine regions with regional SAMs in the R-CGE model; for convenience, the regions are named R1-R9 ( Figure 1) -R9 is the Continental Shelf and outside the map shown in Figure 1. Table 1 shows the regional output shares of the different industries we represent in the model. Regions R1 and R3 are the largest in the model with respect to population and total output. R1 is the capital region of Norway; region R3 is the headset of land-based oil and gas-related industry. Region R1 has the largest output for all industries except for primary and oil and gas. The output share of the oil and gas production in Table 1 in region R3 is 14%, which is significantly higher than for the other land-based regions. The primary industry is largest in region R7 due to a strong fishery sector.
Regions R1, R3, R4 and R6 are all urban; of these, R4 and R6 have less sizeable total output than the other two, as shown in the second to last row of Table 1. The distribution of population within the nine regions is shown in the bottom row of Table 1. When comparing the area sizes of all regions in Figure 1 with the population shares in Table 1, we can observe that regions R2, R5, R7 and R8 are the least densely populated. The urban/rural dimension of the regions is reflected in the benchmark tax levels of payroll taxes. 2 To counter centralization, Norway has a regionally differentiated payroll (RDP) tax, with lower tax rates for rural regions. One of the policy reform shocks we analyze in this paper concerns these RDP taxes.
Establishing the interregional trade data sets In order to have a complete data set for the R-CGE model, the regional SAMs have to be connected with a not-yet-estimated interregional data set. (The disaggregation of the national SAM to the regional SAMs does not give the interregional trade flows.) The structure of the interregional data that we want to estimate is shown in Table 2. The known data values are the Sum values in the last columns. These Sum values are available from the regional SAMs. Both the inter-and intra-regional trades (along the diagonal) are unknown and must be estimated, where XO m ij is the interregional trade flow of good m from region i to region j; and XO m ii is the intra-regional trade along the diagonal.
In the following, the asterisk in XO m * i and XD m i * denotes a summation over the missing index. Overlined symbols XO m ij /XD m ij are exogenously given (i.e., known) parameters. REMES has different price sets for producers and buyers of final and intermediate products. Therefore, traded products ( XO m ij / XD m ij ) are valued in two price sets: for producers, XO m ij , and for consumers, XD m ij . This is the reason why the Sum totals for rows and columns in Table 2 are not equal, and that the Row-sums in the last column in Table 2 are equal to XD m i * . This will be the sum of the estimated  interregional trade data ( i XO m ij ) included the ad valorem trade and transport margin and net product tax. In REMES, taxes are included in the destination value of products ( XD m ij ). 3 Equations (1)-(3) denote the macroeconomic balancing constraints that have to be fulfilled when estimating the interregional trade. Where equations (1) and (2) ensure that the interregional trade flows equal the total supply and demand of the product for each region, (3) captures differences in price levels for producers and buyers: In order to assess their impact on simulation results, the interregional trade data set is estimated with three different methods often used in regional CGE models (Brandsma et al., 2015;Horridge & Wittwer, 2008;Ivanova, Kancs, & Stelder, 2010;Potters, Conte, Kancs, & Thissen, 2014). The three methods are denoted as the basic method (BM), gravity method (GM) and survey entropy method (SEM).
The BM is a non-distance and non-survey-based method (the regional trade is spread out based on production-consumption shares). It estimates the origin-destination (O-D) matrix based on the information shown in Table 2, not using any additional information about trade patterns or spatial connections between regions. Equations (4) and (5) determine the interregional trade in producer and buyer prices respectively.
In equation (4), the trade (in producer prices) XO m ij is calculated as the total output (in producer prices) of region i multiplied by the consumption in region j as a share of total consumption (in buyer prices).
In equation (5), the trade (in buyer prices) XD m ij is calculated as the total demand (in buyer prices) of region j multiplied by the output of region i as a share of total output (in producer prices). As a result, each region consumes the same mix of domestic imports, with regional origin shares equal to each region's share in domestic production: . Intra-and interregional trade flows of products m from region i to region j.
Trade and transport margins Net product taxes The GM uses a gravity formulation with distances. It is based on the idea that trade volumes between regions depend on the distance between them (travel distance, cultural distance, etc.).
In the GM, we estimate a distance data set (measured in travel distance, km) between the regions, Dist ij . The regional distance data set (Dist ij ) is created from the municipality-to-municipality distance data weighted by each municipality's relative population size in 2013 for both receiving and destination regions. 4 Equations (7)-(9) introduce a standard double-constraint gravity model. The weights s 1 and s 2 , together with a distance decay parameter s 3 , are set equal to 1, as in the basic gravity formulation from Wilson (1967). The GM is a heuristic model; no explicit objective function is minimized. We choose starting values 5 for A m i and B m j and recalculate XD m ij , A m i and B m j until their values stabilize and we have converged to a solution: The third method, SEM, uses a cross-entropy formulation with survey transport data as a proxy for interregional trade. Cross-entropy methods aim to minimize the distance from the estimate for the unknown distribution, the posterior, to a known distribution, the prior. This prior distribution could be survey data or other trade flow information for some region-product combinations, and can be exploited to improve the quality of interregional trade estimations.
We enable a transport survey-data set in the SEM. This is originally from SSB and further improved by The Norwegian Institute of Transport Economics (TØI) by including more products to the data set (Hovi, Caspersen, & Grue, 2015;Hovi & Jean-Hansen, 2003). The transport survey covers flows of goods in monetary values in Norway between the supply side: manufacturers, importers and wholesalers, and the destination users: intermediate products used in manufacturing and service industries, export, wholesale and retailers. The survey covers 39 groups of transported goods among 20 counties in Norway, which is further aggregated to the products and regions of the R-CGE model. This data set has a distribution that we use in the estimation in the SEM by creating a so-called prior. PriorsX m ij are defined using the survey data as XO m ij and XD m ij in equations (10)-(11). Next, these priors are used in the objective in equation (12). The SEM minimizes the sum-of-squared differences between priors and posteriors. For trade flows where no survey data are available, we use no prior information in the SEM. 6 In these cases, the calibration procedure equation (12) is only minimized for trade flows where priors are available. Trade flows without priors will be determined by the model within the limits imposed by row and column totals in Table 2: Estimated regional trade data The estimated intra-and interregional trade data for manufacturing product from these three methods are presented in Table A3 in the supplemental data online. As expected, the BM estimates for trade data are rather spread out among regions without a clear pattern reflecting any trade preferences. The GM estimates for intra-regional trade are higher than the BM estimates due to the low within region distances (see Table A4 online). The distance effect also causes higher trade values (between neighbouring regions) than the BM. That geographical distance is an important determinant of actual trade flows is reflected by the GM and SEM data. In the GM, the distances are considered directly; by the SEM they are considered indirectly via transport costs. A notable difference between the SEM and the two others is that it captures some observed trade data. One example is high trade values between R3 and R9 in the SEM. Region R3 is the land-based oil region and R9 is the ocean-based oil region (see output shares in Table 1); this trade connection is reflected nicely by the SEM.
The SEM relies on the available transport data. If there is no transport between two regions in the original survey data (our priors), generally there will be no trade flow in our (estimated) final SEM trade data set. In fact, there are six trade links with zero trade for the manufacturing product produced by the SEM shown in Table A3 in the supplemental data online. However, only three of these six had a prior with value zero: all three are interregional exports to R9. The other three zero-trade flow results are merely due to using the minimization approach equation (12).
Since the SEM method makes use of actual trade flow data for as far as it is available, we believe it will perform best and chose it as the reference method for comparison. Our belief is based on the idea that using available data, and more detailed data, will result in more accurate simulation results. Having the SEM as the reference method creates a contrast between an expensive data-intensive method and cheaper ad-hoc methods.

REMES: AN R-CGE MODEL FOR NORWAY
REMES includes a national government and regional households, representative product producing firms, and product transporting agents (cf., Shoven & Whalley, 1984). The economic agents maximize utility or profit. Agents typically have nested utility or production functions. Perfect competition is assumed for all factor input and output. For the macroeconomic closure, we assume that capital and labour are mobile among industries, but not between regions. In order to streamline the effects of the experiments conducted during the paper, we isolate the impacts of the changes in trade flow data. This is the reason why we limit regional mobility of labour and capital. The foreign exchange rate is fixed (numeraire of the model) and the current account balance changes in response to imports and exports from abroad. The national government has fixed real expenditures and fixed tax rates, while transfers to households are determined residually. Households have fixed savings rates, which determine private savings. Figure 2 illustrates the nesting structure of the production function in an industry. Production of products is performed by each sector. A representative production agent produces a homogeneous product according to a nested elasticity of substitution production function, combining intermediate products (marked up with product taxes and trade and transport margins), labour (marked up for payroll tax), capital and investment. The value of the parameter s in each nest defines the functional form of the constant elasticity of substitution (CES) function. In the top nest, there is a Leontief technology relationship (s = 0): no substitution between Materials and Production factors. Furthermore, in the composite Production factors nest we have Cobb-Douglas production functions (s = 1). This allows industries to substitute between Labour and Capital. A Leontief technology relationship is assumed between Capital and Investments in the composite Capital nest.
Producers distinguish between intermediate products produced in their own region, other domestic regions and products imported from other countries (rest of world, ROW). The lower dashed part to the left of Figure 2 shows the origins of the different goods: this is where the interregional trade data set enters the picture and is used for the calibration of value shares in the Armington function. This approach is in line with the theory of Armington (1969) when substituting between products from Own region, Other regions or Imports from the ROW. Bilgic, King, Lusby, and Schreiner (2002) argue that interregional trade should be more price sensitive compared with international trade due to higher price-related restrictions for international trade compared with domestic trade. As a consequence, international import elasticities (s 5 ) should generally be lower than regional import elasticities (s 6 ).
Elasticities determine how strongly a policy change will affect key macroeconomic variables. However, to keep the impact on simulations as transparent as possible, we assume equal values of the other elasticities for all products, industries and regions. The values used are indicated in Figure 2. We now turn to how we calibrate the Armington trade agents with the interregional trade data.
Parameterizing the Armington trade specification with the estimated data sets Equations (13)-(20) show the REMES code equivalent, found at the lower dashed part on the left in Figure 2. Table 3 gives a description of the variables in the equations. Equations (13)-(16) show how quantities (Q) are calibrated with data from the regional SAMs and the interregional trade estimation. Notions S, S1, S2 and S3 denote the different nests of the traded Armington composite goods; indices i, j represent regions; while m represents the respective goods of the model. These equations show how the intra-and interregional XO m ij trade data are part of the calibration of the Armington function from the lower nests S3 and S2, to the upper level, S (Figure 2). The import quantities from ROW are taken from the regional SAMs. Equations (17)-(20) show price equations of the different nests of the Armington function. In the lower nests, S2 and S3, we assume a Leontief relationship between interregional trade products and trade and transport margins. In nests S1 and S, we assume a CES function.
At this step, the Armington trade specification is populated with one of three different data sets for interregional trade data (see the third section). We have three different benchmark equilibriums ready to be evaluated with relevant regional policy simulations:

CALIBRATED PARAMETER SENSITIVITY ANALYSIS (CPSA)
Any calibrated CGE model will replicate the benchmark equilibrium, i.e., the SAM that was used to calibrate it. Our three different, but internally consistent, interregional trade data sets all replicate the same benchmark equilibrium from the SAM; only the interregional trade flows differ. To perform a sensitivity analysis, a policy shock must be simulated with the differently calibrated CGE model versions. Next, the sensitivity of the results (i.e., the variations in the newly calculated equilibria) with respect to the method used for interregional trade data estimation can be explored. Both a regional value added tax (VAT) reform and a regionally differentiated payroll (RDP) tax reform are relevant policy simulations for the regional authorities in Norway because they both are used in regional policy today. In particular, since 1975, the Norwegian authorities have spent much effort on balancing the regional development with RDP. The idea of RDP taxes is that they will lower labour cost for the industry in the periphery, which gives incentives to recruit more workers. Creating and maintaining jobs should increase the attractiveness of living in rural areas and reduce urbanization. The initial levels of the RDP taxes for different regions are shown in Table A1 in the supplemental data online. Also VAT is regionally differentiated in Norway. The island of Svalbard (Spitsbergen, in the Arctic region), which is the main part of region R9, does not have any VAT at all.
The RDP in Norway is regularly evaluated whether it fulfils policy ambitions. In these processes alternative instruments are suggested, for example, subsidies on transport cost. RDP simulation is one of our two policy simulations. An alternative is more regional VAT differentiation. We therefore analyze a reform with different VAT in region R1 (the region with the largest share of total output and population; Table 1) and describe this policy simulation in the next section. This choice may reflect the region authorities would target with a (relatively) higher VAT to counter urbanization.
A second ambition of the analysis is to investigate whether the magnitude of the two policy reforms affects the sensitivity of the results. Therefore, we define a parameter d to vary the direction and relative magnitude of the policy shock.
The CPSA is implemented in the following steps: (1) Perform the two regional policy reforms as addressed above: . VAT reform: vary net product taxes for one product in one region. This changes prices of the interregionally traded products directly ( Figure 2). 7 . Payroll tax reform: change regional payroll tax levels. This affects the price of interregionally traded products indirectly through a change in the price of labour ( Figure 2). 8 (2) Choose values (d ) to scale the benchmark tax levels in Tables A1 and A2 in the supplemental data online for simulating two different regional policy tax reforms. This allows one to analyze how the magnitude of the shock affects sensitivity of the CGE model results. The scaling levels have been chosen based on historical tax level variation. For example, the highest recent change in VAT in Norway was a reduction of farming product taxes from 24% to 12% in 2001. 9 We analyze tax variations in both directions: from 50% reduction to 50% increase in initial tax levels, with a step interval of 5%. This approach implies that we run all three model versions for 20 different counterfactual scenarios for both the VAT reform and the payroll tax reform: 120 model runs in total.
We refer to a set of simulation outputs (e.g., wage levels, output prices and gross domestic product (GDP)) for different types of policy shock, shock magnitudes and different interregional trade data sets, following the general notation: GM,SEM} (21) where s is the set of policy reforms; D is the set of shock magnitudes for said reforms; and t identifies the interregional trade set used for calibration. In the following, we present one separate analysis for each policy reform; therefore, we remove the index s when the reform type is clear from the context. As mentioned above, in this sensitivity analysis we focus on sensitivity in manufacturing output. Hence, equation (21) only covers manufacturing output in the present analysis. The benchmark equilibrium values of the manufacturing output available from the initial SAM data set are referred to as Y 0 . y d,t denotes the percentage change in Y d ,t from the initial value (equation (22a)); Y t is the mean value over all tax policy shock magnitudes d ; Dy d,t is the difference between results obtained using the BM or the GM and the SEM (equation (22b)). The S t is a summary measure of the sensitivity in the results by considering all levels of d : the mean absolute difference of the BM or the GM compared with SEM relative to the output with the SEM: The following discussion of sensitivity focuses on the output of the Manufacturing sector in regions R1-R9. Table 4 lists the two regional policy simulations with respect to region, type of reform and sector/product, which are performed in the sensitivity analysis.

Effects of policy reforms on manufacturing output
We start with a presentation of the results from the tax reforms on manufacturing output. Typically, the two policy shocks in Table 4 will give different market effects in a CGE model. For example, the VAT reform, which is a variation of net product taxes on the manufacturing product in region R1, will affect the output price of the manufacturing product in R1. Since the manufacturing product is traded to other regions, both as an intermediate and as a final product, other regions are also affected by this policy shock. Owing to the tax change, the relative price of the manufacturing product produced in region R1 changes and buyers will substitute among the manufacturing product in R1 and the manufacturing product from other regions. For the commodity market to clear, the activity of the manufacturing sector must be adjusted in R1. Furthermore, since the CGE model assumes inelastic supply of capital and labour, capital and labour prices also have to adjust in order to achieve market clearance in the factor market.   Figure 3 and Figure A1 in the supplemental data online 2 R 1 -R7 Payroll tax reform All Industries Table 5  The final effect of this reform on the regional manufacturing output in different regions when all markets in the CGE model have cleared is shown in Figure 3. It shows the final results from the reform with the three differently calibrated versions of the CGE model. Higher/lower VAT in region R1 decreases/increases the manufacturing outputs in all regions. Naturally, we see large effects in region R1 (the shocked region), but also significant effects in other regions due to regional interdependencies.
The effect on outputs of the payroll tax reform is shown in Figure 4. Owing to the changes in payroll taxes, higher/lower prices on labour for the industries will decrease/increase the manufacturing output. We see that this reform affects the manufacturing output less than the VAT reform.

Sensitivity in manufacturing output
We analyze the sensitivity of the results from the previous section by: (1) visually comparing the results of Y d ,t (defined in equation (22a) Table 4 (the y-axis shows the percentage change from the initial benchmark value). sensitivity in Table 5 relative to the size of the effect in the SEM which was defined as S t in equation (22c). In summary, we observe the following effects: . There is a general pattern towards higher sensitivity in the VAT reform than in the RDP reform. . The manufacturing output is more sensitive with the BM than the GM. . A stronger policy shock leads to higher relative sensitivity in the manufacturing sector output. (The effect does not scale linearly.) We see some similar patterns in the sensitivity for the payroll taxes. . The manufacturing output for some regions is more sensitive to the interregional trade data set than others.  Table 4 (the y-axis shows the percentage change from the initial benchmark value).
Causes of regional variation in the sensitivity A natural starting point for the analysis is to assess the sensitivity of the CGE model outputs caused by differences in the interregional trade data for the VAT reform. For this reform we investigate the sensitivity in light of the calibrated data in Table A3 in the supplemental data online. When comparing sensitivities, we explain why some regions are more sensitive to variations in the interregional trade data set than others. We analyze this sensitivity by visually comparing the vertical distances in Figure 3 from the SEM to the GM and the BM for all values of d for the VAT reform. Under this reform, the price that regions pay for importing the manufacturing product is affected. This directly affects the import level and how much a region imports from region R1. Regions R6-R9 show more sensitivity in the manufacturing output: R6-R8 especially for the BM. We discuss how this sensitivity may be caused by differences in the calibrated interregional trade data by inspecting export levels of the manufacturing product from R1 to R6 and R3, where R6 is an example of a region with high sensitivity and R3 is an example of one with low sensitivity.
The interregional trade estimates in Table A3 in the supplemental data online show that region R6 imports of manufacturing product from R1 in the BM (10,266) are over three times as much as in the SEM (3223). The import estimate based on GM, 4594, is much closer to the SEM result. Hence, it follows that when the model is calibrated using BM trade estimates, the impact of a VAT reform in region R1 on region R6 will be larger (so we see large variation between the lines in Figure 3 for region R6). The value of S t in Table 5 for R6 for the VAT reform is 68.6% for the BM compared with only 30.2% for the GM.
On the other hand, region R3 is an example of a region with low sensitivity in the results. The lines for BM and the SEM in Figure 3 are virtually on top of each other. The same is true of GM for negative shock values, while the line for the GM lies slightly but noticeably higher only for the largest positive shock values. When we inspect the export levels in Table A3 in the supplemental data online from R1 to R3, we see that they are almost equal between the BM (49,673) and the SEM (49,106), whereas for the GM they are about half (24,763) of the value; this explains the sensitivity in the results.
These examples show that the sensitivity is traceable back to the differences in interregional trade data sets in the case of a one-sector one-region VAT reform. For the payroll tax reform, we cannot easily trace the regional sensitivity variations back to differences in the calibrated interregional data sets. Because all regions are shocked simultaneously, the effects interact (even) more, making it more difficult to identify causal relationships and to explain specific differences in manufacturing output variations.

Role of estimation method (t), reform (s), shock direction and intensity (d) on sensitivity
What are the implications of estimation method, reform, direction and the magnitude of the shock on sensitivity?
. Estimation method (t): the GM method gives results with less sensitivity than the BM. In particular, Table 5 shows less extreme variation in results with the GM versus the BM when comparing S t line by line. However, we see regions where the BM output is closer to the SEM. For example, in region R1, the BM provides results closer to those of the SEM in both reforms. . Reform (s): the fifth section showed that output was less affected by the payroll tax reform than the VAT reform. For example, the Y SEM in region R1 was only 0.5% in the payroll tax reform but 2.4% in the VAT reform. This was expected since a VAT reform affects prices directly, but payroll taxes affect them indirectly. However, our sensitivity analysis is not concerned with the level of Y SEM , but rather with the level of S BM and S GM . When we compare the vertical distances in the results from the two reforms in Figures 3 and  4, the sensitivity seems lower for the reform with variation in payroll taxes. On the other hand, when analyzing the values of S t in Table 5, this is not true for all regions, but there is generally less sensitivity in payroll tax reform results. For example, the highest value of S t is 35% (R8) in the payroll tax reform, while it is 209% (R9) for the other. . Shock intensity and direction (d): Figure 4 shows that, for large negative payroll tax shocks, the output produced with the model calibrated from the SEM is smaller in magnitude compared with the BM and the GM. We also see that for shocks with a tax increase of 20-40%, the manufacturing output changes from the SEM in regions R2 and R5 have a different sign than those based on the BM or the GM. While the VAT reform sensitivity is more or less symmetrical and similar in order of magnitude between both directions of d , the sensitivity for the payroll tax shock shows that increasing taxes results in larger sensitivity compared with reducing taxes.
Finally, Figure 3 shows relatively larger deviations for larger shock magnitudes for some regions in the results for all three methods (i.e., the effect of the shock is not linear with the magnitude of the shock). This is especially clear in regions R7-R8 (and R9, but this region is not comparable with the mainland regions), shown in Table A1 in the supplemental data online, where the differences in effects for the VAT reform are plotted.
The previous section indicated how sensitivity is partly traceable back to differences in the interregional trade data. This section further explored how sensitivity is affected by the estimation method, type of reform and the magnitude of the shock given the reform.

Measuring sensitivity in monetary terms
The above sensitivity analyses illustrate that the different estimation methods cause variation in the simulation results. How large are the consequences of such a variation in monetary terms when performing policy simulations? To illustrate this, we give an example from the payroll tax reform with a 50% reduction in the payroll taxes. In region R1, the region with the lowest sensitivity variation but highest industrial output, this would induce a growth in manufacturing output, causing increases of 4956, 5681 and 4286 million Norwegian Kroner (mln NOK) 10 when using models SEM, GM and BM respectively. The impact predicted by the GM is 14% higher and by the BM 13% lower than by the SEM, and the difference between the lowest and highest value is actually about 1.3 bln NOK (€135 mln). These differences show the impact and illustrate the importance of trade parameter calibration, and, hence, the estimation method used. Based on our policy simulations, we conclude that the R-CGE model REMES (and likely other macroeconomic models with similar assumptions) show high sensitivity to different interregional trade data sets. Modellers opting for a specific method to calibrate an Armington trade specification must expect that this will affect simulation results significantly.

Generalization of the findings to stakeholders
The relatively slow start of regional CGE models was that they were data intensive compared with their counter parts such as input-output models or other econometric methods used for evaluating regional policy (Partridge & Rickman, 1998). One data set generally not reported by national statistic bureaus is interregional trade data.
This paper suggests that the results from a regional CGE will report the same sign on changes in outputs regarding policy reform effectsregardless of how this data set is estimated. However, the variation in magnitude of the changes in outputs may differ substantially. Thus, the results should motivate CGE modellers to be diligent when calibrating a trade function with missing input data such as interregional trade. For other stakeholders, in particular regional authorities, should improve coverage of interregional trade data. For example, analysis of large-scale payment data might prove worthwhilebut may face privacy concerns.
The methodology used in this paper is easily replicable to other regional CGE models; however, the generalization of findings in sensitivity is to some degree limited to the experiment conducted. For example, this analysis was limited to sensitivity in the output from the manufacturing sector in REMES. We suspect that some patterns and observations are linked to the REMES model structure. In particular, the following issues require more attention: elasticities and nesting structure; aggregation of regions, products and industries; macro-closure of the R-CGE model; and the type of policy reform.

CONCLUSIONS
Numerical general equilibrium models are often criticized for using a calibration of model parameters based on point observations. We are interested in how the calibration of an Armington trade specification affects policy simulation outcomes in a regional CGE model. We estimate with three different estimation methods the interregional trade data set for seven aggregated products between nine regions in Norway. One estimation method uses partial survey data (SEM), which is used as the benchmark for comparison; the other twonon-surveyestimation methods are based on distance/gravity (GM) and a basic method (BM) that uses no geographical information. We have investigated the sensitivity of manufacturing sector output results with respect to different estimation methods for regional trade data sets using a CPSA. Concerning our main research question: to what extent inferences from R-CGE models are affected by different interregional data set, we conclude the following.
We use two different tax policy shocks to perform a sensitivity analysis on regional manufacturing output using an R-CGE model calibrated with the different interregional trade data sets. When we take all shocks and the different data sets into consideration, we observe many intuitive effects in manufacturing output levels. However, applying these shocks to the R-CGE model (changes in factor and output taxes of different magnitude in one or all regions) results in a rather high sensitivity of the regional manufacturing output, and, we observe that the magnitude of the shock matters more for the sensitivity than the type of the shock. Sensitivity is lower when we compare a multi-regional labour factor tax reform with a one-region one-product tax reform similar in magnitude. We see that in the latter reform, regions with large variations in interregional imports of the taxed product due to the three methods will experience large sensitivity in the output.
First, how much does the calibration of an Armington trade specification matter when calibrating and using an R-CGE model? The analysis shows some rather large variations in policy impact on sector output. In monetary terms, the effects vary by magnitudes of hundreds of millions of NOK. If a policy analysis is concerned with a broader-aiming policy instruments (e.g., for multiple products, industries and regions), the specific method used is likely to have less impact on the final R-CGE model results.
Second, does our analysis provide some generalizable recommendations with respect to the preferred estimation method of interregional trade in R-CGE models? As discussed, transport survey data typically suffers from limitations (such as missing data for services, conversion issues from physical units to monetary values, overestimation of trade from regions with transport platforms, and problems with double-counting of whole-sellers (Sargento et al., 2012). However, even when flawed, SEM's use of available data to the extent possible is best able to reproduce the real interregional trade pattern. It implicitly captures hurdles (cultural, mountain ranges, etc.) and reflects encouraging factors (traditions, trade hubs). If survey data is lacking, our findings support using the GM over the BM. We have not analyzed to what extent we can combine different estimation methods for different products in order to improve the overall quality of the regional trade data set. Possibly, the interregional trade data of services (with no prior information about this product in the SEM) could be better estimated with the BM than the GM, because trade of services is less distance dependent. NOTES 1 REMES is programmed in mathematical programming system for general equilibrium analysis (GAMS, 2015), a modelling package in GAMS. See Werner et al. (2015) for a full model documentation. 2 See Table A1 in the supplemental data online for these benchmark tax levels. 3 TrMarg m and NetPrtax m are the ad valorem trade and transport margin and net product tax on the different products m respectively, which the buyer of final and intermediate products has to pay. These values are determined from the national SAM. 4 Alternatively, we could have weighted the Dist ij with employment data, as these data also are a sufficiently good data source at the municipality level in Norway. However, we have used population data. Here, m denotes Norwegian municipalities and i are the defined regions in the model; M is the set of municipalities; and Dist m m is the input distance matrix. We use population data for the year 2013 (Pop m ) to weight each distance within the region. Finally, a distance data set between the regions is calculated in (6b). For region R9, we do not have any distance data available. Therefore, we use a fixed-distance parameter from and to this region. See Table A4 in the supplemental data online for the calculated distances: Dist m m · Share m, m