Concrete agglomeration benefits: do roads improve urban connections or just attract more people?

ABSTRACT Cities with more roads are more productive. However, it can be unclear whether roads increase productivity directly, through improved intra-urban connections, or indirectly, by attracting more people. Our theory suggests that population responses may obscure the direct connectivity effects of roads. Indeed, conditional on population size, highway density does not affect productivity in a sample of US metropolitan areas. However, when exploiting exogenous variation in urban populations, we find that highway density improves agglomeration benefits: moving from the 50th to the 75th percentile of highway density increases the productivity-to-population elasticity from 2% to 4%. Moreover, travel-based measures outperform population size as a measure of agglomeration externalities.


INTRODUCTION
Workers earn more in cities with more highways (Figure 1). When a city builds a highway, its citizens become better connected; and better connections help to exploit agglomeration externalities. However, highways also attract new citizens who, in turn, also boost agglomeration effects. Can the higher wage be explained by better connections, or by larger population size? Both can make workers more productive. This paper shows that the role of infrastructure that connects citizens may be larger than a regular comparison of city productivity and infrastructure suggests.
Cities offer benefits of agglomeration. Urban environments offer their citizens easy access to other peers and to potential jobs, and they offer firms more potential sellers and buyers, and more peers to learn from. The benefits of connecting and interacting thanks to the proximity cities afford have long been recognized in economics (Marshall, 1890), but also play a central role in urban planning (Jacobs, 1961) and, more recently, in social physics (Bettencourt, 2013).
However, not all citizens are equally well connected within a city. Even within cities, some locations are easier to reach than others due to better road access, congestion or distance. In theory, easier travel and interaction within a city should extend the benefits of living in a large agglomeration (Behrens, Mion, Murata, & Südekum, 2017;Lucas & Rossi-Hansberg, 2002). As noted below, empirical evidence suggests that driving, commuting, job search and information flows deteriorate with distance and travel effort, even within the same city. Thus, the benefits of sharing knowledge, indivisibilities and thick markets spread around more easily when the urban infrastructure is effective.
As a consequence, one would expect the structure and spatial organization of a city to affect its productivity. The metropolitan areas of Houston (TX) and of Washington (DC) are similar in population size, for instance, but the share of commuters using public transport is more than six times higher in Washington. Does this affect the way in which knowledge spreads? Similarly, New York (NY) has few employment centres while Los Angeles (CA) has many (Arribas-Bel & Sanz-Gracia, 2014). And the population of San Francisco's (CA) metropolitan area is smaller than that of Atlanta (GA), but its road density is almost twice as high. Can San Francisco's road density compensate for its smaller population?
It is not easy to test whether cities with more roads are more productive, because cities with more roads typically attract more populationand population makes cities productive, too. The productive benefits of roads may be obscured if roads lead the city population to grow. Indeed, Duranton and Turner (2012) document that in the United States employment grows faster in cities with more interstate highway-kilometres. Infrastructure relocates people: the construction of highways suburbanized cities in the United States (Baum-Snow, 2007) and Spain (Garcia-López, Holl, & Viladecans-Marsal, 2015), among others. In China, railroads and radial and ring roads have decentralized the population, but also production (Baum-Snow, Brandt, Henderson, Turner, & Zhang, 2017). US cities also have larger populations if the structure of their transport network is efficient (e.g., connectivity, circuitry, 'treeness'; Levinson, 2012). Similarly, public transit increases the density of employment in the city centre allowing residents to move outward (Chatman & Noland, 2014). As the population moves in response to infrastructure, it is hard to disentangle from standard correlations whether population scale or the quality of internal urban connections causes urban productivity to grow.
We formalize these ideas to understand the respective role of population size and transport infrastructure in urban productivity. Our model is related to that of Duranton and Turner (2015), as it allows citizens to choose their exposure to urban benefits, depending on the ease of travel (in addition to relocating between cities), and delivers two key insights. First, the extent of agglomeration effects is best measured by citizens' engagement in urban interaction, captured in their travel effort. The city's population size is a less precise proxy for agglomeration economies. Second, the model suggests that if population moves in accordance with the spatial economic equilibrium, the productive effects of infrastructure such as roads are understated in a regular regression. The reason is that population is larger near better infrastructure, so it seemingly accounts for the productivity.
The empirical results we present suggest that the role of highways in generating productivity is larger than can be concluded from naive regressions. Baseline ordinary least squares (OLS) estimates provide no evidence that roads have an impact on urban productivity. In line with our theory, we exploit exogenous variation in population that plausibly has not responded to highway infrastructure. Eliminating the population response to highway density differences, we find that increased highway density significantly increases the productivity-to-city size elasticity. Correspondingly, internal travel efforts rather than the size of population explains urban productivity.
Our findings have two implications. First, they help to understand the benefits of improving urban design and infrastructure. The evaluation of infrastructure investment often assumes that increased connectedness, or 'effective density', improves agglomeration benefits (Graham, 2007a). That is an indirect but important argument in cost-benefit analysis for infrastructural investments, sometimes called 'wider economic benefits'. Our results show that statistical associations between agglomeration benefits and infrastructure likely show a downward biased image of infrastructure benefits at the urban level. Second, the results are consistent with the idea that population moves in accordance with the urban circumstances. In other words, the results suggest the existence of an urban spatial equilibrium. They also engage with a literature that studies how city scale determines urban outcomes such as productivity, pollution or crime (Bettencourt, 2013). Our paper, by contrast, shows that city size responds to productivity and transport, reversing the logic of scaling.
The paper is organized as follows. The next section briefly motivates the potential role of urban structure in agglomeration effects. The theoretical section presents a model of an urban production externality, paired with decisions to travel inside the city and migrate in and out of the city. The model provides predictions to test, in particular on how any effects of roads can be examined. The  Concrete agglomeration benefits: do roads improve urban connections or just attract more people? paper then examines these predictions for a sample of US metropolitan areas.

SPATIAL STRUCTURE AND AGGLOMERATION ECONOMIES IN CITIES
Workers in larger cities are more productive. Much of the evidence for that size-productivity relationship shows a positive elasticity between wages and the population size of a city. The elasticity is often estimated to be up to 5% (Melo, Graham, & Noland, 2009). 1 Such a 'scale elasticity' persists even when eliminating alternative explanations, such as sorting of more talented workers into larger cities (Behrens, Duranton, & Robert-Nicoud, 2014;Combes, Duranton, & Gobillon, 2008). The elasticity of productivity with respect to a city's population size suggests that only the size of the city matters, although recent estimates suggest that density matters, too (Puga, 2010). The urban economics literature at large, however, suggests that the structure and internal organization of cities also matter various urban externalities act 'with different strengths, among different agents, at different distances' (Anas, Arnott, & Small, 1998, p. 1459). Here we argue that, considering their microfoundations, several forms of agglomeration economies must depend on the ease of internal urban interactions.
Cities allow the interaction required for workers to learn, which is one of the more prominent agglomeration benefits (Duranton & Puga, 2004). Glaeser and Gottlieb (2009, p. 983) stress 'the role that density can play in speeding the flow of ideas'. In cities, returns to education and experience are higher (De la Roca & Puga, 2017;Heuermann, Halfdanarson, & Suedekum, 2010). There is also more job churning, allowing knowledge to be carried from one firm to another. The transfer of knowledge, especially embodied knowledge, is limited by workers' travel. As mentioned by Puga (2004, p. 2098), '[learning] involves interactions with others and many of these interactions have a "face-to-face" nature'. For firms, the peers that use related knowledge and are likely to spawn usable ideas and innovations are more usually found in larger cities. Co-location is important in this case, too; as Glaeser, Kallal, Scheinkman, andShleifer (1992, p. 1127) famously put it, 'intellectual breakthroughs must cross hallways and streets more easily than oceans and continents'.
Consistently, recent evidence suggests that the spatial extent of agglomeration economies is limited. The benefits of co-location may decline rapidly within kilometres or fewer (e.g., Arauzo-Carod & Viladecans-Marsal, 2009;Arzaghi & Henderson, 2008;Rosenthal & Strange, 2003). Andersson, Klaesson, and Larsson (2016) show that population density is not relevant beyond neighbourhood scale. These studies suggest that the benefits of agglomeration are severely impeded by distance or travel, even if they are agnostic about the exact mechanics of the benefit. The productive effects of employment masses near a worker's job location fade within kilometres, as do firms' productive effects of co-location with peers.
Good connections inside the city plausibly foster learning. Easier travel allows for larger and more extended social networks. Social interactions increase with the size and density of the social network (Helsley & Zenou, 2014). 2 Increased social interaction improves the scope for learning. Patents, a more formal measure of learning, also occur at higher rates in cities (Jaffe, Trajtenberg, & Henderson, 1993). By all measures, formal knowledge does not travel far either. Kerr and Kominers (2015) show that even within Silicon Valley, patenting relations cover a limited distancealthough most locations in the Bay Area patent a lot, individual links in patents are unlikely to span the width of the Bay Area.
Easy travel throughout the city may also improve matching on labour markets. Larger cities see both the quality and the chances of worker-job matches increase. There are more workers in a permittable range of commuting costs, and markets are thicker. That leads workers and firms to accept matches of high quality (for an extensive overview, see Zenou, 2009). The willingness to commute decreases when travel costs to work are higher (e.g., Persyn & Torfs, 2015;Van Ommeren & Fosgerau, 2009), but cities offer more potential jobs within a given commuting time (Angel & Blei, 2016). One would expect that between equally large cities, the one with the most efficient infrastructure allows workers to reach more potential jobs. Effectively, the labour market is thicker if more jobs can be reached in the same commuting time. Similar arguments might be made for goods transport, which fosters trade within cities (Holmes, 1999). However, models of agglomeration based on trade, like the New Economic Geography, tend to focus on trade between cities (e.g., Parr, Hewings, Sohn, & Nazara, 2002).
Urban spatial structure influences the interactions between its inhabitants, too. Urban planners have long contended that polycentricity affects commuting patterns (e.g., Giuliano & Small, 1993). Duranton and Turner (2015) show that increases in density show little impact on driving, suggesting that when given the chance, inhabitants exploit the larger scale that infrastructure offers rather than minimize their travel time. A popular conjecture is that cities are more productive if they have conducive land-use patterns, especially patterns that allow high density (Henderson, Venables, Regan, & Samsonov, 2016).
Roads have substantial impact on the organization of cities, making it plausible that they affect urban productivity. There is evidence US cities with more highways see higher employment growth (Duranton & Turner, 2012) and attract more firms (Chandra & Thompson, 2000). At the same time, highway expansions have allowed jobs and residents to decentralize at lower costs, leading to suburbanization (Baum-Snow, 2010). Their effect on travel costs influences urban economic outcomes: while larger road capacity increases employment, congestion of those roads reduces it (Hymel, 2009). Increasing the length of the network also increases its use. Duranton and Turner (2011) show that a 1% increase in the number of highway-kilometres within a city leads to a 1% increase in driving. These results suggest that easier travel is partially offset by the increased number, length of trips or new residents who use the system. There are also several residential and trip choices residents make that depend on the quality and density of the road network of the city. For instance, travel speed is lower in centralized cities, and those without ring roads (Couture, Duranton, & Turner, 2016). As road network characteristics vary, the frictions of interaction change and presumably, the extent of agglomeration externalities is affected.
Evidence of productivity gains from other infrastructure shows more circumstantial evidence. Fallah, Partridge, and Olfert (2011) develop a measure of sprawl at the metropolitan level. Using OLS as well as instrumental variables (IV) estimation, they conclude that there is a negative link between the particular urban structure of sprawl and labour productivity in the United States. Garcia-López and Muñiz (2013) use the Barcelona Metropolitan Region in Spain over the period 1986-2001 to study the effects of the appearance and evolution of urban sub-centres on specialization and economic growth, suggesting that the organization of the city in multiple centres affects its growth. Fernald (1999) shows an alternative argument: state-level road investments increase productivity most in vehicle-intensive industries, suggesting that roads have a causal productivity effect. Zheng (2007) shows that transport connections to other cities increase productivity. While this relates to our paper in the focus on productive effects of transport, Zheng considers 'borrowed agglomeration' from other cities while we focus on internal agglomeration effects.
Our goal in this paper, however, is to evaluate whether infrastructure helps a city to offer agglomeration benefits. Accordingly, our conjecture is that access inside cities matters for the benefit of living and working in that city. An efficient commuting network increases the available job opportunities within acceptable commuting costs. For a given city size, easy commuting should therefore increase the quality of job market matches and the flows of employee-embodied knowledge. Similarly, firms that find more firms of a similar nature within a given range of transport costs may copy more knowledge, find more suitable upstream and downstream partners, and share larger infrastructural benefits.

A STRUCTURAL MOTIVATION
The above literature provides plentiful clues that urban organization affects agglomeration benefits. To analyze them, we formalize the interaction inside cities to describe urban externalities. This section presents a model that clarifies the relationship between the structure of a city and its population size, thus helping to guide an empirical exploration of the links with urban productivity. We build on the work of Duranton and Turner (2015), who are interested in identifying the effects of urban form on driving. Our starting point is the idea that citizens choose their interactions inside the city, and therefore the exposure to benefits from agglomeration.
Thus, a distinguishing feature of our approach is that citizens trade off the costs of interacting across space with its productive benefits. Better infrastructure reduces travel costs and leads citizens to interact more. This way, the relevant dimension of urban externalities is not the size of the urban population as much as the amount of interaction within that urban population. The aggregate interaction is a composite of the population size of the city, and the spatial frictions between them.
Our structural strategy delivers two key messages. First, it is well possible that the equilibrium number of people living in a city adapts to the quality of the city's infrastructure, thus making it difficult to disentangle the productivity effects of infrastructure and population. Second, a worker may benefit from only a subset of the other workers, even if those benefits are proportional to the total population size of the city. The model suggests that the extent of agglomeration economies might be better measured by how much a worker travels on average than by the total population size.

Production externalities and travel choices
Workers have a job inside the city and can travel to the location of other workers. Our assumption is that if a worker spends more time at other workers' locations (and possibly in many different locations), he/she becomes more productive through an externality. Workers produce and consume a freely traded numeraire good. Their utility depends on wages w, (money metric) travel costs T and the local land rent r, in the following indirect utility function: where the parameter t determines the elasticity of substitution between wages and travel costs. The effective distance to another worker's location i is u i , and the total time spent on travelling to that location is proportional to the amount of trips (e.g., the number of days a worker makes that trip), so the worker spends u i T i in time travelling to location i. The total travel costs are the aggregate of all individual trips: where the worker chooses how much time to spend in every location, T i ; and u i will be a measure of travel friction inside the city. Workers produce a numeraire good. Their productivity is a product of a nominal productivity a, and increases in productivity that follow from spending time at other locations. This is our formalization of the externality: workers who spend more time at different locations inside the city become more productive. The wage rate is: where the parameter 1 determines the elasticity of substitution of different workers' locations in productivity spilling over.
In our formalization of the agglomeration externality, we only intend to reflect that a person spending time at different locations will become more productive. It is not our intention to model one of the micro-mechanisms put forward in the above literature. However, there are many channels consistent with the idea that access to many locations makes a worker more productive. For instance, T i could reflect time spent socially or collaborating with people in different locations; it could represent the effort of looking for a job; or performing different jobs.
The ratio of first-order conditions for travelling to locations i and j implies an optimality condition: Using the optimality condition in the total travel time definition suggests that travel time to a particular location depends on the bilateral travel time, relative to the travel time to all other locations: Inserting the travel time for every individual location into the expression for wage and simplifying yields: The wage is a function of travel times to all other locations. If other locations are easily accessible, the worker spends more time absorbing the production externality, and he becomes more productive. The equilibrium wage rate allows for an elasticity of substitution between different locations. If 1 is high, workers become most productive by spreading their time over different locations. Lower values of 1 allow workers to learn from visiting only few locationsonly the closest, for instance. 3 The above wage rate covers many different structures of access inside the city. However, to distil our main argument (and to keep the results tractable) we follow Duranton and Turner (2015) and assume a symmetrical city in which workers travel to each others' locations at equal costs. 4 This allows one to express the wage rate (and average productivity) in terms of the city size N (the number of workers travelled to), and the average travel time between two locations in the city, u: This expression for the wage rate is already close to a standard expression of a Marshallian externality because it relates city size to productivity. However, there are two additional elements: higher distance frictions u inside the city hamper the flow of knowledge, and workers endogenously choose how much time to expose themselves to the externality, T . Under symmetry of travel, we can also introduce an elementary congestion effect. The worker considers a trip's travel time u as given, but suppose it is in fact a function of the infrastructural capacity that determines free-flow travel time u f , and a congestion for the number of users N with exponential parameter w. The travel time is then: Individuals do not take into account the effect of their travel choice on aggregate travel time on between two locations.
Workers optimize their travel time to exploit the agglomeration benefit. They become more productive by spending time in other locations, but dislike to spend time travelling. Using the expression for wages in the indirect utility function to identify the returns to travel, the firstorder condition for travel is: so that the equilibrium travel time is: Considering workers' choices to expose themselves to the production externality, the wage rate is: which is a standard Marshallian expression, except that the city structure plays a role because it determines internal travel time. However, the original expression for the wage rate has another implication: optimal travel behaviour responds to internal travel frictions, too. As the initial expression for the production externality suggests, productivity depends on the effective time of interaction. The time of interaction, in turn, is determined by the ease of travel inside the city. Taking travel time as the behavioural result, the extent of the externality can be expressed as a function of travel times exclusively: This expression carries a key point of the model. It shows that the exploitation of the agglomeration benefit is captured by how much a city's inhabitants choose to travel. That choice is driven by the amount of possible destinations (the size of the city); and how costly it is to reach each destination (the quality of the infrastructure). Thus, conditional on the worker's travel behaviour, the population size is not necessarily relevant.

Spatial equilibrium
Workers may choose a city to live in, like they choose to travel inside the city. In this model, we assume workers can move freely across cities. In the spatial equilibrium, workers have no incentive to movecities other than the one they live in offer no higher prospective utility. The indirect utility of living in a city, with the endogenous wages and travel times substituted, is: 1138 Michiel Gerritse and Daniel Arribas-Bel which is simply a function of how many people live in the city, the average quality of the urban structure and the prevalent land rents. We assume that each city has competitive suppliers of land, with a given supply elasticity h. The inverse supply function for land is: The demand for land is unit-elastic, and fraction a of budget is spent on land. With N citizens demanding land, the city-level demand function is: and the rent that clears the land market is: where c is used as a positive parametric constant. With equilibrium on the land market, the spatial equilibrium condition can be defined. Using the equilibrium land rent in the indirect utility function, the expression for average travel time: and simplifying gives that the utility in a city is: where a and b are parametric constants: Note that a is negative if h, the housing supply elasticity, is low enough. If housing supply is sufficiently inelastic (h , (1 + 1)t/(1 + t)), utility is downward-sloping in the amount of inhabitants in the city, a requirement for a stable internal spatial equilibrium. Parameter b is negative higher average travel frictions always reduce utility. Similarly, a is negative if w is large enoughthen the congestion effects of more population discourage immigration. The indirect utility function (equation 16) raises the second main point of the model. If migration is possible, differences in potential utility between the cities are eliminated. In a log-linear world (as most regressions of agglomeration economies assume), population movement may perfectly compensate for infrastructural differences between cities. In the indirect utility function, the population size and the average infrastructural quality of the city are iso-elastic. The trade-off between population size and internal city frictions suggests that in spatial equilibrium, the relation between infrastructural quality and population size in a cross-section of cities has a constant elasticity: so that the elasticity of population size with respect to the distance friction in the city u is constant, because a and b are constants. Given a , 0 and b , 0, the elasticity is negative: everything else equal, more spatial friction inside the city is associated with lower population. The magnitude of the effect depends on the value that individuals attach to travelling (t); the benefits of spending time in different locations (1) and the housing elasticity (h). If the supply elasticity of housing is low, few new citizens enter the city if internal travel frictions fall (while overall travel may still rise). Note also that in our expression of the production externality, changes in infrastructure u f may lead to increases in productivity along with longer travel times, if strong congestion effects w are paired with low housing supply elasticity. Thus, it is possible that (latent) demand for travel compensates the time gains from improvements in road capacity (Duranton & Turner, 2011); or that road investments lead to growth of the number of residents (Baum-Snow, 2010; Duranton & Turner, 2012) without necessarily changing travel times. The constant elasticity between infrastructure quality and population size also affects the earlier prediction that population and infrastructure determine productivity (equation 10). If the spatial equilibrium holds, the log of population may perfectly adapt to infrastructure, leaving no discernible role for infrastructure itself. We elaborate on that econometric problem more extensively in the empirical section.

Infrastructural effects
The set-up above has implications for the interpretation of externalities. When analyzing the benefits of connectivity inside the city due to a good road network, there are two main conclusions to be drawn. First, measures of infrastructural quality may suffer from severe simultaneity issues. These endogeneity issues are not in the classical sense, that the independent variable of interestinfrastructural qualitymay respond to productivity. Rather, the variable (log) population may perfectly adapt to infrastructure, if the spatial equilibrium condition holds. To clarify this, a standard Mincer productivity equation can be considered instead as a system of two equations in the above model: where the first is the logarithmic version of the externality; and the second is the logarithmic version of the spatial equilibrium condition. If the spatial equilibrium Concrete agglomeration benefits: do roads improve urban connections or just attract more people?
condition holds, the reduced form yields: In this context, the coefficient of a 'naive' regression of log wages on log population and infrastructural measures may thus suffer from collinearity between the population and infrastructural measures. The coefficient on population reflects a direct population size effect as well as the association between infrastructural quality and population. This problem does not need to occur; our model suggests it might occur, under specific parameterizations and if the log-linear approximations are accurate.
The second prediction of our stylized formulation of the externality is that travel time inside the city matters for the extent of the scale externality. Workers choose their travel based on how many citizens inhabit the city as well as how long it takes to reach them. When taking the model at face value, conditional on travel times, population size does not explain the externality. In logs, the externality is: This expression for wages is derived by inserting the equilibrium travel time (equation 9) in the productivity term (equation 10). Importantly, the relation between travel time and productivity holds whether or not the spatial equilibrium condition holds.

Data
We examine the predictions regarding the measurement of infrastructure effects in a cross-section of United States' metropolitan areas in the year 2010. As workers in the United States are mobile compared with other countries, the spatial equilibrium outlined above might be relevant. Second, one of our contributions is in providing a novel methodology to study the effects of urban spatial structure.
The United States provides a good backdrop to evaluate our results because much of the related literature has focused on this region, in good part due to its relative availability of good-quality data.
To estimate city-level productivity, we exploit microdata from the American Community Survey (ACS; accessible through the Public Use Microdata Series -PUMS). The survey provides individual information on a 1% sample of the population, including wages, education, race, sex, age and information on commutes at the Public Use Microdata Area (PUMA) level, a bespoke unit of analysis created for the dataset. Our measures of urban structure rely on highways in each metropolitan area, part of the interstate highway system (Duranton & Turner, 2012). These data are widely accepted and, importantly, have had convincing IV strategies proposed. We also use physical attributes of cities, such as internal elevation measures, accessed from Nunn and Puga (2012). Additionally, we use the 1920s' population (Duranton & Turner, 2012) and banking data from the census (accessed through PUMS).

Empirical strategy
The most important outcome in our analysis is urban labour productivity. We identify productivity from individual wage data. In a competitive labour market, a worker's wage rate reflects his/her marginal productivity. This may be due in part to age, training and industry, but also to location. To isolate location-specific productivity estimates, we first estimate a Mincer regression at individual (worker) level: ln w ir = c + bX ir + r a r D r + 1 ir (22) where the logarithm of the wage ln w ir for individual i in metropolitan region r is regressed on a constant term c, a set of personal characteristics X ir , a set of metropolitan region a r fixed effects and an i.i.d. error term. X ir includes age and gender as well as education level, race and sector fixed effects. Equation (22) captures the contribution to productivity of observed personal characteristics. In addition, the metropolitan fixed effects a r absorb level differences in the wages of individuals who live in the same metropolitan region. In other words, the estimates of a r represent specific city premia on wages, 'cleaned' from worker-specific characteristics that we can observe. 5 Our theory suggests that population might adapt to the quality of urban infrastructure. Here, we use exogenous variation in the population to check for differential effects of population for cities varying in highway density. To allow the scale elasticity of productivity to population to vary with the quality of infrastructure, we add an interaction term to a standard regression for scale elasticity: a r = g 1 log pop r + g 2 log HD r + g 3 log HD r × log pop r + u r , where a r is the city-level productivity (wage premium) shifter; HD r is city r's highway density (kilometres of highway per km 2 ). The coefficient g 3 captures the interaction effectit allows the scale elasticity to vary with highway density. The coefficients of interest may not be identified correctly if population follows urban structure. Following our model given above, a compensating differential in the spatial equilibrium may cause a log-linear relationship between population and infrastructure. Thus, we obtain a two-equation system of the city-level productivity regression and the population-infrastructure interaction, as in equation (19). For instance, if in equilibrium: log pop r = d log HD r , the joint system implies that population and highway density are collinear, and conditioning effects (even any effects) of infrastructure cannot be recovered. One strategy is to use the model's prediction that in equilibrium average trip times are collinear with city population size (equation 21). We explore that in a robustness check.
Our main empirical strategy, however, is to isolate exogenous variation in population. Suppose that the city population log pop r is a (multiplicative) composite of a given, historical population (log pop h r ), and the relative population adaptations to infrastructure to satisfy the spatial equilibrium (log pop r ), so that: The estimating equation could then be written as: a r = g 1 log pop h r + g 2 log HD r + g 3 log HD r × log pop h r + [g 1 log pop r + g 3 log HD r × log pop r ].
According to our model's predictions on population location choices (equation 20), an OLS regression might not provide unbiased estimates on g 2 and g 3 . To address this, we effectively treat the term in brackets (containing the population adaptations to infrastructure) as a measurement error and subsume them in the model's error term. An IV regression with an instrument that is related to log pop r through log pop h r but not to the term log pop r can then identify effects of infrastructure, because the variation in log pop r is not used for identification. Intuitively, if we only use exogenous variation in the population, we identify effects of highway density without considering the variation in population that responded to highway presence.
In this context, it is important to note what the consequences are of using a wrong instrument. If the instrument exogeneity is violated, the instrument is associated with endogenous variation in population. Following our theory, if we fail to identify exogenous population variation, our estimate will incorporate a population response to highway density differences. Thus, if the instrument is endogenous, it would stack against finding an effect of highways. In other words, using a wrong instrument would preclude us from recovering a potential direct productivity effect of infrastructure.
For an instrument, we require variation in population that is not the result of the current day highway network. The first instrument is obvious, and often used: the (log) of population for each metropolitan statistical area (MSA) in the 1920 census (Combes, Duranton, Gobillon, & Roux, 2010). As a second instrument, we consider the historical degree of banking penetration in each MSA in 1920. In 1920, mortgages were only provided by banks operating regionally, so that variation in housing finance was large between cities. Most infrastructure (in particular the highways we use in the following empirical analysis) was government financed and built later, so that 1920 bank penetration likely causes residential variation. We use the deposits per head in each city from the 1920 census. Table 1 shows the regressions explaining productivity from population size and the road network. The estimating equation is the model's prediction (equation 10)or its empirical form (equation 19). The first column shows an OLS regression of the log wage premium on logs of highway density, population and their interaction. It shows that log population is significant in explaining urban productivity, but highway density is not. The interaction coefficient is not significant either, suggesting no modulating role of infrastructure.

RESULTS
The linear interaction between roads and population might show no effects because it does not precisely reflect the functional form of the actual interaction effect. To explore more flexible functional forms, we calculate sample quantiles for highway density and allow the effect of population on the wage premium to vary over five (column 2) or 10 quantiles (column 3), with the first quantile as the reference case. We find little statistically significant variation in the coefficients across quantiles, and the magnitude of the deviations is small. The absence of significant modulating effects of infrastructure appears to be robust to the choice functional form.
The fourth column shows the same interaction regression, but with historical or 'deep lag' instruments for population. The instruments are the log of 1920 MSA population, and that variable interacted with highway density. The results show that the coefficients for highway density and its interaction with log population are significantly different from zero. Note that this occurs despite slightly larger standard errors, as the magnitude of the coefficients grows (IV might yield less precise estimates than OLS). The estimates imply productivity effects of highway density, when log population is instrumented for. The coefficients cannot be interpreted in isolation, as there is an interaction coefficient, which shows that the scale effects vary with highway density. For the median highway density, the productivity to scale elasticity is 2.0% (p ¼ 0.09); while at the 25th percentile of highway density, the effect is statistically not different from zero; at the 75th percentile, the elasticity is 4.0% (p ¼ 0.00); and at the 90th percentile, the elasticity is 5.2% (p ¼ 0.00). 6 As a measure of instrument relevance, we report the Kleibergen-Paap (Lagrange multiplier -LM) test, which is robust to the fact that the instruments are correlated among themselves. The first-stage regressions for the IV results in columns (4-7) are reported in Table C1 in Appendix C in the supplemental data online.
We focus on highway density (road length divided by surface area) as a scale-independent measure for the ease of transportation to other locations in a city. The related literature investigates urban growth with cities' aggregate Notes: Robust standard errors are given in parentheses. FE, fixed effects; IV, instrumental variables; LM, Lagrange multiplier; OLS, ordinary least squares. ***p < 0.01, **p < 0.05, *p < 0.1.
Concrete agglomeration benefits: do roads improve urban connections or just attract more people? highway length (Duranton & Turner, 2012). In the logarithmic specification, the difference between log road-kilometres and log road density is the log city surface. For comparison, column (5) adds the log surface area of the MSA to the baseline regression. When controlling for city surface area, the results attenuate slightly, but are qualitatively similar.
We also report the results of IV regressions using bank deposits per capita as an instrument, in addition to the historical population level. An additional instrument provides more variation to identify the interaction effect, and it allows checking for over-identification. The results are shown in Table 1, column (6). The motivation to use historical bank deposits as an alternative instrument is that regional variation bank access in 1920 is strongly related to residential choices, but less correlated to infrastructural investments later on (especially the highways that were financed federally). The results in column (6) are similar if slightly stronger than the instrumentation based on historical population. The Kleibergen-Paap test shows significant instrument relevance. The Hansen J-test, permitted by the additional instrument, shows no signs of overidentification.
One might also argue that highway construction itself is endogenous. We do not consider that a first-order effect in most regressions, as we are interested in the change in moderating effects of highwaysnot the level but the interaction effect. Nevertheless, such endogeneity may affect the rest of the regression. To check our results, we additionally exploit the 1947 highway plan as an established instrument for highways (Duranton & Turner, 2012) in column (7). The three terms (highway density, population and their interaction) are instrumented by the historical population, the highway density in the 1947 highway plan, and the interactions of historical population and bank density with the 1947 plan highway density. It is the same regression as in column (6), but with highway density instrumented with its 1947 planned density. Instrumenting for another variable puts more demands on the data, but the results are very similar. The instruments are relevant and there is no evidence of over-identification. The interaction coefficient is statistically significant and similar to the regression in column (6), if somewhat stronger. Figure 2 illustrates the difference in conclusions from the regular regression and the IV regression of column (7). The dashed line at 0.038 is the unmoderated effect of log population on the log wage premium (productivity), based on the IV estimates (column 9 of Table 1). With a regular OLS regression (column 1 of Table 1; shown in grey in Figure 2), log highway density shows no moderating effect on the scale elasticity. For no value of highway density does the elasticity differ significantly from the average, unmoderated effectthe dashed line. The interpretation of the IV regression (column 7 of Table 1; shown in maroon in Figure 2) is different. The scale elasticity of population to productivity grows in the highway density. Over the lower range of highway density, the scale elasticity is not significantly different from zero, but it is statistically different from zero from around -2 log density. The scale elasticity is also statistically significantly higher than the average scale elasticity if the log highway density is over 1. Similar conclusions hold for the other IV regressions reported in Table 1. These findings are consistent with the idea that a larger population increases agglomeration benefits if the internal infrastructure allows more interaction. 7 Interpreting the size of indirect effects Our theory suggests that there may be a substantial direct productivity effects of highways. Taking the theoretical results at face value, we can provide a back-of-the-envelope estimate of the direct and indirect effects of differences in highway density (i.e., direct increases in interaction between incumbent population versus the extra population that highways attract).
We provide an estimate of the share of direct effects by comparing two cases: the case where there is a productivity effect, and the case where there is a productivity effect and a spatial equilibrium. In the latter case, we allow population to change with road density changes, and consequently we allow productivity to accumulate more population. Effectively, we compare causal productivity effects (based on the IV regression in Table 1, column 7) with a productivity effect allowing population to adapt (according to the estimates in column 1). We detail our calculation in Appendix A in the supplemental data online. The coefficient estimates imply that the direct productivity-inducing effect of highways accounts for around 24% of the total effect (the total effect includes population responses to highways; see Appendix A in the supplemental data online). Informally, improving highways yields a significant effect through increased connectivity for the people who already live in the city, but the effect roughly quadruples in size because the highway and its wage effects also attract migrants and increase scale. We do note that this estimate is based on the theoretical results.

Robustness checks
First, to corroborate the population instruments, we also report the regular Mincer regression with only the level of population instrumented. Column (8) of Table 1 shows the a wage regression with historical population as an instrument. Column (9) shows the regression with historical population and bank deposits per capita as an instrument, which permits testing for over-identifying restrictions. The bank-based instrument only slightly attenuates the estimated coefficient (from 0.040 to 0.038). A Sargan test shows no evidence of over-identification. The estimated elasticities in columns (1), (8) and (9) are consistent with most other literature, which reports agglomeration elasticities around 4% (e.g., see the metaanalysis by Melo et al., 2009). Altogether, the performance of our historical instruments seems in line with the literature identifying agglomeration effects.
Second, instead of isolating exogenous cross-sectional population variation, one might discard potential sources of cross-sectional endogeneity altogether. The identification relies on the argument that cross-sectional population variation may be endogenous, but changes over time might reflect the spatial equilibrium less perfectly. If so, one would expect statistically significant interaction effects under identification based on time variation. Using variation over time helps to rule out potentially confounding unobserved variables, such as features of geography that affect productivity as well as highway density.
We report the results of a panel regression as a robustness check in Appendix D in the supplemental data online. It is based on 1983-2003 variation in highway density (Duranton & Turner, 2012) and corresponding wage variation in the census. The results imply that the scale elasticity is estimated at around 4%, whether using time variation or pooled variation. The interaction of highway density and population is insignificant in explaining productivity when identified from cross-sectional and time variation. However, when ruling outcross-sectional variation using fixed effects, the interaction effect is statistically significant and very close to the IV estimates. That is consistent with the idea that cross-sectional variation may reflect the confounding effect of urban population with the direct road effects.
Alternative model prediction: travel time An alternative prediction of the model is that the externality is captured by the travel time (effort in interaction) rather than the population size of the city. This is reflected in the model's result that conditional on travel times, population does not explain agglomeration externalities in equilibrium (equation 11). The intuition is that the externality relies on aggregate interaction, which is the product of the possibilities to interact (population size) and the ease of interacting (travel costs per kilometre). The result suggests that cities of equal population size do not have similar productivities if citizens of one city travel more.
To test whether interactions matter in addition to population size, we nest the log population and the actual travel times as competing explanations for productivity. The resulting regression is statistically similar to the above regressions. However, the expected coefficients are differentinstead of identifying a role for infrastructure, we expect the role of population to diminish when we enter a theory-driven measure of interaction in the regression. Statistically, our model suggests that there is collinearity between the average travel times and population size, because population size is one of the parameters that determines how much a workers chooses to travel to absorb externalities.
Our proxy for internal travel times is the average reported commuting time between PUMA areas within a city, weighted by the size of the commuting flow (from the ACS definitions). Reiterating, an OLS regression of the metropolitan wage premia (detailed above) on the log of metropolitan population suggests a scale elasticity slightly under 4%, which is consistent with most of the other literature on agglomeration externalities (first column of Table 2). Table 2 also shows the results of combining log population and log commuting times as explanatory variables. Column (2) suggests that once commuting times are controlled for, population plays a far smaller role in determining urban productivity. The coefficient of log population on urban wage premia falls by 55%. The coefficients on population and travel-to-work times represent elasticities, but their empirical relevance is hard to comparewhile one city's population may be 50-fold the population of another cities, travel times do not have such proportional variation. Therefore, we also report the beta-coefficientshow many standard deviations (SDs) the log wage premium changes in expectation when the independent variable changes one sample SD. These suggest that the effect of population is sizable in isolation (0.47 SD, column 1 beta). However, the role of population is smaller conditional on travel times, while the variation in travel time yields substantial effect (0.35 SD, column 2 beta).
Clearly, as argued by our urban model, population size may be endogenous, and so may be commuting times. To Concrete agglomeration benefits: do roads improve urban connections or just attract more people?
investigate whether a possible simultaneity bias affects the result, we instrument both variables with a number of instruments suggested and tested by the literature. Like before, we use the log of 1920 population in the metropolitan area, assuming that it affects the current-day population but not current wages directly. In addition, we exploit exogenous variation in the Duranton and Turner (2012) measure of the 1947 highway plan, and we use two physical geography measures to exploit additional exogenous variation. First, the elevation range is an arguably exogenous determinant of the difficulty of city expansions, as well as the presumed speed on the infrastructure network. Second, we use the yearly number of cooling days, which may both make the city less attractive to its citizens directly, and pose problems for its internal transport. The first-stage regressions are reported in Appendix C in the supplemental data online.
The IV result suggests that population does not explain urban productivity once intra-city commuting times are controlled for. The coefficient for commuting times is statistically significant and large (informally, a sample SD increase in equilibrium travel times raises productivity by roughly 1 SD). Jointly, our instruments seem relevant. As there may be correlation between the individual instruments, we employ an Anderson canonical correlations test, which suggests our instruments are relevant (p < 0.07; the individual F-tests for both instrumented variables are significant beyond the fourth decimal). The Sargan test shows no signs of over-identification, so that our instruments do not seem to be correlated with second-stage regression errorthe exogeneity requirements are met. In unreported regressions, we have dropped individual instruments from the instrument set, but that affects neither the relevance and exogeneity tests much nor our estimates.
Interpreted as a structural estimate of our model, these coefficients suggest that the actual interaction is determined by the cost of travel, as well as the multitude of potential travel destinations. The resulting choice is travel time, which incorporates the (population) size of the city as well as its internal travel frictions. From our theory, the insignificance of the coefficient for log population thus points to a role of travel frictions inside the city.

CONCLUSIONS
Agglomeration benefits thrive with interaction between citizens. Efficient infrastructure, such as a good road network, increases the effective proximity of citizens, and should increase the benefits of population agglomeration. However, the benefits of good infrastructure might be hard to identify if the urban population moves when infrastructure changes.
We develop a stylized model of travel and migration choices in cities that exhibit localized agglomeration externalities. Agglomeration externalities occur when citizens travel to other locations inside the city. Their travel choice is the result of the number of people in the city and the ease of travelling. Apart from internal travel choices, citizens may migrate. Migration obscures the effects of travel infrastructure on agglomeration benefits. Citizens locate where roads are good (and travel is easy), so it becomes difficult to distinguish the productive effects of good roads from the productive effects of population size. We show that in the log-linear model that typically motivates studies of agglomeration effects, the population response may perfectly absorb any effects of infrastructure.
We test the model's predictions in a cross-section of US cities. We find little evidence of productive effects of highways in cities when we control for population levels. However, when we exploit variation in population that is arguably unrelated to infrastructure, we find that highway density does moderate agglomeration effects: cities with denser highway networks have substantially larger returns from agglomeration. The differences in returns may be sizable: the productivity-to-city size elasticity is around 2% at median highway density (approximately Buffalo, NY) but varies from 0% to 4% over the interquartile range of highway density (i.e., approximately from Grand Rapids, MI, to Santa Cruz, CA). Using our estimates as structural parameters in our theory suggest that roughly one-quarter of wage increases associated with denser highway networks are due to better connected citizens; and three-quarters are due to the fact that more people reside where highways are better, thus increasing scale effect per se. The model's second discriminating predictionthat travel times and not population per se explain agglomeration externalitiesalso finds support in the data. Our results hold with different instrumentations and in time variation as well as crosssectional variation. We use fairly established instruments for population, and test for their exogeneity. Nevertheless, omitted variables related to population may still bias our estimates. In future research, longer time variation in infrastructure data may solve this issue. Similarly, other infrastructure, not considered here, may affect productivityrailroads lead to population movements, and efficient public transit may foster interactions (e.g., Baum-Snow et al., 2017;Chatman & Noland, 2014). Vice versa, our road measure may also impact other relevant outcomes. For example, changes in road network densities could lead to a change in transport mode choices (Bento, Cropper, Mobarak, & Vinha, 2005). Our results may explain why infrastructural effects play a seemingly small role in generating productivity at urban levels. In the broader literature, the estimated effects of infrastructure differ markedly, depending on the scale of the analysis (e.g., urban versus project basis), which makes it hard to draw policy conclusions from the academic literature (Banister & Thurstain-Goodwin, 2011; Organisation for Economic Co-operation and Development (OECD), 2008; Vickerman, 2007). Our results may help reconcile the differences, and caution that regressions in cross-sections of cities may easily understate the productive benefits of local infrastructure or transport investments.
Our results also square with more circumstantial evidence on infrastructure's productive effects. They are in line with earlier evidence on firm productivity (instead of worker productivity) for cities (Eberts & McMillen, 1999) or regional evidence (Kelejian & Robinson, 1997). Several recent observations in urban economics are consistent with our conclusion. The role of the spatial equilibrium is closely in line with the theoretical observation that the elasticity of population with respect to commuting costs is equal to 1 (Duranton & Puga, 2013, p. 6). Changes in commuting costs may be fully accommodated by growth in the city population. Employment is also larger in cities that have more highways (Duranton & Turner, 2012). Duranton and Turner (2011) find that adding roadway lanes to interstate highways increases the vehicle-kilometres travelled proportionally, suggesting increased travel demand absorbs the (congestion-reducing) benefits of infrastructure improvements. Given that population, employment and travel behaviour respond to infrastructure, it may not be surprising that productivity responds to infrastructure, too. However, our aim is to demonstrate that, consequently, the effect of infrastructure is not recovered in a naive analysis; and to provide ways to recover it.

DISCLOSURE STATEMENT
No potential conflict of interest was reported by the authors.

SUPPLEMENTAL DATA
Supplemental data for this article can be accessed at https://doi.org/10.1080/00343404.2017.1369023 NOTES 1. The effect and size of the agglomeration elasticity has been extensively researched. Additionally, recent approaches extend this view to consider local access and different (see Graham, 2007b;Redding & Turner, 2014; and the articles discussed in this section). 2. Although there are not necessarily more private social connections (Brueckner & Largey, 2008). 3. This could easily be extended to workers learning more from other workers who are more productive, so that destinations are further differentiated. However, while this matters a lot for the welfare conclusions about internal travel, it turns out not to matter for the motivation of the empirical specification. 4. An obvious alternative for the symmetry assumption is the organization in a monocentric city. An earlier version of this paper reached similar conclusions with workers living around a central business district (CBD), with an externality on labour supply. It is available from the authors upon request. 5. The results from the Mincer regressions can be obtained from the authors upon request. 6. We report below a visual interpretation of the interaction based on column (7). 7. The regression's interaction between population and highway density alternatively can be interpreted as the conditional effect of infrastructure, given the population. As that interpretation is very related to Figure 2, we present it in Figure B1 in Appendix B in the supplemental data online. Figure B1 suggests that the elasticity of the city wage premium with respect to infrastructure rises in city size, and is significantly different from zero for larger cities.