An alternative approach to estimating agglomeration and productivity using geography, demography and evidence from satellite imagery

ABSTRACT In this article we combine novel geographical, demographic and satellite data with an instrumental variables (IVs) estimation approach to tackle a thorny set of data and econometric issues. Studies of agglomeration effects on economic productivity at the city scale have been growing in popularity in both the literature and the interests of urban policymakers in recent years. Despite growing sophistication in econometric approaches, underlying weaknesses in the quality and availability of data at the city scale have hindered efforts to unravel the endogeneity issues, or circularity between city scale, amenity and productivity. We set out both to extend the agglomeration literature to the case of Australia – which hitherto has been lightly studied in this field – and to take advantage of Australia’s unique geography and reliance on migration to drive population change. We also draw on satellite data measuring night-time luminosity as a proxy for economic productivity. We demonstrate the construction of convincing and robust IVs that proxy for population at the city scale. Our econometric estimates indicate that a 1% increase in city populations causes productivity to be enhanced by 0.24–1.70% on average.


INTRODUCTION
In recent years there has been a resurgence of interest in the concept of agglomeration economies, and in empirical testing strategies designed to uncover the strength and direction of such effects, together with the circumstances in which they may and may not hold. The advantages to economic productivity that arise due to city scale and/or density are argued to be potentially wide-ranging, including competitive advantages that arise when firms are able to access larger markets, internal advantages to the firm such as access to more highly trained labour supply and lowering costs through accessing established supply chains (Puga, 2010). Positive agglomeration effects also occur over space, such that concentrations of activity affect outcomes in nearby urban neighbourhoods, cities and/or regions (Combes & Gobillon, 2015). It has also been argued that agglomeration effects are unlikely to be linear and may be subject to threshold effects. This gives rise to the possibility that negative agglomeration effects arise at high concentrations of economic activity (Wheeler, 2003).
Questions about agglomeration effects generate interest further afield than the economics literature. They are pertinent to urban and regional growth strategies because decisions about where to encourage and concentrate economic activity impact on productivity, hence the overall economic output of a nation. Agglomeration economies are arguably more important than ever to the urban planning debate in the wake of the Covid-19 pandemic given the imperative to rebuild economic output rapidly in order to repay the record sovereign debts incurred to pay for economic stimulus measures. In some countries, as in Australia where economic growth has been predicated on significant overseas migration for decades (McDonald & Temple, 2010), the debate is particularly important.
As we shall see in the next section, the empirical evidence on agglomeration economies that exists is heavily based on the North American (particularly US) experience and, to a lesser extent, European cities. Very little is known about the applicability, existence or scale of agglomeration effects outwith these geographical contexts. The case of Australia is particularly interesting given its enormous scale of 7.692 million km², modest population of 25 million in 2019 (Australian Bureau of Statistics (ABS), 2016), and heavy concentration of economic activity in a small number of major capital cities (Kelly & Donegan, 2014). Distances between concentrations of economic activity in Australia tend to be very large, and it is not clear if or how this affects the relationship between city scale and economic productivity. By extending the empirical analysis to the case of Australia cities, we make a unique and novel contribution to the literature.
In this article we deliberately depart from the econometric-led approaches to untangling endogeneity that have become dominant in the economics literature. Australia's geographical scale has an analytically attractive property because numerous climactic zones are present in a single country. Climate conditions are an important determinant of individuals' decisions to migrate both to and within a country (Jordan, 2007). People particularly prefer to reside in cities with more liveable climates such as mild, warm and cool climates (Albouy & Lue, 2015;Cragg & Kahn, 1997). Climate conditions are pre-determined and exogenous (Dell et al., 2014). Additionally, Australia's population expansion after 1995 is due to overseas migrant intakes and the number of overseas migrants who arrived in Australia is positively influenced by migrant quotas and the number of visas issued (see the third section below for more details). As only the federal government of Australia has authority to set migrant quotas or issue visas, changes in quotas and number of visas issued can create exogenous variations on Australian city populations. Therefore, we combine city climates with quotas or visa issuance to construct new instrumental variables (IVs) for city population to overcome the endogeneity and reverse causality estimation issues. To the best of our knowledge, this has not been attempted elsewhere in the literature.
A further empirical innovation is our use of information on satellite-recorded night luminosity data to reveal the concentration of human activity in urban areas, and to act as a proxy for urban productivity. As productivity is positively related to (real) gross domestic product (GDP) and income levels that are shown to be well approximated by night-time luminosity (Pinkovskiy & Martin, 2016), there is a logical expectation that a more economically productive city will be more lit. It has also been shown that cities with larger populations of skilled people are more brightly illuminated at night (Dingel et al., 2021). Many previous empirical studies (see Table A1 in Appendix A in the supplemental data online) have employed micro-(or survey) and macro-data to measure productivity. However, such approaches suffer from errors induced through measurement and/or self-reporting (Kjellsson et al., 2014). To address this issue, our study follows Henderson et al. (2012) in the use of impersonally collected night-time luminosity data. Unlike micro-(survey) or macro-data as used in many previous studies, satellites collect data impersonally and this helps to mitigate the problem of measurement errors in productivity data.

ADDING GEOGRAPHICAL APPROACHES TO THE STUDY OF AGGLOMERATION ECONOMIES
The theoretical basis for expecting interactions between urban scale (or density) and economic productivity are well rehearsed elsewhere in the literature. Briefly, Marshall (1920) set out three broad categories of agglomeration economies: labour pooling, shared input markets and technological spillovers. Although these principal categories have since been refined, with additional arguments added, they form the basis for much of the empirical work that has been published in this vein. More recently, Glaeser's (2008) general equilibrium model has been widely used and replicated in modelling the impact of urban economic development policy. Elsewhere, Glaeser and Gottlieb (2009) have argued that attracting workers to an area will boost local productivity through human capital spillovers, but they also note that low-income workers are attracted to urban areas with greater employment opportunities. Indeed, they make numerous cautionary points about the adoption of agglomeration economics in policy, including the fact that such effects are likely to be non-linear, difficult to quantify and subject to threshold effects. In addition, it is very difficult to determine whether gains through agglomeration benefits arising from shifting economic activity to one location more than offset losses experienced elsewhere.
Several recent studies have examined the relationship between employment density and labour productivity (e.g., Brulhart & Mathys, 2008;Ciccone & Hall, 1996;Morikawa, 2011), while others have examined the impact on other proxies for productivity, such as patent intensity (Carlino et al., 2007). Wheeler (2001) focused on the first of Marshall's categories (labour pooling) and found that the effects of larger markets on productivity are greater for higher skilled workers. Strange (2003, 2008) examine information spillovers and find that the benefits of spatial concentration fall quickly with distance, but that labour market pooling and shared input benefits dissipate more slowly. Baldwin et al. (2010), based on Canadian data, also found that knowledge spillovers are bounded at a distance of 10 km from a plant. Significantly, this study also found that spillover effects are non-uniform between industrial sectors, adding to the complexity.
A summary of past empirical studies is shown in Table A1 in Appendix A in the supplemental data online. In our survey of the literature, we found a heavy reliance on survey-based and/or macro-dataset to measure (or proxy) productivity. However, we note that such data are subject to measurement errors (e.g., Holland & King, 1979;Johnson et al., 2013;Kjellsson et al., 2014). 1 A second limitation is that many of the studies surveyed focused either on a subset of economic sectors or on a single economic sector. While this simplifies the analysis, it nevertheless introduces the risk that important interactions between economic sectors are missed, giving rise to misleading results.
In order to address the endogeneity issue of estimating the causal effect of city populations on productivity, some studies have followed Ciccone and Hall (1996) and Combes et al. (2010) to use historical population levels and soil fertility respectively as IVs for city populations (e.g., Combes et al., 2011;Mion & Naticchioni, 2009; see also Echeverri-Carroll & Ayala, 2011;. As these instruments are time invariant, only cross-sectional regression approaches may be used. Consequently, time-invariant city characteristics such as locations and land sizes that are correlated with population and productivity are omitted in the estimations. The result is that the estimates of the effect of city population on productivity may remain biased when such instruments are chosen. In this study we investigate the link between city population size and productivity in 293 Australian 'cities' (defined as local government areas (LGAs) with populations over 10,000) between 2001 and 2013. It should be noted that the Australian definition of 'city' is more akin to the concept of an urban neighbourhood. By focusing on LGAs with 10,000 or greater population, our definition is similar to that of significant urban areas (SUAs) in Australia. 2 Aside from the contributions to the literature outlined above, we make two major empirical innovations to address the data and estimation issues that affect previous studies reliant on survey or macro-data. First, we deploy high-resolution satellite imagery night-time luminosity data from the US Air Force Defense Meteorological Satellite Program (DMSP) to proxy city productivity. As outlined in the introduction, GDP can be approximated by night-time luminosity (Henderson et al., 2012;Pinkovskiy & Martin, 2016). Additionally, Mellander et al. (2015) show that night-time luminosity is a good proxy for economic activities. With over 25 billion grid cells recorded in the DMSP's global system since 1992, the satellite-recorded light data represent a finely spatially grained, time-varying, dataset that can proxy for the intensity of economic activities from all sectors. This choice of data has the additional advantage that we do not need to restrict the analysis to a subset of economic sectors that are arguably a good proxy for total productivity.
Our second innovation is the development of new IVs for city populations with temporalspatial exogenous variations that are suitable for the panel data analysis. We combine the following arguments and evidence to construct our instruments. First, we acknowledge that overseas migrant intakerather than new birthsare the most significant component of population change to population expansion in Australia since 1995. Second, we note that annual new migrant intakes are controlled by the federal government of Australia through setting permanent resident quotas and issuing visas annually. 3 Third, we take account of the fact that climactic conditions help to explain differences in population levels between cities. Specifically, cities with more liveable climates (as defined in the third section below) tend to be more populous.
The remainder of the paper is structured as follows. In the next section we examine the role of immigration and climate in annual change to Australian cities' populations. The fourth section presents the data sources used in the study, and the fifth section introduces our estimation approach. Subsequent sections examine the results, robustness checks and suggest future directions for studies of city productivity that combine geographical and econometric approaches.

OVERSEAS MIGRATION, GEOGRAPHY AND POPULATION AGGLOMERATION IN AUSTRALIA
Australia's population grew significantly between 1980 and 2019, from 15 million to 25 million (ABS, 2016). The average rate of growth was over 1.5% during this period, which is the second highest among the Organisation for Economic Co-operation and Development (OECD) countries. 4 There are two contributing factors to population growth in Australia. Figure 1 demonstrates that new births added around 60,000 people each quarter to Australia between 1982 and 2006, and the addition modestly increased to 78,000 in 2017. About 50,000 overseas migrants arrived in Australia each quarter in 1982, and that quarterly arrivals increased significantly to 138,000 in 2017. 5 Figure 2 shows that 54% of population expansion came from new births, and 46% was from new overseas migrant intakes on average between 1982 and 1995. 6 This is different in 2017, when 65% of the new population came from new overseas migrants, whereas 35% was from births. It is therefore clear that overseas migration rather than new births is the most important contributing factor to Australia's population growth after 1995.
The federal government of Australia promoted the immigration-led population growth in Australia after 1995. This is because that the federal government sets permanent resident quotas and issue visas annually to determine the number of overseas migrants arriving in Australia each year to fulfil the country's economic and labour market needs (Productivity Commission, 2016). For example, the federal government increased quotas and issued more visas to address a perceived labour supply shortage after 1995 (Spinks, 2010). 7 These changes are shown in Figures 3 and 4. Subsequently, Australia's population began to grow more significantly after the federal government decided to increase permanent resident quotas and to issue more visas annually. The positive association between Australia's population and quotas or visas issued is reflected in the upward trends in log(population), log(quotas) and log(visa issuance) in Figures 3 and 4. The correlation between Australia's population and quotas or visas issued is over 0.82.  An alternative approach to estimating agglomeration and productivity The positive relationship between overseas migrant intakes and population can also be shown at the city level. For instance, we selected four Australian capital cities (i.e., Sydney, Melbourne, Brisbane and Adelaide) and plotted the natural logarithm of their population levels against the log of permanent resident quotas and visa issuance from 2001 to 2016 in Figures 5 and 6. We can observe the upward trends in these cities, implying that there is a positive link between city-level population and the permanent resident quotas or visa issuance. The correlation indices are over 0.8, which suggests that the positive link between selected city population and quotas or visas issued is strong.  City populations and overseas migrants in Australia are distributed unevenly in different climate conditions. Cities with more liveable climate conditions tend to be more highly populated (Albouy & Stuart, 2014). In Australia, the Bureau of Meteorology (BoM) developed seven climate zones. 8 The zones we include are: Zone 1: Hot humid summer and warm winter climate; Zone 2: Warm humid summer and mild winter climate; Zone 3: Hot dry summer and warm winter climate; Zone 4: Hot dry summer and cool winter climate; Zone 5: Warm temperate climate; Zone 6: Mild temperate climate; and Zone 7: Cool temperate climate. Table 1 shows that the cities with a warm, mild or cool climate are much more populous than those with a hot or hot, dry summer climate.

ESTIMATION METHODOLOGY
In this section we set out our methodology for estimating the influence of city population on economic productivity using our novel approach. We begin with ordinary least squares (OLS) regression, then go on to discuss the estimation issues in this approach. To establish a causal relationship between population and productivity, and that the relationship is robust, we propose our IVs for Australian city populations and employ them in two-stage least squares (2SLS) and reduced-form regressions.

OLS regression
Equation (1) shows the OLS regression that relates the natural log of light intensity for city i, state s at year t as the proxy of annual city productivity level.
To control for the probable impact of educational attainment on productivity, we include the log number of people with a bachelor's degree or above (bachelorp). The variable greatercapital is a dummy variable where city i is equal to 1 (or 0) if it is in the (non-)greater capital area. Including this variable aims to control the pattern in which population and economic activities are geographically clustering at the greater capital areas. μ subscripts i and s are the city and year fixed effects, respectively, that absorb all time invariant city-specific characteristics and all time-varying (observed and unobserved) macroeconomic shocks that affect productivity respectively. μ subscript s is the state fixed effect, and we interact it with the year fixed effect (i.e., μ s × μ t ) to control for any factors that vary across states and years (e.g., annual gross state product (GSP) or state-level unemployment). Lastly, to control the potential spatial clustering, we choose our idiosyncratic error term (ν is,t ) to be clustered at the city level.

Estimation issues in OLS regression
Even though we include control variables to reflect educational attainment, greater capital city and fixed effects, OLS regression cannot identify β ithe elasticity of productivity with respect to city populationas a result of the following issues. First, other determinants of productivity that are correlated with population are contained in ν is,t . These factors could be city housing prices and amenities (Albouy, 2016). Hence, the OLS regression is subject to omitted variable bias. To further examine the problem of endogeneity in the OLS regression we conduct Durbin-Wu-Hausman tests and present the results in Tables A2 and A3 in Appendix B in the supplemental data online. The results suggest that city population is endogenous to the error term in equation (1). Second, people prefer to work in cities with higher wage rates. This means that the OLS regression is susceptible to self-selection bias or reverse causality issues.

Two-stage least squares (2SLS) regression
To overcome the estimation issues in OLS regression, we employ an IV strategy for the 2SLS regression. Specifically, we interact two exogenous variables that are correlated with city populations. The first variable is climate, which varies across cities, and can be considered to be both predetermined and exogenous (Dell et al., 2014;Roos, 2005). Jordan (2007) and Albouy and Lue (2015) argued that areas with a mild, warm or cool climate are more favourable for habitation than those with hot or hot, dry summers (see also Cragg & Kahn, 1997;Albouy et al., 2013). As we have shown, Australian cities with one of these 'favourable' climates for habitation are more populous than those with a hot or hot, dry summer climate (see Table 1 for details). Thus, we construct a dummy variable ( favourableclimate is ) where the city i at state s would be equal to 1 if it has a mild, warm or cool climate (or the city is in Zone 2, 6, 5 or 7), otherwise it is set to 0 for cities that have a hot or hot, dry summer climate (i.e., the city is in Zone 1, 4 or 3). The second variable is the time-varying permanent resident quotas and number of visas issued. The rationale is that overseas migrant intakes rather than new births are the major contributor to population expansion after 1995 in Australia. The annual number of overseas migrants to arrive in Australia is controlled by the Australian federal government by setting permanent resident quotas and issuing visas annually. Therefore, we can employ annual quotas and visas issued to approximate yearly city population well (see the third section for more details). More importantly, as Australian cities do not have the authority to set quotas or to issue visas, the (log of) lagged quotas and number of visas issued (i.e., log(quota t-k ) and log(visasissued t-k ) k = 1 or 2) are exogenous to a city's productivity (as proxied in this article by its night-time luminosity). Our IVs combine climate and quotas or visa issuance, that is: favourableclimate is × log(quota t-k ) and favourableclimate is × log(visasissued t-k ) in the first-stage regression as equation (2) can effectively predict the yearly-city population; and ρ i and τ i are the elasticities of city population with respect to climate and quotas/visas issued.
Equation (3) is the second-stage regression in which the predicted yearly-city population (i. e., the fitted values for log(population) is,tk ) as obtained in the first-stage regression can act as exogenous variations on productivity. To summarize the effect of city population instrumented by climate and quotas/visas issued on productivity: loglog (lights is,t ) = c + a i loglog ( population) us,t−k + lgreatercapital i + (bachelorp) is,t

Reduced-form regression
Equation (4) is the reduced-form regression. This model aims to directly estimate the effect of our proposed IVs (i.e., climate and quotas/visa issuance) on city productivity (as proxied by night-time luminosity).

DATA
Our dataset is a panel of Australia cities defined as LGAs with an annual population over 10,000 between 2001 and 2013, and we can understand them as SUAs in Australia. This is a slightly different definition to that used by the ABS, which is any Statistical Area Level 2 (SA2) with a population over 10,000. The data for city population and the number of people with a bachelor's degree or more are from the ABS (2001,2006 and 2011 censuses, with extrapolation for intervening years). The city boundaries are defined by the Australian Statistical Geography Standard (ASGS) (2016). The data for annual permanent resident quotas and visas issued are from the Department of Home Affairs, and the visas issued include four types of applicants: permanent residents, working holiday, students and long-term business holders. 9 The summary statistics are shown in Table 2.

Climate zone data
To account for the variation in city climate in Australia, we use climate zone data developed by the BoM that categorizes all Australian cities into one of seven climate zones on the basis of a city's historical climate conditions (i.e., humidity, temperature and precipitation levels between 1961 and 1990). 10 As noted above, we consider seven instead of eight zones here. The BoM determines that cities belong to Zone 2 (warm, humid summer and mild winter climate) if their average annual temperature, humidity and precipitation levels are 18-24°C, 50-70% and 1000-2000 mm individually between 1961 and 1990. If cities' average annual temperature, humidity and precipitation levels in this period are 30-36°C, 30-40% and 300-400 mm individually, they belong to Zone 3 (hot, dry summer and warm winter). To save space, we do not illustrate here the full details for humidity, temperature and precipitation levels for the rest of these zones. However, they can be found at BoM. 11 Satellite recorded city light data The raw data for night-time luminosity have been recorded by satellites from the DMSP using Operational Linescan System sensors. 12 These satellites detect luminosity to the space at every location on Earth between 20:30 and 22:00 hours local time. Researchers at the National Oceanic and Atmospheric Administration's (NOAA) and National Geophysical Data Centre (NGDC) then process these raw data (i.e., the removal of observations for locations with natural lights, forest fires and obscurity due to cloud cover over the Earth's surface) and produce yearly light intensity valued between 0 (i.e., the dimmest) and 63 (i.e., the brightest) for each 30 arc-second grid cell (or 1 × 1 km at the equator) of the Earth. The light data are available from 1992 to 2013. To construct the yearly light intensity for Australian cities, we consider the following steps. First, we need to address the top-coding (or truncation) issue in the light data. This is because cities from developed countries such as Australia are likely to have a light intensity over 63 (Storeygard, 2016). To overcome this issue, we use the data that are free from the top-coding issue provided by Bluhm and Krause (2018). 13 Second, we sum the yearly light intensity from all the grid cells within a city and compute its average to represent that city's yearly light intensity level. For example, if there are 10 grid cells in the city of Sydney and the light intensity is 1 for each cell in 2010, the sum of night-time light intensity for Sydney in 2010 is 10 and the light intensity (or the average from all cells) in 2010 would be 1. We follow this method to compute the annual night-time luminosity for the rest of Australia's cities.
Night-time luminosity in Australia rises significantly over time and varies dramatically between cities. To illustrate, we can compare the night-time light intensity in Australia in 1992 ( Figure 7a) with that in 2013 ( Figure 7b). In addition, we can see that the increase in night light during this period mainly comes from the coast and nearby inland areas. We can also see that luminosity is quite different between the inland (or Outback) and coastal cities within a given year. To illustrate, compare Figure 8a with Figure 8b to see that Alice Springs (an inland city) is much dimmer than Newcastle (a coastal city) in 2013.

RESULTS
In this paper we note that many previous studies (23 of 27 summarized in Table A1 in Appendix A in the supplemental data online) have used survey/micro-or macro-data to measure economic productivity in cities. In our above discussion we noted that there are two principal problems with this approach. First, such data are affected by measurement and self-reporting errors (e.g., Bollinger & Hirsch, 2013;Bound et al., 2001;Holland & King, 1979;Johnson et al., 2013;Kjellsson et al., 2014). Second, the time series for variables such as wages, value added An alternative approach to estimating agglomeration and productivity and income that are used to measure productivity tend to be very short. Taking Australian cities as an example, wage and income data are only available from 2013 to 2017 from the ABS. We build on recent studies that have employed satellite-recorded night-time luminosity data to circumvent these issues (e.g., Chen & Nordhaus, 2011;Henderson et al., 2012;Storeygard, 2016). They argued that luminosity data are collected impersonally, which avoids measurement and self-reporting error. Most importantly, Dingel et al. (2021) argue that the brightness of cities at night can effectively approximate their productivity (including accounting for educational attainment and skill levels).
To illustrate the utility of luminosity data as a proxy for productivity in Australia, we regress the log of value added, GSP and wage on the log of night-time luminosity at the state and city   Table 3, columns (1-5). The results suggest that brighter states (or cities) at night are more productive in terms of having greater value added and GSP and higher wages.

levels in different time periods and present their results in
Having established that night-time luminosity is a good proxy for productivity, the remainder of this section now steps through the three estimation stages, beginning with the results obtained from the OLS regression. Table 4 show that the OLS estimates of city population on productivity, approximated by night-time luminosity, are between 0.48 and 0.56 when log(population is,t-1 ) and log(population is,t-2 ) are considered separately in equation (1) without any control variables (i.e., greatercapital i and log(bachelor is,t )). After adding the control variables, the elasticity changes slightly to between 0.37 and 0.46, as shown in columns (III) and (IV) of Table 4. This could be interpreted to mean that adding the control variables in the OLS regression did not affect the estimated effects of population on productivity.

OLS estimates Columns (I) and (II) of
If city scale (higher levels of population) increases economic productivity through agglomeration effects, as many empirical studies claim, our OLS estimates of log(population is,t -1 ) and log(population is,t -2 ) in the different model specifications summarized in Table 4 should be positive. Our mixed results here could be due to the estimation issues of endogeneity and reverse causality, as discussed above.

2SLS estimates
As we noted in the last section, the OLS results cannot be assumed robust because the estimation approach does not address the endogeneity or reverse causality that almost certainly exists between city scale, urban amenity and productivity. In our review of the literature, it is clear that migration both to and within a country is driven by a range of factors including individuals' pursuit of higher wages and greater liveability. It has also been shown that higher levels of amenity and more liveable climate conditions are associated with higher levels of population in cities. Without controlling for these influences on individuals' migration and relocation decisions, it is not possible to robustly estimate the relationship between city scale (population) and economic productivity. For this reason, we introduce two IVs to represent the endogenous variable (annually observed city population). These IVs enter the 2SLS regression. The first IV combines climate conditions with permanent resident quotas, and their 2SLS estimates are presented in Table 5.
The first-stage estimates at the lower panel of Table 5 suggest that city population is positively related to the favourable climate conditions for residence and permanent resident quotas. Specifically, column (I) suggests that if the federal government increases the quotas by 1% at year t -1, the population of cities with a favourable climate for residence (i.e., the cities with a mild, warm or cool climate) would be enhanced by 0.16% more than the cities without it (i.e., the cities with a hot or hot, dry summer climate). The estimated impact becomes 0.12% after the control variables are included in column (III). If the 1% increase occurs at year t -2, our first-stage estimation without control variables suggests that the population would be expanded by 0.14% more on average for the cities with a favourable climate than the one without it in column (II). Without the control variable, the estimated impact would be 0.10% in column (IV). Our IVs are plausibly strong. This is because (1) all the first-stage estimates from columns (I-IV) in Table 5 are statistically significant at the 1% Table 3. Relationship between lights and productivity in Australia.

Variable
(1) log(value added) (2) log (GSP) ( 3)  Note: Robust standard errors clustered at the city level are shown in parentheses. *Significant at 10%; **significant at 5%; and ***significant at 1%. Value added refers to gross value added by all industries at current prices. Gross state product (GSP) is measured by chain volume. Wages are weekly means for adults working full time. Column (4) is for all Australian cities; and column (5) is for cities with populations of 10,000 or greater. Note: Robust standard errors clustered at the city level are shown in parentheses. *Significant at 10%; **significant at 5%; and ***significant at 1%.

level;
(2) all the Chi-squared p-values for Kleibergen-Paap LM statistics are 0.00, suggesting that our IVs are highly correlated with city population; and (3) all the Kleibergen-Paap Wald F-statistics are greater than the Stock and Yogo critical value, which allows us to reject the null hypothesis of weak IVs. The second-stage estimates in Table 5 show that the elasticity of night-time luminosity with respect to city population instrumented by climate conditions and permanent resident quotas is between 1.15 and 1.73 through columns (I-IV). This suggests that city population agglomeration increases productivity. In addition, the estimates suggest that this effect is stronger for cities in greater capital areas, and for cities with a higher concentration of human capital (measured by people with a bachelor's degree or above). Table 6 sets out the 2SLS estimation results when city population is instrumented by the combined effects of favourable climate conditions for residence and visa issuance. The lower panel presents the first-stage estimates where the combined effects on population are 0.13-0.19 in five different model specifications (columns I-V). The statistically significant levels of all the IVs estimates, and their Kleibergen-Paap LM and Wald F-statistics indicate that our IVs are 'valid' and 'strong' in the first-stage regression, following the same approach applied and explained for Table 5. The second-stage estimates in Table 6 show that the estimated effects of city population on night-time luminosity are 1.10-1.79 in five model specifications, which are similar to the results in Table 5.

Reduced-form estimates
Finally, to estimate the direct impact of our IVs on productivity (as proxied by night-time light intensity), we estimate the reduced-form regression shown as equation (4) and document their estimates in Tables 7 and 8. Our reduced-form estimates indicate that cities with a Note: Robust standard errors clustered in the city level are shown in parentheses. *Significant at 10%; **significant at 5%; and ***significant at 1%.

CONCLUSIONS
In recent years there has been a resurgence of interest in the study of agglomeration economic effects centred on the economics literature, but with offshoots of interest in the regional science, urban geography, applied economics and housing economics literature. The study of agglomeration is important from a number of policy perspectives, and that importance has risen steeply as governments grapple with questions about how to rebuild economies and rebalance urban and regional economies in the aftermath of the Covid-19 pandemic. This study examined the relationship between population and productivity in Australian cities between 2001 and 2013. We tackled several empirical issues and set out to make three new contributions to the literature. First, we have extended the agglomeration economics analytical framework to Australia. We would argue that this is a welcome development given the heavy concentration of previous studies on the North American context. Australia has a unique set of geographical circumstances which lead us to question whether the concept of agglomeration economies is even applicable in the Australian context. The geographical extent, sparsity, small population and large distances between cities make it very different contextually to the ambit of most of the rehearsed literature.
Second, we have adopted an entirely different empirical strategy that the econometrics-led approaches that have come to dominate the economics literature in the study of agglomeration effects. To do this, we exploited Australia's unique and diverse geography, proxied by climate zone, to account for the endogeneity between city scale, population change and economic productivity. We also capitalized on Australia's unique reliance on overseas migration as the driver of annual population expansion. By combining information from geographical and demographic processes in this way, we were able to construct convincing and robust IVs that proxy for yearly-city populations. This breaks the endogeneity and allows us to arrive at robust estimates of agglomeration effects for Australian cities.
The analysis demonstrates that the relationship between city scale and productivity is volatile and unreliable when estimated using OLS. Our first-stage estimation results show that the instruments are valid and strong in predicting yearly-city populations. Our second-stage estimates indicate that an 1% increase in Australian city populations causes Australian cities' productivity to increase by 0.97-1.78% on average. The positive link between city population and productivity is reinforced in the reduced-form estimates. Our 2SLS and reduced-form estimation results remain robust after conducting a series of robustness checks. Thus, we would argue that the analysis in this article has shown that a combination of geographical and Note: Robust standard errors clustered in the city level are shown in parentheses. *Significant at 10%; **significant at 5%; and ***significant at 1%. demographic data, set within an IVs approach, is effective in tackling the econometric estimation issues that arise from short time series and productivity data paucity that have plagued previous studies. Finally, our study used high-resolution satellite imageryspecifically night-time luminosityto proxy for city-scale economic productivity. This approach avoids measurement errors such as frame deficiency and response/non-response bias that are available in the survey data as used in previous studies. We have shown that areas with higher levels of night-time light are more economically productive in Australia. In other words, night-time luminosity is an effective proxy for city-scale economic productivity, and it accounts for change over time, spatial differences at the city level and differences arising from regional effects (particularly coastal versus inland differences).

NOTES
1 Measurement errors include frame deficiency and response and non-response bias. For more examples, see Bound et al. (2001) and Bollinger and Hirsch (2013). 2 The definition of SUA here is slightly different to that defined by the ABS as any Statistical Area Level 2 (SA2) with a population over 10,000. 3 These visas are issued for permanent and temporary overseas migrants. For more information, see the data section of this article. 4 See https://www.oecd-ilibrary.org/economics/oecd-factbook-2013/population-growth-rates_factbook-2013-table4-en/. 5 Overseas migrants in this study refer to the sum of temporary and permanent residents of Australia. Temporary residents include international students and business long-stay residents; but they exclude tourists. 6 To compute the percentage of new births (or overseas migration) in Australian population expansion each year, we divide the annual new births (or new overseas migrants) by the annual total new populations (i.e., the sum of new births and overseas migrants.) For example, if new births are 10 and new overseas migrants are 20 in 2002, the total new population is 30. The percentage of new births in new population is about 33% in 2002. 7 Supply shortages are mainly in the fields of medical, information technology (IT), engineering and construction trade workers. The federal government temporarily slashed the quotas and visas issued to respond to the economic slowdown in the Global Financial Crisis in 2008-09. For more details, see Productivity Commission (2016). 8 These climate zone data are developed by BoM to assist the Australian Building Codes Board (ABCB) to regulate the building and construction industry (see https://www.abcb.gov. au/Resources/Tools-Calculators/Climate-Zone-Map-Australia-Wide). The BoM developed eight zones for the ABCB. One of these (Alpine) is not included in this study as there are very few areas with this climate in Australia. As all the cities with this climate have a predominantly cool temperate climate, we reassign them to the cool temperate climate category. 9 For more information about these data, see https://www.homeaffairs.gov.au/research-andstatistics/statistics/visa-statistics/live/migration-program/. 10 The BoM developed these data to assist the ABCB to regulate the building and construction industry; for these data, see https://www.abcb.gov.au/Resources/Tools-Calculators/ Climate-Zone-Map-Australia-Wide/. 11 See http://www.bom.gov.au/jsp/ncc/climate_averages/climate-classifications/index.jsp? maptype=tmp_zones#maps/. 12 For these data, see https://ngdc.noaa.gov/eog/dmsp/downloadV4composites.html/. 13 Bluhm and Krause (2018) provided evidence to show that the distribution of lights data from the DMSP is a Pareto distribution, and had corrected the top-coding issue based on this evidence. For the DMSP's light data free from the top-coding issue, see https://www. lightinequality.com/.

DISCLOSURE STATEMENT
No potential conflict of interest was reported by the authors.