Demand pull or supply push? Metro-level analysis of start-ups in the United States

Abstract This paper examines factors related to higher regional start-up activity. Two hypotheses are formulated to explain start-ups: the demand-pull hypothesis argues that the amount, growth and density of aggregate demand will stimulate start-ups in any sectors; and the supply-push hypothesis argues that factors including high-tech industry concentrations, patent generation, industrial and university research activities, and government funding will stimulate high-tech start-ups. Both hypotheses support the importance of human capital factors, such as a highly educated or skilled workforce and thick labour markets. The paper incorporates these various measures and employs cross-sectional multivariate analysis of start-up rates in all sectors and in high-tech sectors in 366 metropolitan areas as defined by the US Census Bureau in 2009. Overall, very strong support is found for the demand-pull hypothesis, but only modest support for the supply-push hypothesis, which provide substantial caveats for public policy to promote start-up activities.


INTRODUCTION
Entrepreneurship is an important element in regional economic development because it is a fundamental driver of the economy. Long ago, Schumpeter (1926Schumpeter ( , 1950 argued that the source of economic development is entrepreneurs who bring about creative destruction. More recently, Fritsch (2008) elaborated four economic functions of entrepreneurship: (1) improving efficiency by competing with existing companies; (2) providing a greater variety of goods and services; (3) amplifying innovations by creating new markets; and (4) accelerating industrial and structural change. In this paper we study two indicators of entrepreneurship across US regions: the overall business start-up rate and the high-tech business start-up rate. We amplify Fritsch's second and third points: business starts across all sectors could provide greater variety of commodities within metropolitan areas, whereas start-ups with significant growth potential could implement innovations and create new

LITERATURE REVIEW AND FRAMEWORK
Two major schools of thought explain the variations of entrepreneurship rates at the metropolitan level. While some overlap exists, each school has a distinct scope of inputs and outputs concerning entrepreneurship. We review each school closely in this section: knowledge spillover theory and agglomeration theory, and identify three sets of factors worth analyzing.

Knowledge spillover theory
Current consideration of the regional variation in entrepreneurship rates comes from the literature on knowledge spillovers (Acs & Armington, 2006;Acs et al., 2013;Doran, McCarthy, & O'Connor, 2016;Fritsch, 2002). The roots of this literature go back over two decades. Based on higher observed research productivity and citations, Jaffe, Trajtenberg, and Henderson (1993) and Jaffe, Fogarty, and Banks (1998), as well as Griliches (1992), argued that researchers benefit from the proximity of other researchers, thereby realizing knowledge spillovers within geographically proximate areas. Subsequent studies focused on the inputs required for research and innovation-intensive activities and modelled various forms of private corporate and university research activities, commercial applications of new knowledge (i.e., patents), and highly skilled and tech-oriented human capital (Anselin, Varga, & Acs, 1997;Arauzo-Carod & Manjón-Antolín, 2012;Audretsch, Hülsbeck, & Lehmann, 2012;Feldman & Florida, 1994;McCann, 2007;Simon & Nardinelli, 2002).
An underlying assumption of knowledge spillover theory applications is that high-tech firms primarily represent growth entrepreneurship, that is, innovative firms with significant growth potential (Acs & Armington, 2006;Acs et al., 2013). These high-tech start-ups then should be influenced by specialized inputs in the knowledge production function. Important factors include the concentration of existing high-tech industries, often measured with location quotients (LQs), the level of patent production, regional research activity, primarily research at area universities, and the amount and concentration of skilled human capital. Furthermore, it is not just the presence of skilled labour but the mechanism by which knowledge spills over when workers move between jobs and industries, taking their accumulated skills and tacit knowledge with them (Audretsch & Feldman, 2004). Moretti (2012) identified this mechanism operating in 'thick' labour markets. This literature identifies candidate supply-push factors that may generate start-ups.
Considering knowledge spillover theory, we note that some studies of entrepreneurship overvalue high-tech sectors and undervalue agglomeration. In fact, empirical evidence suggests that the growth of employment or productivity in high-tech sectors has been slower than the overall economy since the 1990s (Hecker, 2005;Kask & Sieber, 2002). Furthermore, among high-growth companies, high-tech sectors comprise less than one-quarter (Brown & Mason, 2017;Motoyama, 2015). Thus, we examine entrepreneurship across all sectors of the economy and move to the second school of thought that considers agglomeration economies, which affect external economies of urbanization and localization.

Agglomeration, urbanization and localization
Agglomeration economies are observed in metropolitan areas, enhancing innovation and productivity, as city size and density increase (Alañón-Pardo & Arauzo-Carod, 2013;Duranton & Kerr, 2015;Hindle, 2010;Jofre-Monseny, Marín-López, & Viladecans-Marsal, 2011;Puga, 2010;Sedgley & Elmslie, 2004). Two influential works in the mid-20th century narrowed the focus to localization economies. First, Losch (1954) deeply examined localization economies, which are external to a firm but internal to an industry within a geographical region. Second, Chinitz (1961) added the importance of industrial structure and dominance, which sparked a series of studies to test the effect on aggregate regional growth of the region's average firm size (Drucker, 2011;Drucker & Feser, 2012;Evans, 1986;Norton, 1992) and the diversity of regional industries (Glaeser, Kallal, Scheinkman, & Shleifer, 1992;Henderson, 1986Henderson, , 2003. These studies of localization economies have led to the continuing interest in industry clusters as the major focus of local economic development practice in the United States (Porter, 1998). Yet fundamental aspects of urbanization economies have received less attention, at times treated as control variables in empirical work.
We embrace the classic work of Alfred Marshall and recognize the benefits of co-location that enable sharing pools of highly skilled labour, customers, specialized suppliers and ancillary services (Marshall, 1898, pp. 346-356). In addition to the diversity of skill sets that large pools of labour offer, Marshall also emphasized the potential heterogeneity of aggregate demand that large markets offer. As population grows, more highly differentiated markets provide opportunities for new businesses to serve new market niches; it is not just the sheer size or the composition of industry that matter. Aggregate demand grows with market size and has the power to spark entrepreneurship in the regional economy. These demand-pull factors address the drivers of startups in all sectors. Here, urbanization economies that increase with the size, growth and density of the local market are expected to be most influential, emphasizing the importance of aggregate market demand on start-up rates. Jacobs (1961, p. 151) placed special emphasis on density arguing that higher density 'created effective economic pools'. An important assumption here is that most of these start-ups will serve the local market, and therefore the survivors will have growth potential limited by the size of the local market.
Both schools of thought reference factors related to skilled human capital either as part of the knowledge production function or as augmenting agglomeration economies.
Finally, a related set of studies noted the roles of foreign-born immigrants because they have higher rates of entrepreneurship than the native population. Literature exists about ethnic enclaves (Waldinger, 1986;Waldinger, Aldrich, & Ward, 1990), high-skill immigrants in Silicon Valley (Saxenian, 1999(Saxenian, , 2006, and inclusion of immigrants in the entrepreneurial index (Fairie, 2010(Fairie, , 2014. Foreign-born workers also contribute to the skilled workforce in Silicon Valley (Saxenian, Motoyama, & Quan, 2002).
Examining the two hypotheses about start-up activities based on these two schools of thought is important for several reasons. First, the results will help us understand whether supply-push factors that may be associated with high-tech start-ups are also associated with the start-up rate in all sectors. We need to keep mind that not all metropolitan statistical areas (MSAs) have significant economic activity in high-tech sectors. Will we find that supply-push factors are associated with start-up activities in all sectors, high-tech and non-high-tech alike? Second, the currently dominant practice relies heavily on supply-push factors assuming they will promote entrepreneurship. For example, the federal government funds research and development (R&D) through the National Institute of Health (NIH) and Small Business Innovation Research (SBIR) programmes (Audretsch, Link, & Scott, 2002;Cooper, 2003;Deaton, 2003;Lerner, 2000). Research universities generate cutting-edge technologies, develop science parks and establish technology-transfer offices to commercialize basic science (Degroof & Roberts, 2004;Mayer, 2007;Plosila, 2004;Sa, Geiger, & Hallacher, 2008). Patents and licences are often used to measure such bridging activities (AUTM, 2012(AUTM, , 2016. An important initial question is: are supply-push factors correlated with start-up rates?

MEASURES AND DATA SOURCES
Two dependent variables help test the two schools of thoughts discussed above: the start-up rate pertaining to businesses in all sectors and the start-up rate in high-tech sectors. The high-tech sectors here include information and communication technology (ICT), health and life sciences technology, and advanced manufacturing. As noted, we expect that new firms primarily serving the local market will dominate start-up rates in all sectors. High-tech businesses are more likely to incorporate innovations that generate products traded in larger markets outside the metropolitan area. Knowledge spillover theory suggests that these new firms have strong growth potential (growth entrepreneurship) and are likely to become part of the metropolitan area's export base. 2 The following 10 factors serve as the explanatory variables drawn directly from the literatures cited above: • Supply-push: knowledge spillovers, research activity and high-tech industries. • Demand-pull: aggregate demand, market growth and population density. • Both: skilled human capital, high-tech labour pools, labour market thickness and foreign born.
For supply-push, we measured the commercialization of new knowledge and knowledge spillovers with patents per capita across the metropolitan areas. 3 We measured research activity as the presence of Research I universities, research funding at these universities, NIH grants received per capita and SBIR grants per capita. The latter three measures of research funding are highly correlated. We test NIH and SBIR grants per capita in separate models because some SBIR funds are disbursed through the NIH. Localization economies are expected to impact high-tech start-up rates more than overall start-up rates. These external economies should increase with more high-tech firms. We used the LQ of employment in high-tech industries as computed by the Milken Institute to reflect the existing concentration of firms in high-tech sectors. This measure shows the relative concentration of high-tech covering high-tech manufacturing and specialized services in software, telecommunications, computer and architectural design, engineering, and medicine (DeVol, Klowden, Bedroussian, & Yeo, 2009). These sectors approximate but do not replicate the sectors included in the high-tech start-up rate measure.
The demand-pull hypothesis primarily depends on the size, growth and density of demand. We measured each metropolitan area's total personal income to reflect the level of demand from the Woods and Poole Database, recent population growth to pick up market expansion from the American Community Survey (ACS), and population density within 1 mile of city hall (US Census Bureau, 2012). The 1-mile density measure better reflects agglomeration than citywide population density, as the political boundaries of cities vary substantially.
We recognized the importance of skilled human capital by including the per cent of adults with college degrees or higher, calculated from the education variable (EDUC) in the ACS. This percentage may be higher in research-oriented metropolitan areas. We measured high-tech labour pools and labour market thickness in the following manner.
Following Chapple, Markusen, Schrock, Yamamoto, and Yu (2004), we defined high-tech labour as workers in Science, Technology, Engineering, and Math (STEM) occupations. The three major categories are 15: Computer and mathematical, 17: Architecture and engineering, and 19: Life, physical and social science. For each occupation group, the labour pool measure we computed accounts for two dimensions: the relative importance of the occupation group within the metropolitan area and its absolute importance as a share of total employment in that occupation for all metropolitan areas. We computed the LQ to indicate the relative importance of the occupation group in each metropolitan area.
To account for the absolute size of the occupation group in each metropolitan area we completed the following tasks. First, we established four size ranges based on the total number of workers in all metropolitan areas in that occupation group. The ranges we set were (1) the average number in the group for all 366 metropolitan areas or fewer; (2) the range from this average to the number of workers representing 0.5% of the US total; (3) the range from 0.5% to 1.0% of the US total; and (4) the range over 1.0% of this total. After some experimentation, we assigned the following values to each of these four ranges: 1.0, 1.25, 1.5 and 2.0. For example, if a metropolitan area had the average number of workers or fewer, its score was 1.0. If a metropolitan area's share of workers was more than 1% of the total, its score was 2.0. We then multiplied this score by the LQ for that occupation group in each metropolitan area. Finally, we added the weighted LQs for the three occupation groups to compute one overall measure for each area. Although this measure is complex, it is one way to combine relative importance and absolute share of total in one measure. We did not separate and include these two variables because of the high anticipated correlation between the size variable, such as log of employment, and total personal income -our most important demand-pull indicator.
The measure of labour market thickness was less complicated. We measured labour market turnover, which is the reallocation of workers losing and then regaining employment. We assumed that the higher the turnover, the thicker the labour market. We used the indicator of turnover rate defined by the US Census Bureau (2013) found in the Quarterly Workforce Indicator database: the sum of all hires in one quarter plus all separations in the next quarter divided by twice the total employment in the earlier quarter.
For foreign-born, we measured foreign-born adults as a percentage of the adult population, calculated from the birthplace in the ACS. We considered using the percentage of foreign-born adults with a college education as a better indicator of the supply of potential entrepreneurs, but using both college-educated adults and foreign-born college-educated adults in the same models resulted in collinearity. Thus, we used the per cent foreign-born adults in the final models.
We employed two further explanatory variables: (1) the number of adjacent metropolitan areas within 50 km; and (2) the census division in which the metropolitan area is located. Metropolitan areas ideally represent mutually exclusive 'commuter sheds'. However, census-defined metropolitan areas follow county boundaries. Therefore, some cross-commuting is likely, and its amount may increase as the number of MSAs adjacent to any metropolitan area. We wanted to recognize this influence by counting the number of adjacent MSAs. We used 50 km because we think this distance is reasonable when considering the likelihood of cross-commuting.
Each area is located in one of nine census divisions: Northeast, Mid-Atlantic, East Midwest, West Midwest, South Atlantic, East South Central, West South Central, Mountain or Pacific. West Midwest served as the base case with the other eight regions specified as dummy variables. Table 1 lists the variables, measures and data sources in this analysis. There are two dependent variables and 13 explanatory variables. We attempted to measure all explanatory and dependent variables as close to 2010 as possible. Excluding the eight dummy variables, we had an array of 14 variables for 366 metropolitan areas, 5124 data points in all.
The descriptive statistics for the variables are shown in Table 2. The dummy variables pertain to eight of nine census regions where the West Midwest region containing 30 MSAs is the base case, as noted above.
Correlations between 11 explanatory variables, excluding the metropolitan proximity and regional dummies, are mostly low, but there are some modestly correlated variables, such as population per square mile and Research I universities (0.74), and patents per capita and high-tech LQ (0.62). High-tech labour pool is correlated with high-tech LQ (0.75), college completion rate (0.63) and patents per capita (0.60). We will closely test the collinearity in the next section.

EMPIRICAL RESULTS
We used two models to test the hypotheses. The four human capital-related variables may be important in either model. We expected the three 'pure' demand-pull factors to explain the variation in the start-up rates for businesses in all sectors, but not in high-tech sectors. Conversely, we expected the four 'pure' supply-push factors to explain the variation in start-up rates for businesses in high-tech sectors only.
The models can be specified as follows: where y is either business start-up rates in all sectors or the natural log of high-tech start-up rates; a is the intercept term for each model; b 1 to b 12 are the coefficients for the 11 explanatory variables and the adjacent metropolitan areas, one set for each model; x 1 to x 11 are the 11 demand-pull or supply-push factors used in each model; x 12 is the adjacent metropolitan area variable in each model; d 1 to d 8 are the coefficients for the dummy variables entered as 0-1 values depending on location in eight of the nine regions; and e is the error term for each model. The distribution for start-up rates in all sectors was symmetrical, but the distribution for high-tech start-ups was right skewed. The log transformation for high-tech start-ups made the distribution more symmetrical, allowing us to apply ordinary least squares (OLS) analysis. For each model, we ran the variance inflation factor (VIF) test for collinearity and the Breusch-Pagan/Cook-Weisberg test for heteroskedasticity. The tests indicated that collinearity did not threaten either model. The average VIF for the explanatory variables was 2.86 with individual scores ranging from 1.5 to 5.7. However, both models were heteroskedastic, which is not surprising given some of the variable ranges shown in Table 2. We calculated robust standard errors to address heteroskedasticity, which increases standard errors and lowers levels of significance. We considered only the 1% and 5% levels of significance, as explained below.
These cross-sectional models cannot generate causal inferences; they can only reveal associations between the dependent and explanatory variables. Yet, it is worth considering whether the start-up rates could influence the explanatory variables (endogeneity). We think this is unlikely for almost all of the explanatory variables. 4 There is a chance that start-up rates around 2010 could influence population growth rates between 2006 and 2010. There is greater likelihood of interaction between start-up rates and labour turnover. For example, employees losing employment may withdraw from the labour market to start firms. Having said this, our objective here is employ a model that indicates correlations, not causality.
With those points in mind, we examined the following: • Adjusted R 2 to indicate whether the models explain a reasonable amount of variation in the dependent variables. • The significance of the explanatory variables at the 5% or 1% level. • The direction of association between significant explanatory variables and start-up rates, not specific coefficient values.The regression results for the two models are presented in Table 3. The models explain a reasonable amount of variation: 45% for all start-ups and 61% for high-tech start-ups.
With respect to statistical significance and direction of relationships, we first look at the pure demand-pull variables. All three, total personal income, population growth and density, are highly significant in the model for all start-ups. This result provides strong support for the demandpull hypothesis. The only caveat is that population density has a negative coefficient probably resulting from its very high correlation with total personal income (0.86). Equally important is the result that total personal income and population growth are also significant in the model for high-tech start-ups.
As hypothesized, the pure supply-push variables -patents, Research I universities, NIH or SBIR funding, and high-tech LQ -are not significant in the model for all business start-ups. However, only high-tech LQ is significant in the high-tech starts model. It makes sense that high-tech start-up rates are higher where high-tech sectors are more prominent. These results provide only modest support for the supply-push hypothesis.
The remaining four human capital variables pertain to both hypotheses with caveats. Two are significant in the all sector model and three in the high-tech one: per cent foreign-born adults  Coeff.

Coeff. Std Err
Coeff. Std Err and labour turnover associate with starts in all sectors. Our interpretation is that foreign-born adults directly contribute by starting new businesses. Higher labour turnover rates may generate potential entrepreneurs from among displaced workers. The three human capital factors significant in the model for high-tech start-ups are per cent adults with a college education or higher, high-tech labour pools and, again, labour turnover. The significance of college graduates represents skilled human capital for high-tech start-ups. It is equally straightforward with high-tech labour pools and the availability of more specialized talent. Higher labour turnover rates comport with thicker labour markets that associate also with high-tech start-ups.

Coeff. Std Err
Per cent foreign born is not statistically significant for the high-tech start-ups. We interpret this result that the largest foreign-born group is Hispanics. Other immigrants, such as Indians or Chinese, are less prominent. Hispanic immigrants appear more likely to start locally oriented businesses that are not in high-tech sectors.
All significant variables have positive associations with the dependent variables as expected. Among the regional dummy variables, there are three statistically significant relationships. The dummies for the high-growth South Atlantic region are positive and significant in both models. The dummy for the Mountain region has a positive shift parameter in the high-tech starts model.
Although adjusted R 2 is lower, the model for start-up rates in all sectors has five of seven demand-pull related variables significantly associated and none of the four pure supply-push factors associated. Although three of four human capital-related variables are significant in the high-tech start-ups model, only one of four pure supply-push variables is significant, which simply indicates that the higher presence of high-tech sectors is associated with higher high-tech start-up rates. Neither the presence of Research I universities, NIH or SBIR research activity nor patents are associated. All these variables have relatively strong positive correlations with the high-tech LQ, college-educated adults and high-tech labour pools, which are all significant in the high-tech start-up model.
In sum, high-tech start-ups are associated with specialized talent (high-tech labour pools) and skilled labour (college-educated adults) possibly to meet their initial staffing needs. Higher labour turnover would enable rapid hiring to expand or replace workers. High-tech start-ups could also benefit from established high-tech firms that can potentially spawn spinoff opportunities, offer mentorship, become customers or provide other forms of support.
Overall, then, we believe we have very strong support for the demand-pull hypothesis with five of seven variables significant in the all sector model and a different set, but another five of seven variables in the high-tech sector model. We find limited evidence for the supply-push hypothesis as the only significant variable is the high-tech industry concentration. Having no pure supply-push variable significant in the model for all starts provides additional support for the demand-pull hypothesis.

CONCLUSIONS
Overall, metropolitan areas with larger, growing, denser markets as well as more foreign-born and higher labour turnover have higher start-up rates in all sectors. For high-tech start-up rates, we find strong correlations with the same factors of larger and growing markets, as well as the hightech LQ, labour pool, and college-educated population, but no correlation with other supply-push factors: patents, Research I universities, and NIH or SBIR spending.
We believe that our findings may have several policy implications subject to further research. As mentioned above, our cross-sectional analysis does not address causality, but statistical insignificance simply and clearly means no relationship between variables. Given these results, would it not be reasonable to revisit the currently dominant supply-push approach, which, for example, treats research as the raw material stoking entrepreneurial activity? Spending for science parks, university technology transfer offices and federal-government research funding through the NIH, SBIR or the National Nanotechnology Initiative (NNI) is often justified on the grounds of promoting greater economic competitiveness (McCray, 2005;Motoyama, Appelbaum, & Parker, 2011). Our results point to the need for more definitive research on whether such spending will yield greater overall start-up activity or even high-tech start-up activity. Note that the federal government annually spends roughly US$20 billion for NIH, US$2 billion for SBIR and another US$2 billion for NNI. This spending may achieve important purposes for scientific development, but may not translate into greater economic competitiveness at least as measured by two types of start-up activities. Having said this, it is possible that these research inputs may correlate with start-up rates with time lags, though there is little variation over time about the presence of Research I universities. Future research should address whether and how long those time lags may exist between research-related factors and start-up rates.
In contrast to the presence of Research I or research-active universities, the basic educational role of universities -producing college-educated adults -is statistically related to high-tech start-ups. This echoes some past studies that did not find the roles of universities in start-up rates, but indirectly as the supplier of human capital (Faggian & McCann, 2006;Motoyama & Mayer, 2017). Further research should test whether all universities, not just Research I universities, are important contributors to entrepreneurship promotion.
We found that high-tech industry concentrations are associated with high-tech start-up rates, which ties back to agglomeration theory operating through localization economies. Further research should investigate the policy of promoting high-tech start-ups where high-tech sectors are present in the region. Initiatives designed to stimulate high-tech start-ups in metropolitan areas without high-tech sectors may not be promising.
Finally, although all start-ups may be acknowledged as entrepreneurship, start-ups that can become major companies serving larger markets (growth entrepreneurship) deserve more nurturing. Their success should eventually increase local demand that should subsequently induce start-ups that serve the local market. Therefore, future research should address which regions appear to foster 'growth' entrepreneurship using more sophisticated models to ferret out relationships. Although policy implications must await further research, this study of demand-pull and supply-push factors conducted for all metro areas in the United States is an important first step.