An empirical investigation of socio-economic impacts of agglomeration economies in major cities of Punjab, Pakistan

Abstract Agglomeration economies are the external benefits earned from clustering of industries and people in cities. The study assumes unbridled clustering of population in emerging urban agglomerations turning economies into diseconomies. This study empirically investigates the heterogeneous socioeconomic impacts of agglomeration economies in selected cities of Punjab, Pakistan, from 1998 to 2018, using the Pooled Mean Group and the Mean Group techniques of Panel ARDL. Agglomeration economies are determined by population density, number of registered factories, employment size, and housing, in the cities of Punjab. The study designed four indices for socioeconomic conditions using principal component analysis. These include: education-index, healthcare-index, water & sanitation-index, and economic conditions-index. Research findings reveal pressures of high population density, unemployment, and costly housing on educational & healthcare facilities, poor sanitation & waste management, in cities of Punjab, Pakistan. The study suggests that policy makers and urban planners to develop short term and long term policies and development plans for villages and secondary cities to uplift wellbeing of the local population. Nonetheless, cities need to decentralize for sustainable development and management.


PUBLIC INTEREST STATEMENT
Generally, population and industrial clustering in cities create benefits for individuals and for the economy as a whole. However, in the case of developing economies, due to poor economic infrastructure, and rapid rise in population size, it creates different socio-economic and administrative challenges in absorbing this pressure. The study investigates the short-run and the long-run socio-economic impacts of population clustering in big cities of Punjab, Pakistan. The study tested the impact of population clustering (agglomerations) for 21 years . The findings of the study revealed that pressures of high population density, unemployment, and costly housing are creating pressures on educational & healthcare facilities, sanitation & waste management, and on economic conditions of households, and for city administration. It is suggested for the policy makers to develop both short term and long term plans to control population pressures towards big cities and plan to strengthen administrative infrastructure of the cities.

Introduction
Agglomeration economies 1 are the major characteristic of urbanization, showing structural transformation 2 of a region's economy. Globally, countries benefit from urbanization by nurturing productivity through clustering of enterprises and people in their cities. Nonetheless, concentration of economic and social activities and migration is a usual phenomenon of big cities. Agglomeration economies and congestion forces come to the surface due to urbanization. Therefore, the process of urbanization poses huge socio-economic, administrative, infrastructural, energy, transport, and environmental problems for the developing world (Arif et al., 2019;Rana & Bhatti, 2018). Estimates suggest that the urban population of the world would increase by 2.5 billion in 2050 (United Nations, 2019). Correspondingly, the urban population of South Asia is projected to increase by 250 million by 2030 (World Bank, 2016). Population density, employment size, labor and land prices are the major drivers of agglomerations economies in cities (Glaeser, 2010). The knowledge spillover, availability of skilled labor, market access, and good infrastructure, are documented as the microfoundations of agglomeration economies (Rosenthal & Strange, 2001Glaeser, 2010). In the long-run, successful urbanization goes along with socioeconomic benefits that spill beyond the urban boundary. Nonetheless, these positive trends are weakened by the rising pressures of the urban population growth on infrastructure, land, housing, basic services, and environment, in the developing economies. Statistics illustrate that around 130 million South Asian residents live in slums and squatters and are deprived of basic infrastructure and services (World Bank, 2016). These slum settlements develop peri-urban areas around the city boundary, making rapid modifications to city landscape, and in human activities (Arif & Gupta, 2018). These undesired negative effects impeding economic, social and environmental development are termed as over-agglomeration in the economic literature (Kaya & Koc, 2019). Nonetheless, the access to information and telecommunication has reduced the importance of Agglomeration Economies (Guiliano, Kang, & Yuan, 2019).
Pakistan, with a total land area of 770,880 square km, and population of 220,892,340 people makes 2.83% of the world's population. It ranks the fifth most populous country in the world with 35.1% urban dwellers . Pakistan is the most urbanized South Asian economy, with 80 million people living in cities (UNDP, 2018). The urban population size is growing at 3.3% rate annually as a consequence of structural transformation and migration to urban areas (Rana & Bhatti, 2018), during the past few decades (Appendix-I).
The Punjab province is the most urbanized region of South Asia and the recipient of the largest number of migrants from all around Pakistan (Appendix-II), with a consistent demographic shift towards urban areas and cities (; Rana & Bhatti, 2018). Punjab alone contributes 50% of the GDP. Lahore, Faisalabad, Gujranwala, Rawalpindi, Multan, and Islamabad are the six cities of Punjab which are among ten largest Pakistani cities 3 (Appendix-III). These cities cumulatively contribute 55% of the GDP (MGI, 2012 4 ). Nonetheless, the ten major cities' share in GDP has also increased to 78%, with a concentration of industrial and services sectors, and they contribute 95% of federal tax revenue (UNDP, 2018). This positive growth attracts more people to migrate towards cities. Over the time, the cities are expanding disproportionately in area and in population size (Kanwal et al., 2015;Kugelman, 2013;Rana & Bhatti, 2018) (see map in Appendix-IV). As per urban cluster analysis of Punjab cities, Gujranwala is growing at 8.13% and is expected to be the second largest city in 2040. Faisalabad would be the next largest city in terms of urban expansion (4.02%). More interestingly, the medium cities of Narrowal (12.17%) and Muridke (9.4%) would be the fourth and fifth settlements respectively, and these are expected to be as big as Faisalabad (The Urban Unit, 2018).
With the immense rise in growth of cities, costs of basic utilities, administrative services and security arrangements, also rise. It clearly affects the sustainability of urban development, and necessitates strategies for urban growth with emphasis on shift towards development of secondary cities (Ahmed & Ishrat, 2020;Malik et al., 2020).
The present study has assumed cities as urban agglomerations and investigates the socioeconomic impacts of Agglomeration Economies in five major cities (Lahore, Faisalabad, Gujranwala, Rawalpindi, and Multan 5 ), of Punjab, Pakistan. The data was collected for 21 years  from different published sources of Pakistan Bureau of Statistics. The pooled mean group (PMG) and the mean group (MG) techniques of panel ARDL approach were used to analyze the relationship of variables of agglomeration economies and socioeconomic conditions. The research attempts to fill gap in empirical literature by providing evidence of both positive and negative socioeconomic impacts of agglomeration economies.
The study is organized in the following sections; section-2 gives review of literature on factors associated with agglomeration economies. Section-3 provides the theoretical background of the study. Section-4 provides the details of model, methodology, data sources and study area. Section-5 demonstrates the results and discussion, and lastly section-6 gives conclusions and recommendations of the study.

Literature review
Modern cities attract physical and human capital investments from surroundings. However, weaker economies get depressed with the flight of such capital. For this reason, urban areas become a major competitive advantage in global labor and capital markets. Social infrastructure, e.g., education, healthcare, leisure, culture, entertainment, and transportation attract people from neighborhood that cluster in cities (Ovsiannikova et al., 2018;Sun et al., 2018).
Housing is a vital issue in the developing world, and around 70% of urban population lives in informal settlements (Malik et al., 2020;UN-Habitat, 2016). In South Asia, primary cities 6 and secondary cities 7 of India, Pakistan, Bangladesh, and Sri Lanka, are facing various urban challenges, e.g., population pressure on infrastructure, housing, basic services and environment. The small 8 and medium 9 cities yield positive impact on economic growth; however the primary cities remain statistically insignificant across countries (Deb, 2017;World Bank, 2016).
In Pakistan, with rising migration towards urban areas, various macroeconomic and socioeconomic problems such as urban poverty, overpopulation, environmental pollution, deprivation of education and health, poor sanitation, over-crowded housing, congested traffic, road accidents, and crimes are increasing (Afzal et al., 2018;Latif & Yu, 2020;A. U. Khan et al., 2016). Other urban growth challenges include conversion of farm-lands into residential schemes, squatter settlements, deficient services, and unavailability of clean drinking water (A. A. Khan et al., 2014). On the other side, push factors of migration towards cities include; internal war, insecurity, natural disasters, expensive agriculture, un-equal landholdings, oppressive lifestyle (Kugelman, 2013). A large number of rural residents have moved to the outskirts of cities, and have undergone structural changes resulting in high density and pressure on infrastructure, resources and urban land (Government of Pakistan. [GOP], 2015).
According to a recent labor force survey, urban labor force is 31.5 percent of the total labor force; the province of Punjab holds the highest share that accounted for 57 percent of the total. However, urban labor force participation in Punjab and KPK declined, due to high dependency ratio and rate of migration between -02 and 2013-14 (PES, 2018. This low urban labor force participation and increased migration show a sluggish expansion of urban economic base that ultimately negates the benefits of urbanization (SPDC, 2016). The unbridled urban growth of Lahore has brought a variety of urban environmental annoyances, i.e., untreated industrial and municipal waste, uncollected solid waste, traffic congestion and vehicle exhaust posing serious health risks to the city dwellers (N. Y. Khan et al., 2012). Other cities of Punjab; Faisalabad, Gujranwala, Multan, and Rawalpindi, are also facing challenges of inadequate infrastructure and urban management capacities. Water supply, sanitation, and waste management services are unreliable. The aquifers are also over exploited. The congested roads, poor transport, and traffic management system constraint in urban mobility. Housing facilities are also incapacitated for the growing population. This widespread under-performance in service delivery and incapacitate infrastructure affect living conditions, and limit business growth reducing the productive potential of cities (PES, 2018-19).
In 2008, about one-half of the populations of Punjab and Sindh provinces were living in cities, while the figures for Balochistan and KPK were less than 24 percent and 17 percent respectively. Lahore city, the provincial capital of Punjab, has experienced an arbitrary population growth from 6.32 to 11.12 million during 1998 to 2017. The population density rose from 3,566 to 6,606 persons per square Kilometers in 2018, challenging administrative and civic facilities. The city is being further expanded at the cost of productive agricultural lands. This issue has not been resolved by city district government. The urban planners need to develop an integrated plan for infrastructural and socioeconomic development for the city (Rana & Bhatti, 2018).
Previous studies suggest that urbanization in Pakistan had dual impact on development of the economy; by encouraging workers to move from agriculture to services sector, and conversely, this caused problems for migrants by depriving them from basic needs (Awan & Iqbal, 2010). In addition, urban agglomeration encourages producers and consumers for trade of goods, and a new middle class of over 100 million people emerges, and providing a skilled workforce (Hassan et al., 2012). Furthermore, population size, road density, and technically trained labor pool, promoted industrial agglomerations in Pakistan (Burki & Khan, 2011). Agglomerations of localized and urbanized economies have a strong impact on formation of new firms, their scale of operation, employment, enrollment levels, and social-inclusion. However, this impact varies with firms' size (Azhar & Adil, 2016;Chaudhry & Haroon, 2015).
Around the globe, studies have investigated the micro-foundations of agglomeration economies (Rosenthal & Strange, 2001Glaeser, 2010), benefits of industrial clustering in cities, economies and diseconomies associated with agglomerations. However, in the case of Pakistan, this area is not explored much. Few empirical studies have examined the effects of agglomeration economies; on formation of new firms; on efficiency of manufacturing industries; and on firms' turnover (Burki & Khan, 2011;Haroon, 2013;Nasir, 2017). Furthermore, two studies identified positive impact of district agglomerations on firm efficiency, social inclusion, and economic outputs, in case of Punjab, Pakistan (Azhar & Adil, 2016;Chaudhry & Haroon, 2015). The developmental progress of cities of Punjab shows that the environmental conditions of the cities are poorer than the economic and social conditions. However, none of the cities attained the position of a sustainable city, showing the poor condition of infrastructure of the major cities in Punjab (Ghalib et al., 2017). The negligence towards the development of secondary cities is one of the causes of over urbanization and environmental degradation in big cities and metropolitan areas. However, secondary cities can play a fundamental role to attain sustainable development at local, regional and national level in Pakistan (Kalwar et al., 2016). However urban un-sustainability; economic imbalances, high rents, air and noise pollution are serious challenges to sustainable development and require urban resilience and community engagement (Latif & Yu, 2020).

Theoretical foundations of the study
The theoretical foundation of the Agglomeration Economies originates from Marshall's concept of scale economies. A city generates economies as well as diseconomies of scale, as the effects of agglomeration, due to its physical growth (Marshall, 1920). Myrdal (1957), Hirshman (1958), and Kaldor (1970) discuss the classical issues of convergence and divergence in a dynamic spatial economy. These studies provide that diverging tendencies dominate converging ones in a growing market economy. The theory of AEs put forward that firms achieve positive externalities from the spatial clustering of economic activities. These benefits arise from intra-and inter-industry clustering, referred as localization and urbanization economies 10 (Fujita et al., 1999;Fujita & Thisse, 2002;Melo et al., 2009).The mobility of goods, people, and capital enhance this divergence. Conversely, the mobility of knowledge could not get much attention. It is also observed that producers of goods and services are sensitive to urban congestion, through its impact on business cost and productivity levels of outputs (Weisbrod et al., 2003).
However, Solovian growth theory (1957) proves that incomes converge under constant returns, and this convergence accelerates by any sort of mobility. Thus, in these theories, information mobility is ignored from the beginning. Krugman (1991) developed a novel theory of the New Economic Geography (NEG), to explain the formation of various economic agglomerations in a geographical space. The NEG theory presents agglomeration benefits and location choices in a formal, general equilibrium framework through the interaction between transportation costs and scale economies. The earlier literature provides evidence supporting NEG theory (Breinlisch, 2006;Hansen, 2005;Head & Mayer, 2004;Redding, 2009). NEG is used to describe the spatial agglomerations of industry and population and the economic performance (Fan & Scott, 2003). The main features of NEG models are product differentiation, increasing returns to scale and transport costs, mutually creating economic externalities. The three building blocks increase cumulative causation and agglomerations, along with either factor mobility or intermediate inputs. However, the orthodox and heterodox schools of economics have criticized the monopolistic modelling logic of NEG. These critiques focus on the immeasurability of some concepts of increasing returns of NEG frameworks, such as the static nature of some of its assumptions, the specific focus on the representative firm, presence of the pecuniary economies and the absence of human capital and technological spillovers as externalities (McCann & Oort, 2019). Besides, the balanced growth theory describes, urban structure is the margin that reduces increasing returns, to earn constant returns to scale in aggregate, sufficient to deliver balanced growth (Hansberg & Wright, 2007). This theory produces Zipf's Law, to describe the city-size distribution, under certain assumptions (Arshad et al., 2019), where Zipf's Law for cities is described as a striking pattern of agglomerations (Gabaix, 1999).

Model specification
This section provides the details of model specification of socio-economic conditions (SEC) 11 along with the independent variable of agglomeration economies (AE) i.e. Equation-(1).The model is given below: The study used population density, number of registered firms, employment size of firms, and housing as proxies for agglomeration economies.
Then, it is further explained with the construction of following four equations: Where, the dependent variable, socioeconomic conditions (SEC), is represented by four different indices, e.g., education-index (EI), healthcare-index (HI), water & sanitation-index (WSI), and economic conditions-index (ECI). The agglomeration economies (AE) are measured by population density as persons per square kilometer (DEN), number of factories (FAC), size of employment in factories (EMP), and housing (HOU), from equations (2) to (5).

Methodology
Panel data techniques of pooled OLS, fixed and random effects, are considered inappropriate for non-stationary data. These are static models, for having common slope and variance. The time series and cross sectional effects can only be observed by including dummy variables, which reduces the degree of freedom (Baltagi, 2008). Furthermore, in presence of endogeneity, and correlation between regressors and error term, fixed effects parameter estimates become biased (Campos & Kinoshita, 2008). Similarly, the Random effects model has shortcomings of time invariance. Thus, static panel estimators cannot provide the short-run and long-run relationships (Loayza & Ranciere, 2006). These models also assume homogeneity of lagged dependent variable coefficient (Holly & Raissi, 2009). Thus the static panel techniques are unable to estimate the dynamic models.
In PMG, error variance becomes heterogeneous across individual units. The efficiency, consistency and validity of PMG require a long-run relationship among variables, the coefficient of error correction term must not be less than minus 2. Moreover, the residuals of the error correction model should be serially uncorrelated and explanatory variables be considered as exogenous.
A long-run relationship among variables exists only if all variables are stationary at the same order of integration, whereas the panel ARDL can be applied on variables with different orders of integration (Phillips & Hansen, 1990). The ARDL regression, through pooled mean group estimation, provides consistent estimators, whether the variables of the model are integrated of order one I (1), or integrated of order zero I (0) (Pesaran et al., 1999). Like the time series ARDL method of estimation, the PMG estimator also cannot be applicable if any variable in the model is integrated of order two, I(2) (Pesaran et al., 1999). This is the reason the unit root test is applied before the selection of estimation technique.
The panel ARDL is, autoregressive distributed lag model, with selection of lags of order p in the dependent variable and lag of order q in the explanatory variables. The ARDL specification is constructed in equation-(6), (7), (8) and (9): Where EI is the dependent variable and DEN, FAC, EMP, and HOU represents the set of explanatory variables. The parametric form of error correction of each indicator is represented by equations (10) to (13).
Where π and ω are the short term coefficients of the lagged dependent and independent variables, while ψ is the long run coefficient. In addition, φ the coefficient of error correction term, represents the speed of adjustment of dependent variables (EI, HI, WSI, ECI) towards the long run equilibrium level, which also ensures changes in independent variables. The negative and significant value of φ (φ < 0) is the evidence of cointegration between independent and dependent variables. The large value of error correction term (φ), shows faster convergence rate towards the long-run equilibrium. Moreover, the positive value of error correction term (φ > 0) shows stable relationship does not exist among variables in the long-run. Thus the long-run coefficient (ψ) and the speed of adjustment (φ) are considered important in the estimation of the model.
In order to estimate the equations of study, the pooled mean group (PMG) and mean group (MG) techniques are applied. This method is flexible and allows individual specific variations in the shortrun coefficients. The alternative method (MG), of estimation of error correction term of ARDL, is applied for comparison purpose (Pesaran & Smith, 1995). Like PMG estimation, MG is also applied if data is non-stationary and parameters are heterogeneous across the individual units/groups. The MG estimators provide long-run and short-run slope parameters to vary over individual units only.
The selection of appropriate approaches between PMG and MG can be tested by applying Hausman statistics. The MG estimators become inefficient if the long run homogeneity restrictions prove valid. This provides that PMG estimators are more efficient than the MG.

Data description and sources
This paper attempts to estimate the short-run and the long-run socio-economic impacts of agglomeration economies using annual data from five major cities of Punjab, Pakistan, over the period of 21 years (1998 to 2018). The data has been collected from different published sources of Pakistan Bureau of Statistics (PBS); including development statistics of the Punjab, Multiple Indicator Survey (MICS), Census of Manufacturing Industries (CMI), and compendium of environment. These cities are selected on the basis of their population size (called urban agglomerations). In contrast to developed countries, the time series data on city level is very limited in Pakistan. The motivation for taking data from 1998 to 2018 was to observe the urban agglomerations, as cities have become densely populated during the last few years. In the empirical analysis, most of the data is taken in numbers and some in percentages, as was available with the Bureau of Statistics, Government of Pakistan. The total number of observations is 105.

Sample area
Undoubtedly, Karachi, capital city of Sindh province, is the largest populated city of Pakistan. However, Sindh province has only two big cities (Karachi and Hyderabad), and Punjab, the most populous province, has six big cities. Urbanization is more obvious in Punjab, and it is the most industrialized province, making the major share to the national GDP. It is an important economic hub in Pakistan. The province is bordered by the other three provinces, i.e. Balochistan, Sindh, and Khyber Pakhtunkhwa, making the inter-provincial movement easier. Punjab is known for its comparative prosperity and the lowest poverty rate among all provinces. Therefore, the study selected five major cities of Punjab province. The Figure 1 shows the map of the study area. However, Islamabad city is excluded because it is federally administered capital city of Pakistan, and not considered in provincial administration of Punjab.
In Punjab, 40% people are living in its urban areas. According to World Bank, the problems associated with urbanization include education, health, water and sanitation, poor housing quality and affordability, and transportation. According to existing studies, these problems are most apparent in the five major cities of Punjab, i.e. Lahore, Faisalabad, Gujranwala, Rawalpindi, and Source: Urban Atlas, the Urban Unit, Lahore (2019) Multan. During previous twenty years (1998 to 2017), with following increase in population growth i.e. Lahore (116.32 %), Faisalabad (59.49%), Rawalpindi (48.84%), Gujranwala (78.10%), and Multan (56.33%) (UNICEF, 2020), these cities have become urban agglomerations.

Results and discussion
The section provides the results and discussions on the empirical results of the study.

Principal component analysis
The study used principal component analysis. The method was initiated from the works of Pearson (1901) and Hotelling (1933), and updated by Jolliffe (1986), to develop four indices. Following Kaiser (1960) rules of retaining the number of components based on eigenvalues, the selected four indices of Education, Healthcare, Water & Sanitation and Economic Conditions are presented in Appendix-V.

Summary statistics and correlation coefficients
The results of summary statistics and correlation coefficients among independent and dependent variables are given in appendices VI, & VII.

Unit root tests
It is necessary to pre-test the stationarity of data series before proceeding to econometric estimation of the model. For this, the study run Levin-Lin-Chu and Im-Pesran-Shin unit root tests for all variables, and found that the variables are integrated of mixed order i.e. I(0) and I(1). Pesaran et al. (2001) suggest that with stationary variables of order I(0) and I(1), the pooled mean group and the mean group techniques are appropriate to estimate the model. However, these techniques are inappropriate if a data series is integrated of order 2 or higher. Moreover, while applying PMG and MG techniques, the dependent variables of the models must be integrated of order I(1). For this reason, we run unit root tests (Levin-Lin-Chu and Im-Pesaran-Shin) for the four indices (EI, HI, WSI, & ECI) too, to confirm that PMG and MG techniques are appropriate for the econometric analysis of study data. The findings of unit root test for all variables are provided in Appendix-VIII.

Pooled mean group and mean group estimates
To estimate the long-run and the short-run impacts of agglomerations on socioeconomic conditions in the cities, we apply both the PMG and the MG estimators, and the maximum lag of one is selected on the basis of Schwartz Bayesian Criterion. Initially, from socioeconomic variables, four different indices were constructed; education index, health index, water & sanitation index, and economic conditions index, using principle component analysis (PCA). Therefore, the study has estimated the impact of agglomeration economies on these four indices using the equations (6) to (9) and equations (10) to (13) respectively. Table 1 shows the results of the PMG and MG estimators in four columns (1 to 4). Where, column-1 represents estimates of both PMG and MG for Education Index, column-2 represents estimates for Healthcare Index, column-3 exhibits estimates for Water & Sanitation Index, and column-4 depicts estimates for Economic Conditions Index. On the basis of Hausman test results, hypothesis of poolability of the long-run coefficients is accepted, and an appropriate model is chosen among both PMG & MG. Thus, it is considered that the PMG estimator is efficient and preferable to MG estimator, for education and economic condition indices, the results of PMG are interpreted for these indices. However, for healthcare and water & sanitation indices, MG is preferred over PMG, so only MG results are interpreted for these two indices.
In Education-Index, at an individual level, the variable of population density shows an insignificant relationship in the long-run, but a significant positive relationship in the short-run. Possibly, in the short-run, in high dens areas, more people enroll in schools and colleges for educational attainment, and the number of educational institutions is also increased to accommodate an increased population in cities. In a favorable learning atmosphere, people concentrate in the neighborhood, which reduces travel time and crime rate (Sun et al., 2018). However, the number of factories has a negative and highly significant relationship with education in the long-run. The  employment in factories has a significant positive impact on education index in the long-run. This implies that with more employment opportunities, the educational attainment increases in population in cities. However, with more labor absorption, high rate of migration of educated labor is also expected towards cities in the long-run.
This result is in line with previous studies confirming localization economies of knowledge diffusion, buyer-supplier network, and a skilled labor-pool (Burki & Khan, 2011;Haider & Badami, 2010;Hassan et al., 2012). The demand for housing also has a significant positive relationship with education, in the long-run. Educated people move to cities for better living conditions (Azhar & Adil, 2016). Overall, the results show that AEs have a significant long-run impact on education index. The error-correction (ECT) coefficient estimate of education index is significantly negative (−0.19) shows the deviation from the long-run equilibrium is corrected by about 19% in one year. The value lies within the dynamically stable range for the PMG estimator. Both, Hausman and ECT values confirm that PMG is preferred for Education Index.
Healthcare-Index consists of the variables of provision of healthcare services in the cities. Again, the ECT estimate is significantly negative with a magnitude of −0.42. In addition, under the Hausman test results, MG is preferred over the PMG approach, and the interpretation of this model centers around MG results only. In the long-run, the coefficients of population density and employment in factories show a significantly negative impact on health conditions in the cities. It shows that a high population density creates pressures on availability of healthcare services and overall health conditions in the cities. High population density in the area causes disease prevalence and other mental health issues (N. Y. Khan et al., 2012) to the citizens too.
Water & sanitation-index exhibits the coefficient of ECT is significantly negative showing deviation from the long-run equilibrium is corrected by 55% in one year. In the long-run, population density shows a significantly negative impact on water & sanitation conditions. It indicates the pressures of high population density on available water resources, and municipal services. The unavailability of clean drinking water, inadequate disposal of residential and industrial wastes, vector-borne diseases, and vehicle exhaust are the problems of cities along with poverty (A. A. Khan et al., 2014). The number of factories, and employment also show a significant negative impact on water & sanitation-index. The consumption of natural resources, pressures on infrastructure, uncollected solid waste and traffic congestion pose serious health risks (N. Y. Khan et al., 2012;GOP, 2014), in the long-run. In Pakistani cities, around 40% citizens have access to safe water supply, and nearly 80% have access to sanitation services, full and safe treatment and disposal of wastewater is almost nonexistent (World Bank, 2019). The coefficient of housing indicates a significant positive impact on water & sanitation-index. This indicates, as the demand for housing increases, the demand for water & sanitation services also rises, in the long-run. The results are in line with other studies, as the increased number of settlements around water bodies is a major reason for stress on aquatic sources. The total waste water discharges in Pakistan are recorded as 7,590 million cubic meters annually, of which 30% comes from industries and 70% comes from domestic discharges. It is expected that both of these discharges would double by 2025. Previously, only 1% of urban wastewater was treated in Pakistan, and the remaining flows into rivers without any treatment, contaminating the aquatic sources (United Nations Environment Programme [UNEP], 2013).
Economic condition-Index consists of the variables of economic status of people in the cities. The significantly negative value of ECT shows the deviation from the long-run equilibrium is corrected by 46% in one year. The ECT and Hausman values prefer PMG model for Economic condition-index. Among agglomerations variables, population density has a significantly negative impact on economic condition in the long-run. A high population density challenges economic, administrative and civic amenities of cities, and cause socioeconomic problems (Afzal et al., 2018;Imran et al., 2013). The number of factories also shows a negative impact on economic condition. Along with this, the coefficient of employment in factories has also a significantly negative relationship with economic conditions, both in short-run and long-run. The negative sign of these variables may indicate unemployment in cities. Rising unemployment and inflation in cities, increases urban poverty (A. U. Khan et al., 2016). However, the variable of housing shows a significant positive impact, may indicate the development of informal housing, and rising number of housing schemes around the cities (Malik et al., 2020;UN-Habitat, 2016).
Overall findings of the study show that high population density has a significant negative impact on healthcare, water & sanitation and economic conditions of the cities, in the long-run. The rising number of factories has significant negative impact on all socioeconomic variables (i.e. indices of education, healthcare, water & sanitation, and economic conditions) in cities in the long-run. The variable of employment in factories has significantly negative impact on healthcare, water & sanitation, and economic condition indices, however positive impact on education index. The availability of housing has a positive impact on all indices, in the long-run, for the cities of Punjab, Pakistan. These findings confirm the hypothesis of the study that apart from positive impacts of agglomeration economies, negative impacts are also affecting the socioeconomic conditions of the cities of Punjab, Pakistan.

Conclusion and recommendations
The main objective of the present study was to investigate the socioeconomic impacts of agglomeration economies in the major cities of Punjab, Pakistan. For this purpose, five major populous cities (i.e. Lahore, Faisalabad, Gujranwala, Rawalpindi and Multan) of Punjab province were selected for study. The data was collected for previous 21 years from 1998 to 2018, to check the long run impacts of agglomeration economies on socioeconomic conditions of Punjab cities. Initially, four indices were constructed for socioeconomic variables, using PCA technique. Afterwards, PMG and MG techniques of Panel data were applied on all four indices to analyze short-run and long-run impacts. The findings of this research depict that the concentration of industries and population, along with positive economies, create socioeconomic challenges for emerging economies like cities of Pakistan. This study attempts to describe the relationship between city size, availability and provision of socioeconomic infrastructure. The results clearly conclude that unbridled population pressures towards cities, uncontrolled and non-progressive growth in number of factories are creating multiple challenges on the existing social, economic and administrative infrastructure that reduce the ability of cities to address negative externalities.
The private provision of infrastructure can reduce negative externalities, but it is cost-effective only when city population is low or institutions are strong, however, public provision can be more costeffective in bigger cities. The authors are of the opinion that the government needs to take measures to control the unbridled urbanization through integrated and sustainable initiatives of socioeconomic and infrastructure development along with capacity building, institutional strengthening and development control measures within legal and regulatory framework. Nevertheless, decentralization of large cities and development of secondary cities as an effective urban management tool needs to be opted. Development of regional plans and local infrastructure plan ensure all public amenities, infrastructure facilities and provision of administrative services is also suggested for small towns and villages to control the out-migration, towards big cities. It further suggests the studies on urban public service delivery. ORCID ID: http://orcid.org/0000-0002-9562-4291 M. Riaz Akbar 3 ORCID ID: http://orcid.org/0000-0001-8028-0791 1 The Superior University, Lahore, Pakistan. 2 Lahore College for Women University, Lahore, Pakistan.

Notes
1. Agglomeration economies (AEs) are the unintended benefits that occur with the clustering of firms and people together in an area. These economies arise by improving productivity and job creation, specifically in manufacturing and services sectors. 2. Structural transformation is simply moving from low-productivity areas like agriculture to high productivity and value-added areas like industry.
3. Ten large cities include Karachi, Lahore, Faisalabad, Gujranwala, Multan, Rawalpindi, Hyderabad, Peshawar, Quetta and Islamabad with population size of over one million are also growing at a rate of over 3% annually, are emerging urban agglomerations (Appendix-A). 4. Urban World: Cities and the Rise of the Consuming Class. McKinsey Global Institute (MGI), 2017. 5. The study excluded Islamabad city from empirical analysis, because it is federally administered capital city of orPakistan. So the analysis is run on five cities of Punjab, instead of six. 6. Primary cities (With large Metropolitan areas). 7. Secondary Cities (With population over five million). 8. Small cities (with population of less than 300,000 inhabitants). 9. Medium cities (with 5 million or lesser inhabitants). 10. Localization economies occur when an increased size of industry in a city leads to increased productivity. However, urbanization economies occur when change in size of city leads to increase in productivity (Marshall, 1920). 11. The details of the construction of variables of study are given in Appendix-III. 12. Government statistics of migration towards cities are not available.

Disclosure statement
The authors have no conflict of interest.  Source: Authors' calculation. Notes: The asterisks *** and ** denote significance at 1% and 5% respectively.

Appendix-VII: Correlation Coefficients of Dependent and Independent Variables
Source: Authors' calculation