Urbanization and poverty in Sub-Saharan Africa: evidence from dynamic panel data analysis of selected urbanizing countries

Abstract Urbanization in Sub-Saharan Africa (SSA) is generally highlighted as a puzzle that deviates from the stylized facts in the literature. Using data from a panel of 29 urbanizing countries in SSA from 1985 to 2019, the study employs the two-step system generalized methods of moments to investigate the effect of urbanization on the Poverty Headcount ratio and Poverty Gap. The estimated urbanization elasticities of poverty indicate that at growth rates, a 1 percentage point increase in urbanization rate induces 0.04 and 0.05 (0.07 and 0.09) percentage points decrease in the Poverty Headcount ratio and Poverty Gap in the short-run (long-run), respectively. Similarly, at levels, a 1 percent increase in urbanization level induces 0.22 and 0.32 (0.60 and 0.68) percent decrease in the Poverty Headcount ratio and Poverty Gap in the short-run (long-run), respectively. Consistently, these results show stronger effect of urbanization on the depth of poverty relative to the incidence of poverty. These findings reappraise the literature on the urbanization of poverty in SSA as well as provide a nuanced understanding of the effect of urbanization on the different class of poverty measures. Notwithstanding, the poverty reduction potential of urbanization is not automatic and requires enormous investment in public infrastructure to achieve.


Introduction
The first target of the Sustainable Development Goals 1 (SDG1) is to end all forms of extreme poverty worldwide by 2030 (United Nations, 2015c). The potential of the urbanization process towards attaining this foremost SDG is widely recognized (Christiaensen & Weerdt, 2017;Glaeser, 2013;World Bank, 2009). Over the past century, urbanization has been acknowledged as one of the most important demographic mega-trends and the primary determinant of the spatial distribution of global population. Sustainable urbanization (SDG11) is also closely connected to the economic, social, political and environmental dimensions of sustainable development (Rudd et al., 2018;United Nations, 2015c).
The stylized fact in the urban economics and development literature is that the urbanization process through agglomeration economies and scale economies induces significant increases in income and/or consumption for a large number of both rural and urban inhabitants through the creation of relatively higher productivity and correspondingly higher paying non-farm employment opportunities in both urban and rural areas (Collier, 2017;Collier & Venables, 2017;Gollin, 2018;World Bank, 2009). This has been the experiences of the old urbanizations of Europe and North America and the new urbanization of Asia which were particularly associated with industrial revolution and agricultural green revolution respectively, thus leading to rapid economic growth, reduction in inequality and poverty reduction (Gollin et al., 2021(Gollin et al., , 2016Henderson & Kriticos, 2017).
However, the urbanization process in SSA is largely seen to deviate from the stylized facts in the literature due to its association with growing inequality and worsening poverty (Castells-Quintana & Wenban-Smith, 2020;Collier, 2006;Glaeser & Henderson, 2017). For instance, SSA is the only region in the world which experienced substantial growth in the number of extreme poor from 277.5 million in 1990 to 413.3 million in 2015 (World Bank, 2018). Particularly, 3 out of the top 5 countries that accounted for 50% of the World's extreme poor in 2015 are in SSA, namely Nigeria, Democratic Republic of Congo, and Ethiopia, and are forecasted to be the top 3 countries by 2030 (Ibid). Further, extreme poverty is projected to increase due to COVID-19 pandemic induced income losses in the large informal sector in many SSA countries (UN-Habitat, 2020;World Bank, 2022).
As extreme poverty continues to become increasingly an SSA burden, it is rightly recognized that it is in this same region that the battle for reducing global extreme poverty to less than 3% by 2030 will be won or lost (World Bank, 2018, 2019. Therefore, in piecing together the poverty puzzle, the potential of the urbanization process for poverty reduction in SSA has become a key research focus and policy priority (Rudd et al., 2018;UN-Habitat, 2016).
Generally, the urbanization-poverty nexus in SSA has been described largely as a puzzle and highlighted variously as urbanization without growth (Fay & Opal, 2000), urbanization of poverty (Ravallion et al., 2007), pathological urbanization (Annez & Buckley, 2009), poor country urbanization (Glaeser, 2013) and dysfunctional urbanization (Collier & Venables, 2017). However, these popular perceptions which are extrapolated through a comparison of the urbanization experience in SSA with Europe, North America and Asia show that an understanding of the urbanization process and its economic ramifications in SSA is nascent (Glaeser & Henderson, 2017;Turok & McGranahan, 2013).
Furthermore, the paucity of literature on the poverty reduction effect of urbanization in SSA is evidenced by the relatively limited number and avenues of studies on same. To our knowledge, few recent studies on this subject matter (Castells-Quintana & Wenban-Smith, 2020;Christiaensen & Weerdt, 2017;Mahumane & Mulder, 2022) focus exclusively on the region and/or countries within SSA. Moreover, these studies mainly focus on a single measure of poverty and do not compare the effect of urbanization on different poverty measures.
This study contributes to the literature in three ways. First, it aims to address the knowledge gap on the urbanization-poverty nexus in SSA. Second, it reappraises the urbanization-poverty puzzle in SSA. Third, it provides a nuanced understanding of the effect of urbanization on the different class of poverty measures namely the poverty Headcount ratio (P 0 ) and the poverty Gap (P 1 ) to aid policy focus in SSA. The study employs the system generalized methods of moments (SYS-GMM) methodology to estimate and compare the urbanization elasticities for the poverty Headcount ratio (incidence of poverty) and the Poverty Gap (depth of poverty) at both levels and growth rates, to ascertain which effect is stronger in the short-run vis-à-vis the long-run or both. 2 The rest of the paper is organized as follows. The related literature is reviewed in Section 2. The data sources, definitions and empirical strategy employed are discussed in Section 3. The results of the study are presented and discussed in Section 4. The conclusions and recommendations for policy considerations are presented in Section 5.

Related literature
Generally, the spatial distribution of poverty worldwide shows two main distinctive patterns. Firstly, poverty is overwhelmingly a rural phenomenon (Nguyen, 2014;World Bank, 2011;World Bank & IMF, 2013). For instance, the global incidence of poverty in rural areas is 17.2% as compared to 5.3% in the urban areas and despite the increasing share of poverty in urban areas, caused mainly by the poor being the most rapidly urbanizing segment of the population, it will not be until the middle of the century that the rural and urban shares of poverty will converge (McGranahan, 2017;Ravallion et al., 2007).
Secondly, the incidence of poverty declines steadily from rural areas to smaller towns and cities to metropolitan areas (Ferre et al., 2012;Lanjouw & Marra, 2018;Tripathi, 2013b;World Bank & IMF, 2013). This poverty city-size gradient results from the lower per-capita provision of public infrastructure and basic services in smaller towns and cities relative to big cities and metropolitan areas (Castells-Quintana & Wenban-Smith, 2020). Also, the rural poor overwhelmingly migrate to nearby smaller towns and cities thereby resulting in declining per-capita access to basic public services (World Bank & IMF, 2013).
In general, the impact of urbanization on poverty can be categorized under two-rounds effects. The first-round effects occur in the urban areas and are manifested in several folds. One, is the provision of employment opportunities in urban areas for the usually abundant low and unskilled labour from rural areas at comparatively higher levels of productivity and remuneration (Christiaensen & Weerdt, 2017;Liddle, 2017;UN-Habitat, 2016). Two, the rural poor now living in urban areas are able to access the essential public services and infrastructure such as education, electricity, healthcare, portable water, sanitation, housing, transport, capital and others required to improve living standards which are not adequately and affordably supplied in the rural areas (Liddle, 2017;UN-Habitat, 2020;World Bank, 2009). Three, surrounding rural areas provide market for urban products (Da Mata et al., 2015) and a significant proportion of urban food needs and cooking fuel such as fuel wood and charcoal (Broto et al., 2020;Mahumane & Mulder, 2022).
The second-round impact of urbanization on poverty occur in the rural areas through several channels. One, improved urban-rural linkages result in increased urban market for rural products leading to increased rural income and agricultural productivity via specialization and scale economies (Emran & Shilpi, 2012;UN-Habitat, 2016, 2020. Two, urbanization induces increased rural non-farm employment opportunities which are associated with higher returns to labour and lower incidence of poverty as compared to rural agriculture (Deichmann et al., 2009;Fafchamps & Shilpi, 2005;Foster & Rosenzweig, 2004). Three, remittances from urban to rural areas increase rural income and consumption (Cali & Menon, 2013;UN-Habitat, 2016). Four, return migration by those who have acquired capital and skills in the urban areas increases the productivity of the rural economy (UN-Habitat, 2020;World Bank, 2009).
The results from several empirical studies confirm the poverty reduction effects of urbanization. In their study on the urbanization of poverty for 87 developing countries over the period 1993, Ravallion et al. (2007 found that of the 5.2% decline in aggregate poverty during the period, urbanization accounted for 1.04%. The study by Nguyen (2014) in Vietnam over the period [2006][2007][2008] showed that a 1% increase in urbanization resulted in a rise in both rural households' percapita income and per-capita consumption expenditure by 0.54% and a 0.39%, respectively, and led to a reduction in rural household poverty rate by 0.17%. Also, the study by Datt and Ravallion (2009) in India from 1951 to 2006 showed that the poverty reduction potential of urbanization is unmatched by any productivity increase in the rural sector. The study found that the poverty reduction impact of urban economic growth far exceeded that of rural economic growth for all the three class of FTG poverty measures at the national, urban and rural levels.
The study by Tripathi (2013a) for 52 large Indian cities with 750,000 or more inhabitants between 1950 and 2025 found that urban economic growth significantly reduces urban poverty headcount ratio growth. In a similar study using data from the 61st Round of the Indian National Sample Survey, Tripathi (2013b) found that large urban population and higher city economic growth each induces a reduction in all three FGT class of poverty measures.
In SSA, the findings from the longitudinal study by Christiaensen and Weerdt (2017) in Tanzania between 1991 and 2010 found extreme poverty to be virtually non-existent among city migrants, 16% for town migrants, 30% for off-farm migrants and 42% for non-migrant rural farmers. Altogether, the average income of migrants to cities increased by 206% as compared to 36% for non-migrant rural farmers.
Additionally, several recent studies indicate non-linear effect of urbanization on poverty. The study by Ha et al. (2021) in Vietnam using data from 2006 to 2016 showed a U-shaped effect of urbanization on the poverty headcount ratio, with the estimated urbanization thresholds being 43.68% and 40.19% in the static and dynamic models, respectively. Also, the study by Wang et al. (2022) on the effect of urbanization on rural and urban poverty using data from up to 19 provinces in China from 2000 to 2017 found a U-shaped relationship for the poverty headcount, poverty gap and poverty intensity for both rural and urban areas. Furthermore, the study by Mahumane and Mulder (2022) on the effect of urbanization on household energy poverty in Mozambique between 2003 and 2015 showed that the effect for energy consumption poverty is U-shaped and that for energy expenditure poverty is N-shaped.

Data
The data for the study are sourced from three main online databases namely Penn World Tables  Version 10. Three main data sampling criteria are adopted. First, the study follows Henderson (2003a) and adopts the urbanization criterion which restricts the sample to only 38 positively urbanizing countries throughout the study period. Next, in line with prior literature (Ferre et al., 2012;Henderson et al., 2013; UN-DESA, 2019a) a population criterion is employed which considers only 34 countries with at least 300,000 inhabitants in 1960. The raison d'etre for this criterion is that urban agglomeration economies are far less pronounced in countries with lower population. Third is data availability/quality criterion which restricts the sample to only 29 countries. 3 In line with prior literature, the data is sub-divided into five-year intervals to purge the variables from short term wide fluctuations and cyclical effects (Brülhart & Sbergami, 2009;Castells-Quintana, 2017;Chauvin et al., 2017;Fay & Opal, 2000;Henderson, 2000;Sulemana et al., 2019) as well as to capture sufficient variations (Henderson, 2003a(Henderson, , 2003b. The Foster et al. (1984) class of decomposable poverty measures (FGT) covering the Poverty Incidence (P 0 ) and the Poverty Gap (P 1 ) are used to measure, respectively, the breadth and depth of poverty. Table A presents the definitions, expected signs and the sources of data for the variables of the study.

Descriptive statistics
The summary statistics of the key variables as presented in Table 1 show considerable variations within and among countries. Noteworthy, the Poverty Headcount ratio (Poverty Gap) ranges from a minimum of 3% (1%) to 95% (65%) with a mean value of 54% (24%). Also, urbanization level (rate) with a mean of 34% (2%) ranges from a minimum of 5% (0.03%) to a maximum of 89% (12%).

Empirical model
The study empirically investigates both the short-run and long-run effects of urbanization on poverty in SSA. The urban economics and new economic geography literature considers the existence of a large variety of agglomeration economies as the most important feature of the urban spatial economy (Fujita et al., 2003). Consequently, the study follows prior studies (Castells-Quintana, 2017;Fay & Opal, 2000;Henderson & Kriticos, 2017;Nguyen & Nguyen, 2018) and adopts urbanization variable as a proxy for urban agglomeration economies. Particularly, the proportion of a country's population living in areas described as cities by national statistics (urbanization level) and the changes in urbanization level (urbanization rate) are used exclusively of each other as the proxy measures of urban agglomeration economies.
In line with the standard approach in the literature where both initial conditions and interaction effects are considered (Bourguignon, 2003;Christiaensen et al., 2013;Fosu, 2009;Kalwij & Verschoor, 2007), a Cobb-Douglas expenditure function is specified of the form: The hypothesized relationship in Equation 1 is that the poverty index of country i over period t, P it is a function of the urbanization rate (level) U it ; a vector of control variables K it ; and the set of interaction terms R it . The initial levels of per-capita GDP and inequality and the changes in percapita GDP and inequality are used as the set of control variables (Dollar et al., 2016;Dollar & Kraay, 2002;Fosu, 2017;Kanbur, 2005). For the interaction terms, the level of urbanization is  Wang et al., 2022). Figure 1 presents the analytical framework of the study.
The Cobb-Douglas functional specification of Equation 1 is to make it easier to log-transform it to obtain the urbanization elasticity parameters for estimation. The log-linearization provides additional estimation benefits. First, it transforms the non-linear equation into a linear model to enable the parameters to be estimated using linear regression methods for easy interpretation. Second, the log-transformation reduces the skewness in the data which may be caused by outliers that may bias the estimated results. Third, it eliminates any possible existence of heteroscedasticity to make the error terms homoscedastic, uncorrelated and normally distributed.
Accordingly, the natural logarithm is taken on both sides of Equation 1 and rewritten in a dynamic form to yield a first order autoregressive [AR (1)] model to be estimated as: where i = 1, ?up>. . ., N, t = 1, ?up>. . .
The random disturbance term � it in the dynamic panel data (DPD) model of Equation 2 is a oneway error component model of the form: where υ i denotes the country-specific effects and ε it is the usual stochastic error term. Equation 3 is a random model, the error terms υ i ?up>∼IID (0, σ 2 υ i ), ε it ?up>∼IID (0, σ 2 ε it Þ and are all independent such that E(υ i ) = 0, E (ε it ) = 0 and E (υ i ε it ) = 0. Also, the explanatory variables (X it � ) in Equation 2 are all orthogonal to the error terms υ i and ε it for all i and t such that Since both the dependent and the main independent variables in Equation 2 are in natural logarithms, it implies that the coefficient of the main independent variables namely β 1 is the urbanization elasticity of poverty.

The case for generalized methods of moments
The application of the GMM methodology for this study is based on four principal reasons. First, the primary condition for the use of GMM exists since the number of countries (N = 29) is considerably higher than the number of time periods in each cross section (T = 7). Thus N >T. Second, the poverty indices are persistent. In particular, the correlation between the Poverty Headcount ratio (Poverty Gap) and its first lag is 0.8713 (0.8627) which is significant at 1% level. These coefficients are above the threshold level of 0.8000 required to establish the persistence of a variable (Asongu & Acha-Anyi, 2019;Tchamyou & Asongu, 2017). Third, the GMM preserves the cross-country variations in the panel data.
Fourth, there is a problem of endogeneity in Equation 2 since p it as a function of υ i implies that p i;tÀ 1 is also related to υ i and therefore, using p i;tÀ 1 as a separate regressor will be correlated with the disturbance term � it . GMM addresses this endogeneity issue in several ways. It mitigates both the unmeasured and time-invariant individual country specific and unobserved heterogeneity effects (Asongu et al., 2020). It also accounts for simultaneity in the explanatory variables via the use of the lagged values of the dependent variable and the regressors as instruments in differences or both differences and levels (Bond & Windmeijer, 2002;Brülhart & Sbergami, 2009;Tchamyou et al., 2019). The GMM also uses the orthogonality conditions to obtain efficient and consistent estimates even when heteroskedasticity exists in an arbitrary form (Baum et al., 2003).
To illustrate the GMM procedure, consider Equation 2 in level given in a general form as: where, p it � represents the dependent variable and x it � the right-hand variables in Equation 2, with a0 and β 0 being parameters. The difference GMM (DIF-GMM) involves taking the first difference of Equation (5) as: Which can be rewritten in the form: where Δ is the difference operator. The first differencing eliminates the country-specific effects term υ i which may result in incorrect model specification. Also, Δp it � is correlated with Δε it . The system GMM (SYS-GMM) is proposed to address the weak instrumentation problem of the DIF-GMM by combining instruments in first differences and levels (Bowsher, 2002;Judson & Owen, 1999;Roodman, 2009a). Also, the GMM procedure addresses the serial correlation and endogeneity issues through the use of sufficient lags of the dependent variable and the first differenced errors (Arellano & Bond, 1991;Arellano & Bover, 1995;Blundell & Bond, 1998).

Choosing between the difference and system GMM
In choosing between the DIF-GMM and SYS-GMM, the study follows the methodology outlined by Bond (2002). It involves estimating Equation 2 using the pooled OLS, Fixed Effects (FE) and DIF-GMM and comparing the respective values of α. The OLS and the FE are considered, respectively, as an upper-bound estimate and lower-bound estimate. Since the a priori expectation is that α is positively correlated with � it , the OLS will bias its value upward whereas the FE will bias it downward so the estimated value of the true parameter should lie in or close to this range (Bond, 2002;Roodman, 2009b).
The results from the alternative estimations of Equation 2 for the Poverty Headcount ratio and Poverty Gap as the respective dependent variables for the rates and levels of urbanization are presented in Table 2. From the Table, the coefficients of the respective lagged dependent variables from the DIF-GMM1 estimations are closer to that of the FE estimations, implying that the DIF-GMM estimator is biased downward and hence the SYS-GMM estimator is preferable in all cases.
Furthermore, in line with the convention in most applied work using GMM estimations, this study estimates and interprets the results of Equation (2) using the two-step system-GMM (SYS-GMM2). Several simulation studies have shown the efficiency gains from using SYS-GMM2 including controlling for heteroskedasticity and cross correlation (Bond & Windmeijer, 2002;Roodman, 2009aRoodman, , 2009bWindmeijer, 2005).

GMM specification tests
Following the convention in the literature, the study employs two main GMM standard specification tests. These are serial correlation tests, namely the first order [AR(1)] and the second order [AR(2)]; and the test on the validity of instruments (Arellano & Bond, 1991). Generally, using lagged variables as moment conditions can lead to bias due to the possibility of over-fitting the endogenous regressors (Baltagi, 2005). Consequently, the study follows (Bond, 2002;Roodman, 2009aRoodman, , 2009b and uses both the Sargan test and the Hansen test as complementary test statistics of full instrument validity as well as the structural specification of the model. Additionally, the collapsed approach of Roodman (2009aRoodman ( , 2009b) is adopted to account for cross-sectional dependence (Asongu & Acha-Anyi, 2019;Tchamyou et al., 2019) and to prevent instrument proliferation that weakens the Hansen J-test (Andersen & Sørensen, 1996;Bowsher, 2002).

GMM identification, simultaneity, and exclusive restrictions
Fundamental to the GMM strategy is the identification, simultaneity and exclusion restrictions. First, identification involves defining the three variable categories, namely the dependent variable, the endogenous explanatory variables and strictly exogenous variables (Asongu et al., 2020;Tchamyou et al., 2019). The outcome variables are the Poverty Headcount ratio and the Poverty Gap. The identified strictly exogenous variables are years whereas the explanatory variables, namely urbanization (rate and level) and the control variables are the endogenous variables (Asongu & Acha-Anyi, 2019;Tchamyou, 2019). Implicitly, the strictly exogenous variables are assumed to affect the outcome variable through the endogenous variables (Asongu et al., 2020).
Secondly, the issue of simultaneity such as the inclusion of both the urbanization term (u it ) and the poverty term (p it ) in Equation 2 is addressed through the instrumentation process of the SYS-GMM (Arellano & Bover, 1995;Bond & Windmeijer, 2002;Roodman, 2009b). Third, the exclusive restrictions involve checking the validity of a subset of instruments (Roodman, 2009a(Roodman, , 2009b. Within the GMM strategy, the Difference in Hansen Test is applied and the validity of the exclusion restrictions is confirmed when the null hypothesis in relation to the instrumental variable is not rejected (Asongu & Acha-Anyi, 2019;Asongu et al., 2020).   Note: */** /*** indicate statistically significant levels at, respectively, 10%/5%/1%.

Correlations among key variables
The partial correlation matrix among the key variables is reported in Table 3. The poverty reduction effect of urbanization as indicated by the negative correlation coefficients between urbanization level and the Poverty Headcount ratio (Poverty Gap) is −0.543 (−0.431) and significant at 1% level. Similarly, the correlation between GDP per-capita and the Poverty Headcount ratio (Poverty Gap) is −0.603 (−0.495) and significant at 1% level. The correlation coefficients between the Gini indices and the poverty indices are all positive, albeit only that between the Gini Index and the Poverty Gap is significant at 1% level.
Furthermore, the scatterplot in Figures 2 and 3 show strong negative correlations between the level of urbanization and the poverty indices in SSA. This attests to the poverty reduction effect of    urbanization. However, due to the absence of control mechanisms and diagnostic tests not much inference is made from these graphical results.

Effects of urbanization on poverty
The results for the SYS-GMM2 estimations of Equation 2 are presented in Tables 4, 5 and Table 6. All estimations were conducted at 95% confidence interval and the maximum lag length of the variables and instruments are restricted to three (3) which according to the simulation studies of Bowsher (2002) maximizes the power of the Sargan test. (2) = long-run estimates. Four criteria are used to evaluate the validity of each estimation (Asongu et al., 2020;Tchamyou et al., 2019). First is autocorrelation test. The AR (1) has the preferred negative sign and most importantly, the AR (2) is not significant in all estimations. This implies that the lag terms of the respective dependent variables used as instruments are exogenous and therefore valid instruments. It also confirms the appropriateness of the DPD models used for this study. Second is the test of full instrument validity. The p-values associated with the Sargan and Hansen tests for over-identification restrictions are all not significant. These confirm the validity of the full instruments used in each estimation. Particularly, the p-values for the J-statistics are within the generally acceptable range of 0.10-0.60 with that for the Poverty Gap being within the "Goldilocks range" of 0.10-0.25 (Roodman, 2009a). Third is the test of validity of instrument subset. The p-values for the Difference in Hansen Test are all not significant and confirm the exogeneity of the subsets of the instruments used. Fourth, the overall significance of each regression as indicated by the F-test statistic is significant at 1% level. The results from these standard diagnostic tests confirm the validity of the structural specifications and moment conditions used in estimating Equation (2).  Notes: */** /*** indicate significance levels at, respectively, 10%/5%/1%. The standard errors for the estimated parameters are in parenthesis. For the F, AR(1), AR(2), Sargan, Hansen and Difference in Hansen tests, the p-values are in parenthesis. The panel data cover the period from 1985-2019 and the variables are calculated over 5-year intervals. Also, (1) = short-run estimates;
The estimated results from Tables 4 and 5 indicate the poverty reduction effect of urbanization in SSA. One, as indicated by the urbanization elasticities of poverty, the poverty reduction effect of urbanization is stronger in both magnitude and significance for the level of urbanization as compared to the rate of urbanization for the same poverty index. For example, for P 1 in Table 5, the respective estimated short-run and long-run urbanization level (rate) elasticities are −0.32 (−0.05) and −0.68 (−0.09) at corresponding 1% (5%) and 1% (5%) significant levels. 4 Two, the poverty reduction effect of urbanization is stronger in the long-run as compared to the short-run for the same poverty index. For instance, for P 0 in Table 4 the long-run (short-run) magnitudes of the urbanization elasticity variables ln(Urbanization level) and ln(Urbanization rate) are, respectively, -0.60 (-0.22) and -0.07 (-0.04). A similar observation pertains to P 1 in Table 5. These elasticities imply that the poverty reduction effect of urbanization amplifies with time.
Three, in general, both the growth rate and initial level of per-capita GDP have significant poverty reduction effects. Particularly, from Table 5, the coefficients of the variables ln(per-capita GDP growth) and ln(Initial per-capita GDP) are negative and significant in both the short-run and long-run for P 1 . These results are in line with the literature and specifically support the findings of (Bourguignon, 2003;Dollar et al., 2016;Dollar & Kraay, 2002;Fosu, 2009Fosu, , 2017b) that high level of per-capita GDP and/or the growth rate of per-capita GDP is a boon to poverty reduction. More so, the general significance of the variable Squared(per-capita GDP growth) confirm the existence of a non-linear relationship between GDP per-capita and poverty. Further, the (absolute) magnitude of the growth elasticity of poverty ln(per-capita GDP growth) increases with time. For P 1 in Table 5, it increases from −1.48 (−1.13) in the short-run to −2.67 (−2.36) in the long-run for the urbanization rate (level).
Four, the results generally confirm the deleterious effect of income inequality on poverty. Particularly, the variable ln(Inequality growth) is significant throughout for P 1 in Table 5. However, the results for the initial level of inequality, although with the right positive coefficients, are only significant for P 0 in the long-run. On the whole, these results support the findings of Fosu (2009Fosu ( , 2017 and Kalwij and Verschoor (2007) that initial and/or growing inequality hurt poverty reduction efforts and converse to the findings of Dollar and Kraay (2002) and Dollar et al. (2016) that growth in income of the poor are uncorrelated with both the initial and growth in income distribution. Furthermore, the variable Squared (Inequality growth) being generally significant for both P 0 and P 1 in Tables 4 and 5 confirm the non-linear relationship between inequality and poverty.
Five, the respective roles of GDP per-capita and Inequality levels in moderating the effect of urbanization on poverty are as expected. The significance of respective positive and negative coefficients of the interaction effects variables, namely (Urbanization level*per-capita GDP) and (Urbanization level*Inequality Level) in both Tables 4 and 5 show that the poverty reduction effect of urbanization is amplified by the level of GDP per-capita and attenuated by the level of Inequality. The former results confirm the synergistic complementary relationship between the spatial agglomeration of economic activities and economic growth.
Six, Time effects are significant and increase in (absolute) magnitude for both poverty indices, a result that corroborates with the generally observable increasing poverty reduction effects of the significant variables in the long-run. Table 6 presents a summary of the urbanization elasticities of poverty estimated from Equation 2. Estimations at growth rates indicate that a 1 percentage point increase in urbanization rate induces 0.04 (0.05) and 0.07 (0.09) percentage points decrease in the Poverty Headcount (Poverty Gap) in the short-run and the long-run, respectively. Similarly, estimation at levels indicate that a 1 percent increase in urbanization level induces 0.22 (0.32) and 0.60 (0.68) percent decrease in the Poverty Headcount (Poverty Gap) in the short-run and the long-run, respectively. Clearly, urbanization has a stronger effect in reducing the depth of poverty (P 1 ) relative to the incidence of poverty (P 0 ) in both the short-run and the long-run. Furthermore, the poverty reduction effect of urbanization at both growth rates and levels of urbanization are far more pronounced in the long-run relative to the short-run.

Summary and conclusions
The study investigated the poverty reduction effect of urbanization for a panel of 29 urbanizing countries in SSA from 1985 to 2019. The study employed the SYS-GMM2 to estimate the growth rates and levels of urbanization elasticities of poverty. The results show that urbanization within the selected SSA countries has a significant effect in reducing both the incidence of poverty (Poverty Headcount ratio) and depth of poverty (Poverty Gap) with the latter effect being consistently stronger than the former at both growth rates and levels in the short-run and long-run. Overall, the findings of this study reappraise the literature on the urbanization of poverty in SSA as well as provide a nuanced understanding of the effect of urbanization on the different class of poverty measures.
The findings of this study have several policy implications. First, due to its potential for poverty reduction, policy makers in SSA should fully embrace urbanization rather than adopt partial exclusionary measures to prevent it. Second, the full benefits of the urbanization process cannot be reaped automatically. This calls for long-term urban planning and substantial investment in the provision of urban public infrastructure and services such as roads, water, health, education, telecommunication, and others that are mostly lacking in the newly emerging and contiguous urban areas in SSA. Third, promoting (sustainable) urbanization must be made part and parcel of the process of nurturing economic growth and eradicating poverty in SSA. Four, to successfully manage the urbanization and its economic consequences in SSA, there is the need for continuous policy coordination across national and sub-regional borders in SSA. Five, promoting sustainable urbanization in SSA requires the provision of legal and effective enforcement of private property rights over land and buildings that constitute the urban built environment.
An obvious weakness of this study is its limited scope. For instance, the stylized facts of the spatial distribution of poverty worldwide show a declining incidence from rural areas to smaller towns and cities to metropolitan areas, however, urban poverty in many SSA countries is disproportionately concentrated in the largest cities (World Bank, 2011;World Bank & IMF, 2013). This phenomenon which was not examined in this study presents avenue for future research.

Funding
The authors received no direct funding for this research.