Transit-oriented development and gentrification: a systematic review

ABSTRACT The last two decades have witnessed a growing trend towards transit-oriented development (TOD) as a critical approach for achieving sustainable mobility. However, some analysts and community activists have expressed concerns that TOD could induce gentrification and potential concomitant low-income group displacements. This paper presents a systematic review of 35 quantitative research-based studies presenting evidence on gentrification outcomes resulting from transit-based interventions, published between 2000 and 2018. To our knowledge, this is the first systematic review on this topic and thus provides a useful synthesis of current empirical evidence on transit-induced gentrification. Although there is some evidence supporting the transit-induced gentrification hypothesis, methodological flaws render many of the studies’ conclusions highly questionable. The findings suggest that gentrification is more closely associated with existing local dynamics, built environment attributes, and accompanying policies than transit-oriented development. In its critical analysis of research approaches, this paper warns that the incorporation of several sources of bias into study designs may engender a number of misinterpretations, thus ultimately leading to misguided conclusions and policies.


Introduction
Since the 1987 Brundtland Commission report and the 1992 Rio Conference, the sustainability concept has been widely used as a policy guide to develop strategies for more reasonable uses of renewable resources. Meanwhile, awareness has been growing on the contribution of urban transport and mobility to major environmental externalities, and the development of demand management programmes has striven to reduce the mobility footprint through modal shifts and the reduction of automobile dependence (Banister, 2008).
Transit-oriented development (TOD) is among the most popular interventions for reducing the mobility footprint, thus making it a critical component of smart growth and new urbanism. Relying on compact, transit-oriented growth patterns around train, light-rail or metro stations, TOD aims to promote a modal shift and reduce automobile dependence, while also enhancing neighbourhoods' liveability, which is a multidimensional and rather vague concept encompassing healthy, safe, and comfortable environments, quality and aesthetics of public spaces, as well as expanded social and economic opportunities (for a comprehensive definition, see Southworth, 2003). However, TOD might also induce gentrification and concomitant low-income group displacement, and some researchers and public advocates have expressed concerns regarding the social costs and fairness of sustainability-driven approaches (Rayle, 2015;Revington, 2015). Since TOD creates conditions for real estate investment, land values are expected to increase, thus potentially leading to restrictions for low-income groups with regards to accessing housing and maintaining their residential locations. In this manner, the improved local economics might engender additional displacement cases and the replacement of low-income families by better-off households, which raises serious equity concerns.
However, little is known about the actual equity-and gentrification-related outcomes resulting from TOD initiatives, and limited empirical information has been published on housing equity issues related to transit in a broader sense. A comprehensive evidencebased synthesis of the links between transit and gentrification is needed to inform policy and improve practice. Accordingly, we have undertaken a systematic review of the quantitative literature on TOD distributional outcomes to answer one central question: is there evidence that TOD contributes to neighbourhood ascent and the displacement of low-income groups? We carefully searched and reviewed a total of 35 studies published between 2000 and 2018. This paper presents a narrative synthesis of the available empirical evidence.
Transit-oriented development: principles, benefits, and caveats

Rationale and benefits of TOD
Growing awareness on the contribution of urban transport and mobility to major environmental externalities such as pollution, energy dependency, land consumption, congestion, and public health challenges has paved the way to a clear political objective. This objective is founded on a set of practices with two primary and closely related purposes based on densities and travel behaviours (Banister, 2008): (i) to promote a modal shift, reduce automobile dependence, and distances travelled; as well as (ii) to create liveable, meaningful, opportunity-inducing, socially stimulating, and inclusive neighbourhoods. Among multiple combinations and strategies, TOD has been increasingly implemented worldwide, particularly around light-rail transit (LRT) and bus rapid transit (BRT). TOD initiatives are frequently advocated as powerful tools for leveraging transit use and reducing car use while simultaneously triggering local development and quality of life improvements in otherwise declining communities. TOD can be defined as a type of urban development that aggregates a mixed-use, pedestrian-friendly, densely built environment around a public transit station (Litman, 2017). Transit-rich neighbourhoods (TRN) and transit-adjacent development (TAD) convey a broader meaning whereby the transport and development components of transit-served areas are not integrated into an explicitly coherent process. This review focusses on both strictly-defined TOD operations and more general TRN: first because they are not explicitly distinguished in many existing studies, thus resulting in a vague threshold; and second because to do so contributes to our objective to assess contemporary effects arising from the presence of varying transport nodes, regardless of their year of initiation.
The incorporation of smart growth and new urbanism principles into TOD has spurred an extensive body of literature that falls into three main broad categories, the first of which draws upon the idea that TOD is beneficial and should be supported by accompanying policies. The question of how to materialise TOD has engendered better knowledge of the factors favouring or hampering implementation, which range from political, institutional, and instrumental factors to funding schemes and public acceptance (Banister, 2008). The second group addresses the sustainability-related benefits arising from TOD initiatives, among the most frequently assessed dimensions of which include travel behaviour and transit ridership, residential location choices, the quality of public spaces, and land use (Lin & Jen, 2009).
The third literature category draws upon a long-standing neo-classical approach of analysing the capitalisation effects of public expenditures on infrastructures by using beforeafter approaches, hedonic models, and repeat-sales methods to separate transit proximity from other potential explaining factors (Debrezion, Pels, & Rietveld, 2007). It is fairly well established that proximity to transit has a positive effect on land and property values, although land use impacts might be highly context-specific: such effects might depend upon the marginal utility of the new transit line, which in turn could depend upon the pre-existence of other modes and the competitiveness of the new line with them; they also might vary according the general economic, political, and institutional conditions of the metropolitan area the transit line is embedded in; and they might ultimately produce negative externalities.

The hypothesis of TOD-related gentrification
Although the vast majority of the aforementioned studies do not engage with affordability issues, some researchers have elaborated on equity concerns and utilised prices as a proxy of gentrification effects (Immergluck & Balan, 2018), whereas others have focussed on transportation and housing costs and related affordability issues. In recent years, researchers and policy advocates have argued that TOD interventions could result in gentrification and the eventual displacement of low-income groups (Cappellano & Spisto, 2014;Jones & Ley, 2016;Kahn, 2007;Rayle, 2015). For the purposes of this study, we define "gentrification" as a broad upgrading process whereby a neighbourhood's socio-economic composition changes to a greater degree than that of nearby areas over a relatively short time period, as wealthy and highly skilled workers proportionally increase by outbidding poorer residents for housing (Brown, 2016).
Why would gentrification occur in TOD areas or other transit-rich neighbourhoods? TOD initiatives involve and trigger (re)investment processes that can change spatial patterns, urban visual settings, and accessibility levels. Newly-built developments or housing rehabilitation can trigger declines in housing affordability, upward social filtering, and displacements. Location theory can predict such outcomes, and neo-classical approaches and the new economic geography have converged to highlight the critical roles of commuting costs and the locational advantages conveyed by transport nodes. Two complementary views of gentrification provide a useful framework for interpreting TOD-and transit-induced neighbourhood ascent. The consumption-side perspective identifies individuals' preferences and the attractiveness of urban life as primary drivers of urban regeneration and investments. Accordingly, TOD initiatives would encourage individuals seeking vibrant and liveable neighbourhoods. In contrast, the Marxistderived "rent gap" theory explains capital investments and subsequent gentrification based on the higher underlying location land value in otherwise declining areas. Although this process normally occurs regardless of the presence or absence of a transit line, it has been argued that enlargements of pre-existing rent gaps might result from the opening of new lines and/or local TOD development projects (Revington, 2015). As the rent gap expands, real estate investments increase as they become more profitable, thus leading to declining affordability and the consequent up-filtering of households.
Neoliberalisation and entrepreneurial forms of public management play critical roles in explanations deriving from changing planning paradigms (Revington, 2015). In contrast to the aforementioned gentrification theories underlining private agents' micro-decisions, changing planning paradigms rather emphasise the increasing promotion of private capital attraction and deliberate gentrification by public authorities (Nilsson & Delmelle, 2018). Environmental rationales have been fully internalised as a positive asset by economic players and public stakeholders. Through the double rationale of creating attractive neighbourhoods and reducing car use, TOD interventions are precisely at the crossroads between two of the competing dimensions of sustainability, which might result in the obfuscation of social factors. Such policies can therefore be seen (at least partially) as a form of competition-driven activity framed in globalisation processes. To succeed amidst global competition, cities seek to improve their image branding and develop recognisable identities by conveying vibrant, attractive, and mixed-used neighbourhoods that promote environmental sustainability and a local-scale sense of quality of life; however, such images are undermined by conditions of socio-spatial inequity and the persistence of minorities and vulnerable residents in central areas.
TOD interventions are intended to maximise ridership (from the perspective of transit authorities and operators) and property tax revenues (for local governments) while simultaneously addressing the high costs related to requirements such as changing zoning and regulations, coordination with transit agencies, public space design and the provision of local amenities (place-making). TOD areas are designed to attract investments that are essentially directed by private-led developments that need to be capitalised through the production of dwellings orientated to upper income households (Cappellano & Spisto, 2014). In addition to the production of well-designed, attractive and walkable public spaces, newly-built housing units tend to attract one-person households and young professionals (Rayle, 2015), not necessarily merely due to the presence of transit, but also because of a set of attributes associated to the built and social environment, including increased density, land-use mix, lifestyle services and amenities, green areas, and public open spaces (Chatman, 2013).
That being said, the debate on whether TOD initiatives contribute to or create conditions leading to gentrification remains unresolved. On the one hand, low-income groups might continue to dominate the neighbourhoods in many transit-served areas, as public transit stations might discourage wealthier people from moving in due to congestion-related conditions, shortages of large and comfortable apartments, poor parking possibilities, and crime. Moreover, even with increases in housing costs due to inmigrations of wealthier residents, the lower transport costs deriving from newly built stations could offset diminishing affordability issues in other areas, thus resulting in relatively low combined housing and transport costs (Litman, 2017). Even in the case of a relative increase in absolute numbers of better-off residents, more vulnerable groups might be able to remain in the area and exploit the improved accessibility of jobs and other urban resources and opportunities. Therefore, the resulting increase in wages might not be the result of filtering and displacements, but rather an outcome of general improvements in economic conditions.
On the other hand, endogeneity issues might counterbalance the view of gentrification as a by-product of newly built transit. Smart growth and new urbanism approach often contribute to a modern and progressive image of the cities that adopt them, and they are frequently utilised as a way to attract skilled workers and jobs, even if gentrifiers themselves hardly use public transit (Danyluk & Ley, 2007). In some cases, neighbourhoods where social upgrading is already underway are more publicly visible and could become a preferential target for urban design and new urbanism-inspired initiatives. The terms of the causative relationship would then be reversed, as TOD would actually be a consequence (albeit also potentially serving as a reinforcing factor), rather than an initial cause of gentrification. In other cases, transit is expanded explicitly for the purpose of better serving poor neighbourhoods; however, these newly transit-rich areas become attractive and susceptible to gentrification (Rayle, 2015). As Deka (2017) observed, transit-rich areas are often characterised by larger proportions of renters, who are more likely to be displaced than owners. In such cases, the previous social composition would then be a confounding factor in that it contributed to both the emergence of a TOD initiative and the resulting gentrification process.

The still-debated undesirability of gentrification
Thus far, gentrification has been predominantly interpreted as a process with negative outcomes, which raises concerns regarding the potential displacement of low-income groups and the rise or intensification of local conflicts due to housing burdens, increasing food retailer prices, and the loss of a sense of community (Clagett, 2015). Although methodological constraints have limited the collection of evidence concerning the fate and location of displacees, many fears have been expressed in the literature. For example, displaced workers' jobs might be at risk due to the loss of the geographical connection to their workplace, and this problem could be exacerbated by the likely rise in transportation costs associated with increased travel distances. Displacement has been very difficult to demonstrate, and the new residential locations chosen by displacees are to a large extent unknown (Chapple et al., 2009). Displaced persons might accept more expensive, precarious, or overcrowded housing. They might suffer from negative psychological effects from the threat of displacement (Twigge-Molecey, 2014), move to the urban fringe and become more car-dependent, experience reduced access to services and amenities, and/or end up living in less health-supportive built environments (Cole, Garcia Lamarca, Connolly, & Anguelovski, 2017). Thus, newly built transit lines might ultimately fail to provide accessibility benefits to those who are most in need of them.
This could also lead to deviations from the objective to increase ridership and to financial sustainability issues. First, low-income groups are the most important user demographic, and as the hypothesis of transit-related displacements proposes, those who most need transit would lose their access to stations, thus resulting in diminishing transit use among poorer demographics. Second, better-off households moving into TOD-served areas would own more vehicles than the previous, displaced residents, particularly if on-and off-street parking remains available (Chatman, 2013). Although wealthier residents often shift to non-car modes after relocating to a TOD area, their presence would not offset the revenue losses resulting from the removal of the poorest households because they often choose cycling over public transit (Danyluk & Ley, 2007).
The "positive gentrification" perspective sustains that the influx of wealthier residents into low-income neighbourhoods does not necessarily result in the marginalisation of less well-off households (Chaskin & Joseph, 2013). Rather than being displaced, lowincome households living in relatively segregated poor neighbourhoods might benefit from an increase in income-based diversity. Living in neighbourhoods where contact with better-off households is favoured has been associated with better education outcomes for low-income children (Chetty, Hendren, & Katz, 2016). Similarly, local economies could be enhanced by gentrification, as the availability of new, better quality services and goods positively impacts quality of life for existing households.
The specificities of this debate are beyond the scope of this study. Rather, what needs to be assessed first is whether and to what extent transit proximity, and particularly TOD interventions contribute to neighbourhood gentrification. Only then can policies be designed to reduce such risks. The next section provides an overview of the review protocol before the main findings are presented.

Review protocol
Although there is no consensus on the best manner in which to conduct a systematic review (Higgins & Green, 2008), there is fair agreement that careful attention should be devoted to issues of reliability and bias among the selected studies (Higgins & Green, 2008;Rutter, Francis, Coren, & Fisher, 2013). Systematic reviews are primarily utilised in social, behavioural, medical, and economics research, and methods of data extraction and quality appraisal sometimes differ across fields. In order to fit our review to the topic, we used a self-constructed data extraction tool derived from several sources (Higgins & Green, 2008;Rutter et al., 2013).

Search strategy and data sources
Two search stages were implemented in this review. Search A used a range of electronic databases, namely ScienceDirect, Web of Science™, the Transportation Research Board's Transport Research International Documentation (TRID) database, Google Scholar, Worldcat, opengrey.eu, and opendoar.org. Additionally, we manually searched issues published in 2018 through a selection of relevant journals from among the top 40 rankings of the 2016 SCImagoJR index of Geography and Planning, Transport, and Urban Studies publications. 1 The following combination of search terms was used (with some syntactic variants, depending upon the database): 2 (i) "transit-oriented development" or "metro" or "subway" or "underground" or "rail" or "station" or "bus rapid transit" or "transit infrastructure" or "transit line*" or "transit-rich" or "transit infrastructure*" or "transport* infrastructure*" or "public transport*" or "public transit" or "compact cit*" and (ii) gentrification or "low-income" or displacement or "social upgrading" or "declining affordability" or "neighbo*rhood change" or "hous* pric*" or "hous* valu*" or "land pric*" or "land valu*" or "property pric*" or "property valu*". After selecting the studies, data were extracted onto a spreadsheet, which consisted of 52 fields divided into seven categories (Table 1).

Inclusion criteria
Studies were considered eligible if they satisfied a set of pre-determined criteria: (i) quantitative studies published in English from 2000 to 2018 (restricted to peer-reviewed journals and conference proceedings, dissertations, working papers, research reports, and governmental research); (ii) studies using neighbourhood social upgrading and/or displacement of low-income groups and minorities as outcome measures of interest; (iii) studies using any people-related variables as a measure of neighbourhood change (investigations based uniquely on land and property values were not considered eligible).
Qualitative studies were not included because our aim was to find evidence of causal relationships between transit lines and gentrification. Although we recognise that qualitative methods can provide site-specific causal explanations through richer data and that residents' perceptions to fully understand gentrification processes and their experienced consequences are of equal importance as quantitative measures, such approaches are only rarely used to make causal claims involving non-subjective outcomes. Control of potential biases and observation of regularities have been long-standing issues limiting the ability of qualitative approaches to promote causal inferences and generalisability (Maxwell, 2004). Indeed, as discussed later in this paper, the few existing qualitative and/or perception-based studies were generally guided by slightly different research questions than those shaping the current investigation. Finally, from a more practical perspective, quality appraisal designs applied to quantitative and qualitative studies would differ substantially, thus generating comparability challenges. Nonetheless, the main findings of such papers will be mentioned, as they provide useful insights on the process of (or counterbalancing) transit-induced gentrification.

Quality appraisal
Quality appraisal is a fundamental step in any systematic review (Higgins & Green, 2008), as the calibre of the examined studies can vary significantly, thus potentially influencing the reliability of the conclusions they inform (Rutter et al., 2013). Quality appraisal is also the most difficult task due to a high degree of unavoidable subjectivity in the choice of criteria defining quality as well as during coding (Higgins & Green, 2008); however, subjectivity can be minimised at both steps. First, what defines quality can be based on internal validity, defined as the susceptibility to and minimisation of potential bias in the study appraised; and on external validity, defined as the generalisability of the study's findings (Higgins & Green, 2008). The latter is normally better used in behavioural or medical sciences than in context-sensitive geographical studies. Second, subjectivity at the coding step can be minimised by a double-independent coding approach in which disagreements are solved through discussion between raters (Higgins & Green, 2008), as well as discussing how methodological issues might influence the review findings. No study was excluded from the review based on quality. Criteria for quality appraisal were based on the SCIE guidelines (Rutter et al., 2013), albeit a modified version adapted for transport-land-use studies (Table 2). Higgins and Green (2008) recommended that several sources of bias be assessed; however, their list is focussed on healthcare research, whereas some distinct criteria are required for evaluating social science and transport studies. In the absence of such a well-established set of guidelines, a list of potential sources of bias was defined and assessed for each study, which resulted in a bias avoidance score ranging from 0 (totally unreliable) to 1 (totally reliable) ( Table 2). The bias avoidance score was calculated as follows: BAS = 1 − [Sum of scores/(13 − "Non-Applicable" answers)]. Item ratings were attributed independently by two coders and disagreements were resolved through discussion. The quality appraisal was only considered in the context of a set of chosen indicators and was not intended to serve a judgement of the global intrinsic quality of the studies themselves. A necessarily arbitrary cut-off point would therefore be methodologically wrong and impair our attempt to minimise subjectivity by keeping all of the information available.

Study selection
After conducting Search A and removing duplicates, 5,712 studies were identified ( Figure  1) and assessed against the inclusion criteria. Most of the studies (nearly 88%) were actually completely off-topic and located only because some of the search keywords happened to also appear in those papers. All records were examined by a single reviewer, Are control groups used for comparisons? Through methodologydistance as a variable (2) Non-transit served equivalent neighbourhoods (2) Control corridor group with other infrastructure (1.5) TOD vs non-TOD station areas (1) Small sample of non-transit served neighbourhoods (1) Entire wide-area, city, or urban area (0.5) Transit-served compared between them (0) No control group (0) Is endogeneity accounted for? For every question: Yes (1) Partially (0.5) No (0) Unclear (0) Not applicable (0) Is spatial autocorrelation taken into account? Are spillover effects accounted for? Is the influence of other infrastructures accounted for?
(interferences) Does the study determine if the study area was a gentrifiable one in the first place? Is the built environment taken into account? Is the transit performance/quality/connectivity taken into account Is time taken into account? Is the choice of distance type and threshold shown to be reliable?
Are robustness and sensitivity tests described? For both questions: Yes (1) Partially (0.5) No (0) Unclear (0) Does the paper discuss the quality of the analysis? and a random sample comprising 20% of the records was independently assessed by a second reviewer, as recommended by Rutter et al. (2013). The interrater reliability was considered strong (Cohen's kappa = .87). The findings of the review are presented below through a narrative synthesis approach. The studies were predominantly undertaken in the United States (n = 31, 88.6%), although a few were also conducted in Canada, Colombia, Taiwan, India, and England (n = 1 in each country) (Supplemental file 01). The identification of only a single European study was rather surprising considering the recent prominence of social concerns in the continent's transport literature and that many European cities have undertaken transitoriented development projects alongside regeneration policies over the last two decades. There are some possiblealbeit not entirely convincingreasons to explain such lack. First, the popularity of TOD initiatives involving both transportation and urban space regeneration operations is significantly less in other countries than in the US, where such projects are constructed in less saturated areas where there is still developable land. In contrast, European urban areas tend to be more intensively built with little room for such expansive initiatives, and public transit is more ubiquitous in European cities, whereas it is widely viewed as a scarcer resource in the US. A counterpoint here is that transit in Europe is not as pervasive as it might seem; even more importantly, there are numerous non-US-based studies dedicated to land values around transit stations. Second, the English language restriction might have caused a selection bias. However, (i) there are many English-written works on TOD and mobility-related equity in a high diversity of countries; (ii) countries such as Australia, Ireland and the UK, and Canada, where distributional outcomes from transit issues are quite popular in the literature, were expected to have more studies than were actually identified. Third, the availability of fine data (e.g. at the census block level) involving a sufficiently large time period is fairly uncommon worldwide, and very few countries can provide accurate data that enable TOD-related gentrification analysis. Nonetheless, none of these points seem sufficient to explain why only five out of 35 studies are not US-based.

Summary of selected studies
A total of 40 urban areas were analysed across the selected studies. Among these, 23 cities (all in the US) that were analysed in at least three different studies represent 85.6% of the cumulative universe of examined cities (pairs cities/studies = 132). Los Angeles was the most frequently analysed city (n = 10), followed by Dallas, Denver, and Portland (9), Atlanta and San Francisco (7), Baltimore (6), and San Diego and Washington, DC (5).
Light-rail transit was the most frequent exclusive focus of the studies (40%, n = 14), and it was included in an additional nine multi-mode studies. The other modes have been much less analysed: bus rapid transit (BRT) was present in five studies (including two in which it was the exclusive focus); subway stations were included in 10 studies (exclusively in five), commuter railways were analysed in ten studies (exclusively in three). Other modes included bus, streetcar, cable car, and trolley bus. Most studies were focussed on only a single mode (71%, n = 25), whereas fewer analysed two modes (four studies), three modes (two studies), or more than three modes (four studies).
Based on our review of the 35 papers, it is difficult to distinguish between specific TOD operations and broader transit-related analyses. The development and expansion of lightrail transit lines does not necessarily mean that public space treatment, urban regeneration and/or real estate operations will also be implemented by local authorities, and the studies seldom mentioned when such operations were included.

Description of the studies
Transit outcomes measured as reflections of gentrification varied widely across studies (Figure 2), although some were common to nearly all papers. For instance, income (77% of the studies, 88% if poverty indicators are included in this category), ethnicity (54%), and education (51%) were the most frequently applied measures, followed by house price (49%, 52% if land price is included), tenure (34%), and age and migration (both 26%). More rarely addressed measures were occupation, rent prices, household size and type, and car ownership (<25%). Data on evictions and home conversions, which are difficult to obtain, were used in only one study (Chapple, Loukaitou-sideris, Chatman, Waddell, & Ong, 2017). The same applies to data on property turnover (Dawes, 2017).
The number of indicators used varies from one (in six studies) to 13 (in one study). Just over half of the studies (54%, n = 19) used more than three indicators (median = 4).
The study designs were also rather heterogeneous (Supplemental file 02). Some investigations were based on a single approach, whereas others utilised two or three different and complementary methods. A cumulative number of 51 methods (i.e. pairs category-study) was obtained by grouping the study designs into seven main categories (before-after comparison, bivariate analysis, analysis of variance, clustering methods, survival analysis, difference-in-difference analysis, and regression models). Being that they are also among the least reliable methods on the list, before-after comparisons (2%) and bivariate analysis (4%) were unsurprisingly rare. Survival analysis was used in one study (Grube-Cavers & Patterson, 2015), which showed the second highest bias avoidance score. Much more common were difference-in-difference approaches, which accounted for 39% of the total. These approaches were based on comparisons of change in a variable or set of variables during a given period between transit-rich neighbourhoods versus control group neighbourhoods, which could include equivalent neighbourhoods or an entire county or metro area. The reliability of such approaches depended on how the control groups were defined, and which variables were included. A third of the studies relied on regression models, specifically ordinary least square (OLS) models, logit and probit models, and spatial lag models. Regression analyses were conducted in 37% of the investigations and were notably employed by four of the five studies with the highest bias avoidance scores: Bardaka, Delgado, and Florax (2018) used a spatial lag model; Brown (2016) performed an OLS model; Chapple et al. (2017) relied upon several logit and OLS models; and Pathak, Wyczalkowski, and Huang (2017) computed a fixed-effect 2SLS regression model.
Finally, the timeframes considered for analysing transit-caused gentrification were also highly variable and not always entirely clear. Only 43% of the studies (n = 15) provided complete information on the years of the data that were used and/or the years of intervention (TOD or lines opening). Periods of analysis ranged from <0 (anticipated effects of TOD) to 45 years (minimum mean 5.1 years, maximum mean 17.8 years)each study might have a minimum and a maximum time span, particularly if examining several lines with varying opening years.

Quality appraisal: risk of bias assessment
The quality of the selected studies is rather variable. One of the most striking results to emerge from the data was the high risk of bias (Supplemental file 03). Only two of the 35 studies selected demonstrated low risk of bias (Bardaka et al., 2018;Pathak et al., 2017); whereas 10 studies (29%) exhibited moderate risk of bias (scoring between 0.5 and 0.75), and a large majority of 23 studies (66%) indicated serious risk of bias (<0.5). More than half of the studies failed to account for any source of bias. The most considered sources of bias were gentrifiability (49%) and control groups (34%). Only 54% of the papers discussed the quality of the analysis, and robustness was assessed in a mere 29% of the studies. Despite a wide range of analytical time periods across and within studies, only 34% of them took time into consideration. Only 20% of the studies considered endogeneity and the attributes of the built environment, spillover effects were considered in 17% of the studies, and 14% made room for spatial autocorrelation. The type of distance, the potential influence of other infrastructures, and the performance of the lines examined were almost never included in the analyses (<10%). Finally, a distinction was rarely made between new and existing stations, and most of the studies did not mention any public interventions on the surrounding built environments. Figure 3 presents an overview of potential bias in the selected studies. Four observations can be drawn here: (i) bias avoidance scores above 0.5 appear only from 2015 onward, which indicates some progress in recent years; (ii) in 14% of the studies, gentrification was not the main focus of analysis (as expected, these generally received lower scores); (iii) only ten studies scored above 0.5; (iv) the type of paper influences the bias avoidance score, as journal papers were generally more reliable.

Data analysis and synthesis
Our core question was whether these studies detected TOD or transit-induced gentrification. Based on the above, it seemed obvious that the answer must be related to the quality of the studies, the specific outcomes measured, and their geographical scope (Supplemental file 04).
By disaggregating the results by city, we were able to identify consistencies and divergences in interpretations across studies. As stated above, among a total of 40 cities analysed across the studies, 23 were addressed in at least three different papers. For each study/city pair, we determined whether gentrification was detected (noted "Y" in the following lines), not detected ("N"), or considered variable or unclear ("V"). Due to the high degree of heterogeneity among methods and indicators of gentrification, we did not distinguish between levels of gentrification intensity (for instance, between strong, moderate, and weak gentrification processes). One study did not provide results across cities in a disaggregated manner (Pollack, Bluestone, & Billingham, 2010); therefore, 12 pairs were  2N), Boston, and Washington, DC (2Y, 1V). Higher bias avoidance scores ( Figure 4) were associated with support for the gentrification hypothesis. Of the twelve studies scoring at least 0.5 on the bias avoidance scale, six identified some evidence for the gentrification hypothesis (Bardaka et al., 2018;Brown, 2016;Chapple et al., 2017;Heilmann, 2018;Hess, 2018;Lin & Chung, 2017). Based on income, education and house price measurements, Bardaka et al. (2018) were able to attribute socioeconomic changes to LRT in predominantly low-income downtown neighbourhoods in Denver. Chapple et al. (2017) detected signs of gentrification in San Francisco and Los Angeles downtown areas as evinced in the loss of affordable housing and low-income households, as well as higher rates of in-and out-migration, better educated neighbourhood profiles, higher housing prices and displacements. In Los Angeles, Brown (2016) detected gentrification patterns in neighbourhoods with lower rents, educational attainment, median household incomes and higher proportions of renter-occupied housing. Heilmann (2018) found increases in neighbourhood income in Dallas census tracts that received rail access compared with neighbourhoods that were promised access but did not due to funding cuts. In Seattle, Hess (2018) detected increased percentages of Whites and decreased percentages of minorities near light-rail stations, whereas in Taipei, Lin and Chung (2017) identified higher population migration, percentages of college graduates, floor areas and house prices near transit stations, mainly in the inner city. Four of the highest-ranked studies exhibited some variability across cities: Grube-Cavers and Patterson (2015) detected evidence of gentrification in Toronto and in Montréal, but not in Vancouver. Nilsson and Delmelle (2018) identified gentrification in Pittsburgh, but not in Baltimore, Buffalo, Denver, Houston, and St. Louis. Rochester (2016) did not detect any signs of gentrification along the Portland Eastside Blue and the Westside Blue Lines but found strong evidence around the Yellow Line. In their 14-city study, Baker and Lee (2017) found high variability in gentrification outcomes, even reporting counter-gentrification in some cases (Portland, Los Angeles, Buffalo). They confirmed gentrification in Denver and San Francisco, also identified by Bardaka et al. (2018) and Chapple et al. (2017), and they highlighted the importance of local and regional contexts on transit impacts. Finally, two of the highest-ranked investigations concluded that there were no signs of gentrification. Dong (2017), referring to Portland's oldest rail transit line, nevertheless suggested that there might be a lengthy time lag for gentrification. In Atlanta, Pathak et al. (2017) showed that census tracts with access to public bus transit actually have a higher proportion of low-income households than tracts without bus access.
When considering only these 12 highest-ranked studies, the city-specific outcomes are highly doubtful. Only San Francisco (gentrification detected) and Baltimore (no signs of gentrification) provided consistent results (albeit across only two studies). In the remaining cities, such as Denver, Dallas, Los Angeles, Pittsburgh, Portland, San Diego, and St. Louis, findings vary across the studies.
Among the five identified qualitative and/or perception-based studies, none explicitly sought to objectively detect gentrification near transit stations. Jones and Ley (2016) showed that the residents' recognition of gentrification process along the Vancouver Sky-Train corridor depended upon its intensity. In Minneapolis-St. Paul, Guthrie (2012a, 2012b) demonstrated the variability in minorities' perceptions on transitinduced neighbourhood change, whereby frequent transit users had generally more positive perceptions. Likewise, Moore (2015) indicated some variability among residents, varying from positive perceptions of improved neighbourhoods to fears of displacement. Finally, Sandoval (2018), and Sandoval and Herrera (2015) focussed on mobilisations against the threat of gentrification and found that such grassroots actions contributed to reshape TOD redevelopment projects and maintain some control over activists' places of residence.

Evidence, challenges and recommendations
This paper contributes to a better understanding of potential adverse effects of otherwise well-intentioned transit initiatives in two ways. It is the first systematic review on this topic and thus provides a useful synthesis of current empirical evidence on transitinduced gentrification. Moreover, by critically analysing the associated research approaches, this paper warns that the incorporation of several sources of bias into study designs can cause many misinterpretations of the data, ultimately leading to misguided conclusions and policies.
Overall, the results of this review suggest that proximity to transit may indeed contribute to gentrification. Although this finding reinforces the concerns of many equity advocates (Rayle, 2015;Revington, 2015), the low number of fairly reliable studies hinders solid conclusions. Rather, the high variability in findings and the relevance of local contexts might suggest that gentrification is more closely associated with existing local dynamics, built environment attributes, and accompanying policies. Among studies with significant bias issues, evidence of transit-induced gentrification might well be a false positive leading to incorrect determinations. Thus far, the empirical evidence does not provide truly conclusive guidelines for transit development policies.
Further work is therefore required to formally establish a causal relationship for assessing the existence of TOD-induced gentrification. A first step could be to enlarge geographical scope by conducting more comparative studies in different contexts outside the US. Next, although a lack of available data recognisably hinders the incorporation of displacements into gentrification-related studies (Chapple et al., 2009), investigation into this issue is likely to complement current approaches based on changes in indicators.
Multiple sources of bias need to be accounted for in future research in order to prevent potentially misleading conclusions. Specifically, control groups for quasi-experimental designs should be accurately chosen and the criteria for the inclusion of neighbourhoods should be clear. Larger geographical scales such as entire metropolitan or county areas should be avoided. Transit line performance, the influence of other infrastructures, timeframes, and endogeneity issues should be considered as potential factors when assessing the impact of transit services on local communities.
Above all, local context matters. Not only should consideration of local attributes of the built environment be incorporated into research methods, but the role of local governments in establishing mitigation policies should also be more clearly measured. Other elements with undeniable effects on accuracy and reliability in findings should be introduced, such as affordability policies, the implementation of urbanistic regulations in regeneration operations, the status of the presumably TOD area (for instance, is it a real TOD initiative, or simply a transit-adjacent development, or even a mere unchanging built environment adjacent to an already existing line?), the capacities and resources of the planning agencies, the quality of planning, and other factors usually associated with land-use planning appraisal.

Limitations of the review
This review has some limitations that append a note of caution regarding the findings. First, there is a potential selection bias due to our decision to include only Englishlanguage papers in the search, thus leading to possibly relevant studies not being included. Secondly, the quality appraisal was based on rating scores that could be considered arbitrary. In contrast to health science and economic reviews (Higgins & Green, 2008), there is no established check-list for systematic reviews of transport-related studies. Some criteria have been more valued than others; for example, the existence and reliability of the control group criterion was attributed a maximum value of two points, whereas other criteria elaborating on robustness and performing sensitivity tests were each attributed only 0.5 points. The resulting ratings are therefore sensitive to the choice of ratings scales, and they should be interpreted with caution. As we noted, subjectivity is unavoidable in appraisal procedures, and this issue has been treated by (i) describing the method used and providing the data to ensure reproducibility; and (ii) discussing how the outcomes and methodological flaws influence the review findings.
Finally, we attempted to overcome the limitations frequently associated with narrative syntheses by following existing guidance and providing multiple quantitative and methodological tools as necessary for a full comprehension of the findings. Again, our appraisal of the studies is not a judgement of their intrinsic quality, but rather merely an evaluation based on a limited set of indicators. Thus, the fact that a study has a low score does not necessarily indicate that the study is fundamentally bad, and this review should not discourage researchers from examining those studies.

Conclusion
After almost three decades of supporting a nearly idealised TOD approach for enhancing local communities while promoting a modal shift contributing to the reduction of gas consumption, transit-induced gentrification has recently emerged as a matter of concern. This paper aimed to review empirical evidence of such trends near public transit stations.
Current evidence is partial and inconclusive. The paucity of existing studies is an obvious limitation that is expected to fade over time, as 20 of the 35 studies selected were published in the past three years (including 13 of the 17 peer-reviewed journal papers). Methodological flaws are currently the main hindrance to the reliability of the current literature on TOD-induced gentrification. Nevertheless, the few reliable studies identified in this review tend to support the gentrification hypothesis, although spatial variability across urban areas and continental contexts might affect the generalisability of the results. Future research should therefore enlarge the geographical scope and produce more robust methods of distributional outcomes evaluation by accounting for the potential sources of bias identified in this review.