Identifying Human Mobility Patterns using Smart Card Data

Human mobility is subject to collective dynamics that are the outcome of numerous individual choices. Smart card data which originated as a means of facilitating automated fare collections has emerged as an invaluable source for analyzing human mobility patterns. A variety of clustering and segmentation techniques has been adopted and adapted for applications ranging from passenger demand market segmentation to the analysis of urban activity locations. In this paper we provide a systematic review of the state-of-the-art on clustering public transport users based on their temporal or spatial-temporal characteristics as well as studies that use the patter to characterize individual stations, lines or urban areas. Furthermore, a critical review of the literature reveals an important distinction between studies focusing on the intra-personal variability of travel patterns versus those concerned with the inter-personal variability of travel patterns. We synthesize the key analysis approaches and based on which identify and outline the following directions for further research: (i) predictions of passenger travel patterns; (ii) decision support for service planning and policy evaluation; (iii) enhanced geographical characterization of users' travel patterns; (iv) from demand analytics towards behavioral analytics.


Introduction
Human mobility is subject to collective dynamics that are the outcome of numerous individual choices.Even though individual needs and travel choices exhibit large variability and the urban and regional areas in which those they are embedded are subject to great diversity, there is evidence to suggest that human mobility manifests some recurring features throughout history and across geographies (Ahmed and Stopher 2014).The growing availability of disaggregate mobility data enables the analysis of temporal and spatial patterns and the link between microscopic behaviour and resulting aggregate flows (Schläpfer et al. 2021).Geo-location traces are increasingly available from mobile phone and GPS data, social media posts and travel-related records such as automated fare collection (AFC) records (see Barbosa et al. 2018), giving rise to a plethora of studies analysing human mobility patterns and the decomposition thereof in the past decade.In this paper we provide a systematic review of the state-of-the-art on identifying human mobility patterns from smart card data and propose an outlook for addressing persisting challenges.
Smart card transactions offer a unique and rich source of passively collected data that enable the analysis of individual travel patterns.Pelletier et al. (2011) reviewed the technological advancements making the analysis of smart card data possible, followed by a review of strategic, tactical and operational applications of smart card data in public transport.Another relevant work is the one by Yue et al. (2014) who reviewed trajectory-based travel behaviour studies, where smart card data is covered as one of the emerging sources enabling the analysis of human mobility patterns, replacing travel diaries and stated preferences data.Their main focus is on the properties of different trajectory data categories.In a recent review, Hussain et al. (2021) examined recent developments in using smartcard data for the purpose of Origin-Destination matrix estimation.The latter -as well as related inference algorithms that are performed in order to estimate travel destinations, vehicles boarded, transfer locations and home-zones -is a pre-requisite for some of the research devoted to identifying human patterns based on smart card data.
In the last decade, an extensive research attention has been devoted to the identification and classification of mobility patterns from smart card data.Researchers have adopted and adapted a variety of clustering and segmentation techniques for a series of applications ranging from passenger demand market segmentation to the analysis of urban activity locations.The objective of this paper is to provide a systematic review of this body of knowledge, synthesize the key analysis approaches and offer a research agenda for addressing the remaining gaps.
Our search strategy was based on applying combinations of 'Smart card data' and one or more of the following keywords when performing the bibliometric search: 'Clustering', 'Travel patterns', 'Mobility patterns', 'User segmentation', 'Spatial', 'Temporal' and 'Spatiotemporal'.The databases of Scopus and Google Scholar were used to identify all relevant literature.The search, last performed on June 29, 2022, resulted in 698 papers of which 56 papers were identified as directly relevant for the scope of this paper.Some of those excluded performed aggregate statistical analysis of mobility patterns rather than identifying distinctive ones.Other papers excluded from this review focused on inference algorithms, modes other than public transport, aggregate analysis of statistics such as OD matrix estimation, ridership, service evaluation and travel time distributions, and the statistical distributions thereof.
All of the studies included in our analysis were published in the last decade, the vast majority (84%) of which in the last five years, from 2017 onwards.The selected papers were then categorized based on whether they analyse individual mobility patterns or network and urban analytics.Based on the papers reviewed, an additional distinction was made between studies that characterise individual mobility patterns based exclusively on their temporal patterns and studies that also consider spatial features as part of the pattern identification.
The remainder of the paper is organised as follows.The following section, Section 2, provides a synthesis of the literature concerned with identifying and characterising travellers' mobility patterns whereas Section 3 reviews studies that use the latter to characterise individual stations, lines or urban areas.We then offer a research agenda in Section 4, outlining key promising directions for future research.

Characterising Individual Mobility Patterns
Human mobility patterns are characterised by their temporal and spatial features.Most of the past work devoted to analysing individual mobility patterns using smart card data has focused on their temporal characteristics and is reviewed in the following sub-section 2.1.Thereafter, we turn in subsection 2.2 into reviewing works that have explicitly considered both spatial and temporal aspects of user travel patterns as input to the market segmentation analysis.

Temporal patterns of individual users' travel
Table 1 provides a summary of the 22 studies analysing temporal user travel patterns, including their main aim, the variable(s) which are subject to analysis, the clustering technique employed, contextual information used in the analysis and in the interpretation of the results, and the key features of the case study application.
A critical review of the literature reveals an important distinction -yet only seldom made explicit by the studies themselves -between studies focusing on the intra-personal variability of travel patterns versus those concerned with the inter-personal variability of travel patterns.The former aims at classifying individual in terms of the stability of the temporal features of their travel patterns whereas the latter aims at segmenting users based on the characteristics of their regular temporal patterns.
Studies focusing on intra-personal stability have used various temporal features to analyse travel stability.The latter was investigated in terms of departure times (Kieu et al. 2015a, Manley et al. 2018), by the number of trips made on each day of the week (Deschaintres et al. 2019) or a combination thereof (Moradi and Trepanier 2020).A couple of studies have specifically looked into identifying socalled 'extreme' travellers defined by the percentage and frequency of trips in certain time periods or of certain duration (Long et al. 2016) or in terms of the number of days travelled per week, travel frequency per day of the week and temporal differences between daily trips (Cui and Long 2019).
Segmenting users based on their inter-personal variability can reveal similarities and differences in their temporal travel habits.Several studies have investigated aggregate indicators of travel frequency such as overall frequency of use and frequency of travel characteristics such as travelling by train and performing transfers (Kieu et al. 2018) or number of travel days (Egu and Bonnel 2020).Since weeks are considered to be the fundamental unit of recurrent travel schedules, the number of trips per day of the week has been often considered an important feature for market segmentation (Viallard et al. 2019), especially in combination with the starting hour (Briand et al. 2017, El Mahrsi et al. 2017, Liu and Cheng 2020, Cats and Ferranti 2022a), resulting in an hour-by-hour weekly travel profile per user which is then subject to clustering.User segmentation can then also be used for predicting the travel frequency per user per time-of-day as a conditional probability (Yang et al. 2018).Individual travel diaries allow to specifically focus on the timing of (certain) activities such as boarding times when travelling to and back from work (Ji et al. 2019), the start times as well as duration of activities (Medina 2018) and the activity sequence structure (Goulet-Langlois et al. 2016, Lei et al. 2020).
Smart card data facilitates the construction of individual travel diaries, which in turn enable the investigation of travel patterns amongst specific user groups.Many of the studied reviewed have analysed how the temporal patterns identified vary amongst users with different fare product types using the latter as a contextual post-analysis variable, some of which contain information on different user groups such as school pupils, higher education students, and retired or older users (Kieu et al. 2015a, Briand et al. 2017, El-Mahrsi et al. 2017, Deschaintres et al. 2019and Egu and Bonnel 2020).A single study has specifically focused on a selected user group, namely on the variability of trip frequency per hour by older travellers (Liu et al. 2021).In several studies, personal information available from smart card registrations has also been used, for at least a sample of users included in the analysis (Goulet-Langlois et al. 2016, Liu et al. 2021) or from a matching household survey (Long et al. 2016).However, in most cases no socio-demographic data is available for individual card holders.Notwithstanding, by inferring card-holders place of residence from their travel patterns, one can link users to zonal socio-demographic characteristics available from census data such as ethnicity, employment and income (Liu andCheng 2020, Cats andFerranti 2022a).
Clustering or segmentation techniques for aim at identifying mutually exclusive and collectively exhaustive subsets of the population so as to maximize the intra-group similarities and minimize the inter-group similarities.The definition of the (dis)similarity metric is therefore crucial.In the context of user segmentation, the two most common clustering techniques are -means (see Likas et al. 2003) and agglomerative hierarchical (see Rokach and Maimon 2005) clustering.The former finds the best partitioning of the dataset given a pre-defined number of clusters, , with the distance metric pertaining to the centre of the cluster.A variant of which, -medoids, selects an actual data point as a the center of each cluster.Agglomerative hierarchical techniques follow a bottom-up approach with iterative merging of groups of observations based on their similarity.While hierarchical approaches are more computationally expensive, they allow for greater variety of distance specifications and their output contains more information on relations between potential clusters and does not require the a-priori specification of the number of clusters.An additional technique used in several studies for clustering users based on the similarity between subsequent trips is DBSCAN (Khan et al. 2014) which relies on user-specified parameter values.A model-based approach such as Gaussian mixture model offer a latent profile assignment with probabilistic assignment of members to clusters which is especially attractive in the case of continuous variables (e.g.Briand et al. 2017).
Case study applications analysed smart card data from cities in Australia (Brisbane, Sydney), Canada (Gatineau, Montreal), China (Beijing, Nanjing, Shenzhen), Denmark (Copenhagen), France (Lyon, Rennes), Japan (Shizuoka), Singapore, Spain (Tarragona), Sweden (Stockholm) and United Kingdom (London).Most applications have considered metro systems, bus systems or a combination thereof, with few studies extending to other urban and suburban modes of public transport.Smart card data from most metro systems worldwide and bus systems from systems in East Asian analysed by studies included in this review contain both tap-in and tap-out data.Note that for most approaches taken in the analysis of temporal travel patterns no information regarding travel destination is per-se needed since boarding time (available from tap-in transactions) is the prime variable of interest used for constructing travel profiles that are then subject to clustering.
Few of the studies examining temporal travel patterns have also considered spatial elements such as the share of regular OD journeys (Kieu et al. 2015a) or information on travel modes and specific services (Manley et al. 2018).In addition, a number of studies have considered spatial attributes such as relating those to stations (Manley et al. 2018), inferred home location (Kieu et al. 2018, Liu and Cheng 2020, Cats and Ferranti 2022a), inferred home and job locations (Long et al. 2016), land use information (Lei et al. 2020), spatial entropy (Briand et al. 2017) as contextual variables.

Spatiotemporal patterns of individual users' travel
Table 2 provides a summary of user segmentation studies that have identified spatiotemporal travel patterns.Similarly to the main divide in the analysis of temporal travel patterns, also studies focusing on both spatial and temporal characteristics have either looked into intra-personal or inter-personal variability.The former considered in this context thus not only the extent to which users travel at the same times over the course of the analysis period (be it defined in terms of frequency per week, days of the week and/or time of the day) but also the locations between which they have travelled.
In an attempt to identify commuters, studies focusing on intra-personal variability have considered variables such as the number of active travel days and the similarity of first boarding times, stops and route sequences (Ma et al. 2013) or the prevalence of most frequent origin-destination and the respective boarding times and routes (Ma et al. 2017).A broader perspective, not limited to commuting and most commonly performed trips, considers the distribution of trips over daily time periods and the most often used ODs and stations (Kaewkluengklom et al. 2021), or adds greater detail in the form of the overall sequence of activity locations and their respective durations per day (Goulet-Langlois et al. 2017, He et al. 2021).A similar approach served the opposite goal in identifying irregular (undirected, potentially suspicious) travellers based on travel frequency per time window and station (Wang et al. 2019).
Studies focusing on inter-personal variability group users based on the characteristics of their regular spatiotemporal travel patterns (rather than the extent to which those are stable).Three studies have been identified in this category, all of which focusing on special user groups.Gutierrez et al. (2020) examined the combination of an array of temporal (travel frequency and number of travel days) and spatial (number of locations visited and routes used, prominence and identity of most visited locations) characteristics of tourists' travel patterns.Wang et al. (2019) investigated the activity spaces of vulnerable travellers' groups and employed to this end indicators related to both temporal (frequency, time entropy) and spatial (distance, radius and shape of activity space, place entropy) travel characteristics.Pieroni et al. (2021) focused on low-income workers in the context of a metropolis with large income gaps and contrasts their travel patterns with those of high-income ones.
A snowball search identified one additional study which regressed the extent to which users tend to travel using the same route against user, OD and journey characteristics, albeit without seeking to identify patterns (Kim et al. 2017).
Two techniques are dominant when integrating spatial information into the clustering of individual users, k-means (Ma et al. 2013, Yang et al. 2018, Kaewkluengklom et al. 2021, Wang et al. 2019, Zhang et al. 2021a, Pieroni et al. 2021) and DBSCAN (Ma et al. 2013, Ma et al. 2017, Wang et al. 2019).Some exceptions exist for studies using a diverse array of travel features which use a Gaussian mixture model (Gutierrez et al. 2020) or hierarchical clustering (He et al. 2021), as well as a study identifying irregular users based on percentile cut-off of an entropy-based metric (Goulet-Langlois et al. 2017).
Of the ten studies identified which analysed the spatiotemporal characteristics of individual mobility patterns using smart card data, three of which have been conducted for the case of Beijing and the remaining, with one study each, for Brisbane, Gatineau, London, Sao Paulo, Shizuoka, Tarragona and Wuhu.
A special class of studies -not concerned with the identification of user segments, yet concerned with their individual travel patterns -is the one focusing on physical encounters and thereby contact networks and their potential consequences, in particular in the context of virus spreading.All these works construct a contact network based on the inferred passenger trajectories.Sun et al. (2013) analysed the statistical properties of the topological indicators -including the clustering coefficientof the resulting physical encountering network for bus users in Singapore.Liu et al. (2020) searched for the co-existence of passengers at stations and on-board vehicles using DBSCAN.Their application considered the metro networks of Shenzhen and they analyse the frequency and duration of encounters among pairs of passengers.Qian et al. (2021) analysed the dynamics of virus spreading in contact networks and the effectiveness of control strategies, namely vaccination and quarantine, using contract networks constructed from the metro systems of Guangzhou, Shanghai and Shenzhen in China.
It is evident that the number of studies that consider the spatial characteristics of travel patterns in the identification of user segments is comparatively small compared with the body of literature devoted to temporal aspects of human travel using smart card data.Furthermore, all of the relevant studies identified in this review have analysed spatial features in conjunction with temporal ones, rather than analysing spatial aspects in isolation.In line with the scope of this review, studies that analysed the spatiotemporal characteristics of ridership flows or origin-destination matrices are not included in this analysis.The limited research effort devoted to the consideration of spatial features as part of user segmentation arguably stems from its greater complexity as compared to temporal aspects.The latter can easily be discretized and are universal and therefore the transferability of the associated feature definition and clustering techniques is fairly straightforward.In contrast, spatial features are subject to local variations and are often labelled, their discretization is not trivial and the findings cannot be easily made transferable to other contexts.Notwithstanding, there is much room for further research in this domain as elaborated in Section 4.

Spatiotemporal Travel Patterns for Network and Urban Analytics
In the previous section we reviewed the literature on clustering public transport users based on the temporal or spatial-temporal characteristics of their individual travel patterns.Another, related, body of literature has sought to identify and characterise clusters of (public transport) network elementsmost often stops/stations but in a few cases service lines -based on the spatial-temporal characteristics of their respective users, as well as the urban area in which the networks are embedded.
Table 3 presents a summary of studies that performed network and urban analytics based on spatiotemporal travel patterns derived from passenger data.The objectives of the 24 studies included in this review vary greatly.Pioneering studies in the field sought to identify key activity centres using passenger flow data (Roth et al. 2011, Kim et al. 2014, Cats et al. 2015).Those have been followed by a series of studies that investigated the relation between the results of the station clustering and landuse patterns and places of interest in proximity to these stations or characteristics of the respective zones (Kim et al. 2017, Zhang et al. 2018, Zhao et al. 2019, Gan et al. 2020, Kim 2020, Zhuang et al. 2020).Recent studies have extended this to the analysis of how urban structure has evolved over time (Wang et al. 2021, Zhang et al. 2021b).Another stream of works has focused on the generation of origin-destination matrices by aggregating stops into demand zones (Kieu et al. 2015b, Luo et al. 2017), characterising the demand patterns associated with individual stations (Zhong et al. 2015, El-Mahrsi et al. 2017, Wang et al. 2017, Tang et al. 2018, Li et al. 2020, Zhou et al. 2022, Park et al. 2022), and clustering stations (Cats and Ferranti 2022b) or lines (Wang et al. 2020, Yap et al. 2019) that exhibit homogenous activity patterns.
With the exception of the latter group focusing on service lines which is aimed at supporting service planning, all other studies have used individual stations or set of stations in proximity (i.e.areas) as the unit of analysis subject to clustering.All of the abovementioned studies have utilized origindestination flow information derived from smart card data in their analysis with the exception of one study relying on boarding volumes only (El Mahrsi et al. 2017) and two studies considering boarding and alighting flows (Cats et al. 2015, Gan et al. 2020), presumably due to data availability limitations.
Given the dynamic character of network and urban centre characteristics, it is not surprising that the vast majority of studies have considered the temporal properties of passenger flow distribution in their analysis.Most studies have integrated information on the temporal characteristics of the flows analysed as part of the clustering process (Cats et al. 2015, El Mahrsi et al. 2017, Kim et al. 2017, Tang et al. 2018, Zhao et al. 2019, Gan et al. 2020, Kim 2020, Li et al. 2020, Zhuang et al. 2020, Wang et al. 2021, Zhang et al. 2021b, Cats and Ferranti 2022b).In addition, several studies have investigated how the results of their clustering vary for different time periods (Luo et al. 2017, Wang et al. 2017, Yap et al. 2019, Wang et al. 2020).
Methods deployed for the clustering of stops (or sets of stops) and lines include k-means or variants thereof (Kim et al. 2017, Luo et al. 2017, Tang et al. 2018, Zhao et al. 2019, Gan et al. 2020, Cats and Ferranti 2022b, Zhou et al. 2022, Yong et al. 2021), variants of hierarchical methods (Roth et al. 2011, Kim et al. 2014, Cats et al. 2015, Kim 2020), DBSCAN (Kieu et al. 2015b, Yap et al. 2019), Poisson or Gaussian mixture models (El-Mahrsi et al. 2017, Wang et al. 2021), affinity propagation (Wang et al. 2017, Zhuang et al. 2020) and a discriminative functional mixture model (Park et al. 2022).In addition, a number of studies adopted community detection techniques (Zhong et al. 2015, Yap et al. 2019, Zhang et al. 2019, Wang et al. 2020, Zhang et al. 2021b).The latter -which have not been adopted by any of the studies performing user segmentation reported in sections 2.1 and 2.2 -is a graph-based technique which partitions a network into communities by grouping together nodes that are densely connected with those belonging to different communities being only sparsely connected, where the strength or distance of a connection is indicated by the selected link labelling.
Largely reflecting the geographical distribution of the research groups that have conducted related research, studies have analysed urban and system structure for cities from Australia (Brisbane), China (Beijing, Nanjing, Shanghai, Chongqing), France (Rennes), the Netherlands (The Hague), Singapore, South Korea (Seoul), Sweden (Stockholm) and the United Kingdom (London).The majority of studies have utilised (only) metro data whereas several studies included all public transport modes present in the case study area to best reflect the underlying travel demand patterns, including in areas which might be underserved by the metro.
Given the scope of this review, the discussion above and Table 3 are limited to studies that identify patterns in how passenger flows differ across the system for the purpose of either network or urban analytics.The bibliometric search yielded several adjacent studies that are either grounded in complex system theory or in urban planning.Studies within the realm of the former investigate statistical properties of travel flows across the network, such as the stability of how travel patterns vary over days for the networks of London, Singapore and Beijing (Zhong et al. 2016) or calculated several measures of urban diversity based on the temporal pattern of flows per spatial units (Sulis et al. 2018).Examples of the latter include the analysis of year-on-year change in the job to worker ratio per station in the Beijing metro network (Huang et al. 2019) and the identification of employment areas in Beijing and the evolution thereof based on temporal characteristics of incoming and outgoing flows (Huang et al. 2021).Note that these studies, unlike the studies included in Table 3, use descriptive statistics to observe spatial variations.

Research agenda
As is evident from the synthesis of the literature provided in sections 2 and 3, there is an increasing body of research that has either characterised individual travel patterns or performed network and urban analytics based on the associated travel patterns.Based on the review of the literature several on-going trends in this research domain can be  Specific user groups.There is increasing interest in investigating travel patterns for particular target segments which exhibit distinctive travel patterns ranging from tourists (Gutierrez et al. 2020) and older travellers (Liu et al. 2021) to lower income individuals (Pieroni et al. 2021). Enriching smart card data with socio-demographic data.This has been accomplished by means of including personal information available from card registrations (Goulet-Langlois et al. 2016, Liu et al. 2021) or a matching household survey (Ling et al. 2016).Alternatively, zonal socio-demographic data was linked by inferring the home-zone location of card holders (Liu andCheng 2020, Cats andFerranti 2022a). Evolution of travel patterns.The availability of smart card data over the course of a substantial time period allows for the analysis of changes in observed travel patterns in terms of identified market segments (Briand et al. 2017, Viallard et al. 2019), station clusters (Huang et al. 2021) or underlying urban characteristics (Zhang et al. 2021b).
Based on our critical review of the state-of-the-art we identify in the following four knowledge gaps and outline related directions for further research.
 Predictions of passenger travel patterns.The availability of system-wide passenger data opens avenues for the development of schemes aimed at the prediction of passenger travel patterns.Such predictions can facilitate service design adjustments as well as supply-and demandmanagement measures to better cater or steer anticipated flows.A couple of studies included in this review already do so specifically for travel frequency per user (Yang et al. 2018) and the daily pattern of arriving passengers at stations (Park et al. 2022).
The future development of prediction schemes can be structured along two axes: short-term vs. long-term and disaggregate vs. aggregate.The former distinction pertains to the time horizon for which the prediction is made, whether it concerns within-day downstream conditions measured in minutes to hours or considering a timespan of days or even months.Short-term predictions pose requirements on computational efforts and possibly also on the real-time availability of smart card data whereas long-term predictions can rely on offline applications.Several recently deployed fare collection systems enable the real-time availability of smart card data, albeit those are still far from common practice.The distinction between disaggregate vs. aggregate pertains, respectively, to whether the unit of analysis that is subject to prediction relates to trip characteristics of each individual traveller or to the travel pattern characteristics which are the outcome of collective dynamics, such as boarding, alighting and on-board passenger flows.
All four combinations of short-and long-term, disaggregate and aggregate predictions are relevant as future research directions and can facilitate respective applications and interventions by service providers.Several pioneering studies proposed a logistic regression model for predicting the next trip to be performed by an individual (Zhao et al. 2018) and an elasticity model for short-term aggregate predictions (van Oort et al. 2019).
 Decision support for service planning and policy evaluation.The analysis of travel patterns should ultimately inform planners in making decisions ranging from strategic and tactical to operational planning.This can be further facilitated by fusing long-and short-term predictions of travel patterns.While most of the studied included in this review have alluded to potential relevant none of them has made an explicit link to service planning tasks, with the exception of Yap et al. (2019) and its relevance for service coordination.
There is thus clearly a missing link between the development and application of demand, urban and network clustering analysis on one hand and service design and decision support on the other hand, which is key for the value proposition of the literature reviewed here.Future research should bridge this gap by making explicit how the techniques and insights developed based on smart card data analytics can translate into service improvements.The latter can range from ex-ante policy assessment (see Kholodov et al. 2021 for an example concerned with price elasticities) and network and capacity allocation adjustments and even the design of flexible services (see Qiu et al. 2019 for an original contribution in this direction) on one hand to real-time management such as information provision and control measures on the other hand.In particular real-time contrl strategies will greatly benefit from advancements in short-term passenger flow predictions.Another promising direction is the development of fare products and fare schemes (see Halvorsen et al. 2020 for a relevant framework) based on the market segmentation performed using smart card data.For example, changes in travel patterns induced by the COVID pandemic crisis and workfrom-home habits call for the development of new subscription models which can be devised to cater for changes in spatiotemporal user clusters.
 Enhanced geographical characterisation of users' travel patterns.The analysis of individual mobility patterns and the related user clustering and market segmentation have insofar largely focused on their temporal features (section 2.1) or the stability and variability of travel destinations (section 2.2).However, there is lack of knowledge on the geographical properties of individual travellers by characterising spatial features of locations visited based on disaggregate longitudinal data.For the latter to be made available it is essential to comply with rigid privacy protocols due to the sensitivity of sequential geo-location data.Moreover, the aforementioned trend of enriching smart card data with socio-demographic data has been observed only for temporal user clustering but is yet to enter studies focusing on spatial clustering.The analysis thereof will enable identifying the extent to which different user groups visit different parts of the urban area and when analysed in the aggregate, the composition of travellers visiting various zones and the extent to which various destinations are visited by people with different backgrounds.Community detection techniques may be apply for user segmentation with the goal of identifying groups of travellers that share similar geographical features of their mobility patterns.
In the analysis of spatial characteristics it is advised to extract features that can be generalized to other contexts such as density, diversity and geographical scope of destinations visited rather than focus on labelled ones that pertain to local features (e.g.location names).Given the significant role that features of the local urban environment are likely to play in determining the spatial characteristics of traveller's mobility, there is a need to gain more knowledge on how those are manifested in diverse settings.While these variables vary also within any given city, they may be confounded and the range of values might be limited.A cross-city comparison will allow concluding on relevant determinants such as urban form, density, modal competition, land-use and demographics.
 From demand analytics towards behavioural analytics.While the identification and characterisation of human mobility patterns offers remarkable demand analytics insights it does not advance our knowledge on the underlying determinants of travellers' behaviour.A separate large body of research is devoted to the estimation of travel choice determinants from smart card data, primarily limited to the estimation of route choice models and the impact of crowding (e.g.Hörcher et al. 2017, Yap et al. 2020).There is great potential in marrying these two research streams so as to support the development of behavioural models for users with distinctive mobility patterns, for example by means of estimating latent class models that are informed by the user clustering results.
Obtaining behavioural insights will also be stimulated by segmenting mobility patterns based on a combination of travel features that provide additional information on users experience.A couple of recent examples making advancements in this direction include the identification of central travellers for virus spreading by means of the simultaneous consideration of the distance travelled, the number of fellow travellers one has been exposed to and the degree of exploration (El Shoghri et al. 2020) and the neural network approach taken for the joint consideration of travellers' departure and arrival time, travel location, travel distance and point of interest at the destination (Li et al. 2021).
Smart card data which originated and has been designed as a means of facilitating automated fare collection has clearly emerged as an invaluable source for analysing human mobility patterns.Fusing smart card data with other sources of geo-location data will potentially allow analysts to consider additional aspects of human mobility beyond those performed by means of passenger transportation.Moreover, for the research analysing mobility patterns using smart card data to become an integral part of service design and decision making and offer insights that go beyond describing current travel patterns and related user segments, future research must combine methodological advancements in data science, machine learning and clustering techniques with knowledge in travel behaviour, demand forecasting and transport planning.

Table 1 :
Summary of studies clustering user temporal travel patterns

Table 2 :
Summary of studies clustering user spatial-temporal travel patterns

Table 3 :
Summary of studies clustering network elements based on related travel patterns