Application of the trajectory error matrix for assessing the temporal transferability of OBIA for slum detection

ABSTRACT High temporal and spatial-resolution imageries are a valuable data source for slum monitoring. However, the transferability of OBIA methods across space and time remains problematic, due to the complexity of the term “slum”. Hence, transparency is important when analysing the transferability of OBIA methods for slum mapping. Our research developed a framework for measuring the temporal transferability of OBIA methods employing the trajectory error matrix (TEM). We found relatively low trajectory accuracies indicating low temporal transferability of OBIA methods for slum monitoring using point-based assessment methods. However, the analysis of change needs to be combined with an analysis of the certainty of this change by considering the context of the change to deal with common problems such as variations of the viewing angles and uncertainties in producing reference data on slums.


Introductions
Adverse impacts have emerged as a result of the rapid expansion of cities, particularly in the Global South. The disparities between urban and rural areas are urging people to move to cities searching for a better life. Unfortunately, governmental institutions often fail in providing serviced land with basic facilities to the growing urban population (Centre on Housing Rights and Evictions, 2008;Ooi & Phua, 2007). High land prices in cities force low-income households to settle in areas that were not planned for development (i.e. slums) with sub-standard facilities (Centre on Housing Rights and Evictions, 2008).
Although the reduction of slums is part of the global agenda (United Nations, 2014), updated information regarding the dynamics of slums (e.g. growth and clearance) is scarcely available (Shoko & Smit, 2013). Commonly employed approaches for data collection on slums, i.e. survey-based methods, have their limitations (Kohli, Sliuzas, Kerle, & Stein, 2012). For instance, due to the high temporal dynamics of slums, such data might be obsolete when they are used (Hofmann, Strobl, Blaschke, & Kux, 2008).
The use of satellite images offers an opportunity to capture the dynamics of slums since they can provide spatially consistent data with both high spatial detail and temporal frequency (Hofmann et al., 2008). However, detecting slums from satellite images has challenges as slums often share similar surface materials with other urban features (Kohli, Warwadekar, Kerle, Sliuzas, & Stein, 2013;Kuffer, Pfeffer, & Sliuzas, 2016). Hence, slum detection methods that rely on spectral information alone should be avoided (Jain, 2007), and additional criteria, e.g. shape, texture and density should be considered to avoid misclassifications (Kohli et al., 2013). Object-Based Image Analysis (OBIA) has this potential compared to classical pixel based methods as it can incorporate contextual information of objects (Ebert, Kerle, & Stein, 2009). However, OBIA shows limited transferability, i.e. methods developed for a specific image cannot be easily applied to another image, city or even other part of the same city (Hofmann et al., 2008). To improve transferability, Hofmann et al. (2008) suggested using an ontology as a basis for slum detection. Consequently, Kohli et al. (2013) developed the generic slum ontology (GSO) to assist slum detections by providing a comprehensive characterisation of slums in an image. The GSO builds on one of the five indicators for slum household i.e. the durable housing indicator from the definition specified by UN-Habitat (2009). It describes slum characteristics at three spatial levels, i.e. environs, settlement and object level (Kohli et al., 2012). The GSO was developed as a generic framework for slums, but it requires adaptations for applying it to a local context (Kohli et al., 2013). Moreover, slums change over time (e.g. Kim, Hensley, Yun, & Neumann, 2016;Kit & Lüdeke, 2013), which makes multi-temporal information crucial to monitor slum dynamics, and to analyse the impact of slum-related policies (Patel, Crooks, & Koizumi, 2012).
The OBIA approach relies on a ruleset that operationalises slum indicators of the GSO. The aim of a rule-based classification is to increase the accuracy of urban features, particularly to minimise the confusion errors among classes with similar spectral information (Bouziani, Goita, & He, 2010). A ruleset represents a set of rules, which allows employing various characteristics to distinguish slum and non-slum areas. However, due to the need for local adaptation of the GSO, ruleset adaptations are inevitable (Anders, Seijmonsbergen, & Bouten, 2015;Kohli et al., 2013;Tiede, Lang, Hölbling, & Füreder, 2010). Therefore, it is crucial to describe these adaptations transparently (e.g. when employing the ruleset to a different image) to ensure the objectivity of the ruleset (Anders et al., 2015).
Various studies have discussed the transferability of OBIA rulesets for slum detection (Hofmann, Blaschke, & Strobl, 2011;Kohli et al., 2013), by comparing mapped slum locations using a specific or adapted ruleset across different locations (i.e. spatial transferability). However, to the best of our knowledge, a study focusing on measuring the temporal transferability of OBIA rulesets for slum detection has not been done so far. The aim of this study is to develop a framework for measuring the temporal transferability of a ruleset for OBIA-based slum detection. First, we develop a framework for measuring the temporal transferability of OBIA-based rulesets for slum detection. Second, we implement the framework for post-classification-based change detection of multi-temporal Pleiades imageries. Third, we discuss and evaluate the usage of our framework. Finally, we provide conclusions in terms of applicability of the trajectory error matrix (TEM) for analysing temporal dynamics of slums.

Development of the transferability framework
This section consists of two parts. First, we discuss various frameworks used for measuring transferability in different domains. Second, we discuss the process of implementing the framework for measuring temporal transferability of OBIA rulesets for slum detection.

Comparison of different frameworks for measuring temporal transferability
Transferability refers to the capability of a certain method to provide comparable results with a minimum of adaptations when applied to different image conditions (Kohli et al., 2013), thus transferability of a model developed for images collected at one point in time to another point in time, of the same area (Fox, Daly, Hess, & Miller, 2014). Therefore, temporal transferability in our study refers to the ability of a ruleset to produce comparable results when applied to a different temporal image.
In the OBIA domain, researchers commonly measure transferability on the basis of required adaptations of the ruleset (e.g. Anders et al., 2015;Hamedianfar & Shafri, 2015;Tiede et al., 2010) for obtaining similar accuracies. For mapping slums, Hofmann et al. (2011) developed a framework for measuring the robustness of OBIA rulesets based on the need of threshold adaptations, for obtaining similar results in different images. Kohli et al. (2013) analysed the transferability of a locally adapted ruleset based on the GSO for one image subset and applied it to other subsets comparing accuracy results.
Thus in most frameworks, the accuracy achieved with minimum adaptations is the key parameter for measuring transferability, i.e. the more similar the accuracy of different classification results, the more transferable the ruleset. The most common methods for accuracy assessment are the error matrix and kappa coefficient. However, these methods are designated for single temporal thematic mapping (Li & Zhou, 2009). Therefore, Macleod and Congalton (1998) proposed a change detection error matrix, by modifying the standard error matrix to assess accuracies of a land cover change detections. Li and Zhou (2009) proposed the TEM, which is more suitable for analysing multi-temporal images compared to the change detection error matrix (which can be optimally employed for only two observations). The TEM is based on a modification of the change detection error matrix from Macleod and Congalton (1998), it utilises the land cover change trajectories (Crews-Meyer, 2001;Mertens & Lambin, 2000;Petit, Scudder, & Lambin, 2001).
Although Li and Zhou (2009) used a small number of classes and images (i.e. four classes in three different temporal images, and five classes in two different temporal images), they dealt with a large number of land cover change trajectories. A high number of possible trajectories can make the implementation difficult as the TEM results in too many columns and rows. Therefore, they classified the possible trajectory combinations into six confusion sub-groups (shown in Table 1) to reduce the complexity in change detection analysis. To this end, we argue that TEM with six confusion subgroups from Li and Zhou (2009) is the most suitable framework for assessing the temporal transferability as it can utilise multi-temporal images and also reduces the complexity in the confusion matrix.

Implementation of the temporal transferability framework
We adopt the TEM using four steps for calculating the accuracy of change detections. The first step is to develop the land cover/use change trajectories and the reference data. Second, determine the confusion sub-groups in the TEM for each sample (see Table 1). Third, calculate the total members of each sub-group. Fourth, compute the accuracy indices.
Regarding the first step, several approaches can be used for change detection, e.g. post classification, image differencing, unsupervised change detection (Leichtle, Geiß, Wurm, Lakes, & Taubenböck, 2017). In postclassification change detection, which is commonly employed in urban remote sensing (Hussain, Chen, Cheng, Wei, & Stanley, 2013), each image is first classified separately, either using pixel-based techniques or OBIA; then the classification results are compared to indicate changes (Macleod & Congalton, 1998). In general, post-classification change detection is less sensitive to radiometric variations among different temporal images (Mas, 1999). This approach is more appropriate to our study since we use an OBIA-based method for slum detection. A similar approach is to conduct OBIAbased classification followed by pixel-based post-classification for change detection (see Zhou, Troy, & Grove, 2008). We specifically focused on change detection between slum and non-slum areas across different temporal images.
In the second step, we extracted six categories of confusion sub-groups, as shown in Table 1. In (S 1 ), both reference data and image classification agree, and slum or non-slum areas are correctly classified. For instance, in the sample (x), pixels are correctly classified as slums, and both reference and image classification indicate that the area remains a slum. In (S 2 ), both reference and image classifications indicate changes, i.e. both reference and the image classification result indicate that the area changed from slum to non-slum. In (S 3 ), both reference data and image classification show similar results, but without a correct classification. In (S 4 ), the reference data indicate no change, but it is detected as a change in image classification. In (S 5 ), reference data indicate that the sample changes, but the image classification does not show changes. Last, for (S 6 ), both reference and image classification indicate changes, but with incorrect trajectories, i.e. the reference data show change from slum to non-slum, but the image classification shows a change from non-slum to slum. To give a better understanding, Figure 1 illustrates the difference between the subgroups (S1-S6).
In the third step, we compared the classification results with the reference data to calculate the accuracy of the land cover classification. For this purpose, we generated 300 random points, which were assigned classes based on visual image interpretations using Google Street View and ground knowledge. Therefore, we obtained 300 random points with its corresponding classification between 2013 and 2015 (i.e. 900 reference data). We used this information as an input to determine the change trajectory (S 1 -S 6 ).
Regarding the fourth step, Li & Zhou (2009) proposed two accuracy measures, i.e. overall accuracy and accuracy difference. Overall accuracy is measured by using two indices, trajectory overall-accuracy (A T ) and change/no change accuracy (A C/N ). Trajectory overall accuracy is measured as the ratio between correct cases (classification and changes) over the total samples (1).
Meanwhile, the change/no change accuracy (A C/N ) is measured by including any correct change between the reference and image classification (2).
The higher the value for A T , the more reliable the change detection result. The accuracy difference is used to indicate how much A C/N can represent the accuracy of individual trajectories. To measure accuracy differences, Li and Zhou (2009) proposed two indices, i.e. overall accuracy difference (OAD) and the accuracy difference of individual class (ADIC). OAD indicates the difference between A C/N and A T (3), where a high OAD indicates high accuracy when measuring A T (Li & Zhou, 2009).
Meanwhile, ADIC measures accuracies for individual trajectories, and it considers both no-change and change classes, Equation (4) measures the no-change trajectory (ADIC N ), and (5) calculates the change trajectory (ADIC C ).
A high value for A C/N is not necessary to obtain a high accuracy for an individual trajectory if the ADIC value is low.

Methodology: implementation of the TEM
This section consists of two parts. First, we describe the study area, the data and image pre-processing to  (Li & Zhou, 2009 Changes but incorrect trajectories ensure comparison of multi-temporal images. Second, we locally adapt the GSO based on interviews with local experts and develop a ruleset to detect slums in the study area across multi-temporal Pleiades imageries (2013)(2014)(2015).

Study area, data and pre-processing
To demonstrate the applicability of our framework, we chose Jakarta, Indonesia as a study area. Jakarta is a highly dynamic urban area with more than 30 million inhabitants within its metropolitan area (Demographia, 2016). Recently, the government of Jakarta started programme for mapping all slums across the country, to meet the national policy target of zero slums by 2019 (UN-Habitat, 2014). The programme will include slum upgrading but also relocating slum dwellers at a massive scale ( Figure 2). Presently, slum mapping is done based on ground surveys using a complex set of indicators (Ministry of Public Works and Housing (Kemen PUPR), 2016), which results in inconsistent maps (done by local surveys) and an observed dramatic increase in reported slums by municipalities, hoping to acquire national funds for upgrading (Leonita, 2018). Obtaining timely and reliable information regarding the dynamics of slums is crucial for a consistent slum database that allows monitoring the policy implementation.
However, slum mapping in Indonesia employing morphological indicators in remotely sensed imageries is challenging. Informal areas, locally called kampungs, dominate residential areas in Jakarta. These areas appeared more than 50 years ago when the planning institutions were not yet established (Rukmana, 2008). A kampung is a spontaneous urban settlement that has grown organically without planning guidance or provision of services (Sihombing, 2014). Kampungs are not necessarily slums, although they may share similar morphological characteristics as slums. Nowadays, many kampung residents have a formal ownership on land, and  they often come from middle-class income households (see Sihombing, 2007Sihombing, , 2014.
We employed multi-temporal Pleiades images with standard-ortho bundles for the years 2013-2015. The images have a spatial resolution of 0.5 m for red, green, blue and near infra-red bands. We selected an image for each year with less than 10% cloud cover. Since multi-temporal data was used, variations of the atmosphere during this period could affect the outcome. Therefore, image pre-processing, i.e. haze and sun angle correction, was done to ensure consistent input for the ruleset.

Local adaptations and slum detection
Topic-focused interviews (Groenendijk & Dopheide, 2003) were conducted with local experts from different institutions and backgrounds (i.e. central government, local government, NGO, consultant) to determine the local characteristics of slums. We selected subsets within the Tebet District of South Jakarta (Figure 3), due to three reasons. First, Tebet District comprise of various urban land uses, namely high-class residential, central business district (CBD), transportation hub and slums. Second, slums are often located on the riverbanks of this district. Third, different typologies of slums exist in the district, e.g. slums located on the riverbanks, near railroads and close to the CBD. Thus, the selected area is a complex area, where local experts exhibited high level of uncertainties when visually delineating slum boundaries as shown in a previous study (Pratomo, Kuffer, Martinez, & Kohli, 2017). Across the three years used in this study, slum do not show substantial changes, which provides an optimal and very complex test case to assess whether the TEM captures the low dynamics in the area and analysing the temporal transferability of the ruleset.
The six local experts that were interviewed had different perceptions regarding slum characteristics (Table 2). For some features, the experts had a high agreement, i.e. located on riverbank, small building size, irregular pattern and poor roof materials. Meanwhile, two characteristics were only mentioned by one expert from the local government, i.e. located close to the high-class residential or CBD, built on illegal land. The tenure status (i.e. built on illegal land) cannot be directly detected from the satellite imagery. However, experts from the local government argued that slum and non-slum kampungs can be only distinguished by this characteristic.
We translated the real-world definitions of slums from the local experts to the image based features by using several visual elements, e.g. tone, shape, size, association and texture. For instance, slums have an unorganised layout, which relates to the object geometry, i.e. the shape of slums. We also used association to describe the relationship between objects and other spatial features. For instance, slums are often found on riverbanks, thus, river can be associated with slums. Regarding tenure characteristics (Table 2), we used ancillary data (e.g. socio-economic, tenure status) as demonstrated by Kohli et al. (2013) or Netzband (2010). We also used the official land-use planning map, obtained from the Government of Jakarta. According to the discussions with the local experts, the government owns open spaces (e.g. public parks, riverbanks) and it is prohibited to build on them. Therefore, this data could be used as a proxy indicator for determining the tenure status. Thus illegal settlements could be identified by overlaying the boundaries of open spaces on the imageries. A comparison of the real-world and image domain characteristics of slum can be seen in Table 3.
We developed the OBIA ruleset according to image domain characteristics (Table 3) in Trimble E-Cognition version 9.2.1. Slum detection was conducted in two stages, segmentation and classification. Various algorithms can be used for the segmentation, e.g. chessboard, quad-tree-based, contrast filter, contrast split, multi-resolution, multi-threshold (Trimble Germany GmbH, 2015). For slum detection, multiresolution segmentations (MRS) is the most widely used algorithm (Kohli et al., 2012;Kuffer, Barros, & Sliuzas, 2014), as it can produce homogeneous objects from different types of data (Baatz & Schäpe, 2000). The implementation of MRS depends on the Scale Parameter (SP) (Drǎguţ, Tiede, & Levick, 2010) which controls the heterogeneity and size of an image object (Baatz & Schäpe, 2000). Often SP values are selected by a trial-and-error process (Whiteside, Boggs, & Maier, 2011), which decreases the robustness of the approach (Arvor, Durieux, Andrés, & Laporte, 2013). Hence, we employed Estimation of Scale Parameter (ESP), developed by Drǎguţ et al. (2010), to optimise the selection of SP based on the rate of change in the local variance (ROC-LV).
Following the segmentation process, we employed a two-level classification, which started with background removal followed by slum detection. First, we focused on extracting the background classes with smooth segmentation (MRS, SP = 5), i.e. vegetation, railroad, road and river. To assist the classification of the background classes, we used ancillary data from Open Street Map (OSM) to extract roads, railroads and rivers. The normalised difference vegetation index (NDVI) was used for detecting vegetation. Second, we calculated the SP using the ESP tool. For the complete process of developing the ruleset, refer to Figure 4. This ruleset started with a smooth segmentation for background removal and was followed by coarser segmentation at Level 2. In this level, we implemented the SP obtained from ESP for each image. In addition, we implemented different classification thresholds for each image.
A fine-tuning of threshold values was employed, using trial and error, to the indices that are affected by atmospheric variations (i.e. colour), to obtain comparable results from the ruleset across different years (Table 4). Meanwhile, we kept other indicators constant, i.e. association, shape, size (Table 3).

Results
In this section, we apply OBIA across multi-temporal Pleiades imageries (2013)(2014)(2015) to classify slums. Further, based on the classified slums, we calculate accuracies of the change detection and the temporal transferability according to TEM framework.

Object-based classification of slums
Following the steps in Figure 4, we classified the image into six classes, i.e. vegetation, built-up, roads, river, slums and railroad. Figure 5 shows how slums change over the 3 years in the study area. In general, slums follow the railroad and river. In principle, slums are found at the same locations across the years but boundaries differ.  (4) and (5) are local housing consultants (Pratomo et al., 2017

Change detection analysis
To conduct the post-processing change detection, first, the classification is reclassified ( Figure 5) into binary images, reducing the number of classes from six to two classes. Thus we kept the slum class and merged the other classes as background class (Table 5), assigning a unique class value for each year. Second, the binary classification result from 2013 to 2015 is combined into a composite map ( Figure 6). Since we worked with three images and two classes (i.e. slum and non-slum), the possible trajectory combinations were two to the power of three, resulting in eight trajectories.
Following the new classification (Table 5), eight change trajectories were obtained (i.e. 111, 112, 121, 122, 211, 212, 221 and 222). For instance, a pixel labelled 111 means that the pixel remains classified as non-slum between 2013 and 2015. Meanwhile, a pixel labelled 121 means that this pixel was labelled as nonslum in 2013, changed into slum in 2014 and in 2015, this pixel changed again into non-slum. Figure 6( b) shows that most of the area did not change (87.5%).
Among the various types of change trajectories, 212 has the highest percentage of change trajectory (3.2%), followed by 112, which is 3.0%. Based to the eight possible trajectories, the size of the area that improved and worsened can be determined. In the improved area, the slum status changed from a slum into non-slum (i.e. 211 and 221), this contributes to 22.6% of total changes. Meanwhile, for the worsened area, where the slum status changed from non-slum into slum contributes to 38.7% of total changes. Thus, considering the proportion of changes, results indicate more slum areas worsened than improved between 2013 and 2015.

Transferability measurements
By comparing the reference data and results from the post-classification change detection, we obtained the number of samples for each sub-group in TEM, as can be seen in Table 6.
In total, 74.3% of the samples were categorised as S 1 , where the samples were correctly detected as no change, having the correct land cover classification, followed by 15.3% as S 4 where the image classification indicated a change but no change occurred on the ground. According to the results in Table 6, we measured the accuracy as mentioned in Equations (1) to (5) ( Table 7).
For the overall trajectory accuracy (A T ), we obtained the value of 74.7%, which means that 74.7% of the samples have a correct classification with a correct change trajectory. A T is lower than overall classification accuracy (OA), as in A T , the correctness of change trajectory is also considered. For the change/no change overall accuracy (A C/N ), we obtained 83.7%, which is higher than A T . The reason for this is that in A C/N , we only consider the correctness of the change detection, without considering the correctness of the classification accuracies and the trajectory of changes. For instance, a point with incorrect classifications, as long as it is correctly detected a change, either from slum to nonslums, or vice versa. For the OAD, we obtained the value of 9.0%, which means that the value of A C/N is higher than A T . The high value of OAD indicates that although we have successfully determined the change and no-change status, some of the trajectories do not match with the reference data.
For ADIC N we obtained 89.6%, which is higher than the five previous measurements that we employed (i.e. (1) to (5)). The value indicates that the majority of "no change" cases can be identified from the correct classification results. Meanwhile, for ADIC C , we obtained 50%. It means that half of all change cases can be identified from the correct classification.

Discussion
We have demonstrated the usage of our framework for measuring the temporal transferability of the    employed OBIA ruleset for slum detection. The relatively low value of overall accuracy of trajectory (A T ) indicates that our OBIA ruleset has a moderate-low transferability for measuring change detection. We argue that problems with the temporal transferability were caused by several reasons. First, the overall accuracy of classification obtained for individual images was low (i.e. 65% for 2015). However, similar accuracy levels have also been reported by other OBIA-based slum mapping studies (e.g. Hofmann et al., 2011;Kohli, Stein, & Sliuzas, 2016;Kuffer et al., 2014). The low accuracies could be due to difficulties in formulating crisp definition of slums that can be transferred to an OBIA ruleset. As shown in Table 2, local experts have different agreements on slum characteristics in the study area, which was also shown in recent studies by Kohli et al. (2016) and Pratomo et al. (2017). This causes uncertainties when generating reference data. For instance, many highdensity areas that are detected in the images as slums cannot be categorised as slums on the ground (see, e.g. Figure 7). In addition, using morphological characteristics in imagery, it is difficult to precisely identify the boundaries between slum and non-slum areas, particularly in transition zones. Therefore, location based information are very important (Taubenböck, Kraff, & Wurm, 2018), which we included into our ruleset (e.g. proximity to river), improving the separability. Second, since we conducted a change detection analysis employing a post-classification method, any error that has been made in the classification stage propagated in the change detection. The variations in the viewing angle of the multi-temporal images resulted in misclassification of changes as also indicated by Leichtle et al. (2017). Figure 8 showed an example of an area that did not change on the ground but was detected as a change in the image.
Although the trajectory of overall accuracy (AT) indicates a moderate value (75%), the high value of ADIC N shows that our OBIA ruleset has a relatively high transferability when applied to multi-temporal imageries. More than 84% of the samples with no change and change have been correctly classified.
A fundamental limitation of the research is to base the assessment via the TEM on whether an individual point changed or not. This makes the assessment vulnerable to changes in imaging conditions (e.g. viewing angles, shading effects). In order to support monitoring of slum upgrading programmes, a more aggregated-contextual assessment would be required. Thus, in addition to a point, it would require to assess whether the neighbourhood (context) also confirm this change (Figure 9). This would reduce a large amount of "noise" (e.g. caused by different viewing angles), and would improve the certainty that a location really changed. Figure 9 shows the relationship between amount of contextual change and certainty. For example, in the case of pixel, when all the surrounding pixels confirm this change the certainty of the change is high. While when none of the surrounding pixels changed, the certainty of the change  is low. In case of objects, the topological relationship between change and neighbouring objects need to be assessed for analysing the certainty of the change. Thus, in order to inform policy development and indicate changes in a certain area, a sufficient contextual threshold needs to be defined. This would need to be combined with a decision what minimum context should be considered (e.g. direct neighbouring pixels/objects or are large context of a minimum change area).
In our study, we considered a relatively short time span of three years. This was considering the fast changes in the context of Jakarta where slum-related policies are being implemented to improve, or resettle slum settlements. For further research, our method could be tested on other contexts to capture the changes over longer time spans.

Conclusions
Our study has developed a framework to quantify the temporal transferability of an OBIA ruleset by adopting the TEM. We demonstrated the usability of our framework by employing multi-temporal Pleiades imageries over 3 years (2013)(2014)(2015). Two sources of uncertainties occurred, potentially causing a relatively low trajectory of the overall accuracy (AT). First, the fuzzy boundaries and definition of slums caused difficulties when generating reference data for slum and non-slum areas. The local context increased this problem as kampungs are informally developed and high-density areas but not necessarily slums. Second, variations in the viewing angle of the images led to misclassification of changes. These results demonstrate that a pixel-based accuracy assessment without including the neighbourhood (context) of the change produces a noisy change analysis.
Therefore, further research should be done to quantify various sources of temporal uncertainties and adding a contextual certainty analysis. In the context of Indonesia, particularly in Jakarta, the government has set a target to achieve zero slums by 2019; this requires monitoring the implementation of such slum reduction policies. For this purpose, TEM is a suitable tool to quantify trajectory accuracies in multi-temporal VHR imageries. However, this would need to be combined with an assessment of the certainty of change. This would allow policymakers to be aware of the correctness of change trajectories and the certainty in measuring slum dynamics.