Harmonization of land-cover data to assess agricultural land transformation patterns in the peri-urban Spanish Mediterranean Huertas

ABSTRACT Most of the peri-urban areas in European cities are characterized by a mix of rural and urban uses. Despite being sprawled areas, they provide opportunities for improving green connectivity at a multiscale level, between urban-green and natural or agricultural peripheral extensions. Several land monitoring services, both at national and European levels, have become key tools to perform the analysis and diagnosis of its transformation patterns and dynamics. However, the accuracy of available datasets is typically not adequate for approaching the spatial complexity of these areas. This research proposes a methodology to improve precision by combining land use datasets and applies it to a specific study case, the peri-urban Spanish Mediterranean Huertas, highly valued agricultural and cultural landscapes under an intense urban pressure. Findings reveal that this method detects and solves inaccuracies, and it is easily replicable in different spatial contexts, becoming an effective tool for decision-making processes.


Introduction. LULC in the peri-urban areas
Land use and land cover -LULC -changes are a central issue within environmental land policies regarding sustainability (Banzhaf et al., 2017;Lai et al., 2017). Literature on the topic is extensive and several authors agreed on highlighting the fact that spatiotemporal transformation processes are difficult to measure because of the different speed and magnitude of those LULC changes (Borgogno-Mondino et al., 2015;Lambin & Geist, 2006;Salem et al., 2020). Specifically, the transition zones between urban tissue and the peri-urban territory are complex areas, generally perceived as a chaotic mixture of land uses distributed in highly fragmented land plots, which are under urban metropolitan pressure (Meeus & Gulinck, 2008). Furthermore, there is an increasing process of environmental degradation and abandonment of traditional productive activities, such as agriculture. It has a significant impact on land use and becomes a concerning topic widely approached by academic researchers and administrations (Levers et al., 2018).
For decades, the integration of urban and rural areas has been outlined as a relevant issue by European Commission (European Environment Agency, 2020c), driving several initiatives and projects such as RURBAN (Partnership for sustainable urban-rural development; European Union, 2012) or the H2020 project REPAiR (REsource Management in Periurban Areas: Going Beyond Urban Metabolism; TU Delft (coord), 2020), among others. These examples give evidence of the interest and relevance of creating a place-based development strategy in the interface of peri-urban areas (De Falco et al., 2019). Additionally, they show the advantage of further enhancing the connectivity between urban green infrastructure and peripheral areas, providing new opportunities for these peri-urban areas acting as ecosystem services.
It is argued that land cover datasets are one of the basic sources of information for land use change research, including multidisciplinary and multiscale studies. The analysis of LULC changes provides both quantitative and qualitative description of the spatiotemporal transformation process (Antrop, 2004). For example, considering a) the identification of factors and causes of these dynamics of change (Costanza & Ruth, 1998;Ruiz-Martinez et al., 2020), b) the impact assessment of these transformations from an ecological perspective (Botequilha Leitão & Ahern, 2002), and c) the prediction of future use scenarios (Minetos & Polyzos, 2009) to better target land use policy formulation (Dwyer, 2011;Pickard & Meentemeyer, 2019) mainly connected with green infrastructure conceptualization (Benton-Short et al., 2019).
Importantly, the existing research is developing different approaches and methods to tackle with these peri-urban fringe unbalances and the determination of local factors. Indeed, in Europe, some of the current researching topics based on LULC dataset analysis are focused on: i) the study of land-use intensity with the definition of new comprehensive analytical framework (Erb et al., 2013;Tang et al., 2020;Zeng et al., 2005), ii) the identification and mapping of high nature value farmland by combining several datasets (Bonato et al., 2019), iii) the definition of landscape types using CORINE Land Cover (Vizzari et al., 2018), iv) the assessment of total energy input per hectare (Rega et al., 2020), or v) the definition of land use indicators based on the neuronal network and selforganizing map approaches (Van der Zanden et al., 2016).
However, the existing approaches to peri-urban agricultural -PUA -areas are mostly misrepresented due to the weak accuracy of LULC datasets in these territories, caused by the diverse and small-scale changes. This is a critical limitation because the two last features are key to better understand the spatial configuration, which determine not only the perception but also the possibilities of defining new planning strategies (Tuanmu & Jetz, 2014). Although the precise characterization of PUA areas is essential to improve policies and planning objectives based on evidence, current fragmentation indicators have significant weaknesses in the spatial unit of analysis that may explain changes in these areas (Morán Alonso et al., 2017).
This research assesses whether the use of different LULC datasets helps to increase their accuracy in PUA areas. If so, it would be possible to provide new data that helped in the definition of urban planning strategies in these areas with more detailed information about the urban and rural dynamics. The method has been specifically tested on three Spanish Mediterranean Huertas as a case study. Huertas are ancestral highly productive agricultural landscapes, and they represent an important environmental, cultural, and productive asset, which is under urban pressure. This innovative method contributes to better identifying and visualizing changes oin the structure of the territory, considering land fragmentation. Moreover, based on a lay-overlapping system, this study provides a working tool, which intertwines the most widely used land-monitoring datasets among those currently available. Therefore, it contributes to data harmonization on different factors of the operational area and, as a result, to reducing significant uncertainties revealed in the analysis of current land cover datasets.

Peri-urban Mediterranean Huertas as a PUA benchmark
Huerta is defined as one of the thirteen acknowledged European landscapes, as stated in the Dobris Report developed by the European Environment Agency (1998). Indeed, it is considered as a unique heritage asset and a key component of the cultural identity, together with other traditional Mediterranean agricultural areas (CESE, 2005). In the European Mediterranean basin, Huertas are intimately linked to traditional urban settlements' location and growth. Nowadays, these PUA areas conflict with new urban activities, which are often accompanied by an environmental decline due to the loss and fragmentation of agricultural land plots (Font, 2004;Garcia-Marin et al., 2020;Vallés-Planells et al., 2020;Verdú-Vázquez et al., 2021).
This research explores the peri-urban Spanish Mediterranean Huertas as a case study, which are located in three different areas along the Mediterranean basin ( Figure 1): 1) Huerta de Zaragoza (38,061.6 ha) in the metropolitan area of the homonymous city (Zazo, 2010); 2) L' Horta de Valencia (23,129.1 ha), located in the metropolitan area of Valencia city (Romero & Melo, 2016); and 3) Huerta de Murcia-Alicante (83,491.9 ha), which extends over the metropolitan area of Murcia city (Cánovas-Molina et al., 2021) and the polynuclear area of southern Alicante province (García-Mayor, 2017). To date, Huerta areas have been comprehensively studied as separate cases (Temes & Moya, 2016), but there is still little research addressing these areas from an intertwined and multiscale approach (Martí & García-Mayor, 2020

LULC datasets
Three land-monitoring datasets have been selected to evaluate land cover classification and spatiotemporal analysis of the PUA landscape in the Mediterranean:1) CORINE Land Cover -CLC -as a European scale LULC reference map (European Environment Agency, 2020a), 2) SIOSE -Information System of Land cover/Land use in Spain -as national level LULC map (Urbana & Ministerio de Transportes, 2020), and 3) SIGPAC -Geographic Information System of Agricultural Parcels of Spain -as the Spanish regional level dataset, developed to assist different agents in the application of EU Common Agricultural Policy for land agricultural subsidies (Ministerio de Agricultura, 2015). Table 1 shows selected datasets' basic features, reflecting substantial differences in scale and mapping accuracy among all three, distinguishing SIGPAC as the more accurate with the larger scale, 1:5,000. Although the SIGPAC dataset provides deeper details about agricultural plots, little is gathered about urban uses. In contrast, SIOSE and CLC compilations include more information about urban uses and dynamics at different scales, 1: 25,000 and 1: 100,000, respectively (Büttner et al., 2017;Urbana & Ministerio de Transportes, 2020), but little about small-scale LULC changes in the agricultural tissue.
Each dataset is generated using different methodologies. At the European level, CLC has traditionally been generated from satellite image photointerpretation, but since 2006, some countries have generated the information from generalization techniques using more detailed thematic maps (Hazeu et al., 2016). In the Spanish case, since 2012, CLC has been updated by applying generalization techniques from SIOSE and complemented with photointerpretation. The temporal analysis varies depending on each dataset: CLC covers from 1990 to present time, while SIOSE and SIGPAC provide data since 2005. All of them are regularly updated at different time intervals: CLC every six years, SIOSE every three years, and SIGPAC annually.
Additionally, the consideration of each dataset's classification system is considered an important key point to understand the method definition: CLC has a hierarchical classification system that uses unique codes, organized in different grouping levels; SIOSE has an object-oriented system in which each polygon is defined by an homogeneous land coverage, as a result of different single coverages in the proportions expressed in the dataset (Equipo técnico Nacional SIOSE, 2018); and SIGPAC defines a classification of unique codes for agricultural uses per plot that are declared annually by farmers (García, 2016). All these differences imply variations in the dataset results' accuracy and consistency.  Table 2 reports the comparation of the agricultural and artificial area extensions in PUA Huertas among CLC and SIOSE datasets, using their first level of cover aggregation. The oldest and newest updates of these datasets that match in time are used (2006-2005 and 2018-2015) to explore land cover changes.
In the PUA Huertas analyzed, the spatial analysis shows a great heterogeneity of farmland, but neither of both datasets -CLC and SIOSE -reflect dwellings' occupancy, which is one of the specific spatial patterns of these areas. Scattered dwellings are included as a generic 'artificial surface', mainly when they are near the consolidated urban tissue, or as 'agricultural coverage' when there is a greater discontinuity and crop plots are prevalent (Figure 2). The lack of accuracy in the CLC dataset for addressing analysis of fallow or abandoned farmland is pointed by some authors (Levers et al., 2018), and others highlight the existence of inconsistencies between CLC and SIOSE or show that data provided by CLC and SIOSE datasets are very similar at the national level (García-Álvarez & Camacho Olmedo, 2017). Consistent with previous research, this study finds an increased accuracy assessing land use changes when combining multiscale datasets because more nuances and greater differences are possibly determined at the local and regional level (Olazabal & Bellet, 2018).
All the above-mentioned justifies the selection of SOISE and SIGPAC as the most suitable datasets to address the harmonization process with the aim of increasing accuracy within the PUAs' spatial patterns definition (Figure 2).

Harmonization process applied to PUA Huertas
The method proposes a harmonization of the SIOSE and SIGPAC datasets in the PUA Huertas ( Figure 3) by exploring the classification provided by both datasets and the coverage correspondence between the attributes of 'SIOSE_CODE' from the SIOSE dataset (Instituto Geográfico Nacional, 2016) and 'USO_SIGPAC' from the SIGPAC dataset. To be specific, this exploration is solved in a threestep process: • First, the spatial transformation among dataset polygons is performed obtaining a new polygon layer as an intersection of both datasets. This new layer comprises the predominant coverage from SIOSE and the use in SIGPAC. Additionally, to refine the data, SIGPAC polygons less than 15 m 2 have been removed from the imputation model (Ministerio de Hacienda y Administraciones Públicas, 2013). • Second, the two attributes ('SIOSE_CODE' and 'USO_SIGPAC') are compared and those polygons where the values do not correspond to the same land coverage are identified. • Third, and finally, a reclassification of the coverages is performed, for those polygons previously identified. In this step, a semantic translation is performed based on a visual interpretation of these polygons from orthophotography. As a result, a classification for PUA Huertas is set up according to SIOSE nomenclature.

Results
This new dataset -PUA Huertas -includes an attribute for the classification coverage that is based on the SIOSE categories. However, this attribute considers the harmonized polygons (those with disparities between SIOSE and SIGPAC datasets) and includes new categories, according to SIGPAC information and visual interpretation. Considering this fact, the results in this section are presented in relation with two aspects: first, the analysis of how the harmonized dataset gives answer of detected discrepancies and second, the results of combining qualitative and quantitative analyses for assessing LULC changes in PUA Huertas from 2005 to 2015.

Dataset category harmonization: the qualitative perspective
The spatial complexity of PUA areas is reflected in the mismatches and aggregation inconsistencies detected in the identification and classification of small areas, mainly within the SIOSE dataset. In this study, discrepancies between land-use dataset classifications have turned into a clue to develop the harmonization (Table 3). Although the SIGPAC dataset focuses on the agricultural plots, in this method, it also helps to reclassify some artificial soils, such as 'road' (code CA) or 'building categories' (code ED) that were included into 'agricultural land' by SIOSE. This is because the smaller scale of SIGPAC provides a better accuracy to assess the agricultural land transformation. Other land coverages require a visual analysis, contrasting the SIOSE dataset with orthophotography, such as those reclassified as 'unbuilt land' (code 121). These areas were initially considered as 'agricultural plots' (code 200) in the SIOSE dataset because the urban transformation was not completed and as 'urban areas' (code ZU) in the SIGPAC dataset because the agricultural use was forsaken (Figure 4a). One of the most relevant findings is related to isolated buildings scattered in the agricultural land, which have required greater attention and deeper visual review. Generally, these constructions are classified in SIOSE as 'artificial covers' (100), either as 'unbuilt land' (121) and 'other constructions' (code 111; Table 3). However, this proposed method has made possible the recovery of the agricultural use of some of these specific areas (Figure 4b). Indeed, although the concentration of buildings causes SIOSE to classify them as 'artificial covers', the application of this method reveals that some areas remain with their agricultural uses, specifically 'citrus' (codes CF and CI), 'other woody crops' (codes FS, FV, and FL), and 'vineyards' (codes VF and VI).

Unbuilt land * Category IV has been incorporated into those predefined by SIOSE
The SIGPAC 'unproductive' category (code IM) has also facilitated the identification of plots with agricultural plastic technologies (Figure 4c). In these cases, a new category, not previously considered by SIOSE, has been generated for 'greenhouse and plasticulture crops' (code IV). This category provides a more accurate land classification of those SIGPAC 'greenhouses' and SIOSE 'arable land' (code TA). Figure 5 summarizes the total harmonized coverages obtained as a result of the application of this GIS-based method. This figure provides an overview of three different criteria: location of the PUA (left column), land coverages that have been reclassified according to SIGPAC codes into artificial and agricultural areas (middle column), and year from the retrieved data source that has been analyzed (right column).

Outcomes of combining qualitative and quantitative analyses for assessing LULC changes in PUA Huertas
A further look into results, in correspondence with Table 3, reveals interesting issues to consider. First, the reclassification of land coverages into 'artificial and other land uses ' (11,772.20 ha in 2005 and 14,167.70 ha in 2015) is significantly higher than the reclassification of land coverages into 'agricultural uses ' (2,304.30 ha in 2005 and 2,316.20 ha in 2015). This means that more of the 80% (83.63 in 2005 and 85.95 in 2015) of the harmonization has focused on artificial and other land use areas, and less than 20% (16.37 in 2005 and 14.05 in 2015) has focused on agricultural areas. Second, considering the PUA study cases, the applied methodology has reclassified a percentage close to 10 and 11% of the total studied area in 2005 and 2015, respectively. These changes have been included in the new harmonized PUA Huertas dataset.
With respect to artificial and other land use coverages, 'unproductive land' (code IM) and 'roadways' (code CA) represent the categories with more inaccuracies detected in land use classification. As Table 3   polygons made them negligible to be identified by the SIOSE dataset. It is important to refer that PUA Huertas harmonized 4,577.6 ha (1,578.5 and 2,999.1 in 2005 and 2015, respectively) of 'urban areas' in SIGPAC (code ZU) -reaching 15.0% of the total reclassification -which were initially considered as crops by the SIOSE dataset. However, findings reveal that they constitute unbuilt soils in urbanized areas, also lacking from any agricultural use.
Regarding the agricultural land coverage, the study has detected a large extension of plots with isolated buildings located within them, frequently considered in the SIOSE dataset as artificial coverage without agricultural use. However, the real characterization of these plots corresponds to 'fruit trees' (code FY), 'citrus orchards' (code CI), and 'Huerta' (code TH). These results represent an important key point for performing an accurate landscape characterization.
Additionally, the method introduces greater precision in the measurement of landuse transformations in PUA Huertas territory in the 2005-2015 decade. Results reveal that between 2005 and 2015, the agricultural landuse area was reduced by 2.6% (75,244.3 ha). This amended extension represents 52.0% of the total PUA Huertas. However, the artificial surface increased by 3.4% during the decade 2005-2015, reaching the 34.9% (50,422.70 ha) of the total area in the same period (Table 4 and Figure 6). If the landcover dataset harmonization that is proposed in this article had not been considered, the artificial surface would have decreased by 6.8%, according to the SIOSE dataset, or increased by less than 1%, according to the CLC dataset (Table 2). However, these values do differ from real agricultural land transformation patterns that can be recognized in field work, and the relevance of this proposed GIS-based method and importance of achieved results arise.
Between 2005 and 2015, Huerta de Zaragoza was the one that suffered the greatest loss of agricultural area -6.5% -, while Huerta de Murcia-Alicante and L' Horta de Valencia case studies show a loss within a range between 1% and 2% of agricultural coverage, respectively. However, the increase of artificial land has a similar growth ratio in the three case studies, ranging within 3.2% to 4.0% rates. When exploring the subcategories that integrate the agricultural coverage, results reveal important insights. 'Tree crops' (LFN in PUA Huertas categories, based in SIOSE nomenclature) -excluding 'citrus trees' (code LFC) -, 'rice crops, and olive groves' (code LOL) have increased the number of hectares cultivated, contrasting with 'citrus trees' (code LFC) and other arable crops decrease. Land cover changes over 2005-2015 decade are globally represented in Figure 6. Changes regarding specific categories within agricultural landcover and increments and decrements are represented by solid or dashed lines, respectively.

Discussion
In developing this method, several specific issues related to the original LULC datasets and the harmonization process needed to be addressed. Recent literature gives reason for some of these issues, from a general perspective (Nedd et al., 2021;Yang et al., 2017). However, from a more specific perspective, considering the specificities of the PUA Huerta landscape, this study covers the following three issues.
The first one is related to the inaccuracies of LULC classifications in areas where artificial and agricultural uses coexist. In this PUA Huerta landscape, it is aggravated by the highly mixed and fragmented plot system. The proposed method enables a better exploration of the dynamics of change that the one provided by SIOSE or SIGPAC datasets if they are evaluated individually (Bonato et al., 2019;Borgogno-Mondino et al., 2015).
A second one focuses on the semantic inconsistency of the original datasets (Baudoux et al., 2021). This semantic consistency is revealed in each of the datasets. For instance, a similar mix of agricultural plots and sprawled dwellings is categorized differently if we consider the SIOSE dataset of Huerta de Murcia-Alicante and compare Murcia and Alicante areas (different regional administrative boundaries). Although it is a continuous Huerta, the fact is that the dataset has been developed by different teams and semantic inconsistencies arise (Ros Sempere & García Martín, 2016). This fact provides enough evidence to include a visual interpretation as a complementary stage of the harmonization process The detection of these inconsistencies in categorizations reinforces the need to perform dataset harmonization before developing studies comprising different regional or international geographical areas. Indeed, the current project to update SIOSE is considering the harmonization of different geographics to improve the accuracy and reduce the maintenance costs of LULC datasets (Delgado Hernández et al., 2017). The third and last issues that this study has addressed is related to the spatial transformations of original datasets. As demonstrated, given the 'micro' size of land use changes in PUA Huertas, the accuracy of available datasets (SIOSE or CLC) is not effective enough for approaching the spatial complexity of these areas. Consequently, other resources -such as the Geographic Information System of Agricultural Parcels -are needed to evaluate the spatial transformation of these PUA Huertas, as they increase the spatial accuracy of available datasets.
Additionally, the applied methodology has permitted to complement data increasing nuances at multiscale levels. For example, in relation with agricultural uses, SIGPAC database provides specific and updated information about automated irrigation, manual irrigation, and rainfed irrigation agricultural lands. These details are basic to study and assess water resources' needs and management, determining the potential of agricultural cultivated, which is adjacent to urban land uses, or detecting rural roads that might strategically provide accessibility at the local scale connecting rural with urban tissues.
Considering the abovementioned issues, this proposed method contributes to a better understanding of the current spatial organization of these peri-urban areas and, moreover, it gives support for policy and decision-making in territorial planning and multiscale strategies, such as localintegration of spaces -, urban -strategic network connectivity, service distribution -or metropolitan -ecological connectors, identification of strategic opportunities.

Conclusions
Compared to the broadly used CLC in previous works, this study has confirmed that the harmonization of these two databases, SIOSE and SIGPAC, provides a more refined LULC identification. This is even clearer in highly fragmented areas such as the peri-urban fringes where agricultural plots and urban tissues meet, and the harmonization of databases has facilitated a better comprehension of its spatial complexity. The method has also allowed us to identify the oversimplification of the mixture of artificial and agricultural uses in these PUA Huertas. Accordingly, this information could serve as guidelines to review the SIOSE classification criteria and of other land cover datasets.
This methodology is an effective exploration system to support decision-making processes in designing landscape and defining environmental policies. In fact, the potential of this methodological proposal is its consideration as an urban-planning tool to better approach the peri-urban spatial context. Moreover, this is easily applied at the European level considering that SIOSE and SIGPAG datasets have their equivalence at the national level in almost all European countries, such as the German DeCOVER, the Austrian LISA, or the Portuguese COS 2007. This fact also implies the possibility of stablishing cross-country analysis, providing then a European framework for assessing planning challenges at peri-urban fringe, which is one of the next research lines.
As previously explained, the relevance of studying PUA Huertas has been confirmed since they are environmentally strategic spaces located in metropolitan areas under a high urban pressure, similar to other Italian and Greek Huertas' strongholds identified by the DOBRIS report. Moreover, any agricultural space in a peri-urban location could be studiedby following the proposed method, mainly when looking for intertwining urban expectations and environmental aspirations with accuracy and considering nuances otherwise difficult to detect in these complex peri-urban areas.
These PUA areas emerge as multifunctional places in which the existing conflicts in the interface between the urban fringe and the agriculture plots can be transformed into an opportunity, for example, the development of an urban green infrastructure in connection with peri-urban environments, as recommended by the European Common Environmental policy. Therefore, the development of a spatial analysis tool based on existing databases offers an opportunity for a more effective spatial planning in the peri-urban fringe.