Ad-hoc combination and analysis of heterogeneous and distributed spatial data for environmental monitoring – design and prototype of a web-based solution

ABSTRACT With regard to a multi-dimensional and multi-facetted implementation of the vision of a Digital Earth, capabilities to combine and analyze heterogeneous spatial data sources on the web are becoming increasingly important. In this article, an online system is conceptualized and implemented to facilitate spatial data analysis and decision making specifically for environmental applications. It supports a dynamic search and binding of suitable geoprocessing functionality with respect to the given input data and target description. Geoprocessing patterns are used to create an application-oriented abstraction layer on top of generic geoprocessing services available on the web. As an application scenario the determination and quality assessment for water body structures is taken. For this use case authoritative data, remote sensing imagery and citizen science data gets combined to gain a comprehensive picture of the various spatial, temporal and thematic aspects influencing the quality of inland waters. The prototypical implementation makes use of open standards to facilitate the integration with existing spatial data infrastructures.


Introduction
The multitude of spatial data sources that is nowadays more and more openly available on the web plays a significant role in shaping the Digital Earth. Beside national Spatial Data Infrastructures (SDI) providing mostly administrative data, this diversity encompasses the various data offerings from remote sensing and especially addresses voluntarily collected spatial data, commonly denoted as Volunteered Geographic Information (Goodchild 2007). On the downside, the plethora of distributed, mostly disconnected data sources still makes spatial data analysis and decision making a tedious task. Thus, the fusion of heterogeneous spatial data sources in a dynamic and responsive manner is required across different scales, geographic borders, organizations and domains. The vision of a Digital Earth aims to overcome data centric infrastructures, such as driven by GEOSS or the various governmental SDI initiatives, toward an open framework that effectively supports collaboration and participation. It especially highlights the need for dynamic and quality-aware processes adapting to actual user and application needs Bernard et al. 2014).
The demand for online geoprocessing has already been identified by many GIS software vendors, offering comprehensive functionality via specialized server products, such as ArcGIS Server or FME Server. This usually comes at the cost of vendor lock-in with limited interoperability and access. However, as argued by Goodchild et al. (2012), ' … access to scientifically grounded information about the planet's future is a basic human right; … it should be available to all, independent of national policies and strategies, and in a form that is readily understood and absorbed'. The establishment and maintenance of open and interoperable geoprocessing functionality on the web is thus considered an important means to transpose the wealth of spatial data available on the web into useful spatial information.
A promising example for the future application of online geoprocessing is the support for an ad hoc reporting about the status in reaching the United Nations Sustainable Development Goals (SDG) using a set of well-defined indicators (Economic and Social Council 2015). Taking water-related SDG indicators as an example, the application developed in the course of this article relates to the determination of water body structures in support for the sustainable management of inland water resources, in particular the 'Percentage of bodies of water with good ambient water quality' (Economic and Social Council 2015, indicator 6.3.2). Information on the quality of water bodies are of great value to both administration and citizens, especially in areas prone to flooding or environmental hazards. The overall goal is to achieve free and instantaneous access to information on water quality in a comprehensive, interoperable and flexible manner, preferably on a global scale. This goes in hand with a transition from supply oriented to demand-oriented information infrastructures. This also means shifting from traditional partly long-term reporting cycles, resulting in static monitoring reports and maps, to providing means for an ad-hoc combination and integration of distributed, most up-to-date data sources, allowing for a transparent and ideally reproducible analysis of the current environmental status and changes. Possible data sources include both field and remote sensing data provided by governmental agencies, research institutions or citizen science. However, its availability, accessibility and quality varies widely by geographical location. Therefore, indicators on water quality should best be deducible from any data source, but are expected to be improvable by the combination of multiple sources.
The web-based solution for the ad hoc analysis and combination of distributed spatial data sources described in this article is intended to assist decision makers in gaining a comprehensive view on the various spatial, temporal and thematic aspects of a real-world phenomenon. It is prototypically implemented for the determination, enrichment and update of indices linked to the ecological status of rivers. While Section 2 briefly reviews the state-of-the-art for both existing river monitoring strategies and geoprocessing on the web, Section 3 focuses on the core objectives of this article: (a) a process suggestion system for the analysis and combination of heterogeneous spatial data sources on the web, (b) the enrichment of geoprocessing capabilities in current SDIs by the application of geoprocessing patterns and (c) the specification and formal description of data and process constraints. In Section 4, the prototypical implementation of the system is described and evaluated. The conclusion in Section 5 summarizes the contributions and poses further research challenges toward achieving the vision of Digital Earth.

State-of-the-art
The determination and improvement of the ecological quality of water bodies is often associated with common conservation objectives (Pedroli et al. 2002;de Nooij et al. 2004) and the risk and potential impact of flood events (Disse and Engel 2001;Hooijer et al. 2004;Reinhardt et al. 2011;Thieken et al. 2016). Regular monitoring and reporting cycles for the ecological water quality have been implemented around the globe. Prominent legal foundations include the European Water Framework Directive 1 (WFD) and the Clean Water Act 2 in the United States. The final ecological rating of a water body is usually derived from a number of biological, hydromorphological and physicochemical measurements. However, as shown in Raven et al. (2002), Šípek, Matoušková, and Dvořák (2010) and Karrasch et al. (2015) for the hydromorphological assessment of rivers, results may significantly vary with the underlying classification strategy and scale.
In addition to field observations, remote sensing has a long tradition and will continue to play a significant role in the regular monitoring of water bodies (Palmer, Kutser, and Hunter 2015). This is particularly true for areas with less advanced monitoring programs and structures and accordingly low administrative data coverage. For the determination of hydromorphological parameters, various sensor types have already been applied, including optical sensors (Karrasch et al. 2015), radar (Klemenjak et al. 2012) and airborne laser scanning (Demarchi, Bizzi, and Piégay 2016). In addition, remote sensing can be used to monitor certain chemical and biological characteristics (Palmer, Kutser, and Hunter 2015). Hence, it is no surprise that the combination of in situ and remote sensing data is considered very important (Schaeffer et al. 2013).
In addition and complementary to existing field and remote sensing data on water quality, citizen participation and low cost sensors gain increasing attention as a means to complement regular monitoring in the spatial and temporal dimension (Dworak et al. 2005;Lowry and Fienen 2013;Buytaert et al. 2014). This is mainly due to the fact that local characteristics of water bodies are on the one hand very important to implement reasonable strategies on biodiversity conservation and flood risk reduction (Blöschl, Viglione, and Montanari 2013), but on the other hand hard or costly to achieve and maintain in the traditional manner.
For the analysis and combination of the above mentioned data sources on the web in an ad hoc fashion, a service-based solution seems to be most promising. Standardized geoprocessing on the web has already been promoted by a series of initiatives and projects. First application scenarios using the OGC WPS standards were developed by Friis-Christensen et al. (2007), Kiehle, Greve, and Heier (2007) and Lanig et al. (2008). A corresponding research agenda was set up by Brauner et al. (2009). Since then, various aspects of service-based geoprocessing have been further developed by researchers, including the wrapping of legacy software (Müller, Bernard, and Brauner 2010;Hinz et al. 2013), service orchestration (Müller, Bernard, and Brauner 2010;Bensmann et al. 2014;Meek, Jackson, and Leibovici 2016) and semantic service descriptions (Fitzner, Hoffmann, and Klien 2011;Müller 2015;Hofer et al. 2017a). However, despite those efforts, standardized and open geoprocessing on the web is still considered to be in its early stages, with a particular need for institutional and community engagement (Hofer 2015). Moreover, mediating services, comprising pre-processing, connective operations and post-processing, are required for workflow generation, in particular for handling heterogeneous data sources (Hofer et al. 2017a).
Besides the generic geoprocessing functionality offered by established WPS frameworks, for example, buffer computation, line simplification or basic raster operations, the majority of online platforms for geoprocessing are theme-specific (Hofer 2015). This underlines the demand for pragmatic developments, offering solutions to actual real-world questions and problems. This requires means to abstract from technical service descriptions and provide well-defined, reusable and application-oriented functional descriptions instead. One approach to achieve this are geoprocessing patterns, which are descriptions of common processes within a certain domain with only a loose coupling to the underlying software or service infrastructure (Brauner 2015;Wiemann 2017).
3. An information system for the analysis and comparison of spatial data Today's governmental SDI, as well as volunteered, commercial or scientific initiatives, already offer a vast amount of spatial data on the web. In addition, numerous software tools provide processing functionality related to the determination of water quality. However, the composition of geoprocessing workflows for the analysis or combination of spatial data usually requires specialized software and expert knowledge. Therefore, the main target of the information system proposed here is to assist users in finding and executing appropriate workflows that fulfill their particular need for information.

Application scenario: assessing the quality of water bodies
With respect to the determination and assessment of the quality of water bodies, three potential stakeholders are identified: the responsible governmental authority, the decision maker and the regular citizen ( Figure 1), each of them with different requirements and objectives. The authority is primarily bound to legislation, for example the national implementation of the EU Water Frame Work Directive in the EU member states, and is required to satisfy the imposed reporting obligations. The decision maker, for instance an environmental planner, a risk manager or an assurance officer, uses existing data to extract most up-to-date information for decision making, for example, in the case of flood events, environmental damage or to calculate insurance rates. Finally, citizens are often interested in gathering information on the water quality in their area, for example, for personal interest or educational purposes (Higgins et al. 2016).
There are three input data sources from which information on water body quality can be derived: data provided by governmental authorities, remote sensing data and data obtained from citizen science projects. Administrative data mainly refer to data from land surveying and environmental authorities as data recorded in compliance with legal reporting activities. The recording and provision of those datasets are required by law and usually characterized as quality-checked and consolidated datasets with long-term maintenance. However, they also come with long update cycles. Remote sensing data refer to both airborne and satellite imagery and are provided by either the relevant competent authorities or private aviation and aerospace companies. Depending on mission, flight path and measurement configuration, remote sensing image data come at various spatial, radiometric, spectral and temporal resolutions. The third input, data from the citizen science, is much more versatile in terms of content, structure, quality and resolution than the previous two. However, this versatility makes it a promising complementary source of information, especially for local characteristics of the environment.
Besides the existence of spatial data, the actual access to raw data is required to allow for spatial data processing. This requires well-defined interfaces and data specifications as well as the willingness to actually provide data on the web. For the public benefit, this should preferably be achieved by the facilitation of open data initiatives, but may as well include proper systems for authentication, authorization and accounting for commercial data sources.

A suggestion system for the analysis and combination of heterogeneous spatial data sources
Appropriate methods for the analysis of water quality based on single feature objects (e.g. rivers, dams, flood plains, built-up areas, catchments) essentially depend on its spatial representation. For feature objects, with either point, curve or surface geometry, valuable information can be read from the attributes attached to it. This comprises previously determined characteristics of an object, such as conducted measurements on the hydromorphological or the bio-chemical quality. Additional information can be extracted directly from the feature geometry. Whereas this information is limited to a single coordinate for points, a curve representation of a river centerline can be used to derive important information on the hydromorphological quality of a river, most notably the river sinuosity (Karrasch et al. 2015). The longitudinal profile of a river can be derived, if threedimensional coordinates are provided. In the case of a surface representation of a river, it is possible to measure the river width and its variation (Karrasch and Hunger 2016). Moreover, the river centerline and the structure of the riverbank line can be extracted. Considering a continuous representation of a river, such as extracted from remote sensing imagery, radiometric information can be used to estimate certain bio-chemical characteristics, such as phytoplankton and water clarity (Olmanson, Brezonik, and Bauer 2013). In addition to feature objects directly representing the water body, features in its direct proximity provide information about environmental conditions that potentially affect the water quality. For example, land use or land cover information gives indications on the quality and quantity of pollutant entries, for example from agriculture or mining industries (Smith and Owens 2014).
If multiple input features are selected, either within or across spatial datasets, there are two options for further processing. First, the analysis processes described above can be applied either sequentially or in parallel to the selected features; second, the combination of selected features can exploit relationships between them. This is exemplarily shown in Table 1, with the identification of suitable processes again relying on the spatial representation of input features.
Beside the spatial representation, the semantics of the input data ultimately determines the meaningfulness of a combined analysis. This is demonstrated by Stasch et al. (2014) for the applicability of geostatistical analysis processes depending on the inherent characteristics of a represented phenomenon. In Scheider and Tomko (2016), this is solved by comprehensive semantic annotations that are used to reason on the applicability of certain operations. Accordingly, semantic information plays a decisive role, for example, for identifying the kind of phenomenon (e.g. river, pipeline, center or flow line), the underlying representational model (invariant/object or continuous/field), the applied attribute value scales (nominal, ordinal, interval or ratio) or the kind of measurement (discrete or integral). However, explicit formalized semantics are rarely found for existing spatial datasets; it still lacks consolidated description formalisms and encoding mechanisms for these semantic aspects. The same applies to the description and handling of spatial and thematic accuracies, especially on the feature level. This leaves a certain degree of responsibility for the selection of appropriate data and processes as well as for the interpretation of the results to the user.
For the combination of spatial data on water body structures, there are a number of frequently used processes relevant to the quantification and assessment of quality characteristics. Distance measurements are quite common, for example, between a river and a source for environmental pollution in order to estimate its potential to have an impact on the water quality. Others are the measurement of the topological relationship between features, which for example can select objects affected by a certain water level, or rather its surface representation, in a flood scenario. A combination of features with raster data from remote sensing can be applied to determine certain characteristics of the water body based on their particular spectral signature. The combination with digital elevation models is of particular interest to derive longitudinal profiles, catchment areas or objects prone to flooding. If the input data represent the same phenomenon at different timestamps, a change detection can be conducted for multi-temporal analysis, for example, of the area covered Texture analysis Surface Raster difference confusion matrix Raster by water or the surrounding land cover types. This may as well be done for more than two inputs, which can provide valuable insights into the spatiotemporal dynamics and variability of water body structures (Karrasch et al. 2015).
In comparison to administrative and remote sensing data sources, the majority of citizen science data is characterized as irregularly distributed point-based observations addressing specific aspects of a water body and its surrounding area. This may include width and water level measurements, the sighting of a certain species, a classification of surrounding land use or potential floating thrash.
Depending on its quality and information content, citizen science data can be used to complement and validate existing data sources. As an example, Comber et al. (2013) report on the successful application of citizen science in a comprehensive ground truthing campaign for land cover data.
In summary, capabilities for both the analysis and the combination of data are required in order to achieve a comprehensive view on a targeted feature, that is, a water body, including its current, past and potential future state. Whereas the analysis primarily addresses the processing of inherent characteristics of a feature, such as its geometry and thematic properties, the comparison exploits existing relationships to other features, either within or across datasets. Thus, the overall workflow to be implemented consists of four steps ( Figure 2): (1) The selection of input features (≥1) by the user.
(2) The identification of suitable processes for further processing.
(3) The selection and parametrization of a desired process.
(4) The execution of that process.
These steps can be iterated with different input data and processes until the user decides that his or her objectives are fulfilled. Hereinafter, the focus is on the second and the third step of the sequence and primarily addresses guidance mechanisms to identify and correctly apply suitable processing functionality to support the generation of meaningful results. Accordingly, the most important constraint for the workflow creation is the selected input data (Figure 2).

Data and operation constraints
Upon the selection of input data by the user, suitable operations for further processing should be identified in order to assist the user. This suggestion system can utilize data and operation constraints, which are specified by an operation description. In order to successfully perform an operation, all relevant constraints must be fulfilled. Data constraints can be evaluated based on the metadata of inputs during workflow composition or based on the actual feature objects at runtime. Examples for the first are constraints on the data domain, the data and geometry type, the spatial reference system or the spatiotemporal extent. The latter includes required computational resources, processing timeouts or the structural consistency of input data. However, whereas metadata constraints can be evaluated upon workflow creation and accordingly be addressed by user interaction, violations of a runtime constraint are hard to mitigate and ultimately result in the termination of the workflow. On the other side, operation constraints can be specified to define required input states or processing environments that are required by certain operations. Those can also be evaluated either during workflow composition, for example, based on a set of input metadata elements, or during workflow execution, for example, based on specified computational resources or processing timeouts. For the first type of constraints, that is, constraints that can be evaluated during workflow composition, a rule-based formalism and common examples are described by Hofer et al. (2017a), but are not yet implemented.
For service-based spatial data processing using the WPS standard, a number of data constraints can already be evaluated based on the mandatory process description document, including the number, cardinality and size of inputs as well as the range of supported data formats. However, the default description does neither address relationships between inputs nor semantic restrictions. Therefore, we propose to express additional constraints within the generic profile of a process, which can be referenced by the process description (Müller 2015;OGC 2015). Whereas a generic WPS client would simply ignore that profile, a profile-aware client is able to parse, interpret and evaluate data and operation constraints formulated therein. A corresponding example, specifying the need for a shared spatial extent of inputs in a feature matching process, is shown in Listing 1. It specifies a condition (overlap) that must be evaluated and a subject (bounding box), for which the condition must hold true to satisfy the constraint. To facilitate usability and interoperability, each definition and term links to a corresponding Semantic Web description. As shown by Hofer, Papadakis, and Mäs (2017b), such descriptions can later be exploited for the interactive search and suggestion of required operations during workflow creation.
Listing 1. Generalized process description example with constraint on a shared spatial extent.

Application patterns for geoprocessing on the web
In addition to geoprocessing services offering generic or specific functionality on the web, geoprocessing patterns can be used to describe more abstract application-oriented workflows, that is, a meaningful orchestration of services (Brauner 2015;Wiemann 2017). If applied on the web, they combine a human-readable description of a domain-specific problem to be solved with a workflow composed of links to an underlying service infrastructure (Figure 3). In the first instance, those links point at process descriptions that need to be resolved into processing services upon execution of the pattern by a service orchestration engine. There are two ways to do this: first, a process description can directly link to an implementing service instance; second, a well-defined description of both interface and functionality, such as proposed by Fitzner, Hoffmann, andKlien (2011)andMüller (2015), can be used to find and bind suitable implementing services within a processing service registry. However, the reproducibility of results, for both single services and patterns, must be assured by either formal specification (Fitzner, Hoffmann, and Klien 2011) or convention (Müller 2015). The creation of processing patterns requires extensive domain knowledge and is thus a task for domain experts. Following the descriptions of Brauner (2015) and Wiemann (2017), typical characteristics of a geoprocessing pattern on top of a service infrastructure can be summarized as follows: . The behavior is similar to high-level services, with the exception that it is not an instance, but a description of a processing workflow; accordingly it does not depend on actual processing capabilities of its creator or provider. . For the creation, registration, search and invocation of a geoprocessing pattern, a standardized set of metadata elements is required to describe its functional principle as well as constraints on the data inputs and outputs. . Each pattern may include other patterns in order to solve well-defined domain-specific subtasks; generic patters are not excluded per se, but are not expected to be of use to domain users. . It can be exchanged across the network and executed by a workflow engine at an appropriate position, for example, close to the input data (moving code) or close to the referenced processing services (moving data). . For each task, a range of applicable patterns may exist, each with different advantages and disadvantages; a user rating of a pattern should be based on its performance and its appropriateness with respect to previous applications of that pattern. . It must be subject to regular functional evaluations, with particular focus on the availability and accessibility of underlying service instances.
Since the external view on a pattern is similar to that on a geoprocessing service, it is reasonable to assume that a similar description, that is, the same set of metadata elements and profiles described in the previous section, can be applied. However, additional elements are required to describe workflow components and interactions. To allow for a standardized description, encoding, exchange and interpretation of the workflow encapsulated by a geoprocessing pattern, it is appropriate to apply a workflow description language for this purpose. At this point, the use of the Business Process Model and Notation (BPMN) standard is suggested, because it specifies both a human-readable graphical representation and a machine-readable XML encoding. However, upon invocation of the workflow this only applies with the restriction of a customized BPMN engine required to support the invocation of OGC web services (Meek, Jackson, and Leibovici 2016;Wiemann 2017).
To illustrate the structure and application of geoprocessing patterns, Figure 4 shows an example related to the determination of water body structures. It shows three different patterns for the same objective, which is the estimation of a river surface representation from either remote sensing imagery, the river centerline or both. To achieve this, a number of generic and domain-specific processing services are selected and composed into meaningful workflows. As can be seen from the figure, a data constraint on the spatial representation of input data decides on the applicability of a certain pattern. Moreover, the process constraint on a shared spatial extent described in Listing 1 can be applied to the second pattern, as it should take two spatially overlapping data inputs.

Implementation and application of the information system
As a proof-of-concept, the core of the information system has been implemented in support for the ad hoc analysis and combination of spatial data on the web. 3 It primarily uses OGC standards for client-service and service-service communication and, thus, aligns with current SDI developments. However, due to the current lack of available and accessible spatial data and, in particular, geoprocessing services on the web, a testing environment has been set up, comprising a number of useful services with respect to the application scenario. The Free State of Saxony in Germany was selected as a test area, due to a number of readily available spatial data services and datasets related to the quality of water bodies. The following datasets are registered in the client for visual analysis and potential further processing: . Environmental data provided by the LfULG (Saxon State Office for Environment, Agriculture and Geology); it comprising WFS data for major catchment areas, river geometries and areas prone to flooding as well as an existing official classification for water body structures dated 2007, which was downloaded and set up as a WFS using GeoServer. 4 . Data from land surveying, including a digital elevation model at a resolution of 5 m and a number of orthoimages, both obtained from the GeoSN (Saxon State Office for Land Surveying) and set up as a WCS using Geoserver. . Preprocessed remote sensing data collected by the LANDSAT satellite system (cf. Karrasch et al. 2015) as well as CORINE Land Cover data provided by the Copernicus Programme, both set up as a WCS again using Geoserver. . Field observations for certain biological, hydromorphological and physicochemical parameters describing the quality of water bodies collected by school classes in the course of the citizen science project COBWEB (Citizen Observatory Web, Higgins et al. 2016). . Data collected by volunteers contributing to the OpenStreetMap 5 project are obtained via the Overpass API, which provides read-only access to the database together with a custom query language; the Overpass API is a prominent example for open, well-established, but non-standardized interface specifications frequently applied on the web.
In addition to the spatial data services set up to test and evaluate the information system, geoprocessing capabilities are required to facilitate the ad hoc analysis and combination of the provided spatial data. Thus, a number of geoprocessing services are registered: . The demonstration instance of the 52n WPS framework 6 offers basic geoprocessing functionality to the system; the service provides a number of basic operations, for example, buffer and coordinate transformation, as well as processes from GRASS GIS and the SEXTANTE library, which are encapsulated by the WPS framework. . Existing functionality for the matching and fusion of spatial data developed in Wiemann and Bernard (2016) is reused and set up as geoprocessing services using the 52n WPS framework. . In order to address domain-specific functionality, a number of processes related to the quality of water bodies are developed and provided via the WPS interface using again the 52n WPS framework; it comprises sinuosity measurements (cf. Karrasch et al. 2015) and the determination of the chemical status of a waterbody based on the measurement of relevant substances.
In addition to the above mentioned spatial data and geoprocessing services, there are three further components of the information system ( Figure 5): . The pattern registry is a website that provides the capabilities to upload, search and download geoprocessing patterns; each pattern is composed of a process description, which basically follows Figure 5. Main components of the implemented information system.
the WPS process description document, and a BPMN file with a formal description of the pattern workflow; single workflow components are directly linked to implementing WPS instances. . An orchestration service is used to execute the geoprocessing patterns; it parses the BPMN workflow description, composes the required processing service instances, executes the workflow and finally returns the result; the service offers a WPS interface and is implemented using the 52n WPS framework. . The client application is the main entry point for users interested in the determination of the quality of water bodies and supports the visualization and the retrieval of spatial data via OGC services; the client is connected to the pattern registry and the orchestration service and thus allows for the execution of previously designed geoprocessing patterns.
When accessing the web client application (Figure 6), a basemap is shown for orientation. Additional overlays can be selected by the user as input for geoprocessing. Therefore, spatial data services are selected from either the initial list of data services registered to the client or a userdefined WFS service endpoint. Moreover, additional WPS instances and processing patterns can be registered to broaden the range of available functionality. However, only such processes and patterns are shown that fulfill the constraints set by the selected input data, for example, concerning the number, format or spatial representation.
Upon selection of a process or pattern by the user, a pop-up window is used to parameterize the process or pattern respectively. Whereas feature inputs are chosen from the active map overlays, literal inputs are separately listed and requested from the user. Subsequently, processes can be invoked and patterns be sent to the connected orchestration engine. The results of a process are temporarily stored on the application server. It can be added as an overlay and accordingly be used for visualization and further processing. If the result is no feature data, for example, in the case of literal outputs, or feature relations, they are provided as downloads.

Application for the determination of water body structures
The goal of the developed information system is to support users in fulfilling their particular information needs with respect to the characterization of water bodies. The offered processes and patterns essentially depend on the number and type of datasets selected by the user. Thus, if all data and operation constraints of a process or pattern are fulfilled by the user's data selection, it is offered for execution. Hereinafter, an application is described for each of the stakeholder groups shown in Figure 1: the governmental authority, the decision maker and the regular citizen. The first application targets at governmental authorities with reporting obligations on the ecological quality of water bodies, for example, in the context of the EU WFD. The determination of the river sinuosity, basically an indicator on its curvature dividing the curve length by the basis length, is chosen as an important hydromorphological parameter with significant influence on the potential biological status of a river. The implemented sinuosity process takes a single geometry, that is, the spatial representation of a river, as input and returns information on the sinuosity for each point on the line considering various possible segment lengths (Karrasch et al. 2015). Thus, technically, the process is only constrained by the selection of at least one feature with linear geometry type. The service response contains a link to the result image, which can be downloaded and used for a detailed inspection and interpretation of the river sinuosity. The corresponding inputs and output of the application are depicted in Figure 7.
The second application is targeted on decision makers dealing with information on urban areas potentially affected by flood events. This could be used, for example, for the estimation of real estate values or insurance rates in areas prone to flooding. As input data, existing flood plains for each river are taken and combined with CORINE Land Cover data. For that purpose, a computation of the zonal statistics is conducted for each flood plain and appended as attributes to the original dataset. The result accordingly contains the number of pixels contained by a flood plain for each of the COR-INE classes. The result is returned to the user for visualization and further analysis. The respective inputs and the schematic output are shown in Figure 8. Technically, the zonal statistics process is offered to a user, if at least one vector feature with polygon geometry type and one raster dataset are selected.
The third application addresses regular citizens taking observations alongside a certain river, which are interested in comparing related observations. Therefore, a geoprocessing pattern with four consecutive processes is designed: first, the considered river is selected by a closest distance operation; second, a buffer is calculated around the river centerline; third, all observations within that buffer are assigned to the corresponding river; and fourth, each observation is connected with available upstream and downstream observations. Those links are appended as attributes  and can be used to traverse and compare observations alongside the selected river. The pattern is offered, if a river network, that is, a set of directed linear geometries, and a set of point-based observations is selected by a user. A schematic output for a river network and a number of sampled observations is depicted in Figure 9.
The conceptualized and implemented process suggestion system is primarily based on specified data and process constraints, which are embedded in processing profiles. This combination is considered a good starting point for the further development and provision of a harmonized set of process descriptions, in particular for workflow composition. The development of geoprocessing patterns complements this approach by providing an additional abstraction to complex application-oriented workflows. In the current stage of development, the application has been tested by a number of domain experts in a controlled environment. However, it is planned to open the application to a broader audience to facilitate the analysis and combination of spatial data on the web, with particular focus on the monitoring of the ecological quality of water bodies.

Conclusion
The information system developed in this article provides the opportunity to achieve a comprehensive view on a spatial domain, with particular focus on water quality. It does not only represent an approach to access the wealth of spatial data on the web, but also provides capabilities to derive actual information from that data. It allows for the integration of distributed spatial data and geoprocessing services in an ad hoc manner, without the need for comprehensive GIS knowledge and resources. The generic infrastructure can be adopted to serve other domains for the general validation, enrichment and combination of spatial data on the web.
Whereas the collection of appropriate spatial data within a domain is already well-supported by existing SDI, capabilities for geoprocessing still need to be enhanced. This particularly addresses the semantic description of data and processes, which play a significant role in the determination and composition of suitable processing workflows. In the presented approach, a significant proportion of the required logic is implemented by domain experts creating domain-specific geoprocessing patterns based on a number of underlying processing services. Thus, a regular user familiar with the domain is still capable to perform more or less complex workflows. However, future work should not only focus on additional geoprocessing functionality, but also on the creation and description of reasonable geoprocessing patterns that provide well-established and frequently applied functionality to a larger group of users. Once a sufficient number of patterns are available, a rating mechanism seems reasonable to further specify their applicability for a specific problem. Moreover, a number of benchmarks should be established to regularly test the accessibility, performance and quality of existing processes and patterns.
The current prototype focuses on a small geographic area with a good data coverage. However, to be universally applicable, and ultimately support the determination of global water quality indices for the evaluation of the UN SDGs, an extension to other areas with less data coverage is intended. This especially requires approaches to derive the same information from a wide range of different data sources. This also poses additional challenges for the valorization of citizen support on the technical, organizational, legal and social level.