Customizable process design for collaborative geographic analysis

ABSTRACT Collaborative geographic analysis can lead to better outcomes but requires complicated interactions among participants, support resources and analytic tools. A process expression with explicit structure and content can help coordinate and guide these interactions. For different geographic problems, the structure and content of collaborative geographic analysis are generally distinct. Since the process structure embodies the pathway of problem-solving and the process content contains the information flow and internal interactions, both the structure and the content of the process expression must be clarified during process customization. However, relevant studies concerning the collaborative geographic analysis process mainly focus on the process structure, which remains a “black box” in terms of the process content, especially the internal interactions. Therefore, this article designs a customizable process expression model that takes both process structure and content into account and proposes a corresponding process customization method for collaborative geographic analysis. Additionally, a support method for geographic analysis process implementation is also provided. To verify the feasibility and capability, these methods were implemented in a prototype system, and a case study on traffic noise assessment was conducted. The results suggest that the proposed strategy can effectively improve geographic analysis by customizing processes, guiding participants, performing interactions, and recording operations throughout the process.


Introduction
Geographic analysis can be utilized to assist with addressing various geographic problems by different means, including data analysis, model construction, geographic simulation, and decision making. To address complicated problems, especially interdisciplinary problems, diverse resources and knowledge must be combined during geographic analysis (Voinov et al. 2016;Usón, Klonner, and Höfle 2016;Chen et al. 2020Chen et al. & 2021a. For this purpose, collaboration-supported geographic analysis (namely, collaborative geographic analysis) is needed, and such collaboration can help different participants (especially experts) share ideas, data, and even analytic models to co-analyze geographic environments and better understand the mechanisms underlying these environments (Lü 2011;Basco-Carrera et al. 2017;Dubey et al. 2021).
Collaborative geographic analysis of a specific geographic problem usually involves a series of interlinked interactions, such as data importing, parameter setting, and model invoking. Therefore, participants must perform these interactions in an orderly and coordinated manner, which requires participants to understand the relevant process structure and interaction context. A well-expressed and understandable process can help improve collaborative geographic analysis (Balram and Dragićević 2006). In particular, the expression of the process can benefit (1) participants in reaching consensus on the selection of geographic analysis pathways, (2) newcomers by providing instructions for collaborative process implementation, and (3) problem-solving practices to improve their transparency and reliability (Singleton, Spielman, and Brunsdon 2016;Bandaragoda et al. 2019;Zare et al. 2020).
For different geographic problems, collaborative geographic analysis processes are generally distinct in terms of both structure and content. Therefore, the process of collaborative geographic analysis must be customized adaptively according to the specific problem (Zare et al. 2020;Chen et al. 2021a). Process customization is generally implemented by clarifying the process structure and content (Held and Blochinger 2009). The process structure can embody the implementation logic of geographic analysis; the process content usually involves the information flow and internal interactions. Compared to the process structure, the process content (especially the internal interactions) has not been described well by existing studies, leading to the existence of a "black box" in geographic problem-solving. Therefore, it is difficult for participants to discover and understand the progress and details of geographic analysis; thus, the efficiency of collaboration might decrease.
To improve collaborative geographic analysis, this study aims to design a novel process model and a corresponding process customization method for collaborative geographic analysis. The process model can describe the process structure and content via a unified method, and the process customization method can be utilized to define process implementation logic, control information flow, and record interactions based on the process models. This customizable process can provide participants with an explicit and transparent process to improve collaboration in geographic problem solving.

Literature review
Collaborative geographic analysis provides an effective solution to geographic problems, and the associated process attracts a great deal of attention due to its capacity to help clarify different tasks, guide participants and prepare resources during collaborative geographic analysis. Therefore, many studies have focused on process cognition and customization, as well as collaborative geographic analysis environments.

Process cognition and customization
A clear and understandable process can help guide participants, prepare resources and improve outcomes during geographic problem solving. Therefore, many studies have attempted to clarify such processes (Zhang et al. 2013;Scolobig, Thompson, and Linnerooth-Bayer 2016;Hassenforder et al. 2016;Cradock-Henry et al. 2020). These studies have usually regarded the process of collaborative geographic analysis as a combination of a series of activities (stages, phases, or steps) (Balram and Dragićević 2006;von Korff et al. 2010;Voinov et al. 2018;Halbe, Pahl-Wostl, and Adamowski 2018;Ma et al. 2021). Throughout the entire process, these activities can be carried out for different purposes, including data processing, data visualization, model construction, and model evaluation (Jakeman, Letcher, and Norton 2006;Elsawah et al. 2017;Badham et al. 2019;Ma et al. 2021). Therefore, these activities can be the basic unit of organization for various interactions (e.g. communication, resource sharing, data editing, and model running) during the collaborative geographic analysis process.
To provide an implementation pathway for geographic analysis, a customization method for different processes is needed. The scientific workflow is a typical method that can combine different activities to provide a geographic analysis process (Reuillon, Leclaire, and Rey-Coyrehourcq 2013;Wachowicz et al. 2016;Kruiger et al. 2021). This method has been applied in many different domains, including forest growth prediction (Ma et al. 2019), solar radiation modeling (Radosevic et al. 2020), and water contamination monitoring . To describe the structure and content of the process, different scientific workflow expression methods have been introduced, such as the Business Process Execution Language (BPEL) (Tan et al. 2018) and Petri-Net (Ren et al. 2018). Through the use of these methods, the process of geographic analysis can be customized by defining the combination relationships and determining the data flows among different computation activities (nodes, services or tasks) (De Luca, Silva, and Modica 2021). However, scientific workflow-based process customization methods are mainly utilized for the computation process. For collaborative geographic processes that involve many different interactions (e.g. communication and problem analysis), it remains difficult to describe both the process structure and the content well via a unified customization method.

Collaborative geographic analysis environments
During the process of collaborative geographic analysis, diverse interactions must be performed, which requires the support of appropriate environments (systems or platforms). These environments can provide online workplaces where participants can use the appropriate resources and tools to perform interactions together, for example, collaborative geographic information systems (CGISs) (Sun and Li 2016;Jelokhani-Niaraki 2019), collaborative virtual geographic environments (CVGEs) (Chen et al. 2012;Lin et al. 2013;Zhu et al. 2016), and SWATShare (Rajib et al. 2016). To improve the capabilities of collaborative geographic analysis environments, many researchers have focused on different aspects of this context, including data sharing, model sharing, and real-time interaction (Gan et al. 2020;Zhang et al., 2021b;Butt, Mahmood, and Raza 2018). Meanwhile, different projects, such as the Community Surface Dynamics Modeling System (CSDMS) (Peckham, Hutton, and Norris 2013;Peckham and Goodall 2013) and Open Geographic Modeling and Simulation Systems (OpenGMS) (Chen et al. , 2021a have also provided considerable data and model resources for geographic collaboration. To adapt to changing demands throughout the different activities of the process, resources and tools need to be well-coordinated and provided in these environments. For this purpose, different environments have been developed in which an explicit process expression exists (Jelokhani-Niaraki 2018; Bandaragoda et al. 2019;Xu et al. 2019). In these environments, geographic analysis processes are mainly provided by two approaches. The first approach is to offer the given process expression of geographic analysis for a specific domain (Almoradie, Cortes, and Jonoski 2015;Zhang et al., 2021a). However, the processes in these types of environments are usually exclusive to the target domain. The other approach to supporting users in process customization aims to address different problems (Held and Blochinger 2009;Chawanda et al. 2020). Through this approach, an appropriate process can be generated to suit the particular problem and the required method of addressing it, which can help participants focus on detailed targets and tasks during the geographic analysis process. At present, many collaborative geographic analysis environments that allow for process customization are largely based on the underlying idea of the scientific workflow (Palomino, Muellerklein, and Kelly 2017;Yue et al. 2019;Chen et al. 2021b). Nevertheless, for customizable processes, the linking relationships of activities are usually complicated; thus, the coordination of relevant resources and tools is a key issue that needs to be addressed.
Based on existing achievements, this study focuses on the demand for process customization in collaborative geographic analysis. The study introduces a customizable process design strategy that can achieve process customization and provide implementation support for the customized process. Based on this strategy, participants can first customize the geographic analysis process to include an explicit process structure, controlled information flows, and detailed interaction records; then, the entire process can be implemented with the support of a processoriented collaborative environment. Therefore, this strategy can help participants, especially a variety of experts, conduct collaborative geographic analysis in an explicit and transparent manner.

Conceptual framework
According to one opinion concerning the problemsolving process, process customization is usually achieved with the deepening of problem cognition (Buzan 1995;Nyerges, Roderick, and Avraam 2013). Apparently, a hierarchical and progressive structure is more suitable for the process customization involved in collaborative geographic analysis. Therefore, a conceptual framework for the customizable process design strategy is designed as shown in Figure 1.
In the proposed strategy, the collaborative geographic analysis process is driven by two categories of activities. The first category is analysis-performing activities ( Figure 1a). For such activities, participants can share their resources (e.g. data and documents) and utilize tools (e.g. modeling tools and cognitive tools) to conduct interactions for specific targets, including data processing and geographic simulation. Another category is process-constructing activities ( Figure 1b). In these activities, participants can share ideas, establish new child activities, and determine the geographic analysis process. Likewise, child activities feature two categories with the same abilities.
These activities, including both parent and child activities, can form a hierarchical structure, as shown in Figure 1c. This hierarchical structure embodies the different cognitive levels of a given geographic problem. For example, participants in the first hierarchy typically focus on an entire geographic problem. As the understanding of the geographic problem deepens, it is seen that the individual problem needs to be addressed in several steps. Thus, participants can create child activities to pursue the target of each step, forming the second hierarchy. Furthermore, these targets might also be divided into several subtargets that require more child activities to help the participants concentrate on specific subtargets and make full use of their expertise.
These activities can be linked to support various geographic analysis scenarios, for instance, outcome comparison and iterative optimization. The links depicted by the lines with arrows in Figure 1c define the information flow, including the resource information and participant information. However, the information flow should be followed conditionally. For example, some unqualified outputs might be conveyed to other activities, and participants from other activities might perform interactions that are not allowed. Therefore, the constraints and controls are represented by gates in Figure 1.
In this conceptual framework, there are four basic elements: participants, resources, tools, and operations. These elements can be used together to record different interactions during the collaborative geographic analysis process.
(1) The joint actions of participants toward the same goals can effectively promote collaborative performance (Patel, Pettitt, and Wilson 2012). Because the participants can proactively generate ideas, share resources, and utilize tools to perform operations, which is the key to collaborative geographic analysis. During the collaboration process, participants usually have specific roles and responsibilities (Xu et al. 2018). Therefore, it is necessary to understand the behaviors of participants and to provide corresponding authority and instructions to help them work toward collaboration.
(2) Resources can provide initial information to assist problem recognition and the understanding of the geographic environment (Ma et al. 2021;Li et al. 2021). In collaborative geographic analysis, the Figure 1. The conceptual framework of the customizable process design strategy for collaborative geographic analysis. There are two categories of activities in the strategy, and they can be used to customize the hierarchical process for geographic analysis. relevant resources include data, algorithm code, descriptive documents, and even introductory multimedia resources. In particular, data resources can supply considerable information pertaining to the inversion, simulation, and prediction of geographic phenomena. Because adequate resources are the premise of successful geographic problem solving, the sharing and managing of resources must be supported.
(3) Tools can help users achieve their goals more easily and reduce the social distance between distributed participants in the collaboration (Patel, Pettitt, and Wilson 2012). Geographic analysis usually involves the use of various types of tools, such as communication tools, cognitive tools, and quantitative tools, to support a variety of activities, including data processing, data visualization, and process prediction (Voinov et al. 2018;Li et al. 2019). Thus, to support collaborative geographic analysis, more tools should be made available throughout the geographic analysis process.
(4) Operations from participants can result in various interactions. By clarifying the operations, interactions in collaborative geographic analysis can be adequately recorded and understood. Therefore, process customization for collaborative geographic analysis must consider questions about operations, for example, who uses tools and resources, which tools and resources are used, when the operation occurs, and what results are obtained via the operations.

Collaborative geographic analysis-oriented process customization
According to the conceptual framework, a hierarchical description model and a protocol-based activity linking method were designed. In the process model, four elements (participants, resources, tools, and operations) can be used to record the detailed interactions involved.

A hierarchical process expression model
The hierarchical process expression model is designed to represent the hierarchical process structure and express the process content. As shown in Figure 2, the process expression model is constructed as a node-document structure. The nodes represent geographic analysis activities, and the documents record the statuses of four elements: the contributions of participants, the use of resources and tools, and the details of the operations.
The attribute design of the activity nodes and documents can support the implementation of the nodedocument structure. The activity node has five categories of attributes, namely, ID, Type, ParentActivity, ChildActivities, and BasicInfo ( Figure 3). The ID attribute is the unique identification information of an individual activity and can be used to link it to other activities and documents. The Type attribute indicates whether the current activity is a process-constructing activity or analysis-performing activity. The ParentActivity and ChildActivities attributes can be utilized to separately record the IDs of the parent activity and child activities Figure 2. The structure of the hierarchical process expression model. In the model, each activity has a corresponding activity document; thus, they can provide both the basic information and internal interaction information of one activity. and to construct the hierarchical structure. BasicInfo is a category of attributes that contains a series of descriptive attributes (e.g. name, description, contained resources, support tools, and created time). In contrast, the activity document has two attributes, ID and Content. The ID attribute has the same identification information as the corresponding activity node, and the Content attribute is employed to store the stringformatted activity document.
According to the node positions in the process expression model, these activities can be viewed as root activities, middle activities, and leaf activities. Root activities are usually process-constructing activities, but they can also be analysis-performing activities if any child activity is not needed. Because middle activities have several child activities, they are always processconstructing activities; leaf activities do not have any child activity, and thus, they are analysis-performing activities. In addition, each activity has a unique document to describe the corresponding collaboration scenarios. Therefore, two types of activity documents exist, and they are designed separately for processconstructing activities and analysis-performing activities.
The activity document is constructed by using extensible markup language (XML), and the elements in the collaboration scenarios are represented by XML nodes. Furthermore, there are some discrepancies between the two types of activity documents used to describe the process-constructing activity and analysis-performing activity.
In the document for the process-constructing activity, the root node of the XML document refers to the current activity, and its attributes include the ID, name, description, and type information. The root node contains five nodes, the Participants node, ResourceCollection node, OperationRecords node, ChildActivities node, and ActivityDependencies node, as shown in Figure S1.
(1) Under the Participants node, each person engaged in the collaboration is described by a Person node, which records the information of the user, including the e-mail, name, role, and state. In particular, the "role" attribute defines the user role undertaken in the collaborative geographic analysis. There are four levels of user roles (manager, core member, normal member, and visitor) to support different operation authorities. The "state" attribute has two options ("in" and "out") to indicate whether the user is still involved in the collaboration.
(2) The ResourceCollection node can have several Resource nodes. The Resource node has six attributes: "id," "name," "type," "provider," "href," and "state." The "type" attribute explains the type of resource. In collaborative geographic analysis, the resource types include data, parameters (file-organized parameters), models (e.g. executive files and model codes), documents (e.g. reports, record documents, and technical documents), papers, images, and even videos, which can facilitate geographic analysis to different extents. The "provider" attribute records the user who provides the resource, and it can link to the specific person in the Participants node. In addition, the "href" attribute provides a hyperlink to access a resource in the web environment, and the "state" attribute indicates whether this resource is accessible or removed. (3) The OperationRecords node contains three different types of Operation nodes to describe process-related operations, activity-related operations, and communication-related operations. Specifically, the "type" attribute explains the type of operation, and the "behavior" attribute describes the detailed operations that users perform in terms of the activity and process. For example, the creation of a child activity is defined as a "create" behavior in the "activity" type of operation; modifying the geographic analysis process is a "modify" behavior in the "process" type of operation. In the "communication" type of the Operation node, the "resRef" attribute can link to a Resource node, which describes the communication records. In addition, both the "operator" attribute and the PersonRef node can link to a Person node and indicate that the user performed the operation.
(4) The ChildActivities and ActivityDependencies nodes are designed to support the construction of the geographic analysis process. The ChildActivities node contains several Child nodes to describe the child activities of the current activity. The ActivityDependencies node can explain the linking relationships of these child activities through the Relation node.
The document for the analysis-performing activity has four XML nodes under the root node, namely, the Participants node, ResourceCollection node, ToolBox node, and OperationRecords node. Except for the Participants node and ResourceCollection node, there are some differences in the nodes ( Figure S2).
(1) The ToolBox node is a collection of a series of Tool nodes, and it represents the tools used in the geographic analysis. The Tool node contains seven attributes, namely, "id," "name," "type," "function," "provider," "href," and "state." Notably, the "type" attribute represents the type of tool, such as a data processing tool, data visualization tool, or geographic simulation tool. The "function" attribute defines the detailed capabilities of the tool. For example, the function can be "slope calculation" or "format conversion from GeoJSON to Shapefile." The other attributes are the same as the corresponding attributes in the Resource node.
(2) The OperationRecords node contains four types of Operation nodes to record resource-related, toolrelated, communication-related, and analysis-related operations. The nodes for resource-and tool-related operations indicate the behaviors of resources and tools (e.g. the uploading, sharing, and removal of resources and tools) through the "behavior" attribute. They can also be linked to a specific Person node, Resource node, and Tool node separately through the "operator," "resRef," and "toolRef" attributes. Compared with the process-constructing activity document, the communication-related Operation node has an extra "tool" attribute that indicates which tool is used for communication. Moreover, the analysis-related Operation node records the purpose and the tool of an analysis operation. This node also contains ResRef nodes and PersonRef nodes to explain the operation in detail. The ResRef node has three types to represent the possible parameters, inputs, and outputs. The PersonRef node is designed to describe the engaged cooperators and can be linked to a Person node for detailed user information.

A protocol-based process customization method
To customize the geographic analysis process, each activity must be linked with others for problem solving. In particular, activity dependencies should be defined when linking these activities. The linking relationships and information flow among different activities must be identified. Therefore, a protocol-based activity linking method is designed to link different collaborative activities.

Design of the activity linking protocol
The key of the protocol-based activity linking method is the activity linking protocol. It can define the linking relationships of activities and the flow of information in detail. In the collaborative geographic analysis process, the involved information usually concerns participants and resources.
The activity linking protocol first clarifies the linking relationships and identifies the prerequisite activities and the subsequent activities. There are four types of linking relationships, the sequence, the branch, the merger, and the loop, as shown in Figure 4. According to the defined linking relationships, the prerequisite activity's output resources can be used by its subsequent activities, which should exclude some unqualified resources, such as data with incorrect formats or inappropriate geographic references. Therefore, a constraint method is designed in the protocol to control the resource information flow. This constraint method involves six items: resource types, file formats, temporal and spatial scales, geographic references, units of values, and resource concepts, as shown in Figure 5. The resources that can meet all of these constraints can be conveyed to subsequent activities. For instance, when participants make a protocol and stipulate that only data type resources in the "Shapefile" and "NetCDF" formats can be accepted by the next activity, other types of resources and files in other formats are filtered out. In the sequence relationship, there are at least two geographic analysis activities that are connected one by one. The branch relationship requires several subsequent activities that depend on the same prerequisite activity. The merger relationship is formed by at least three activities but has only one subsequent activity. The loop relationship is similar to the sequence relationship, but the last activity is also a prerequisite for the first activity. Participants in the prerequisite activity can also be allowed to enter other activities in line with the linking relationships in the protocol, but it is unreasonable for all participants to join the subsequent activities without any constraints. To control entrance and prevent unauthorized persons from accessing sensitive data, the flow of participant information must be implemented conditionally. In the protocol, this constraint is implemented on the basis of three items: the roles, domains, and organizations of users ( Figure 5). If participants meet the role, domain and organization constraints, they will be allowed to join the subsequent activities. For example, if the role is set to "Expert" and the domain is set to "sensitivity analysis" in the protocol, a user with an expert role and substantial experience in sensitivity analysis can enter the subsequent activity without the approval of the activity managers.
To describe these items of resource and participant constraints, condition tags are utilized. One constraint item can have several different condition tags. If a specific participant or resource matches one of the condition tags, this participant or resource can meet the constraint. Only if the resources (or participants) meet all resource (or participant) constraints is this information flow of resources (or participants) allowed. In addition, the constraint items can lack condition tags, which means that all participants and resources satisfy the constraint.

Implementation of the protocol-based linking method
The protocol-based linking method can be employed in process customization. The implementation of the activity linking method includes protocol generation, protocol management, and the control of information flow.
Since the linking relationships of activities can usually be clarified according to the experiences of participants and the results of discussions, the most important step for protocol generation is to set the condition tags for each constraint of the information flow. The condition tags can be determined according to the metadata (and other descriptive information) of existing participants and resources in the prerequisite activities. For example, several different scales of data can be involved in an activity, including the "global," "national," and "urban" scales. Participants can obtain these different scales and select the "national" scale as the constraint for data transmission. Therefore, the setting of constraints requires a detailed description of resources and participants. For this purpose, the activity document is extended at the Resource node and the Person node, as shown in Figure 6. In the Resource nodes, a series of Metadata nodes are added to describe the resource. These Metadata nodes can indicate the attributes of format, scale, reference, unit, and concept. The Person node is also supplied with a Domain node and Organization node to provide the personal information of domains and organizations. Notably, the Domain node and the Organization node can be used repeatedly because Figure 6. Extensions of the activity document. The metadata node, domain node, and organization node are added to the activity document to provide extra information on resources and participants. experts might have expertise in more than one domain, and they can also be employed by different organizations. The extension of activity documents can supply reasonable options for setting constraints in the protocol. While participants are setting the protocol, they can automatically obtain the extended information of resources and persons and can set the information as condition tags.
When a protocol is formulated, it must be stored and managed, as shown in Figure 7. The formulated protocol is stored in the protocol repository, which is a collection of protocols used in the entire geographic analysis process. Based on the protocol repository, protocols can be conveniently managed, and the information of linking relationships and constraints can be easily queried. Additionally, the protocol will be recorded in the activity document to record the operations of process construction. The ActivityDependencies node in the activity document contains several Relation nodes that can record the flow direction between child activities. By means of the "protocol" attribute, the Relation node can query a specific protocol from the protocol repository to obtain detailed constraint information.
After the protocol is generated and the process is customized, the information flow of resources and participants can be controlled. Specifically, since each protocol can represent the dependencies of several activities, all protocols in the protocol repository can define an activity-driven directed cyclic graph. When the state of a resource or participant is changed (for example, in the generation of the geographic simulation results), the method will first check the connectivity of the directed cyclic graph and identify the activities that will be influenced. Then, the method will judge the constraints to update the resources or change the authority of participants in the influenced activities. Thus, resource and participant control over the entire geographic analysis process can be implemented.

Support method for process implementation
To support process-based geographic analysis, it is important to provide a collaborative environment that can help participants access geographic analysis resources and tools to work together. Therefore, resources should be well managed and coordinated in the environment. In addition, to help participants perform operations, including data analysis, geographic modeling and simulation, various geographic analysis tools can be integrated into the environment.

Design of the collaborative geographic analysis workspace
In an open web environment, convenient resource sharing, rapid knowledge access, and reliable distributed operations are available and have greatly promoted the development of collaborative geographic analysis (Palomino, Muellerklein, and Kelly 2017;Yue et al. 2019;Wang et al. 2020;Sun and Li 2016). Thus, a web-based workspace is needed for the implementation of collaborative geographic analysis. However, an individual geographic analysis workspace does not seem appropriate for supporting process-based geographic analysis, which cannot distinguish among the participants, resources, and tools involved in different activities of the process.
In this study, the geographic analysis environment is composed of a series of workspaces that correspond to each activity of the customizable process. For the two categories of geographic analysis activity, two types of workspaces are prepared: (1) The workspace of the process-constructing activity has three main functions: communication, activity linking, and resource collection. In this type of workspace, participants can communicate with each other and discuss the geographic analysis pathway. Based on the communication results, participants can link different child activities using the protocol-based activity linking method. In addition, they can collect related resources for further geographic analysis in the child activities.
(2) The workspace of the activity also has three main functions: resource access, tool use, and operation management. In this analysis-oriented workspace, participants can access the necessary resources and collaborative tools. With the support of the resources and tools, participants can perform different operations to analyze and address geographic problems. Moreover, the operations that are recorded in the activity document can be shown in the workspace. This workspace allows participants to create geographic analysis tasks to organize and manage various operations. Thus, operations can be organized by means of different geographic analysis tasks according to their purpose, and other operations that are not managed by the task are viewed as temporary operations.

Coordination of geographic analysis resources
The geographic analysis resources that can be accessed for collaboration are provided by three main sources: shared online resources, uploaded offline resources, and analysis-generated resources. To coordinate these resources for collaborative geographic analysis, the proposed strategy provides a common resource repository for each activity, as shown in Figure 8. Based on the common resource repository, participants can access the needed resources in the workspace and apply the acquired resources to perform interactions. Simultaneously, each participant is provided with a private resource repository that can store personal resources online.
In the practice of geographic analysis, participants can upload resources to the common resource repository directly and share online resources from their personal repositories to the common repository. In addition, due to the hierarchical structure of activities, an activity can inherit geographic analysis resources from its parent activity. When an activity is linked with other activities by the activity linking protocol, the resources in the prerequisite activities can also be transferred to the current activity under the resource constraints that are defined in the protocol. Although some factors might lead to the failure of resource transfer, such as an inappropriate resource constraint definition or incomplete resource metadata, participants can also manually apply to obtain these resources from the prerequisite activities. Therefore, resources from different sources can converge for collaborative geographic analysis.

Design of service-based tools for process implementation
Geographic analysis tools have specific functions that can be used to support analysis-performing activities. During the geographic analysis process, once an analysis-performing activity is created, geographic analysis tools are automatically supplied for this activity in accordance with the corresponding purposes. Therefore, various collaborative tools with different functions (especially data processing and geographic simulation) must be developed and provided for geographic analysis.
However, for complex geographic problems, the development of collaborative tools usually requires considerable repetitive programming work, which is not a feasible way of supplying numerous geographic analysis tools. Thus, it is necessary to create tools using existing resources. The development of serviceoriented architecture (SOA), which has promoted the sharing and reuse of geographic analysis models, can also provide opportunities to utilize existing web services to create collaborative tools (Nativi, Mazzetti, and Geller 2013;Walker and Chapra 2014;Zhu and Yang 2019).
The services utilized to create tools are mainly taken from the geographic analysis model service container and the data service container provided by the OpenGMS team, which aims to facilitate broader participation and exploration of geographic modeling and simulation. These services are accompanied by detailed descriptive information, including inputs, parameters, and outputs.
Two collaborative tool templates can be used to create service-based tools. These tool templates are designed to access the model services from the model service container and the data processing services from the data service container. The model services are accompanied by a model description language (MDL) document that introduces the information of the model input, output, and parameters (Yue et al. 2016;Wen et al. 2017;Zhang et al. 2019Zhang et al. , 2020. This information can be employed using the model tool template to form the user interface (UI) of the model tool. Additionally, the data processing services that can address data conversion and visualization also provide input and output information (Wang et al. , 2020. Thus, the data processing tool template can generate tool components for setting data and invoking services. The tool templates can also record user operations and synchronize these operations to other users for collaborative geographic analysis. Based on the templates, the execution result of the tools can be stored in the resource repository for further geographic analysis.

A prototype system based on the customizable process design strategy
To customize the process of collaborative geographic analysis, the proposed strategy is applied to develop a prototype system (https://geomodeling.njnu.edu. cn/PExploration). Figure 9 illustrates the architecture of the prototype system, which contains the supporting repositories, functional modules, and applications for geographic analysis.
There are six repositories in Figure 9 that provide supporting resources for collaborative geographic analysis. These supporting resources include created activities, generated activity documents, activity linking protocols, engaged users, collected analysis resources, and prepared collaborative tools. These supporting resources are stored in several repositories based on MongoDB, which is a flexible documentoriented database that is suitable for storing the heterogeneous data involved in the geographic analysis process (Banker et al. 2016). Specifically, the repositories of activities, activity documents, and protocols store the information used to customize the geographic analysis process. The user repository provides information concerning geographic analysis participants, such as roles, domains, and organizations. The analysis resource repository and collaborative tool repository supply geographic analysis resources and tools for addressing geographic problems.
Based on these supporting repositories, the prototype system provides seven main functional modules that offer capabilities pertaining to managing activities, tools, and resources, controlling user authorities, linking activities, performing interactions, and supporting collaboration. The activity management model can allow participants to manage (create/ remove/query/update) the two categories of activities. When an activity is created (or changed), this module generates (or updates) the corresponding activity document. The activity linking module provides functions to help participants establish protocols and link activities. The interaction performing module includes a workspace where participants can conduct co-analysis using various resources and tools. The resource management module is responsible for resource collection and sharing; it can manage different types of resources for convenient application. The tool management module can be employed to create a tool using web services, in addition to performing the functions of updating and sharing. The user authority control module can not only manage the participant information but can also control user authorities in the context of collaborative interactions. In the collaborative geographic analysis, all collaboration-related functions, for example, state synchronization and message transmission, are provided by the collaboration support module.
For a geographic problem that requires knowledge and resources in different fields, participants with different backgrounds need to gather to address the problem. On the basis of these functional modules, participants can customize the process to improve collaborative geographic analysis in four stages. (1) They can first discuss the geographic analysis method of addressing the problem and clarify potential targets, necessary resources and required geographic models for the method. Once they come to an agreement regarding the method, the appropriate activities can be determined. Then, participants can create different activities according to the targets to prepare the resources and models and can customize the geographic analysis process based on the hierarchical description model and the protocol-based activity linking method. (2) The customized process can supply explicit instructions for collaboration. According to the customized process, participants can be instructed and can select specific activities for collaboration. (3) Subsequently, participants can work together in the workspace corresponding to the activity to engage in collaborative activities, such as problem discussion, data processing, and geographic simulation. (4) Simultaneously, information concerning the geographic analysis process, including the pathway of geographic analysis and the users, resources, tools, and operations involved, is recorded to improve transparency and reliability. Notably, these four stages are not conducted sequentially. For example, after participants receive guidance and begin interaction, they can still modify and optimize the customized process. The recording process is also not an independent stage; it is conducted alongside the other stages to record the process customization, participant change, and geographic analysis operations.
The use of the prototype system is shown in Figure  S3. This figure shows an example of activity linking based on a customizable process design strategy. While linking activities, participants can select the target activities and link them by establishing the protocol. Figure S4 depicts an example of a collaborative scenario in the workspace.

Background
Given the acceleration of urbanization and the rise in car ownership, exposure to traffic noise is increasing. Therefore, individuals pay attention to whether their community is exposed to traffic noise pollution. As an approach to geographic analysis, traffic noise assessment can play an important role in identifying traffic noise impacts on a community. Therefore, several participants decided to analyze the traffic noise status of their communities, and the traffic noise assessment was conducted via a collaborative geographic analysis system.

Participants, resources and tools
There were different participants in this experiment who were geographically distributed in different regions of Nanjing City and had different expertise backgrounds. For example, one participant was an expert in traffic noise modeling, two participants were familiar with urban infrastructure, and others were local residents who engaged in this collaboration as stakeholders. These participants decided to share their resources and knowledge to perform this experiment.
The resources utilized in this experiment were mainly road data, urban building data, and sound barrier data that were provided by participants familiar with urban infrastructure. The data format were Open Street Map (OSM) and GeoJSON, and the geographic reference of the data was the World Geodetic System 1984 (WGS84). The resources were shared on the prototype system; thus, participants could use them in the collaborative traffic noise assessment.
Several real-time online tools could be accessed on the prototype system to support the collaboration in this experiment. Some of the tools were provided by the system including a communication tool, a mind map tool, a map editing tool, and a traffic noise data preparation and visualization tool. Some key tools for data processing and traffic noise simulation, such as a traffic noise simulation tool, a data interpolation tool (inverse distance weighted interpolation), reprojection tools (for GeoTiff and Shapefile), and a data conversion tool (for converting from GeoJSON data to Shapefile data) were created based on OpenGMS model services and data processing services. Detailed information about these tools is listed in Table 1.

Process customization
For the implementation of traffic noise assessment, participants first created a root activity that provided a collaborative workspace for all participants. In the root activity, participants discussed and determined an overall route with 4 targets: identifying the problem, selecting models, assessing traffic noise, and optimizing urban planning to improve traffic noise. Thus, they created 4 activities to correspond to these targets. With the exception of the problem identification activity, participants realized that they needed more child activities to complete these targets and coordinate the participants, resources, and tools. For example, in the model selection activity, several participants familiar with traffic noise models first needed to find an appropriate model, and then needed to help other participants learn to use the model. Consequently, they created several child activities for the traffic noise model selection activity, traffic noise simulation activity, and urban planning activity, as shown in Figure 10.
During the experiment, participants discussed the process structure and collaboratively linked these activities to customize the entire process for traffic noise assessment. Specifically, participants utilized the protocol-based activity linking method to define the activity relationships and control the information flow. For instance, participants used four activity linking protocols to link the child activities of the collaborative traffic noise simulation activity, as shown in Figure 11. These four protocols determined the linking relationships among activities and the constraints on resource and participant flows. As an example, Protocol 1 determined the sequence relationship between the data collection activity and the data preprocessing activity. This protocol provided the participant constraints, including "Role: Manager; Expert" and "Domains: Traffic noise modeling; Urban infrastructure;" the resource constraints included "Types: Data," "Formats: OSM; GeoJSON; Shapefile," "References: WGS84; Beijing54 (Beijing 54 Coordinate System)," and "Concepts: Road data; Building data; Sound barrier data." Therefore, for the resources collected in the data collection activity, only road data, building data, and sound barrier data employing one of the OSM, GeoJSON, or Shapefile formats and using the geographic reference data of WGS84 or Beijing54 could be conveyed to the subsequent activities for further data processing. After the activity linking interactions in the four process-constructing activities, the process for traffic noise assessment was customized, as shown in Figure S5.

Implementation of collaborative traffic noise assessment
To complete the experiment, participants used resources and tools to implement the collaborative traffic noise assessment following the customized process (part of the process of this case study can be retrieved from https://www.youtube.com/watch?v=-Dvji2ZQf1c): (1) In the problem identification activity, participants used the communication tool to discuss the purpose of this geographic analysis. For example, participants established the geographic analysis target for assessing the traffic noise status of three different regions (Figure 12a). Region A was a community in the Hexi CBD of Nanjing, Region B was an area next to a highway, and Region C was a territory located in the old town.
(2) In the traffic noise model selection activity, participants utilized the mind map tool to introduce and evaluate several traffic noise models in the two child activities. The traffic noise modeling expert introduced the RLS-90 model, the Federal Highway Administration (FHWA) model, and the Calculation of Road Traffic Noise (CoRTN) model (Rajakumara and Figure 10. Activities in the collaborative traffic noise assessment experiment. Figure 11. Applications of the protocol-based activity linking method. In the collaborative traffic noise simulation activity of this experiment, the participants formulated four protocols to link the seven child activities. Gowda 2008). Since the RLS-90 model features a higher degree of calculation accuracy and can support better traffic noise simulation in the steep slope area of roads compared with other models, it was selected as the simulation tool. In addition, participants learned the data requirements of the RLS-90 model with the help of the modeling expert, and they conducted the traffic simulation with the demo data. The simulation result is shown in Figure 12b.
(3) In the collaborative traffic noise simulation activity, participants collected various data in the data collection activity, including road data, building data, and sound barrier data. These data were simplified and separated into three regions with the help of the map editing tool and the data conversion tool (Figure 12c). Participants were also divided into three groups to simulate the traffic noise in the three regions separately. After the simulation, the results were converted to GeoTIFF data in the data postprocessing activity. The participants conducted visualization and further analysis in the impact analysis of traffic noise activity, and the simulation results are shown in Figure S6. According to the results, Region 1 is in a noisier environment, which might be caused by the higher traffic flow in CBD. In Region 2, the highway is the main factor impacting the noise environment in the region. Due to the narrower road and denser residential area in the old town, Region 3 has a better noise environment.
(4) In the urban planning activity, participants attempted to discover whether changes in urban planning could improve the sound environments. Therefore, they iteratively edited the road and barrier data to simulate traffic noise environments. By comparing the different results, participants analyzed the influences of urban planning changes ( Figure 12d).

Discussion of the experiment
During this experiment, participants first worked together to conduct a traffic noise assessment on the prototype system. After engaging in discussions concerning their overall approach, the participants created four activities and their child activities, and they customized the entire traffic noise assessment process based on the process expression model and the activity linking protocol. The process can guide participants to perform specific interactions (e.g. data processing, traffic noise simulation, and result visualization) according to their experience and expertise. These interactions during the process can also be recorded to improve transparency and reliability, as shown in Figure S7.
The implementation of this case study can still demonstrate a guided and transparent process of collaborative traffic noise assessment. In particular, the customized process can instruct participants with different backgrounds to work together based on a well-defined process structure and information flow; meanwhile, the interactions for traffic noise assessment implementation are recorded to help participants better understand the process, which can support further process tracing and solution optimization.
As a consequence, in comparison to noncollaborative geographic analysis, this strategy can more successfully enable users to share knowledge and resources (e.g. knowledge of traffic noise modeling, as well as the required road data and building data) and establish agreement on the traffic noise assessment process. Moreover, existing collaborative geographic analysis methods, such as CVGE (Chen et al. 2012;Lin et al. 2013;Zhu et al. 2016) and collaborative GIS (Sun and Li 2016;Jelokhani-Niaraki 2019), usually focus on support environments and implementation methods to help participants work together. In comparison, this study focuses on the process of collaborative geographic analysis, and it provides an explicit and transparent traffic noise assessment process to support the guiding of participants, the controlling of information flows, and the recording of interactions. For example, the clear process structure led the traffic noise modeling experts to appropriate activities (e.g. the model determination activity and traffic noise simulation activity) and coordinated the relevant resources; the interaction records showed the implementation details of the traffic noise simulation and helped participants understand the context.

Conclusions and future work
Collaborative geographic analysis can help address geographic problems, but an effective mechanism is still needed to support collaboration throughout the entire geographic analysis process. For example, multiuser-engaged and interaction-intensive collaboration requires clear process expression, control, and recording, and process implementation requires the support of a resource-and tool-coordinated collaborative environment. Therefore, this study focuses on process customization and process-based geographic analysis implementation. The study proposes a novel customizable process design strategy for collaborative geographic analysis. By means of this strategy, explicit processes for collaborative geographic analysis can be customized for different geographic problems, and participants can engage in appropriate activities and collaboratively perform interactions. Moreover, this strategy allows the entire process to be recorded, including the pathway and detailed interactions, thereby making the geographic analysis process more transparent and reliable.
Although these efforts can improve geographic analysis, some limitations have not been overcome. For example, the use of activity documents and the support of complicated interactions require further improvement.
(1) To prepare an explicit and transparent process for collaborative geographic analysis, participants require a certain amount of time. Therefore, the question of how to reuse the customized process must still be addressed. In this study, the process expression model can record considerable information concerning the practice of collaborative geographic analysis. Although this information is retained only as an archive for understanding the process and is not fully utilized, the recorded information can provide valuable experience with respect to solving similar geographic problems. To help participants customize the process more easily, a reuse method that can allow users to extract valuable information (e.g. the use of geographic analysis models and the processing of model data) and develop a problem-solving template must be researched. Based on such a template, users could easily acquire a well-expressed process to solve geographic problems .
(2) Without the support of a capable collaborative engine, the prototype system cannot be used in larger-scale collaboration. Therefore, this study cannot yet be popularized. Namely, throughout the entire geographic analysis process, there are diverse forms of interactions, such as activity linking, model parameter setting, data editing, and simulation execution. To support collaboration among geographically distributed participants, the synchronization of this cooperation should be considered. Therefore, a robust and efficient collaboration support engine is needed. Such an engine should have the capabilities of supporting real-time communication, addressing operational conflicts and monitoring abnormal operations.