Using modular 3D digital earth applications based on point clouds for the study of complex sites

This article discusses the use of 3D technologies in digital earth applications (DEAs) to study complex sites. These are large areas containing objects with heterogeneous shapes and semantic information. The study proposes that DEAs should be modular, have multi-tier architectures


Introduction
Over the last decade, the use of 3D technologies in digital earth applications (DEAs) has grown considerably.The need to study complex areas in virtual 3D environments has been identified by many, for example, Basanow et al. (2008), Krüger and Meinel (2008), Stoter et al. (2011), von Schwerin et al. (2013), Zhang et al. (2014), Hu et al. (2015) and Hunter et al. (2015).These works propose solutions that place objects in large 3D spatial contexts, such as buildings in cities or structures in archaeological sites.However, these solutions are usually stand-alone applications with limited sharing capabilities or are relatively difficult to reuse since they are not developed as Free and Open Source Software (FOSS).
As indicated by Craglia et al. (2012), the future for DEAs in general moves away from a single system infrastructure into multiple and highly connected infrastructures.These infrastructures should be openly accessible, serving multiple users with various backgrounds and skill levels.For these developments to thrive and to turn this vision into reality, both organizational and technological challenges need to be addressed (Craglia et al. 2012;Goodchild et al. 2012).
This article contributes to addressing these technological challenges, and aims to provide technological insights for the development of future DEAs.We believe that the future of DEAs is in modularity, i.e. in creating DEAs that are composed of multiple applications.These applications integrate state-of-the-art technologies, can interface between each other, and, if possible, are developed as FOSS.Moreover, following the recommendation by Craglia et al. (2012), we propose that such DEAs have multi-tier architectures to serve, through various interfaces, multiple users with different aims and technological skills levels.In multi-tier architectures, the functions regarding data management and regarding data analysis and data visualization are physically separated.
We focus on the use of 3D technologies in DEAs for the study of relatively large and complex sites.These sites are areas containing objects that have heterogeneous shapes and that are enriched with semantic information.
We propose to use point clouds as the basis for the 3D representations of the digital earth.Point clouds are more reliable than other detailed 3D representations such as polygonal models which require interpolation or modeling methods for their generation.If any interpolation or modeling method is used in a 3D digital earth representation, users will be inclined to interpret the representation as fact, where reality might be different.The reliability of the 3D digital earth representation is crucial in 3D DEAs that require the highest accuracy in measurements and analysis.
As a case study, this article uses the archaeological project Mapping the Via Appia (www.mappingtheviaappia.nl;Mols, Moormann, and Pelgrom 2013;de Kleijn et al. 2015;de Kleijn, de Hond, and Martinez-Rubi 2016).This project studies two miles of the Via Appia Antica containing over 2000 archaeological objects and structures of interest.For this project, we have developed a modular 3D DEA.This is mostly based on FOSS components, thus our efforts can be replicated and improved upon by others.The technological characteristics of the solution are in line with a Spatial Data Infrastructure (SDI) approach.In fact, we have used this term in an article published in the journal for Digital Applications in Archaeology and Cultural Heritage (de Kleijn, de Hond, and Martinez-Rubi 2016).In that article, we discussed the implications of the 3D DEA specifically on the field of archaeology and architectural history.From a more technical perspective, the term SDI does cover the approach presented; however, the consensus on what an SDI defines also entail organizational aspects (Hendriks, Dessers, and van Hootegem 2012;de Kleijn et al. 2014).Since this study does not address the organizational aspects, we have decided to use the term DEA instead.
For the development of any DEA in general, and in particular of the Via Appia 3D DEA, we propose to follow a technological workflow with four components: (i) data acquisition and processing, (ii) data management, (iii) data analysis, and (iv) data visualization.This workflow must be seen as an iterative process which is depicted in Figure 1.The major direction of development flows clockwise, building the next component based on the outcome of the previous.However, in reality challenges are encountered that required direct feedback between the various components.For the components of the workflow, we identify sets of technological challenges that arise when using 3D technologies in the development of DEAs for the study of complex sites.These challenges are rather generic and shared across domains.Since in the context of this research the analysis and visualization components are, from a technological perspective, strongly related to each other, we decided to discuss them as one.Thus, the set of technological challenges becomes: . Data acquisition and processing.In order to obtain a reliable reality-based 3D digital earth representation (point cloud), the study area has to be measured and the objects of interest have to be identified.The identification of an object consists of the assignment of a unique identifier and the registration of its 3D position within the study area.Once identified, specific data for the objects of interest need to be added and related, i.e. semantic information and additional data such as pictures, paintings and alternative 3D representations (e.g.polygonal meshes).Some of these additional 3D data may be manually generated instead of acquired by sensors.
In our case study, for instance, archaeological reconstructions are generated with 3D modeling software.Additionally, in some cases the generation of a complete 3D digital earth representation is not possible or the data cannot be collected at once.In our case study, for example, the archaeological objects of interest need to be cleaned before detailed reality-based models can be obtained.The process of acquiring and processing data therefore needs to be dynamic, allowing to update the 3D digital earth representation at different moments in time using various methods. .Data management.A system is required to deal with 3D digital earth representations which may be incomplete or consist of multiple parts.The system has to deal with different types of data, its object-oriented nature or division and its dynamic acquisition process.Moreover, data can also be updated.To allow multiple users to access the data at the same time, the data must be stored in an online infrastructure.Additionally, in order to perform queries based on the spatial location and the structured semantic information, the data must be clearly structured so that it can be handled by database and spatial querying tools. .Data analysis and data visualization.In order to make 3D DEAs useful for users, tools for knowledge extraction are essential.Users need to be able to visualize and analyze the study area and the objects of interest while making use of the semantic and spatial query functionality provided by the data management system.This enables users to identify and explore in the virtual 3D environment the objects that fulfill their selection criteria.
The remainder of the article has the following structure.In Section 2 a short description of the Via Appia case study is provided.Sections 3-5 extend and take up the challenges in the components of the technological workflow and present the implementation details of the developed 3D DEA for the case study.Section 6 lists the conclusions of this article and discusses future work.Finally, the appendix contains additional implementation details of the developed 3D DEA.

Case study: Mapping the Via Appia
The Via Appia Antica, which runs from Rome south to Brindisi (Italy), is known as Europe's first paved long-distance road.The road originates from the Roman times, was constructed from the late 4th century BC onwards and had important commercial, military, religious and funerary functions.After the Roman period, the road fell into decay until it was reshaped as an archaeological park at the end of the 19th century.This setting of more than 2300 years of history encapsulates a complex multi-layered landscape containing remains of funerary monuments, villas, farmsteads, small industries and sanctuaries from antiquity, but also structures built on top of the monuments from the medieval period and reconstructions from the 19th century which are nowadays considered to be of doubtful archaeological quality.
Within the Mapping the Via Appia project the area between the 5th and the 6th mile of the Via Appia Antica is studied.In this area there are over 2000 archaeological objects of interest, ranging from fragments and blocks ex situ to largein situ ancient ruins.Note that within the project we also refer to sites of interest, which are small areas that contain one or more objects of interest.The main aim of the project is to reconstruct how the structures alongside the road have changed over time.One of the biggest challenges is to analyze, in the large 3D spatial context, the more than 2000 objects scattered around the road.
Systems for the 3D study of archaeological areas have already been proposed in Forte et al. (2012), Dell'Unto et al. (2013) and von Schwerin et al. (2013); however, they present one or more of the following limitations: they are limited to small areas, they cannot handle semantic information or they can only use oversimplified 3D models.

Data acquisition and processing
In the data acquisition and processing component, we have identified three technological challenges.The first challenge regards the methods to obtain reality-based 3D digital earth representations.The second concerns the identification of the objects of interest and the attachment of their semantic information and other additional data.Finally, the third challenge deals with the level of automation that is required in order to foster efficient data acquisition and processing.Sections 3.1-3.3extend the identified challenges and Section 3.4 presents our solutions for the case study.

Reality-based 3D digital earth representation
When obtaining a reality-based 3D digital earth representation, one must choose which type of 3D representation is going to be used, and which 3D scanning technologies and acquisition methods are going to be employed.

Types of 3D representation
All methods to capture reality-based 3D representations are based on point clouds.There are two options to represent the reality in 3D: to use derived models or to directly use point clouds.
. Derived models, such as polygonal meshes, usually present closed virtual surfaces (without holes).
However, interpolation or modeling methods are required to create these surfaces (e.g. with the Poisson surface reconstruction as described in Kazhdan, Bolitho, and Hoppe 2006).By using interpolation or modeling methods, we are prone to model the surfaces unrealistically.This can be problematic and lead to significant errors when performing measurements in the 3D virtual scene. .The direct use of point clouds provides the most reliable representation of reality.The representation is what was sensed, without interpolation or modeling errors.Measurements in point clouds are more reliable when compared to derived models.However, there can be interpretation issues caused by the discrete nature of the data (holes, point density variation, etc.).
Both options have advantages and disadvantages, but in this study we focus on 3D DEAs in which the reliability of the measurements in the 3D models is crucial.Therefore, we propose to use point clouds directly rather than derived models.

3D scanning technologies and acquisition methods
There are several 3D scanning technologies, the most popular ones are Lidar techniques and photogrammetric methods.The former are based on measuring distances using laser, and the latter are based on acquiring images from different viewing angles which are combined and processed into point clouds.Photogrammetric methods are sometimes referred to as Structure from Motion (SfM) techniques.For a more detailed comparison of Lidar and photogrammetric techniques, we refer the reader to Baltsavias (1999) and Gil et al. (2013).In addition to the choice in technology, there is also the choice in the acquisition method, for example, aerial, car-mounted and terrestrial.
Since the different 3D scanning technologies and the various acquisition methods have their own limitations, the combination of multiple scanning technologies and acquisition methods is often required in order to get a proper 3D digital earth representation of a relatively large and complex area.The most obvious limitation is that an optical sensor can only scan surfaces that are visible from its location.For example, sensors mounted on cars will not be able to scan roofs or surfaces of buildings and structures not visible from the road.Furthermore, when scanning distant surfaces the areas which are parallel to the line of sight of the sensors cannot be scanned.For the scan of a city, aerial sensors will not be able to scan most of the facades of buildings.Only scanning within short distances when covering large areas is usually not feasible due to the amount of time it would take.The solution that we propose is to use acquisition methods that allow to cover the whole area from a relatively large distance, thus limited to what is seen from the sensor.Then, these scans are combined with short-range scans of the missing parts that are relevant for the study.The various scans can be done with the same or with different 3D scanning technologies.However, when integrating 3D data from different sensors one should be aware of issues with differently calibrated scales and the use of different spatial reference systems.

Object identification and attachment of semantic and additional data
There are multiple methods for automatic or semi-automatic object detection or extraction such as the ones presented in Bae, Belton, and Lichti (2007), Richter, Behrens, and Döllner (2013), Oude Elberink and Kemboi (2014), Yang, Xu, and Yao (2014) and Xiao et al. (2015).Although these studies have made significant contributions targeting at specific types of objects (e.g.trees, cars, building, etc.), the current solutions and algorithms are poorly applicable to automatically detect objects with heterogeneous shapes in large and complex areas.In some cases, automatic approaches can assist the users by highlighting or suggesting certain shapes.However, this becomes difficult in areas which also contain a high number of uninteresting objects that also have heterogeneous shapes.The most feasible strategy is therefore to use manual identification either in the real world or in the 3D digital earth representations.
Subsequently to identify objects of interest, assign unique identifiers and register their 3D positions, the data with the characteristics for the objects of interest need to be attached.Regarding this semantic information (or attributive data), we propose to use a structured data format.In this manner, the data can be imported in database systems whose rich query functionality can be exploited for advanced semantic analysis.
In addition, for each object of interest other types of data may need to be integrated.This includes the scans required to complement the 3D digital earth representation of the area as described in Section 3.1, and maybe their repetition in different moments in time.Other types of data that may need to be integrated are pictures, paintings, 3D models (polygonal meshes) of reconstructions of the objects, etc.

Automation
This challenge concerns efficiency issues that may occur when manual intervention is required during the acquisition and processing of data.For example, Section 3.1 pointed out that the combination of multiple datasets from various scanning technologies and methods is often required to obtain a proper 3D digital earth representation.For the combination, these datasets need to be aligned and resized into a common spatial reference system.If the process for the alignment and the resizing of each dataset requires user interaction, this will significantly affect the efficiency and required time of the whole process.This can be critical when the number of datasets to combine becomes high.Therefore, we propose to use methods that minimize the manual user intervention as much as possible.

Via Appia implementation
This subsection explains how the described challenges have been addressed in the case study.To obtain a georeferenced reality-based 3D point cloud of the study area Fugro's DRIVE-MAP system (http://www.drive-map.eu/) is used.This technology uses a car with a Lidar scanner combined with a differential GPS and a photo camera.Although this point cloud contains a scan of the entire study area, it is limited to the surfaces visible from the road.To identify the different sites and objects of interest their 2D georeferenced footprints are collected in the field using a differential global positioning system (DGPS) manufactured by Topcon (https://www.topconpositioning.com/).In order to complement the DRIVE-MAP dataset and to obtain a complete 3D digital earth representation of the study area we use SfM.This method requires multiple pictures for each site of interest from different viewing angles and based on patterns in the pictures it generates a point cloud for each site.Additionally, a polygonal mesh can be generated from each point cloud.Since the number of sites of interest is relatively high, we have developed an automatic image-based modeling tool chain (Drost, Spaaks, and Maassen 2016) by integrating FOSS tools developed by other researchers, concretely SIFT (Lowe 2004), Bundler (Snavely, Seitz, and Szeliski 2006) and CMVS/PMVS2 (Furukawa et al. 2010;Furukawa and Ponce 2010).However, the used photogrammetry methods produced point clouds which are relatively scaled and not referenced to the earth's surface.
In order to align the point clouds of the sites generated using SfM with the georeferenced point cloud acquired by DRIVE-MAP (referred to as background point cloud), an automatic registration (alignment) tool has been developed (Attema et al. 2016).The tool uses the footprints, the DRIVE-MAP point cloud and the SfM point clouds as input.For each SfM point cloud of a site, the registration is performed in the following way: first, the registration tool applies the dbscan (Ester et al. 1996) algorithm to filter out noise and select the densest parts of the point cloud.In most cases, this corresponds to the object of interest since it is at the focal point of all the images; second, the SfM point cloud is roughly situated in the DRIVE-MAP reference system using the footprint; third, the scale is estimated by using known image features in the SfM point cloud (range rods whose length is known); Finally, the generalized iterative closest point algorithm (Segal, Haehnel, and Thrun 2009) is used to refine the alignment.Even though the photogrammetric and registration methods usually require a high degree of manual intervention, our developed tools have managed to automate the processing in approximately 50% of the cases.
Semantic information (or attributive data) is gathered by field observations for the various sites and objects.This information consists of archaeological and architectural relevant information such as type of material, decoration type, dating, condition, etc. 3D archaeological reconstructions (polygonal meshes) are created with known 3D modeling software tools such as 3ds Max, SketchUp, or Blender.These reconstructions show the archaeological interpretation of how the different sites might have looked in antiquity and how they possibly transformed in later periods.Additionally, contemporary and historical pictures, drawings and paintings derived from archival research are collected.
An overview of the data acquisition and processing for the Via Appia 3D DEA is depicted in Figure 2. Initially, the default ('raw') formats of the different types of data as obtained from the acquisition methods and registration tools are used.For the point cloud data this is the LAS format, OBJ/ PLY for the polygonal meshes and PNG/JPEG for the pictures.The attributive data of sites and objects are collected by the archaeologists in the field using Microsoft Access and the MDB file format, while the 2D footprints are provided as an ESRI ShapeFile.
In order to visualize and analyze the data with the developed client applications, which will be elaborated in Section 5.3, the point clouds, polygonal meshes and pictures are converted to specific formats used by those tools.This is done in the last phase of the data processing stage.

Data management
In the data management component of the workflow, we have identified four technological challenges.The first challenge concerns how to manage a 3D digital earth representation that is the combination of multiple point clouds that, additionally, can be updated.The second challenge regards the integration of the multiple point clouds with other types of data.The third challenge deals with finding the most suitable architecture to provide multi-user support.Finally, the fourth challenge is to provide analytic and semantic functionality.Sections 4.1-4.4extend the identified challenges and Section 4.5 shows the approach we took to address the challenges in the Via Appia case study.

Multiple point clouds
From the data acquisition and processing, we stated that multiple reality-based datasets (point clouds) need to be integrated in order to generate a complete and reliable 3D digital earth representation of the study area.There are studies (van Oosterom et al. 2015a;Martinez-Rubi et al. 2015a) that deal with how to manage single point clouds with many points.However, the difference with the current research is that this consists of multiple relatively small point clouds.We propose to keep the various point clouds separated instead of merging them for two reasons: (i) it facilitates manual registration (the tools to re-align and re-scale the 3D models sometimes fail) and (ii) it allows to easily add new scans of the objects.

Data integration
In order to enhance data exploration and be able to access the available data per object, the multiple point clouds that form the 3D digital earth representation have to be related to other types of data into a single infrastructure.Examples of other types of data are meta data, pictures or derived models from the point clouds such as polygonal meshes.For the integration of the data, we propose to use a relational database management system (RDBMS).The relational aspect of RDBMS is crucial to establish the relationship between the multiple point clouds, other 3D models and other types of data.However, current database solutions do not provide efficient support to visualize 3D data such as point clouds directly from the database, nor are the available visualization tools compatible with database systems as data back end.As identified in van Oosterom et al. (2015b), databases with point cloud functionality lack efficient level-of-detail (LoD) support, which is required for visualization.Therefore, we propose to use a hybrid data management system (HDMS) that keeps the data in the most optimal format for the tools that have to use it while still using a database to handle the data relations.In the database only the references (i.e.file locations) of the data are stored.

Multi-user architecture
In order to allow users to simultaneously access and change the data on different locations, the interaction must be synchronized.We therefore propose a two-tier architecture with the data centralized in a cloud-based server.The concept of storing and disseminating large point cloud datasets making use of cloud computing resources had already been suggested by Kodde (2010).The client layer of the proposed architecture contains the front-end applications where the users can interact with the system (see details in Section 5.3).For the analysis and visualization, the client applications download copies of the data from the server.Ideally this is done on demand so not all data are downloaded but only this required by the user.Any change to the data made by the users, like adding new data, must be consistently synchronized from the server to all clients.

Analytic and semantic functionality
Beyond the interfaces to visualize the 3D data, including identified objects with attributes as well as other additional data, it is necessary to include analytic functionality to query the dataset based on the geospatial locations.However, and most important, it is also required to enable semantic queries, i.e. to retrieve only certain objects that match a user-defined criterion based on the semantic attributes of the objects.We therefore propose to import the semantic information into the RDBMS in order to exploit its SQL query functionality.

Via Appia implementation
This subsection explains how the challenges related to data management have been addressed for the case study.The developed 3D DEA for Mapping the Via Appia has a two-tier architecture where the (Linux) server uses a HDMS to manage the data (Figure 3).The multiple point clouds and other related data for the various sites of interest are kept separated in their raw formats but also in the formats required by the front-end applications that have to use them (see Section 5.3).They are stored in directory structures and their references (file locations) are imported into the RDBMS component of the HDMS which is implemented in PostgreSQL-PostGIS and is called viaappiadb.In the database we also import the semantic information (sites/objects attributes) and the 2D footprints.Appendix 1 contains more technical details of the viaappiadb, concretely of its entity-relationship diagram.
All manipulations of the data in the directory structures as well as in the database are performed using a set of Python scripts (Martinez-Rubi et al. 2016) and must be done while connected to the server.This ensures data consistency within the described system.The scripts allow adding and deleting data, exchanging the data in the various formats, updating the database with changes in the directory structures and updating the database with new semantic information and 2D footprints.
Figure 4 illustrates the two-tier client/server architecture of the 3D DEA.As previously mentioned, the server contains the HDMS with the master copy of the data and the PostgreSQL-PostGIS database.The client layer of the architecture contains the front-end applications that run on the computers of the users.These applications require the generation of configuration files which is also done in the server.Depending on the application we use the Xenon library (Maassen et al. 2015) or a NGINX HTTP server to download copies of the data required for the visualization and the configuration files to the user's computer.The former is used to synchronize a copy of a data folder in the user's computer with the master copy in the server, while the latter is used to download copies of specific files in the server into the user's computer.Additionally, connections to the database can be made from the client applications.This allows exploiting the analytic and semantic functionality of the system.Section 5.3 provides more information on the developed client applications, including additional details regarding the data synchronization with the server.

Data analysis and data visualization
For the development of tools for data analysis and data visualization, we have identified two technological challenges.The first challenge encompasses the visualization of relatively large complex sites composed of multiple point clouds, polygonal meshes and pictures.The second challenge regards the provision of functionality to perform spatial and semantic analysis in a virtual 3D environment.Sections 5.1 and 5.2 extend the identified challenges and Section 5.3 shows how we addressed the challenges for the case study.

3D visualization
In the past few years various desktop visualization applications for large point clouds have been developed, e.g.Wimmer and Scheiblauer (2006), de Haan ( 2010), Richter and Döllner (2010), Günther et al. (2013) and Richter, Discher, and Döllner (2015).However, these applications tend to be limited.One of the main issues is that these applications need the data, which can easily add up to hundreds of gigabytes, to be locally stored.This is problematic since users need to have sufficient disc space available in their local machines.Additionally, this may be aggravated when dealing with datasets that have multiple versions.
An alternative to desktop applications are web applications, which are gaining popularity thanks to the recent developments in web 3D visualization.Renderers using WebGL have become available for the visualization of point cloud data over the web, for example Plasio (http://plas.io/),Unity (https://unity3d.com/) and Potree (http://potree.org/),the latter based on the previous work by Wimmer and Scheiblauer (2006) and extended by Schütz and Wimmer (2015).These options are compatible with the possibility to host the data remotely.However, currently they tend to be experimental of nature, limiting their performance or requiring plug-ins to be installed.Work has been done on extending the capabilities to visualize massive point clouds over the web (Martinez-Rubi et al. 2015b).Nonetheless, the focus in these studies was on visualizing single (massive) point clouds, rather than having multiple small ones.
An additional limitation that affects both desktop and web applications, is the fact that usually they target at a specific type of data, for example to only visualize point cloud data, and they normally do not include functionality to combine these with other data types.

Analytic and semantic functionality
In addition to the visualization of multiple types of data, the client applications must also have analytic and semantic functionality.This includes measuring tooling and a visual interface to the semantic information stored in the data management layer of the centralized cloud-based server as proposed in Section 4.4.The challenge here is to combine analytic and semantic functionality with the visualization.Thus, the client-side applications must have a connection with the cloudbased server and must allow the user to make selections of which data are visualized, based on spatial and semantic criteria.
Since no solutions exist that simultaneously solve the challenges described in the previous and current subsections, we propose to use easy-to-extend solutions to which the missing functionality to address the challenges can be added.

Via Appia implementation
In this subsection we explain how the challenges related to data analysis and data visualization have been addressed for the case study.Within the project we explored two front-end applications: a desktop application and a web application.They are both based on existing solutions that have been extended to address the described challenges as much as possible.The next subsections provide more details about their features.How the tools developed relate to the user requirements has been discussed in de Kleijn, de Hond, and Martinez-Rubi (2016).Table 1 offers a summary.

Desktop windows application
The developed desktop Windows application is based on OpenSceneGraph (http://www.openscenegraph.org/)and requires the point clouds, polygonal meshes and pictures to be converted to the OpenSceneGraph binary format.This conversion has already been done as part of the data acquisition and processing workflow.A snapshot of the application is depicted in Figure 5.
This application requires a copy of the converted data to be downloaded from the server to the user computer.The same application performs the download before launching the 3D viewer using the Xenon library.The application requires a configuration file, which is generated from the database and also downloaded by Xenon.
As mentioned in Section 4.5, any modification to the data is made at the server using the developed Python management scripts.The only exception is that users can change the position and the scale of the visualized objects in this application.In this case the data modification at the database in the server is done by remotely executing one of the management scripts containing information of the edited location.This is done automatically by the desktop application when the tool is quit using the Xenon library.
This tool is targeted at advanced users, is fully featured and very versatile.However, it is the only component of the Via Appia 3D DEA which is not FOSS.The reason is that it is a modified version of an existing commercial tool.In addition to the already commented functionality, the application has the following features: .It shows the different types of data (i.e. point clouds, polygonal meshes, and pictures) in the same 3D scene. .It contains measurement tools to measure distances and volumes as well as labeling tools to add visual labels into the 3D scene. .A query constructor tool is available.This allows to execute custom queries directly in the database for selecting sites and objects based on attributive data while being able to visualize the results in the 3D scene.
Limitations to the desktop application are that the data needs to be synchronized with the server, which means that significant disc-space at the user computer needs to be available.Furthermore, since users experienced the installation of the desktop application and the interface and functionality to be complicated, the desktop application is mostly used by IT-experienced users, and not usable for the entire targeted user group of the Via Appia 3D DEA.

Web viewer
The web application (van Meersbergen et al. 2016) is based on Potree (http://potree.org/)to display the multiple point clouds.The Potree basic renderer has been adapted to be able to display multiple point clouds in the same 3D scene with the option to select which ones are visible.The web application also uses 3DHOP (http://vcg.isti.cnr.it/3dhop/)developed by Potenziani et al. (2015) to display the polygonal meshes.This requires the point clouds to be converted to the Potree format and the polygonal meshes to be converted to the Nexus format used by 3DHOP.This conversion has already been done as part of the data acquisition and processing workflow.
Figure 6 depicts a snapshot of the web viewer with the point clouds being displayed using Potree in the main scene and the polygonal meshes being displayed using 3DHOP in a different window.
The web viewer does not require a copy of any directory structure before its execution.Like the desktop viewer, it requires a configuration file generated from the database.The configuration file and the Potree and Nexus converted data required for the visualization are provided to the web viewer through the NGINX HTTP server.
The web viewer is suited for all types of users.The interface is experienced as easy to use, and since it does not require additional installations it is easy to distribute.In addition to displaying the point clouds and polygonal meshes, the web viewer has the following features:   Limitations of the web viewer are that the different types of data cannot be visualized in the same 3D scene and pictures cannot be visualized at all; only thumbnails are used for the sites in the site information box.The web application does not allow moving and scaling the visualized data and it does not contain a direct database connection for complex custom site and object queries.However, this is not considered to be an issue since the desktop application provides these features.Note that these limitations are due to lack of time for implementation, not because of any technical limitations.
In addition to Potree and 3DHOP, the viewer also uses OpenLayers for the 2D orientation minimap and AngularJS as JavaScript framework.Therefore, all the components of this tool, including the web viewer itself, are FOSS.

Conclusions
This article contributes to the concept of 3D digital earth by identifying, describing and addressing the challenges that arise when using 3D technologies in DEAs to study complex sites.These contain objects that have heterogeneous shapes and that are enriched with structured semantic information.
The article aims to provide technological insights for the development of future 3D DEAs.It states that the future of DEAs is in modularity, i.e. in creating DEAs that combine multiple applications.These applications integrate state-of-the-art technologies, interface between each other, and are, ideally, FOSS.It also suggests that the DEAs have multi-tier architectures to simultaneously serve multiple users with different skills and goals.For DEAs requiring high reliability in the 3D measurements, it proposes to use point clouds directly as 3D digital earth representations instead of derived models such as polygonal meshes.For the development of DEAs, it proposes to follow a workflow that consists of (i) data acquisition and processing, (ii) data management, (iii) data analysis and (iv) data visualization.
As case study, this article discusses the Mapping the Via Appia project for which we have developed a specific modular 3D DEA.The 3D DEA has a multi-tier architecture and combines a set of components that can be re-used individually or as a whole for other projects as almost all the developed tools are FOSS.This targets not only other archaeology projects but also other domains where the digital earth is studied in 3D.
The article presents methods to create a data acquisition and processing workflow to obtain and process different types of data.It suggests to combine long-range and short-range capturing methods to obtain more complete 3D digital earth representations.Additionally, the article identifies the need to automate the processes as much as possible.
We propose to use a HDMS, which combines a database and directory structures, to handle the data.The point clouds and the other data are kept in the directory structures in the various formats used by the client applications.Their references (file locations) are imported into the database.This is done to provide the most efficient access to the client applications.We suggest to keep the multiple point clouds obtained from the various acquisition methods separated to ease the refinement of the registration of various point clouds and to ease the addition of new sites and objects.The database also contains the semantic information of the sites and objects.This approach gives the possibility to query data based on semantic and spatial selections.All the data are stored and managed in a cloud-based server which is part of the two-tier server/client architecture of the system.This allows multiple researchers to visualize and analyze the data simultaneously from different locations.
The client applications download copies of the data on demand.The article explores the current alternatives in visualization and analytic tools for client applications.Due to the lack of solutions with the required functionality, i.e. the visualization of all collected 3D data combined with a visual interface to the semantic and analytic functionality, it suggests to chose easy-to-extend options to which the missing functionality can be added.For the case study, two different front-end applications have been developed and presented.The first one exploits the mature state of desktop-based 3D applications to provide a tool which is rich in functionality.The second one benefits from the recent improvements in 3D web visualization technologies, such as Potree and 3DHOP, to provide an easy-to-use 3D web viewer.By having multiple clients for various purposes, the proposed system is able to target users with different levels of IT expertise.
The proposed workflow for the development of DEAs is an iterative process where each component learns from every other.Each new iteration of the process brings new challenges that need addressing.From that perspective, this article must be seen as a first step.This paper has identified and addressed the challenges that arose in the first iterations of the process.Therefore, future steps will be to follow the described iterative process and, in parallel, to keep track of the new developments in software, with focus on FOSS, on these rapidly changing technologies.The described methodologies are reusable in other applications and domains where we expect that they will produce new generic challenges that need to be solved.columns as well as the direct connections between the tables from the different categories.The table ITEM contains the physical entities under study and two types are distinguished.First, a site item (from now on site) is used for an archaeological site of interest and, second,background item (from now on background) is used to define the whole study area as a single entity.An item can contain multiple parts or objects which are stored in the table ITEM_OBJECT (note that each archaeological structure of the study area is considered an item itself and not an item_object of the background).The ATTR category contains the attributive data of theitems and their item_objects.The RAW category contains the locations of the point clouds, polygonal meshes and pictures in their raw file formats while the POTREE, NEXUS and OSG categories contain the file locations of the converted data.The POTREE category contains point cloud data and the NEXUS category contains mesh data, which are used by the web viewer application.On the other hand, the OSG category contains point clouds, polygonal meshes and pictures, which are used by the desktop Windows application.This application also allows updating the position and the scale of the objects in the 3D scene (which is not possible in the web viewer), thus a table to store the position of moved and scaled objects is required and has been included.Note that some of the relationships are '1 to 0..n' or '1 to 0..1' (with black points) instead of the usual '1 to 1..n'.This is to illustrate that some items and item_objects may have entries in some tables but not in others.(e.g. it is possible to have a site in the ITEM table which has no entry in the attributes table (ITEM_ATTR) because attributive information of that site has not been collected yet.)

Figure 2 .
Figure 2. Data acquisition and processing flow chart with the different types of data (point clouds, polygonal meshes, pictures, 2D footprints and sites/objects attributes) and their various formats, i.e. the raw formats and the different formats required by the client applications.*For sites for which the automatic registration tool does not work manual registration tools are implemented.

Figure 3 .
Figure3.Schematic overview of the HDMS using directory structures (file-based data structures) to store the data in various formats and a database to store meta data of the directory structures (location of the files) as well as the 2D footprints and the semantic information (attributes of the sites and objects).The latter is imported into the database after a conversion from Microsoft Access to PostgreSQL using Bullzip Access to PostgreSQL (http://www.bullzip.com/products/a2p/info.php).

Figure 4 .
Figure 4. Two-tier client-server architecture of the Via Appia 3D DEA.

Figure 5 .
Figure 5. Snapshot of the desktop Windows application depicting the graphical user interface showing a reconstruction mesh (de Hond 2014) on top of the background point cloud.The user interface depicts the tab with the database query constructor tool. .
The search bar allows to easily focus on a particular site.The site information box shows an overview of the attributive data of a site.The search bar can also be used to perform simple queries which are only based on a sample of the full attributive data, for example to search the sites whose description contains the word 'basalt'.

Figure 6 .
Figure 6.Snapshot of the web viewer showing a point cloud of a selected site on top of the background point cloud, a mesh on a different 3D scene and the site information box with attributive data of the selected site.

Figure
Figure A1-depicts the entity-relationship diagram of the viaappiadb database implemented as part of the HDMS of the Via Appia 3D DEA (see Section 4.5).

Figure A1 .
Figure A1.Entity-relationship diagram of the viaappiadb, showing the most relevant tables and table columns as well as the direct connections between the tables from the different categories.The table ITEM contains the physical entities under study and two types are distinguished.First, a site item (from now on site) is used for an archaeological site of interest and, second,background item (from now on background) is used to define the whole study area as a single entity.An item can contain multiple parts or objects which are stored in the table ITEM_OBJECT (note that each archaeological structure of the study area is considered an item itself and not an item_object of the background).The ATTR category contains the attributive data of theitems and their item_objects.The RAW category contains the locations of the point clouds, polygonal meshes and pictures in their raw file formats while the POTREE, NEXUS and OSG categories contain the file locations of the converted data.The POTREE category contains point cloud data and the NEXUS category contains mesh data, which are used by the web viewer application.On the other hand, the OSG category contains point clouds, polygonal meshes and pictures, which are used by the desktop Windows application.This application also allows updating the position and the scale of the objects in the 3D scene (which is not possible in the web viewer), thus a table to store the position of moved and scaled objects is required and has been included.Note that some of the relationships are '1 to 0..n' or '1 to 0..1' (with black points) instead of the usual '1 to 1..n'.This is to illustrate that some items and item_objects may have entries in some tables but not in others.(e.g. it is possible to have a site in the ITEM table which has no entry in the attributes table (ITEM_ATTR) because attributive information of that site has not been collected yet.)