Propagation of semantic information between orthophoto and 3D replica: a H-BIM system for the north transept of Pisa Cathedral

Abstract This contribution proposes a methodological approach for the transfer of annotations between orthophotos and 3D digital heritage models, relying on a mesh-based/point-based representation. The workflow leverages on the exploitation of 2D/3D projective relations and on the identification, propagation, modelling and tiling of virtual models of architectural heritage. Referring to the significant case study of Pisa Cathedral, the method is tested to ensure an informative continuum between 2D medias and 3D representations, in terms of morphology, geometry and semantic enrichment. At first, a high resolution ortho-photo is created to support studies related to conservation and restoration, e.g. to highlight degradation patterns and materials as well as to distinguish cracks, frescoed surfaces, decorations. Then, the information is translated from the 2D support to a virtual 3D mockup: this step is essential to ensure a complete understanding of the architectural heritage object, that can thus be studied in its entirety, considering its morphological complexities. The proposed approach provides a more effective system for the transfer and exchange of semantic information from high-resolution orthophotos to semantically rich 3D models, that can be fundamental even in view of the construction of Heritage-Building Information Modeling (H-BIM) environments.


Introduction
The present contribution illustrates the construction of a building information management system to handle documentation and conservation activities on the north transept of Pisa Cathedral, subject of a complex restoration action. The proposed methodological strategy relies on the transfer of the available information on the heritage artifact (e.g. in terms of state of conservation and degradation phenomena) from bi-dimensional (2D) digital orthophotos to mesh-based / point-based three-dimensional (3D) models derived from surveying.
Nowadays, digital methods for the documentation and visualization of architectural heritage are countless and widely proven, as confirmed by the wide variety of tools and devices allowing to virtually represent each artefact or site in the two or three dimensions. Reality-based surveying techniques, since combining the radiometric precision of Structure from Motion (SfM) photogrammetry with the metric precision of the laser scanner, allow to faithfully reproduce the geometry, shapes and colours of the built environment (Bevilacqua et al. 2018). Despite this, the community of Cultural Heritage (CH) experts and practitioners still lacks digital tools enabling the use of the different digital data produced -whether a 2D computer-aided drawing, a photo, or a 3D digital twin-as effective means in support of restoration, conservation, and documentation activities.
Previous studies (Croce et al. 2019;2020;Ponchio et al. 2020) proved that the enrichment of reality-based models obtained from surveying can occur by the socalled annotation mechanism, intended as the operation of a) selection of a region of the digital model and b) association of the region considered with descriptive attributes, related, e.g. to the results of historical analysis, to cognitive studies, to diagnostic and chemical investigations, sampling of materials and so forth. In such wise, the available information on a certain asset could be illustrated, archived, accessed, and updated starting from its reality-based surveying and thus inserting knowledge data derived from cross-disciplinary study domains directly on the digital replica, thanks to the annotation process.Indeed, the information mapping procedure could allow to document with effective 2D/3D visual media, e.g. material changes (Pocobelli et al. 2018) and alterations (Stefani et al. 2014), degradation levels and cracks on surfaces (Bacci et al. 2019), as well as conservation states (Apollonio et al. 2018;Bruno and Roncella 2019) and temporal evolutions.
To date, orthophotos, i.e. orthorectified images extracted from reality-based surveying, are among the most frequently used media for the insertion of annotations: being them metrically correct bi-dimensional representations, the annotation process is rather simple (Fabiani et al. 2016). However, the use of 2D representations implies the production of multiple sheets of information, each one identifying, e.g. a single section or elevation of a building. This avoids further cross-check and transfer of data between multiple annotated drawings and is the cause of a dispersion of knowledge.
For these reasons, the 2D information mapping alone is not sufficient itself for a complete comprehension and understanding of the architectural object, in its morphological completeness and complexity. The products of preventive conservation studies, in addition to being supported by the graphical representation via orthophotos, should be made accessible and amendable through a more structured and intelligible representation, such as that of a 3D environment, that rather takes into account the mutual relationships between the different elements represented and always preserves the connection of 2D/3D information.
Even though current Heritage-Building Information Modeling (H-BIM) platforms have been developed to this purpose (Murphy et al. 2009;L opez et al. 2018), they are far from ensuring the correct transfer and automatic propagation of information between orthophotos and 3D replicas, for the orderly structuring of semantic-aware data over 3D representations. In order to fill this research gap, so avoiding possible alterations or dispersion of relevant information produced within the context of heritage studies, this paper discusses a more effective approach for the 2D/3D digital annotations transfer and illustrates the results achieved on the specific case study of the north transept of Pisa Cathedral.
The proposed method is based on the propagation of semantic annotations between reality-based models, derived from the surveying, and ortho-photos. The thematic mapping of the conservation state, prior to any restoration intervention, is performed originally over the 2D media; subsequently, the inserted information is transferred to the digital model, to be directly visualized and accessed over a meshbased / point-based 3D representation. The proposed strategy can be validated on other case studies involving valuable architectural heritage assets, subject to integrated recovery projects.

Related work
Nowadays, digital technologies are increasingly deployed in the CH field. Virtual representations and information management systems offer new opportunities towards the reasoned and structured use of digital replicas of existing heritage objects. At the same time, reality-based survey techniques allow to metrically represent the reality as it is, returning the existing object by photos, range data, CAD drawings and maps, or even by the integration of these techniques (Manferdini and Remondino 2010;Bevilacqua et al. 2018).
In the field of architectural restoration, these developments are also seen in the perspective of performing thematic mappings by directly leveraging on the digital twin: geometric and visual data can indeed be associated to the bundle of complementary knowledge derived from analyses, tests and inspections performed over the studied object (Alshawabkeh 2020).
The connection of the merely visual and metric representation with structured knowledge-related information is performed thanks to the mechanism of the semantic annotation. The latter can be inserted on the digital media through manual selection techniques, or even through semi-automated or automated procedures (Croce et al. 2020;Scalas et al. 2020). The associated level of information can result more or less structured, depending on whether the annotation is performed, respectively, by tags, attributes, relations and ontologies (Andrews et al. 2012).
In (Croce et al. 2020), the different semantic annotation pipelines are also classified on the basis of the type of digital media where annotations are performed. A distinction can thus be made between: (a) 2D approaches, (b) 3D approaches, and (c) hybrid 2D/3D approaches, respectively.
2D approaches are traditionally the most widespread. They are based on the annotation of a two-dimensional digital representation leveraging the perspective projection (photos, archival images) or the orthogonal projection (orthophotos, CAD drawings). In the field of restoration and preventive conservation, a remarkable achievement in this regard is the SiCAR digital system, promoted by the Italian Ministry of Cultural Heritage and Activities and Tourism (MiBACT) and presented in (Fabiani et al. 2016); it is conceived as an open source and online platform that enables the digital mapping over orthophotos, and that is dedicated to professionals and Public Administrations in support of heritage restoration operations.
Although such a 2D approach paves the way for new perspectives of rethinking the documentation of restoration activities by making use of digital images, it is not sufficient and adequate to describe the complex and multidisciplinary nature of the studies that revolve around a heritage object. The sole use of orthophotos, in fact, requires the production of numerous digital documents for the information input (at least as many as the number of orthophotos produced), with ensuing scattering and possible loss or alteration of data. This also causes a loss of reference to the whole architectural asset or building, that is no more considered in its entirety and morphological complexity, thus provoking a spatial reference gap (Messaoudi et al. 2018) between the different 2D documents being annotated.
To address this problem, an explicit reference to the three-dimensional representation of the object of study needs to be introduced. That is why, in most recent digital information systems, the semantic annotation is no more limited to 2D data, but rather is extended to the 3D digital twin. 3D models can be annotated by connecting data to a selection of points, (poly-) lines or polygons over the digital replica (Scopigno et al. 2011). Many existing approaches exploit mesh models obtained from reality-based acquisitions, suffice it to say that platforms such as Sketchfab (https:// sketchfab.com/), and Potree (http://potree.org/), born to allow 3D model publishing and sharing between a community of web users, allow basic and user-friendly tools for the insertion of 3D digital tags, that represent the most trivial form of annotation.
As for the specific domain of CH, 3D approaches are more and more exploited: (Galantucci and Fatiguso 2019) adopted annotations on point clouds and textured polygonal meshes to characterize cracks of features induced by material loss in historic buildings' surfaces. (Apollonio et al. 2018) proposed a web-based information system to support the restoration of Neptune's Fountain in Bologna, for a more direct liaison between the on-site inspections and the reasoning on the digital replica. (Garozzo et al. 2017) leverage 3D data to store historical buildings' documentation, by embedding over the digital model specific information on documentary sources, images, decay and deformation evidence as well as on decorative elements. Other works by (Hunter and Gerber 2010; Serna et al. 2012;Boutsi et al. 2019) rely on ontologies to formalize a shared and univocal method for the insertion of annotations within 3D heritage data.
Another noticeable approach to the topic relies on the use of Building Information Modeling systems, applied to CH. Following the work by (Murphy et al. 2009;Dore and Murphy 2013) on the application of BIM-based reasoning to this domain, BIM platforms such as Autodesk Revit or ArchiCAD have constituted a preferred media to link the geometric representation of artifacts with cross-disciplinary knowledge. (Angulo-Fornos and Castellano-Rom an 2020) use annotations of this kind to manage all the information generated in preventive conservation actions on a heritage building, at different levels of detail and visualization. (Pocobelli et al. 2018) use H-BIM to construct a weathering forecasting model and to provide information on possible surface degradation patterns, bringing as a significant example the Jewel Tower in London. (Bacci et al. 2019;Malinverni et al. 2019) leverage a parametric representation to build a 3D thematic map illustrating material decay and related restoration interventions. Annotations are displayed as multi-category tags and they are sorted according to a relational database schema.
However, among the different digital annotation systems proposed, hybrid approaches are the most valuable ones: indeed, they do not limit the inclusion of information to a certain type of model at the detriment of another, but rather they allow the transfer of annotations between different media, i.e. between 2D images or drawings and 3D digital models.
Among them, the web-based information system NUBES (Stefani et al. 2014) exploited UV mapping to automatically project annotations made on 2D textures to the 3D scene. More recently introduced, the collaborative reality-based platform Aïoli (http://aioli.cloud), leveraging photogrammetric acquisitions, allows to link annotations made on images to 3D point clouds.
Hybrid approaches have a fundamental advantage in restoration applications, where the continuous connection between different study products and outputs is required, and the transfer of information between descriptive sheets, photos, 3D models and CAD drawings is demanded for a continuous updating, retrieval and archiving of documentary data. However, a complete information management system should be provided with a solid structure of data access, storage and retrieval, as is the one provided by H-BIM platforms. Starting from the state-of-the-art on this topic, the methodology proposed in this work combines the advantages of hybrid approaches, in terms of 2D/3D information transfer, with the more logical information structuring typical of H-BIM environments.

The north transept of Pisa Cathedral
The proposed methodological approach is tested on the relevant case study of the north transept of Pisa Cathedral, in which the production of orthophotos is explicitly requested as a fundamental means of support, to document (annotate) degradation states and preventive conservation interventions.
The Pisa Cathedral (Italy) is a masterpiece of the Pisan Romanesque style; it is located in Piazza dei Miracoli, near the well-known leaning tower. Founded in 1064, the complex is a testimony to the prestige reached by the Maritime Republic of Pisa at its apogee.
The Cathedral was built in two phases, respectively linked to the architects Buscheto, who designed the original layout with a basilica body of five naves, a transept with three naves and a dome on the cross, and Rainaldo, who planned the extension of the building, as well as the façade with black and white marble inlays and with a persistent use of reused materials recovered from monuments of the Roman age. The complex, which has been subject to repeated restoration campaigns over time, is enriched by elements recalling oriental architectures, traceable in the decorative components and in the elliptical plan of the dome, of Moorish inspiration.
The latest restoration action was put into practice by the Opera Primaziale Pisana, the lay-ecclesiastical institution created for the management and preservation of the complex. Restoration activities were conceived in a series, based on a subdivision into successive intervention lots, so as to allow the execution of the operations while maintaining the Cathedral open to visitors.
In year 2018, the group in charge of maintenance and restoration requested to perform the surveying of the outer north side of the Transept of the Cathedral, to be returned at a scale of 1:20, (Figure 1), as a preliminary step towards the definition and documentation of the recovery interventions.

Preliminary Preparatory: Survey activities
The external surfaces of the transept, interested by the survey campaign carried out for the Opera Primaziale Pisana, cover a total area of about 1700 m 2 , including, respectively: the East and West walls (about 500 m 2 each), the North wall, corresponding to the area of the transept including the apse (about 400 m 2 ), and the side walls corresponding to the naves (about 300 m 2 ). The height from the ground of the top of the cathedral is about 30 m, while the height from the ground of the roofs of the transept is on average 20 m, for a width of 24 m.
The metric survey required the use of integrated survey techniques, which included, respectively (Caroti and Piemonte 2020): Terrestrial Laser Scanner (TLS) survey. The TLS survey was performed via a Leica Geosystems C10 ScanStation, set to a scan density of 8 points per cm 2 . As there were no surrounding areas located at sufficient elevation to allow for a more complete description of the higher parts of the transept's façade, the scans were acquired mostly at the ground level. The only exceptions were represented by two scans, performed from two access points located on the outside of the tambour supporting the Cathedral dome. Ground-based photogrammetric survey. The photographic campaign was carefully arranged, in order to have a restitution as homogeneous as possible of the colour information, in terms of hue, saturation, brightness, exposition. It was performed via a Nikon's D850 digital SLR camera, with a CMOS sensor of 35.9 x 23.9 mm capable of a pixel resolution of 8256 x 5504 px.
The photo shoots were acquired at a camera-to-object distance of about 55-65 meters, with a lens of fixed focal length f ¼ 200 mm, guaranteeing a Ground Sampling Distance (GSD) between 1.2 mm and 1.4 mm. For some images taken at a closer distance, a f ¼ 50 mm lens was used.
The colorimetric information was corrected by providing a homogeneous reference across the different types of lighting conditions encountered during the photogrammetric acquisitions. This was achieved by thoroughly planning the time of the day and the acquisition modes, to avoid objects' appearance to be altered by cast shadows or under different lighting conditions. In addition, every 30 minutes, a photo containing a ColorChecker Color Rendition Chart was included, to adjust white balance and thus control exposure and brightness levels of each photo shoot.
Unmanned Aerial Vehicle-Borne Photogrammetric Survey (UAV-based survey). It was acquired by a DJI's Phantom 4 Pro. This UAV system was equipped with a DJI FC6310 camera, (13.2x8.8 mm CMOS sensor), capable of a resolution of 4864 x 3648 pixel, and ensuring an average GSD of about 2.4 mm at an UAV-to-object distance of 8 m on average. Sequences of photo shoots in the vertical direction were acquired. The GSD is in this case higher with respect to the one calculated for ground-based photogrammetry; moreover, despite the high value set for the interval timer shooting (1:800 s), micro-blurs still appeared, thus affecting the quality and precision of the photographs. For these reasons, the UAV-based photogrammetry was only considered for the integration of missing or at least not plainly visible parts of the ground-based surveying, as the latter featured better acquisition results.

Methods
The proposed methodological approach begins with the creation of 2D or 3D digital study models, which are intended to provide basic graphic support for the management and coordination of restoration operations. The first step is indeed the acquisition, starting from the reality-based surveying, of metrically controllable data, derived through laser scanning or photogrammetric surveying. These pre-processing operations are illustrated in Subsection 4.1.
Subsequently, the different stages of the data processing workflow can be divided as follows: 1. 3D model reconstruction and High Resolution (HR) orthophoto generation; 2. Semantic mapping of the model (e.g. in terms of depiction of degradation elements, description of materials, identification of interventions to be performed, or even classification of architectural components); 3. Reconstruction of projective relations between orthophoto and 3D model; 4. Discretization and tiling of the 3D model, with identification of significant elements (points) to which information can be associated (mesh-based representation); 5. Transfer of annotations between model and orthophoto, and transition to pointbased annotations (point-based representation). 6. Information archival, retrieval and update via point-based annotations, in a continuous protocol of input and output allowing for a more complete heritage documentation system.
The overall flowchart of the proposed methodology is provided in Figure 2. The description of each step of the proposed approach is provided in the next subsections.

Pre-processing of digital data for HR ortho-photo generation
A preliminary phase of inspection and direct contact with the survey object allows to accurately plan the type of survey to be performed and the instrumentation to be employed in the study of each architectural asset. This decision is influenced each time by the characteristics of the object itself, that may vary depending on the dimensions, the shape, the building's constructive complexity, etc. As for the case study of the transept of Pisa Cathedral, TLS and photogrammetric surveys were chosen, so as to combine the geometric accuracy of the first with the colorimetric precision of the latter (Bevilacqua et al. 2018) and the acquisition were carefully prepared to maximize the uniformity of lighting and shadow conditions. Then, with regard to the homogenization of coordinate systems, the TLS point cloud provided the metric reference to which associate the radiometric correctness (i.e., in terms of colour) of the photogrammetric survey.
The processing of the survey data and the integration between the two systems provided a 3D model of the study object, a digital replica that was used as a reference to the construction of the information system.
More in detail, the alignment of the UAV-based and ground-based acquisitions respectively was performed via the software Agisoft Metashape. At first, the camera internal and external orientation parameters of each photogram were calculated, separately processing every wall of the transept and obtaining for each side a dense point cloud.
Then, the insertion of Ground-Control Points (GCPs), detected at a first approximation on the TLS referenced cloud, provided the basis for the proper scaling and georeferencing of the photogrammetric dense point clouds. In particular, the rototranslation matrix with scale factor was calculated via a cloud-to-cloud matching algorithm, and resulted in an overall precision of the order of 4 mm. The coordinates of the GCPs exported were so recalculated and imported back, in order to ensure every project to be framed in a common reference system. This methodology was applied both to the ground-based and to the UAV-based photogrammetric projects separately and allowed to obtain at the end a textured mesh of the transept.
Starting from this output, an ortho-mosaic was generated, by adequately considering each resulting model. To this end, the accuracy required by the desired scale of representation (1:20) has demanded the logical subdivision of the north façade of the transept into blocks, intended as areas of the model lying approximately in the same plane and which could therefore be processed together in the ortho-rectification process. In parts assimilable to planar elements, the mesh model derived from the point cloud has been replaced with theoretical planes in order to improve precision and accuracy of the HR orthophoto. The procedure has been previously documented and suitably illustrated in .
The subdivision of the building elevation section in such homogeneous regions is shown in Figure 3: red, green and blue areas highlight regions with planar development located at increasing distance from the camera location, while yellow areas indicate curved elements that have been treated separately, by making use of cylindrical projections.
For the correction and matching of the generated orthophotographs, Adobe Photoshop was used as image editing software where to perform slight adjustments of eventually remaining cast shadows and/or image overlaps. Then, as a final step for the generation of the HR orthophoto, the restorers requested to modify the image equalization schemes in order to highlight the depiction of surface discontinuities or degradation phenomena due to the particular oxidization or alteration of the stone ashlars. The orthophoto was then ultimately modified by adjusting the hue, saturation and brightness of its parts, to meet these requirements.

Semantic mapping and reconstruction of 2D/3D projective relationships
Once the 3D study model and the high-resolution orthophotos are generated, the thematic mapping represents an essential phase of the working protocol on the cultural asset, enabling the link of the purely visual representation of the surveyed elements with attributes related, for instance, to the state of conservation, to the degradation of the surfaces or to the definition of the recovery interventions, as well as to any other semantic-aware source of information.
This phase is thus conceived as tightly correlated to the complementary analyses carried out on the asset, entailing the insertion of such information within a digital representation; it is aimed at the provision of a graphical and numerical tool for data display, extraction and recording, which otherwise would be restricted to the isolated observation of several reports and various spreadsheets.
In other words, the mapping operation represents the first stage towards the definition of an information model that serves as a repository of inter-related graphic entities associated with texts, images and documentary resources. This is achieved, as previously discussed in Sections 1 and 2, thanks to the annotation mechanism; herein, visual representations are inserted in the form of 2D annotations, directly on the digital orthophotos.
As for the proposed strategy, the orthophoto is imported into the free-form modelling software McNeel Rhinoceros, and it is in this environment that the annotations are performed, resorting to closed polylines or curves and associating each time the inserted data to distinct information layers.
However, in contrast to current methods solely relying on the annotation of the orthophoto, the following approach aims at transferring the information from the 2D support to the 3D model, which can thus constitute an unambiguous reference for the exchange and retrieval of information and in such a way as not to limit the information to the representation in two dimensions only, which would require the use of different non-connected documents.
Moreover, it is worth noting that the present contribution illustrates a manual-tracing procedure for the creation of semantic annotation, but it is possible to envisage the extension of the proposed approach also to the case of reality-based annotations obtained with more automatic procedures, such as, by way of example, those obtained by exploiting Artificial Intelligence for the classification of 2D/3D heritage data (Grilli and

Annotation transfer and 3D information model
The transition from the 2D media to a 3D representation is based on the recovery of projective relationships connecting orthophotos and digital model. This operation is achieved by referring to the basic principles of descriptive geometry and by extracting appropriate 3D elements to which information can suitably be associated.
Purpose of phases 3 to 6 of the procedure (illustrated at the beginning of Section 4) is thus to create a 3D reference model, valid once and for all and which can be used every time it is necessary to update, recover, validate, or amend specific information relating to a given heritage monument. To accomplish this, it is assumed that descriptive data are inserted considering a suitable set of representative points of this reference model, to which information is associated each time.
The different phases of this process are summarized in the scheme of Figure 4, which displays the example of annotation transfer related to regions of the material 'white marble': starting from the thematic mapping of the orthophoto, the annotations are firstly transferred to the mesh as projected curves or polylines; then, the mesh -chosen as unique reference environment for the information management system-is segmented, through an appropriate operation of discretization (subdivision into tiles). This operation is performed by choosing a minimum size of the tiles of the mesh, depending on the minimum surface area covered by restoration works. So doing, it is possible to shift to a representation based on Surface Information Points (SIPs) identified over a source model, each one identifying a specific mesh tile, to which information related to the thematic mapping of each tile is linked. Finally, the interpolation with the curves projected onto the model allows the information to be transferred to the representative surface points, displaying the annotations in both 2D and 3D.
In order to streamline the procedure, the workflow related to these stages of the workflow is solely carried out through the software Meshlab (for the discretization of the mesh model) and McNeel Rhinoceros (for the remaining steps).

HR ortho-photo
The ortho-mosaic mapping, after the processing of surveying data (Subsection 4.1), resulted for the north transept of Pisa Cathedral in a two-fold output: on the one hand, a HR ortho-photo of the object of study, in its entirety ( Figure 5); on the other hand, the development of the apse, obtained by a cylindrical projection, that better described, in 2D, the curved surfaces of this part of the transept ( Figure 6).

Semantic mapping
The thematic mapping of the surfaces is initially performed on the HR orthophoto of the transept: for restoration-related purposes, the semantic description of the object is achieved by defining regions of interest describing, respectively, materials, areas affected by degradation phenomena, conservation interventions made necessary for preventive maintenance works (Figure 7).
For what concerns the detection of alterations, damages and decays, reference is made to the distinction provided by the International Council of Monuments and Sites (ICOMOS), that constitutes a common effort towards a terminological convention, widely recognized, to define stone deterioration patterns of heritage artifacts (Cartwright et al. 2008). In detail, cracks, features induced by material loss, discoloration, deposit and biological colonization interested the surfaces of the considered case study (Figure 7ii). For each category, an appropriate conservation strategy has been put into practice and documented accordingly (Figure 7iii). The mapping over the digital orthophoto is performed considering annotations of the type of closed curve or polyline: a multi-segment polyline has options to draw line and arc segments (Figure 8), while the curve is generated by control points (Figure 9).
Each annotation class is managed as a single layer property of the layer command.

Reconstruction of projective relationships between orthophoto and 3D model
Once the multi-layer annotation phase is accomplished for the 2D orthophoto, the information is transferred to the 3D model. In so doing, the position of the orthomosaic in relation to the model is manually reconstructed; the ortho-photo is placed in the 3D space by identifying an average generation plane and the projection direction. This is done by rotating and translating the orthophoto until its position coincides with that of the 3D model projection.
In detail, projection lines are used to relocate the 2D drawing in space and make the elements represented on the orthophotos coincide as much as possible with the ones of the model in the three dimensions. This operation can be performed as many times as the types and directions of projections originally used to build the orthophoto.
For the orthophoto obtained by planar projection (Figure 5), the direction orthogonal to the middle plane of the transept facade is considered.
It has to be noted that, by preserving the roto-translation matrix, it is always possible to recover and therefore preserve the position of the orthophoto with respect to the model: this ensures that, if in the future it is required to update the thematic mapping of information, one can keep track of the position occupied by the 2D annotation in space, and therefore transfer new annotations or update the previous ones from the ortho-photo onto the 3D model. Once the orthophoto is correctly arranged in space, the relationships between parallel and perspective projections can be exploited in order to reconstruct the position of each annotation in 3D.
Curves and polylines representing the contours of each annotation are thus projected onto the 3D model, by pulling their 2D representation toward the mesh. The approach is applied both to the case of plane ( Figure 10) and cylindrical projections ( Figure 11).

3D Model tiling and definition of surface information points
The mesh tessellation procedure, for the identification of the SIPs, is performed via the software Meshlab, by means of a uniform resampling operation, which has been previously tested in (Angulo-Fornos and Castellano-Rom an 2020). Through this command, the mesh can be subdivided into tiles of regular dimensions, while setting a minimum size of the constructed cells. In the presented case, the cell size is set to a minimum value of 5 cm x 5 cm, considered as a limit setting below which the restoration intervention is not considered significant. The result is an orderly structured system of faces and vertices in which the model is structured (Figures 12 and 13). It should be noted here that this approach is tested with existing tools that automate to the maximum the simplification and decimation of the mesh vertices and faces: the resampling is indeed performed by building a uniform volumetric representation (meshlab.net). A more rigorous approach, but certainly more time-consuming, would require dividing the mesh according to the individual elements that constitute the façade of the building: this approach, which is being studied as a future work of this research, would be the result of a semantic segmentation, understood as a reasoned structuring of the mesh guided, e.g., by the distinction into main architectural components.
In any case, from the resulting resampled mesh, the SIPs are extracted: these points, corresponding to the vertices of the mesh faces, are used as representative elements of each model tile (Figure 14), and it is to them that semantic information is intended to be associated.

Annotation transfer between orho-photo and 3D model
For the association of annotation information to SIPs, the visual programming language is exploited. The representative points of each annotation layer are extracted by means of a graphical algorithm, starting from the 3D closed curves and polylines obtained at the end of Subsection 5.3.
The core of the used algorithm is the Point In Curves function, which tests a points for closed curves containment. Given as input the 3D SIPs coordinates and the closed 3D annotation regions, the algorithm returns the Point/Region relationship as index values respectively equal to 0 (¼outside), 1 (¼coincident) or 2 (¼inside). As an output, the point list related to index values 1 or 2 is extracted ( Figure 15). Figures 16 and 17 display the result of the application of the constructed algorithm, related to the annotation of the discoloration and deposit phenomenon 'encrustation'. The procedure is repeated each time for every annotation layer and allows to finally achieve the point-based connection between 2D and 3D representations. An information model is so finalized, where the layers corresponding to each annotation class can be visualized, switched off or switched on. This also enables interpolation of the various information entered, and eventual extraction of computation tables linked, for example, to the temporal evolution of degradation or to queries related to the types of interventions performed.

Discussion
The methodological approach illustrated above in relation to the case of the Pisa Cathedral is intended to be applied in collaboration with the actors of the process of restoration and preventive conservation of this cultural asset, in order to create a shared data management system following the logic of modern heritage-building information modelling platforms.
Analytical and descriptive sheets become an integral part of the process of digital heritage documentation, while being stored, managed, queried, and accessed through 3D visualization graphics. The digital model thus becomes a reference source, semantically rich, to which to associate the variety of available and usable information over time, avoiding the dispersion or loss of relevant data.
The reference model consists of a mesh, obtained from the reality-based survey (with integration of TLS, ground-based and UAV-based photogrammetry) and appropriately tessellated, on which representative elements are duly identified. It is precisely to these representative elements, identified as SIPs, that the different pieces of information, including the semantic mapping over HR orthophotos, are associated and organized according to a multi-layered structure.   The knowledge-related data can be isolated from time to time or, vice versa, interpolated and combined depending on the desired queries, by acting on the visibility of individual layers and thus changing the configuration of the different model views.
Indeed, the operation of model tessellation and the reference to SIPs allows to overcome a limitation of the current H-BIM approaches, which is to associate the information to the one and only architectural component (e.g. wall, column, floor elements etc.). Conversely, the proposed strategy relies on a point-based representation to characterize the descriptive attributes related to certain parts of the surfaces and not to the whole objects per se, resulting in an approach that is more suited in the case of restoration operations, as the mapping of smaller elements requires the recourse to a reduced scale of detail.
Moreover, the use of SIPs allows to transfer in a more immediate way the information mapped on 2D supports, by graphically depicting 2D objects and their related spatial relationships on the 3D model. This expedient has the advantage of combining the ease of graphically annotating images with the clarity and completeness of 3D graphics.
As a quantitative analysis for the verification of the proposed approach, we compared, on the simplified mesh model, a point-based annotation performed manually with that resulting from the automatic procedure, so as to verifyat the same scale of representation and subdivision of the mesh -, the correspondence between the detected SIPs. Figure 18 shows the comparison between manual (Figure 18a) and automatic (Figure 18b) annotations, respectively, performed for an upper portion of the transept elevation, where a validation set has been manually set up.
In particular, the dark red lines in Figure 18b identify the mesh faces that have been excluded from selection in the case of automatic annotation: in total, 34 faces of the total 222 faces relative to the annotation of Figure 18 result in false negative polygons, i.e. in polygons that should result as belonging to the annotation but have been excluded from it. This test, elaborated on more portions of the model to test the validity of the approach, has led to quantify around an average of 15% the portion of misclassified faces of the mesh.
It can be observed that the major limitations of the proposed approach concern transition points and seamlines dividing one architectural element from another, e.g.   a flat wall from a column or even a single ashlar from another. It is thus reasonable to expect that a remeshing of the initial model, if performed by considering a semantic segmentationinto main architectural components, would allow to more appropriately distinguish the individual elements of the mesh in such a way that faces, vertices and edges are associated to a single architectural element. This aspect is the subject of ongoing work and would improve the result of the automatic transfer.
However, in some cases, given the particular conformation of the base model or the presence of noise (as can be seen, e.g. from Figure 12), the mesh obtained from reality-based surveying may not allow for a correct subdivision of the surfaces; in this regard, the approach could me more robust and widely applicable when taking into account a more ordered distribution of the mesh, referred for example to an ideal Figure 19. Semantic segmentation of the model with identification of classes of typological elements. Future work will consider the model resampling based on these primitive architectural components. model formed by geometric primitives. As such, it would be necessary to identify (segment) the model in a semantic way, by marking the main and recurrent architectural components (walls, columns, capitals, moldings and so on) and defining for each of them an appropriate tessellation procedure, based on architectural criteria ( Figure 19).
In any case, the 3D model is to be intended always as a univocal reference, which is valid a priori and is defined once and for all, to which associate the information. The applied strategy allows so to refer every time the descriptive data (that can be textual, graphical, alphanumeric … ) to representative elements of this metric digital model.
In the future, any information to be associated with the model can be inserted by referring again to this representation, be it a mesh or a primitive-based mockup. In other words, if new information about the asset becomes available, regardless of the refinement and type of representation used to describe it, it can be linked to the source model defined a priori. Vice versa, any information to be retrieved on the 3D model can be associated to its original descriptive sheet (such as the orthophoto), by appropriately exploiting inverse projective relations.
In so doing, a protocol of input and output of semantic data is established that allows in both directions the graphic and quantitative control of relevant information.
Regarding the handling, management, and consultation of the integrated H-BIM model, it is essential to provide a univocal data entry system. In this contribution, the formal specification of a building taxonomy has been shown explicitly only for the case of the identification of stone alteration patterns, referring to the ICOMOS illustrated glossary. However, in the perspective of a more general and extended use of the source model, the definition of an unambiguous methodology of information entry, also in terms of vocabulary and specific taxonomies (Messaoudi et al. 2018;Roussel et al. 2019), is a fundamental aspect that has to be deepened in view of sharing the 3D model among the different actors involved in the restoration process. Nonetheless, it is envisaged to define a single register of information management also based on a system of access keys, in such a way that visualizing and accessing, or modifying and systematically updating the information are operations that can be performed by this or that stakeholder, depending on his/her respective position in the heritage documentation process.

Conclusions and future developments
In this study, the implementation of a methodological approach for the propagation of information between orthophoto and 3D replica was presented, with reference to the construction of a H-BIM system for the north transept of Pisa Cathedral.
The semantic enrichment returns a fundamental moment of connection and exchange of the different points of view resulting from the multidisciplinary studies on the asset, starting from original survey data acquired via TLS and photogrammetry. In the broader context of heritage preventive conservation actions, the proposed strategy defines an information management protocol that has at its centre the 3D model of an object or site and that is based on the mutual exchange of information between 2D and 3D annotations.
The proposed approach, applied here to the Pisa Cathedral, can suitably be extended and tested on other case studies of historical buildings and cultural heritages in which it is necessary to organize the information related, e.g., to state of the art, material and degradation survey and/or restoration interventions.
The 3D digital twin is assumed as univocal a priori reference for the association of the different descriptive attributes, appropriately tessellated and described by relevant elements identified over the surface (SIPs). Once mapped over a 2D media, semantic annotations can be transferred to the 3D model by means of appropriate 2D/3D projection relationships and by referring to a point-based representation. The approach to such a thematic mapping is illustrated in this work with reference to the generation of a HR orthophoto; however, the method for data transfer could suitably be extended to images and computer-aided drawings, as well as to other 3D models derived from different surveys or performed at different times.
With these expedients, it is possible to bring together, in a common dialogue and exchange process, the different actors of the heritage conservation and documentation process, supporting them with protocols of access, visualization and amendment of data. For the same reason, the inclusion of the constructed model in a broader context, e.g. encompassing the entire Pisa Cathedral or the monuments of Piazza dei Miracoli, is an envisaged development of the research. In this regard, successive levels of subdivision of the reference model can be envisioned, based on the type of semantic information to be associated with it and, consequently, to the desired degree of detail.
Finally, as mentioned above (Section 4.2), the most recent developments in Artificial Intelligence applied to cultural heritage are contributing to render the mechanism of semantic annotation increasingly automatic, and this further raises the need for interpretation, retrieval and exchange of semantically rich digital models. The extension of this approach to images and models derived from semantic segmentation approaches can be envisaged. Current developments of this research also go in this direction.