Progress and perspectives of point cloud intelligence

ABSTRACT With the rapid development of reality capture methods, such as laser scanning and oblique photogrammetry, point cloud data have become the third most important data source, after vector maps and imagery. Point cloud data also play an increasingly important role in scientific research and engineering in the fields of Earth science, spatial cognition, and smart cities. However, how to acquire high-quality three-dimensional (3D) geospatial information from point clouds has become a scientific frontier, for which there is an urgent demand in the fields of surveying and mapping, as well as geoscience applications. To address the challenges mentioned above, point cloud intelligence came into being. This paper summarizes the state-of-the-art of point cloud intelligence, with regard to acquisition equipment, intelligent processing, scientific research, and engineering applications. For this purpose, we refer to a recent project on the hybrid georeferencing of images and LiDAR data for high-quality point cloud collection, as well as a current benchmark for the semantic segmentation of high-resolution 3D point clouds. These projects were conducted at the Institute for Photogrammetry, the University of Stuttgart, which was initially headed by the late Prof. Ackermann. Finally, the development prospects of point cloud intelligence are summarized.


Introduction
The European Geospatial Industry Outlook Report 1 released in 2018 added three-dimensional (3D) scanning as one of the four major branches of the traditional geospatial industry (Global Navigation Satellite System (GNSS) and positioning, geographic information systems (GIS) and spatial analytics, Earth observation, and 3D scanning), and has predicted that 3D scanning market will become the fastest-growing market among the four major areas, which could lead to rapid development in smart cities, intelligent transportation, global mapping, etc.With unparalleled advantages over vector maps and imagery, point cloud data (X, Y, Z, A) have become the third most important spatio-temporal data source after vector maps and imagery.3D point cloud data are the major source of 3D geographic information and play an irreplaceable role in the accurate description of 3D spaces.How to obtain 3D geographic information quickly and accurately has become a fundamental task in the field of mapping and geographic information (Deren, Jun, and Zhenfeng 2018;Deren 2017).With the rapid development of sensors, semiconductors, and unmanned platforms, point cloud big data reality acquisition equipment represented by laser scanning and oblique photogrammetry has made great progress in stability, accuracy, ease-of-use, and intelligence.With the advancement of technology, a range of multi-platform/multi-resolution equipment has been developed, including spaceborne, manned/ unmanned, vehicle, ground, backpack, and handheld equipment, which provide convenience for the acquisition of point cloud big data.The International Society for Photogrammetry and Remote Sensing (ISPRS) has set up a working group on point cloud processing.Meanwhile, the industry has also focused on point cloud processing (e.g. the annual International LiDAR Mapping Forum (ILMF) 2 ).However, the industry still cannot meet the demands for intelligent processing and application of point cloud big data.Point cloud intelligence was born to build a bridge between point cloud big data and scientific research, as well as engineering applications.It is a scientific means to realize 3D representation of entities with structure and function from point cloud big data, including its core of point cloud big data quality enhancement, intelligent 3D information extraction, and on-demand 3D reconstruction.Point cloud intelligence is also a scientific method and tool for scientific research and engineering applications, such as geoscience, information science, smart cities, etc.This paper focuses on the principle of point cloud intelligence, and highlights the research progress and trends from three aspects: 1) point cloud big data acquisition; 2) intelligent processing; and 3) engineering applications.Finally, we give an outlook on the important development directions of point cloud intelligence.

Data acquisition
The acquisition equipment is developing rapidly, not only the active acquisition equipment represented by laser scanning, but also the passive acquisition equipment represented by oblique photogrammetry.For the carrier platform, multiple platforms now exist, including spaceborne platforms, manned/unmanned airborne platforms, vehicle platforms, ground-based platforms, and portable platforms, from space to ground.
The laser scanning equipment acquires the point cloud data by integrating Global Positioning System (GPS)/Inertial Measurement Unit (IMU) hardware and scanners of different performances on different platforms to jointly solve the laser emitter position, attitude, and distance to the target.For spaceborne laser scanning, the ICESat and ICESat-2 satellites were launched by NASA in 2003 and 2018, respectively, and the ZiYuan-3 02 satellite (Xinming, Jiyi, and Guoyuan 2018) was launched by China in 2017.Manned/unmanned airborne laser scanning systems and vehicle laser scanning systems were the major solutions of Riegl, Optech, Hexagon, and others.Recently, Chinese companies such as Leador Spatial, Surestar, Hi-Target, CHC Navigation, and South Survey have launched a series of airborne, vehiclebased, and backpack laser scanning systems, for which the performance can rival the former systems.A laser bathymetric system measures the depth of water by emitting two different wavelengths of laser in blue band and green band, to obtain the underwater topography.As a result, laser bathymetric systems play an important role in marine mapping and underwater measurement.
Over the last decade, oblique photogrammetry has developed rapidly.During the flight, the ground is observed from both directly above and at tilted directions at the same time, capturing the target texture from the top as well as different angles.Processing by professional software (such as Match-T from Trimble Inpho, SURE from nFrames, PixelGrid, etc.) can be used to generate dense image point clouds with color information, which is a supplement to the laser scanning point clouds, and has been widely used in 3D urban reconstruction.In addition, with consumergrade depth cameras, close-range 3D point cloud data can be acquired through a structured light camera, Time Of Flight (TOF) camera, or binocular camera.Commercial products have now been released, such as Apple Prime Sense, 3 Microsoft Kinect-1, 4 Intel RealSense, 5 ZED, 6 and Bumblebee. 7 With the increasing demand for the granularity and connotation of geospatial information, the content of point cloud data has moved from mainly geometric information to simultaneous geometric, spectral, and textural information (e.g.multispectral laser scanning systems (Virtanen et al. 2017)).In terms of the scanning mode, this has changed from spindle-type scanning to optical phased array/single-photon LiDAR, which has wide application potential in remote sensing and is becoming a future trend of active Earth observation (Li et al. 2018).In terms of the acquisition platform, specialized equipment has changed to diversified consumer-level intelligent equipment.As the size, weight, and cost of sensors are respectively becoming more and more miniaturized, lightweight, and less expensive, consumer-grade, portable, and integrated smart scanning equipment is now booming (Hemin, Ruofei, and Donghai 2019;Li et al. 2019).The U.S. Defense Advanced Research Projects Agency (DARPA) has developed an autonomous collaborative scanning system for ground and air robots to scan unknown environments, supported by Simultaneous Localization And Mapping (SLAM) technology and robot control planning, which greatly reduces the human cost and solves the problem of the inability to operate in hazardous and special environments (Kelly et al. 2006).In photogrammetry and remote sensing field, a current trend is aimed at the joint acquisition, georeferencing, and interpretation of optical imagery and LiDAR data (Haala et al. 2022;Koelle et al. 2021).
The diversity of the point cloud acquisition platforms and methods leads to huge differences and even conflicts in the sampling granularity, quality, and expression of the point cloud.Platform-oriented point cloud processing methods cannot effectively collaborate with multi-platform point clouds to achieve complementarity.Thus, there is an urgent need to develop point cloud intelligence, to provide scientific decisions and means for the intelligent understanding of point cloud scenes described by point cloud big data.

Methodology
For the intelligent processing of point cloud data, the current algorithms mainly focus on data quality enhancement (denoising, shape completion, etc.), registration, segmentation, surface reconstruction, etc., to provide the data basis for the subsequent engineering applications.In this section, we summarize the key techniques of point cloud intelligent processing from the above aspects.

Denoising
Due to the limitation of the acquisition equipment and the influence of the acquisition environment, the point cloud data obtained by a 3D scanner are often accompanied by a large amount of noise and outliers (Roveri et al. 2018), which bring additional challenges to the intelligent understanding of the point cloud scene.Based on this problem, more and more researchers have begun to study how to restore clean point cloud data from noisy point cloud data, After reviewing the previous research, we categorize the point cloud denoising methods into two types: optimizationbased and deep learning-based.

Optimization-based denoising methods
The optimization-based methods eliminate noise by defining an optimization target.For example, Alexa et al. (2003).proposed to use simple geometric functions to approximate the least squares surface, and then defined the optimal projection operator to project points onto the surface.However, this approach is susceptible to outliers.Avron et al. (2010) proposed to use sparse regularization to solve the optimization problem and reconstruct normals, and then updated the coordinates of the noise points according to the reconstructed normals.Subsequently, Schoenenberger, Paratte, and Vandergheynst (2015) used a graph structure to capture the structural information of point cloud data by defining nodes and edges, and then used a graph Laplacian filter and other filters for the denoising.Zaman, Wong, and Ng (2017) approximated the noisy point cloud by modeling the distribution of points and used techniques such as kernel density estimation.However, the biggest problem of these optimizationbased methods is that the optimization-based methods are heavily reliant on prior knowledge, such as the basic geometric structure of the point cloud or the assumption of the noise distribution.

Deep learning-based denoising methods
With the rapid development of deep learning, researchers have begun to use this data-driven approach to deal with noisy point clouds.For example, Roveri et al. (2018).proposed PointProNets, which introduced a neural network into the point cloud denoising task for the first time.PointProNets takes sparse and noisy point cloud data as input.The point cloud is first divided into multiple patches, and then these patches are projected onto height graphs.The final clean point cloud is then obtained by backprojection to 3D coordinates.In this method, part of the cloud information is lost due to projection, which limits the denoising capability of the network.Subsequently, PointCleanNet (Rakotosaona et al. 2020) was proposed, which uses PointNet (Qi et al. 2017) as the backbone to directly perform convolution operations on unstructured point cloud data and realizes the denoising task for point cloud data by predicting the offset of a noise point to the normal line of the surface of the clean point cloud.In view of the difficulty of obtaining the ground truth of 3D point clouds, Hermosilla, Ritschel, and Ropinski (2019) proposed an unsupervised point cloud denoising method named Total Denoising, which directly regresses the approximate surface from the distribution of the noisy point cloud.However, Luo and Hu (2020) argued that PointCleanNet does not directly restore the real surface, leading to suboptimal denoising results.Based on this, they proposed DMRDenoise, which uses differentiable pooling to downsample the input.The manifold of the point cloud is estimated and the final output is obtained by resampling.This approach can achieve good results, but due to the introduction of the down-sampling scheme, it can easily result in a loss of details.Pistilli et al. (2020) proposed GPDNet, and effectively improved the robustness of the network by using a graph neural network.Recently, Luo and Hu (2021) proposed a new score-based point cloud denoising algorithm, which is based on the use of the gradient ascent technique to move points iteratively to the lower surface by estimating the score of each point.Chen et al. (2022) proposed RePCD-Net, which uses a multi-scale point cloud denoising scheme based on a Recurrent Neural Network (RNN), which iteratively processes point clouds in different denoising stages, reaching the current state-of-the-art level.However, due to the lack of prior knowledge, it is very hard for the deep learning based approaches to distinguish the difference between detailed structure and noise, resulting in excessive smoothing.

Future works
Possible future research directions include: 1) Fusion of prior information.Point cloud denoising is a restoration task analogous to image restoration, and is essentially a "morbid" problem with a one-to-many relationship.Combining optimization-based and deep learning-based methods to integrate prior information into the model is the key to improving the method's performance.2) Real-time denoising.Most of the existing methods assume that the noise obeys a Gaussian distribution.However, in a real scene, the noise of the point cloud is often composed of multiple distributions, which limits the effectiveness of the existing methods.3) Large scene application.Most of the existing methods are limited to target-level point clouds and cannot be directly applied to large-scale scenarios.

Completion
Point clouds obtained by 3D scanning devices are often incomplete, and thus require completion before the downstream task.With a given partial point cloud observation, the target of point cloud completion is to recover a complete 3D shape.

Traditional shape completion methods
The traditional point cloud completion methods can be grouped into two categories: geometry-based methods and alignment-based methods, which respectively utilize the geometric attributes of the objects (Berger et al. 2014;Davis et al. 2002;Hu, Fu, and Guo 2019;Mitra, Guibas, and Pauly 2006) and retrieval of the complete structure in the database (Felzenszwalb et al. 2010;Gupta et al. 2015;Han and Zhu 2008;Li et al. 2015).However, these methods cannot robustly generalize to the cases of complex 3D surfaces with large missing parts.

3D shape completion with pairwise supervision
Researchers began to leverage deep learning based methods for 3D shape completion with the development of the deep neural network models.Early works focused on a 3D voxel grid (Dai, Ruizhongtai Qi, and Nießner 2017;Wang et al. 2017), but they were limited by the computational cost, which increases cubically with the shape resolution.
On the other hand, since PointNet (Qi et al. 2017) and its subsequent studies (Qi et al. 2017;Wang et al. 2019) have solved the disorder problem of point cloud data, point cloud based methods have emerged a lot in recent years.
The Point Completion Network (PCN) (Yuan et al. 2018) is the first point cloud based deep neural completion network, which uses an encoder similar to PointNet (Qi et al. 2017) to extract a global feature with several MultiLayer Perceptrons (MLPs) and a max pooling operation, and then employs a decoder to infer the complete point cloud from the global feature.
More recent works have tried to preserve the observed geometric details from the local features in incomplete inputs, and follow a coarse-to-fine strategy.The NSFA method (Zhang, Yan, and Xiao 2020) reconstructs the missing parts separately.VRC-Net (Pan et al. 2021) is based on a variational framework by leveraging the relationship between structures during the completion process.PMP-Net (Wen et al. 2021) accomplishes the completion task by learning point moving paths.SnowflakeNet (Xiang et al. 2021) introduces snowflake point deconvolution with skiptransformer for point cloud completion.ASFM-Net (Xia et al. 2021) employs an asymmetrical Siamese feature matching strategy and an iterative refinement unit to generate complete shapes.
There are also some recent networks that use a voxel-based completion process.For example, GRNet (Xie et al. 2020) is based on a gridding network for dense point reconstruction, and VE-PCN (Wang, Ang, and Lee 2021) develops a voxel-based network for point cloud completion by leveraging edge generation.

3D shape completion without pairwise supervision
On the other hand, the paired ground truth of real scans is difficult to obtain, so a few studies on unpaired shape completion have been conducted.
The Amortized Maximum Likelihood (AML) approach (Stutz and Geiger 2018) uses the maximum likelihood method to measure the distance between complete and incomplete point clouds in the latent space.Pcl2Pcl (Chen, Chen, and Mitra 2019) pretrains two auto-encoders, and directly learns the mapping from partial shapes to the complete shapes in the latent space.Cycle4Completion (Wen et al. 2021) introduces two cycle transformations to establish the geometric correspondence between incomplete and complete shapes in both directions.ShapeInversion (Zhang et al. 2021) incorporates a well-trained Generative Adversarial Network (GAN) as an effective prior for shape completion.Cai et al. (2022).established a unified and structured latent space to achieve partially complete geometric consistency and shape completion accuracy.

Future works
Following the above review, the future works will likely focus on several directions: 1) The existing completion networks are still defective in the details of maintenance, especially the thin structures such as wires.2) Real-time completion is important for driverless vehicles and other applications.The general deep learning based methods for point cloud completion are faster than the traditional methods.3) The current generalization ability of these methods is not sufficient, and it is difficult to extend these methods to scene completion.

Registration
Estimating the geometrical transformations of unaligned point clouds, which is known as point cloud registration, is a fundamental task for many downstream fields, such as 3D reconstruction (Izadi et al. 2011), autonomous driving (Caesar et al. 2020;Zeng et al. 2018), and augmented reality and virtual reality (AR/VR) (Gao et al. 2019;Zhang, Dai, and Sun 2020).The existing registration methods can be grouped into pairwise and multi-view methods (Dong et al. 2020).

Multi-view registration
Compared with pairwise registration, multi-view registration has drawn less attention, especially in the deep learning based field, and can be classified as sequential registration and joint registration algorithms (Dong et al. 2020;Gojcic et al. 2020;Grisetti et al. 2011).The main challenge of both categories lies in building a robust registration path or scene graph and resolving the ambiguous cases arising with pairwise registrations.
Sequential strategies such as minimum spanning tree based methods (Kruskal 1956;Zhu et al. 2016;Yang et al. 2016), shape growing based methods (Mian, Bennamoun, and Owens 2006;Ge and Hu 2020), and hierarchical merging based methods (Dong et al. 2018;Tang and Feng 2015) merge the scans incrementally by local pairwise registrations following the recovered registration path.However, a general drawback of the sequential-based methods is the accumulated error from the pairwise registrations.
Thus, joint registration methods have been proposed for solving global energy optimization problems (Zhou, Park, and Koltun 2016;Theiler, Wegner, and Schindler 2015;Ge, Hu, and Wu 2019), motion averaging algorithms (Govindu 2004;Shih, Chuang, and Yu 2008;Arrigoni, Rossi, and Fusiello 2016), and graph synchronization (Gojcic et al. 2020;Huang et al. 2019), although these methods do tend to be afflicted by local minima for a large search space.

Hybrid georeferencing of LiDAR and image data
In order to compute the respective 3D points from LiDAR range measurements, the sensor platform typically integrates a GNSS/IMU unit to provide the required position and attitude of the system.If suitable calibration of the sensor system is achieved, 3D point accuracies of the centimeter level are feasible (Liu et al. 2021a).Typically, LiDAR strip adjustment further improves the trajectory while minimizing the remaining offsets between point clouds of different flight strips in overlapping areas (Cramer et al. 2018;Ressl, Kager, and Mandlburger 2008;Shan and Toth 2018).For this purpose, a sophisticated calibration procedure is applied, while algorithms such as the well-known ICP algorithm (Besl and McKay 1992) minimize the discrepancies within the overlapping area of the flight strip pairs.By extending this traditional LiDAR strip adjustment with additional observations from the bundle adjustment of image blocks, the so-called hybrid georeferencing of LiDAR and aerial images becomes feasible (Glira, Pfeifer, and Mandlburger 2016).Usually, bundle block adjustment is used to estimate the respective camera parameters from the corresponding pixel coordinates of overlapping images, while the object coordinates of these tie points are a byproduct.Within the hybrid orientation approach, the tie points' object coordinates from automatic aerial triangulation are re-used to establish the correspondences between the data and the image block, while the resulting discrepancies are minimized within a global adjustment procedure.In this respect, hybrid orientation adds additional observations and thus constraints to LiDAR strip adjustment.During hybrid adjustment, both the laser scanner and camera can be fully re-calibrated by estimating their interior calibration and mounting parameters (lever arm, boresight angles).Furthermore, the systematic measurement errors of the flight trajectory can be corrected individually for each flight strip.Figure 1 gives an example of the combination of LiDAR points with photogrammetric tie points during hybrid adjustment of Unmanned Aerial Vehicle (UAV) imagery and LiDAR imagery during hybrid georeferencing.In this project, which was aimed at the ultra high precision collection of 3D point clouds for deformation monitoring, dense 3D point clouds at sub-centimeter accuracies were collected using a UAV platform (Haala et al. 2022).

Future works
From the above literature review, the possible future directions include: 1) designing (deep) features that are efficient, distinct, generalizable, and robust to noise, point density, and random transformations; 2) designing multi-view registration strategies that are robust to error accumulation and local extremes; 3) designing fully differentiable end-to-end deep pairwise or multi-view registration models balancing accuracy and efficiency; and 4) designing robust and efficient algorithms with high generalization capability for the registration of cross-modal multitemporal point clouds collected in different ways and at different times.

Segmentation
Point cloud semantic/instance segmentation, as a hot and rapidly developing research topic, connects original remote sensing data to high-level scene understanding.Typically, point cloud segmentation methods learn the point distribution patterns from annotated datasets and accordingly make their predictions.In this section, the previous works are roughly categorized into the following two groups.

Traditional machine learning based methods
Since the popularization of laser scanning, researchers have made great progress in segmenting large-scale point clouds by following the traditional machine learning paradigm (Ma et al. 2018;Che, Jung, and Olsen 2019;Yang et al. 2015).Specifically, for these methods, the key problem is how to define the feature calculation units and design appropriate feature descriptors, so that a machine learning point classifier can be accordingly trained (Dong et al. 2017;Landrieu et al. 2017;Zheng, Wang, and Xu 2016;Guan et al. 2016).For example, Weinmann et al. (2015) presented a versatile per-point classification framework and analyzed various neighborhood units, feature descriptors, and feature combinations, to identify the most discriminative features.Alternatively, Yu et al. (2016) clustered points into supervoxels as homogeneous point groups and mapped each group to a contextual visual word.Considering the point-region-instance feature hierarchy, Yang et al. (2017) incorporated multiple levels of features and contextual information.Li et al. (2019) further achieved component-level road object semantic segmentation using a machine learning classifier for pole attachment recognition.Although these machine learning based segmentation methods have contributed a lot to various applications, especially road object extraction, the overreliance on handcrafted features limits their generalization performance in complex scenes (Zhou et al. 2022).

Deep learning based methods
Instead of treating the feature description and classification as separate steps, deep learning based point cloud segmentation methods train a neural network both to encode the point features and make predictions (Guo et al. 2020;Li et al. 2020).Point clouds are unorganized and irregular, so designing elegant point cloud encoders or effective backbones has always been a major research topic in deep learning based semantic segmentation (Qi et al. 2017;Graham, Engelcke, and Van Der Maaten 2018;Thomas et al. 2019;Zhao et al. 2021;Boulch 2020;Zhu et al. 2021;Xu et al. 2020).The RandLA-Net method achieves efficient semantic segmentation for very large scale point clouds, based on the introduced random sampling and feature aggregation strategy (Hu et al. 2020).Xu and Lee (2020) exploited and modeled the implicit relationship between labeled and unlabeled points, and thus proposed to segment point clouds by weakly supervised learning.Furthermore, point cloud semantic instance segmentation requires more discriminative features (Wang et al. 2018;Yang et al. 2019;Jiang et al. 2020;Han et al. 2020).The Associatively Segmenting Instances and Semantics (ASIS) method associates semantic and instance segmentation tasks and encourages them to cooperate with each other and exploit more information, leading to simultaneous semantic and instance label prediction (Wang et al. 2019).Hou et al. (2021) investigated pre-training based on both point pairs and spatial contexts, and accordingly achieved instance segmentation with limited annotation.

Future works
With research in 3D computer vision and computer graphics flourishing, point cloud segmentation solutions come across new challenges and opportunities. 1) Novel point cloud network designs can gain inspiration from the traditional methods, including the oversegmentation strategy or explicit geometric feature description.2) The ability to learn in semi-or selfsupervised manners should be further exploited since annotating full point cloud labels is very laborious and expensive.3) Jointly segmenting point clouds and reconstructing 3D models has been explored as a promising scene understanding domain, because these tasks help the networks to holistically learn and leverage appearance, geometric, and semantic information.An example of the current focus and state-ofthe-art of semantic segmentation of 3D point clouds is the Hessigheim 3D (H3D) benchmark (Koelle et al. 2021).This benchmark provides labeled highresolution 3D point clouds as training and test data.Furthermore, textured meshes as well as multiple epochs are additionally available.

Surface reconstruction
Given an unstructured point cloud set P obtained by a 3D sensing device, the target of surface reconstruction is to recover the underlying continuous surface S. Considering that it is an ill-posed problem to obtain a continuous surface from discrete point clouds, it is necessary to add appropriate regularization to recover the surface S. According to the different kinds of prior behind the regularization, the existing methods can be categorized into triangulation-based methods, implicit methods, and deep learning based-methods.

Triangulation-based methods
These methods (Cohen-Steiner and Da 2004;Bernardini et al. 1999;Edelsbrunner and Shah 1994;Mostegel et al. 2017;Labatut, Pons, and Keriven 2009;Boissonnat 1984;Jancosek and Pajdla 2014) represent the local surface by a triangular plane under a piecewise linear assumption, which directly produces the mesh result.In general, these methods first generate the triangle candidate set from the observed point set P and select the optimal subset to form the final surface.Edelsbrunner and Shah (1994) proposed constructing a tetrahedron first via the Delaunay triangulation method (Boissonnat 1984), and then utilized a graph-cut algorithm to classify the inside/ outside property for each tetrahedron, so that the surface can be represented by the shared triangle between the inside tetrahedron and outside tetrahedron.Furthermore, the greedy Delaunay algorithm (Cohen-Steiner and Da 2004) adopts a greedy strategy to utilize a topological constraint for selecting the subset triangles sequentially.In addition, the Ball-Pivoting Algorithm (BPA) (Bernardini et al. 1999) utilizes balls with various radiuses to roll on P, to generate the triangular face as the reconstructed surface.

Implicit methods
Under the assumption of global smoothness or local smoothness, these methods (Levin 1998;Hoppe et al. 1992;Kazhdan and Hoppe 2013;Atzmon andLipman 2020b, 2020a;Carr et al. 2001;Kazhdan, Bolitho, and Hoppe 2006;Baorui et al. 2021;Gropp et al. 2020) utilize a continuous function to represent the implicit fields (e.g. the Signed Distance Function (SDF) or indicator function), and the final surface can be extracted by marching cubes (Lorensen and Cline 1987).Specifically, the Implicit Moving Least Squares (IMLS) method (Levin 1998) utilizes a moving least squares scheme to represent the local surface from the given oriented point cloud under a local smoothness prior.In addition, Carr et al. (2001) adopted a radial basis function to represent the global implicit function, which assumes that the surface should satisfy the property of global smoothness.Furthermore, the sScreened Poisson Surface Reconstruction (SPSR) method (Kazhdan and Hoppe 2013) combines the global property and local property to solve the Poisson equation, which can achieve high-fidelity reconstruction results.Other recent works (Atzmon andLipman 2020b, 2020a;Baorui et al. 2021;Gropp et al. 2020) have explored a neural implicit function to reconstruct accurate surfaces directly from raw point clouds.

Deep learning-based methods
Differing from the afore-mentioned methods, these methods (Peng et al. 2021;Mostegel et al. 2017;Erle, Guerrero, andOhrhallinger 2020, 2020;Mescheder et al. 2019;Jiang et al. 2020;Genova et al. ;Chen and Zhang 2019;Liao, Donne, and Geiger 2018;Liu et al. 2021b;Peng et al. 2020) try to leverage a data prior from the existing 3D dataset, which enables the reconstruction of an accurate surface, even from point clouds with high noise or from partial inputs.Some of the early works (Erle, Guerrero, and Ohrhallinger 2020; Genova et al. ; Peng et al. 2020) proposed to learn a global shape prior encoded by a neural network, but these methods have difficulty in generalizing to unseen objects or large scenes.To enhance the generalization of the neural network, the following works (Jiang et al. 2020;Park et al. 2019;Jiang et al. 2020) proposed learning a local prior instead.Furthermore, Point2Surf (Peng et al. 2020) uses a patch-based method to achieve more accurate prediction.Meanwhile, some methods (Peng et al. 2021;Erler et al.) attempt to combine the learning prior with traditional methods.For example, IMLSNet (Erler et al.) utilizes a neural network to predict the pointwise offset and normal for the given noisy point clouds, and utilizes traditional IMLS (Levin 1998) to reconstruct the surface.In general, the learning-based methods can obtain promising results for noisy and partial point clouds, but it is still an open question as to how an optimal learning scheme utilizes the data prior.

Combining LiDAR and multi-view-stereo
While photogrammetry and LiDAR originally developed as competing techniques, recent systems have utilized their complementary properties for 3D object reconstruction.Thus, the integrated capture and evaluation of airborne images and LiDAR images can generate 3D point clouds of a very high quality, with increased completeness, reliability, robustness, and accuracy.As a matter of principle, Multi-View Stereo (MVS) point clouds feature a high-resolution capability.Since the accuracy of MVS point clouds directly corresponds to the ground sampling distance, suitable image resolutions even enable accuracies in the subcentimeter range.However, MVS requires visible object points in at least two images.Complex 3D structures are likely to violate this condition, resulting in their reconstruction being severely aggravated.Problems may specifically occur for objects in motion, such as vehicles, pedestrians, etc., or in very narrow urban canyons, due to occlusions.In contrast, a single LiDAR measurement is sufficient for point determination due to the polar measurement principle of LiDAR sensors.The lower requirements on visibility are advantageous for the reconstruction of complex 3D structures, urban canyons, or objects that change their appearance rapidly when seen from different positions.Another advantage of LiDAR is the potential to measure multiple responses of the reflected signals, which enables penetration of semi-transparent objects such as vegetation.However, LiDAR data do not provide color information.
The advantages of combining MVS with LiDAR point measurements is demonstrated in Figure 2. The left part depicts the textured mesh generated from MVS only, while the hybrid mesh integrating MVS points and LiDAR points is depicted on the right.So far, in this paper, our investigations have been based on 3D point clouds as an unordered set of points.In contrast, Figure 2 depicts a 3D mesh as an alternative representation.Such meshes are graphs consisting of vertices, edges, and faces that provide explicit adjacency information.The main differences between meshes and point clouds are the availability of high-resolution textures and the reduced number of entities.The meshes presented in Figure 2 were generated with SURE software from nFrames (Glira, Pfeifer, and Mandlburger 2019), based on data captured from a UAV platform (Haala et al. 2022).As they are visible, the incorporation of LiDAR points enhances the reconstructed 3D data substantially.To give examples, the top of the church and the vegetation provide more geometric detail when LiDAR data are integrated.

Applications
Point cloud intelligence has achieved good results in 3D information extraction and modeling, and has been widely used in scientific research and engineering, such as geospatial informatics research, underground space development and utilization, smart cities, new basic surveying and mapping, and infrastructure health monitoring, as shown in Figure 3.
In geo-information science, point cloud intelligence can accurately portray the 3D morphological structure of vegetation, glaciers, islands, and the surrounding underwater terrain, providing vital support for global forest accumulation, biomass estimation, global glacial material balance, marine economic development/management, and sea defense security.
In terms of smart cities and "realistic 3D China", point cloud intelligence is playing an increasingly important role in fine urban management, 3D change detection, urban security analysis, etc.Through the intelligent integration of data, structure, and function as a whole, the 3D geometric information, rich semantic information, and accurate spatial relationships between indoor and outdoor, above and below ground, and above and below water can be expressed in an integrated manner to achieve on-demand multi-level detail modeling and provide informative all-space, dynamic, and static information guarantee for complex cities.
For the comprehensive development and utilization of underground spaces, point cloud intelligence can provide support for digital construction, Building Information Modeling (BIM), underground disaster detection, early warning, etc.It can also be used to establish an all-digital underground space infrastructure and dynamic convergence of Internet of tThings (IoT) data, supporting scientific management support and decision-making for the construction and comprehensive planning of underground spaces, construction projects, process supervision, status and full life cycle databases, as well as the whole process of project planning and management and whole life cycle refinement management.
In terms of the health monitoring of major infrastructure, point cloud intelligence can provide accurate and effective 3D information for power line safety monitoring (safety distance, etc.), road surface health monitoring (collapse, damage, etc.), bridge and tunnel deformation monitoring, etc., through refinement modeling of key structures as well as precise identification of multiple targets and spatial relationship calculation, providing guarantees for the safe operation of infrastructure.
In terms of automatic driving, point cloud intelligence is the core support for real-time motion target detection and localization, real-time obstacle avoidance, and High-Definition (HD) map production.Laser scanning for obstacle avoidance has become a standard for automatic driving, and accurate extraction of HD map elements enables automatic driving, providing users with accurate and intuitive 3D location information as well as precise path planning control strategies that exceed the sensor capabilities.
In terms of digital heritage protection and inheritance, point cloud intelligence can provide systematic scientific support, from data collection to refined reconstruction with digital high-precision reconstruction, virtual restoration, and networked dissemination of cultural heritage, which significantly improves the efficiency of cultural heritage protection and enriches the expression of cultural heritage results, such as the splicing of cultural heritage fragments, 3D model reconstruction of cultural heritage, and restoration of cultural heritage.

Outlook
The rapid development of sensors, semiconductors, the IoT, and delivery platforms is continuously improving the efficiency and quality of point cloud big data acquisition, as well as reducing the cost of data collection, allowing the physical world to be digitized more efficiently in 3D.However, the data volume is increasing exponentially, which raises great challenges for storage management and the computation and analysis of point cloud big data.Fortunately, emerging technologies such as edge computing, deep learning, artificial intelligence, etc., can help to provide more opportunities for point cloud intelligence.
The era of large-scale urban scene point clouds and global fine-scale point clouds is coming, and point cloud intelligence, as the scientific support for the intelligent processing and analysis of point cloud big data, which is the third most important type of basic data after vector maps and imagery, will further develop in the following directions: 1) the development of the storage and updating mechanisms for point cloud big data to provide basic support for the efficient utilization of point cloud data; 2) the establishment of industrial and national standards for point cloud 3D information extraction and modeling for new basic mapping, to serve the construction of "realistic 3D China" and natural resource monitoring; 3) the creation of objectoriented deep learning networks for object-oriented point cloud big data, based on artificial intelligence, to transform point cloud processing from the current point-by-point classification to the integration of object classification and boundary extraction for the accurate understanding of 3D scenes; and 4) the development of intelligent equipment that integrates collection, processing, and serving as one, to serve the health management of major infrastructure (such as power grids, high-speed railetc.).It is believed that, in the future, with the support of artificial intelligence and deep learning, point cloud intelligence will not only enable the fine reconstruction of the 3D real world through real-time integration with IoT data, but will also support Earth science application research, smart cities, etc., with more scientific decision-making.

Conclusion
This paper has presented a contemporary survey of the state-of-the-art of point cloud intelligence, with regard to the theoretical methods, the key techniques of intelligent processing, and the major engineering applications.We have analyzed the equipment and methods of mainstream point cloud data acquisition and the advanced algorithms for the key technologies involved in point cloud intelligent processing.A comprehensive classification and a merit and demerit analysis of these methods have been presented, with the potential research directions also being highlighted.Finally, based on the summary of point cloud intelligence in scientific research and engineering applications, the future development directions of point cloud intelligence have been discussed.

Notes on contributors
Bisheng Yang received the B.S. degree in engineering survey, the M.S. degree, and the Ph.D. degree in photogrammetry and remote sensing from Wuhan University, China, in 1996, 1999, and 2002, respectively. From 2002to 2006, he

Figure 1 .
Figure 1.LiDAR points colored by reflectance and photogrammetric tie points (white) as additional observations for hybrid strip adjustment.

Figure 3 .
Figure 3. Point cloud intelligence for scientific research and major engineering applications.
held a post-doctoral position at the University of Zurich, Switzerland.Since 2007, he has been a Professor with the State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing (LIESMARS), Wuhan University, where he is currently the vice-director of LIESMARS.His main research interests comprise 3-D geographic information systems, urban modeling, and digital city.He was a Guest Editor of the ISPRS Journal of Photogrammetry and Remote Sensing, and Computers \& Geosciences.His main research interests comprise 3-D geographic information systems, urban modeling, and digital city.Norbert Haala is an Associate Professor at the Institute for Photogrammetry, the University of Stuttgart, where he is responsible for lectures in the field of photogrammetric image processing.His research interests include virtual city models and imagebased 3-D reconstruction.Zhen Dong is a professor at the State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing (LIESMARS), Wuhan University.He received his B.E. and Ph.D. degrees in Remote Sensing and Photogrammetry from the Wuhan University in 2011 and 2018.His research interests lie in the field of 3D Computer Vision, particularly including 3D reconstruction, scene understanding, point cloud processing as well as their applications in intelligent transportation system, digital twin cities, urban sustainable development and robotics.