The applications of robust estimation method BaySAC in indoor point cloud processing

Abstract Based on Bayesian theory and RANSAC, this paper applies Bayesian Sampling Consensus (BaySAC) method using convergence evaluation of hypothesis models in indoor point cloud processing. We implement a conditional sampling method, BaySAC, to always select the minimum number of required data with the highest inlier probabilities. Because the primitive parameters calculated by the different inlier sets should be convergent, this paper presents a statistical testing algorithm for a candidate model parameter histogram to compute the prior probability of each data point. Moreover, the probability update is implemented using the simplified Bayes’ formula. The performances of the BaySAC algorithm with the proposed strategies of the prior probability determination and the RANSAC framework are compared using real data-sets. The experimental results indicate that the more outliers contain the data points, the higher computational efficiency of our proposed algorithm gains compared with RANSAC. The results also indicate that the proposed statistical testing strategy can determine sound prior inlier probability free of the change of hypothesis models.


Introduction
A 3D model of an indoor environment includes modeling of indoor space, floor, wall and objects inside a room. These 3D models will play a more important role in context aware and robotics applications such as automatic route tracking, object detection, etc. Range camera (e.g. Kinect) has been proven to be an applicable sensor for 3D indoor modeling (Henry et al. 2012;Khoshelham and Elberink 2012;Camplani, Mantecon, and Salgado 2013). The estimation of the parameters of a model is often involved in the data processing for 3D model reconstruction (e.g. the geometric model of an indoor feature in the fitting of point clouds, the rigid transformation model in point cloud registration, and so on). Because the point clouds and images acquired from an indoor scene always contain plenty of noises, it is necessary to implement the robust estimation of the parameters of a model. RANdom SAmple Consensus (RANSAC) (Fischler and Bolles 1981) is a well-regarded technique for the segmentation and fitting of laser scanning data, because it is proven to be capable of addressing more than 50% of all outliers.
R-RANSAC (Matas and Chum 2004) was proposed to increase the model parameter estimation speed through hypothesis evaluation randomization processes because numerously erroneous model parameters are expected to arise from contaminated samples evaluated in RANSAC. Schnabel, Wahl, and Klein (2007) improved the efficiency of RANSAC through local point selection and the incorporation of a simplified score function. Torr and Zisserman (2000) applied the robust estimation method of Maximum Likelihood Estimation SAmple Consensus (MLESAC) to identify best-fitting roof models in a model-driven manner. Torr and Davidson (2003) presented IMPortance sampling and random SAmple Consensus (IMPSAC) method, which employed a hierarchical resampling algorithm. Li et al. (2014) improved the RANSAC strategy by defining a prior energy function to optimize sampling strategy. A normal-coherence CC-RANSAC (NCC-RANSAC) was presented to perform a normal coherence check before RANSAC process to remove the data points with contradictory normal directions to the fitted plane (Qian and Ye 2014). İmre and Hilton (2015) proposed order statistics of RANSAC, which obviated the noise-free data assumption. Xu et al. (2016) proposed a weighted RANSAC approach, which designed weight functions in terms of the difference in the error distribution between the proper and improper plane hypotheses.
A conditional sampling method, Bayesian Sampling Consensus (BaySAC) (Botterill, Mills and Green 2009) was presented to always select the minimum number of required data with the highest inlier probabilities as a hypothesis set for the purpose of reducing the number of

OPEN ACCESS
iterations needed to find a good model. However, Botterill, Mills, and Green (2009) admitted that there was a possibility that degenerate configurations incorrectly assumed to contain outliers could cause a sampling strategy to fail.
To improve the robustness and applicability of the original BaySAC method, Kang et al. (2014) optimized the BaySAC algorithm by presenting a model-free algorithm of statistical testing of candidate model parameters to compute the prior probability of each data point. Therefore, in this paper we apply the above method in the processing of indoor point clouds, i.e. point cloud registration and fitting.

Optimized BaySAC
In the BaySAC algorithm, a hypothesis set is only tried if at that moment it is the most likely one to be correct, which is determined in terms of its inlier probability. BaySAC assumes independence between the different inlier probabilities of data points in the same hypothesis set. First, the minimum number of data required with the highest inlier probabilities is selected as the hypothesis set. After evaluating a hypothesis set, the inlier probabilities of the data points of the hypothesis set are updated using Bayes' rule. The hypothesis set evaluation is consecutively repeated with the new inlier probabilities. Although the determination and evaluation of the probabilities also takes some time, this strategy can remarkably reduce the number of iterations needed to find a good model, and is hence reducing the computational costs. Compared with RANSAC, the key properties of BaySAC are to determine and update the inlier probabilities of the data points.

Determination of the prior inlier probabilities of data points
The statistical testing process proposed by Kang et al. (2014) is generic and can be applied to any BaySAC problem. The process is based on a histogram to dynamically evaluate the convergence of the hypothesis parameters sets during the hypothesis testing process. Figure 1 shows a histogram to evaluate the convergence of the planar parameters for point cloud fitting. The horizontal and vertical axes of Figure 1 denote the angle between the normal vector n of a plane and the horizontal plane, and the perpendicular distance from the origin ρ, respectively. The upright axis represents the convergence degree of each set of parameters. Different convergent cluster of the hypothesis sets will be presented in the histogram. We select the oldest parameter set as the reference point for each convergent cluster of parameter sets. The more the hypothesis sets converge to a cluster, the more possible is it that the reference parameters set of the cluster is correct. Therefore, the convergence degree of a cluster is presented to evaluate the correct possibility. The convergence degree is a percentage which describes that the number of sets converges to a parameter set cluster. It is calculated by dividing the number of the hypothesis sets in the cluster by the total number of hypothesis sets.
In this paper, the geometric model of an indoor feature in the fitting of point clouds and the rigid transformation model in point cloud registration are used as the hypothesis models.
When the degree of convergence of a cluster in the distribution of parameter solutions reaches a predefined threshold, the first hypothesis set in that cluster is used to determine the prior inlier probabilities of the data points according to Equation (1): where P i denotes the prior probability of point i, D i is the distance between point i and the fitted primitive, and m represents the predefined threshold for outlier identification, which is set as five times the point precision.

Probability updating
After determining the prior inlier probabilities of each data point, the following Equation (2) is employed to update the inlier probabilities during consecutive iterations (Kang et al. 2014): where I is the set of all inliers, H t is the hypothesis set of n data points used in iteration t of the hypothesis testing process, P t−1 (i∈I) and P t (i∈I) denote the inlier probabilities for data point i during iterations t−1 and t, respectively, k is the number of points consistent with the model during a test, and D is the total number of data points.
(1)  Figure 2 shows the flow chart of optimized BaySAC (taking point cloud registration as an example). The process first starts with a RANSAC strategy, which chooses an initial data-set randomly from the candidate points. Meanwhile, the proposed convergence evaluation is iteratively implemented using the newly calculated hypothesis parameter set. During each iteration, we update the degrees of convergence of all candidate parameter sets in terms of the newly calculated hypothesis parameter set. If the highest degree of convergence exceeds the predefined threshold (CD T in Figure 2), its corresponding parameter set is used to determine the prior inlier probabilities of data points using Equation (2). The BaySAC strategy is then activated. If the number of iterations RANSAC needs is very small, it is likely that the highest degree of convergence fails to exceed the threshold when the RANSAC process ends. In this case, the BaySAC degrades to RANSAC. Once the BaySAC strategy is activated, the hypothesis testing process is implemented as the procedure explained in Section 2.2.

Experimental results
To demonstrate the application of the method proposed by Kang et al. (2014), we conducted experiments with three sets of indoor point clouds, acquired respectively by the 3D SwissRanger camera, by Kinect 2.0 and by the LMS-Z620 laser scanner from RIEGL. The SwissRanger data-set consists of two point clouds sampling the indoor environment of a building (Data-set I). Kinect 2.0 dataset comprises a registered point cloud from 61 scans captured in a room with an average shift of 0.4 m between the scanning centers (Data-set II). LMS-Z620 data-set was captured from an underground parking garage (Data-set III).  implemented with a RANSAC strategy, during which the inlier probabilities of the different correspondences are determined. As shown in Table 2, the ratio between the number of inliers and that of possible correspondences in Data-set I reaches 81%. The number of iterations for RANSAC varies from 6 to 18. The total number of iterations for BaySAC-CONV varies from 2 to 11. The average consumed time of the two strategies is listed in Table 3. The random parts of the time consumed by BaySAC-CONV is to determine the prior inlier probabilities of data points, which are the core of BaySAC-CONV. After the determination of the prior inlier probabilities, the use of BaySAC remarkably reduces the computational time needed to find a good model compared with the time consumed by plain RANSAC. For instance, in Table 3, 5 ms are spent by BaySAC-CONV to find a good transformation from Data-set I, while the time cost by plain RANSAC is 21 ms.

Point cloud registration
The performance of the BaySAC-CONV algorithm in point cloud registration was evaluated on Data-set I in terms of both registration accuracy and computational efficiency. Correspondences between different scans were identified using reflectance images (as shown in Figure 3). The registration accuracies of RANSAC and BaySAC-CONV were evaluated using the average distance between inlier correspondences after registration. Figure 4 shows that five correspondences are identified as check points. Table 1 lists the evaluation of registration accuracies in terms of the average distance between the selected correspondences after registration. BaySAC-CONV achieved higher accuracy (23.3 mm) compared with the performance of the plain RANSAC (50.2 mm).
Hypothesis set evaluation is an iterative process, and therefore the calculational efficiency of the proposed strategies is evaluated in terms of the number of iterations and computation time. As described in Section 2.3, the BaySAC-CONV strategy consists of two parts, i.e. the random part (BaySAC-CONV-Random) and the BaySAC part. The random part consists of the iterations    region. Compared with the results of the original point cloud, finer planar primitives were detected owing to the simplification preserving the feature points, which enhanced the prominence of small-sized feature after simplification (highlighted with the white arrows in the "before" parts of Figures 6(a) and (b)).

Conclusions
In this paper, an optimized BaySAC algorithm is applied in the processing of indoor point clouds, which developed a strategy for determining the prior probability from the statistical characteristics of the deterministic mathematical model for hypothesis testing. The performances of the BaySAC algorithm with the proposed strategies of the prior probability determination and the RANSAC framework are compared using real datasets in terms of computational efficiency and accuracy.
The experimental results indicate that the optimized BaySAC algorithm can achieve better performances than

Fitting of point clouds
We used the RANSAC and BaySAC-CONV methods for planar primitive fitting of Data-sets II and III. The fitting results are shown in Figures 5 and 6, respectively. To validate the accuracy of the plane fitting, we selected correct points that were not involved in the fitting as check points. We solved for the distances from the check points to the fitting plane and computed the median error. We also evaluated the performance of different algorithms based on accuracy and computational cost. Table 4 lists the median errors and fitting times of the planar primitive fitting on Data-sets II and III. The fitting accuracy of the BaySAC-CONV algorithm is not considerably different from that of the RANSAC algorithms, while the efficiencies of the BaySAC-CONV algorithms are higher than that of the RANSAC algorithm.
Data-set III contains many planar features, and therefore we analyze the point cloud in one region ( Figure 6). Figure 6 includes images of the analyzed   RANSAC in both accuracy and computational efficiency. The more outliers contain the data points, the higher computational efficiency of the optimized BaySAC algorithm gains compared with RANSAC. The results also indicate that the statistical testing strategy can determine sound prior inlier probability free of the changes of hypothesis models. In the future, this method will be expanded to the fitting of complex features, which will also increase the adaptability of the proposed method.  vol. 80, No. 11, pp. 1041vol. 80, No. 11, pp. -1052vol. 80, No. 11, pp. , 2014. "