A regression model-based method for indoor positioning with compound location fingerprints

ABSTRACT This paper proposed and evaluated an estimation method for indoor positioning. The method combines location fingerprinting and dead reckoning differently from the conventional combinations. It uses compound location fingerprints, which are composed of radio fingerprints at multiple points of time, that is, at multiple positions, and displacements between them estimated by dead reckoning. To avoid errors accumulated from dead reckoning, the method uses short-range dead reckoning. The method was evaluated using 16 Bluetooth beacons installed in a student room with the dimensions of 11 × 5 m with furniture inside. The Received Signal Strength Indicator (RSSI) values of the beacons were collected at 30 measuring points, which were points at the intersections on a 1 × 1 m grid with no obstacles. A compound location fingerprint is composed of RSSI vectors at two points and a displacement vector between them. Random Forests (RF) was used to build regression models to estimate positions from location fingerprints. The root mean square error of position estimation was 0.87 m using 16 Bluetooth beacons. This error is lower than that received with a single-point baseline model, where a feature vector is composed of only RSSI values at one location. The results suggest that the proposed method is effective for indoor positioning.


Introduction
To localize a position in an indoor environment, wireless-based methods and dead reckoning methods are widely used (Liu et al. 2007;Pei et al. 2016;Kjaergaard 2007). Trilateration methods estimate the position on the basis of the signal strength from wireless beacons or transmitters (Liu et al. 2006). Triangulation methods take into account the angle of signals as well as the signal's strength (Liu et al. 2010;Shan and Yum 2007). Both methods are affected by multipath propagation, especially in indoor environments. Errors in position estimation with both methods can be large in indoor environments, although very short pulses using ultrawideband technology can reduce the errors. Complex multipath and non-line-of-sight propagation are still a problem indoors (Xiao et al. 2015).
Fingerprinting methods are also used for indoor position estimation (Kjaergaard 2007). They are based on a radio fingerprint, or a feature vector, which is composed of the strength of the signals from multiple transmitters. Multipath propagation attenuates or intensifies signal strength indoors in a complicated manner. Fingerprinting methods can utilize the local increase and decrease in signal strength as features, and their estimation errors are regarded as relatively small.
However, these methods have been used mainly for area estimation or for finding a reference point.
Dead reckoning methods are used to estimate a relative position from a reference point (Jimenez et al. 2009). Dead reckoning estimates a relative offset, that is, a displacement vector, usually on the basis of the double integration of acceleration and accumulates position estimation errors. Therefore, the errors become larger as integration time progresses.
Combinations of fingerprinting and dead reckoning are often used to locate a reference point by fingerprinting and then to measure the displacement by dead reckoning (Chang et al. 2015;Seitz et al. 2010). They are useful but do not solve the accumulation of estimation errors caused by dead reckoning. Therefore, they generally need other mechanisms to compensate for the accumulation.
In this paper, we propose a method that combines fingerprinting and dead reckoning differently from the conventional methods of combination. The proposed method estimates a position on the basis of compound location fingerprints, which are combinations of radio signals, or fingerprints, at multiple points of time (or positions), and the displacements between them.
A short moving distance between near times (points) can suppress the accumulated estimation errors caused by dead reckoning. Using fingerprints at multiple times (points) can increase the size of the feature vectors without increasing the number of transmitters. We believe that dead reckoning and fingerprinting can complement each other in this combination and thus improve the estimation accuracy.

Proposed method
Fingerprinting in the proposed method is based on a combination of fingerprints of radio wave strength at multiple times and the displacement vectors between them.
The proposed method can be divided into three steps: (1) collecting data, (2) building a regression model, and (3) estimating a position. Steps (1) and (2) are preparations for step (3). The first two steps are also called the "offline training phase," and step (3) is also called the "online position determination phase." Assume that beacons B l 1 l L ð Þare installed in a target area where L is the total number of beacons.

Step for collecting data
The data collection step is the same as the measurement in conventional fingerprinting methods. A data collector measures the Received Signal Strength Indicators (RSSIs) from all beacons at all points in a measurement point set where c i are known and true coordinates and N is the number of measurement points. In this paper, we focus on a simple coordinate system, that is, a two-dimensional plane, although the proposed method can easily be applied to a three-dimensional space.
Let r i;k (1 i N; 1 k K) denote an RSSI vector, that is, a conventional fingerprint, on the k-th measurement at coordinates c i , where measurement is repeated K times at each location.
An RSSI vector set R i at c i is expressed as and the total set R(C) of RSSI vectors at the points in a measurement point set C is also represented as follows: Step for building regression model In the step for building a regression model, a nonlinear regression model is built whose explanatory variable and objective variable are respectively a compound location fingerprint vector and a position. A compound location fingerprint g i;k;j;m is composed of an RSSI vector r i;k 2 R i 1 k K ð Þ , an RSSI vector r j;m 2 R j 1 K ð Þ, and a displacement d i;j ¼ c i À c j between coordinate c i and coordinate c j .
A compound fingerprint g i;k;j;m and a compound fingerprint set G i;j are respectively expressed by g i;k;j;m ¼ ½r i;k ; r j;m ; d i;j (3) and G i;j ¼ fg i;k;j;m 1 k K; 1 m Kg (4) where a; b; c ½ concatenates vectors a, b, and c into one vector. A compound fingerprint set G C ð Þ, which is made of RSSI vectors and displacements at points in a measurement point set C, is represented as follows: The model inputs a compound location fingerprint f and outputs an estimated position b p. This is expressed as follows:

Step for estimating position
In this step, a position is estimated on the basis of the built regression model with a measured compound location fingerprint. A user measures RSSIs at times t q and t r . A displacement vector between time t q and t r is also estimated with dead reckoning. Figure 1 describes the relationship between the compound location fingerprints and their elements. Let s l t ð Þ be an RSSI from beacon B l and S t ð Þ ¼ Þbe a signal strength vector at time t.
A displacement vector D t r ; t q À Á from time t q to time t r is estimated with dead reckoning. If t r À t q À Á is small, D t r ; t q À Á % P t r ð Þ À P t q À Á , where P t ð Þ is a position vector, that is, coordinates at time t, although P t r ð Þ and P t q À Á are unknown. The goal of the proposed method is to estimate b P t r ð Þ precisely. Compound location fingerprint F t r ; t q À Á is composed of S t r ð Þ; S t q À Á ; and D t r ; t q À Á , which is expressed as follows: A compound location fingerprint F t r ; t q À Á is used to estimate a position at time t r , that is b P t r ð Þ, on the basis of the model built in the step for building a regression model. In other words, b P t r ð Þ ¼ M G ½ F t r ; t q À Á À Á . Figure 2 illustrates an example of a compound location fingerprint.

Experiment
An experiment was conducted to evaluate the proposed method. We measured the RSSIs from beacons at multiple points and prepared the training data and validation data. We then built a regression model on the basis of the training data and estimated the position on the basis of the regression model with the validation data.
The goal of the experiment was to compare the proposed method with a baseline method that uses a regression model based on simple fingerprints at one measurement, that is,

Experimental environment and devices
The proposed method was evaluated in a student room with the dimensions of 11 × 5 m and with many pieces of furniture and windows. The room was located in a reinforced concrete building. The origin of the coordinate system was set to one corner of the room. The shorter and the longer sides of the floor of the room were, respectively, the X and Y axis shown in Figure 3. Sixteen Bluetooth beacons were placed in the room, as shown in Figure 3. The beacons were installed on the ceiling and on the walls. We attached 12 beacons to the ceiling and the remaining 4 on the walls between the ceiling and the floor. We used 16 MyBeacons whose specifications are shown in Table 1.
Measurement was performed with a Huawei P10 with the operating system Android 7.0. The specifications of the device are shown in Table 2. The device was mounted at a height of 1.35 m on a stand with casters so that it would be easy to change the location of the device. Figure 4 shows the measurement tool. We developed a measurement application program to measure Bluetooth RSSIs.

Experimental procedure
In the step for collecting data, we measured the RSSI values of the Bluetooth beacons 10 times at 30 measuring points, which were points of intersections on a 1 × 1 m grid with no obstacles, as shown in Figure 3. The measuring point set is referred to as C1. The total number of valid pieces of data in the collected data set R C1 ð Þ was 300. We divided R C1 ð Þinto two parts, training data set R t and validation data set R v , before composing the compound location fingerprints to avoid the compound fingerprints to contain the same fingerprint elements, which  could affect the results. The compound fingerprints for training G t were composed of fingerprint elements in R t and those for validation G v were composed of fingerprint elements in R v . In the step for building a regression model, Random Forests (RF) (Breiman 2001;Breiman et al. 2001) was used to build regression models as it was effective for position estimation in previous experiments (Takayama et al. 2018). In particular, the proposed method with the RF outperformed the baseline method with the RF and the Support Vector Regression (SVR) method (Cortes and Vapnik 1995). We built the regression models using various numbers of beacons. Beacon bNN in Figure 3 represents a beacon with the identifier number NN.
Beacons from b01 to bMM are used where the number of beacons is MM. For example, when the number of beacons is 3, three beacons b01, b02, and b03 are used.
In the step for estimating a position, calculated displacements were used on the basis of the collected data in the evaluation in this paper. These displacements were used to compose compound location fingerprints in the validation data set.

Evaluation criterion
We evaluated two methods with respect to the estimation distance error and accuracy.
A Root Mean Squared Error (RMSE) is used for estimation distance error. The RMSE is a standard deviation of prediction errors, or the residuals.
Let p,p,p À p j j, and ε be, respectively, a true position, an estimated position, a distance between p andp, and a permissible tolerance of error distance. Ifp À p j j ε,p is a correct estimate. The accuracy of estimation is the ratio of the number of correct estimates to the number of total estimates.

Basic evaluation
We first evaluated the proposed method on the basis of regression models by 10-fold cross-validation, referred to as  Figure 3. Layout of target room.   Figure 4. Measurement tool. Figure 5 shows the RMSEs at the intersections (C1) of the 1 × 1 m grid: the baseline method, referred to as B C1 Þ. The RMSE of the position estimation in the proposed method with the RF using 16 beacons was 0.87 m. The RMSE of the position estimation of the baseline method using 16 beacons was 1.01 m. The RMSE in the proposed method is smaller than that in the baseline model. Figure 6 shows the cumulative accuracy of the two methods B C1 ð Þ and P C1 ð Þ with different numbers of beacons. Figure 6 shows that at a tolerance distance of 0.5 m using 16 beacons, the cumulative accuracy of the proposed method was 40.9%, which was better than that of the baseline method by 14.9 percentage points (pp). Similarly, the cumulative accuracy of the proposed method using six beacons was 33.8%, which was better than that of the baseline method by 16.2 pp.
The results of the basic evaluation show that the proposed method using compound location fingerprints improved the RMSE and the accuracy compared to the baseline method using the RSSI vectors at one point.

Accuracy at unknown points
We also evaluated estimation accuracy at unknown points in the proposed method on the basis of regression models by 10-fold cross-validation. We built regression models based on a data set collected at the intersections on a 2 × 2 m grid (10 times at 12 measurement points). The intersection points are referred to as C2, as shown in Figure 3. We then validated an in-between data set whose locations are at the intersections of the 1 × 1 m grid but not at the intersections of the 2 × 2 m one. The location points are referred to as CX. In other words, CX ¼ C1 À C2. The in-between points can be considered to be unknown locations whose data are not used in model building. Training data set G t C2 ð Þ was a data set composed of 90% of G C2 ð Þ(10 times at 12 measurement points), that is, the size of G t C2 ð Þis 11,664 (= 108 × 108). The regression models are referred to as Model ð Þ and in-between data set G v CX ð Þ. The number of compound location fingerprints in G CX ð Þ was 32,400 (= 180 × 180). Each fold in 10-fold cross validation has a validation set whose size is 468, that is, 144 (= 12 × 12) in G v C2 ð Þ and 324 (= 18 × 18) in G v CX ð Þ. In comparison, each hold in baseline method has a training set whose size is 108 in R t C2 ð Þ and a validation set whose size is 12 in R v C2 ð Þ and 18 in R v C1 ð Þ. The RMSEs for Model C2 are shown in Figure 7. The RMSE of the position estimation of the proposed method with the RF using 16 beacons was 1.55 m. This error is smaller than that in the baseline model. The RMSE of the position estimation of the baseline method using 16 beacons was 1.70 m. Figure 8 shows the cumulative accuracy for Model C2. Let Q D t ; D v ; w; t ð Þbe the cumulative accuracy for the model built with training data set D t , validation data set D v , the number of beacons w and w is a tolerance distance t in meters. At a tolerance distance of 0.8 m using 16 beacons, Q G t C2 ð Þ; G v CX ð Þ; 16; 0:8 ð Þ was 29.3%, which was better than that of the baseline method by 3.8 pp. Similarly, at a tolerance distance of 1.0 m using six beacons, Q G t C2 ð Þ; G v CX ð Þ; 6; 1:0 ð Þ was 33.8%, which was better than that of the baseline method by 4.4 pp.
The evaluation of the RMSE at the unknown points showed that the RMSE at in-between points of the proposed method with the RF was better than that of the baseline method using models built on more data. This suggests that the proposed method with the RF is effective for precisely estimating a position.

Conclusions
We proposed a method for estimating position that is based on non-linear regression models with a unique combination of fingerprinting and dead reckoning. The evaluation results showed that the proposed method was better than the baseline method based on a single fingerprint, suggesting that it can be effective in indoor environments.  University as a visiting scholar from May 2012 to Apr. 2013.

Notes on contributors
Noritaka Osawa received his B.S., M.S., and PhD degrees in information science from the University of Tokyo in 1983Tokyo in , 1985Tokyo in , and 1988, respectively. After working for a software company, the University of Electrocommunications, and the National Institute of Multimedia Education, he is now with Chiba University as a professor.