Indoor WLAN localization via adaptive Lasso Bayesian inference and convex optimization

Abstract There has been growing interest in indoor positioning system technologies due to the important role of real-time indoor positioning services in modern technologies such as security services and emergency healthcare. Currently, many large companies such as Apple, Microsoft, and Google have researched location-based services (LBS), as they are key for network optimization and extensive computing applications. Multiple techniques were proposed using fingerprinting-based location methods due to their ability to obtain accurate results within several meters. However, their major drawback is that the received signal strength (RSS) can fluctuate with time and different environment, giving RSS distribution a multimodal distribution. Thus, in this paper, we established a framework consisting of k-mean-symmetrical-Hölder-divergence, a statistical model that encapsulates Cauchy-Schwarz divergence and skews Bhattacharyya divergence, to measure dissimilarities among signals that have multivariate distributions. In other words, the traditional k-mean is extended to meta-algorithms to detect the cluster that RSS is related to. Our second approach was hierarchical Bayesian model based on adaptive lasso criterion to recover sparse signals to optimize the accuracy of the indoor location estimation by solving the l1-minimization problem. The experimental results showed that the proposed system had substantially improved localization estimation accuracy compared to traditional fingerprinting-based localization methods.


PUBLIC INTEREST STATEMENT
There has been growing interest in indoor positioning system technologies due to the important role of real-time indoor positioning services in modern technologies such as security services and emergency healthcare. Currently, many large companies such as Apple, Microsoft, and Google have researched location-based services (LBS), as they are key for network optimization and extensive computing applications. Multiple techniques were proposed using fingerprinting-based location methods due to their ability to obtain accurate results within several meters. However, their major drawback is that the received signal strength (RSS) can fluctuate with time and different environment, giving RSS distribution a multimodal distribution. Thus many algorithms were proposed to improved localization estimation accuracy compared to traditional fingerprinting-based localization methods.

Introduction
Recently, location-based services (LBS) have garnered great interest due to the strong drive of location-based technologies (McGuire et al., 2005). They are considerably used for healthcare monitoring, emergency personnel navigation, personalized information delivery, network management and security, and context awareness. The Global Positioning System (GPS) and Global Navigation Satellite System (GNSS) (Patterson et al., 2003) are superior for outdoor positioning, but cannot provide indoor positioning navigation for two reasons: they require a direct line-of-sight between the user and the satellite; and they cannot locate the floor inside a building. Researchers have proposed many approaches for indoor localization, which can be divided into two categories: (1) non-radio-based positioning such as ultrasonic techniques, infrared (IR) waves, visible light, magnetic field exploitation, and inertial systems; and (2) radio-based positioning techniques such as ultra-wide band, radiofrequency identification, ZigBee-based techniques, Bluetooth-based techniques, frequency modulation techniques, and IEEE 802.11 wireless local area network (WLAN)-based techniques. IEEE 802.11 WLANs have widespread technology in indoor positioning techniques as they can be found in almost every building and do not need extra infrastructure in contrast to many other technologies that have high maintenance costs and need a massive amount of transceivers. IEEE 802.11 WLANs operate in the 2.4 GHz (IEEE 802.11b/g) and 5 GHz (IEEE 802.11a) unlicensed bands. However, other technologies transit simultaneously, which can lead to interference distortion.
Indoor positioning approaches for WLAN are categorized into two main categories: (1) lateration methods that consist of direction of arrival, angle of arrival, time of arrival, and time difference of arrival techniques; and (2) RSS exploitation (fingerprinting-based localization) (Abdullah & Abdel-Qader, 2016;Abdullah et al., 2016a). However, the lateration method suffers from poor localization accuracy in indoor environments. It was reported in one study (B. Wang et al., 2015) that localization distance error in a typical office environment is about 24.73 ft with a length of 200 ft and width 80 ft; this inaccurate estimation is due to two main reasons. First, the localization distance estimation is based on the propagation model, which is not accurate due to the complexity of the indoor environment and has a non-line-of-sight problem that can cause a multi-path signal. Second, the localization distance error can significantly increase if one or more access points (APs) are not correctly estimated (Abdullah & Abdel-Qader, 2018;Ji et al., 2006).
Fingerprinting radio map techniques do not need an infrastructure as they use APs in indoor environments. The fingerprinting radio map consists of two phases: the offline and online phases. During the offline phase, a multiset of predefined points referred to as landmarks or reference points (RPs) are determined in the area of interest (AoI). At each RP, a survey is conducted and the RSS value of available APs in the AoI is recorded for a specific time interval to generate a database of RSS values. During the online phase, the user collects the RSS values at his location and sends them to the server where specific algorithms are applied to find similar measurements in radio map fingerprints using the RPs to estimate the user's location (Abdullah et al., 2016b).
In this paper, RSS fingerprinting database were recorded into four different direction during the offline phase. After that k-mean-symmetrical-Hölder-divergence was applied on the fingerprinting database in order categorized the cluster with the most similarity head cluster head. During the online phase, the similarity was computed between the online measurement and the cluster head and then categorized and selected with the least distance measurement as coarse location of the object (Abdullah, 2018). However, each cluster has a subset of RPs that is confined the location of the object called Region Of Interest (ROI) (Abdullah et al., 2017). Contrasting the traditional indoor localization techniques which select the RPs and the APs based on the fingerprinting database, three different schemes were proposed to select the APs, random selection scheme, strongest APs scheme, and the Fisher criterion and then the comparison were made between these three different schemes. We utilize that the random selection scheme was the optimum scheme between them. After that, a novel framework that provide a Bayesian inference incorporate with adaptive Least Absolute Shrinkage and Selection Operator (LASSO) were proposed for fine localization, that formulate the sparse user location to find the position vector by finding the similar RPs and minimize the dissimilarity between the online measurement and the fingerprinting database. Table 1 summarizes the notation of this work.

Related work
Significant advances have been made in the area of wireless localization using RF signals. RSS maps were employed by Radar, in which the RSS values of the APs were recorded with their  (Feng et al., 2009) enhanced the performance of Radar by exploiting RSS properties such as scattering and multipath. In one study (Orr & Abowd, 2000), an approach was proposed to reduce the localization distance error by estimating the conditional property of different locations from nine APs, and then employing the speed of the user's mobile to refine the speed. Kung et al. (2009) utilized the RSS power of APs on localization distance error and categorized the APs as "good" to represent the maximum power of the APs and "de-emphasizing" to represent the bad role of the APs. In one study (Youssef & Agrawala, 2005), the accuracy was significantly improved by using an autoregressive model that autocorrelated the RSS with the same AP. The RP was created based on the autocorrelation of APs, and then the variance and mean were estimated based on that RP. In most previous works, object location has been estimated using a probabilistic framework with the interpolation method. Chai and Yang (2007) proposed the statistical Markov model, termed the hidden Markov Model (HMM), by utilizing the trace of the user by developing a patch on the radio map. The proposed algorithm compensated between the accuracy for further improvement on the radio map by exploiting the motion constraint. Then a multivariate Gaussian model-based algorithm was used to estimate the location of the object. In general, probabilistic approaches have better accuracy than deterministic approaches as they provide one point for estimating the user's location. To improve accuracy and reduce complexity, (Milioris et al., 2014) used multivariate Gaussian by exploiting the mean vector and covariance matrix and employing them in Kullback-Leibler divergence, especially its one-iteration method compared HMM-based localization algorithm that needs multiple iterations to obtain the converges. Recently the CS algorithm has motivated researchers to apply many emerging applications; nevertheless, CS-LPS is still in the early stage. Orr and Abowd (2000) proposed CS using a framework and taking advantage of the spatial sparsity by using constrained l1 norm based on a learning dictionary by taking the RSS average measurement and considering it as a signature for the WLAN. Then the dictionary learning was established by concatenating each cell and the unknown cell was projected to the dictionary-learning algorithm to form the vector of measurement. Several studies (Zhao et al., 2017) have proposed a novel approach by leveraging the sparse signal of the RSS measurement. These techniques open a new ear for deterministic approaches by dividing the RSS measurement into a subset of small vectors containing indices of each nonzero measurement. So the object location can be estimated by locating the sparse position by solving the l-minimization through the RSS measurement in the online phase and the RSS measurement of the offline phase. (Kalistatov, 2019) investigated in magnetic transport construction through wireless video monitoring of traffic flows through calculating radio channel and find the advantage and disadvantage of the infrastructure. (Astaneh & Gheisari, 2018) proposed about cognitive network and Its routing parameter and showed not efficient methods of allocating static spectrum due to lack of spectrum, they compared routing metrics to analyze its challenges in multi-route routing and in oneway routing. (X. Wang et al., 2020) investigated the effect of the passenger flowing in urban road and how the weather effect on daily ridership rate. While in (Čokl et al., 2019) investigated about the effect of stink bug on multimodal signals that have been transmitted the substrate and air, and how that can affect hormone aggregation pheromones as uni-or multicomponent signals.

Indoor positioning system
Fingerprinting data collecting is considered the main task in IPS in the offline phase. A person holding the mobile phone has WLAN access to collect RSS data for different locations from various APs. This process was performed at the College of Engineering and Applied Sciences at Western Michigan University (Kalamazoo, MI, USA). It was observed that RSS data have multimodal distribution due to many reasons such as multipath signal and scattering. For confirmation, an experiment was performed in which the signal-to-noise ratio (SNR) was collected for 35 minutes for a single AP in a long corridor by using a mobile robot. It was found that the SNR value differed by 10 dBm as shown in Figure 1.

IPS data collection
The human body of the person who holds the mobile phone is an obstacle for the signal as well as for the pedestrians, so the radio map fingerprint is recorded into four directions within the time sample where the directions are 45°, 135°, 225°, and 315°, and the time sample is denoted as ; ð � Þ represents the direction of the orientation and t is the time sample. For four directions, the average and covariance matrix were calculated to generate the where t = 10 and q where L is the number of APs and N is the number of RPs.
i;j is the variance for AP i at RP j with orientation ð � Þ; so the radio map of the offline phase is

Clustering based on K-MEAN Hölder divergence
In this section, symmetric Hölder Divergence was produced, to calculate the dissimilarity between p and q, by using bi-parametric inequalities lhs (p, q) ≤ rhs (p, q), where lhs and rhs denote the lefthand side and right-hand side, respectively, and use the log ratio gap: The relationship between the divergence families is illustrated in Figure 2.
The Hölder divergence satisfies the conjugate exponents β and α.
The symmetrized Hölder divergence is: To improve the IPS accuracy, we proposed that k-mean incorporated symmetrized Hölder divergence. The k-mean algorithms first introduced by Lloyd in 1957 to solve vector quaternization problems by choosing seeds from data points as initial centroid and each point will be associated to the closet center. Each center will be reiterated and updated with each calculation step until the difference between each successive calculation will be below the threshold value. In general, k-mean uses squared Euclidean distance to calculate the dissimilarity between the center and the other points. The center c i and cluster C i's is defined as follows: where C i j j denotes the cardinality of C i .
Given the data of the fingerprinting radio map database k-mean Hölder divergence was performed based on S H α;γ ðp : qÞ (Nielsen et al., 2017). The cost function of k-mean Hölder divergence is where c li are the cluster centers, and l i 2 f1; . . . ::; Lgare the cluster labels.

Model of sparse recovery modification
Conventional fingerprinting techniques have high computationally complexity; nevertheless, CS has introduced a new methodology in wireless localization with high accuracy and acceptable time. In previous works, the online phase has been considered one of the RPs of the radio map. Wireless localization with sparse information can be programed as an indicator to find the pose vector, and the element can be recast as a one-sparse vector , 1 -minimization problem. The sparse recovery in wireless indoor localization can be represented as: where θ ¼ ½0; 0; 1; . . . ::; 0� T represents the sparse location vector where 1 indicates the index where the is the nearest; Φrepresents the selection matrix of APs, from the fingerprinting database Ψ, ε represents the error vector and y represents the RSS measurement of the online phase from specific APs as: In general the CS does not count the noise measurement ε into the calculation, and underdetermines the problem due to the fact that the size of θ is greater than the size of y. Nevertheless, θ only uses one RP location and is solved by using convex optimization θ k k 1 � � is the , 1 -norm of θ. Still, this optimization algorithm does not have the error measurement ε.
Many algorithms have been proposed to deal with error measurement such as basis pursuit (Jin et al., 2010) and greedy algorithms (Zou, 2006). The previous techniques had a fine localization method accompanied by offline clustering and other optimization algorithms. In general, offline clustering has a number of a prior clustering that enforce the data to be divided into a specific number of clusters, and only one or more will be chosen even if the user between them. Moreover, the Ψ matrix has to obey two conditions to obtain optimal results: first, the coherence between Ψ and Φhas to be small; second, the product of Ψ and Φhas to be orthogonal (Alhamzawi & Ali, 2018). Nevertheless, the orthogonality cannot be completely achieved since is not a square.
In general, the main goal of indoor localization is to estimate the location of the object using WLAN signal by using the RSS reading of online phase. Essentially, we are looking for an algorithm that can map the fingerprinting database. However, the sparse vector minimization has tremendous zeroes on less probable RPs, but the similarity between the radio map fingerprint and the online phase is quite precise, with small residuals. Henceforth, we propose a convex algorithm the formulate ℓ 1 norm and ℓ 2 as where 0 � α � 1adjustment between Lasso and ridge regression and λis a complexity parameter. Ridge regression intend to reduce the coefficient of correlated fingerprinting database by using θ k k 2 2 . If the α set 0 the above equation will be ridge regression, while if α was set between 0 and 1 the λwill have sparsity monotonically lead to LASSO solution.

Bayesian adaptive Lasso
The penalize of , 1 -norm is proportional to the size of the signal component, and that will lead to suboptimal solutions. (Alhamzawi & Ali, 2018) proposed that the different weighting coefficients of the , 1 is considered as different entries of the sparse vector γ. After that Largrangian argument was used to improve the accuracy and the adaptive Lasso is expressed as: where μ i > 0, i = Since the Lasso and adaptive Lasso both are sparse, so choosing the suitable parameters of the penalization will lead to a better performance. (Themelis et al., 2012) developed a hierarchical Bayesian model through independent Laplace priors.
In conclude, to define the hierarchical Bayesian algorithm with adaptive Lasso with multivariate Laplace distribution as prior of γis defined as: The MAP estimator is defined as:

Maximum likelihood approach using adaptive Lasso Bayesian inference
In general, Bayesian inference is depend on posterior distribution that is defined as: By maximizing the mariginal likelihoood we maximize the likelihood approach by integrating w with respect to the other unknown parameters. Therefore we can make the posterior distribution as multivariate Gaussian By using Bayes Law p y; γ; λ; β ð Þ is maximized by integration p w; y; γ; λ; β ð Þ out of w i.e.

Experimental results
In this section experimental evaluation is elaborated for the proposed algorithm on real data environment. Large number of smartphones devices are working on Linux-based android operating system, the RSS data of fingerprinting database were collected using a Samsung S5 mobile phone functioning on Android 4.4.2 operating system through an android application by using the inherent chip of Wi-Fi package. However, to create the fingerprinting database data (the MAC address and RSS value) within time sample frame from 1s to 100 s for 84 RPs for four direction within a grid of 1 m. 47 active APs were detected over the layout shown in Figure 3. Furthermore, to evaluate the performance of the proposed algorithms, the online data were collected in different environment during different times of a day for 65 unknown locations as test points.
Most of the previous work worked on uniform granularity of RPs. In general Wi-Fi signal were not created to be used in localization, which means even though high dense granularity RPs can lead to low localization error nevertheless, a dense granularity RPs have high computational cost with great redundancy. But on the other hand, due to the 1-sparse vector property, a dense granularity RPs is beneficial in sparsity algorithms which could lead to high precision positions system. The evaluation of the proposed algorithms depend is measured by using the average Euclidean distance between the estimated location of the test point and the true location of it known as Mean Average Error (MAE).

A clustering-based approach for improving the accuracy of LPS
Sparse recovery localization technique performance is decidedly depending on the number of the fingerprinting database and the number of the APs that have been used in the measurement techniques. For instance, the number of measurements in CS-based localization method has to be in order of log Ñ to imply the sparse recovery of the positioning vector. Nevertheless, the fingerprinting database has to be correlated which demand large number of measurements and the have a large number of APs to have a sparse recovery which considered a big problem in LPS. However, this problem was alleviated in adaptive Lasso Bayesian inference because it has his own recovery correlated scheme. Another approach was proposed to enhance the localization quality, by clustering the fingerprinting database by using K-mean Hölder Divergence, since the clustering the database will decrease the number of RPs that have to process with proposed technique. Figure 4 shows the outperformed comparison of K-mean Hölder Divergence compared to traditional K-mean with respect to localization distance error with respect to the number of APs. The proposed algorithms with clustering methods show lower distance error with lower number of APs. For instance, the lowest localization distance error was obtained with 22 APs for adaptive Lasso Bayesian inference, with proved that the clustering methods can improve the localization quality with small number of APs. Figure 5 shows the localization distance error (m) of the adaptive Lasso Bayesian inference and CS with the increasing of APs. When the number of APs were low the difference in localization distance error is high, which shows the high impact of the clustering in adaptive Lasso Bayesian inference and CS. the best results were obtained when 22 or more APs were used.

AP selection methods
AP selection have different methods that can be performed in both online and offline phase. If it performed in the offline phase, then the AP selection will be only depend on the fingerprinting database regardless the data of the online measurement. Conversely the way of selection it may have a big fail if the online characteristics environment was different from the offline characteristics environment. Henceforth the APs selections during the online phase utilizing implicitly or  explicitly characterization. During the implicit utilization the APs selection will be exploiting in offline and online phases, while in the explicit utilization the APs selection will be using only during the online measurements. This process was formulated as fellows: ; 1; . . . :�; i ¼ 1; . . . ::L 0 where Φis the AP selection matrix, of the APs Γ � Γ 0 , L 0 ≤ L be the cardinality of Γ 0 .

Strongest Aps
Early techniques advocate that the APs selections should be on their signal strength in fingerprinting database as well as in online phase. The perception behind that is that the APs with the strongest RSS have the most coverage and accurate measurements. Nevertheless, this assumption may not suitable for a render criterion (Feng et al., 2012).

Fisher criterion
Fisher criterion is a metric that computes the discrimination capacity of the AP across the fingerprinting database by considering the stability of the AP across the RPs. Fisher criterion uses the statistical properties of the RPs by depending on the performance of the APs during the offline and the online phases which the score can be calculated as fellows: This technique is based on the fact if the APs has a smaller score only if it has a higher variance value and that would indicate that the APs are less reliable to consider. Nevertheless the fisher criterion may lead to faulty measurement if one of the APs was appeared in fingerprinting database only and didn't appear in online phase (Youssef et al., 2003).

Random selection
Contrasting the other two schemes, in this scheme the APs were arbitrarily selected despite the performance of the APs. In this scheme the computational complexity is less than the other two schemes that didn't need to calculate the variance, by running a matrix to select different AP at each run and then select the subset with high performance. Figure 6 shows the comparison between different AP selections scheme with respect the localization distant error with adaptive Lasso Bayesian inference where the x-axis represents the number of APs and the y-axis is the localization distance error. The better results were obtained with random scheme. The performance of Fisher criterion and the strongest AP where so close when the number of APs increased. The best results were obtained when 22 APs were used. Thus, we conclude the selection scheme of APs affect the performance of the of the proposed system that not only the number.

Online computing and prior work comparison
The fine estimation of object location is highly depending on the number of the APs that were used in the proposed algorithm. Different technique and numbers of APs were used for our proposed algorithms to estimate the localization distance error. The results were compared with different prior techniques, such as, CS-based localization method (Feng et al., 2012). Kernel Density Estimation (KDE) (Youssef et al., 2003) and Weighted KNN (WKNN) (Themelis et al., 2012). Our implementation of CS-based localization contained all the steps that were implemented in (Han et al., 2015) expect the coarse schemes which have been replaced by offline clustering. Figure 7 shows the localization distance error of different proposed algorithm compared to the proposed algorithms, KDE, WKNN have the highest localization distance error while the lowest localization error was with the proposed algorithm at 22 APs. The results showed that the proposed algorithms have. It was observed the localization error was high with CS-based localization even though large number of APs were used. Adaptive LASSO showed a better performance when large number of APs were used. Figure 8 depicts the Cumulative Distribution Function (CDF) of the localization error for the proposed algorithm scheme along with some will known localization algorithms scheme. Our proposed algorithm surpasses the other schemes which give around 0.9 m at % 90 or the error in this range while the CS-localization is more than 1.2 m at % 90.

Conclusion
Nowadays, the WLAN LPS has gained a great attention due to the existing infrastructure, low cost positioning, and easy to implement. The WLAN fingerprinting-based localization methods became very famous due to the extraordinary performance in real environments. The WLAN propagation has a very complex phenomenon that the signal is distorted by scattering, reflection and multipath, which make the traditional techniques such, AOA, TOA will not lead to good performance. To reduce the affect of multipath signal and time variation we used a symmetrical Hölder divergence clustering method. The fingerprinting radio map were created in CEAS and many algorithms were used to investigate the localization distance error, such as, CS, kNN, and PNN. We proposed a novel hierarchical Bayesian by depending on adaptive lasso criterion that is based on type-II maximum likelihood  methodology. The results showed an adequate result that the localization distance error was less than 1 m. For accurate results we have used different AP selections schemes to reduce the multimodal distribution of the RSS signal. Furthermore, the adaptive Lasso was also used to obtain the best results through small number of RPs. The feasibility of using adaptive Lasso to obtain the location of the object from a small number of RPs. Now, we are in the process to investigate to make the object localization within smaller cluster with more stable localization mechanism with less number of APs. Also, new algorithms were built for more accurate results, and robust system that can keep and update the database automatically without manual work needed.