LoRaWAN ‐ implemented node localisation based on received signal strength indicator

Long Range Wireless Area Network (LoRaWAN) provides desirable solutions for Internet of Things (IoT) applications that require hundreds or thousands of actively connected devices (nodes) to monitor the environment or processes. In most cases, the location information of the devices arguably plays a critical role and is desirable. In this regard, the physical characteristics of the communication channel can be leveraged to provide a feasible and affordable node localisation solution. This paper presents an evaluation of the performance of LoRaWAN Received Signal Strength Indicator (RSSI) ‐ based node localisation in a sandstorm environment. The authors employ machine learning algorithms, Support Vector Regression and Gaussian Process Regression, which turn the high variance of RSSI due to frequency hopping feature of LoRaWAN to advantage, creating unique signatures representing different locations. In this work, the RSSI features are used as input location fingerprints into the machine learning models. The proposed method reduces node localisation complexity when compared to GPS ‐ based approaches whilst provisioning more extensive connection paths. Furthermore, the impact of LoRa spreading factor and kernel function on the performance of the developed models have been studied. Experimental results show that the SVR ‐ enhanced fingerprint yields the most significant improvement in node localisation performance


| INTRODUCTION
The decreasing cost and increasing processing capabilities of computing and communication technologies have fuelled the exponential increase in the number of interconnected devices, commonly referred to as the Internet of Things (IoT) [1]. The deployment of extensive IoT implementations is more often than not subject to the fundamental design constraints of limited resources in terms of low power consumption and low processing capabilities. The availability of Low-Power Wide Area Network (LPWAN) technologies has provisioned the characteristics aligned with the needs of these applications. Amongst the range of options, LoRa [2][3][4] has been widely adopted owing to an advantageous combination of features: low-cost and low power consumption with long range wireless connectivity.
Accurate node localisation is central to many beneficial applications within extensive IoT networks [5,6]. The bulk of existing applications and services harness mature Global Navigation Satellite Systems such as the Global Positioning System (GPS) [7] or Global Navigation Satellite System [8]. Although these platforms provide accurate location estimations, their implementations are relatively expensive and more importantly in the context of IoT, prohibitively power hungry [9]. For instance, the GPS consumes between 30 and 50 mA acquiring a GPS fix which can take tens of seconds [10], largely attributable to the necessary exchanges of data. A more aligned approach is to develop network-based localisation techniques, which harness a suitable parameter inherent to data transmission as the foundation. The paper details the development scheme taking advantage of the flexibility provisioned by the LoRa spectrum modulation scheme.
Inherent features of radio signals that characterise wireless networks can be used to determine the location of a node in the network [13,14]. Signal propagation is dynamic and models the environment in which electromagnetic signals propagate is challenging. Measurable quantities such as the Received Signal Strength Indicator (RSSI) [15,16] vary with the position of the transmitters and are time dependent. A number of reported network-based location estimations based on RSSI indicate that deterministic solutions lack accuracy because of the temporal dependence of the measurements.
Here, in order to maintain simplicity of implementation whilst meeting the needs of extensive low power deployments, a technique is developed and evaluated extending reported research that relies on the inherent relationship between RSSI and physical distance; the relationship will in turn be the basis to estimate node location from the new, unknown node RSSI value. We approach the evaluation of the proposed procedure for the estimation of node location in sandstorm environment through field trial and using LoRaWAN as an example technology. The aim of the evaluation is threefold. First, we aim at demonstrating the feasibility of transformed RSSI-based node localisation using machine learning algorithms. We do that by showing that transformed RSSI (RSSI ratio) estimation outperforms the estimation of node location based on absolute RSSI benchmark. Second, we aim at demonstrating the impact of Spreading Factor (SF) and kernel function on estimation of node location. This is done by using different SFs to gather different datasets and different kernels to evaluate the kernelised algorithms. The results demonstrate the impact of these parameters on the performance of node localisation models. Third, we aim at demonstrating the consistency of the best performing technique across different scenarios by applying a public dataset (Antwerp dataset).
The physical locations and relative elevations of nodes with respect to receivers within the operational environment are central to the solution. Thus, the approach adopts a 'fingerprinting' methodology that models transmission for the network under the environment that governs the coverage area; in this case, the 'fingerprint' is established from the comprehensive mapping of RSSI values. A number of node localisation techniques developed using RSSI-based fingerprinting for the estimation of node location in LoRaWAN and SigFox settings have been reported and summarised in Table 1. The base models have been enhanced through the use of machine learning in order to increase node location accuracy.
In Ref. [13], the fingerprint was enhanced through K-Nearest Neighbour methods to achieve a mean distance error of 689 m (Sigfox) and 398 m (LoRaWAN) from 84 to 68 base stations, respectively. However, neither the location of the base stations was provided nor the SF was used in the development of LoRaWAN-based technique. Authors in Ref. [14] focussed on the development of an outdoor parking positioning system for a restricted coverage area (340 � 340 m) utilising 4 LoRaWAN base stations transmitting at an SF of 7. The Maximum Likelihood method achieved a mean distance error of 24 m. Gaussian Process Regression (GPR)-based fingerprinting for localisation [17] achieved a mean distance error of 25 m in a campus outdoor area (150 � 250 m), utilising 10 LoRaWAN base stations transmitting at an SF of 12. The latter studies focussed on solutions for relatively modest outdoor coverage areas.
In summary, previous studies have used empirical RSSI measurement for node localisation in moderate coverage areas. However, the idea of transforming RSSI measurement into average RSSI ratio by pairs of gateways as input fingerprint, combined kernel function and high SF can be harnessed to optimise and improve estimation of node location in more extensive coverage areas (in the order of kms).
The remainder of the paper is organised as follows; Section 2 details the data gathering infrastructure, measurement methodology, the coverage area under consideration and environmental conditions; Section 3 describes the establishment of the 'fingerprint'; Section 4 details the enhancement to the fingerprinting technique owing to the application of two kernel-based machine learning techniques, Support Vector Regression (SVR) and Gaussian Process Regression (GPR); Section 5 presents an evaluation of the node location performance of the proposed approaches; finally, Section 6 draws conclusions.

| MEASUREMENT METHODOLOGY
Measurements are executed in Jazan City in Saudi Arabia to capture the radio propagation of LoRa nodes in sandstorm condition. Figure 1 shows the map illustrating the location of the gateways, deployed across a semi-uniform grid given that the terrain is characterised by buildings and natural obstacles such as trees. Gateways/receivers are positioned on the outskirts of the Jazan City on four elevated structures with their respective elevations provided in Table 2. The four gateways ( Figure 1; black circles) were located at points 4-7 km around the coverage area containing the transmitter nodes at varying locations. Although we used only four gateways to demonstrate our proposed method. This can be easily scaled to any number of gateways.
Gateways are placed at elevated positions to extend the range of the network that would otherwise be impaired due to buildings and natural obstacles. The transmitter node is fixed when taking measurements and moved between grid points within the coverage area. Measurements were taken from 150 locations (viz. grid points) using different SFs. The distance between grid points is approximately 100 m. The closest measurement is taken at a distance of 4 km and the furthest 7 km. 20 RSSI packets are recorded from each measurement location at different SFs, referred to as SF9, SF10, SF11 and SF12, respectively; a total of 3000 measurements were acquired for each SF. Each packet comprised GPS location coordinates as a payload with gateways issuing an acknowledgement on the successful receipt of the payload. The measured RSSIs at each gateway were uploaded to The Things Network (TTN) server along with the payload information.
The data acquisition system consisted of four LoRaWAN transceiver gateways and transmitter accessing the Internet through laptops. The gateways comprise iC880ASPI LoR-aWAN 868 MHz concentrators connected to a Wi-Fi enabled host (Raspberry Pi 3 Model B SBC platform with 16 GB micro-SD card) via a SubMiniature Version A antenna of 2 dBi and are housed in Acrylonitrile butadiene Styrene Enclosures with mains electrical power supply. The enclosures are designed to guarantee operation between −5°C and +55°C, meeting the requirements of the operational environmental conditions. Gateways are the data collectors of the architecture utilising 868 MHz channels for data transmission. Packets can be received from different nodes with different SFs, up to 8 channels in parallel. Gateways are also equipped with an external control microprocessor and an RPi 3 unit is connected to the Institute for Mobile & Satellite Communication Technology concentrator via the Serial Peripheral Interface bus. The RPi 3 is Wi-Fi enabled and connected to 4G connectors in order to receive and transmit data to the server ('TTN' server). Transmitter nodes are a Sodaq One v2 LoRaWAN device with an 868 MHz antenna of 3dBi connected to a GPS module (Ublox Eva 7 M). The node consists of an RN2483 transceiver with 14 dBm transmission power and a bandwidth of 125 kHz powered by an 800 mAh lithium battery.

| Testbed environment
Two test beds were designed for this measurement campaign to capture the radio propagation of LoRa nodes in sandstorm condition for node localisation as a function of SF. In the first instance, RSSI measurements were taken to characterise the propagation of LoRa nodes in sandstorm environment as compared to clear sky. Figure 2 shows the two environmental conditions: clear sky and sandstorm in the city of Jazan. In this testbed, measurements were taken at locations positioned 100 m away from each other up to 3 km. At each location, the transmitter transmits more than 10 packets and were received by the gateway at a SF of 7. The gateway was placed on the roof of a stationary car (approximately 2 m above the ground). The location of the transmitter node was taken with reference to the gateway. The transmitter node was placed in a car (approximately 1 m above ground level) and moved to the pre-defined locations until all measurements were taken. Second, measurements were taken to validate the proposed node localisation technique. In this case, four gateways are located on the outskirts of the urban area, and the transmitters were located in the rural environment as shown in Figure 1. The propagation path between the test area and the gateways is characterised by buildings of different elevations (9-30 m). In fact, the experimental environment (sandstorm) in this work can be termed a semi-urban environment.
Both measurements were acquired during the monsoon winds in the months of July and August. The wind speed is the most important environmental factor that impacts signal propagation in this context. It is reasonable to expect that as the strength of the wind increases, the density of the perturbed and sand particles increases and the impact on the propagation of the radio signals becomes more significant and time dependent. Table 3 represents the weather conditions in the month of July and August, when measurements were taken. The most challenging season is characterised by dust, high temperature and humidity. Apart from the climatic factors, the radio environment is also characterised by trees and buildings, which can create challenges.

| LoRa transmitter distance estimation
In order to evaluate the performance of LoRa link for long range transmission and transmitter distance estimation for device location in sandstorm environment, the Two Ray Ground Reflection Model is used to estimate transmitterreceiver distance from known measured LoRaWAN RSSI. The two-ray ground reflection model is used in this work because it provides better prediction at long distances compared to other ready-made models [6]. The average RSSI values are used as representative samples to estimate the distance between the transmitter and the receiver. Figure 3 shows the variation in the actual and estimated distances with respect to measured RSSI. As can be seen, there is significantly more attenuation to the signal strength in sandstorm conditions compared to clear sky. However, in both situations, the signal attenuation appears to plateau within a certain range of greater distances. For clear condition, the RSSI sensitivity value is approximately −106dBm at 600 m and greater, and for sandstorm condition, RSSI sensitivity value is approximately −112dBm at 900 m and greater. Consequently, a large proportion of estimated distances for clear and sandstorm conditions 'bunch up' at 600-700 m and 900-1000 m, respectively.
However, in sandstorm condition, the model could produce inaccurate estimates even at shorter distances of 200 m. The estimated distances later bunch up and are significantly greater or less than actual distances with significant error. In clear condition, reasonable estimates are obtained only up to 600 m after which the estimated distances seem to cluster around 650 m, which is far less than the actual distances. It can be concluded that the use of ready-made propagation model with RSSI measurements to determine distances, and hence position of LoRa devices under clear sky condition is grossly inadequate for the long-range application of LoRaWAN for IoT. Therefore, we will investigate the use of location fingerprint for node location in sandstorm environment in the next section.

| DATA PREPARATION AND FINGERPRINTS
During the experiment, 20 packets were transmitted from each of the 150 designated locations and were expected to be received by the LoRaWAN gateways deployed in the vicinity of the experimental environment. The vector of absolute RSSI values received by the LoRaWAN gateways in the experimental environment is used to develop node localisation models. The calculated average values of RSSI of the 20 received packets at each grid location represent the fingerprint of each location. Figure 4 shows the RSSI pattern at various points in the radio environment for each of the LoRa gateways. The figures reveal the complexity of the radio environment (sandstorm), which does not fit any well-known propagation model. The complexity of the signal attenuation with distance is a result of noise and distortions. In case of a missing RSSI value in an observation, we substitute the missing value by using mean imputation method [18,19], which increases the amount of information that can be used, and hence, improves the performance of node localisation models (as discussed in Section IV). The procedure for mean imputation method used in substituting missing RSSI values is as follows: � Separation of each group of 20 packets by location. � examination of all data for each gateway.
� In the case of loss of all RSSI data at a specific gateway (referred to as 'Monotone') with the same location, the missing RSSI values are replaced with a specified value. � In the case of missing data at a specific gateway (referred to as 'Non-Monotone') with the same location, the mean of measured RSSI values (not Null) of the G j for each location is calculated, where G j denotes the number of the gateways. The missing values are therefore replaced with the mean value in G j for each location.
In a challenging radio environment, characterised by reflections and obstructions such as the one under consideration, the dynamic variations of absolute RSSI values with time introduce noise in the fingerprints and may impair the performance of node localisation. In this paper, we propose to derive robust fingerprints by taking ratios of RSSI values between gateway pairs in order to mitigate the variations in absolute RSSI values. Assume G ¼ g 1 ; g 2 ; :::; g n f g is a set of gateways deployed in the area under consideration, and L ¼ l 1 ; :::; l m f g represents the reference node locations. The location feature space, l i , can then be represented by gateways and measured absolute RSSI values r ∈ R where R ¼ r 1 ; r 2 ; :::; r n f g. The RSSI ratio is defined at each location for a unique pair of gateways. The received signal strength ratio for the gateways g i and g j can be computed for measurement taken at location l = [(g i ; r i ); (g j ; r j )] as in Equation (1).
With i < j for uniqueness, where r is absolute RSSI value. The total samples of the RSSI ratio g i ; g j À � is 3000 � 8 (6 columns for the RSSI ratio between pair of gateways and 2 for location coordinates).
The mean of the RSSI ratios for each location is computed as given in Equation (2).
where g i;j denotes the number of unique pair of gateways that measures the signal strength of the node at location l i . Mean RSSI ratios will be used in the subsequent analysis. The proposed node localisation technique is shown in Figure 5. It is important to note that in machine learning technique, separate datasets are needed to train and validate the model. Here, the RSSI_Ratios/location data collected during experiment from 150 locations is randomly divided into training and test sets. A total of 120 � 8 (the RSSI_Ratios between pairs of gateways) randomly selected RSSIs with reference locations are used for training the models and 30 � 6 remaining RSSI_Ratios without reference locations are used to validate the developed models.

| Support Vector Regression
Support Vector Regression (SVR) [16,21,22], dedicated to regression problems, is a variant of the well-known Support Vector Machine (SVM) technique. SVR uses the same principle as SVM [23,24] for classification, mapping the data into a high dimensional feature space using non-linear transformations; linear regression is then executed in this space. Kernel functions perform the non-linear transformation of the data into higher dimensional feature space that then enables the linear separation. Effectively, linear regression in a high dimensional space corresponds to non-linear regression in the low-dimensional input space [25]. Invariably, regression methods derive a function, say f ðxÞ, with the least deviation between predicted and observed output for all training data. Further, SVR minimises the influence of the error in the observed data by establishing boundary margins around the  hyperplane outside which data is not considered for regression. The prediction becomes challenging given that the SVR output is a real number. Consequently, a tolerance margin, epsilon is set in approximation to the SVM.
Two basic types of SVR are applied-epsilon-SVR and nu-SVR [26][27][28]-differentiated by the manner the parameters therein are managed. The main attribute of SVR is the use of a non-linear kernel transformation to map the input variables into a feature space such that the relation with the output variable becomes linear in the transformed space. Second, SVR's excellent generalisation capabilities result from the use of non-linear kernels with good approximation. Also, SVR does not suffer from local minima problem because it possesses convex optimisation formulation. It can better solve small samples and non-linear dimensional problems. The linear case of SVR is modelled as given in Equation (3).

F I G U R E 4 (Continued)
AQEEL ET AL.
SVR can be formulated as a convex optimisation as given in Equation (4).
ε is the acceptable deviation of estimated locations from actual location. An implicit assumption is that the function f ðxÞ can approximate all input pairs x i ; y i ð Þ with precision ε , that is, it is assumed that optimisation is feasible. Therefore, in order to accommodate errors, slack variables ξ i ; ξ * i are introduced to cope with otherwise infeasible optimisation constraints given in Equation (4) [29], where the constant C > 0 determines the degree to which deviations larger than ξ are tolerated with l being the number of samples as in Equation (5).
ξ i ; ξ * i are the slack variables that make allowance for the localisation errors to exist up to the value of ξ i and ξ * i without degrading performance. C is the box constraint, a positive numeric value that controls the penalty imposed on data points that lie outside the ε margin and helps to prevent overfitting.
The dot product of the input vectors can be replaced with their non-linear transformation, the kernel function, represented by k x i ; x ð Þ to form the non-linear solutions given in Equation (8).
Kernel functions make the SVR applicable to both linear and non-linear approximations. SVRs yield an acceptable generalisation performance as only the support vectors are used for prediction and are based on structural risk minimisation that seeks to minimise the generalisation rather than the training error [31].

| Gaussian Process Regression
The Gaussian Process (GP) is a probabilistic kernel-based technique that has been applied in many practical problems including estimation, classification, prediction, and prognosis due to its advantage of being flexible, probabilistic, and nonparametric [17,26]. A GP can model any system or process according to a normal or Gaussian distribution, where the mean and covariance function depend on the training data; the process is a collection of random variables with a joint Gaussian distribution [32]. Thus, any function sample has a Gaussian distribution defined by its mean function mðxÞ and covariance function k x; x 0 ð Þ. The model assumes that the output is a realisation of a GP with joint probability density function as given in Equation (9).
ÞÞ Here, the GP method is applied to a regression problem. Assuming X ¼ x 1 ; x 2 ; : : : x N ½ � represents N by 6dimensional RSSI ratio input vectors, and the corresponding outputs are y ¼ y 1 ; y 2 ; : : : y N ½ �, representing the dual location coordinates. When a new input vector x * is given, the goal is to predict the corresponding output y * (unknown location coordinates). The spatial relationship between the input variable and the expected output can be modelled as a GP by Equation (10).

T A B L E 8 Kernel functions Kernel Formula
Radial basis function (RBF) where φ is a function parameterised by vector W ; ϵ is assumed to be the noise caused by perturbations represented by a distributed Gaussian distribution N with zero mean and variance σ 2 n . The prior probability on y is given by Equation (11).
where E is the mean function, and cov is the variance function. The distribution with the new input can be expressed by the function in Equation (12).
� can be written as k * . The prediction can be presented by Equations (13) and (14).

| EXPERIMENTAL RESULTS
The experimental results are presented in this section. The offline measurements were taken in the suburb region of Jazan City in Saudi Arabia. The testbed considered is an environment characterised with sandstorms, tall buildings, masts and towers. The testbed area was divided into a semi-uniform grid with side AQEEL ET AL.

| Parameter tuning
The hyperparameters associated with the machine learning algorithms impact the overall performance of models; thus, central is the tuning of parameters to optimise their accuracy. Hyperparameters are tuned for each dataset namely RSSI ratios of SF9, SF10, SF11, and SF12. The optimal model hyperparameters are unique to a single dataset. A random search method is used to select the optimal parameters of the epsilon-SVR, nu-SVR and GPR algorithms. A grid of hyperparameters values is established, and a random combination of the values is selected to train the model. Moreover, for SVR, F I G U R E 6 Cumulative Distribution Function (CDF) for epsilon_SVR models using combined kernels F I G U R E 7 Cumulative Distribution Function (CDF) for Nu_SVR models using combined kernels hyper-parameter C, regularisation constant, epsilon and nu for nu-SVR are also optimised using the same methodology. For GPR, the only hyperparameter to be tuned is alpha. Some kernels such as 'Matern' have optimised parameters. The summary of the optimal parameters used in each algorithm for each dataset is given in Table 4, Table 5, and Table 6.

| Impact of transformed RSSI features
To evaluate the impact of the transformed fingerprints (RSSI ratio), the SVR methods are first evaluated using absolute RSSI values for all spreading factors (SF9, SF10, SF11, and SF12). Table 7 shows the statistical performance of the models when absolute RSSI data is used. The median localisation error using absolute RSSI features with SF12 is 336 m. On the other hand, epsilon_SVR with SF11 using RSSI_ratio provides the best median localisation error of 303 m as shown in Table 7, enhancing precision by 28.8% over using absolute RSSI with SF11. The transformed data (RSSI_Ratio) have shown to improve the accuracy of the node localisation. We believe that the improved performance of the developed node localisation model using transformed data (RSSI ratio) is because the average RSSI ratio reduces the noise in the absolute RSSI data.

| Impact of kernel functions
In machine learning, a kernel is used to transform linearly inseparable data to linearly separable data. In effect, kernel functions compute similarities between samples in the data. A range of kernel functions are used in the establishment of SVR-and GPR-enhanced localisation models. In addition, different kernels are combined in order to further investigate the effect of kernel functions on the performance of the models. The kernels used in the evaluation are given in Table 8 [32] [33] [34].
RSSI ratio data and the corresponding location coordinates are used as training inputs to the algorithm. Whilst the data used for training remained constant, the kernel function was varied in order to test its impact on performance. Results for each algorithm, epsilon-SVR, nu-SVR and GPR with combined kernel functions are shown in Table 9, Table 10 and  Table 11. It is evident that the combined kernel functions outperformed the commonly used kernels on the same dataset for all three algorithms. More specifically, the Rational Quadratic + Matern kernel has the lowest median error of 303 m with the epsilon-SVR algorithm; in other words, the model locates the node with error less than 303 m for 50% of the time. The median location error is 309 m for

| Impact of spreading factor
The impact of SF on the performance of the models developed is evaluated using average RSSI ratios at different SFs (9, 10, 11 and 12) as input fingerprints to SVR and GPR. The results presented in Table 9, Table 10 and Table 11 indicate that higher spreading factors (SF11 and 12) yielded improved node localisation performance compared to lower spreading factors (SF9 and 10). SF11 and SF12 derived models produce the highest level of consistency irrespective of the combined kernels used. More specifically, epsilon-SVR at SF11 provides a median error of 303 m, a 30% improvement in precision compared to the performance at SF9 (453 m). The significant improvement at higher SF could be attributed to the quality of data collected. It has been observed in the reported experiment that the quality of data is a function of the SF used. Whilst we experienced significant loss of packets at SF9 and SF10, there was little or negligible loss of data at SF11 and 12. It should be noted that at higher SFs, latency is a consideration as the transfer of packets is subject to significant delays. However, the trade-off between latency and accuracy in this application may be a design option. Shadowing and reflections are more likely to impact reception at low SF values.

| Accuracy
Here, the accuracy of the models is measured as the average Haversian distance metric between the estimated and true location of a node as given in Equation (15): ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi φ − φ 0 2 � � þ cos cos φ 0 ð Þ cos cos ðφÞ λ − λ 0 2 where, φ 0 ¼ latitude of real location, φ ¼ latitude of estim ated location, λ 0 ¼ longitude of real location, and λ ¼ longi tude of estimated location RSSI ratio data and the optimised kernel are used in order to evaluate and compare the performance of the three algorithms. RSSI ratio features (120 � 6) and their corresponding location coordinates were used to train the algorithms; the data from the remaining 30 locations were used as test data. The overall performance of the three models is captured by CDFs of the localisation error as shown in Figure 6, Figure 7 and Figure 8. Each model provides a localisation accuracy with a median error of less than 400 m. Epsilon-SVR has the lowest median error of 303 m compared to 309 and 317 m for nu-SVR and GPR, respectively. SVR outperformed the GPR model in terms of the overall accuracy.

| Analysis of Antwerp dataset
To further demonstrate the feasibility and consistency of the developed method for LoRaWAN localisation, we explore a public dataset of LoRaWAN messages obtained in the city centre of Antwerp. It holds 123,529 messages which were collected over a 3-week period. City of Things hardware and a Firefly �1 GPS receiver was mounted on 20 cars of Antwerp's postal service, which drove around in the city centre while continuously acquiring the current latitude and longitude of the car as well as the Horizontal Dilution of Precision (HDOP) of the GPS signal. The acquired location information is sent in a LoRaWAN message via the IM880 B-L radio module. With HDOP, messages with poor GPS signal quality could be removed. Information stored in the dataset include 68