A Comprehensive Overview of k-Anonymous Location Privacy Protection Scheme Based on Alt-Geohash Coding

A Comprehensive Overview of k-Anonymous Location Privacy Protection Scheme Based on Alt-Geohash Coding

Introduction

The rapid development of sensor technology, mobile smart devices, and wireless communication has led to the widespread adoption of Location-Based Services (LBS). Users can leverage smart devices to query nearby points of interest such as restaurants, hospitals, schools, or parks, enhancing convenience in daily life. However, while enjoying these benefits, users also face significant privacy risks. LBS requires users to submit real-time location data, and untrusted LBS providers (LSPs) may collect and analyze this information without user consent, potentially exposing personal identities and sensitive location details.

To mitigate these risks, various location privacy protection techniques have been proposed, with k-anonymity being one of the most widely used methods. The core idea of k-anonymity is to conceal a user’s real location within a set of k-1 dummy locations, ensuring that an attacker cannot identify the true location with a probability higher than 1/k. However, existing k-anonymity-based solutions often suffer from high computational overhead, reliance on trusted third-party anonymization servers, and vulnerability to inference attacks by adversaries with background knowledge.

This article introduces a novel k-anonymous location privacy protection scheme based on Alt-Geohash coding (KLPPS-AGC), which addresses these challenges by improving retrieval efficiency, eliminating dependence on centralized anonymization servers, and enhancing resistance against inference and homogeneity attacks.

Background and Challenges

Privacy Risks in LBS

When users request LBS, they must send location-based queries to potentially untrusted LSPs. This process introduces several privacy concerns:

  1. Time Overhead – Traditional k-anonymity techniques require extensive retrieval of historical location data to construct candidate sets. As the anonymity degree (k) increases, the time required for retrieval grows significantly, reducing service responsiveness.

  2. Dependence on Third-Party Anonymization – Many existing solutions rely on trusted anonymization servers to process user locations. However, fully trusted third parties are rare in practice, and these servers may become single points of failure or data leakage.

  3. Inference Attacks – Attackers with background knowledge (e.g., map data) can filter out improbable dummy locations (e.g., rivers or uninhabited areas) and narrow down the user’s true location. Additionally, clustering-based attacks can identify concentrated dummy locations, further compromising privacy.

Key Objectives

To overcome these challenges, KLPPS-AGC is designed with the following goals:

  1. Efficient Retrieval – Utilize Alt-Geohash encoding to reduce dimensionality and accelerate historical data retrieval.
  2. Decentralized Anonymization – Perform anonymization directly on user devices, eliminating reliance on third-party servers.
  3. Enhanced Privacy Against Inference – Optimize candidate sets using historical query probabilities and spatial dispersion to resist background knowledge attacks.

Methodology

System Architecture

KLPPS-AGC operates in a decentralized framework consisting of three main components:

  1. Wi-Fi Access Points (APs) – Provide historical query probability data for grid-based regions.
  2. User Devices – Perform local anonymization, including location generalization, Alt-Geohash encoding, and candidate set filtering.
  3. LBS Providers – Respond to anonymized queries without accessing raw user locations.

Key Algorithms

  1. Location Generalization

The algorithm begins by determining whether the user is in a densely or sparsely populated area. Based on this classification, it generates random offsets to expand the user’s coordinates into a generalized region.

• Urban Areas (Dense Population) – Small random offsets (0 to 0.01 degrees) are applied to maintain query precision.

• Rural Areas (Sparse Population) – Larger offsets (0 to 0.025 degrees) enhance privacy by broadening the anonymization range.

This approach prevents attackers from deducing the exact user location based on symmetry or predictable patterns.

  1. Alt-Geohash Encoding

Geohash is a widely used method for converting two-dimensional coordinates into compact alphanumeric strings. KLPPS-AGC extends this concept by incorporating altitude (height) information, resulting in Alt-Geohash codes.

• Geohash Encoding – The latitude and longitude intervals are iteratively divided, with each iteration assigning binary values (0 or 1) based on position relative to midpoints. These binary sequences are interleaved and converted into Base32 strings.

• Altitude Encoding – The user’s height is normalized and appended to the Geohash string, ensuring that dummy locations share similar elevation characteristics.

This encoding enables efficient prefix-based retrieval of historical locations while maintaining three-dimensional spatial consistency.

  1. Reverse Retrieval for Candidate Set Construction

Using the Alt-Geohash code, the system retrieves historical locations matching the user’s encoded region. If insufficient candidates are found, the search expands to adjacent grid cells.

• Initial Retrieval – Locations with identical Alt-Geohash codes are prioritized.

• Expanded Search – If fewer than 4k-4 candidates are found, neighboring grids are scanned to ensure adequate anonymization.

This method significantly reduces retrieval time compared to traditional distance-based searches.

  1. Location Filtering for Secure Anonymization

The candidate set undergoes two optimization steps to enhance privacy:

  1. Query Probability Filtering – The map is divided into 10×10 grids, each assigned a historical query probability. Dummy locations are selected to match the user’s grid probability, making them indistinguishable from the real location.
  2. Spatial Dispersion Optimization – Using the Haversine formula, the algorithm iteratively selects locations that maximize the area of the polygon formed with the user’s position. This ensures high spatial dispersion, mitigating clustering attacks.

Finally, the user’s real location is randomly inserted into the filtered set to produce the final anonymous set.

Performance Analysis

Privacy Protection Effectiveness

  1. Resistance to Inference Attacks – By selecting dummy locations with similar query probabilities, KLPPS-AGC ensures that attackers cannot distinguish the real location based on historical data. The k-anonymity guarantee further limits identification probability to 1/k.

  2. Resistance to Homogeneity Attacks – Spatial dispersion optimization prevents dummy locations from clustering, making it difficult for attackers to isolate the true location via geometric analysis.

Practicality and Efficiency

  1. Reduced Time Overhead – Alt-Geohash encoding enables near-instantaneous retrieval of candidate locations, outperforming traditional methods that rely on computationally expensive distance calculations.
  2. Decentralized Processing – By performing anonymization locally, the scheme eliminates third-party risks and reduces communication overhead.

Comparative Advantages

Experiments using the Geolife dataset demonstrate that KLPPS-AGC achieves:

• Higher Location Entropy – Compared to existing methods like GLPS and LPPS-AGC, KLPPS-ACG generates anonymous sets with superior entropy, indicating stronger privacy.

• Improved Spatial Dispersion – The Haversine-based optimization ensures dummy locations are widely distributed, unlike random or distance-constrained approaches.

• Lower Computational Cost – The Alt-Geohash retrieval mechanism reduces processing time, even with large datasets.

Conclusion

KLPPS-ACG presents a robust solution for protecting location privacy in LBS applications. By leveraging Alt-Geohash encoding, decentralized processing, and multi-criteria optimization, it addresses critical limitations in existing k-anonymity techniques. The scheme ensures efficient data retrieval, resistance to inference and homogeneity attacks, and practical usability without relying on trusted third parties.

Future work may extend this approach to continuous query scenarios, where temporal correlations between locations could introduce additional privacy risks.

doi.org/10.19734/j.issn.1001-3695.2024.03.0165

Was this helpful?

0 / 0