Points-of-Interest Recommendation Based on Spatial-Temporal Enhancement of Sequence Graph and Geographical Relationships
Introduction
With the rapid development of location-based social networks (LBSNs), platforms such as Foursquare and Dianping have emerged, providing users with personalized Points-of-Interest (POI) recommendation services. POIs refer to locations such as restaurants, hotels, tourist attractions, or any place marked with geographical coordinates. POI recommendation aims to analyze users’ movement patterns to suggest locations they may find interesting, playing a crucial role in location-based advertising, food delivery, and other applications.
Existing POI recommendation methods can be categorized into traditional approaches, sequence-based methods, and deep learning-based techniques. Early research primarily relied on collaborative filtering and similarity-based methods, which often overlooked the influence of geographical distance between POIs. Sequence-based methods leverage historical check-in data to model user behavior using matrix factorization and Markov chains, but they heavily depend on continuous user interactions and lack interpretability. Recent advancements in deep learning have introduced neural networks to capture implicit relationships in user behavior, with recurrent neural networks (RNNs) and graph neural networks (GNNs) proving particularly effective in modeling sequential and structural patterns.
Despite these advancements, current methods still face two major limitations: (1) insufficient exploration of geographical features, particularly the complex topological relationships between POIs, and (2) failure to incorporate sequential information into spatial preferences, leading to incomplete modeling of user behavior. To address these challenges, this paper proposes a novel POI recommendation model called STESGGR (Spatial-Temporal Enhancement of Sequence Graph and Geographical Relationships).
Methodology
Overview
The STESGGR model integrates geographical and sequential features to enhance POI recommendation performance. The framework consists of four main components:
- Geographical Graph Construction – POI locations are used to build a geographical graph, where nodes represent POIs and edges denote proximity relationships.
- Spatial-Temporal Enhanced Sequence Graph – User check-in sequences are transformed into a graph structure enriched with temporal and spatial interval information.
- Feature Extraction with Attention Mechanisms – Graph convolutional networks (GCNs) and gated graph neural networks (GGNNs) extract geographical and sequential features, respectively, while attention mechanisms refine user preferences.
- Commonality Learning Framework – A joint optimization strategy aligns geographical and sequential representations to improve recommendation accuracy.
Geographical Graph Construction
The geographical graph captures the spatial relationships between POIs. Each node represents a POI, and edges connect POIs within a predefined distance threshold. The edge weights are determined by the inverse exponential of the distance, ensuring that closer POIs have stronger connections. This graph structure helps model the inherent geographical influence on user preferences, as people tend to visit nearby locations.
To extract high-order geographical relationships, a GCN is applied to propagate information across the graph. The GCN aggregates features from neighboring nodes, allowing the model to learn topological patterns. A multi-head attention mechanism further refines these features by identifying the most relevant POIs based on user behavior.
Sequence Graph with Spatial-Temporal Enhancement
User check-in sequences are represented as a directed graph where nodes are POIs and edges indicate consecutive visits. Unlike traditional sequence models, this approach explicitly incorporates spatial and temporal intervals between check-ins. For example, short time gaps and small distances between visits suggest strong local preferences, while longer intervals may indicate shifts in user interests.
The sequence graph is processed using a GGNN, which effectively captures sequential dependencies through gated recurrent units (GRUs). Spatial and temporal interval matrices are embedded into the graph to enhance the representation of user movement patterns. Similar to the geographical graph, an attention mechanism is applied to emphasize significant transitions in the sequence.
Commonality Learning Framework
Since geographical and sequential features influence user behavior differently, a commonality learning framework is introduced to align their representations. This framework minimizes the divergence between the two feature spaces, ensuring that they complement each other. By jointly optimizing geographical and sequential embeddings, the model achieves a more robust understanding of user preferences.
Click-Through Rate (CTR) Prediction
The final recommendation is generated by combining geographical and sequential representations through a multilayer perceptron (MLP). The model predicts the probability of a user visiting a target POI using a sigmoid activation function. The training objective includes both recommendation loss (cross-entropy) and commonality learning loss, balanced by a tunable hyperparameter.
Experiments
Datasets
Experiments were conducted on five real-world datasets from Foursquare, covering check-ins in Tokyo (TKY), New York (NYC), Calgary (CAL), Phoenix (PHO), and Singapore (SIN). These datasets vary in size, with Tokyo having the longest average sequence length (250.2) and Phoenix the shortest (13.3).
Baselines
The proposed STESGGR model was compared against several state-of-the-art methods:
• Sequence-based models: DIN and DIEN, which use attention mechanisms and GRUs for sequential modeling.
• Graph-based models: SR-GNN, NGCF, and LightGCN, which leverage graph structures for recommendation.
• Geographical models: GeoIE, LSTPM, and DisenPOI, which incorporate spatial information into recommendations.
Evaluation Metrics
Performance was measured using AUC (Area Under the ROC Curve) and Logloss (Cross-Entropy Loss), standard metrics for CTR prediction tasks.
Results
STESGGR outperformed all baselines across all datasets. The improvements in AUC ranged from 1.2% to 2.7%, while Logloss reductions varied between 3.2% and 12.4%. Notably, the model achieved the highest gains in Tokyo (2.2% AUC, 12.4% Logloss) and New York (2.7% AUC, 5.4% Logloss), demonstrating its effectiveness in diverse urban environments.
Ablation Study
To validate the contributions of each component, ablation studies were conducted by removing:
- The geographical graph module (STESGGR-GEO).
- The spatial-temporal enhancement module (STESGGR-ST).
- The sequence graph module (STESGGR-ST&SEQ).
- The commonality learning framework (STESGGR-Lcom).
Results confirmed that each component significantly contributes to the model’s performance. The absence of geographical or sequential features led to notable declines in accuracy, while removing the commonality learning framework weakened the alignment between the two feature spaces.
Hyperparameter Analysis
The impact of key hyperparameters was investigated:
• Commonality Learning Weight (β): Optimal values varied by dataset (0.1 for TKY, 1 for NYC), with excessive values causing overfitting.
• Distance Threshold (d): A threshold of 1 km provided the best balance between local relevance and computational efficiency.
Case Study
A visualization of user trajectories in New York demonstrated that STESGGR’s recommendations were more closely aligned with actual visit patterns compared to DisenPOI. The model successfully identified nearby POIs that matched users’ historical preferences, reinforcing its practical applicability.
Conclusion
This paper introduced STESGGR, a novel POI recommendation model that integrates spatial-temporal sequence graphs with geographical relationships. By leveraging GCNs and GGNNs, the model effectively captures both topological and sequential patterns in user behavior. The introduction of a commonality learning framework further enhances feature alignment, leading to superior recommendation accuracy.
Experiments on five real-world datasets confirmed STESGGR’s superiority over existing methods, particularly in handling sparse data and diverse urban settings. Future work will explore the impact of recurrent movement patterns and contrastive learning techniques to further refine geographical and sequential representations.
doi.org/10.19734/j.issn.1001-3695.2024.08.0298
Was this helpful?
0 / 0