Dual Intent Session-Based Recommendation Integrating Multi-Layer Graphs and Classification Information

Session-based recommendation (SBR) has emerged as a critical approach in modern recommender systems, addressing the challenge of predicting user preferences based on short-term interaction sequences. Traditional recommendation systems often rely on long-term user profiles and historical data, which may not be available in many real-world scenarios. SBR systems overcome this limitation by focusing on the immediate context of user interactions within a session, making them particularly valuable for e-commerce platforms, media streaming services, and other applications where user behavior is transient and anonymous.

The evolution of SBR methodologies has progressed through several stages. Early approaches utilized Markov chains and item-based collaborative filtering techniques. While these methods showed promise in capturing immediate transitions between items, they struggled with modeling long-term preferences and complex item relationships. The introduction of recurrent neural networks (RNNs) marked a significant advancement, enabling the modeling of sequential patterns in user behavior. However, these approaches were limited by their inherent sequential processing nature and inability to capture complex, non-linear item relationships.

Recent years have witnessed the growing dominance of graph neural networks (GNNs) in SBR systems. GNN-based approaches have demonstrated superior performance by explicitly modeling the intricate transition relationships between items within sessions. These methods construct graph representations of user sessions, where items serve as nodes and transitions between them as edges. The graph structure allows for more effective capture of both local and global patterns in user behavior, leading to more accurate recommendations.

Despite these advancements, current SBR systems still face several significant challenges. First, many existing approaches fail to fully exploit the rich information available across different levels of session data. Some methods focus exclusively on intra-session information, while others incorporate either session-level relationships or global item transitions, but rarely all three simultaneously. This partial utilization of available data limits the system’s ability to comprehensively understand user intent.

Second, the aggregation of information across sessions often introduces redundant or irrelevant data. When establishing connections between sessions through shared items, many approaches indiscriminately include all related sessions without considering the potential noise introduced by weakly related interactions. This can lead to information overload and make it more difficult to discern genuine user preferences from incidental patterns.

Third, the integration of auxiliary information, such as item categories or metadata, remains underdeveloped in many SBR systems. While some approaches incorporate such additional data, they often fail to effectively combine it with the core session features, resulting in suboptimal utilization of potentially valuable contextual information.

To address these limitations, we propose a novel Dual Intent Session-Based Recommendation model that Integrates Multi-layer Graphs and Classification information (SRIMC). Our approach fundamentally rethinks how session data is represented and processed, introducing several innovative components that work in concert to provide more accurate and robust recommendations.

The foundation of our model lies in its multi-layer graph architecture, which systematically captures information at three distinct but complementary levels. At the most granular level, we construct local session graphs that represent the sequence of interactions within individual sessions. These graphs preserve the immediate context of user behavior, capturing the direct transitions between items that often reveal short-term preferences and immediate intent.

Moving beyond individual sessions, we introduce a session relationship graph that connects different sessions based on shared items. This intermediate layer allows the model to identify patterns and similarities across multiple user sessions, potentially revealing more stable preferences or common behavioral sequences. To address the issue of information redundancy in session relationships, we implement a novel sparsification technique that selectively prunes less meaningful connections while maintaining the graph’s overall connectivity. This approach helps focus attention on the most relevant inter-session relationships.

At the broadest level, we construct a global item graph that captures transition patterns across the entire dataset. This comprehensive view enables the model to learn general item associations that may not be apparent within individual sessions or small groups of sessions. By considering item relationships at this scale, the system can identify broader patterns of user behavior and item affinity that transcend individual sessions.

The information extracted from these three graph layers is processed through specialized GNN architectures tailored to each level’s characteristics. The local session graph employs attention mechanisms to weight the importance of different item transitions within a session. The session relationship graph utilizes convolutional operations to propagate information across connected sessions. The global item graph incorporates factorized representations to disentangle different aspects of item relationships. The outputs from these three processing streams are then combined to form what we term the α-intent representation, which encapsulates the structural patterns learned from the multi-layer graph framework.

Complementing this graph-based approach, we introduce a second pathway for capturing user intent that focuses on categorical information and session characteristics. Recognizing that item metadata and session properties can provide valuable context beyond interaction sequences, we develop a β-intent learning mechanism that systematically combines classification information with session length features. This component employs a Bayesian framework with β-distribution priors to model the joint probability of user preferences based on both item categories and session characteristics.

The β-intent mechanism operates by first creating a unified representation that combines item category information with the length of the session in which the item appears. This fusion captures how different types of items tend to appear in sessions of varying lengths, potentially indicating different patterns of user interest or purchase behavior. The Bayesian framework then allows for principled estimation of user preference distributions based on these combined features.

A key innovation in our approach is the attention mechanism applied to the β-intent representations, which dynamically adjusts the influence of different items based on their position in the session and the strength of their categorical associations. This ensures that more recent interactions and more strongly categorized items receive appropriate emphasis in the final recommendation process.

The α and β intents are not treated as independent signals but are carefully integrated through a learned fusion mechanism. This combination allows the model to benefit from both the rich structural patterns captured by the multi-layer graphs and the contextual understanding provided by the categorical and session-length features. The fusion process is adaptive, automatically adjusting the relative influence of each intent type based on the characteristics of the current session and the broader context.

To validate the effectiveness of our approach, we conducted extensive experiments across five diverse real-world datasets spanning different domains and interaction patterns. These datasets include e-commerce transaction logs, music listening records, and general web interaction data, ensuring that our evaluation covers a wide range of potential application scenarios.

Our experimental results demonstrate consistent and significant improvements over existing state-of-the-art methods across all datasets. The performance gains are particularly notable in scenarios with sparse interaction data or when dealing with new items, where traditional approaches often struggle. The incorporation of categorical information through the β-intent mechanism proves especially valuable in cold-start situations, where limited interaction data makes structural patterns harder to discern.

Detailed analysis of the model components through ablation studies confirms the importance of each architectural element. The multi-layer graph structure shows clear benefits over approaches that consider only a subset of the information levels. Similarly, the β-intent mechanism contributes substantially to prediction accuracy, particularly in datasets where item categories provide strong signals about user preferences.

The practical implications of our work are substantial. By more effectively utilizing the available data at multiple levels and combining structural and contextual information, our model can provide more accurate recommendations while requiring no additional user information beyond the interaction session itself. This makes it particularly suitable for privacy-sensitive applications or scenarios where user profiles are unavailable.

Furthermore, the model’s ability to handle new items through categorical information addresses a common challenge in recommendation systems, potentially reducing the cold-start problem and enabling faster integration of new inventory in e-commerce settings. The attention to information redundancy in session relationships also makes the model more computationally efficient than approaches that process all possible connections indiscriminately.

Looking forward, there are several promising directions for extending this work. One avenue involves exploring more sophisticated ways to combine the α and β intents, potentially through dynamic weighting schemes that adapt to session characteristics in real time. Another opportunity lies in expanding the types of auxiliary information incorporated into the β-intent mechanism, such as temporal patterns or item popularity metrics.

Additionally, while our current implementation focuses on single-domain recommendations, the underlying architecture could potentially be adapted for cross-domain scenarios by extending the graph representations to encompass multiple item types or interaction spaces. This could open up new applications in areas like multimedia recommendation or multi-category retail platforms.

In conclusion, our proposed SRIMC model represents a significant step forward in session-based recommendation systems. By thoughtfully integrating multi-layer graph representations with categorical information and addressing key limitations of existing approaches, we have developed a framework that more comprehensively captures user intent and delivers superior recommendation performance. The model’s robust performance across diverse datasets and its ability to leverage different types of information suggest strong potential for real-world deployment in various recommendation scenarios.

doi:10.19734/j.issn.1001-3695.2024.08.0323

Was this helpful?

0 / 0