Multi-Behavior Recommendation Integrating Self-Attention and Contrastive Learning

Multi-Behavior Recommendation Integrating Self-Attention and Contrastive Learning

Introduction

Modern internet platforms have become increasingly diverse and personalized, encompassing e-commerce sites, music streaming services, news recommendation systems, and more. However, the exponential growth of online content has led to information overload, making it difficult for users to discover relevant items. Recommendation systems address this challenge by analyzing user preferences and suggesting items that align with their interests. Among various recommendation techniques, collaborative filtering (CF) remains one of the most widely used approaches, leveraging historical user-item interactions to infer preferences.

Despite their success, traditional recommendation models often struggle with cold-start scenarios—where new users or items have insufficient interaction data. Multi-behavior recommendation has emerged as a promising solution, utilizing auxiliary behaviors (e.g., browsing, adding to cart) to enhance predictions for target behaviors (e.g., purchasing). However, existing multi-behavior models face two key limitations: (1) they fail to balance the optimization between auxiliary and target behaviors, leading to suboptimal performance, and (2) they do not account for personalized behavioral dependencies, causing interest drift in recommendations.

To address these challenges, this paper introduces SACL (Multi-Behavior Recommendation Integrating Self-Attention and Contrastive Learning), a novel framework that leverages self-attention and contrastive learning to improve recommendation accuracy. SACL constructs independent interaction views for different behavior types, employs graph neural networks to extract user-item relationships, and uses contrastive learning to capture shared user preferences across behaviors. Additionally, a self-attentive multi-behavior weighting network dynamically adjusts loss weights to mitigate noise from auxiliary behaviors.

Background and Related Work

Multi-Behavior Recommendation

Early multi-behavior recommendation models relied on matrix factorization or sampling strategies to incorporate auxiliary behaviors. While these methods improved performance, they lacked the ability to model complex dependencies between behaviors. Recent advancements have focused on learning behavior-specific embeddings and modeling behavioral relationships. For instance, MGNN aggregates multi-behavior features from different graph structures, while NMTR connects behavior predictions in a cascaded manner within a neural collaborative filtering framework.

Graph-based approaches, such as MBGCN and GHCF, utilize graph convolutional networks to capture high-order behavioral dependencies. MBGMN integrates meta-learning to enhance behavioral representations. Despite their effectiveness, these models often suffer from imbalanced optimization, where auxiliary behaviors dominate training, reducing target behavior prediction accuracy.

Attention Mechanisms in Recommendation

Attention mechanisms have been widely adopted to model behavioral dependencies. MATN employs a Transformer-based encoder to capture semantic relationships between behaviors, while DIPN uses hierarchical attention to learn intra- and inter-behavior dependencies. KHGT introduces multi-head attention to distinguish different behavioral contributions. These methods improve recommendation quality but still struggle with personalized behavioral weighting.

Contrastive Learning in Recommendation

Contrastive learning has gained traction for its ability to enhance representation learning. CML applies contrastive learning to align behavior-specific user representations, improving auxiliary behavior utilization. However, existing contrastive approaches do not fully exploit behavioral relationships or mitigate noise from irrelevant auxiliary behaviors.

Methodology

Problem Definition

Given a dataset with users ( U ) and items ( I ), SACL constructs interaction graphs for ( K ) behavior types, where each behavior ( k ) has an adjacency matrix ( A_k ). The goal is to generate user and item embeddings ( e_u ) and ( e_i ), perform contrastive learning between auxiliary and target behaviors, and optimize model parameters using a self-attentive weighting network.

Multi-Behavior Graph Convolutional Network

SACL begins by building separate interaction graphs for each behavior type. A lightweight graph convolutional network (GCN) propagates information across these graphs, aggregating node embeddings to capture multi-behavior features. The GCN employs weighted summation to combine embeddings from different layers, ensuring that nearby neighbors contribute more to the final representation.

Multi-Behavior Contrastive Learning

To enhance auxiliary behavior utilization, SACL applies contrastive learning between target and auxiliary behaviors. Positive pairs consist of the same user’s embeddings across different behaviors, reflecting shared preferences, while negative pairs consist of different users’ embeddings. The contrastive loss, based on InfoNCE, maximizes agreement between positive pairs and minimizes similarity between negative pairs. This approach strengthens behavioral relationships and reduces noise from irrelevant auxiliary behaviors.

Self-Attentive Multi-Behavior Optimization

A key innovation in SACL is the self-attentive weighting network, which dynamically adjusts loss weights based on behavioral dependencies. The network consists of two components:

  1. Meta-Knowledge Encoder: Encodes user behavior features into meta-knowledge, capturing personalized behavioral preferences. Two encoding strategies are used: one scales contrastive loss values, while the other directly combines loss values with user embeddings.
  2. Self-Attentive Weighting Network: Uses scaled dot-product attention to compute behavior-specific weights. The network projects meta-knowledge into query, key, and value matrices, applying a nonlinear transformation to generate final weights. These weights balance auxiliary and target behavior contributions during training.

Model Training

SACL optimizes parameters using a three-stage training strategy:

  1. Initial Training: Jointly trains the GCN and weighting network on the full dataset.
  2. Weight Network Fine-Tuning: Updates the weighting network using meta-knowledge embeddings.
  3. Final Optimization: Adjusts GCN parameters using the refined weights.

This staged approach accelerates convergence and improves recommendation accuracy.

Experiments

Datasets

Experiments were conducted on two real-world datasets:
• Tmall: Contains four behaviors (browse, favorite, add-to-cart, purchase) from an e-commerce platform.

• IJCAI-Contest: Includes similar behaviors (click, favorite, add-to-cart, purchase) from a retail system.

Baselines

SACL was compared against several state-of-the-art models:
• Single-Behavior Models: NGCF, LightGCN.

• Heterogeneous Graph Models: HGT, HeCo.

• Multi-Behavior Models: NMTR, MBGCN.

• Attention-Based Models: MATN, KHGT.

• Personalized Models: EHCF, CML, DPT.

Evaluation Metrics

Performance was measured using:
• Hit Rate (HR@10): Proportion of correctly recommended items in the top-10 list.

• Normalized Discounted Cumulative Gain (NDCG@10): Ranking quality considering item relevance and position.

Results

SACL outperformed all baselines, achieving average improvements of 10% in HR and 14% in NDCG. Key findings include:
• Multi-behavior models (e.g., MBGCN) surpassed single-behavior models, confirming the value of auxiliary behaviors.

• Attention-based models (e.g., KHGT) outperformed non-attention models, highlighting the importance of behavioral dependency modeling.

• SACL’s contrastive learning and self-attentive weighting provided significant gains over personalized models like CML and DPT.

Ablation Study

Removing key components degraded performance:
• Without Contrastive Learning (w/o MCL): Reduced ability to align behavioral representations.

• Without Meta-Knowledge Encoding (w/o MKE): Weakened personalized weighting.

• Without Weighting Network (w/o AWN): Led to imbalanced behavior optimization.

Hyperparameter Analysis

• GCN Layers: Three layers achieved optimal performance by balancing local and global information.

• Embedding Dimension: A dimension of 128 provided the best trade-off between expressiveness and efficiency.

Case Study

A user’s interaction sequence demonstrated SACL’s superiority. Traditional models relying solely on purchase data mispredicted the next item, while SACL leveraged auxiliary behaviors (e.g., browsing) to make accurate recommendations.

Conclusion

SACL addresses critical challenges in multi-behavior recommendation by integrating contrastive learning and self-attentive weighting. The model effectively balances auxiliary and target behaviors while capturing personalized preferences. Experimental results validate its superiority over existing methods. Future work will explore Transformer architectures for finer-grained behavioral dependency modeling.

doi.org/10.19734/j.issn.1001-3695.2024.07.0289

Was this helpful?

0 / 0