Cooperative Computation Offloading Method Based on Boosting Prioritized Experience Replay
Introduction
The rapid development of mobile cellular networks has led to millions of terminal devices connecting to base station servers, generating massive amounts of data that must be processed through distributed cloud computing (DCC). However, traditional DCC architectures rely heavily on centralized data processing and storage, resulting in significant transmission delays and increased network load. These limitations make it difficult to meet the low-latency requirements of applications such as virtual reality, augmented reality, and autonomous driving. To address these challenges, mobile edge computing (MEC) has emerged as a promising solution. MEC deploys edge servers closer to users, reducing the distance between terminal devices and servers, thereby minimizing computation latency and energy consumption while alleviating network congestion.
MEC relies on task scheduling methods to determine task execution priorities and task offloading methods to decide where tasks should be processed. Therefore, computation offloading methods in MEC play a crucial role in determining task scheduling priorities and offloading strategies.
Existing computation offloading approaches often fail to account for different task queuing conditions at terminal devices and edge servers, leading to inaccuracies in latency estimation. Moreover, reinforcement learning (RL)-based offloading methods typically use temporal difference (TD) error-based experience replay, which may introduce estimation bias and reduce the accuracy of offloading decisions. To overcome these limitations, this paper proposes COOPERANT, a cooperative computation offloading method based on Boosting prioritized experience replay.
Background and Related Work
Task Scheduling in Computation Offloading
Traditional task scheduling methods often handle homogeneous tasks using a first-come-first-served (FCFS) approach, neglecting the varying priorities of latency-sensitive and energy-sensitive tasks. Additionally, existing methods do not consider the different queuing behaviors of tasks arriving at terminal devices and edge servers, leading to inaccurate latency predictions.
Recent research has introduced improvements by considering heterogeneous task characteristics. For example, some studies have incorporated priority-based scheduling in unmanned aerial vehicle (UAV) swarms to optimize computation delays. Others have developed multi-source fluid queue models to estimate task queuing delays more accurately. However, these approaches still lack a comprehensive consideration of dynamic task priorities and queuing behaviors in MEC environments.
Task Offloading Techniques
Early research on task offloading relied on heuristic algorithms such as genetic algorithms (GA), ant colony optimization (ACO), and whale optimization algorithms (WOA). While these methods can optimize system overhead to some extent, they require centralized decision-making entities, limiting their adaptability in dynamic environments.
Recent advancements have leveraged multi-agent deep reinforcement learning (MADRL) to enable autonomous decision-making in terminal devices. However, conventional MADRL methods often use random or TD error-based experience replay, which may not effectively prioritize high-value historical experiences. This limitation reduces the efficiency of learning and leads to suboptimal offloading decisions.
COOPERANT Methodology
System Model
COOPERANT operates in a mobile cellular network edge computing scenario with multiple terminal devices and edge servers. Each terminal device has limited computational resources and must offload complex tasks to edge servers deployed near base stations. The system aims to minimize task computation latency and energy consumption while optimizing resource utilization.
Task Queuing Models
To address the limitations of existing queuing models, COOPERANT introduces two novel queuing models:
-
Terminal Device Queuing Model (M/M/1/∞/∞/Priority):
• Tasks arrive at terminal devices following a Poisson distribution.• The model employs a preemptive priority-based scheduling mechanism, where high-priority tasks (e.g., latency-sensitive tasks) can interrupt lower-priority tasks.
• This approach ensures that critical tasks are processed with minimal delay.
-
Edge Server Queuing Model (M/M/S/∞/∞/FCFS):
• Tasks arriving at edge servers are processed in a non-preemptive FCFS manner.• The model accounts for multiple servers and dynamically adjusts resource allocation based on task arrival rates.
These models enable more accurate estimation of task response times in dynamic environments.
Computation Offloading Optimization
COOPERANT formulates the task offloading problem as a joint optimization model, where the objective is to maximize the system’s reward while minimizing latency and energy consumption. Key components include:
-
Local Computation vs. Offloading:
• Terminal devices decide whether to process tasks locally or offload them to edge servers.• The offloading decision considers factors such as task size, computational requirements, and channel conditions.
-
Transmission and Energy Consumption:
• The model accounts for transmission delays and energy costs when offloading tasks to edge servers.• It optimizes bandwidth allocation and transmission power to reduce overhead.
-
Reward Function:
• The reward function balances latency reduction and energy savings.• Penalties are imposed if task completion exceeds predefined latency or energy thresholds.
Boosting Prioritized Experience Replay
Traditional RL methods use TD errors to prioritize experiences, which may introduce noise and bias. COOPERANT introduces a novel Boosting-based prioritized experience replay mechanism:
-
Experience Weighting:
• Historical experiences are weighted based on their contribution to reward maximization.• High-value experiences (e.g., those leading to low-latency or low-energy solutions) are assigned higher weights.
-
Boosting Mechanism:
• The algorithm constructs weak learners (base classifiers) to evaluate the importance of experiences.• Misclassified experiences (those with high errors) receive increased weights in subsequent iterations.
-
Sampling Strategy:
• Experiences are sampled with probabilities proportional to their weights.• This ensures that high-value experiences are more frequently used for training, accelerating convergence.
Multi-Agent Deep Reinforcement Learning
COOPERANT employs a MADRL framework where each terminal device acts as an independent agent. The framework includes:
-
Actor-Critic Architecture:
• Actor Networks: Generate offloading decisions based on local observations.• Critic Networks: Evaluate the quality of actions and guide policy updates.
-
Centralized Training with Decentralized Execution:
• Agents share experiences during training but act independently during execution.• This approach balances collaboration and autonomy.
-
Network Updates:
• The algorithm uses soft updates to stabilize training.• Target networks are periodically synchronized with main networks to reduce variance.
Experimental Evaluation
Simulation Setup
COOPERANT was evaluated in a simulated MEC environment with multiple terminal devices and edge servers. Key parameters included:
• Terminal Devices: 4 devices with varying computational capabilities.
• Edge Servers: 10 servers, each with multiple transmission channels.
• Task Characteristics: Tasks with different sizes, priorities, and latency/energy constraints.
Performance Comparison
COOPERANT was compared against several baseline methods, including:
• Heuristic Algorithms: GA, ACO, and WOA.
• RL-Based Methods: MADDPG (random experience replay), TD-MADDPG (TD error-based replay), and MAPPO (on-policy RL).
Results
-
Queuing Delay Reduction:
• COOPERANT reduced terminal device queuing delays by up to 71% compared to FCFS models.• Edge server queuing delays were reduced by up to 66%.
-
Latency and Energy Savings:
• COOPERANT achieved lower average latency (19.5%–47.5% reduction) and energy consumption (36.3%–81.4% reduction) than baseline methods. -
Convergence Speed:
• The Boosting-based experience replay accelerated convergence, with COOPERANT reaching stability 25%–40% faster than other RL methods. -
Scalability:
• The method maintained performance across varying task arrival rates, demonstrating robustness in dynamic environments.
Conclusion
COOPERANT presents a novel approach to computation offloading in MEC environments by integrating Boosting-based prioritized experience replay with MADRL. The proposed method addresses key limitations of existing techniques, including inaccurate queuing models and suboptimal experience replay mechanisms. Experimental results demonstrate significant improvements in latency reduction, energy efficiency, and convergence speed compared to state-of-the-art methods.
Future research will explore extensions to scenarios involving both cooperative and competitive interactions among terminal devices, further enhancing the adaptability of computation offloading strategies.
doi.org/10.19734/j.issn.1001-3695.2024.08.0313
Was this helpful?
0 / 0