Introduction
The integration of unmanned aerial vehicles (UAVs) into future communication networks has garnered significant attention due to their flexibility, high mobility, and rapid deployment capabilities. UAVs play a crucial role in various applications, including military reconnaissance, environmental monitoring, and disaster response. Among these applications, real-time video transmission in scenarios such as forest fire monitoring presents unique challenges, including low-latency requirements and high-quality user experience (QoE). To address these challenges, UAV-assisted mobile edge computing (MEC) has emerged as a promising solution, leveraging UAVs as both relay stations and edge servers to process and transmit video data efficiently.
This article explores a UAV-assisted MEC system designed to optimize video task offloading by jointly managing device association, transmission power allocation, video transcoding strategies, and UAV flight trajectories. The system aims to minimize latency while ensuring high video quality, utilizing a deep reinforcement learning (DRL) approach based on the Soft Actor-Critic (SAC) algorithm. The proposed framework demonstrates superior performance compared to baseline methods, offering improved convergence and adaptability in dynamic environments.
System Architecture and Challenges
UAV-Assisted MEC Framework
The UAV-assisted MEC system consists of multiple mobile devices (MDs) equipped with cameras, a UAV-mounted MEC server, and a ground base station (GBS). The MDs capture video footage in remote or hazardous areas, such as forest fire zones, and offload the raw video data to the UAV for processing. The UAV performs video transcoding—reducing the data size by adjusting the bitrate—before forwarding the compressed video to the GBS for further analysis.
Key components of the system include:
- Device Association: The UAV selects one MD per time slot to serve, ensuring efficient resource allocation.
- Communication Links: Two primary channels exist—UAV-to-MD for data collection and UAV-to-GBS for forwarding processed video.
- Video Transcoding: A computationally intensive task performed on the UAV to balance latency and video quality.
- Trajectory Optimization: The UAV dynamically adjusts its flight path to maintain strong communication links with MDs and the GBS.
Challenges in UAV-Assisted Video Offloading
- Latency Constraints: Real-time video transmission demands minimal delays, requiring optimized data offloading, transcoding, and relay processes.
- Energy Limitations: UAVs operate on limited battery power, necessitating energy-efficient flight paths and computation strategies.
- Dynamic Environments: MD mobility, obstacles, and fluctuating channel conditions introduce unpredictability, requiring adaptive decision-making.
- QoE Requirements: Video quality must remain high despite compression, necessitating intelligent bitrate selection.
Problem Formulation and Optimization
System Model
The system operates in discrete time slots, with the UAV hovering at fixed positions to serve MDs sequentially. The optimization problem focuses on maximizing system utility, defined as a weighted combination of latency reduction and QoE enhancement.
Latency Model
System latency comprises three components:
- Data Offloading Delay: Time taken for MDs to transmit raw video to the UAV.
- Transcoding Delay: Processing time on the UAV to compress video data.
- Data Relay Delay: Time for the UAV to forward processed video to the GBS.
Energy Consumption Model
UAV energy expenditure includes:
- Transcoding Energy: Proportional to computational load.
- Communication Energy: Power used for data transmission.
- Flight and Hovering Energy: Determined by UAV speed and hovering duration.
QoE Model
Video quality is measured by the logarithmic ratio of the target bitrate to the minimum acceptable bitrate, ensuring perceptual quality is maintained.
Optimization Objectives
The primary goal is to maximize system utility by optimizing:
- Device Association: Selecting which MD to serve in each time slot.
- Transmission Power: Allocating power for MD-to-UAV and UAV-to-GBS links.
- Video Transcoding Strategy: Choosing optimal bitrates to balance quality and latency.
- UAV Trajectory: Planning flight paths to minimize distance and energy consumption.
Constraints include UAV battery limits, MD mobility, and communication obstacles.
SAC-Based Deep Reinforcement Learning Algorithm
Markov Decision Process (MDP) Formulation
The optimization problem is modeled as an MDP, where the UAV acts as an agent interacting with the environment. Key elements include:
State Space
The state at each time slot includes:
- Positions of MDs and the UAV.
- Obstruction status between UAV-MD and UAV-GBS links.
- Remaining video task sizes and UAV battery levels.
Action Space
The UAV’s actions involve:
- Selecting an MD to serve.
- Adjusting flight speed and direction.
- Setting transmission power levels.
- Choosing video transcoding bitrates.
Reward Function
The reward balances latency reduction and QoE improvement, encouraging the UAV to make decisions that enhance overall system performance.
Soft Actor-Critic (SAC) Algorithm
SAC is an advanced DRL algorithm that maximizes both expected reward and policy entropy, promoting exploration and robustness. Key features include:
- Policy and Q-Networks: SAC employs separate networks to estimate state values and optimize actions.
- Entropy Regularization: The algorithm encourages diverse actions to avoid suboptimal policies.
- Adaptive Temperature Coefficient: Automatically adjusts exploration levels during training.
Training Process
- Experience Replay: Stores past interactions to break temporal correlations.
- Network Updates: Q-networks and policy networks are iteratively refined using gradient descent.
- Target Network Soft Updates: Ensures stable training by slowly synchronizing target and main networks.
Performance Evaluation
Simulation Setup
Experiments were conducted in a 200m × 200m area with one UAV, four MDs, and a fixed GBS. Key parameters included:
- UAV flight height: 100m.
- Maximum UAV speed: 20 m/s.
- Video task sizes: 3.5–4 MB per MD.
- Channel bandwidth: 1 MHz.
Results and Analysis
Convergence Analysis
SAC-UNCO demonstrated stable convergence after 300 training episodes, outperforming baseline algorithms (DDPG, TD3, and RANDOM) in system utility.
Latency and QoE Comparison
- Task Volume Variations: SAC-UNCO reduced latency by 9–36% compared to baselines while improving QoE by 7–47%.
- Bandwidth Impact: At 1 MHz bandwidth, SAC-UNCO achieved 11–36% lower latency and 11–47% higher QoE.
- Flight Height Effects: Higher UAV altitudes increased latency due to longer communication distances, but SAC-UNCO maintained superior performance.
Energy Efficiency
The algorithm effectively managed UAV energy consumption, ensuring operations remained within battery constraints.
Conclusion
The proposed UAV-assisted MEC system, powered by the SAC-UNCO algorithm, addresses critical challenges in real-time video task offloading. By jointly optimizing device association, power allocation, transcoding strategies, and flight trajectories, the system achieves significant latency reductions and QoE improvements. The SAC-based approach excels in dynamic environments, offering robust convergence and adaptability.
Future research could extend this framework to multi-UAV scenarios, further enhancing scalability and resilience in complex monitoring applications.
DOI: 10.19734/j.issn.1001-3695.2024.09.0293
Was this helpful?
0 / 0