Coordinated Slicing and Admission Control using Multi-Agent Deep Reinforcement Learning

5G Cloud Radio Access Networks (C-RANs) facilitate new forms of flexible resource management as dynamic RAN function splitting and placement. Virtualized RAN functions can be placed at different sites in the substrate network based on resource availability and slice constraints. Due to limited resou...

Full description

Saved in:
Bibliographic Details
Published in:IEEE eTransactions on network and service management Vol. 20; no. 2; p. 1
Main Authors: Sulaiman, Muhammad, Moayyedi, Arash, Ahmadi, Mahdieh, Salahuddin, Mohammad A., Boutaba, Raouf, Saleh, Aladdin
Format: Journal Article
Language:English
Published: New York IEEE 01-06-2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:5G Cloud Radio Access Networks (C-RANs) facilitate new forms of flexible resource management as dynamic RAN function splitting and placement. Virtualized RAN functions can be placed at different sites in the substrate network based on resource availability and slice constraints. Due to limited resources in the substrate network and variability in revenue of slices, the Infrastructure Provider (InP) must perform network slicing in a strategic manner, and accept or reject slice-requests to maximize long-term revenue. In this paper, we propose to use multi-agent Deep Reinforcement Learning (DRL) to jointly solve the problems of network slicing and slice Admission Control (AC). Multi-agent DRL along with reward shaping is a promising choice, which is well-suited to problems where multiple distinct tasks have to be performed optimally. The proposed DRL approach can learn the dynamics of slice-request traffic and effectively address these joint problems. We compare multi-agent DRL to approaches that use: (i) simple heuristics to address the problems, and (ii) DRL to address either slicing or AC. Our results show that the proposed approach achieves up to 30% and 5.18% gain in long-term InP revenue when compared to approaches (i) and (ii), respectively. Additionally, we show that multi-agent DRL is preferable to a single-agent DRL approach for the joint problems in terms of convergence time and InP revenue. Finally, we evaluate the robustness of the trained agents in scenarios that differ from training, such as different arrival rates and real dynamic traffic patterns.
ISSN:1932-4537
1932-4537
DOI:10.1109/TNSM.2022.3222589