Multi-agent deep reinforcement learning for user association and resource allocation in integrated terrestrial and non-terrestrial networks

Integrating the terrestrial network with non-terrestrial networks to provide radio access as anticipated in the beyond 5G networks calls for efficient user association and resource allocation strategies. In this work, a weighted sum single objective optimization problem that maximizes the total netw...

Full description

Saved in:

Bibliographic Details
Published in:	Computer networks (Amsterdam, Netherlands : 1999) Vol. 231; p. 109827
Main Authors:	Birabwa, Denise Joanitah, Ramotsoela, Daniel, Ventura, Neco
Format:	Journal Article
Language:	English
Published:	Elsevier B.V 01-07-2023
Subjects:	Multi-agent deep reinforcement learning Non-terrestrial networks RAN user association Resource allocation Terrestrial networks Multi-agent deep reinforcement learning Terrestrial networks Non-terrestrial networks RAN user association Resource allocation
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Integrating the terrestrial network with non-terrestrial networks to provide radio access as anticipated in the beyond 5G networks calls for efficient user association and resource allocation strategies. In this work, a weighted sum single objective optimization problem that maximizes the total network data rate while minimizing the mobility-induced handoffs and prioritizing service provisioning of mission-critical users in the integrated terrestrial and non-terrestrial network is formulated. The problem’s complexity is reduced by solving the defined problem in two phases; the user association followed by the resource distribution phase. Several proposed approaches to the user association sub-problem utilize a central node that requires nearly-complete information, which may not be available in real time. On the contrary, this paper proposes a centralized training and distributed execution multi-agent dueling double deep Q network (MA3DQN) solution. Each user collects the channel state and access node loading information in this approach and makes an association decision that considers its quality of service requirements. The algorithm’s performance is validated through comparison with the genetic algorithm (GA), the integer linear programming (ILP), a heuristic approximation-based solution, the greedy approach, and the random user association (RUA) algorithm. Moreover, the paper simulates the multi-agent deep Q network solution as an additional benchmark algorithm. Simulation results reveal that as the number of users in the network increases, the acquired data rate of the MA3DQN is within 0.48% and 0.4% of that achieved by the GA and ILP, respectively, and outperforms all other algorithms. Notably, the proposed MA3DQN algorithm presents the best running time, attaining a gain of 99.9% over the GA algorithm, which performs the poorest among the algorithms characterized by polynomial worst-case time complexity. Besides, the MA3DQN approach maintains a handoff probability of zero, unlike the ILP, the approximation-based, greedy, and RUA solutions.
ISSN:	1389-1286 1872-7069
DOI:	10.1016/j.comnet.2023.109827