Multi-agent deep reinforcement learning for user association and resource allocation in integrated terrestrial and non-terrestrial networks
Integrating the terrestrial network with non-terrestrial networks to provide radio access as anticipated in the beyond 5G networks calls for efficient user association and resource allocation strategies. In this work, a weighted sum single objective optimization problem that maximizes the total netw...
Saved in:
Published in: | Computer networks (Amsterdam, Netherlands : 1999) Vol. 231; p. 109827 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
Elsevier B.V
01-07-2023
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Integrating the terrestrial network with non-terrestrial networks to provide radio access as anticipated in the beyond 5G networks calls for efficient user association and resource allocation strategies. In this work, a weighted sum single objective optimization problem that maximizes the total network data rate while minimizing the mobility-induced handoffs and prioritizing service provisioning of mission-critical users in the integrated terrestrial and non-terrestrial network is formulated. The problem’s complexity is reduced by solving the defined problem in two phases; the user association followed by the resource distribution phase. Several proposed approaches to the user association sub-problem utilize a central node that requires nearly-complete information, which may not be available in real time. On the contrary, this paper proposes a centralized training and distributed execution multi-agent dueling double deep Q network (MA3DQN) solution. Each user collects the channel state and access node loading information in this approach and makes an association decision that considers its quality of service requirements. The algorithm’s performance is validated through comparison with the genetic algorithm (GA), the integer linear programming (ILP), a heuristic approximation-based solution, the greedy approach, and the random user association (RUA) algorithm. Moreover, the paper simulates the multi-agent deep Q network solution as an additional benchmark algorithm. Simulation results reveal that as the number of users in the network increases, the acquired data rate of the MA3DQN is within 0.48% and 0.4% of that achieved by the GA and ILP, respectively, and outperforms all other algorithms. Notably, the proposed MA3DQN algorithm presents the best running time, attaining a gain of 99.9% over the GA algorithm, which performs the poorest among the algorithms characterized by polynomial worst-case time complexity. Besides, the MA3DQN approach maintains a handoff probability of zero, unlike the ILP, the approximation-based, greedy, and RUA solutions. |
---|---|
ISSN: | 1389-1286 1872-7069 |
DOI: | 10.1016/j.comnet.2023.109827 |