Deep Reinforcement Learning for Maximizing Downlink Spectral Efficiency in Non-Stationary RIS-Aided Multiuser-MISO Systems
Conference: European WIRELESS 2025 - 30th European Wireless Conference
10/27/2025 - 10/29/2025 at Sohia Antipolis, France
Proceedings: European Wireless 2025
Pages: 6Language: englishTyp: PDF
Authors:
Zhang, Haoze; Huang, Xiang; Guan, Zhangyu; Chen, Rong-Rong; Farhang, Arman; Ji, Mingyue
Abstract:
We investigate the problem of maximizing the overall Spectral Efficiency (SE) in a Reconfigurable Intelligent Surface (RIS)-aided Multi-User Multiple-Input Single-Output (MUMISO) downlink system by jointly optimizing the beamforming at the Base Station (BS) and the phase shift of the RIS. To address this highly non-convex optimization challenge, we propose a Deep Reinforcement Learning (DRL) framework utilizing the Deep Deterministic Policy Gradient (DDPG) algorithm. The DRL agent interacts with the communication environment through trial-anderror learning, receiving the rewards that reflect the quality of actions under continuously changing states. One advantage of our proposed scheme is its capability to handle the non-stationary conditions of MU-MISO environment efficiently. This capability is achieved through a carefully designed, richly structured state representation, which captures the detailed information from both the current and previous time steps. Additionally, we introduce a dual-normalization network structure to promote stable learning and effective exploration during training. DRL agent is trained with an off-policy actor-critic method that leverages an experience replay buffer and soft-updated target networks to maintain stable convergence in the continuous action space. Simulation results under the 3GPP propagation environment demonstrate that our proposed scheme can achieve better SE performance compared with several state-of-the-art benchmarks.

