An Ensemble Optimization Algorithm Based on MADDPG
                  Conference: ISCTT 2022 - 7th International Conference on Information Science, Computer Technology and Transportation
                  05/27/2022 - 05/29/2022 at Xishuangbanna, China              
Proceedings: ISCTT 2022
Pages: 4Language: englishTyp: PDF
            Authors:
                          Zhou, Ruiyu (School of Management, Guangdong University of Technology, Guangzhou, Guangdong Province, China)
                      
              Abstract:
              Multi-agent Deep Deterministic Policy Gradient (MADDPG) is a multi-agent deep reinforcement learning algorithm, and it has been published for more than 4 years. It has gradually become a classic multi-agent algorithm and one of the basic algorithms that many beginners must learn. However, sometimes the MADDPG algorithm takes a bit too many episodes for the reward value to rise significantly, and sometimes the reward value is not high enough within a certain episode. That is why this paper proposes an algorithm that uses an ensemble method to optimize the MADDPG algorithm. The basic operation idea of the algorithm is to train multiple models simultaneously, then compare the rewards obtained by all models, and select the model with the highest reward to train all models at this stage. After testing in different environments provided by MPE, it is found that this ensemble optimization algorithm based on MADDPG is effective.            


