Task-Allocation-Driven Multi-Agent Reinforcement Learning for Cooperative Evasion Guidance of High-Speed Aerial Vehicles

Dong Zhao; Chida Liu; Can Liu; Jianguo Liu; Jingfan Guo; Tian Yan

doi:10.65904/3083-3450.2026.02.02

Authors

Dong Zhao Northwestern Polytechnical University, Xi’an, Shaanxi Province, P. R. Chi na Author
Chida Liu Northwestern Polytechnical University, Xi’an, Shaanxi Province, P.R. China Author
Can Liu Northwestern Polytechnical University, Xi’an, Shaanxi Province, P.R. China Author
Jianguo Liu Northwestern Polytechnical University, Xi’an, Shaanxi Province, P.R. China Author
Jingfan Guo Northwestern Polytechnical University, Xi’an, Shaanxi Province, P.R. China Author
Tian Yan Northwestern Polytechnical University, Xi’an, Shaanxi Province, P.R. China Author

DOI:

https://doi.org/10.65904/3083-3450.2026.02.02

Keywords:

Intelligent Aeronautical Systems, Multi-agent Cooperative Guidance, Autonomous Flight Decision-making, Adaptive Control, Real-time Trajectory Management

Abstract

Aiming at the cooperative guidance and control problem of multi-agent systems in complex dynamic environments, this paper proposes an intelligent cooperative maneuvering guidance strategy integrated with role assignment design. High-speed vehicles are divided into Supportive Agents and Primary Mission Agents: through role-cooperative design, Supportive Agents actively maneuver to divert external disturbances from critical paths, while Primary Mission Agents ensure the accurate achievement of terminal mission objectives through autonomous decision-making under the premise of safety guarantee. Based on the multi-agent Soft Actor-Critic (SAC) framework, this paper presents an improved CD-MASAC (Curriculum-Driven Multi-Agent Soft Actor-Critic for Robust Cooperative Guidance Under Target Constraints) algorithm. By introducing a curriculum learning strategy and a dynamic learning rate adjustment mechanism, the training efficiency and convergence stability under complex constraints are significantly enhanced. Furthermore, a control loop with the desired axial velocity as the output is designed; by adjusting the flight rate in real time, the variable-speed capability of the vehicle is fully utilized, which not only satisfies terminal trajectory constraints but also effectively reduces energy consumption during maneuvering and improves flight sustainability. Simulation results demonstrate that the proposed strategy exhibits strong robustness and high control accuracy under significant environmental uncertainties, providing a universal guidance and control scheme for future highly autonomous aerial systems.

References

[1] Wang M L. Overview of ballistic missile penetration countermeasures[J]. Aerodynamic Missile Journal 2012; (10): 45-51.

[2] Mu Z, Jie P, Zhou Z, et al. A survey of the pursuit–evasion problem in swarm intelligence[J]. Frontiers of Information Technology & Electronic Engineering 2023; 1093-1116. https://doi.org/10.1631/FITEE.2200590

[3] Liang H, Li Z, Wu J, et al. Optimal guidance laws for a hypersonic multiplayer pursuit-evasion game based on a differential game strategy[J]. Aerospace 2022; 9(2): 97. https://doi.org/10.3390/aerospace9020097

[4] Hu G, Guo J, Guo Z, et al. ADP-Based intelligent tracking algorithm for reentry vehicles subjected to model and state uncertainties[J]. IEEE Transactions on Industrial Informatics 2023; 19(4): 6047-6055. https://doi.org/10.1109/TII.2022.3171327

[5] Yuan Y, Zhang P, Li X. Synchronous fault-tolerant near-optimal control for discrete-time nonlinear PE game[J]. IEEE Transactions on Neural Networks and Learning Systems 2021; 32(10): 4432-4444. https://doi.org/10.1109/TNNLS.2020.3017762

[6] Wang Y, Ning G, Wang X, et al. Maneuver penetration strategy of near space vehicle based on differential game[J]. Acta Aeronautica et Astronautica Sinica 2020; 41(S2): 724276.

[7] Cheng T, Zhou H, Dong X, et al. Differential game guidance law design for integration of penetration and strike of multiple flight vehicles[J]. Journal of Beijing University of Aeronautics and Astronautics 2022; 48(5): 898-909.

[8] Zhang K, Zhang K, Tan M, et al. Strategies for spacecraft pursuit-evasion games in asymmetric non-zero-sum conditions[J]. Journal of Astronautics 2024; 45(12): 1886-1896.

[9] Qian C, Zhou H, Wang Y, et al. Cooperative penetration strategy for multi - UAV swarms based on distributed game - theoretic optimization[J]. Journal of Intelligent and Robotic Systems 2024; 95(3-4): 553-568.

[10] Liu Y, Chen X, Wang Z, et al. Cooperative game - based penetration strategy for multi - vehicle systems in adversarial environments[J]. Journal of Systems Engineering and Electronics 2023; 34(5): 1031-1042.

[11] Dong X, Zhang H, Zhong M. Adaptive optimal control via Q-learning for multi-agent pursuit-evasion games[J]. IEEE Transactions on Circuits and Systems II: Express Briefs 2024; 3056-3060. https://doi.org/10.1109/TCSII.2024.3354120

[12] Yan T, Cai Y. General evasion guidance for air-breathing hypersonic vehicles with game theory and specified miss distance[C]. Proceedings of the 9th IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems 2019; 1125-1130. https://doi.org/10.1109/CYBER46603.2019.9066556

[13] Yu X, Wang X, Lin H. Optimal penetration guidance law with controllable missile escape distance[J]. Journal of Astronautics 2023; 44(07): 1053-1063.

[14] Feng L, Lu W, Wang F, et al. Optimal penetration guidance law for high-speed vehicles against an interceptor with modified proportional navigation guidance[J]. Symmetry 2023; 15(7). https://doi.org/10.3390/sym15071337

[15] Gao S, Lin D, Zheng D, et al. Intelligent maneuvering penetration guidance strategies for aerial vehicles considering interceptor detection capability limitations[J]. Acta Aeronautica et Astronautica Sinica 2025; 46(10): 331304.

[16] Guo Y, Jiang Z, Huang H, et al. Intelligent maneuver strategy for a hypersonic pursuit-evasion game based on deep reinforcement learning[J]. Aerospace, 2023. https://doi.org/10.3390/aerospace10090783

[17] Zhao S, Zhu J, Bao W. Multi-Constraints guidance and maneuvering penetration strategy via meta deep reinforcement learning[J]. Aerospace Science and Technology 2023; 137: 108531. https://doi.org/10.20944/preprints202308.1512.v1

[18] Li Z, Liu H, Wu Q, et al. Reinforcement learning-based intelligent penetration strategy for missiles in network-centric warfare [J]. Acta Armamentarii 2023; 44(10): 2456-2467.

[19] He X, Chen J, Guo H, et al. Attack and defense game of high-speed aircraft based on deep reinforcement learning. Aerospace Control 2022; 40(04): 76-83.

[20] Wang X, Gu K. A penetration strategy combining deep reinforcement learning and imitation learning[J]. Journal of Astronautics 2023; 44(06): 914-925.

[21] Ni W, Wang Y, Xu C, et al. Cooperative game guidance method for hypersonic vehicles based on reinforcement learning[J]. Acta Aeronautica et Astronautica Sinica 2023; 44(S2): 729400.

[22] Yan T, Liu C, Gao, M, et al. A deep reinforcement learning-based intelligent maneuvering strategy for the high-speed UAV pursuit-evasion game[J]. Drones 2024; 8(7): 309. https://doi.org/10.3390/drones8070309

[23] Su Z, Zheng S, et al. Evade unknown pursuer via pursuit strategy identification and model reference policy adaptation (MRPA) algorithm. Drones 8(11): 655. https://doi.org/10.3390/drones8110655

[24] Li B, Zhang H, He P, et al. Hierarchical maneuver decision method based on PG-option for UAV pursuit-evasion game. Drones, 7(7): 449. https://doi.org/10.3390/drones7070449

[25] Zhong Z, Dong Z, Duan X, et al. Collaboration strategies for two heterogeneous pursuers in a pursuit-evasion game using deep reinforcement learning[C], 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Abu Dhabi, United Arab Emirates 2024; 11962-11968. https://doi.org/10.1109/IROS58592.2024.10802839

[26] Zhang Z, Zong Q, Liu D, et al. A pursuit-evasion game on a real-city virtual simulation platform based on multi-agent reinforcement learning[C]. 2023 42nd Chinese Control Conference 2023; 6018-6023. https://doi.org/10.23919/CCC58697.2023.10240527

[27] Zhang K, Xu Y, Liu C, et al. Intelligent air combat maneuvering decision method of multi-UAV system based on TA-MASAC[C]. 2023 5th International Conference on Robotics, Intelligent Control and Artificial Intelligence (RICAI), Hangzhou, China 2023; 299-303. https://doi.org/10.1109/RICAI60863.2023.10489458

[28] Xu J, Zhang Z, Wang J, et al. Multi-AUV pursuit-evasion game in the internet of underwater things: an efficient training framework via offline reinforcement learning[J]. in IEEE Internet of Things Journal 2024; 11(19): 31273-31286. https://doi.org/10.1109/JIOT.2024.3416616

Task-Allocation-Driven Multi-Agent Reinforcement Learning for Cooperative Evasion Guidance of High-Speed Aerial Vehicles

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

Latest publications