EI、Scopus 收录
中文核心期刊
Zeng Yuling, Hao Yuqing, Yu Ying, Wang Qingyun. Formation control for multi-unmanned vehicles via deep reinforcement learning. Chinese Journal of Theoretical and Applied Mechanics, 2024, 56(2): 460-471. DOI: 10.6052/0459-1879-23-255
Citation: Zeng Yuling, Hao Yuqing, Yu Ying, Wang Qingyun. Formation control for multi-unmanned vehicles via deep reinforcement learning. Chinese Journal of Theoretical and Applied Mechanics, 2024, 56(2): 460-471. DOI: 10.6052/0459-1879-23-255

FORMATION CONTROL FOR MULTI-UNMANNED VEHICLES VIA DEEP REINFORCEMENT LEARNING

  • Received Date: June 19, 2023
  • Accepted Date: December 26, 2023
  • Available Online: December 27, 2023
  • Published Date: December 27, 2023
  • Targeting the problem of multi-agent formation control, this work investigates the formation control of a multi-unmanned vehicle system using the DDQN deep reinforcement learning algorithm. The approach combines consensus control with accompanying configuration to model and simplify the formation control problem. This work establishes a state space based on relative distance and velocity, making control inputs independent of global information, and then designs an action space based on nine major motion directions and formulates reward functions based on relative distance and relative velocity. The work involves the design of neural network architecture, network training, and developing a motion simulation environment. The controller is successfully trained and can be directly applied to the formation task of underactuated unmanned vehicles with nonholonomic constraints, representing a model-free control approach that only requires motion data rather than precise models. Finally, the effectiveness of the controller is verified through extensive motion simulations in various scenarios, including multiple formations, positions, trajectories, as well as examinations of formation transformation, switching communication, and communication failures. The controller performs effectively in all scenarios. The paper concludes by optimizing the strategies in the initial stages of formation, defining waiting and starting conditions, which effectively reduces control energy consumption. The optimization is validated through motion simulations and comparison.
  • [1]
    李远哲, 胡纪滨. 强化学习在无人车领域的应用与展望. 信息与控制, 2022, 51(2): 129-141 (Li Yuanzhe, Hu Jibin. Applications and prospect of reinforcement learning in unmanned ground vehicles. Information and Control, 2022, 51(2): 129-141 (in Chinese)

    Li Yuanzhe, Hu Jibin. Applications and prospect of reinforcement learning in unmanned ground vehicles. Information and Control, 2022, 51(2): 129-141 (in Chinese)
    [2]
    刘树光, 王欢. 有人/无人机协同编队控制研究综述. 飞行力学, 2022, 40(5): 1-8 (Liu Shuguang, Wang Huan. Review on cooperative formation control for manned/unmanned aerial vehicles. Flight Dynamics, 2022, 40(5): 1-8 (in Chinese)

    Liu Shuguang, Wang Huan. Review on cooperative formation control for manned/unmanned aerial vehicles. Flight Dynamics, 2022, 40(5): 1-8 (in Chinese)
    [3]
    Dong W. On consensus algorithms of multiple uncertain mechanical systems with a reference trajectory. Automatica, 2011, 47(9): 2023-2028 doi: 10.1016/j.automatica.2011.05.025
    [4]
    Nuno E, Ortega R, Basanez L, et al. Synchronization of networks of nonidentical euler-lagrange systems with uncertain parameters and communication delays. IEEE Transactions on Automatic Control, 2011, 56(4): 935-941 doi: 10.1109/TAC.2010.2103415
    [5]
    Xiao F, Wang L, Chen J, et al. Finite-time formation control for multi-agent systems. Automatica, 2009, 45(11): 2605-2611 doi: 10.1016/j.automatica.2009.07.012
    [6]
    Ji M, Ferrari-Trecate G, Egerstedt M, et al. Containment control in mobile networks. IEEE Transactions on Automatic Control, 2008, 53(8): 1972-1975 doi: 10.1109/TAC.2008.930098
    [7]
    Desai JP, Ostrowski J, Kumar V. Controlling formations of multiple mobile robots//Proceedings 1998 IEEE International Conference on Robotics and Automation (Cat. No. 98CH36146). IEEE, 1998, 4: 2864-2869
    [8]
    邓国琛. 无人车多车协同控制技术研究. [硕士论文]. 济南: 济南大学, 2022 (Deng Guochen. Research on multi vehicle cooperative control technology of unmanned vehicle. [Master Thesis]. Jinan: University of Jinan, 2022 (in Chinese)

    Deng Guochen. Research on multi vehicle cooperative control technology of unmanned vehicle. [Master Thesis]. Jinan: University of Jinan, 2022 (in Chinese)
    [9]
    Askari A, Mortazavi M, Talebi HA. UAV formation control via the virtual structure approach. Journal of Aerospace Engineering, 2015, 28(1): 04014047 doi: 10.1061/(ASCE)AS.1943-5525.0000351
    [10]
    Balch T, Arkin RC. Behavior-based formation control for multirobot teams. IEEE Transactions on Robotics and Automation, 1998, 14(6): 926-939 doi: 10.1109/70.736776
    [11]
    王丹丹, 宗群, 张博渊等. 多无人机完全分布式有限时间编队控制. 控制与决策, 2019, 34(12): 2656-2660 (Wang Dandan, Zong Qun, Zhang boyuan, et al. Fully distributed finite-time formation control for multiple UAVs. Control and Decisions, 2019, 34(12): 2656-2660 (in Chinese)

    Wang Dandan, Zong Qun, Zhang boyuan, et al. Fully distributed finite-time formation control for multiple UAVs. Control and Decisions, 2019, 34(12): 2656-2660 (in Chinese)
    [12]
    Lafferriere G, Williams A, Caughman J, et al. Decentralized control of vehicle formations. Systems & Control Letters, 2005, 54(9): 899-910
    [13]
    王建春, 晋国栋. 机器学习在力学模拟与控制中的应用专题序. 力学学报, 2021, 53(10): 2613-2615 (Wang Jianchun, Jin Guodong. Preface of theme articles on applications of machine learning to simulations and controls in mechanics. Chinese Journal of Theoretical and Applied Mechanics, 2021, 53(10): 2613-2615 (in Chinese)

    Wang Jianchun, Jin Guodong. Preface of theme articles on applications of machine learning to simulations and controls in mechanics. Chinese Journal of Theoretical and Applied Mechanics, 2021, 53(10): 2613-2615 (in Chinese)
    [14]
    Lillicrap TP, Hunt JJ, Pritzel A, et al. Continuous control with deep reinforcement learning. arXiv Preprint, 2015: 1509.02971
    [15]
    黄诗毅. 不确定非线性系统的强化学习控制技术研究. [硕士论文]. 南昌: 华东交通大学, 2022 (Huang Shiyi. Reinforcement learning control for uncertain nonlinear systems. [Master Thesis]. Nanchang: East China Jiaotong University, 2022 (in Chinese)

    Huang Shiyi. Reinforcement learning control for uncertain nonlinear systems. [Master Thesis]. Nanchang: East China Jiaotong University, 2022 (in Chinese)
    [16]
    Pi CH, Ye WY, Cheng S. Robust quadrotor control through reinforcement learning with disturbance compensation. Applied Sciences, 2021, 11(7): 3257 doi: 10.3390/app11073257
    [17]
    Bae H, Kim G, Kim J, et al. Multi-robot path planning method using reinforcement learning. Applied Sciences, 2019, 9(15): 3057 doi: 10.3390/app9153057
    [18]
    Zhu P, Dai W, Yao W, et al. Multi-robot flocking control based on deep reinforcement learning. IEEE Access, 2020, 8: 150397-150406 doi: 10.1109/ACCESS.2020.3016951
    [19]
    Hung SM, Givigi SN. A Q-learning approach to flocking with uavs in a stochastic environment. IEEE Transactions on Cybernetics, 2017, 47(1): 186-197 doi: 10.1109/TCYB.2015.2509646
    [20]
    李波, 越凯强, 甘志刚等. 基于MADDPG的多无人机协同任务决策. 宇航学报, 2021, 42(6): 757-765 (Li Bo, Yue Kaiqiang, Gan Zhigang, et al. Multi-UAV cooperative autonomous navigation based on multi-agent deep deterministic policy gradient. Journal of Astronautics, 2021, 42(6): 757-765 (in Chinese)

    Li Bo, Yue Kaiqiang, Gan Zhigang, et al. Multi-UAV cooperative autonomous navigation based on multi-agent deep deterministic policy gradient. Journal of Astronautics, 2021, 42(6): 757-765 (in Chinese)
    [21]
    张海峰, 简燕红, 王宏刚等. 基于优化强化学习的多智能体编队最优控制. 控制工程, 2022, 29(12): 2316-2321 (Zhang Haifeng, Jian Yanhong, Wang Honggang. Optimal control of multi-agent formation based on optimized reinforcement learning. Control Engineering of China, 2022, 29(12): 2316-2321 (in Chinese)

    Zhang Haifeng, Jian Yanhong, Wang Honggang. Optimal control of multi-agent formation based on optimized reinforcement learning. Control Engineering of China, 2022, 29(12): 2316-2321 (in Chinese)
    [22]
    赵启, 甄子洋, 龚华军等. 基于D3QN的无人机编队控制技术. 北京航空航天大学学报, 2021, 0601: 1-14 (Zhao Qi, Zhen Ziyang, Gong Huajun, et al. UAV formation control based on dueling double DQN. Journal of Beijing University of Aeronautics and Astronautics, 2021, 0601: 1-14 (in Chinese)

    Zhao Qi, Zhen Ziyang, Gong Huajun, et al. UAV formation control based on dueling double DQN. Journal of Beijing University of Aeronautics and Astronautics, 2021, 0601: 1-14 (in Chinese)
    [23]
    赵启, 甄子洋, 龚华军等. 基于深度强化学习的无人机编队控制. 电光与控制, 2022, 29(10): 29-33, 63 (Zhao Qi, Zhen Ziyang, Gong Huajun, et al. UAV formation control based on deep reinforcement learning. Electronics Optics & Control, 2022, 29(10): 29-33, 63 (in Chinese)

    Zhao Qi, Zhen Ziyang, Gong Huajun, et al. UAV formation control based on deep reinforcement learning. Electronics Optics & Control, 2022, 29(10): 29-33, 63 (in Chinese)
    [24]
    马晓帆. 基于深度强化学习的商用车编队控制方法研究. [硕士论文]. 吉林: 吉林大学, 2022 (Ma Xiaofan. Research on platoon control of commercial vehicle based on deep reinforcement learning. [Master Thesis]. Jilin: Jilin University, 2022 (in Chinese)

    Ma Xiaofan. Research on platoon control of commercial vehicle based on deep reinforcement learning. [Master Thesis]. Jilin: Jilin University, 2022 (in Chinese)
    [25]
    相晓嘉, 闫超, 王菖等. 基于深度强化学习的固定翼无人机编队协调控制方法. 航空学报, 2021, 42(4): 420-433 (Xiang Xiaojia, Yan Chao, Wang Chang, et al. Coordination control method for fixed-wang UAV formation through deep reinforcement learning, Acta Aeronautica et Astronautica Sinica, 2021, 42(4): 420-433 (in Chinese)

    Xiang Xiaojia, Yan Chao, Wang Chang, et al. Coordination control method for fixed-wang UAV formation through deep reinforcement learning, Acta Aeronautica et Astronautica Sinica, 2021, 42(4): 420-433 (in Chinese)
    [26]
    王凯强. 基于麦克纳姆轮全向移动分析与仿真研究. 内燃机与配件, 2023, 373(1): 14-16 (Wang Kaiqiang. Theoretical analysis and simulation research on mcnamm wheel omnidirectional movement of vehicle. Internal Combustion Engine & Parts, 2023, 373(1): 14-16 (in Chinese)

    Wang Kaiqiang. Theoretical analysis and simulation research on mcnamm wheel omnidirectional movement of vehicle. Internal Combustion Engine & Parts, 2023, 373(1): 14-16 (in Chinese)
    [27]
    Sun H, Zhao H, Zhen S, et al. Application of the Udwadia–Kalaba approach to tracking control of mobile robots. Nonlinear Dynamics, 2016, 83: 389-400
    [28]
    Liu TF, Jiang ZP. Distributed formation control of nonholonomic mobile robots without global position measurements. Automatica, 2013, 49(2): 592-600 doi: 10.1016/j.automatica.2012.11.031
    [29]
    刘鹏飞. 基于Udwadia-Kalaba方法的多无人车系统分布式协同控制. [硕士论文]北京: 北京航空航天大学, 2022 (Liu Pengfei, Distributed cooperative control of networked mobile robots using the Udwadia–Kalaba approach. [Master Thesis]. Beijing: Beihang University, 2022 (in Chinese)

    Liu Pengfei, Distributed cooperative control of networked mobile robots using the Udwadia–Kalaba approach. [Master Thesis]. Beijing: Beihang University, 2022 (in Chinese)
    [30]
    Van Hasselt H, Guez A, Silver D. Deep reinforcement learning with double q-learning//Proceedings of the AIAA Conference on Artificial Intelligence, 2016
    [31]
    Huang X, Luo W, Liu J. Attitude control of fixed-wing UAV based on DDQN//2019 Chinese Automation Congress (CAC). IEEE, 2019: 4722-4726
  • Related Articles

    [1]Wang Meiqi, Xu Jiayue, Liu Pengfei, Wang Ruichen. RESEARCH ON COOPERATIVE CONTROL OF MAGLEV TRAIN SUSPENSION SYSTEM BASED ON DEEP REINFORCEMENT LEARNING[J]. Chinese Journal of Theoretical and Applied Mechanics, 2025, 57(4): 854-866. DOI: 10.6052/0459-1879-24-440
    [2]Wu Haokai, Chen Yaoran, Zhou Dai, Chen Wenli, Cao Yong. REFINED STUDY OF SUPER-RESOLUTION RECONSTRUCTION OF NEAR-WALL TURBULENCE FIELD BASED ON CNN AND GAN DEEP LEARNING MODEL[J]. Chinese Journal of Theoretical and Applied Mechanics, 2024, 56(8): 2231-2242. DOI: 10.6052/0459-1879-24-019
    [3]Yu Jiangfei, Lian Chengyue, Tang Tao, Tang Zhuo, Wang Hongbo, Sun Mingbo. CONSTRUCTION AND VALIDATION OF A WIDE-DOMAIN ENGINE FLAMELET COMBUSTION MODEL BASED ON DEEP LEARNING TABLE BUILDING[J]. Chinese Journal of Theoretical and Applied Mechanics, 2024, 56(3): 723-739. DOI: 10.6052/0459-1879-23-403
    [4]Jiang Shouyan, Wan Chen, Sun Liguo, Du Chengbin. CRACK-LIKE DEFECT INVERSION MODEL BASED ON SBFEM AND DEEP LEARNING[J]. Chinese Journal of Theoretical and Applied Mechanics, 2021, 53(10): 2724-2735. DOI: 10.6052/0459-1879-21-360
    [5]Di Shaocheng, Feng Yuntian, Qu Tongming, Yu Hailong. DATA-DRIVEN STRESS-STRAIN MODELING FOR GRANULAR MATERIALS THROUGH DEEP REINFORCEMENT LEARNING[J]. Chinese Journal of Theoretical and Applied Mechanics, 2021, 53(10): 2712-2723. DOI: 10.6052/0459-1879-21-312
    [6]Wang Jianchun, Jin Guodong. Preface of Theme Articles on Applications of Machine Learning to Simulations and Controls in Mechanics[J]. Chinese Journal of Theoretical and Applied Mechanics, 2021, 53(10): 2613-2615. DOI: 10.6052/0459-1879-21-501
    [7]Peng Chao, Gao Yang. NEAR-CIRCULAR LOW-EARTH-ORBIT SPACECRAFT FORMATION CONTROL WITH LORENTZ FORCE[J]. Chinese Journal of Theoretical and Applied Mechanics, 2012, 44(5): 851-860. DOI: 10.6052/0459-1879-12-044
    [8]Hillslope soil erosion process model for natural rainfall events[J]. Chinese Journal of Theoretical and Applied Mechanics, 2008, 40(3). DOI: 10.6052/0459-1879-2008-3-2006-329
    [9]Hongnan Li, Jun Li, Gangbing Song. Improved suboptimal Bang-Bang control of aseismic buildings with variable friction dampers[J]. Chinese Journal of Theoretical and Applied Mechanics, 2007, 23(1): 101-109. DOI: 10.6052/0459-1879-2007-1-2005-601
    [10]MECHANICS MODELS FOR PREDICTION AND CONTROL OF THE WELLBORE TRAJECTORY[J]. Chinese Journal of Theoretical and Applied Mechanics, 1995, 27(4): 501-505. DOI: 10.6052/0459-1879-1995-4-1995-460
  • Cited by

    Periodical cited type(0)

    Other cited types(4)

Catalog

    Article Metrics

    Article views (525) PDF downloads (153) Cited by(4)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return