Markov Game Based 3D Path Planning for Palletizing Robot

China Mechanical Engineering ›› 2012, Vol. 23 ›› Issue (7): 851-855.

Previous Articles Next Articles

Markov Game Based 3D Path Planning for Palletizing Robot

Liu Jiufu;Chen Kui;Su Qingqin;Liang Juanjuan;Wang Zhisheng

Nanjing University of Aeronautics and Astronautics,Nanjing,210016

Online:2012-04-10 Published:2012-04-13
Supported by:
National Natural Science Foundation of China（No. 60674100）

基于Markov对策的码垛机器人三维路径规划

刘久富;陈魁;苏青琴;梁娟娟;王志胜

南京航空航天大学,南京,210016

基金资助:
国家自然科学基金资助项目（60674100）；南京航空航天大学基本科研业务费专项科研项目（NS2010069）
National Natural Science Foundation of China（No. 60674100）

Abstract

Abstract:

On account of complex application environments and large number of uncertain conditions for a palletizing robot, a path-planning method for multiple joints robot was presented by the algorithm based on Markov game. At first, according to the actual working environment, the range of the robot's motion was set and the conventional movement combination was selected as the basic set of the robot's behaviors. The possible reward of various situations would be obtained. Then the reward of each joint can be updated by multi-agent Q-learning algorithm and inverse the movement combination corresponding with the best reward. Selection of the movement combination parts can reduce the coordination among each joints and the complexity of the algorithm. The best motion trail will be shown, including the 3D motion trail when it's barrier-free and there was a spherical obstacle, and determination of the trail errors. At last, after experimental verification, the algorithm has been proved to control the compatible movements of each joint effectively and keep the errors within the allowed ranges. The experiments meet the requirements well.

Key words: palletizing robot, multi-joint robot, multi-Agent system, Markov game, Nash equilibrium

摘要：

针对码垛机器人应用环境状况较复杂、不确定条件较多的问题,使用基于Markov对策的算法对多关节码垛机器人进行路径规划。首先根据实际的工作环境设定机器人的运动范围,并选择经常出现的动作组合作为机器人运动的基本行为集,给出各种情况可能获得的报酬值,依据多智能体Q值学习算法更新每个关节的报酬值,反解出对应最大报酬值的动作组合,选择部分动作组合可以减少各关节之间的协调关系,降低算法的复杂度。仿真绘制出最佳动作组合时的运动轨迹,以及机器人运动环境中无障碍与放置球形障碍物时的三维运动轨迹,并确定轨迹的误差。最后经过实验验证表明,多智能体Q值算法能有效地控制各个关节的协调运动,实际运动的误差在允许的范围内,满足使用要求。

关键词: 码垛机器人, 多关节机器人, 多Agent系统, Markov对策, Nash均衡

CLC Number:

TP18

LIU Jiu-Fu, CHEN Kuai, SU Jing-Qin, LIANG Juan-Juan, WANG Zhi-Qing. Markov Game Based 3D Path Planning for Palletizing Robot[J]. China Mechanical Engineering, 2012, 23(7): 851-855.

刘久富, 陈魁, 苏青琴, 梁娟娟, 王志胜. 基于Markov对策的码垛机器人三维路径规划[J]. 中国机械工程, 2012, 23(7): 851-855.

References

[1]Kaelbling L,Littman M,Cassandra A.Planning and Acting in Partially Observable Stochastic Domains[J].Artificial Intelligence,1998,101(1):99-134.
[2]洪晔[1],王宏健[1],边信黔[1].基于分层马尔可夫决策过程的AUV全局路径规划研究[J].系统仿真学报,2008,20(9):2361-2363.
[3]范波[1],潘泉[1],张洪才[1].基于Markov对策的多智能体协调方法及其在Robot Soccer中的应用[J].机器人,2005,27(1):46-51.
[4]李晓萌,杨煜普,许晓鸣.基于Markov对策和强化学习的多智能体协作研究[J].上海交通大学学报,2001,35(2):288-292.
[5]李晓萌,杨煜普,等.基于多级决策的多智能体自动导航车调度系统[J].上海交通大学学报,2002,36(8):1146-1149.
[6]高阳,周志华.基于Markov对策的多Agent强化学习模型及算法研究[J].计算机研究与发展,2000,37(3):257-263.
[7]Sharma R,Gopal M.A Markov Game-adaptive Fuzzy Controller for Robot Manipulators[J].Fuzzy Systems,2008,16(1):171-186.
[8]Sharma R,Gopal M.Markov Game Controller De-sign Algorithms[J].World Academy of Science,En-gineering and Technology,2007,34(5):585-593.
[9]Littman M L.Value-function Reinforcement Learning in Markov Games[J].Journal of Cognitive Systems Re-search,2001,2(1):55-66.
[10]Chang H S,Hu Jiaqiao,Fu M C.Adaptive Adver-sarial Multi-armed Bandit Approach to Two-person Zero-sum Markov Games[J].Automatic Control,2010,55(2):463-468.
[11]Dutta D,Goel A,Heidemann J.Oblivious AQM and Nash Equilibria[J].ACM SIGCOMM Com-puter Communication Review,2002,32(3):106-113.
[12]战晓磊[1,2],辛洪兵[2],汉斯·彼德兰特斯[3].基于虚拟现实的MOTOMAN-HP3型机器人运动学仿真[J].中国机械工程,2010(16):1952-1954.
[13]Vrancx P,Verbeeck K,Nowe A.Decentralized Learning in Markov Games[J].Systems Man and Cybernetics,Part B:Cybernetics,2008,38(4):976-981.

Markov Game Based 3D Path Planning for Palletizing Robot

基于Markov对策的码垛机器人三维路径规划

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 6

Recommended Articles

Metrics

[1]	DING Shanting, WANG Miao, DONG Zhengqiong, NIE Lei, . A Multi-agent-based Simulation Method for Health State Assessments of Naval Equipment [J]. China Mechanical Engineering, 2022, 33(10): 1169-1177.
[2]	NIE Li1;ZHANG Guohui2;WANG Xiaogang1;BAI Yuewei1. A Game-theory Based Optimization Approach for Job Scheduling in Virtual Manufacturing Networks [J]. China Mechanical Engineering, 2019, 30(12): 1492-1497.
[3]	CHEN Ming, ZHU Haihua, ZHANG Zequn, JIN Yongqiao, WANG Yingcong, TANG Dunbing. Multi-Agent Job Shop Scheduling Strategy Based on Pheromone [J]. China Mechanical Engineering, 2018, 29(22): 2659-2665.
[4]	CHEN Bing, LIU Kai, YANG Ting. Non-cooperative Game for Manufacturing Resources Configuration Driven by Job Load Competition [J]. China Mechanical Engineering, 2013, 24(02): 233-239.
[5]	HU Xiao-Jian, CHEN Qi-Ping. #br# Study on Concurrent Negotiation Model for Supply Chains Based on Game Theory [J]. China Mechanical Engineering, 2013, 24(02): 163-168.
[6]	Zhao Xia;Yuan Shenfang;Zhou Hengbao;Sun Hongbing. Research on Damage Diagnosis Agent Based on Acoustic Emission Technology [J]. J4, 2008, 19(14): 0-1763.