China Mechanical Engineering ›› 2012, Vol. 23 ›› Issue (7): 851-855.

Previous Articles     Next Articles

Markov Game Based 3D Path Planning for Palletizing Robot

Liu Jiufu;Chen Kui;Su Qingqin;Liang Juanjuan;Wang Zhisheng   

  1. Nanjing University of Aeronautics and Astronautics,Nanjing,210016
  • Online:2012-04-10 Published:2012-04-13
  • Supported by:
    National Natural Science Foundation of China(No. 60674100)

基于Markov对策的码垛机器人三维路径规划

刘久富;陈魁;苏青琴;梁娟娟;王志胜   

  1. 南京航空航天大学,南京,210016
  • 基金资助:
    国家自然科学基金资助项目(60674100);南京航空航天大学基本科研业务费专项科研项目(NS2010069)
    National Natural Science Foundation of China(No. 60674100)

Abstract:

On account of complex application environments and large number of uncertain conditions for a palletizing robot, a path-planning method for multiple joints robot was presented by the algorithm based on Markov game. At first, according to the actual working environment, the range of the robot's motion was set and the conventional movement combination was selected as the basic set of the robot's behaviors. The possible reward of various situations would be obtained. Then the reward of each joint can be updated by multi-agent Q-learning algorithm and inverse the movement combination corresponding with the best reward. Selection of the movement combination parts can reduce the coordination among each joints and the complexity of the algorithm. The best motion trail will be shown, including the 3D motion trail when it's barrier-free and there was a spherical obstacle, and determination of the trail errors. At last, after experimental verification, the algorithm has been proved to control the compatible movements of each joint effectively and keep the errors within the allowed ranges. The experiments meet the requirements well.

Key words: palletizing robot, multi-joint robot, multi-Agent system, Markov game, Nash equilibrium

摘要:

针对码垛机器人应用环境状况较复杂、不确定条件较多的问题,使用基于Markov对策的算法对多关节码垛机器人进行路径规划。首先根据实际的工作环境设定机器人的运动范围,并选择经常出现的动作组合作为机器人运动的基本行为集,给出各种情况可能获得的报酬值,依据多智能体Q值学习算法更新每个关节的报酬值,反解出对应最大报酬值的动作组合,选择部分动作组合可以减少各关节之间的协调关系,降低算法的复杂度。仿真绘制出最佳动作组合时的运动轨迹,以及机器人运动环境中无障碍与放置球形障碍物时的三维运动轨迹,并确定轨迹的误差。最后经过实验验证表明,多智能体Q值算法能有效地控制各个关节的协调运动,实际运动的误差在允许的范围内,满足使用要求。

关键词: 码垛机器人, 多关节机器人, 多Agent系统, Markov对策, Nash均衡

CLC Number: