|本期目录/Table of Contents|

[1]侯永宏,叶秀峰,张亮,等.基于深度学习的无人机人机交互系统[J].天津大学学报(自然科学版),2017,(09):967-974.[doi:10.11784/tdxbz201608033]
 Hou Yonghong,Ye Xiufeng,Zhang Liang,et al.A UAV Human Robot Interaction Method Based on Deep Learning[J].Journal of Tianjin University,2017,(09):967-974.[doi:10.11784/tdxbz201608033]
点击复制

基于深度学习的无人机人机交互系统()
分享到:

《天津大学学报(自然科学版)》[ISSN:0493-2137/CN:12-1127/N]

卷:
期数:
2017年09
页码:
967-974
栏目:
电气自动化与信息工程
出版日期:
2017-09-22

文章信息/Info

Title:
A UAV Human Robot Interaction Method Based on Deep Learning
文章编号:
0493-2137(2017)09-0967-08
作者:
侯永宏1 叶秀峰1 张亮23 李照洋1 董嘉蓉1
1. 天津大学电气自动化与信息工程学院,天津 300072;2. 天津市先进电气工程与能源技术重点实验室,天津 300387;3. 天津工业大学电气工程与自动化学院,天津 300387
Author(s):
Hou Yonghong1 Ye Xiufeng1 Zhang Liang23 Li Zhaoyang1 Dong Jiarong1
1. School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China
2.Tianjin Key Laboratory of Advanced Electrical Engineering and Energy Technology, Tianjin 300387, China
3. School of Electrical Engineering and Automation, Tianjin Polytechnic University, Tianjin 300387, China
关键词:
人机交互 双目视觉 深度学习
Keywords:
human robot interaction stereo vision deep learning
分类号:
TP391.4
DOI:
10.11784/tdxbz201608033
文献标志码:
A
摘要:
现行的无人机控制(UAV)主要依靠专业的设备, 由经过专业训练的人来完成.为了更方便的人机交互, 本文提出了一种基于双目视觉和深度学习的手势控制无人机(HRI)方法.用双目视觉提取深度图, 跟踪提取人物所在区域并且设置阈值将人物与背景分离开来, 从而得到只含有人物的深度图.其次, 通过对深度图序列的处理并叠加, 将视频转换为同时含有时间与空间信息的彩色纹理图.本文用深度学习工具Caffe对所得到的彩色纹理图进行了训练与识别, 根据识别结果生成无人机的控制指令.本文所述方法在室内和室外均可使用, 有效范围达到10 m, 可以简化无人机控制复杂度, 对促进无人机普及及拓展无人机应用范围都具有重要意义.
Abstract:
Traditionally,interacting with unmanned aerial vehicle(UAV)required specialized instrument and well trained operators. In order to reduce the difficulty interacting with UAV,a gesture based human robot interface(HRI)method based on stereo vision and deep learning was proposed in this paper. Firstly,the depth map was extracted using stereo vision. By utilizing tracking algorithm and setting a threshold,the pilot was spilt from background and a depth image with only the pilot in it was obtained. Secondly,a series of depth images were overlaid to generate a colored texture image which includes time and space information at the same time. Then,colored texture images were well learned and classified by a deep convolution neural network with the implementation of Caffe. Finally,UAV controlling command was generated according to the classified results. The proposed method is robust for both indoor and outdoor situations and is effective in 10 m. The study simplifies the control of UAV,and makes significant sense to the popularization of UAV and the extention of its application field.

参考文献/References:

[1] Téllez-Guzmán J J, Gomez-Balderas J E, Marchand N, et al. Velocity control of mini-UAV using a helmet system [C]//Workshop on Research, Education and Development of Unmanned Aerial Systems. Cancun, Mexico, 2015.
[2] Vincenzi D A, Terwilliger B A, Ison D C. Unmanned aerial system(UAS) human-machine interfaces:New paradigms in command and control [J]. Procedia Manufacturing, 2015, 3(Suppl 1):920-927.
[3] Lupashin S, Hehn M, Mueller M W, et al. A platform for aerial robotics research and demonstration:The flying machine arena [J]. Mechatronics, 2014, 24(1):41-54.
[4] Mantecón T, del Blanco C R, Jaureguizar F, et al. New generation of human machine interfaces for controlling UAV through depth-based gesture recognition [C]//SPIE DefenseSecurity. International Society for Optics and Photonics. Baltimore, USA, 2014:90840C-1-90840C-11.
[5] Pfeil K, Koh S L, LaViola J. Exploring 3d gesture metaphors for interaction with unmanned aerial vehicles [C]//Proceedings of the 2013 International Conference on Intelligent User Interfaces. Santa Monica, USA, 2013:257-266.
[6] Naseer T, Sturm J, Cremers D. Followme:Person following and gesture recognition with a quadrocopter [C]//2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan, 2013:624-630.
[7] Monajjemi V M, Wawerla J, Vaughan R, et al. HRI in the sky:Creating and commanding teams of UAVs with a vision-mediated gestural interface [C]// IEEE International Conference on Intelligent Robots and Systems. Tokyo, Japan, 2013:617-623.
[8] Costante G, Bellocchio E, Valigi P, et al. Personalizing vision-based gestural interfaces for HRI with UAVS:A transfer learning approach[C]//2014 IEEE/RSJ International Conference on Intelligent Robots and Systems. Chicago, USA, 2014:3319-3326.
[9] Li W, Zhang Z, Liu Z. Action recognition based on a bag of 3d points [C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. San Francisco, USA, 2010:9-14.
[10] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks [C]//Advances in Neural Information Processing Systems. Lake Tahoe, USA, 2012:1097-1105.
[11] Taylor G W, Fergus R, LeCun Y, et al. Convolutional Learning of Spatio-Temporal Features[M]// Berlin:Springer Heidelberg, 2010.
[12] Ji S, Xu W, Yang M, et al. 3D convolutional neural networks for human action recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1):221-231.
[13] Karpathy A, Toderici G, Shetty S, et al. Large-scale video classification with convolutional neural networks [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA, 2014:1725-1732.
[14] Simonyan K, Zisserman A. Two-stream convolutional networks for action recognition in videos [C]//Advances in Neural Information Processing Systems. Montreal, Canada, 2014:568-576.
[15] Shi Y, Zeng W, Huang T, et al. Learning deep trajectory descriptor for action recognition in videos using deep neural networks[C]// 2015 IEEE International Conference on Multimedia and Expo. Turin, Italy, 2015:1-6.
[16] Wang P, Li W, Gao Z, et al. Action recognition from depth maps using deep convolutional neural networks [J]. IEEE Transactions on Human-Machine Systems, 2015, 46(4):498-509.
[17] Zhang K, Zhang L, Liu Q, et al. Fast Visual Tracking via Dense Spatio-Temporal Context Learning[M]. Cham:Springer International Publishing, 2014:127-141.
[18] Konolige K. Small Vision Systems:Hardware and Implementation [M]. London:Springer, 1998:203-212.
[19] Yang X, Zhang C, Tian Y L. Recognizing actions using depth motion maps-based histograms of oriented gradients [C]//Proceedings of the 20th ACM International Conference on Multimedia. Nara, Japan, 2012:1057-1060.
[20] Jia Y, Shelhamer E, Donahue J, et al. Caffe:Convolutional architecture for fast feature embedding [C]//Proceedings of the ACM International Conference on Multimedia. Berkeley, USA, 2014:675-678.
[21] Quigley M, Conley K, Gerkey B, et al. ROS:An open-source robot operating system [C]// ICRA Workshop on Open Source Software. Menlo Park, USA, 2009:1-6.

相似文献/References:

[1]何明霞,宁福星,李 萌. 基于手部行为捕捉的人机交互方式[J].天津大学学报(自然科学版),2011,(05):430.
 HE Ming-xia,NING Fu-xing,LI Meng. Man-Machine Interactive Mode Based on Capturing Hand′s Moving[J].Journal of Tianjin University,2011,(09):430.

备注/Memo

备注/Memo:
收稿日期: 2016-08-20; 修回日期: 2016-09-29.
作者简介: 侯永宏(1968—), 男, 博士, 副教授, houroy@tju.edu.cn.
通讯作者: 张亮, liangzhang@tjpu.edu.cn.
基金项目: 国家自然科学基金资助项目(61571325); 天津市科技支撑计划重点资助项目(15ZCZDGX00190, 16ZXHLGX00190).
Supported by the National Natural Science Foundation of China(No.,61571325)and the Science and Technology Support Program of Tianjin, China(No.,15ZCZDGX00190 and No. 16ZXHLGX00190).
更新日期/Last Update: 2017-09-10