PDF下载 分享
[1]冀中,江俊杰.基于解码器注意力机制的视频摘要[J].天津大学学报(自然科学版),2018,(10):1023-1030.[doi:10.11784/tdxbz201801077]
 Ji Zhong,Jiang Junjie.Video Summarization Based on Decoder Attention Mechanism[J].Journal of Tianjin University,2018,(10):1023-1030.[doi:10.11784/tdxbz201801077]
点击复制

基于解码器注意力机制的视频摘要

参考文献/References:

[1] 王娟, 蒋兴浩, 孙锬锋. 视频摘要技术综述[J]. 中国图象图形学报, 2014, 19(12):1685-1695.
Wang Juan, Jiang Xinghao, Sun Tanfeng. Review of video abstraction[J]. Journal of Image and Graphics, 2014, 19(12):1685-1695(in Chinese).
[2] de Avila S E F, Lopes A P B. VSUMM:A mechanism designed to produce static video summaries and a novel evaluation method[J]. Pattern Recognition Letters, 2011, 32(1):56-68.
[3] Furini M, Geraci F, Montangero M, et al. STIMO:Still and moving video storyboard for the web scenario [J]. Multimedia Tools and Applications, 2010, 46(1):47-69.
[4] Kuanar S K, Panda R, Chowdhury A S. Video key frame extraction through dynamic delaunay clustering with a structural constraint[J]. Journal of Visual Communication and Image Representation, 2013, 24(7):1212-1227.
[5] Wu J, Zhong S H, Jiang J, et al. A novel clustering method for static video summarization[J]. Multimedia Tools & Applications, 2017, 76(7):1-17.
[6] Ji Z, Zhang Y Y, Pang Y W, et al. Hypergraph dominant set based multi-video summarization[J]. Signal Processing, 2018, 148:114-123.
[7] Demir M, Bozma H I. Video summarization via segments summary graphs[C]//IEEE International Conference on Computer Vision. Santiago, Chile, 2016:1071-1077.
[8] 冀中, 樊帅飞, 基于超图排序算法的视频摘要[J]. 电子学报, 2017, 45(5):1035-1043.
Ji Zhong, Fan Shuaifei. Video summarization with hypergraph ranking[J]. Acta Electronica Sinica, 2017, 45(5):1035-1043(in Chinese).
[9] Panda R, Kuanar S K, Chowdhury A S. Scalable video summarization using skeleton graph and random walk [C]//International Conference on Pattern Recognition. Stockholm, Sweden, 2014:3481-3486.
[10] Mei S, Guan G, Wang Z, et al. Video summarization via minimum sparse reconstruction [J]. Pattern Recognition, 2015, 48(2):522-533.
[11] Panda R, Das A, Roy-Chowdhury A K. Video summarization in a multi-view camera network[C]// International Conference on Pattern Recognition. Cancun, Mexico, 2016:2971-2976.
[12] Ji Z, Ma Y R, Pang Y W, et al. Query-aware sparse coding for multi-video summarization[EB/OL]. https:// arxiv.org/abs/1707.04021, 2017.
[13] Gong B, Chao W L, Grauman K, et al. Diverse sequential subset selection for supervised video summarization[C]//Advances in Neural Information Processing Systems. Montreal, Canada, 2014:2069-2077.
[14] Zhang K, Chao W, Sha F, et al. Summary transfer:Exemplar-based subset selection for video summarization [C]//IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016:1059-1067.
[15] Gygli M, Grabner H, van Gool L. Video summarization by learning submodular mixtures of objectives [C]//IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015:3090-3098.
[16] Li X, Zhao B, Lu X. A general framework for edited video and raw video summarization[J]. IEEE Transaction on Image Processing, 2017, 26(8):3652-3664.
[17] Zhang K, Chao W L, Sha F, et al. Video summarization with long short-term memory[C]//European Conference on Computer Vision. Amsterdam, Netherlands, 2016:766-782.
[18] Potapov D, Douze M, Harchaoui Z, et al. Category-specific video summarization[C]//European Conference on Computer Vision. Zurich, Sitzerland, 2014:540-555.
[19] Yong J L, Ghosh J, Grauman K. Discovering important people and objects for egocentric video summarization [C]// IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA, 2012:1346-1353.
[20] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C]//Advances in Neural Information Processing Systems. Montreal, Canada, 2014:3104-3112.
[21] Ma Y F, Lu L, Zhang H J, et. al. A user attention model for video summarization[C]//ACM Conference on Multimedia. Juan les Pins, France, 2002:533-542.
[22] Ejaz N, Mehmood I, Baik S W. Efficient visual attention based framework for extracting key frames from videos[J]. Signal Processing Image Communication, 2013, 28(1):34-44.
[23] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate [C]//International Conference on Learning Representations. San Diego, USA, 2015:1-15.
[24] Meng F, Lu Z, Wang M, et al. Encoding source language with convolutional neural network for machine translation[C]//Annual Meeting of the Association for Computational Linguistics. Beijing, China, 2015:20-30.
[25] Chopra S, Auli M, Rush A M. Abstractive sentence summarization with attentive recurrent neural networks [C]//Annual Meeting of the Association for Computational Linguistics. Berlin, Germany, 2016:93-98.
[26] Xu K, Ba J, Kiros R, et al. Show, attend and tell:Neural image caption generation with visual attention [C]//International Conference on Machine Learning. Lille, France, 2015:2048-2057.
[27] Yao L, Torabi A, Cho K, et al. Describing videos by exploiting temporal structure[C]//IEEE International Conference on Computer Vision. Santiago, Chile, 2015:4507-4515.
[28] Venugopalan S, Xu H, Donahue J, et al. Translating videos to natural language using deep recurrent neural networks[C]//Annual Meeting of the Association for Computational Linguistics. Baltimore, USA, 2014:1494-1504.
[29] Li Y, Merialdo B. Multi-video summarization based on
Video-MMR[C]//International Workshop on Image Analysis for Multimedia Interactive Services. Desenzano del Garda, Italy, 2010:1-4.
[30] Mahasseni B, Lam M, Todorovic S. Unsupervised video summarization with adversarial LSTM networks [C]//IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA, 2017:1-10.
[31] Gygli M, Grabner H, Riemenschneider H, et al. Creating summaries from user videos[C]//European Conference on Computer Vision. Zurich, Switzerland, 2014:505-520.
[32] Yang H, Wang B, Lin S, et al. Unsupervised extraction of video highlights via robust recurrent auto-encoders[C]// IEEE International Conference on Computer Vision. Santiago, Chile, 2015:4633-4641.
[33] Song Y, Vallmitjana J, Stent A, et al. TVSum:Summarizing web videos using titles[C]//IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015:5179-5187.
[34] Zhao B, Xing E P, Quasi real-time summarization for consumer videos[C]//IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA, 2014:2513-2520.
[35] Shao L, Zhu F, Li X. Transfer learning for visual categorization:A survey[J]. IEEE Transactions on Neural Networks & Learning Systems, 2015, 26(5):1019-1034.

备注/Memo

收稿日期: 2018-01-22; 修回日期: 2018-03-13.
作者简介: 冀中(1979—), 男, 博士, 副教授.
通讯作者: 冀中, jizhong@tju.edu.cn.
基金项目: 国家自然科学基金资助项目(61472273, 61771329).
Supported by the National Natural Science Foundation of China(No.,61472273 and No.,61771329).

更新日期/Last Update: 2018-10-10