MLENet: Multi-Level Extraction Network for video action recognition
作者全名:Wang, Fan; Li, Xinke; Xiong, Han; Mo, Haofan; Li, Yongming
作者地址:[Li, Xinke] Chongqing Med Univ, Coll Med Informat, Chongqing 400016, Peoples R China; [Wang, Fan; Li, Xinke; Xiong, Han; Mo, Haofan; Li, Yongming] Chongqing Univ, Sch Microelect & Commun Engn, Chongqing 400044, Peoples R China
通信作者:Li, XK (通讯作者),Chongqing Med Univ, Coll Med Informat, Chongqing 400016, Peoples R China.
来源:PATTERN RECOGNITION
ESI学科分类:ENGINEERING
WOS号:WOS:001248252800002
JCR分区:Q1
影响因子:7.5
年份:2024
卷号:154
期号:
开始页:
结束页:
文献类型:Article
关键词:Action recognition; Spatio-temporal; Temporal feature refinement extraction module; Motion information; Optical flow guided feature
摘要:Human action recognition is a well -established task in the field of computer vision. However, accurately representing spatio-temporal information remains a challenge due to the complex interplay between human actions, video timing, and scene changes. To address this challenge and improve the efficiency of temporal modeling in videos, we propose MLENet, a novel approach that eliminates contextual data and eliminates the need for laborious optical flow extraction.MLENet incorporates a Temporal Feature Refinement Extraction Module (TFREM) that utilizes Optical Flow Guided Features to enhance attention to local deep detail information. This refinement process significantly enhances the network's capacity for feature learning and expression. Moreover, MLENet is designed to be trained end -to -end, facilitating seamless integration into existing frameworks. Additionally, our model adopts a temporal segmentation structure for sampling, effectively reducing redundant information and improving computational efficiency. Compared to existing video -based action recognition models that require optical flow or other modalities, MLENet achieves substantial performance enhancements while requiring fewer inputs. We validate the effectiveness of our proposed approach on benchmark datasets, including Something -Something V1&V2, UCF-101, and HMDB-51, where MLENet consistently outperforms state-of-the-art models.
基金机构:
基金资助正文: