MLENet: Multi-Level Extraction Network for video action recognition

作者全名:Wang, Fan; Li, Xinke; Xiong, Han; Mo, Haofan; Li, Yongming

作者地址:[Li, Xinke] Chongqing Med Univ, Coll Med Informat, Chongqing 400016, Peoples R China; [Wang, Fan; Li, Xinke; Xiong, Han; Mo, Haofan; Li, Yongming] Chongqing Univ, Sch Microelect & Commun Engn, Chongqing 400044, Peoples R China

通信作者:Li, XK (通讯作者),Chongqing Med Univ, Coll Med Informat, Chongqing 400016, Peoples R China.

来源:PATTERN RECOGNITION

ESI学科分类:ENGINEERING

WOS号:WOS:001248252800002

JCR分区:Q1

影响因子:7.5

年份:2024

卷号:154

期号: 

开始页: 

结束页: 

文献类型:Article

关键词:Action recognition; Spatio-temporal; Temporal feature refinement extraction module; Motion information; Optical flow guided feature

摘要:Human action recognition is a well -established task in the field of computer vision. However, accurately representing spatio-temporal information remains a challenge due to the complex interplay between human actions, video timing, and scene changes. To address this challenge and improve the efficiency of temporal modeling in videos, we propose MLENet, a novel approach that eliminates contextual data and eliminates the need for laborious optical flow extraction.MLENet incorporates a Temporal Feature Refinement Extraction Module (TFREM) that utilizes Optical Flow Guided Features to enhance attention to local deep detail information. This refinement process significantly enhances the network's capacity for feature learning and expression. Moreover, MLENet is designed to be trained end -to -end, facilitating seamless integration into existing frameworks. Additionally, our model adopts a temporal segmentation structure for sampling, effectively reducing redundant information and improving computational efficiency. Compared to existing video -based action recognition models that require optical flow or other modalities, MLENet achieves substantial performance enhancements while requiring fewer inputs. We validate the effectiveness of our proposed approach on benchmark datasets, including Something -Something V1&V2, UCF-101, and HMDB-51, where MLENet consistently outperforms state-of-the-art models.

基金机构: 

基金资助正文: