Spatio-Temporal-based Multi-level Aggregation Network for Physical Action Recognition
Computer Science and Information Systems, Tome 21 (2024) no. 4

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

This paper introduces spatio-temporal-based multi-level aggregation network (ST-MANet) for action recognition. It utilizes the correlations between different spatial positions and the correlations between different temporal positions on the feature map to explore long-range spatial and temporal dependencies, respectively, generating the spatial and temporal attention map that assigns different weights to features at different spatial and temporal locations. Additionally, a multi-scale approach is introduced, proposing a multi-scale behavior recognition framework that models various visual rhythms while capturing multi-scale spatiotemporal information. A spatial diversity constraint is then proposed, encouraging spatial attention maps at different scales to focus on distinct areas. This ensures a greater emphasis on spatial information unique to each scale, thereby incorporating more diverse spatial information into multi-scale features. Finally, ST-MANet is compared with existing approaches, demonstrating high accuracy on the three datasets.
Keywords: Action recognition, spatial and temporal attention, multi-level aggregation network
@article{CSIS_2024_21_4_a31,
     author = {Yuhang Wang},
     title = {Spatio-Temporal-based {Multi-level} {Aggregation} {Network} for {Physical} {Action} {Recognition}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {21},
     number = {4},
     year = {2024},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2024_21_4_a31/}
}
TY  - JOUR
AU  - Yuhang Wang
TI  - Spatio-Temporal-based Multi-level Aggregation Network for Physical Action Recognition
JO  - Computer Science and Information Systems
PY  - 2024
VL  - 21
IS  - 4
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2024_21_4_a31/
ID  - CSIS_2024_21_4_a31
ER  - 
%0 Journal Article
%A Yuhang Wang
%T Spatio-Temporal-based Multi-level Aggregation Network for Physical Action Recognition
%J Computer Science and Information Systems
%D 2024
%V 21
%N 4
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2024_21_4_a31/
%F CSIS_2024_21_4_a31
Yuhang Wang. Spatio-Temporal-based Multi-level Aggregation Network for Physical Action Recognition. Computer Science and Information Systems, Tome 21 (2024) no. 4. http://geodesic.mathdoc.fr/item/CSIS_2024_21_4_a31/