Spatio-Temporal-based Multi-level Aggregation Network for Physical Action Recognition
Computer Science and Information Systems, Tome 21 (2024) no. 4.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

This paper introduces spatio-temporal-based multi-level aggregation network (ST-MANet) for action recognition. It utilizes the correlations between different spatial positions and the correlations between different temporal positions on the feature map to explore long-range spatial and temporal dependencies, respectively, generating the spatial and temporal attention map that assigns different weights to features at different spatial and temporal locations. Additionally, a multi-scale approach is introduced, proposing a multi-scale behavior recognition framework that models various visual rhythms while capturing multi-scale spatiotemporal information. A spatial diversity constraint is then proposed, encouraging spatial attention maps at different scales to focus on distinct areas. This ensures a greater emphasis on spatial information unique to each scale, thereby incorporating more diverse spatial information into multi-scale features. Finally, ST-MANet is compared with existing approaches, demonstrating high accuracy on the three datasets.
Keywords: Action recognition, spatial and temporal attention, multi-level aggregation network
@article{CSIS_2024_21_4_a31,
     author = {Yuhang Wang},
     title = {Spatio-Temporal-based {Multi-level} {Aggregation} {Network} for {Physical} {Action} {Recognition}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {21},
     number = {4},
     year = {2024},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2024_21_4_a31/}
}
TY  - JOUR
AU  - Yuhang Wang
TI  - Spatio-Temporal-based Multi-level Aggregation Network for Physical Action Recognition
JO  - Computer Science and Information Systems
PY  - 2024
VL  - 21
IS  - 4
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2024_21_4_a31/
ID  - CSIS_2024_21_4_a31
ER  - 
%0 Journal Article
%A Yuhang Wang
%T Spatio-Temporal-based Multi-level Aggregation Network for Physical Action Recognition
%J Computer Science and Information Systems
%D 2024
%V 21
%N 4
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2024_21_4_a31/
%F CSIS_2024_21_4_a31
Yuhang Wang. Spatio-Temporal-based Multi-level Aggregation Network for Physical Action Recognition. Computer Science and Information Systems, Tome 21 (2024) no. 4. http://geodesic.mathdoc.fr/item/CSIS_2024_21_4_a31/