Abstract
Owing to its resilience to visual noise and viewpoint variations, skeleton-based analysis has become a cornerstone of human action recognition research. Despite its practical significance, existing methodologies often suffer from a reliance on single-stream skeletal representations, which fail to encompass the full complexity of action features. This study introduces Latent Features for Human Action Recognition (LFHAR), a novel architecture designed to overcome these limitations by utilizing diverse spatio-temporal latent representations for improved feature extraction. The approach applies graph-based transformations to individual skeletal frames in temporal sequences, then arranges the derived graph features into spatio-temporal matrices. Evaluation of standard datasets demonstrates the stability and invariance characteristics of the LFHAR architecture. The method produces substantial performance improvements, achieving accuracy increases of 2.7% on dataset NTU-RGB+D-60-classes and 2.1% on dataset NTU-RGB+D-120-classes, confirming its accuracy in improving human action recognition.
First Page
103
Last Page
109
References
- Sun, Z., Ke, Q., Rahmani, H., Bennamoun, M., Wang, G., Liu, J. (2022). Human action recognition from various data modalities: A review. IEEE Trans. Pattern Anal. Mach. Intell.
- Ahmad, T., Jin, L., Zhan,g X., Lai, S., Tang, G., Lin, L. (2021). Graph convolutional neural network for human action recognition: A comprehensive survey. IEEE Trans. Artif. Intell., 2 (2), 128-145.
- Cheng, K., Zhang, Y., He, X., Chen, W., Cheng, J., Lu, H. (2020). Skeleton-based action recognition with shift graph convolutional network. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 183-192.
- Xin, W., Liu, Y., Liu, R., Miao, Q., Shi, C., Pun, C.-M. (2023). Auto-learning-gcn: An ingenious framework for skeleton-based action recognition. Chinese Conference on Pattern Recognition and Computer Vision, Springer, 29-42.
- Liu, R., Liu, Y., Wu, M., Xin, W., Miao, Q., Liu, X., Li, L. (2025). SG-CLR: Semantic representation-guided contrastive learning for self-supervised skeleton-based action recognition. Pattern Recognit., 162 (2025), 111377.
- Marakhimov, A.R., Khudaybergenov, K.K. (2025). Softmax Regression with Multi-Connected Weights. Computers.
- Aouaidjia, K., Zhang, C., Pitas, I., (2025). Spatio-temporal invariant descriptors for skeleton-based human action recognition. Inf Sci (NY), 700, 121832. doi: 10.1016/j.ins.2024.121832 (2025).
- Tu, Z., et. al. (2022). Maxvit: multi-axis vision transformer. European Conference on Computer Vision, Springer, 459-479.
- Wang, J. et. al. (2014). Cross-view action modeling, learning and recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2649-2656.
- Wang, P., et. al. (2016). Action recognition based on joint trajectory maps using convolutional neural networks. Proceedings of the 24th ACM International Conference on Multimedia, 102-106.
Recommended Citation
Marakhimov, Avazjon; Khudaybergenov, Kabul; and Zakhriddin, Mominov
(2025)
"SKELETON-BASED HUMAN ACTION RECOGNITION USING SPATIO-TEMPORAL LATENT FEATURES WITH A GCN MODEL,"
Chemical Technology, Control and Management: Vol. 2025:
Iss.
6, Article 11.
DOI: https://doi.org/10.59048/2181-1105.1725
Included in
Complex Fluids Commons, Controls and Control Theory Commons, Industrial Technology Commons, Process Control and Systems Commons