A Review on Action Recognition and Action Prediction of Human(s) using Deep Learning Approaches

Syed Abdussami; Nagendraprasad S.; Shivarajakumara K.; Sanjeet Singh; A. Thyagarajamurthy

Call for Paper

July Edition

IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 20 June 2025

Submit your paper

Know more

The week's pick

A Hybrid Congestion Detection Based Mobility Model for Vehicular Adhoc Networks

Rashmi Ranjita Sasmita Acharya

Random Articles

Secure Inter-Cloud Federated Identity Management using IID

November

2015

Talent Management and Competitive Advantage: The Moderating Effect of Knowledge Integration

March

2013

Selection of Appropriate Detection Scheme for Optimum Performance-Complexity Trade-Off in 3GPP Suburban Macrocell Wireless MIMO Environments

December

2015

The Search of Non-Standard Words in the Documents Written in Indonesian Language with Nazief and Adriani Algorithm

Dec

2020

Reseach Article

A Review on Action Recognition and Action Prediction of Human(s) using Deep Learning Approaches

by Syed Abdussami, Nagendraprasad S., Shivarajakumara K., Sanjeet Singh, A. Thyagarajamurthy

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 177 - Number 20

Year of Publication: 2019

Authors: Syed Abdussami, Nagendraprasad S., Shivarajakumara K., Sanjeet Singh, A. Thyagarajamurthy

10.5120/ijca2019919605

Syed Abdussami, Nagendraprasad S., Shivarajakumara K., Sanjeet Singh, A. Thyagarajamurthy . A Review on Action Recognition and Action Prediction of Human(s) using Deep Learning Approaches. International Journal of Computer Applications. 177, 20 ( Nov 2019), 1-5. DOI=10.5120/ijca2019919605

@article{ 10.5120/ijca2019919605,

author = { Syed Abdussami, Nagendraprasad S., Shivarajakumara K., Sanjeet Singh, A. Thyagarajamurthy },

title = { A Review on Action Recognition and Action Prediction of Human(s) using Deep Learning Approaches },

journal = { International Journal of Computer Applications },

issue_date = { Nov 2019 },

volume = { 177 },

number = { 20 },

month = { Nov },

year = { 2019 },

issn = { 0975-8887 },

pages = { 1-5 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume177/number20/31012-2019919605/ },

doi = { 10.5120/ijca2019919605 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:46:23.988381+05:30

%A Syed Abdussami

%A Nagendraprasad S.

%A Shivarajakumara K.

%A Sanjeet Singh

%A A. Thyagarajamurthy

%T A Review on Action Recognition and Action Prediction of Human(s) using Deep Learning Approaches

%J International Journal of Computer Applications

%@ 0975-8887

%V 177

%N 20

%P 1-5

%D 2019

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Human Action Recognition and Prediction are some of the hot topics in Computer Vision these days. It has its formidable contribution in the Anomaly detection. Many research scientists have been working in this field. Many new algorithms have been tried out in recent decades. In this paper, eight such approaches proposed in eight research papers have been reviewed. Compared to their counterparts for still images (the 2D CNNs for visual recognition), the 3D CNNs are considered to be comparatively less efficient, due to the limitations like high training complexity of spatio-temporal fusion and huge memory cost. So in the first referred paper the authors have proposed MiCT (Mixed Convolution Tube – for videos) with the right use of both 2D CNNs and 3D CNNs which reduces the training time. In the second research paper, the glimpse sequences in each frame correspond to interest points in the scene that are relevant to the classified activities. Unlike the last referred paper, the third referred paper presents a novel method to recognize human action as the evolution of pose estimation maps. The fourth referred paper presents a model for long term prediction of pedestrians from on-board observations. In the fifth research article referred, an attempt has been made to recognize the Human Rights Violation activities using the Deep Convolutional Neural Networks. In the sixth research article, Convolutional LSTM is used for the purpose of detecting violent videos. The seventh paper introduces a new Two-Stream Inﬂated 3D ConvNet (I3D) that is based on 2D ConvNet inﬂation. In the eighth research paper, a new temporal transition layer (TTL) that models variable temporal convolution kernel depths is embedded into 3D CNN to form T3D (Temporal 3D Convnets). Transferring knowledge from a pre-trained 2D CNN to a 3D CNN reduces the number of training samples required for 3D CNNs.

References

Y. Zhou, X. Sun, Z. Zha and W. Zeng, "MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 449-458
F. Baradel, C. Wolf, J. Mille and G. W. Taylor, "Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 469-478
M. Liu and J. Yuan, "Recognizing Human Actions as the Evolution of Pose Estimation Maps," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 1159-1168.
A. Bhattacharyya, M. Fritz and B. Schiele, "Long-Term On-board Prediction of People in Traffic Scenes Under Uncertainty," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 4194-4202.
Kalliatakis, Grigorios & Ehsan, Shoaib & Fasli, Maria & Leonardis, Ales & Gall, Juergen & McDonald-Maier, Klaus. (2016). Detection of Human Rights Violations in Images: Can Convolutional Neural Networks help?.
S. Sudhakaran and O. Lanz, "Learning to detect violent videos using convolutional long short-term memory," 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, 2017, pp. 1-6.
Carreira, J & Zisserman, Andrew. (2017). Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. 4724-4733. 10.1109/CVPR.2017.502.
Diba, Ali & Fayyaz, Mohsen & Sharma, Vivek & Hossein Karami, Amir & Mahdi Arzani, Mohammad & Van Gool, Luc & Yousefzadeh, Rahman. (2017). Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification.
Tu, Zhigang&Xie, Wei & Qin, Qianqing&Poppe, Ronald &Veltkamp, Remco & Li, Baoxin& Yuan, Junsong. (2018). Multi-stream CNN: Learning representations based on human-related regions for action recognition. Pattern Recognition. 79. 32-43. 10.1016/j.patcog.2018.01.020.

Index Terms

Computer Science

Information Sciences

Keywords

CNN SVM MiCT Glimpse Clouds Two-stream Bayesian Encoder-Decoder Pose estimation Heat Maps ConvLSTM Two-stream 3D CNN TTL T3D.