Call for Paper - November 2023 Edition
IJCA solicits original research papers for the November 2023 Edition. Last date of manuscript submission is October 20, 2023. Read More

A Pyramidal Layered HMM for Multiview Human Behavior Recognition in Asynchronous Video Streams

International Journal of Computer Applications
© 2014 by IJCA Journal
Volume 96 - Number 7
Year of Publication: 2014
Amir Farid Aminian Modarres
Mohsen Soryani

Amir Farid Aminian Modarres and Mohsen Soryani. Article: A Pyramidal Layered HMM for Multiview Human Behavior Recognition in Asynchronous Video Streams. International Journal of Computer Applications 96(7):34-40, June 2014. Full text available. BibTeX

	author = {Amir Farid Aminian Modarres and Mohsen Soryani},
	title = {Article: A Pyramidal Layered HMM for Multiview Human Behavior Recognition in Asynchronous Video Streams},
	journal = {International Journal of Computer Applications},
	year = {2014},
	volume = {96},
	number = {7},
	pages = {34-40},
	month = {June},
	note = {Full text available}


Extracted features which are obtained from a multiview video stream form a special case of a multi-sensor observation sequence. If the sensors are not synchronous, the observed features of views are not aligned together and this makes some difficulties in classification applications. A new architecture for hidden Markov model, namely pyramidal layered hidden Markov model, is proposed in this paper to handle this situation. This is accomplished by means of separate decoding in each view stream in bottom layer and then fusion of the aligned decoded symbols in top layer. Structure and algorithms of the new structure are introduced and are then used for human behaviour recognition in multiview video sequences. Considering collected information from all views of a multiview human action recognition system, one expects the recognition rate to increase and some problems like occlusion to be rectified. Several experiments have been performed in this paper. The experimental results show high performance, about 93. 8% in average, in multiview human behavior recognition, as well as accuracy improvement compared to similar methods. The results are also compared with other contributions on three different multiview behavior datasets.


  • Wu, C. , Khalili, A. H. , Aghajan, H. , 2010. Multiview activity recognition in smart homes with spatio-temporal features. Proceedings of the Fourth ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC '10), p. 142-149. [doi:10. 1145/1865987. 1866010]
  • Oliver, N. , Garg, A. , Horvitz, E. , 2004. Layered representations for learning and inferring office activity from multiple sensory channels. Computer Vision and Image Understanding, 96(2):163–180. [doi:10. 1016/j. cviu. 2004. 02. 004]
  • Rabiner, L. , 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257-286. [doi:10. 1109/5. 18626]
  • Ivanov, Y. A. , Bobick, A. F. , 2000. Recognition of visual activities and interactions by stochastic parsing. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22(8):852-872. [doi:10. 1109/34. 868686]
  • Brand, M. , Kettnaker, V. , 2000. Discovery and segmentation of activities in video. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22(8):844-851. [doi:10. 1109/34. 868685]
  • Fine, S. , Singer, Y. , Tishby, N. , 1998. The Hierarchical Hidden Markov Model: Analysis and Applications. Machine Learning, 32(1):41–62. [doi:10. 1023/A:1007469218079]
  • Murphy, K. P. , Paskin, M. A. , 2002. Linear time inference in hierarchical HMMs. Advances in neural information processing systems 14: proceedings of the 2001 conference, p. 833. [doi:10. 1. 1. 20. 8131]
  • Guo P. , Miao Z. , 2010. Multi-person activity recognition through hierarchical and observation decomposed HMM. Multimedia and Expo (ICME), IEEE International Conference on, p. 143-148. [doi:10. 1109/ICME. 2010. 5582559]
  • Nefian, A. V. , Hayes, M. H. , 1999. An embedded HMM-based approach for face detection and recognition. Acoustics, Speech, and Signal Processing, IEEE International Conference on, p. 3553-3556. [doi:10. 1109/ICASSP. 1999. 757610]
  • Oliver, N. , Horvitz, E. , Garg, A. , 2002. Layered representations for human activity recognition. Multimodal Interfaces, Proceedings. Fourth IEEE International Conference on, p. 3-8. [doi:10. 1109/ICMI. 2002. 1166960]
  • Chen, C. H. , Liang, J. M. , Hu, H. H. , Jiao, L. C. , Yang, X. , 2007. Factorial hidden Markov models for gait recognition. International conference on Advances in Biometrics (ICB'07), p. 124-133. [doi:10. 1007/978-3-540-74549-5_14]
  • Altman, R. M. , 2007. Mixed hidden Markov models: an extension of the hidden Markov model to the longitudinal data setting. Journal of the American Statistical Association, 102:201-210. [doi:10. 1198/016214506000001086]
  • Huang, R. , Pavlovic, V. , Metaxas, D. N. , 2007. Shape analysis using curvature-based descriptors and profile hidden Markov models. Biomedical Imaging: From Nano to Macro, 4th IEEE International Symposium on, p. 1220-1223. [doi:10. 1109/ISBI. 2007. 357078]
  • Wilson, A. D. , Bobick, A. F. , 1998. Recognition and Interpretation of Parametric Gesture. Proceedings of the Sixth International Conference on Computer Vision (ICCV '98), p. 329–336. [doi:10. 1109/ICCV. 1998. 710739]
  • Galata, A. , Johnson, N. , Hogg, D. , 2001. Learning Variable-Length Markov Models of Behavior. Computer Vision and Image Understanding, 81(3):398-413. [doi:10. 1006/cviu. 2000. 0894]
  • Brand, M. , Oliver, N. , Pentland, A. , 1997. Coupled hidden Markov models for complex action recognition. Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, p. 994-999. [doi:10. 1109/CVPR. 1997. 609450]
  • Binder, J. , Koller, D. , Russell, S. J. , Kanazawa, K. ,1997. Adaptive Probabilistic Networks with Hidden Variables. Machine Learning, 29:213-244. [doi:10. 1023/A:1007421730016]
  • Ren, H. B. , Xu, G. Y. , 2002. Human action recognition with primitive-based coupled-HMM. Pattern Recognition Proceedings, 16th International Conference on, p. 494-498. [doi:10. 1109/ICPR. 2002. 1048346]
  • Fernyhough, J. , Cohn, A. G. , Hogg, D. C. , 1998. Building qualitative event models automatically from visual input. Computer Vision, Sixth International Conference on, p. 350-355. [doi:10. 1109/ICCV. 1998. 710742]
  • Intille, S. S. , Bobick, A. F. , 1999. A framework for recognizing multi-agent action from visual evidence. Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence (AAAI '99/IAAI '99), p. 518-525.
  • Madabhushi, A. , Aggarwal, J. K. , 1999. A Bayesian approach to human activity recognition. Visual Surveillance, Second IEEE Workshop on, (VS'99), p. 25-32. [doi:10. 1109/VS. 1999. 780265]
  • Hoey, J. , 2001. Hierarchical unsupervised learning of facial expression categories. Detection and Recognition of Events in Video Proceedings, IEEE Workshop on, p. 99-106. [doi:10. 1109/EVENT. 2001. 938872]
  • Zhou, F. , De la Torre, F. , 2009. Canonical time warping for alignment of human behavior. Advances in Neural Information Processing Systems (NIPS), 22:2286-2294.
  • Aminian-Modarres, A. F. , Soryani, M. , BPG: A new graph-based posture descriptor for human behavior recognition. IET Computer Vision, accepted for publication (acceptance date: 15-Feb-2013). [doi:10. 1049/iet-cvi. 2012. 0121]
  • Singh, S. ,Velastin, S. A. , Ragheb, H. , 2010. MuHAVi: A Multicamera Human Action Video Dataset for the Evaluation of Action Recognition Methods. Advanced Video and Signal Based Surveillance (AVSS), Seventh IEEE International Conference on, p. 48-55. [doi:10. 1109/AVSS. 2010. 63]
  • Weinland, D. , Ronfard, R. , Boyer, E. , 2006. Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding, 104(2–3):249-257. [doi:10. 1016/j. cviu. 2006. 07. 013]
  • Sigal, L. , Balan, A. O. , Black, M. J. , 2010. HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion. International Journal of Computer Vision, 87(1-2):4-27. [doi:10. 1007/s11263-009-0273-6]
  • Karthikeyan, S. , Gaur, U. , Manjunath, B. S. , Grafton, S. , 2011. Probabilistic subspace-based learning of shape dynamics modes for multi-view action recognition. Computer Vision Workshops (ICCV Workshops), IEEE International Conference on, p. 1282-1286. [doi:10. 1109/ICCVW. 2011. 6130399]
  • Weinland, D. , Boyer, E. , Ronfard, R. , 2007. Action Recognition from Arbitrary Views using 3D Exemplars. Computer Vision, ICCV 2007, IEEE 11th International Conference on, p. 1-7. [doi:10. 1109/ICCV. 2007. 4408849]
  • Junejo, I. N. , Dexter, E. , Laptev, I. , Perez, P. , 2008. Cross-View Action Recognition from Temporal Self-similarities. Proceedings of the 10th European Conference on Computer Vision: Part II (ECCV '08), p. 293-306. [doi:10. 1007/978-3-540-88688-4_22]
  • Vitaladevuni, S. N. , Kellokumpu, V. , Davis, L. S. , 2008. Action recognition using ballistic dynamics. Computer Vision and Pattern Recognition, CVPR 2008, IEEE Conference on, p. 1-8. [doi:10. 1109/CVPR. 2008. 4587806]
  • Weinland, D. , Özuysal, M. , Fua, P. , 2010. Making action recognition robust to occlusions and viewpoint changes. Proceedings of the 11th European conference on computer vision conference on Computer vision: Part III (ECCV'10), p. 635-648. [doi:10. 1007/978-3-642-15558-1_46]
  • Junejo, I. N. , Dexter, E. , Laptev, I. , Perez, P. , 2011. View-Independent Action Recognition from Temporal Self-Similarities. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 33(1):172-185. [doi:10. 1109/TPAMI. 2010. 68]
  • Jingen, L. , Shah, M. , Kuipers, B. , Savarese, S. , 2011. Cross-view action recognition via view knowledge transfer. Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, p. 3209-3216. [doi:10. 1109/CVPR. 2011. 5995729]
  • Ning, H. , Xu, W. , Gong, Y. , Huang, T. , 2008. Latent Pose Estimator for Continuous Action Recognition. Proceedings of the 10th European Conference on Computer Vision: Part II (ECCV '08), p. 419-433. [doi:10. 1007/978-3-540-88688-4_31]