![]() |
10.5120/8834-3052 |
Karm Veer Singh and Anil K Tripathi. Article: Emotion based Contextual Semantic Relevance Feedback in Multimedia Information Retrieval. International Journal of Computer Applications 55(15):38-49, October 2012. Full text available. BibTeX
@article{key:article, author = {Karm Veer Singh and Anil K. Tripathi}, title = {Article: Emotion based Contextual Semantic Relevance Feedback in Multimedia Information Retrieval}, journal = {International Journal of Computer Applications}, year = {2012}, volume = {55}, number = {15}, pages = {38-49}, month = {October}, note = {Full text available} }
Abstract
Every query issued by a user to find some relevant information,contains the semantic and its associated contexts, but, indentifying and conveying these semantic and context (present in the query) to MIR system is a major challenge and still needs to be tackled effectively. Thus, exploiting the plausibility of context associated with semantic concept for the purpose of enhancement in retrieval of the possible relevant information, we propose Emotion Based Contextual Semantic Relevance Feedback(ECSRF) to learn, refine, discriminate and identify the current context present in a query. We will further investigate: (1) whether multimedia attributes(audio, speech along with visual) can be purposefully used to work out a current context of user's query and will be useful inreduction of search space and retrieval time; (2) whether increasing the Affective features (spoken emotional word(s)with facial expression(s)) in identifying, discriminating emotions would increase the overall retrieval performance in terms of Precision, Recall and retrieval time;(3)whether increasing the discriminating power of classifier algorithm in query perfection would increase the search accuracy with less retrieval time. We introduce an Emotion Recognition Unit(ERU) that comprises of a customized 3D spatiotemporal Gabor filter to capture spontaneous facial expression, and emotional word recognition system (combination of phonemes and visemes) to recognize the spoken emotional words. Integration of classifier algorithms GMM, SVM and CQPSB are compared in ECSRF framework to study the effect of increasing the discriminating power of classifier on retrieval performance. Observations suggest that prediction of contextual semantic relevance is feasible, and ECSRF model can benefit from incorporating such increased affective features and classifier to increase a MIR system's retrieval efficiency and contextual perceptions.
References
- Kankanhalli MS, Rui Y (2008) Application Potential of Multimedia Information Retrieval. Proceedings of the IEEE 96 (4)
- Hardoon DR, Taylor JS, Ajanki A, Aki KP, Kaski S (2007) Information retrieval by inferring implicit queries from eye movements. In Eleventh International Conference on Artificial Intelligence and Statistics
- Kelly D, Teevan J (2003) Implicit feedback for inferring user preference: a bibliography. SIGIR Forum 37(2): 18–28.
- Rui Y, Huang S (2000) Optimizing learning in image retrieval. In IEEE Proceedings of Conference on Computer Vision, pp 236-243
- Puolamaki K, Salojarvi J, Savia E, Simola J, Kaski S (2005) Combining eye movements and collaborative filtering for proactive information retrieval. In SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, ACM ,pp 146-153
- Aytar Y, Orhan OB, Shah M (2007) Improving semantic concept detection and retrieval using contextual estimates. ICME
- Salojarvi J, Puolamaki K, Kaski S (2005) Implicit Relevance Feedback from Eye Movements. Artificial Neural Networks: Biological Inspirations ICANN 2005, Springer, 3696
- Urban J, Jose J (2007) Evaluating a workspace's usefulness for image retrieval. Journal of Multimedia Systems 12(4-5) :355-373
- Limbu DK, Connor A, Pears R, MacDonellS (2006)Contextual Relevance Feedback in Web Information Retrieval. Information Interaction in Context, ACM, pp 138-143
- Lang PJ, Greenwald MK, Bradley MM, Hamm AO (1993) Looking at pictures: affective, facial, visceral, and behavioral reactions. Psychophysiology, 30 (3): 261-273
- Park JS, Eum KB, Shin KH, Lee JW(2003) Color Image Retrieval Using Emotional Adjectives. Korea Information Processing Society, B,10-B (2): 179-188
- Yoo HW, Cho SB (2004) Emotion-based Video Scene Retrieval using Interactive Genetic Algorithm. The Korean Institute of Information Scientists and Engineers, 10 (6): 514-528
- ArapakisI,Konstas I, Jose JM (2009) Using Facial Expressions and Peripheral Physiological Signals as Implicit Indicators of Topical Relevance. In SIGIR '09:Proceedings of the 32st annual international conference on Research and development in information retrieval, ACM, 2009
- Ekman P(2003) Emotions Revealed: Recognizing Faces and Feelings to Improve Communication and Emotional Life. Times Books, 2003
- Pantic M, Rothkrantz L (2000) Expert system for automatic analysis of facial expression. Image and Vision Computing Journal, 18(11): 881-905
- Salway A, Graham M (2003) Extracting informationabout emotions in films. In: Proceedings of ACMMultimedia '03
- Chang YJ, Heish CK, Hsu PW, ChenYC (2003) Speech-Assisted Facial Expression Analysis and Synthesis for Visual Conferencing System. Proceedings of ICME, pp 111 – 529
- Hayamizu S, Tanaka K,OhtaK (1988 A Large Vocabulary Word Recognition System Using rule based Network Representation of Acoustic Characteristic Variations. IEEE,1988
- Chang YJ, Heish CK, Hsu PW, Chen YC (2003) Speech Assisted Facial Expression Analysis and Synthesis for Virtual Conferencing Systems. IEEE, 2003
- Lu L, Zhang HJ, Ziang H (2002) Content Analysis for Audio Classification and Segmentation. IEEE Transactions on Speech and audio Processing, 10 (7) : 505-515
- Zheng F, Zhang G, Song Z(2001) Comparison of Different Implementations of MFCC. J. Computer Science & Technology, 16(6): 582–589
- Zhang Y, Togneri R, Alder M (1997)Phoneme-Based Vector Quantization in a Discrete HMM Speech Recognizer. IEEE Transactions on Speech and Audio Processing, 5 (1): 26-32
- McKenzie P, Alder M (1994) Initializing the EM algorithm for use in Gaussian mixture modeling. In Proc. Pattern Recognition, 1994
- FooSW, LianY,Dong L(2004)Recognition of Visual Speech Elements Using Adaptively Boosted Hidden Markov Models. IEEETransactions on Circuits and Systems for Video Technology, 14 (5): 693-705
- Zhou XS, Huang ST (2000) Image retrieval: feature primitives, feature representation, and relevance feedback. IEEE workshop Content-based Access Image Video Libraries ,pp 10-13
- Mokhtarian F, Abbasi S (2002) Shape similarity retrieval under affine transform. Pattern Recognition, 35: 31-41
- Xu R, Wunsch D (2005) Survey of clustering algorithms, IEEE Transactions on Neural Networks. 16 (3): 645– 678
- Jolion JM (2001) Feature similarity. In Principles of Visual Information Retrieval, M. S. Lew,Ed. Springer-Verlog,122-162
- Juang BH, Rabiner L R (1991) Hidden Markov Models for Speech Recognition. Technometrics, 33 (3): 251-272
- Tou JT, Gonzalez RC (1974) Pattern Recognition Principles. Addison-Wesley Publishing Company, Inc. , 1974
- Singh KV and Tripathi AK (2012) Contextual Query Perfection by Affective Features Based Implicit Contextual Semantic Relevance Feedback in Multimedia Information Retrieval. IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 5, No 3,pp. 191-202
- Morishima S, Ogata S, Murai K, Nakamura S (2002) Audio-visual speech translation with automatic lip synchronization and face tracking based on 3D head model. In Proc. IEEE Int. Conf. Acoustics, Speech,and Signal Processing, 2 : 2117–2120
- Silsbee PL, Bovik AC (1996) Computer lipreading for improved accuracy in automatic speech recognition. IEEE Trans. Speech Audio Processing, 4: 337–351
- Owens E, Blazek B (1985)Visemes observed by hearing impaired and normal hearing adult viewers. J. Speech Hear. Res. , 28: 381–393
- Oard DW, Kim J (2001) Modeling information content using observable behavior. 2001
- Sebe N, Lew M S, Sun Y, Cohen I, Gevers T, Huang TS (2007) Authentic facial expression analysis. Image Vision Computing 25 (12): 1856-1863
- Bagherian E, Wirza R, Rahmat OK (2008)Facial feature extraction for face recognition: a review. IEEE,2008
- Adelson EH, Bergen JR (1985)Spatio temporal energy models for the perception of motion. Journal of Optical Society of America, A 2(2): 284- 299
- PetkovN, SubramanianE(2007)Motion detection, noise reduction, texture suppression,and contour enhancement by spatiotemporal Gabor filters with surround inhibition. Biological Cybernetics, 97 (5-6): 423-439
- Lyons M, Akamatsu J, Kamachi SM, Gyoba J (1998) Coding Facial Expressions with Gabor Wavelets. Proceedings, Third IEEE International Conference on Automatic Face and Gesture Recognition, IEEE Computer Society, pp200-205
- Jing F, Li M, Zhang L, Zhang HJ, Zhang B (2003) Learning in region based image retrieval. Proceeding of International Conference of image and Video Retrieval (CIVR2003), 206-215
- Guo GD, Jain AK, Ma WY, Zhang HJ (2002) Learning similarity measure for natural image retrieval with relevance feedback. IEEE Trans. Neural Networks, 13 (4) :811-820
- Zhang L, Liu F, Zhang B (2001) Support Vector Machine Learning for Image Retrieval. International Conference on Image Processing, 7-10
- J. Rocchio, " Relevance feedback in information retrieval", In: Salton G. Ed. , The Smart Retrieval System—Experiment in Automatic Document Processing, Prentice-Hall, Englewood Cliffs,NJ, pp. 313-323.
- Bishop CM (1995) Neural Network for PatternRecognition . Oxford University Press, Oxford,UK.
- Zhang L, Lin FJ, and Zhang B (2001) A Neural network based self-learning algorithm of imageretrieval. Chinese Journal of Software, 12 (10):1479-1485
- Vapnik V (1995) The Nature of Statistical LearningTheory. Springer-Verlag, New York,NY, USA