CFP last date
20 May 2024
Reseach Article

Machine Learning based approach for Human Trait Identification from Blog Data

by Saurabh Saxena, Chandra Mani Sharma
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 48 - Number 10
Year of Publication: 2012
Authors: Saurabh Saxena, Chandra Mani Sharma
10.5120/7384-0005

Saurabh Saxena, Chandra Mani Sharma . Machine Learning based approach for Human Trait Identification from Blog Data. International Journal of Computer Applications. 48, 10 ( June 2012), 17-22. DOI=10.5120/7384-0005

@article{ 10.5120/7384-0005,
author = { Saurabh Saxena, Chandra Mani Sharma },
title = { Machine Learning based approach for Human Trait Identification from Blog Data },
journal = { International Journal of Computer Applications },
issue_date = { June 2012 },
volume = { 48 },
number = { 10 },
month = { June },
year = { 2012 },
issn = { 0975-8887 },
pages = { 17-22 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume48/number10/7384-0005/ },
doi = { 10.5120/7384-0005 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:43:43.472503+05:30
%A Saurabh Saxena
%A Chandra Mani Sharma
%T Machine Learning based approach for Human Trait Identification from Blog Data
%J International Journal of Computer Applications
%@ 0975-8887
%V 48
%N 10
%P 17-22
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Emotions form a major part of a person's personality. Emotional intelligence (EI) is the ability to identify, assess, and control the emotions of oneself, of others, and of groups. The written expressions reflect author's personality. Various personality traits can be determined by the analysis of the contents written by a person. This paper proposes a novel technique for human trait identification from the analysis of author's written expressions. The proposed technique is based on the concept of supervised machine learning and uses Support Vector Machine for classifying the personality of a writer. We classify the personality of a writer into five categories namely, highly extrovert, highly introvert, low introvert, low extrovert and ambivert. Experiments have been carried out on the real world blog data and results demonstrate that the proposed technique can determine the personality traits of a writer with accuracy and speed. We have also implemented a PHP based online system, which reads the contents of a blog and can automatically predict the personality of writer of the blog

References
  1. Poropat, A. E. "A meta-analysis of the five-factor model of personality and academic performance. Psychological Bulletin," 135(2), 322–338, 2009
  2. Alastair J. Gill, S. Nowson and J. Oberlander "What are they Blogging About? Personality, Topic and Motivation in Blogs," Association for the Advancement Intelligence 2009.
  3. Haytham Mohtasseb and Amr Ahmed (2009): "More Blogging Features for Authorship Identification". In Proceedings of International Conference on Knowledge Discovery (ICKD'09), Philippines.
  4. Abbasi and H. Chen. "Applying to authorship analysis extremist-group web forum messages". IEEE INTELLIGENT SYSTEMS, pages 67–75, 2005.
  5. Abbasi and H. Chen. Writeprints: "A stylometric approach to identity-level identification and similarity detection in cyberspace". ACM Transaction Information Systems, 26(2):1– 29, 2008.
  6. O. de Vel, A. Anderson, M. Corney and G. Mohay. "Mining email content for author identification forensics". ACM SIGMOD record, 30(4):55-64, 2001.
  7. J. Oberlander and S. Nowson. "Whose thumb is it anyway? Classifying author personality from weblog text". In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics and 21st International Conference on Computational Linguistics, Sydney, Australia, 2006.
  8. Arjun Mukharjee, Bing Liu "Improving Gender Classification of Blog Authors" Proceedings of the 2010 conference on Empirical Methods in natural Language Processing, pages 207-217 MIT. Massachusetts, USA 9-11 October 2010.
Index Terms

Computer Science
Information Sciences

Keywords

Machine Learning Soft Computing Support Vector Machine Author Trait Identification Online System