CFP last date
22 April 2024
Reseach Article

KDSSF: A Graph Modeling Approach

by Muhammad Naeem, Sohail Asghar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 33 - Number 4
Year of Publication: 2011
Authors: Muhammad Naeem, Sohail Asghar
10.5120/4010-5690

Muhammad Naeem, Sohail Asghar . KDSSF: A Graph Modeling Approach. International Journal of Computer Applications. 33, 4 ( November 2011), 31-37. DOI=10.5120/4010-5690

@article{ 10.5120/4010-5690,
author = { Muhammad Naeem, Sohail Asghar },
title = { KDSSF: A Graph Modeling Approach },
journal = { International Journal of Computer Applications },
issue_date = { November 2011 },
volume = { 33 },
number = { 4 },
month = { November },
year = { 2011 },
issn = { 0975-8887 },
pages = { 31-37 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume33/number4/4010-5690/ },
doi = { 10.5120/4010-5690 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:19:18.689831+05:30
%A Muhammad Naeem
%A Sohail Asghar
%T KDSSF: A Graph Modeling Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 33
%N 4
%P 31-37
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In recent years, data mining applications have been found quite extendible in the area of social science like mass communication and religion studies. In traditional approach used for such work, hidden semantics between documents were not considered well. In this study, we have shown that text mining can be applied to classify social figures like politician, religious leaders. Such classification is based on text mining of speeches delivered by social figures. These social figures are famed personalities and their speeches are collected from their official websites. Our text classification is based on tf.idf followed by cosine and Jaccard Similarity. To improve the results on discerning features, we have designed a hash graph modeling technique Knowledge Discovery System for Social Figures (KDSSF) based on synonym words dictionary. In the comparative analysis of speeches made by social figures, we did not focus on the provision of the optimal matches but overall classification of the social figures in any domain of interests. Preliminary experiments have illustrated that inclusion of hash based graph modeling can significantly improve the results of classification.

References
  1. Barabási, AL. Linked: The new science of networks. Cambridge, MA: Perseus; 2002.
  2. Batagelj, V.; Mrvar, A.; Zaveršnik, M. Network analysis of texts. In: Tomǎ, E.; Gros, J., editors. Proceedings of the 5th International Multi-Conference Information Society—Language Technologies. Ljubljana: Slovenia: Multi-Conference Information Society; 2002. p. 143-148.
  3. Corps: A corpus of tagged political speeches. http://hlt.fbk.eu/corps.
  4. Ehud R and Somayajulu S,(2004) Contextual Influences on Near-Synonym Choice, INLG 2004, LNAI 3123, pp. 161–170,
  5. Ferrer i Cancho R, Solé RV. The small world of human language. Proceedings of the Royal Society of London B: Biological Sciences. 2001;268:2261–2266.
  6. Gunes Erkan, Dragomir R. Radev, LexRank: Graph-based Lexical Centrality as Salience in Text Summarization, Journal of Arti_cial Intelligence Research 22 (2004) 457-479
  7. Hamid S and Manucher D, (1991) The production Data-based similarity coefficient versus Jaccard's similarity coefficient, Computers ind. Engng Vol. 21, Nos 1-4, pp. 263-266,
  8. Hasegawa T., Kanagawa Y. and Satoshi S., Discovering relations among named entities from large corpora,ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, 2004.
  9. Jacob B ,Benjamin C, (2008) Calculating the Jaccard Similarity Coe_cient with Map Reduce for Entity Pairs in Wikipedia, Wikipedia Similarity Team Project
  10. Kleinberg, J.M. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604-632.
  11. Litvak M , Last M, Graph-Based Keyword Extraction for Single-Document Summarization, Proceedings of the workshop on Multi-source Multilingual Information Extraction and Summarization, pages 17–24 Manchester, August 2008.
  12. Michael S. Vitevitch, What Can Graph Theory Tell Us About Word Learning and Lexical Retrieval?, Speech Lang Hear Res. 2008 April ; 51(2): 408–422. doi:10.1044/1092-4388(2008/030).
  13. Motter A. E., de Moura A. P. S., Y.-C. Lai, and P. Dasgupta. Topology of the conceptual network of language. Physical Review E, 65(6):065102, 2002.
  14. Ryder, J., Zhang, S. (2010). Preliminary Results of Ranking Political Figures Using Naive Bayes Text Classification. Proceedings of the 2010 International Conference on Data Mining (DMIN 2010). Las Vegas, Nevada, USA. July 12-15, 2010. CSREA Press 2010. ISBN: 1-60132-138-4, Robert Stahlbock and Sven Crone (Eds.)
  15. Sahami M. & Heilman T. (2006). A Web-based Kernel Function for Measuring the Similarity of Short Text Snippets. In Proc. of the 15th Int’l Conf. on the World Wide Web, 377-386.
  16. Salton, G., & McGill, M. (Eds.). (1983). Introduction to modern information retrieval. McGraw-Hill.
  17. Spertus E., Sahami M., & O. Buyukkokten (2005). Evaluating Similarity Measures: A Large Scale Study in the Orkut Social Network. In Proc. of the 11th ACM-SIGKDD Int’l Conf. on Knowledge Discovery in Data Mining, 678-684
  18. Synonym Dictionary, http://www.synonym.ca retrieved on September, 2011.
  19. Takaaki Hasegawa, Satoshi Sekine and Ralph Grishman, Discovering Relations among Named Entities from Large Corpora, Proc. of ACL-2004 (2004), pp. 415-422.
  20. Tommy W.S., Chow, Haijun Zhang, Rahman M.K.M., A new document representation using term frequency and vectorized graph connectionists with application to document retrieval, Expert Systems with Applications, 2009
  21. Vincenzo Di Lecce, Marco Calabrese, and Domenico Soldo, (2008) Mining Context-Specific Web Knowledge: An Experimental Dictionary-Based Approach, ICIC 2008, LNAI 5227, pp. 896–905, 2008.
  22. Wilks C, Meara P, Wolter B. (2005) A further note on simulating word association behaviour in a second language. Second Language Research ;21:359–372.
  23. Zobel, J., & Moffat, A. (1998). Exploring the similarity space. ACM SIGIR Forum, 32(1), 18–34.
  24. C. Fellbaum, WordNet: An Electronic Lexical Da t a b a s e .MIT Press, 1998.
Index Terms

Computer Science
Information Sciences

Keywords

Graph Modeling Term Frequency Inverse Document Frequency Social Sciences Synonyms