CFP last date
22 April 2024
Reseach Article

Article:Query based Text Document Clustering using its Hypernymy Relation

by S.Vijayalakshmi, Dr.D.Manimegalai
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 23 - Number 1
Year of Publication: 2011
Authors: S.Vijayalakshmi, Dr.D.Manimegalai
10.5120/2855-3667

S.Vijayalakshmi, Dr.D.Manimegalai . Article:Query based Text Document Clustering using its Hypernymy Relation. International Journal of Computer Applications. 23, 1 ( June 2011), 13-16. DOI=10.5120/2855-3667

@article{ 10.5120/2855-3667,
author = { S.Vijayalakshmi, Dr.D.Manimegalai },
title = { Article:Query based Text Document Clustering using its Hypernymy Relation },
journal = { International Journal of Computer Applications },
issue_date = { June 2011 },
volume = { 23 },
number = { 1 },
month = { June },
year = { 2011 },
issn = { 0975-8887 },
pages = { 13-16 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume23/number1/2855-3667/ },
doi = { 10.5120/2855-3667 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:09:47.164744+05:30
%A S.Vijayalakshmi
%A Dr.D.Manimegalai
%T Article:Query based Text Document Clustering using its Hypernymy Relation
%J International Journal of Computer Applications
%@ 0975-8887
%V 23
%N 1
%P 13-16
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Clustering of text can be organized in an unsupervised manner. In this paper, Text document clustering is done based on query and its semantic relation. The method utilizes hypernymy to identify its relation. It was detected by using the Word Net. It act as background knowledge of the Query and provides its synonymic terms. This paper proposed the new term-document matrix called Query based document vector model, which is constructed using query with two terms and its hypernymy. The results show that our new measure Cluster Accuracy is significantly better to evaluate the quality of cluster and better results are obtained.

References
  1. Congnan Luo, Yanjun Li, Soon M.Chung, titled “Text document clustering based on neighbors”, Journal on Data & Knowledge Engineering, Volume 68, Issue 11, November 2009, Pages 1271-1288.
  2. Kaufman. L and Rousseauw. P, 1991, Finding Groups in data: An introduction to cluster analysis, 1999, John Wiley & Sons.
  3. Manu Kochady, Textmining Application Programming, Akash Press, Delhi- 20, 2007.
  4. Makrechi. M, “Query-relevant document representation for text clustering”, Digital Information management (ICDIM), 2010, Fifth International Conference on Digital Object Identifier: 10.1109/ICDIM.2010.5664205 pages: 132-138.
  5. M.R.Anderberg, Cluster Analysis for Application, Academic Press, New York, 1973.
  6. A.K.Jain and R.C. Dubes, Algorithm for clustering data, Prentice Hall, Englewood Cliffs NJ, 1998.
  7. S.Guha, R.Rastogi and K.Shim, CURE: An efficient clustering for large databases in Proceedings of the ACM SIGMOID, International Conference on Management of Data, 1998, pages 73-84.
  8. H. Frigui, R. Krishnapuram, A robust competitive clustering algorithm with application in computer vision, IEEE Trans. Patt. Anal. Machine Intelligence. 21 (1) (1999) pages 450–465.
  9. Fazli Can, Ismail Sengor Altingovde, Engin Demir,”Efficiency and effectiveness of query processing in cluster based retrieval”, Journal on Information Systems, ScienceDirect, Volume 29, Issue 8, December 2004, Pages 697-717.
  10. Anastasios Tombros,Robert Villa, C.J.Van Rijsbergen,”The effectiveness of query specific hierarchical clustering in information retrieval”, Journal on Information processing & Management, Volume 38, Issue 4, July 2002, pages 559-582.
  11. Hai-Tao Zheng, Bo-Yeong Kang, Hang-Gee Kim, “Exploiting noun phrases and semantic relationships for text document clustering”, Journal on Information Sciences 179 (2009), ScienceDirect, page 2249-2262.
  12. Yanjun Li, Soon M.Chung, John D. Holt, ”Text Document Clustering based on frequent word meaning sequences”, Journal on Data & Knowledge Engineering, Volume 64, Issue 1, January 2008, Pages 381-404
  13. Meedeniya. D.A and Perera. A.S, “Evaluation of Partition-Based Text Clustering Techniques to Categorize Indic Language Documents”, IEEE International Advanced Computing Conference, 2009(IACC 2009), Digital Object Identifier: 10.1109/IADCC.2009.4809239, Pages: 1497-1500.
  14. Jinxin Gaoa, David B. Hitchcock James Stein,” shrinkage to improve k-means cluster analysis”, Journal on Computational Statistics and Data Analysis (2010) pages. 2113 - 2127.
Index Terms

Computer Science
Information Sciences

Keywords

Clustering Noun Word net Query based document vector model Hypernymy Accuracy