CFP last date
20 May 2024
Reseach Article

Time Improving Policy of Text Clustering Algorithm by Reducing Computational Overheads

by Mamta Gupta, Anand Rajavat
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 123 - Number 5
Year of Publication: 2015
Authors: Mamta Gupta, Anand Rajavat
10.5120/ijca2015905320

Mamta Gupta, Anand Rajavat . Time Improving Policy of Text Clustering Algorithm by Reducing Computational Overheads. International Journal of Computer Applications. 123, 5 ( August 2015), 15-19. DOI=10.5120/ijca2015905320

@article{ 10.5120/ijca2015905320,
author = { Mamta Gupta, Anand Rajavat },
title = { Time Improving Policy of Text Clustering Algorithm by Reducing Computational Overheads },
journal = { International Journal of Computer Applications },
issue_date = { August 2015 },
volume = { 123 },
number = { 5 },
month = { August },
year = { 2015 },
issn = { 0975-8887 },
pages = { 15-19 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume123/number5/21955-2015905320/ },
doi = { 10.5120/ijca2015905320 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:11:51.618164+05:30
%A Mamta Gupta
%A Anand Rajavat
%T Time Improving Policy of Text Clustering Algorithm by Reducing Computational Overheads
%J International Journal of Computer Applications
%@ 0975-8887
%V 123
%N 5
%P 15-19
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Since the amount of text data stored in computer repositories is growing every day, we need more than ever a reliable way to assemble or classify text documents. Clustering can provide a means of introducing some form of organization to the data, which can also serve to highlight significant patterns and trends .Document clustering is used in many fields such as data mining and information retrieval. This paper presents the results of an experimental study of some common document clustering techniques. In particular, we compare the two main approaches of document clustering, agglomerative hierarchical clustering Modified BIRCH and Partitional clustering algorithm K-means. As a result of comparing both algorithms we attempt to establish appropriate clustering technique to generate qualitative clustering of real world document.

References
  1. Shi Zhong,”A k-means algorithm to improve the Efficiency Using Normal Distribution Data Points”,(IJCSE) International Journal on Computer Science and Engineering, 2010.
  2. Xufei Wang, Jiliang Tang and Huan Liu, ”Document Clustering via Matrix Multiplication” 2011 11th IEEE InternationalConference On Data Mining.
  3. Book: Information Retrieval, Algorithms and heuristics by David A.Grossman and Ophir Frieder.Published by Springer International.
  4. Anil K. Jain, “Pattern Recognition Letters”, Journal Elsevier, Pattern Recognition Letters 31 (2010) 651– 666.
  5. Atika Mustafa, Ali Akbar, and Ahmer Sultan”Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization”, International Journal of Multimedia and Ubiquitous Engineering Vol. 4, No. 2, April, 2009.
  6. L. Wanner, “Introduction to Clustering Techniques”,International Union of Local Authorities, July, 2004.
  7. T. Velmurugan,and T. Santhanam, “A Survey of Partition based Clustering Algorithms in DataMining: An Experimental Approach” An experimental approach.
  8. Porter, M.F.: An algorithm for suffix stripping.Program, Vol. 14, No. 3, 1980
  9. Na Wang; Pengyuan Wang; Baowei Zhang; , "An improved TF-IDF weights function based on information theory," Computer and Communication Technologies in Agriculture Engineering (CCTAE), 2010 International Conference On , vol.3, no., pp.439- 441, 12-13 June 2010.
  10. Lee, D.L.; Huei Chuang; Seamons, K.; , "Document ranking and the vector-space model,", IEEE , vol.14, no.2, pp.67-75, Mar/Apr 1997.
  11. Shobha S. Raskar, D. M. Thakore “Text Mining and Clustering Analysis”, IJCSNS International Journal of Computer Science and Network Security, VOL.11 No.6, June 2011.
  12. Mrs .S.C.Punitha and Dr.M.Punithavalli, “A Comparative Study to Find A Suitable Method for Text Document Clustering”, International Journal of Computer Science & Technology(IJCSIT)Vol 3,No 6 December 2011.
Index Terms

Computer Science
Information Sciences

Keywords

Document clustering BIRCH K-means Support Vector Model Matrix Representation.