Time Improving Policy of Text Clustering Algorithm by Reducing Computational Overheads

Mamta Gupta; Anand Rajavat

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 20 July 2026

Submit your paper

Know more

The week's pick

Quantifying Label-Induced Bias in Large Language Model Self and Cross Evaluations

Muskan Saraf Sajjad Rezvani Boroujeni Justin Beaudry Hossein Abedi Tom Bush

Random Articles

On Chain Folding Problems of Chain Mapper and Chain Reducer Meta Expressions

April

2015

A Supervised Approach to Zero-Shot Learning for Field Classification of Texts: Leveraging File Data for Improved Text Categorization

Sep

2024

Optimized kNN Query Processing using Clustering in Untrusted Cloud Environment

April

2015

Development of an Instrument for Enterprise Resource Planning (ERP) Implementation in Indian Small and Medium Enterprises (SMEs)

July

2012

Reseach Article

Time Improving Policy of Text Clustering Algorithm by Reducing Computational Overheads

by Mamta Gupta, Anand Rajavat

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 123 - Number 5

Year of Publication: 2015

Authors: Mamta Gupta, Anand Rajavat

10.5120/ijca2015905320

Mamta Gupta, Anand Rajavat . Time Improving Policy of Text Clustering Algorithm by Reducing Computational Overheads. International Journal of Computer Applications. 123, 5 ( August 2015), 15-19. DOI=10.5120/ijca2015905320

@article{ 10.5120/ijca2015905320,

author = { Mamta Gupta, Anand Rajavat },

title = { Time Improving Policy of Text Clustering Algorithm by Reducing Computational Overheads },

journal = { International Journal of Computer Applications },

issue_date = { August 2015 },

volume = { 123 },

number = { 5 },

month = { August },

year = { 2015 },

issn = { 0975-8887 },

pages = { 15-19 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume123/number5/21955-2015905320/ },

doi = { 10.5120/ijca2015905320 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:11:51.618164+05:30

%A Mamta Gupta

%A Anand Rajavat

%T Time Improving Policy of Text Clustering Algorithm by Reducing Computational Overheads

%J International Journal of Computer Applications

%@ 0975-8887

%V 123

%N 5

%P 15-19

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Since the amount of text data stored in computer repositories is growing every day, we need more than ever a reliable way to assemble or classify text documents. Clustering can provide a means of introducing some form of organization to the data, which can also serve to highlight significant patterns and trends .Document clustering is used in many fields such as data mining and information retrieval. This paper presents the results of an experimental study of some common document clustering techniques. In particular, we compare the two main approaches of document clustering, agglomerative hierarchical clustering Modified BIRCH and Partitional clustering algorithm K-means. As a result of comparing both algorithms we attempt to establish appropriate clustering technique to generate qualitative clustering of real world document.

References

Shi Zhong,”A k-means algorithm to improve the Efficiency Using Normal Distribution Data Points”,(IJCSE) International Journal on Computer Science and Engineering, 2010.
Xufei Wang, Jiliang Tang and Huan Liu, ”Document Clustering via Matrix Multiplication” 2011 11th IEEE InternationalConference On Data Mining.
Book: Information Retrieval, Algorithms and heuristics by David A.Grossman and Ophir Frieder.Published by Springer International.
Anil K. Jain, “Pattern Recognition Letters”, Journal Elsevier, Pattern Recognition Letters 31 (2010) 651– 666.
Atika Mustafa, Ali Akbar, and Ahmer Sultan”Knowledge Discovery using Text Mining: A Programmable Implementation on Information Extraction and Categorization”, International Journal of Multimedia and Ubiquitous Engineering Vol. 4, No. 2, April, 2009.
L. Wanner, “Introduction to Clustering Techniques”,International Union of Local Authorities, July, 2004.
T. Velmurugan,and T. Santhanam, “A Survey of Partition based Clustering Algorithms in DataMining: An Experimental Approach” An experimental approach.
Porter, M.F.: An algorithm for suffix stripping.Program, Vol. 14, No. 3, 1980
Na Wang; Pengyuan Wang; Baowei Zhang; , "An improved TF-IDF weights function based on information theory," Computer and Communication Technologies in Agriculture Engineering (CCTAE), 2010 International Conference On , vol.3, no., pp.439- 441, 12-13 June 2010.
Lee, D.L.; Huei Chuang; Seamons, K.; , "Document ranking and the vector-space model,", IEEE , vol.14, no.2, pp.67-75, Mar/Apr 1997.
Shobha S. Raskar, D. M. Thakore “Text Mining and Clustering Analysis”, IJCSNS International Journal of Computer Science and Network Security, VOL.11 No.6, June 2011.
Mrs .S.C.Punitha and Dr.M.Punithavalli, “A Comparative Study to Find A Suitable Method for Text Document Clustering”, International Journal of Computer Science & Technology(IJCSIT)Vol 3,No 6 December 2011.

Index Terms

Computer Science

Information Sciences

Keywords

Document clustering BIRCH K-means Support Vector Model Matrix Representation.