Performance based Analysis and Comparison of Multi-Algorithmic Clustering Techniques

Rajesh N. Phursule; P. C. Bhaskar

Call for Paper

July Edition

IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 20 June 2025

Submit your paper

Know more

The week's pick

Designing Multi-Tenant E-Learning Systems in the Cloud: A Process-Oriented Approach for Higher Education

Sameh Azouzi Sonia Ayachi Ghannouchi

Random Articles

Prediction of Breast Cancer Risk Level with Risk Factors in Perspective to Bangladeshi Women using Data Mining

November

2013

Clone Attack Detection Protocols in Wireless Sensor Networks: A Survey

July

2014

An Efficient Gateway Election Algorithm for Clusters in MANET

September

2014

Security Attacks in Mobile Adhoc Networks (MANET): A Literature Survey

July

2015

Reseach Article

Performance based Analysis and Comparison of Multi-Algorithmic Clustering Techniques

by Rajesh N. Phursule, P. C. Bhaskar

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 45 - Number 4

Year of Publication: 2012

Authors: Rajesh N. Phursule, P. C. Bhaskar

10.5120/6770-9056

Rajesh N. Phursule, P. C. Bhaskar . Performance based Analysis and Comparison of Multi-Algorithmic Clustering Techniques. International Journal of Computer Applications. 45, 4 ( May 2012), 40-44. DOI=10.5120/6770-9056

@article{ 10.5120/6770-9056,

author = { Rajesh N. Phursule, P. C. Bhaskar },

title = { Performance based Analysis and Comparison of Multi-Algorithmic Clustering Techniques },

journal = { International Journal of Computer Applications },

issue_date = { May 2012 },

volume = { 45 },

number = { 4 },

month = { May },

year = { 2012 },

issn = { 0975-8887 },

pages = { 40-44 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume45/number4/6770-9056/ },

doi = { 10.5120/6770-9056 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:36:45.643487+05:30

%A Rajesh N. Phursule

%A P. C. Bhaskar

%T Performance based Analysis and Comparison of Multi-Algorithmic Clustering Techniques

%J International Journal of Computer Applications

%@ 0975-8887

%V 45

%N 4

%P 40-44

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Clustering the documents based on similarity of words and searching the text is major search procedure and widely used for large set of documents. Documents can be clustered using many clustering algorithms such as Nearest Neighbor, K-Means, Hierarchical, Graph Theoretic etc [4] [5] [7]. The performance measurement in terms of space complexity and execution time and searched output in terms of accuracy and redundancy of these algorithms is a needful study [3]. This paper mainly focuses on performance measurement of Nearest Neighbor, K-Means and Hierarchical agglomerative clustering algorithms on text documents as well as compares them in terms of space complexity, execution time, accuracy and redundancy. In particular, preprocess the input text document and convert it into the document graph represented in the form of matrix. Then convert that document graph into relation matrix which gives relation (similarity score) among all the nodes from 0 to 1 [2]. Implementation and the results of applied clustering algorithms ( Nearest Neighbor, K-Means and Hierarchical agglomerative) on documents are discussed and implemented here.

References

Sholom Weiss, Brian White and Chidanand Apte, "A Lightweight Document Clustering", IBM T. J. Watson Research Centre NY10598, USA.
Ramkrishna Varadrajan, Vagelis Hristidis, "A System for Query Specific Document Summarization", Florida International University.
Michael Steinbach, George Karypis, Vipin Kumar, "A Comparison of Document Clustering Techniques" ,University of Minnesota, Technical Report #00-034.
A. K. Jain, Michigan State University, M. N. Murthy, Indian Institute of Science and P. J. Flynn, The Ohio State University, "Data Clustering: A Review".
King B. , "Step-wise Clustering Procedures", 1967J. Am. Stat. Assoc. 69, 86–101.
Anderberg M. R. . , "Cluster Analysis for Application", 1973 Academic Press, Inc. , New York Ny. Augustson, J.
Abracos and G. Pereira-Lopes, "Statistical methods for retrieving most significant paragraphs in newspaper articles", ACL/EACL Workshop on Intelligent Scalable Text Summarization, 1997.
S. Agrawal, S. Chaudhuri, and G. Das, "DBXplorer: A System For Keyword-Based Search Over Relational Databases", ICDE,2002.
E. Amitay, C. Paris, "Automatically Summarizing Web Sites -Is there any way around it?", CIKM,2000.
H. H. Chen, J. J. Kuo, and T. C. Su, "Clustering and Visualization in a Multi-Lingual Multi- Document Summarization System ", ECIR,2003
G. Erkan and D. R. Radev. Lexrank, "Graph-based centrality as salience in text summarization", JAIR,2004.
J. Goldstein, M. Kantrowitz, V. Mittal, J. Carbonell, "Summarizing text documents: Sentence selection and evaluation metrics", ACM SIGIR, 1999.
C. Y. Lin, "Improving Summarization Performance by Sentence Compression - A Pilot Study", IRAL,2003.
D. Cutting, D. Karger, J. Pedersen, and J. Tukey, " Scatter/Gather: a Cluster-based Approach to Browsing Large Document collections", ACM SIGIR 1992.
J. Hartigan and M Wong, ". A k-means clustering algorithm", Applied Statitsics, 1979
A. El-Hamdouchi and P. Willet, ". Comparison of Hierarchic Agglomerative Clustering Methods for Document Retrieval", The Computer Journal, Vol. 32, No. 3, 1989

Index Terms

Computer Science

Information Sciences

Keywords

Analysis And Comparison Of K-means Nearest Neighbor Agglomerative Hierarchical Document Graph. Clustering Algorithm