Review of Clustering Techniques for Finding the Similarity in Articles

Usha Rani; Shashank Sahu

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

Evaluating Text-to-Text Generation from LLMs: A Case Study and Scalable Framework

Ziqiao Ao Juhi Singh Sebastian Antinome

Random Articles

Reseach Article

Review of Clustering Techniques for Finding the Similarity in Articles

by Usha Rani, Shashank Sahu

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 155 - Number 6

Year of Publication: 2016

Authors: Usha Rani, Shashank Sahu

10.5120/ijca2016912329

Usha Rani, Shashank Sahu . Review of Clustering Techniques for Finding the Similarity in Articles. International Journal of Computer Applications. 155, 6 ( Dec 2016), 32-35. DOI=10.5120/ijca2016912329

@article{ 10.5120/ijca2016912329,

author = { Usha Rani, Shashank Sahu },

title = { Review of Clustering Techniques for Finding the Similarity in Articles },

journal = { International Journal of Computer Applications },

issue_date = { Dec 2016 },

volume = { 155 },

number = { 6 },

month = { Dec },

year = { 2016 },

issn = { 0975-8887 },

pages = { 32-35 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume155/number6/26610-2016912329/ },

doi = { 10.5120/ijca2016912329 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:00:34.024658+05:30

%A Usha Rani

%A Shashank Sahu

%T Review of Clustering Techniques for Finding the Similarity in Articles

%J International Journal of Computer Applications

%@ 0975-8887

%V 155

%N 6

%P 32-35

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Clustering is an important technique in data mining. It is a technique in which grouping of item taken place into the clusters in such a way that items of same cluster have more similarity than the items into another cluster, but is very dissimilar to the item in other clusters. The aim of document clustering is to make a set of clusters of given documents in such a way that document of each cluster have more similarity than the documents of other clusters. This paper reviews various techniques of clustering which can be divided mainly into two groups that are hierarchical and partitional clustering.

References

Pavel Berkhin (2000), Survey of Clustering Data Mining techniques, Accrue Software, Inc.
Sasirekha, K., and P. Baby. "Agglomerative Hierarchical ClusteringAlgorithm-A."InternationalJournal ofScientific andResearch Publications: 83.
Deepa, M. Sathya, and N. Sujatha. "Comparative Studies of Various Clustering Techniques and Its Characteristics." Int. J. Advanced Networking and Applications 5.6 (2014): 2104-2116.
Jiawei Han and Michheline Kamber, Data mining concepts and techniques-a reference book, pg. no.-383-422.
Xu Rui and Donald Vrinshc. "Survey of clustering Algorithms." IEEE Neural Networks on Tronskshns 16.3 (2005): 645-67
Elavarasi, S. Anitha, J. Akilandeswari, and B. Sathiyabhama. "A survey on partitionclustering algorithms." International Journal of Enterprise Computing and Business Systems 1.1 (2011).
Jain, Anoop Kumar, and Satyam Maheswari. "Survey of recent clustering techniques in data mining." Int J Comput Sci Manag Res 3 (2012): 72-78.
Lior Rokach & Oded Maimon, .CLUSTERINGMETHODS
Ester, Martin, et al. "A density-based algorithm for discovering clusters in large spatial databases with noise." Kdd. Vol. 96. No. 34. 1996.
Ankerst, Mihael, et al. "OPTICS: ordering points to identify the clustering structure." ACM Sigmod Record. Vol. 28. No. 2. ACM, 1999.
Al-Anazi, Sumayia, Hind AlMahmoud, and Isra Al-Turaiki. "Finding Similar Documents UsingDifferentClustering Techniques." ProcediaComputer Science 82 (2016): 28-34.
Hinneburg A., Keim D.: “An Efficient Approach to Clustering in Large Multimedia Databases with Noise”, Proc. 4th Int. Conf. on Knowledge Discovery & Data Mining, New York City, NY, 1998.
Agrawal R., Gehrke J., Gunopulos D., RaghavanP.: “Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications”, Proc. ACM SIGMOD’98 Int.Conf. on Management of Data, Seattle, WA, 1998, pp. 94-105
Huang, Zhexue. “Extensions to the k-means algorithm for clustering large data sets with categorical values." Data mining and knowledge discovery 2.3 (1998): 283-304.
Karypis, George, Eui-Hong Han, and Vipin Kumar. "Chameleon: Hierarchical clustering using dynamic modeling." Computer 32.8 (1999): 68-75.
Chiu T, Fang D, Chen J, Wang Y, Jeris C. A robust and scalable clustering algorithm for mixed type attributes in large database environment. In: Proc 2001 Int Conf on Know-ledge Discovery and Data Mining (KDD’01), SanFrancisco, CA; 2001. pp 263–268.
Chris ding and Xiaofeng He (2002), Cluster Merging And Splitting In Hierarchical Clustering Algorithms.
A. Hotho, S. Staab, and G. Stumme. Wordnet improves text document clustering. In Proceedings of the SIGIR Semantic Web Workshop, Toronto, 2003.
Zhao, Ying, George Karypis, and Usama Fayyad."Hierarchical clustering algorithms for document datasets." Data mining and knowledge discovery 10.2 (2005): 141-168.
Arai, Kohei, and Ali Ridho Barakbah. "Hierarchical K-means: an algorithm for centroids initialization for K-means." Reports of the Faculty of Science and Engineering 36.1 (2007): 25-31.
Al-Shboul, Bashar, and Sung-Hyon Myaeng."Initializing k-means using geneticalgorithms."World Academy of Science,Engineering and Technology 54.30 (2009): 114-118.
Eriksson, Brian, et al. "Active Clustering: Robust and Efficient Hierarchical Clustering using Adaptively Selected Similarities." AISTATS. Vol. 8. 2011.
Baridam B, Barilee. More work on K -Means clustering algorithm: The dimensionality problem. International Journal of Computer Applications. 2012; 44(2): 23–30.
Bora, Mr, et al. "Effect of different distancemeasures on the performance of K-means algorithm: an experimental study in Matlab." arXiv preprint arXiv:1405.7471 (2014).
MarjanKuchaki Rafsanjani, Zahra Asghari Varzaneh, Nasibeh Emami Chukanlo (2012), A survey of hierarchical clustering algorithms, The Journal of Mathematics and Computer Science, 5,.3, pp.229- 240.
Bide, P., Shedge, R. Improved Document Clustering using k-means algorithm. In: 2015 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT). 2015, p. 1–5.

Index Terms

Computer Science

Information Sciences

Keywords

Clustering Hierarchical clustering Partitional clustering.