Phrase based Clustering Scheme of Suffix Tree Document Clustering Model

Anoop Kumar Jain; Satyam Maheshwari

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2026

Submit your paper

Know more

The week's pick

Structured and Compact: A Novel Encoding and Enhancement Paradigm for ML-based SAT Solving

Ziqi Zhang Lan Zhang

Random Articles

Identifying Overloaded Servers and Managing Dynamic Placement of Virtual machines in Cloud

April

2016

A Survey on various Machine Learning Approaches for ECG Analysis

Apr

2017

Sentiment Analysis Approach based N-gram and KNN Classifier

Jul

2018

A Novel Technique for Data Extraction from Hidden Web Databases

February

2011

Reseach Article

Phrase based Clustering Scheme of Suffix Tree Document Clustering Model

by Anoop Kumar Jain, Satyam Maheshwari

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 63 - Number 10

Year of Publication: 2013

Authors: Anoop Kumar Jain, Satyam Maheshwari

10.5120/10504-5273

Anoop Kumar Jain, Satyam Maheshwari . Phrase based Clustering Scheme of Suffix Tree Document Clustering Model. International Journal of Computer Applications. 63, 10 ( February 2013), 30-37. DOI=10.5120/10504-5273

@article{ 10.5120/10504-5273,

author = { Anoop Kumar Jain, Satyam Maheshwari },

title = { Phrase based Clustering Scheme of Suffix Tree Document Clustering Model },

journal = { International Journal of Computer Applications },

issue_date = { February 2013 },

volume = { 63 },

number = { 10 },

month = { February },

year = { 2013 },

issn = { 0975-8887 },

pages = { 30-37 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume63/number10/10504-5273/ },

doi = { 10.5120/10504-5273 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:14:01.647866+05:30

%A Anoop Kumar Jain

%A Satyam Maheshwari

%T Phrase based Clustering Scheme of Suffix Tree Document Clustering Model

%J International Journal of Computer Applications

%@ 0975-8887

%V 63

%N 10

%P 30-37

%D 2013

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Document clustering is one of the difficult and recent research fields in the search engine research. Most of the existing documents clustering techniques use a group of keywords from each document to cluster the documents. Document clustering arises from information retrieval domains, and "It finds grouping for a set of documents belonging to the same cluster are similar and documents belongs to the different cluster are dissimilar". The nformation retrieval plays an important role in data mining for extracting the relevant information for related to user request. Information retrieval finds the file contents and identifies their similarity. It measures the performance of the documents by using the precision and recall. In this paper we proposed a phrase based clustering scheme which based on application of Suffix Tree Document Clustering (STDC) model. The proposed algorithm is designed to use the STDC model for accurate equivalent representation of document and similarity measurement of the similar documents. This method of clustering reduces the grouping time and similarity accuracy as compared to other existing methods.

References

Shafiq Alam, Gillian Dobbie, Patricia Riddle, M. Asif Naeem, "Particle Swarm Optimization Based Hierarchical Agglomerative Clustering", 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 64-68.
David Pettinger and Giuseppe Di Fatta, "Scalability of Efficient Parallel K-Means", IEEE e-Science 2009 Workshops, pp. 96-101.
Yun Ling and Hangzhou, "Fast Co-clustering Using Matrix Decomposition", IEEE 2009 Asia-Pacific Conference on Information Processing, pp. 201-204.
J. Prabhu and M. Sudharshan and M. Saravanan and G. Prasad, "Augmenting Rapid Clustering Method for Social Network Analysis", 2010 International Conference on Advances in Social Networks Analysis and Mining, pp. 407-408.
F. Yang, T. Sun, C. Zhang, An efficient hybrid data clustering method based on K-harmonic means, and Particle Swarm Optimization, Expert Systems with Applications 2009, pp. 9847–9852.
Y. -T. Kao, E. Zahara, I. -W. Kao, A hybridized approach to data clustering, Expert Systems with Applications 2008, pp. 1754-1762.
Madjid Khalilian, Farsad Zamani Boroujeni, Norwati Mustapha, Md. Nasir Sulaiman, "K-Means Divide and Conquer Clustering", IEEE 2009, International Conference on Computer and Automation Engineering, pp. 306-309.
Lan Yu, "Applying Clustering to Data Analysis of Physical Healthy Standard", 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2010), pp. 2766-2768.
Vignesh T. Ravi and Gagan Agrawal, "Performance Issues in Parallelizing Data-Intensive Applications on a Multi-core Cluster", 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 308-315.
Maryam hajiee, "A New Distributed Clustering Algorithm Based on K-means Algorithm", 2010 3rd International Conforence on Advanced Computer Theory and Engineering (1CACTE), pp. 408-411 (V2).

Index Terms

Computer Science

Information Sciences

Keywords

Clustering Techniques Document Clustering Phrase Merging Suffix Tree