Efficient Clustering Approach using Statistical Method of Expectation-Maximization

P.srinivasa Rao; K.sivarama Krishna; Nagesh Vadaparthi; S.vani Kumari

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

Evaluating Text-to-Text Generation from LLMs: A Case Study and Scalable Framework

Ziqiao Ao Juhi Singh Sebastian Antinome

Random Articles

Reseach Article

Efficient Clustering Approach using Statistical Method of Expectation-Maximization

by P.srinivasa Rao, K.sivarama Krishna, Nagesh Vadaparthi, S.vani Kumari

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 46 - Number 12

Year of Publication: 2012

Authors: P.srinivasa Rao, K.sivarama Krishna, Nagesh Vadaparthi, S.vani Kumari

10.5120/6958-9305

P.srinivasa Rao, K.sivarama Krishna, Nagesh Vadaparthi, S.vani Kumari . Efficient Clustering Approach using Statistical Method of Expectation-Maximization. International Journal of Computer Applications. 46, 12 ( May 2012), 1-7. DOI=10.5120/6958-9305

@article{ 10.5120/6958-9305,

author = { P.srinivasa Rao, K.sivarama Krishna, Nagesh Vadaparthi, S.vani Kumari },

title = { Efficient Clustering Approach using Statistical Method of Expectation-Maximization },

journal = { International Journal of Computer Applications },

issue_date = { May 2012 },

volume = { 46 },

number = { 12 },

month = { May },

year = { 2012 },

issn = { 0975-8887 },

pages = { 1-7 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume46/number12/6958-9305/ },

doi = { 10.5120/6958-9305 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:39:31.849049+05:30

%A P.srinivasa Rao

%A K.sivarama Krishna

%A Nagesh Vadaparthi

%A S.vani Kumari

%T Efficient Clustering Approach using Statistical Method of Expectation-Maximization

%J International Journal of Computer Applications

%@ 0975-8887

%V 46

%N 12

%P 1-7

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Clustering is the activity of grouping objects in a dataset based on certain similarity. Available reports on clustering present several algorithms for obtaining effective clusters. Among the existing clustering techniques, hierarchical clustering is one of the widely preferred algorithms. Though there are many algorithms existing,K-Means for hierarchical clustering stand top. But still it is observed that the K-Means algorithm has number of limitations like initialization of parameters. To overcome this limitation, we propose the utilization of E-M algorithm. The K-Means algorithm is implemented by using measure of Cosine similarity and Expectation-Maximization(E-M) with Gaussian Mixture Model. The proposed method has two steps. In first step, the K-Means and E-M methods are combined to partition the input dataset into several smaller sub clusters. In the second step, sub clusters are merged continuously based on maximized Gaussian measure.

References

SimilarityMeasures for text document clustering by Anna Huang
Evaluating the Performance of Similarity Measures Used in Document Clustering and Information Retrieval,IEEE, ieeexplore. iee. org
M. Goto, T. Ishida, S. Hirasawa: "Statistical Evaluation of Measure and Distance on Document Classification Problems in Text Mining", IEEE International Conference on Computer and Information Technology, 2007
Expectation–maximization algorithm From Wikipedia, the free encyclopedia.
Robert Hogg, Joseph McKean and Allen Craig. Introductionto Mathematical Statistics. pp. 359–364. Upper Saddle River, NJ: Pearson Prentice Hall, 2005.
David J. C. MacKay,The on-line textbook: Information Theory, Inference, and Learning Algorithm.
ShuhuaRen AlinFanSch. of Inf. Sci. & Eng. , Dalian Polytech. Univ. , Dalian, China: K-means clustering algorithm based on coefficient of variation.
Momin, B. F. ; Kulkarni, P. J. ; Chau-dhari, A,;Web Document Clustering Using Document Index Graph.
Mikawa, K. ; Ishida, T. ; Goto, M. ; Dept. of Creative Sci. & Eng. , Waseda Univ. , Tokyo, Japan. ; A proposal of extended cosine measure for distance metric learning in text classification.
ELdesoky, A. E. Saleh, M. Sakr, N. A. Dept. of Comput. & Syst. , Mansoura Univ. , Mansoura; Novel similarity measure for document clustering based on topic phrases.
H. Chin, X. Deng,"Efficient phrase-based document similarity for clustering".

Index Terms

Computer Science

Information Sciences

Keywords

K-means Expectation-maximization Gaussian Mixture Model Clustering Similarity Measure