Clustering Algorithms for Huge Datasets: A Mathematical Approach

Shyam Mohan J. S.; Shanmugapriya P.

Call for Paper

March Edition

IJCA solicits high quality original research papers for the upcoming March edition of the journal. The last date of research paper submission is 20 February 2026

Submit your paper

Know more

The week's pick

A Knowledge-Graph–Driven Multimodal Large Model for Semantic Understanding and Controllable Generation of Intangible Cultural Heritage

Jundi Yang Heng Yao

Random Articles

Reseach Article

Clustering Algorithms for Huge Datasets: A Mathematical Approach

by Shyam Mohan J. S., Shanmugapriya P.

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 181 - Number 49

Year of Publication: 2019

Authors: Shyam Mohan J. S., Shanmugapriya P.

10.5120/ijca2019918724

Shyam Mohan J. S., Shanmugapriya P. . Clustering Algorithms for Huge Datasets: A Mathematical Approach. International Journal of Computer Applications. 181, 49 ( Apr 2019), 58-62. DOI=10.5120/ijca2019918724

@article{ 10.5120/ijca2019918724,

author = { Shyam Mohan J. S., Shanmugapriya P. },

title = { Clustering Algorithms for Huge Datasets: A Mathematical Approach },

journal = { International Journal of Computer Applications },

issue_date = { Apr 2019 },

volume = { 181 },

number = { 49 },

month = { Apr },

year = { 2019 },

issn = { 0975-8887 },

pages = { 58-62 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume181/number49/30494-2019918724/ },

doi = { 10.5120/ijca2019918724 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T01:09:38.293789+05:30

%A Shyam Mohan J. S.

%A Shanmugapriya P.

%T Clustering Algorithms for Huge Datasets: A Mathematical Approach

%J International Journal of Computer Applications

%@ 0975-8887

%V 181

%N 49

%P 58-62

%D 2019

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Identifying clusters for huge datasets are useful for finding out attributes of a particular dataset and thereby providing insights for making effective decision making. In our previous work, we have proved the concept of clustering algorithms for huge datasets theoretically by applying small computations on the available datasets. In this paper, we extend the same work by applying Mathematical calculations for the datasets so as to prove the correctness of our previous work carried out. Our proposed method is applied to various datasets and proved K-Means algorithm mathematically and the experimental calculations performed on various clustering algorithms shows that our approach provides the new idea of clustering techniques that can be applied for any number of huge and complex datasets.

References

Jain, Anil K., M. Narasimha Murty, and Patrick J. Flynn. "Data clustering: a review." ACM computing surveys (CSUR) 31, no. 3 (1999): 264-323.
Senthilnath, J., S. N. Omkar, and V. Mani. "Clustering using firefly algorithm: performance study." Swarm and Evolutionary Computation 1, no. 3 (2011): 164-171.
Kanungo, Tapas, David M. Mount, Nathan S. Netanyahu, Christine D. Piatko, Ruth Silverman, and Angela Y. Wu. "An efficient k-means clustering algorithm: Analysis and implementation." Pattern Analysis and Machine Intelligence, IEEE Transactions on 24, no. 7 (2002): 881-892.
Shyam Mohan J S, Shanmugapriya.P ,”Clustering of Huge Datasets using Machine Intelligence Techniques.”IJCA – Vol.181,No.18,September 2018.
Robson L. F. Cordeiro et.al,” Clustering Very Large Multi-dimensional Datasets with MapReduce.” ACM- KDD’11, August 21–24, 2011, San Diego, California, USA.
Dongkuan Xu et.al,” A Comprehensive Survey of Clustering Algorithms.”Springer - Ann. Data. Sci. DOI 10.1007/s40745-015-0040-1.
Max Bodoia ,” MapReduce Algorithms for k-means Clustering.”
Nivranshu Hans et.al,” Big Data Clustering Using Genetic Algorithm On Hadoop MapReduce.” INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 4, ISSUE 04, APRIL 2015 ISSN 2277-8616.
Sreedhar et al.,”Clustering large datasets using K means modified inter and intra clustering (KMI2C) in Hadoop”, Journal Of Big Data , DOI 10.1186/s40537-017-0087-2, Springer 2017.

Index Terms

Computer Science

Information Sciences

Keywords

Machine Intelligence Clustering Algorithms