CFP last date
22 April 2024
Reseach Article

PPreDeConStream: A Parallel Version of PreDeConStream Algorithm

by Reza Tashvighi, Alireza Bagheri
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 154 - Number 10
Year of Publication: 2016
Authors: Reza Tashvighi, Alireza Bagheri
10.5120/ijca2016912235

Reza Tashvighi, Alireza Bagheri . PPreDeConStream: A Parallel Version of PreDeConStream Algorithm. International Journal of Computer Applications. 154, 10 ( Nov 2016), 7-12. DOI=10.5120/ijca2016912235

@article{ 10.5120/ijca2016912235,
author = { Reza Tashvighi, Alireza Bagheri },
title = { PPreDeConStream: A Parallel Version of PreDeConStream Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { Nov 2016 },
volume = { 154 },
number = { 10 },
month = { Nov },
year = { 2016 },
issn = { 0975-8887 },
pages = { 7-12 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume154/number10/26525-2016912235/ },
doi = { 10.5120/ijca2016912235 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:59:52.274874+05:30
%A Reza Tashvighi
%A Alireza Bagheri
%T PPreDeConStream: A Parallel Version of PreDeConStream Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 154
%N 10
%P 7-12
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Clustering is one of the major techniques in data mining. Clustering data streams have drawn attentions in the past few years because of their ever-growing presence. Data streams add more challenges to clustering such as limited time, limited memory and one pass clustering. Further, discovering clusters with arbitrary shapes is important in data stream applications. Now a few clustering techniques for data streams exist in multidimensional spaces and the technique of "clustering projected or subspace" is used. Therefore, the task of projected clustering (or subspace clustering) has to be defined. PreDeConStream is a density-based data stream clustering algorithm for clustering high-dimensional data streams. In this paper, PPreDeConStream is present as a parallel version of PreDeConStream algorithm in the shared memory model. The theoretical and experimental results show that PPreDeConStream offers nearly linear speedup while keeps other advantages of PreDeConStream.

References
  1. E. Ikonomovska, S. Loskovska and D. Gjorgjevik, "A Survey of Stream Data Mining," In Proc. the 8th National Conference, pp. 9-25, 2007.
  2. A.Amini, T. Y. Wah and H. Saboohi, "On Density-Based Data Streams Clustering Algorithms: A Survey," Journal of Computer Science And Technology, vol. 29, no. 1, pp. 116-141, 2014.
  3. F. Cao, M. Ester, W. Qian and A. Zhou, "Density-Based Clustering over an Evolving Data Stream with Noise," In Proc. the 2006 SIAM Conference on Data Mining, pp. 328-339, 2006.
  4. C. C. Aggarwal and C. K. Reddy, Data Clustering Algorithms and Applications, Chapman & Hall, 2014.
  5. A.Ntoutsi, A. Zimek, T. Palpanas, P. Kroger and H.-P. Kriegel, "Density-based Projected Clustering over High Dimensional Data Streams," Proceedings of the 2012 SIAM International Conference on Data Mining, pp. 987-998, 2012.
  6. C. Bohm, K. Kailing, H.-P. Kriegel and P. Kroger, "Density Connected Clustering with Local Subspace Preferences," Data Mining, 2004. ICDM '04. Fourth IEEE International Conference, pp. 27-34, 2004.
  7. M. Hassani, A. Tarakji, L. Georgiev and T. Seidl, "Parallel Implementation of a Density-Based Stream Clustering Algorithm Over a GPU Scheduling System," Trends and Applications in Knowledge Discovery and Data Mining, pp. 441-453, 2014.
  8. "The OpenCL Specification Version: 2.0 Document Revision: 26," Khronos OpenCL Working Group, 2014.
  9. M. Hassani, P. Spaus, M. M. Gaber and T. Seidl, "Density-Based Projected Clustering of Data Streams," Scalable Uncertainty Management, vol. 7520, pp. 311-324, 2012.
  10. J. A. Silva, E. R. Faria, R. C. Barros and J. P. Gama, "Data Stream Clustering: A Survey," ACM Computing Surveys (CSUR), vol. 46, no. 1, pp. 1-37, 2013.
  11. R. Biglari and A. Bagheri, "PPreDeCon: A Parallel version of Preference Density Connected Clustering Algorithm," International Journal of Computer Applications (IJCA), vol. 107, no. 1, pp. 22-26, 2014.
  12. "OpenMP Application ProgramInterface Version 4.0," July 2013. [Online]. Available: www.openmp.org. [Accessed 17 10 2015].
  13. "KDD Cup 1999 Data," The UCI KDD Archive, 28 October 1999. [Online]. Available: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. [Accessed 2 8 2015].
  14. N. Gershenfeld and A. Weigend, "The Santa Fe Time Series Competition Data," Addison-Wesley, 1994. [Online]. Available: http://www-psych.stanford.edu/~andreas/Time-Series/SantaFe.html#setB. [Accessed 1 11 2015].
Index Terms

Computer Science
Information Sciences

Keywords

Clustering data stream algorithms parallel algorithms microcluster density-based clustering shared memory model.