CFP last date
20 May 2024
Reseach Article

Bucketization based Flow Classification Algorithm for Data Stream Privacy Mining

by G. Kesavaraj, S. Sukumaran
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 81 - Number 12
Year of Publication: 2013
Authors: G. Kesavaraj, S. Sukumaran
10.5120/14063-2245

G. Kesavaraj, S. Sukumaran . Bucketization based Flow Classification Algorithm for Data Stream Privacy Mining. International Journal of Computer Applications. 81, 12 ( November 2013), 13-18. DOI=10.5120/14063-2245

@article{ 10.5120/14063-2245,
author = { G. Kesavaraj, S. Sukumaran },
title = { Bucketization based Flow Classification Algorithm for Data Stream Privacy Mining },
journal = { International Journal of Computer Applications },
issue_date = { November 2013 },
volume = { 81 },
number = { 12 },
month = { November },
year = { 2013 },
issn = { 0975-8887 },
pages = { 13-18 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume81/number12/14063-2245/ },
doi = { 10.5120/14063-2245 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:55:52.515017+05:30
%A G. Kesavaraj
%A S. Sukumaran
%T Bucketization based Flow Classification Algorithm for Data Stream Privacy Mining
%J International Journal of Computer Applications
%@ 0975-8887
%V 81
%N 12
%P 13-18
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In recent years, data mining plays a major role in maintaining the huge volume of data from which it can derive the useful information. With the huge number of formation of data, the data wants to be lectured in a limit to the charge of growth. But it is complex to get over the set of meaningful information from the continuous set of data. Data-stream mining is a method which can discover important information from a huge contract of prehistoric data. For identification of useful information, the classification of continuous data streams is done. Current approaches in classifying the data streams are processed using supervised learning algorithms, which can be qualified with tagged data. Usually, manual classification of data is both expensive and time consuming. As a result, where massive amount of data emerge at a high speed, tagged data might be very sparse. Therefore, only a restricted amount of training data might be accessible for constructing the classi?cation models, tend to badly trained classi?ers. To overcome the issue, in this work, a novel technique is presented to build a classification set having both unlabeled and a small amount of labeled instances. This model is built by using the Flow Classification Algorithm (FCA). The FC algorithm is able to judge internally on set of marked data. Before classification, the correlation set of attributes in the each record set are grouped using bucketization technique. The superiority of models updated from them is enough for utilization of unlabeled records, or whether more set of labeled records are needed for classification is processed. Experimental evalaution is conducted to the proposed FC technqiue over its counterparts to find a set of diverse solution in terms of execution time, classification accuracy and security. Performance metrics for evaluation of proposed FCA technique shows that the security level is 10-15% high against existing work.

References
  1. Hui Wang, Ruilin Liu, "Privacy-preserving publishing micro-data with full functional dependencies", Elseiver Data & Knowledge Engineering 70 (2011) 249–268
  2. Josep Domingo-Ferrer, Ursula Gonzalez-Nicolas, "Hybrid microdata using microaggregation", Elseiver Information Sciences 180 (2010) 2834–2844.
  3. Nissim Matatov et. Al. , "Privacy-preserving data mining: A feature set partitioning approach", Information Sciences 180 (2010) 2696–2720.
  4. Keke Chen, Ling Liu, "Geometric Data Perturbation for Privacy Preserving Outsourced Data Mining", ieee transactions knowledge and data engineering, 2012.
  5. Elizabeth Durham et. Al. , "Quantifying the correctness, computational complexity, and security of privacy-preserving string comparators for record linkage", Information Fusion 13 (2012) 245–259.
  6. Li Xiong et. Al. , "PREDICT: Privacy and Security Enhancing Dynamic Information Collection and Monitoring", International Conference on Computational Science, ICCS 2013.
  7. Leite, D. ,Costa, P. ;Gomide, F. , "Evolving granular neural network for semi-supervised data stream classification", International Joint Conference on Neural Networks (IJCNN), 2010
  8. Aggarwal, C. C. ,Jiawei Han;Jianyong Wang;Yu, P. S. , "A framework for on-demand classification of evolving data streams", IEEE Transactions on(Volume:18 ,Issue: 5) Knowledge and Data Engineering, 2006.
  9. Masud, M. M. ;et. Al. , 'Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints', IEEE Transactions on(Volume:23 ,Issue: 6) Knowledge and Data Engineering, 2011.
  10. Masud, M. M. et. Al. , 'Classification and Adaptive Novel Class Detection of Feature-Evolving Data Streams', IEEE Transactions on(Volume:25,Issue: 7) Knowledge and Data Engineering, 2013.
  11. Hashemi, S, Ying Yang;Mirzamomen, Z. ;Kangavari, M. , "Adapted One-versus-All Decision Trees for DataStream Classification",IEEE. Transactionson (Volume:21, Issue:5) Knowledge and Data Engineering, 2009
  12. Abdulsalam, H. et. Al. , "Classification Using Streaming Random Forests", IEEE Transactions on Knowledge and Data Engineering, (Volume:23 ,Issue: 1), 2011.
  13. Ki-Seung Lee, "SNR-Adaptive Stream Weighting for Audio-MES ASR", IEEE Transactions on (Volume:55 ,Issue: 8) Biomedical Engineering, 20088.
Index Terms

Computer Science
Information Sciences

Keywords

Flow classification bucketization Frequent item sets.