CFP last date
20 May 2024
Reseach Article

From Perturbation Data, Regenerate of Data in Matlab

by Fehreen Hasan, Niranjan Singh
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 14 - Number 7
Year of Publication: 2011
Authors: Fehreen Hasan, Niranjan Singh
10.5120/1898-2529

Fehreen Hasan, Niranjan Singh . From Perturbation Data, Regenerate of Data in Matlab. International Journal of Computer Applications. 14, 7 ( February 2011), 7-10. DOI=10.5120/1898-2529

@article{ 10.5120/1898-2529,
author = { Fehreen Hasan, Niranjan Singh },
title = { From Perturbation Data, Regenerate of Data in Matlab },
journal = { International Journal of Computer Applications },
issue_date = { February 2011 },
volume = { 14 },
number = { 7 },
month = { February },
year = { 2011 },
issn = { 0975-8887 },
pages = { 7-10 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume14/number7/1898-2529/ },
doi = { 10.5120/1898-2529 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:02:44.501706+05:30
%A Fehreen Hasan
%A Niranjan Singh
%T From Perturbation Data, Regenerate of Data in Matlab
%J International Journal of Computer Applications
%@ 0975-8887
%V 14
%N 7
%P 7-10
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The main contribution of this paper lies in the algorithm to accurately reconstruct the community joint density given the perturbed multidimensional stream data information. Any statistical question about the community can be answered using the reconstructed joint density. There have been many efforts on the community distribution reconstruction. In this project, we are considering the information privacy which now-a-days has become one of the most important issues. We touch upon several techniques of masking the data, namely random distortion, including the uniform and Gaussian noise, applied to the data in order to protect it. Then, after using a certain data recovering techniques we look for the distribution of data obtained. Our task is to determine whether the distributions of the original and recovered data are close enough to each other despite the nature of the noise applied. We are considering an ensemble clustering method to reconstruct the initial data distribution. As the tool for the algorithm implementations we chose the “language of choice in industrial world” – MATLAB.

References
  1. University of California, Berkley (2000). How much information? http://www.sims.berkeley.edu/research/projects/how-much-info/internet.html
  2. Joseph Turow, 2003, Americans and Online Privacy, The System Is Broken. http://www.asc.upenn.edu/usr/jturow/internet-privacy-report/36-page-turow-version-9.pdf
  3. Alexandre Evfimievski. Randomization in Privacy Preserving Data Mining
  4. R. Agrawal and R. Srikant. Privacy-preserving data mining. 2000. ACM Press.
  5. H. Kargupta, S. Datta, Qi Wang and Krishnamoorthy Sivakumar. Random Data Perturbation Techniques and Privacy Preserving Data Mining.
  6. D. Agrawal and C. C. Aggarwal. On the design and quantification of privacy preserving data mining algorithms.
  7. A. Weingessel, E. Dimitriadou, K. Hornik. An Ensemble Method for Clustering. DSC 2003 Working Papers.
  8. S. Evfimievski. Randomization techniques for privacy preserving association rule mining.
  9. E. Dimitriadou, A. Weingessel, K. Hornik. Voting in Clustering and Finding the Number of Clusters.
  10. E. Dimitriadou, A. Weingessel, K. Hornik. A Voting-Merging Clustering Algorithm.
  11. Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise.
  12. Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: Proc. of ACM SIGMOD. (2001) 247–255
  13. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. Journal of Royal Statistical Society B39 (1977) 1–38
  14. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
  15. Kelsey, J., Schneier, B., Wagner, D., Hall, C.: Cryptanalytic attacks on pseudorandom number generators. In: FSE ’98: Proceedings of the 5th International Workshop on Fast Software Encryption, London, UK, Springer-Verlag (1998) 168–188
Index Terms

Computer Science
Information Sciences

Keywords

Perturbation Data Regenerate of Data distribution reconstruction information privacy random distortion recovered data