CFP last date
20 May 2024
Reseach Article

Privacy Preserving Data Mining: Techniques, Classification and Implications - A Survey

by Alpa Shah, Ravi Gulati
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 137 - Number 12
Year of Publication: 2016
Authors: Alpa Shah, Ravi Gulati
10.5120/ijca2016909006

Alpa Shah, Ravi Gulati . Privacy Preserving Data Mining: Techniques, Classification and Implications - A Survey. International Journal of Computer Applications. 137, 12 ( March 2016), 40-46. DOI=10.5120/ijca2016909006

@article{ 10.5120/ijca2016909006,
author = { Alpa Shah, Ravi Gulati },
title = { Privacy Preserving Data Mining: Techniques, Classification and Implications - A Survey },
journal = { International Journal of Computer Applications },
issue_date = { March 2016 },
volume = { 137 },
number = { 12 },
month = { March },
year = { 2016 },
issn = { 0975-8887 },
pages = { 40-46 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume137/number12/24331-2016909006/ },
doi = { 10.5120/ijca2016909006 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:38:13.342600+05:30
%A Alpa Shah
%A Ravi Gulati
%T Privacy Preserving Data Mining: Techniques, Classification and Implications - A Survey
%J International Journal of Computer Applications
%@ 0975-8887
%V 137
%N 12
%P 40-46
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Privacy has become crucial in knowledge based applications. Proper integration of individual privacy is essential for data mining operations. This privacy based data mining is important for sectors like Healthcare, Pharmaceuticals, Research, and Security Service Providers, to name a few. The main categorization of Privacy Preserving Data Mining (PPDM) techniques falls into Perturbation, Secure Sum Computations and Cryptographic based techniques. There exist tradeoffs between privacy preservation and information loss for generalized solutions. The authors of the paper present an extensive survey of PPDM techniques, their classification and give a preliminary implication of technique to be used under specific scenarios.

References
  1. Agrawal and Srikant, “Privacy Preserving Data mining”, Proceedings of the ACM SIGMOD International Conference on Management of data, 2000.
  2. Li Liu, Murat Kantarcioglu and Bhavani Thuraisingham, “The applicability of the perturbation based privacy preserving data mining for real-world data”, Data & Knowledge Engineering 65 (2008) 5–21.
  3. E. Poovammal and M. Ponnavaikko, “Task Independent Privacy Preserving Data Mining on Medical Dataset”, International Conference on Advances in Computing, Control and Telecommunication Technologies, 2009.
  4. Marina Blanton, “Achieving Full Security in Privacy-Preserving Data Mining”, IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE International Conference on Social Computing, 2011.
  5. Tiancheng Li, Ninghui Li, “Towards Optimal k-anonymization”, Data & Knowledge Engineering, 2008 Elsevier. 303
  6. E .Poovammal and Dr. M. Ponnavaikko, “An Improved Method for Privacy Preserving Data Mining”, IEEE International Advance Computing Conference (IACC 2009) Patiala, India, 6-7 March 2009.
  7. Jiang, Clifton and Kantarcıoğlu, “Transforming Semi-Honest Protocols to Ensure Accountability”, Data & Knowledge Engineering, 2008 Elsevier.
  8. Bhavani Thuraisingham, “Privacy constraint processing in a privacy-enhanced database management system”, Data & knowledge Engineering, 2005.
  9. XunYi, YanchunZhang, “Privacy-preserving naive Bayes classification on distributed data via semi-trusted mixers”, Information Systems 34 (2009) 371–380.
  10. Jian Wang, Yongcheng Luo, Yan Zhao, Jiajin Le, “A Survey on Privacy Preserving Data Mining”, First International Workshop on Database Technology and Applications, 2009.
  11. Yun Ding and Karsten Klein, “Model-Driven Application-Level Encryption for the Privacy of E-Health Data”, International Conference on Availability, Reliability and Security, 2010.
  12. Yehuda Lindell, Benny Pinkas, “Privacy Preserving Data Mining”, http://www.pinkas.net/PAPERS/id3-final.pdf.
  13. Samarati P, “Protecting respondent’s privacy in Microdata release”, IEEE Transactions on Knowledge and Data Engineering, 13:1010–1027
  14. Geetha Jagannathan, Rebecca N. Wright, “Privacy-Preserving Imputation of Missing Data”, Data & Knowledge Engineering, 2008 Elsevier.
  15. Justin Zhan, Stan Matwin, Li Wu Chang, “Privacy-preserving collaborative association rule mining”, Journal of Network and Computer Applications 30 (2007) 1216–1227.
  16. Sweeney L, “k-anonymity: A model for protecting Privacy”, International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):557–570.
  17. Jimmy Secretan, Michael Georgiopoulos, Anna Koufakou, Kel Cardona, “APHID: An architecture for private, high performance integrated data mining”, Future Generation Computer Systems 26 (2010) 891_904.
  18. Keke Chen and Ling Liu, “Privacy-Preserving Multiparty Collaborative Mining with Geometric Data Perturbation”, IEEE Transactions on Parallel and Distributed Systems, Vol. 20, No. 12, December 2009.
  19. Jitao Zhao and Ting Wang, “A General Framework for Medical Data Mining”, International Conference on Future Information Technology and Management Engineering, 2010.
  20. R. Mukkamala and V.G. Ashok, “Fuzzy-based Methods for Privacy-Preserving Data Mining”, Eighth International Conference on Information Technology: New Generations, 2011.
  21. F. Emekci, O.D. Sahin, D. Agrawal, A. El Abbadi, “Privacy preserving Decision tree learning over multiple parties”, Data & Knowledge Engineering 63 (2007) 348–361.
  22. Yan ZHU and Lin PENG, “Study on K-anonymity Models of Sharing Medical Information”, 1-4244-0885- 7/07/$20.00 ©2007 IEEE.
  23. Li Liu, Murat Kantarcioglu and Bhavani Thuraisingham, “Privacy Preserving Decision Tree Mining from Perturbed Data”, Proceedings of the 42nd Hawaii International Conference on System Sciences – 2009.
  24. Samet, S. ; Miri, A., 2009, Privacy-Preserving Bayesian Network for Horizontally Partitioned Data International Conference on Computational Science and Engineering, 2009. CSE '09. (Volume:3 ), pp: 9-16
  25. Benny Pinkas, “Cryptographic techniques for privacy preserving data mining”,
  26. Jinfei Liu, Jun Luo and Joshua Zhexue Huang, “Rating: Privacy Preservation for Multiple Attributes with Different Sensitivity Requirements”, 11th IEEE International Conference on Data Mining Workshops, 2011.
  27. Madhusudana Shashanka, “A Privacy–Preserving Framework for Gaussian Mixture Models”, IEEE International Conference on Data Mining Workshops, 2010.
  28. José Luis Fernández-Alemán,Inmaculada Carrión Señor, Pedro Ángel Oliver Lozoya, Ambrosio Toval, “Methodological Review-Security and Privacy in electronic health records: A systematic literature review”, Journal of Biomedical Informatics(2013).
  29. Yan Zhao, Ming Du, Jiajin Le, Yongcheng Luo, “A Survey on Privacy Preserving Approaches in Data Publishing”, First International Workshop on Database Technology and Applications, 2009.
  30. Xun Yi, Yanchun Zhang, “Privacy-preserving distributed association rule mining via semi-trusted mixer”, Data & Knowledge Engineering 63 (2007) 550–567.
  31. Nissim Matatov, Lior Rokach, Oded Maimon, “Privacy-preserving data mining: A feature set partitioning approach”, Information Sciences 180 (2010) 2696–2720.
  32. Majid Bashir Malik, M. Asger Ghazi, Rashid Ali, “Privacy Preserving Data Mining Techniques: Current Scenario and Future Prospects”, Third International Conference on Computer and Communication Technology, 2012.
  33. Dan Zhu, Xiao-Bai Li, Shuning Wu, “Identity disclosure protection: A data reconstruction approach for privacypreserving data mining”, Decision Support Systems 48 (2009) 133–140.
  34. Benjamin C. M. Fung, Ke Wang, Lingyu Wang, Patrick C.K. Hung, “Privacy-preserving data publishing for cluster analysis” , Data & Knowledge Engineering 68
  35. Yaping Li, Minghua Chen, Qiwei Li, and Wei Zhang, “Enabling Multilevel Trust in Privacy Preserving Data Mining”, IEEE Transactions On Knowledge And Data Engineering, Vol. 24, No. 9, September 2012.
  36. Asmaa H.Rashid and Prof.dr. Abd-Fatth Hegazy, “Protect Privacy of Medical Informatics using K-Anonymization Model”, IEEE Explore
  37. Alper Bilge, Huseyin Polat, “A comparison of clustering-based privacy- preserving collaborative filtering Schemes”, Applied Soft Computing 13 (2013) 2478–2489.
  38. Gerardo Canfora, Elisa Costante, Igino Pennino, Corrado Aaron Visaggio, “A three–layered model to implement data privacy policies”, Computer Standards & Interfaces 30 (2008) 398–409
  39. Weijia Yang, Sanzheng Qiao, “A novel anonymization algorithm: Privacy protection and knowledge preservation”, Expert Systems with Applications 37 (2010) 756–766.
  40. Sergio Martínez, David Sánchez, Aida Valls, “A semantic framework to protect the privacy of electronic health records with non-numerical attributes”, Journal of Biomedical Informatics 46 (2013) 294–303.
  41. R. Vidya Banu, N .Nagaveni, “Evaluation of a perturbation-based Technique for privacy preservation in a multiparty clustering scenario”, Information Sciences 232 (2013) 437–448.
  42. Sin G Teo, Vincent Lee, Shuguo Han, “A Study of Efficiency and Accuracy of Secure Multiparty Protocol in Privacy-Preserving Data Mining”, 26th International Conference on Advanced Information Networking and Applications Workshops, 2012.
  43. Alpa K. Shah, Ravi Gulati, “Contemporary Trends in Privacy Preserving Collaborative Data Mining– A Survey”, Proceedings in IEEE International Conference on Electrical, Electronics, Signals, Communication and Optimization (EESCO), 2015
  44. Alpa K. Shah, Ravi Gulati, “Privacy, Collaboration and Security – Imperative Existence in Data Mining” VNSGU Journal of Science and Technology Vol 4 ,No 1, July 2015, Pg. 44-49, 0975-5446
  45. Jisha Jose Panackal1 ,Dr Anitha S Pillai, “Privacy Preserving Data Mining: An Extensive Survey”, in Proceedings of Proc. of Int. Conf. on Multimedia Processing, Communication and Info. Tech., MPCIT, 2013.
  46. Tsiafoulis, S.G. Zorkadis, V.C., 2010, A Neural Network Clustering Based Algorithm for Privacy Preserving Data Mining, International Conference on Computational Intelligence and Security (CIS), 2010, pp: 401-405
  47. SathiyaPriya, K.; Sadasivam, G.S.;Celin, “A new method for preserving privacy in quantitative association rules using DSR approach with automated generation of membership function”, World Congress on Information and Communication Technologies (WICT), 2011, pp: 148-153
  48. Zhiqiang Yang ; Wright, R.N. 2005, Improved Privacy-Preserving Bayesian Network Parameter Learning on Vertically Partitioned Data, 21st International Conference on Data Engineering Workshops, 2005. Pp:1196
  49. Alpa K. Shah, Ravi Gulati,” A Survey on Cryptographic Techniques for Privacy Preserving Data Mining”, IIJDWM, Mining Vol 2 Issue1 Feb 2012 pp: 8-12
  50. Wang Hongmei ; Zhao Zheng ; Sun Zhiwei, 2005, Privacy preserving Bayesian network structure learning on distributed heterogeneous data,.11th Pacific Rim International Symposium on Dependable Computing, 2005. Proceedings, DOI: 10.1109/PRDC.2005.49
  51. Syed Zahid Hassan and Brijesh Verma, “A Hybrid Data Mining Approach for Knowledge Extraction and Classification in Medical Databases”, Seventh International Conference on Intelligent Systems Design and Applications.
  52. Cano I., Torra V, “Generation of synthetic data by means of fuzzy c-Regression” . IEEE International Conference on Fuzzy Systems, 2009. FUZZ-IEEE, pp: 1145 – 1150
  53. Kokkinos, Y., Margaritis, K., 2013, Distributed privacy-preserving P2P data mining via probabilistic neural network committee machines, Fourth International Conference on Information, Intelligence, Systems and Applications (IISA), 2013, pp: 1-4
  54. Honda, K. ; Kawano, A. ; Notsu, A. ; Ichihashi, H., 2012, “A fuzzy variant of k-member clustering for collaborative filtering with data anonymization”, Fuzzy Systems (FUZZ-IEEE), 2012 IEEE International Conference on, pp: 1-6
Index Terms

Computer Science
Information Sciences

Keywords

PPDM Perturbation Cryptography SMC Randomization Condensation Anonymization