Call for Paper - January 2024 Edition
IJCA solicits original research papers for the January 2024 Edition. Last date of manuscript submission is December 20, 2023. Read More

Privacy Preserving Data Mining: A Comprehensive Survey

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2017
Ritika Lohiya, Ankita Mandowara, Rushabh Raolji

Ritika Lohiya, Ankita Mandowara and Rushabh Raolji. Privacy Preserving Data Mining: A Comprehensive Survey. International Journal of Computer Applications 161(6):30-38, March 2017. BibTeX

	author = {Ritika Lohiya and Ankita Mandowara and Rushabh Raolji},
	title = {Privacy Preserving Data Mining: A Comprehensive Survey},
	journal = {International Journal of Computer Applications},
	issue_date = {March 2017},
	volume = {161},
	number = {6},
	month = {Mar},
	year = {2017},
	issn = {0975-8887},
	pages = {30-38},
	numpages = {9},
	url = {},
	doi = {10.5120/ijca2017913220},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}


Privacy preserving data mining has emerged due to large usage of data in organizations for extracting knowledge from data[1]. Big data uses centralized as well as distributed data and mines knowledge. Privacy preservation of data has become critical asset due to malicious users and society issues. It is very crucial nowadays to maintain balance between ensuring privacy and extracting knowledge. These areas is burning domain for researchers till now because no such research has been done that out performs all the techniques in privacy preserving data mining. Privacy preservation is classified into many categories like data modification, data distribution, data hiding and data encryption. For performance measuring, evaluation criteria like information loss, computational overhead, data utility etc are considered. Data modification techniques mainly focus on adding errors to data or results into output which degrades the accuracy of data mining algorithm. In case of critical analysis of data, crypto graphical approaches in privacy preserving data mining which has no loss of information but overhead of computation and communication have been adopted. PPDM includes homomorphic encryption, Shamir’s secret sharing scheme, oblivious transfer and many other cryptography techniques. Challenges in this area include, higher computational and communication cost. At last, most advanced, functional encryption concept in privacy preservation have been included. Functional encryption provides higher level of security as well as privacy to data. It only allows learning output of function without revealing anything else.


  1. Ontario. Information and Privacy Commissioner, and Ann Cavoukian.Data mining: Staking a claim on your privacy. 1997.
  2. Liu Yu, Dap eng L, et al, Survey of research on anonymilization technology in data publication, Computer Application, pp. 2361-2364, 2009.
  3. Verykios, Vassilios S.,et al."State-of-the-art in privacy preserving data mining."ACM Sigmod Record 3.1(2004): 50-57.
  4. R. Agrawal and S. Ramakrishanan, Privacy preserving data mining ACM sigmod record, 2004.
  5. Zhou Shui-Geng, Li Feng, Tao Yu-Fei, Xiao-Kui. Privacy Preserva- tion in Database Applications: A Survey.C hinese journerl of computer,2009
  6. Yan Zhao1 Ming Du2 Jiajin, Le1 Yongcheng Luo1, A Survey on Privacy Preserving Approaches in Data Publishing. First International Workshop on Database Technology and Applications, 2009
  7. A. Shamir. How to share a secret. Communications of the ACM,22(11):612 613, November 1979.
  8. A. C. Yao. Protocols for secure computations(extended abstract). In 23rd Annual Symposium on Foundations of Computer Science. IEEE, 1982.
  9. J.Benaloh, Dense probabilistic encryption. Citeseer .ist . /benaloh94dense.html,1994.
  10. Oliveira, Stanley RM, and Osmar R. Zaiane. ”Privacy preserving frequent itemset mining.” Proceedings of the IEEE international con- ference on Privacy, security and data mining-Volume 14. Australian Computer Society, Inc., 2002.
  11. P. Paillier. Public-key cryptosystems based on composite degree residuosity classes. In Advances in Cryptology EUROCRYPT’99, pages 223-238. Springer, 1999.
  12. Xiaolin Z. and Hongjing B. Research on privacy preserving classifica- tion data mining based on random perturbation. National conference of Information. Vol 1. No 1.,2010.
  13. Kamakhi P. and Vinnaiya babu. Preserving privacy and sharing the data using classification on perturbed data. IJSCE. Vol 2. No 3. 2010.
  14. Jagannathan, Geetha, and Rebecca N. Wright. ”Privacy-preserving distributed k-means clustering over arbitrarily partitioned data.” Pro- ceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, 2005.
  15. Kantarcioglu, Murat, and Chris Clifton. "Privacy preserving distributed mining of association rules on horizontally partitioned data." IEEE Transactions on Knowledge and Data Engineering 16.9(2004) : 1026-1037.
  16. Patel, Sankita, Sweta Garasia, and Devesh Jinwala. ”An Efficient Approach for Privacy Preserving Distributed K-Means Clustering Based on Shamirs Secret Sharing Scheme.” Trust Management VI. Springer Berlin Heidelberg, 2012.
  17. Kantarcoglu, Murat, Jaideep Vaidya, and C. Clifton. ”Privacy pre- serving naive bayes classifier for horizontally partitioned data.” IEEE ICDM workshop on privacy preserving data mining. 2003.
  18. Vaidya, Jaideep, and Chris Clifton. ”Privacy-preserving k-means clus- tering over vertically partitioned data.” Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2003.
  19. Samarati, Pierangela. ”Protecting respondents identities in microdata release.” Knowledge and Data Engineering, IEEE Transactions on 13.6 (2001): 1010-1027.
  20. Duan, Yitao, and John F. Canny. ”Practical Private Computation and Zero-Knowledge Tools for Privacy-Preserving Distributed Data Mining.” SDM. 2008.
  21. Friedman, Arik, Assaf Schuster, and Ran Wolff. ”k-Anonymous deci- sion tree induction.” Knowledge Discovery in Databases: PKDD 2006. Springer Berlin Heidelberg, 2006. 151-162.
  22. Blanton, Marina. ”Achieving full security in privacy-preserving data mining.”Privacy, security, risk and trust (passat), 2011 ieee third international conference on social computing (socialcom). IEEE, 2011.
  23. Yang, Bin, et al. ”Collusion-resistant privacy-preserving data min- ing.”Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2010.
  24. Xu, Zhuojia, and Xun Yi. ”Classification of privacy-preserving distributed data mining protocols.” Digital Information Management (ICDIM), 2011 Sixth International Conference on. IEEE, 2011.
  25. Fung, Benjamin CM, Ke Wang, and Philip S. Yu. ”Anonymizing classification data for privacy preservation. Knowledge and Data Engineering, IEEE Transactions on 19.5 (2007): 711-725.
  26. Fang, Weiwei, and Bingru Yang. ”Privacy preserving decision tree learning over vertically partitioned data.” Computer Science and Software Engineering, 2008 International Conference on. Vol. 3. IEEE, 2008.
  27. Dasseni, Elena, et al. ”Hiding association rules by using confidence and support.” Information Hiding. Springer Berlin Heidelberg, 2001.
  28. Lin, Zhenmin, and Jerzy W. Jaromczyk. ”Privacy preserving two-party k-means clustering over vertically partitioned dataset.” Intelligence and Security Informatics (ISI), 2011 IEEE International Conference on. IEEE, 2011.
  29. Slavkovic, Aleksandra B., Yuval Nardi, and Matthew M. Tibbits. ”” Secure” Logistic Regression of Horizontally and Vertically Partitioned Distributed Databases.” Data Mining Workshops, 2007. ICDM Work- shops 2007. Seventh IEEE International Conference on. IEEE, 2007.
  30. Xiao, Ming-Jun, et al. ”Privacy preserving id3 algorithm over horizontally partitioned data.” Parallel and Distributed Computing, Applications and Technologies, 2005. PDCAT 2005. Sixth International Conference on. IEEE, 2005.
  31. Xiao, Ming-Jun, et al. ”Privacy preserving C4. 5 algorithm over horizontally partitioned data.” Grid and Cooperative Computing, 2006. GCC 2006. Fifth International Conference. IEEE, 2006.
  32. Inan, Ali, et al. ”Privacy preserving clustering on horizontally partitioned data Data and Knowledge Engineering 63.3 (2007): 646-666.
  33. Pang, Liaojun, et al. ”A verifiable (t, n) multiple secret sharing scheme and its analyses.” Electronic Commerce and Security, 2008 International Symposium on IEEE,2008.
  34. Aggarwal, Charu C., and S. Yu Philip. ”A condensation approach to privacy preserving data mining.” Advances in Database Technology- EDBT 2004. Springer Berlin Heidelberg, 2004. 183-199.
  35. Reza, M., and Somayyeh Seifi. ”Classification and Evaluation the PPDM Techniques by using a data Modification-based framework.” IJCSE, Vol3. No2 Feb (2011).
  36. Pinkas, Benny. ”Cryptographic techniques for privacy-preserving data mining ACM SIGKDD Explorations Newsletter 4.2 (2002).
  37. Pedersen, Thomas Brochmann, Ycel Saygn, and Erkay Sava. ”Secret charing vs. encryption-based techniques for privacy preserving data mining.” (2007)
  38. Taylor, Ronald C. ”An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics.” BMC bioin- formatics 11.Suppl 12 (2010): S1.
  39. Naveed, Muhammad, et al. ”Controlled Functional Encryption.” Pro- ceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM, November-2014. Beimel, Amos, et al. ”Non-Interactive Secure Multiparty Computation.”Advances in CryptologyCRYPTO 2014. Springer Berlin Heidelberg, 2014. 387- 404.
  40. Agrawal, Shashank, et al. ”Function Private Functional Encryption and Property Preserving Encryption: New Definitions and Positive Results.” IACR Cryptology ePrint Archive 2013 (2013): 744
  41. Attrapadung, Nuttapong, and Benot Libert. ”Functional encryption for public-attribute inner products: Achieving constant-size ciphertexts with adaptive security or support for negation.” J. Mathematical Cryptology 5.2 (2012): 115-158.
  42. Barbosa, Manuel, and Pooya Farshim. ”Delegatable homomorphic encryption with applications to secure outsourcing of computation.” Topics in CryptologyCT-RSA 2012. Springer Berlin Heidelberg, 2012. 296-312.
  43. Gorbunov, Sergey, Vinod Vaikuntanathan, and Hoeteck Wee. ”Func- tional encryption with bounded collusions via multi-party computation." Advances in Cryptology CRYPTO 2012. Springer Berlin Heidelberg, 2012.162-179.
  44. Boneh, Dan, Amit Sahai, and Brent Waters. ”Functional encryption: Definitions and challenges.” Theory of Cryptography. Springer Berlin Heidelberg, 2011. 253-273.
  45. Yang, Xiaoyuan, Weiyi Cai, and Ping Wei. ”Multiple-authority-keys CP-ABE.”Communication Software and Networks (ICCSN), 2011 IEEE 3rd International Conference on. IEEE, 2011.
  46. Lewko, Allison, et al. ”Fully secure functional encryption: Attribute-based encryption and (hierarchical) inner product encryption.” Advances in Cryptology EUROCRYPT 2010. Springer Berlin Heidelberg,2010.62-91.