Hybrid Perturbation Technique using Feature Selection Method for Privacy Preservation in Data Mining

Praveena Priyadarsini; M. L. Valarmathi; S. Sivakumari

Call for Paper

March Edition

IJCA solicits high quality original research papers for the upcoming March edition of the journal. The last date of research paper submission is 20 February 2026

Submit your paper

Know more

The week's pick

A Knowledge-Graph–Driven Multimodal Large Model for Semantic Understanding and Controllable Generation of Intangible Cultural Heritage

Jundi Yang Heng Yao

Random Articles

Reseach Article

Hybrid Perturbation Technique using Feature Selection Method for Privacy Preservation in Data Mining

by Praveena Priyadarsini, M. L. Valarmathi, S. Sivakumari

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 58 - Number 2

Year of Publication: 2012

Authors: Praveena Priyadarsini, M. L. Valarmathi, S. Sivakumari

10.5120/9257-3427

Praveena Priyadarsini, M. L. Valarmathi, S. Sivakumari . Hybrid Perturbation Technique using Feature Selection Method for Privacy Preservation in Data Mining. International Journal of Computer Applications. 58, 2 ( November 2012), 34-41. DOI=10.5120/9257-3427

@article{ 10.5120/9257-3427,

author = { Praveena Priyadarsini, M. L. Valarmathi, S. Sivakumari },

title = { Hybrid Perturbation Technique using Feature Selection Method for Privacy Preservation in Data Mining },

journal = { International Journal of Computer Applications },

issue_date = { November 2012 },

volume = { 58 },

number = { 2 },

month = { November },

year = { 2012 },

issn = { 0975-8887 },

pages = { 34-41 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume58/number2/9257-3427/ },

doi = { 10.5120/9257-3427 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:01:32.403436+05:30

%A Praveena Priyadarsini

%A M. L. Valarmathi

%A S. Sivakumari

%T Hybrid Perturbation Technique using Feature Selection Method for Privacy Preservation in Data Mining

%J International Journal of Computer Applications

%@ 0975-8887

%V 58

%N 2

%P 34-41

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Privacy-preserving in data mining refers to the area of data mining that seeks to safeguard sensitive information from unsolicited or unsanctioned disclosure and hence protecting individual data records and their privacy. Data perturbation is a privacy preservation technique which does addition / multiplication of noise to the original data. It performs anonymization based on the data type of sensitive data. Generalization is a technique were quasi identifiers data are replaced by some other more general term. In this paper privacy protection is applied to high dimensional datasets like Adult and Census. For ranking the attributes, information gain feature subset selection method is used. The high ranking attributes with sensitive information are set as quasi identifiers of the datasets. A hybrid perturbation technique is used to perturb categorical and numeric attributes of both the datasets and the utility of the datasets is measured using accuracy on data mining functionalities. The data distortion is measured using maintenance of Rank of Features (CK) between the original and perturb datasets. Experimental results show that utility of the perturbed datasets comparable with the original dataset and the Census dataset has comparable CK value than adult dataset.

References

Aggrawal, C. C. (2005): On K-Anonymity and the curse of dimensionality. In the proceedings of the 31st conference on VertLargDatabases (VLDB) 901-90.
Agrawal,R. ; Srikant,R. (2000): Privacy-Preserving Data Mining by, In Proceedings of the 2000ACM SIGMOD conference on Management of Data, pages 439–450, Dallas, TX, May 14-19 2000 ACM.
Agrawal,R. ; Srikant,R. (2000): . Privacy-preserving data mining. In Proc. of the ACM SIGMOD Conference On Management of Data, pages 439-450. ACM Press, May 2000.
Alexandre Evfimievski : Privacy-Preserving Data Mining by IBM Almaden Research Center, USA Tyrone Grandison IBM Almaden Research Center, USA.
Alsabt I. K; Srank ;Singh V. (2006): An Efficient K- Means Clustering Algorithm in 11th International Parallel Processing Symposium, 1998.
Barzan Mozafari,; Carlo Zaniolo. ( 2006): "Publishing Naive Bayesian Classi?ers: Privacy without Accuracy Loss"
Frank, A. ; Asuncion, A. (2010): UCI Machine Learning Repository [http://archive. ics. uci. edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
Giannella C and Liu K(2009) "On The privacy of Euclidean Distance Preserving Data Perturbation" Computer Science – Cryptography and Security.
Guo, S. Wu, X and Li, Y (2006) "On the Lower Bound of Reconstruction Error for Spectral Filtering based Privacy Preserving Data Mining" in Proceedings of the 10th European conference on Principles and practices of Knowledge discovery in Databases Berlin, Germany.
Han,J. ;Kamber,M. ( 2001): Data Mining Concepts and Techniques, Morgan Kaufmann.
Islam, M. Z. ; Brankovic, L. ( 2007): Privacy Preserving Data Mining: Noise Addition to Categorical Values Using a Novel Clustering Technique, In IEEE Transactions on Industrial Informatics.
Kantarcioglu, M. ; Jin, J. ; Clifton,C(2004): When Do Data Mining Results Violate Privacy? Proc. 2004, Int'l Conf. Knowledge Discovery and Data Mining, pp. 599-604.
Kargupta,H. ;Datta,S. ;Wang,Q. ; Sivakumar, K. (2005): . Random-data perturbation techniques and privacypreserving data mining Knowledge and Information Systems, 7:387-414.
Lindell,Y. ; Pinkas,B. ( 2000): Privacy Preserving Data Mining by,In Advances in CryptologyCRYPTO 2000, pages 36–54. Springer-Verlag, Aug. 20-24 2000.
Mark Hall. ; Eibe Frank,; Geoffrey Holmes,; Bernhard Pfahringer,; Peter Reutemann,; Ian H. Witten (2009): The WEKA Data Mining Software: An Update; SIGKDD Explorations, Volume 11, Issue 1
Muralidhar,K. ; Parsa,R;Sarathy,R. ( 1999): A general additive data perturbation method for database security. Management Science, 45(10):1399-1415.
Pengpeng Lin,; Jun Zhang,; Ingrid St. Omer,; Huanjing Wang,; JieWang Proceedings(2011: A Comparative study on Data perturbation with feature selection, The international multi conference of Engineers and computer scientist 2011 vol 1, March 16-18, 2011 Hong Kong.
Poovammal,E. ; Ponnavaikko,M. (2009): Task Independent Privacy Preserving Data Mining on Medical Dataset in 2009 International Conference on Advances in Computing, Control, and Telecommunication Technologies.
Sweeney, L. (2002): Achieving k-anonymity privacy protection using generalization and suppression,International Journal on Uncertainty, Fuzziness, and Knowledge-based Systems, vol. 10, no. 5, pp. 571,588.
Wang,J,; Zhong,W. J. ;Zhang,J. ; Xu,S. T. ( 2006): "Selective Data Distortion via Structural Partition and SSVD for Privacy Preservation," In Proceedings of the 2006 International conference on Information & Knowledge Engineering, pp: 114 - 120, CSREA Press, Las Vegas, Nevada, USA, June 26-29, 2006.

Index Terms

Computer Science

Information Sciences

Keywords

Data mining Privacy preservation perturbation generalization utility classifications clustering maintenance of Rank of Features