CFP last date
22 April 2024
Reseach Article

Article:A Comparison on Performance of Data Mining Algorithms in Classification of Social Network Data

by P.Nancy, Dr.R.Geetha Ramani
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 32 - Number 8
Year of Publication: 2011
Authors: P.Nancy, Dr.R.Geetha Ramani
10.5120/3927-5555

P.Nancy, Dr.R.Geetha Ramani . Article:A Comparison on Performance of Data Mining Algorithms in Classification of Social Network Data. International Journal of Computer Applications. 32, 8 ( October 2011), 47-54. DOI=10.5120/3927-5555

@article{ 10.5120/3927-5555,
author = { P.Nancy, Dr.R.Geetha Ramani },
title = { Article:A Comparison on Performance of Data Mining Algorithms in Classification of Social Network Data },
journal = { International Journal of Computer Applications },
issue_date = { October 2011 },
volume = { 32 },
number = { 8 },
month = { October },
year = { 2011 },
issn = { 0975-8887 },
pages = { 47-54 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume32/number8/3927-5555/ },
doi = { 10.5120/3927-5555 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:18:40.642652+05:30
%A P.Nancy
%A Dr.R.Geetha Ramani
%T Article:A Comparison on Performance of Data Mining Algorithms in Classification of Social Network Data
%J International Journal of Computer Applications
%@ 0975-8887
%V 32
%N 8
%P 47-54
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data Mining (the analysis step of the Knowledge Discovery in Databases process or KDD), a relatively young and interdisciplinary field of computer science, is the process of discovering or extracting new patterns from large data sets involving methods from statistics and artificial intelligence. It is commonly used in marketing, surveillance, fraud detection, scientific discovery and now gaining wide way in social networking. Anything and everything on the Internet is fair game for extreme data mining practices. Social media covers all aspects of the social side of the internet that allow us to get contact and carve up information with others as well as intermingle with any number of people in any place in the world. This paper uses the dataset “Social side of the Internet” from Pew Research Center. The focus of the research is towards exploration on impact of the internet on social group activities using Data Mining Techniques. The original dataset contains 162 attributes which is very large and hence the essential attributes required for the analysis are selected by feature reduction method. The selected attributes were applied to Data Mining Classification Algorithms such as RndTree, ID3, K-NN, C-RT, CS-CRT, C4.5 and CS-MC4. The Error rates of various classification Algorithms were compared to bring out the best and effective Algorithm suitable for this dataset.

References
  1. Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. From Data Mining to Knowledge Discovery: An Overview. In Fayyad, U., Piatetsky-Shapiro, G., Amith, Smyth, P., and Uthurusamy, R. (eds.), Advances in Knowledge Discovery and Data Mining, MIT Press, 1-36, Cambridge, 1996
  2. Report on “Social side of the Internet” http://pewinternet.org/Reports/2011/The‐Social‐Side‐of‐the‐Internet.aspx. This website provides a report with detailed information about Social side of the Internet.
  3. Tanagra Data Mining tutorials, http://data-mining-tutorials.blogspot.com/ This website provides detailed information on the basics of Data Mining Algorithms
  4. Dr. Varun Kumar, Luxmi Verma,” Binary Classifiers for Health Care Databases: A Comparative Study of Data Mining Algorithms in the Diagnosis of Breast Cancer” in IJCST Vol. 1, Issue 2, December 2010
  5. Desouza, K.C. (2001) Artificial intelligence for healthcare management In Proceedings of the First International Conference on Management of Healthcare and Medical Technology Enschede, Netherlands: Institute for Healthcare Technology Management.
  6. D. E. Brown, V. Corruble, and C. L. Pittard. A comparison of decision tree classifiers with backpropagation neural networks for multimodal classification problems. Pattern Recognition, 26:953-961, 1993.
  7. J. Catlett. Megainduction: Machine Learning on Very large Databases. PHD Thesis, University of Sydney, 1991.
  8. M. James. Classification Algorithms. John Wiley, 1985.
  9. T. Cover and P. Hart. Nearest neighbor pattern classification. IEEE Trans. Information Theory, 13:21-27, 1967.
  10. Fayyad, Usama; Gregory Piatetsky-Shapiro, and Padhraic Smyth (1996). "From Data Mining to Knowledge Discovery in Databases". Retrieved 2008-12-17.
  11. Fayyad, U. Data Mining and Knowledge Discovery: Making Sense Out of Data. IEEE Expert, v. 11, no. 5, pp. 20-25, October 1996. Exclusive Ore Inc. The Exclusive Ore Internet Site, http://www.xore.com, 1999.
  12. K. Cios, W. Pedrycz, and R. Swiniarski. Data Mining Methods for Knowledge Discovery. Boston: Kluwer Academic Publishers, 1998
  13. W. Ressom, Rency S. Varghese, Zhen Zhang, Jianhua Xuan, and Robert Clarke. 2008 Classification Algorithms for phenotype prediction in genomic and Proteomics Front BioScience.
Index Terms

Computer Science
Information Sciences

Keywords

Knowledge discovery in databases data mining surveys