CFP last date
20 May 2024
Reseach Article

Comparison of Naive Basian and K-NN Classifier

by Deepak Kanojia, Mahak Motwani
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 65 - Number 23
Year of Publication: 2013
Authors: Deepak Kanojia, Mahak Motwani
10.5120/11228-6545

Deepak Kanojia, Mahak Motwani . Comparison of Naive Basian and K-NN Classifier. International Journal of Computer Applications. 65, 23 ( March 2013), 40-45. DOI=10.5120/11228-6545

@article{ 10.5120/11228-6545,
author = { Deepak Kanojia, Mahak Motwani },
title = { Comparison of Naive Basian and K-NN Classifier },
journal = { International Journal of Computer Applications },
issue_date = { March 2013 },
volume = { 65 },
number = { 23 },
month = { March },
year = { 2013 },
issn = { 0975-8887 },
pages = { 40-45 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume65/number23/11228-6545/ },
doi = { 10.5120/11228-6545 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:20:42.338350+05:30
%A Deepak Kanojia
%A Mahak Motwani
%T Comparison of Naive Basian and K-NN Classifier
%J International Journal of Computer Applications
%@ 0975-8887
%V 65
%N 23
%P 40-45
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this paper comparison is done between k-nearest neighbor and naïve basin classifier based on the subset of features. Sequential feature selection method is used to establish the subsets. Four categories of subsets are used like life and medical transcripts, arts and humanities transcripts, social science transcripts, physical science transcripts to show the experimental results to classify data and to show that K-NN classifier gets competition with naïve basian classifier. The classification performance K-NN classifier is far better then naïve basian classifier when learning parameters and number of samples are small. But as the number of samples increases the naïve basian classifier performance is better K-NN classifier. On the other hand naïve basian classifier is much better then K-NN classifier when computational demand and memory requirements are considered. This paper demonstrates the strength of naïve basian classifier for classification and summarizes the some of the most important developments in naïve basian classification and K- nearest neighbor classification research. Specifically, the issues of posterior probability estimation, the link between Naïve basian and K-NN classifiers, learning and generalization tradeoff in classification, the feature variable selection, as well as the effect of misclassification costs are examined. The purpose is to provide a synthesis of the published research in this area and stimulate further research interests and efforts in the identified topics.

References
  1. C. J. van Rijsbergen, Information Retrieval, 2nd edition, Butterworth, London, 1979.
  2. C. Buckley and A. F. Lewit, "Optimizations of Inverted Vector Searches," Proc. of Annual ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 97–110, 1985.
  3. C. C. Aggrawal and P. S. Yu, "Finding Generalized Projected Clusters in High Dimensional Spaces," Proc. of ACM SIGMOD Int'l Conf. on Management of Data, pp. 70–81, 2000.
  4. T. Liu, S. Liu, Z. Chen, and W. Ma, "An Evaluation on Feature Selection for Text Clustering," Proc. of Int'l Conf. on Machine Learning, 2003.
  5. Y. Yang and J. O. Pedersen, "A Comparative Study on Feature Selection in Text Categorization," Proc. of Int'l Conf. on Machine Learning, pp. 412–420, 1997.
  6. Y. Yang, "Noise Reduction in a Statistical Approach to Text Categorization," Proc. of Annual ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 256–263, 1995.
  7. J. R. Quinlan, "Induction of Decision Trees," Machine Learning, vol. 1, pp. 81–106, 1986.
  8. R. O. Duda, P. E. Hart, D. G. Stork, Pattern Classification, Wiley, New York, 2000.
  9. N. Friedman, D. Geiger, M. Goldszmidt, Bayesian network classifiers, Mach. Learn. 29 (1997) 131–163.
  10. Dasarathy, B. V. , Nearest Neighbor (NN) Norms,NN Pattern Classification Techniques. IEEE Computer Society Press, 1990.
  11. Wettschereck, D. , Dietterich, T. G. "An Experimental Comparison of the Nearest Neighbor and Nearesthyperrectangle Algorithms," Machine Learning, 9: 5-28, 1995.
  12. Platt J C. Fast Training of Support Vector Machines Using Sequential Minimal Optimization [M]. Advances in Kernel Methods:Support Vector Machines (Edited by Scholkopf B,Burges C,Smola A)[M]. Cambridge MA: MIT Press, 185-208, 1998.
  13. C. Burges, "A tutorial on support vector machines for pattern recognition", Data Mining and Knowledge Discovery, vol. 2, 1998.
  14. Y. Yang and X. Liu, "A Re-Examination of Text Categorization Methods," Proc. SIGIR '99, pp. 42-49, 1999.
  15. T. Joachims, "Text Categorization with Support Vector Machines: Learning with Many Relevant Features," Proc. European Conf. Machine Learning, pp. 137-142, 1998.
  16. Man Lan, Chew Lim Tan, Jian Su, and Yue Lu, "Supervised and Traditional Term Weighting Methods for Automatic Text Categorization", Ieee Transactions on Pattern Analysis and Machine Intelligence, Vol. 31, No. 4, April 2009.
Index Terms

Computer Science
Information Sciences

Keywords

Naïve Bayesian classifier K-NN classifier classification ensemble methods feature variable selection learning and generalization misclassification costs k-means clustering