CFP last date
22 April 2024
Reseach Article

Towards the new Similarity Measures in Application of Machine Learning Techniques on Agriculture Dataset

by Bhagirath Parshuram Prajapati, Dhaval R. Kathiriya
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 156 - Number 11
Year of Publication: 2016
Authors: Bhagirath Parshuram Prajapati, Dhaval R. Kathiriya
10.5120/ijca2016912571

Bhagirath Parshuram Prajapati, Dhaval R. Kathiriya . Towards the new Similarity Measures in Application of Machine Learning Techniques on Agriculture Dataset. International Journal of Computer Applications. 156, 11 ( Dec 2016), 38-41. DOI=10.5120/ijca2016912571

@article{ 10.5120/ijca2016912571,
author = { Bhagirath Parshuram Prajapati, Dhaval R. Kathiriya },
title = { Towards the new Similarity Measures in Application of Machine Learning Techniques on Agriculture Dataset },
journal = { International Journal of Computer Applications },
issue_date = { Dec 2016 },
volume = { 156 },
number = { 11 },
month = { Dec },
year = { 2016 },
issn = { 0975-8887 },
pages = { 38-41 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume156/number11/26757-2016912571/ },
doi = { 10.5120/ijca2016912571 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:02:23.502063+05:30
%A Bhagirath Parshuram Prajapati
%A Dhaval R. Kathiriya
%T Towards the new Similarity Measures in Application of Machine Learning Techniques on Agriculture Dataset
%J International Journal of Computer Applications
%@ 0975-8887
%V 156
%N 11
%P 38-41
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

k-Nearest Neighbor is a simple and effective classification method. The primary idea of this method is to calculate the distance from a query point to all of classified data points and make choice of a class which occurs maximum time in k closest neighbors. The Euclidean distance and cosine similarity the common choice for similarity metric among all the similarity measures. Apart from Euclidean and Cosine there are various similarity measures available and being used to calculate similarity in n-dimension vector space model for classification. Similarity calculation is complex operation and computationally need high time if vector dimension increases. Hence this paper explores the usefulness of nine different similarity measures in kNN and presents their experimental results on agriculture dataset. We also compared the time required to finish the classification task and concluded that I-divergence is taking minimum time compared to these algorithms.

References
  1. Deza E. and Deza M.M., Dictionary of Distances, Elsevier, 2006
  2. Zezula P., Amato G., Dohnal V., and Batko M., Similarity Search The Metric Space Approach, Springer, 2006
  3. B.S.Charulatha, et. El, A Comparative study of different distance metrics that can be used in Fuzzy Clustering Algorithms, IJETTCS, National Conference on Architecture, Software systems and Green computing-2013(NCASG2013)
  4. Gavin D.G., Oswald W.W., Wahl, E.R., and Williams J.W., A statistical approach to evaluating distance metrics and analog assignments for pollen records, Quaternary Research 60, pp 356–367, 2003
  5. O. Ibrahimov, et. El, The performance analysis of a Chi-square similarity measure for topic related clustering of noisy transcripts, Pattern Recognition, 2002. Proceedings. 16th International Conference.
  6. Xiangyan Meng, et. El , A Novel K-Nearest Neighbor Algorithm Based on I-Divergence with application to Soil Moisture Estimation in Maize Field.
  7. R. Chang, Z. Pei, C. Zhang, “A Modified Editing k-nearest Neighbor Rule”, Journal of Computers, vol.6, pp.1493- 1500, 2011.
  8. J. Gou, T. Xiong, Y. Kuang, “A Novel Weighted Voting for K-Nearest Neighbor Rule”, Journal of Computers, vol.6, pp.833-840, 2011.
  9. T.M. Cover, P. E. Hart, “Nearest Neighbor Pattern Classification”, IEEE. Transactions on Information Theory, vol.13, no.1, pp.21-27, 1967.
  10. Zobel, J. and A. Moffat, Exploring the Similarity Space. In ACM SIGIR Forum. 1998.
  11. B. Prajapati, D. Kathiriya, “Evaluation of Effectiveness of k-Means Cluster based Fast k-Nearest Neighbor classification applied on Agriculture Dataset”, IJCSIS, October, 2016, Vol 14, No 10.
Index Terms

Computer Science
Information Sciences

Keywords

Euclidian Manhattan Minkowasky Canberra Chebychev Cosine Correlation Chi-square I-divergence