Call for Paper - November 2017 Edition
IJCA solicits original research papers for the November 2017 Edition. Last date of manuscript submission is October 20, 2017. Read More

A Survey on Nearest Neighbor Search Methods

Print
PDF
International Journal of Computer Applications
© 2014 by IJCA Journal
Volume 95 - Number 25
Year of Publication: 2014
Authors:
Mohammad Reza Abbasifard
Bijan Ghahremani
Hassan Naderi
10.5120/16754-7073

Mohammad Reza Abbasifard, Bijan Ghahremani and Hassan Naderi. Article: A Survey on Nearest Neighbor Search Methods. International Journal of Computer Applications 95(25):39-52, June 2014. Full text available. BibTeX

@article{key:article,
	author = {Mohammad Reza Abbasifard and Bijan Ghahremani and Hassan Naderi},
	title = {Article: A Survey on Nearest Neighbor Search Methods},
	journal = {International Journal of Computer Applications},
	year = {2014},
	volume = {95},
	number = {25},
	pages = {39-52},
	month = {June},
	note = {Full text available}
}

Abstract

Nowadays, the need to techniques, approaches, and algorithms to search on data is increased due to improvements in computer science and increasing amount of information. This ever increasing information volume has led to time and computation complexity. Recently, different methods to solve such problems are proposed. Among the others, nearest neighbor search is one of the best techniques to this end which is focused by many researchers. Different techniques are used for nearest neighbor search. In addition to put an end to some complexities, variety of these techniques has made them suitable for different applications such as pattern recognition, searching in multimedia data, information retrieval, databases, data mining, and computational geometry to name but a few. In this paper, by opening a new view to this problem, a comprehensive evaluation on structures, techniques and different algorithms in this field is done and a new categorization of techniques in NNS is presented. This categorization is consists of seven groups: Weighted, Reductional, Additive, Reverse, Continuous, Principal Axis and Other techniques which are studied, evaluated and compared in this paper. Complexity of used structures, techniques and their algorithms are discussed, as well.

References

  • A. Andoni, Nearest Neighbor Search - the Old, the New, and the Impossible, Ph. D. dissertation, Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 2009.
  • G. Shakhnarovich, T. Darrell, and P. Indyk, Nearest Neighbor Methods in Learning and Vision : Theory and Practice, March 2006 [Online].
  • N. Bhatia and V. Ashev, Survey of Nearest Neighbor Techniques, International Journal of Computer Science and Information Security, 8(2), 2010, pp. 1-4.
  • S. Dhanabal and S. Chandramathi, A Review of various k-Nearest Neighbor Query Processing Techniques, Computer Applications. 31(7), 2011, pp. 14-22.
  • A. Rajaraman and J. D. Ullman. Mining of Massive Datasets, December 2011 [Online].
  • D. Marshal, Nearest Neighbour Searching in High Dimensional Metric Space, M. Sc. thesis, Computer Science, Australian National University, 2006.
  • P. Zezula, G. Amato, V. Dohnal, and M. Batko, Similarity Search: The Metric Space Approach, December 2005 [Online].
  • T. Skopal and J. Lokoc, NM-Tree: Flexible Approximate Similarity Search in Metric and Non metric Spaces, in 19th International Conference on Database and Expert Systems Applications, Turin, Italy, 2008, pp. 312-325.
  • B. Zhang and S. N. Srihari, A Fast Algorithm for Finding k-Nearest Neighbors with Non-Metric Dissimilarity, in 8th International Workshop on Frontiers in Handwriting Recognition, Ontario, Canada, 2002, pp. 13-18.
  • J. Lokoc, On M-tree Variants in Metric and Non-metric Spaces, in 17th Annual Conference of Doctoral Students, Prague, Czech Republic, 2008, pp. 230-234.
  • T. Skopal, On Fast Non-Metric Similarity Search by Metric Access Methods, in 10th International Conference on Advances in Database Technology, Munich, Germany, 2006, pp. 718-736.
  • T. Skopal, Unified Framework for Fast Exact and Approximate Search in Dissimilarity Spaces, ACM Transactions on Database Systems. 32(4), 2007,pp. 29.
  • Y. Rubner, J. Puzicha, C. Tomasi, and J. M. Buhmann, Empirical Evaluation of Dissimilarity Measures for Color and Texture, Computer Vision and Image Understanding. 84(1), 2001, pp. 25-43.
  • K. Fukunaga and P. M. Narendra, A Branch and Bound Algorithm for Computing k-Nearest Neighbors, IEEE Transactions on Computer. 24(7), 1975, pp. 750-753.
  • P. N. Yianilos, Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces, in 4th Annual ACM-SIAM Symposium on Discrete Algorithms, Austin, Texas, USA, 1993, pp. 311-321.
  • F. Bajramovic, F. Mattern, N. Butko, and J. Denzler, A Comparison of Nearest Neighbor Search Algorithms for Generic Object Recognition, in 8th International Conference on Advanced Concepts for Intelligent Vision Systems, Antwerp, Belgium, 2006, pp. 1186-1197.
  • M. Slaney and M. Casey, Locality Sensitive Hashing for Finding Nearest Neighbors, IEEE Signal Processing Magazine. 25(2), 2008, pp. 128-131.
  • A. Andoni and P. Indyk, Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimension, in 47th Annual IEEE Symposium on Foundations of Computer Science, Berkeley, California, USA, 2006, pp. 459-468.
  • A. Gionis, P. Indyk, and R. Motwani, Similarity Search in High Dimensions via Hashing, in 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, 1999, pp. 518-529.
  • R. F. Sproull, Refinements to Nearest-Neighbor Searching in k Dimensional Trees, Algorithmica. 6(4), 1991, pp. 579-589.
  • J. L. Bentley, Multidimensional binary search trees Used for Associative Searching, Communications of the ACM. 18(9), 1975, pp. 509-517.
  • R. Panigrahy, An Improved Algorithm Finding Nearest Neighbor Using Kd-Trees, in 8th Latin American conference on Theoretical informatics, Bzios, Brazil, 2008, pp. 387-398.
  • A. Nuchter, K. Lingemann, and J. Hertzberg, Cached KD Tree Search for ICP Algorithms, in 6th International Conference on 3-D Digital Imaging and Modeling, Montreal, Quebec, Canada, 2007, pp. 419-426.
  • V. Gaede and O. Gunther, Multidimensional access methods, ACM Computing Surveys. 30(2), 1998, pp. 170-231.
  • H. Samet, The Quadtree and Related Hierarchical Data Structures, ACM Computing Surveys. 16(2), 1984, pp. 187-260.
  • J. Tayeb, O. Ulusoy, and O. Wolfson, A Quadtree Based Dynamic Attribute Indexing Method, The Computer Journal. 41(3), 1998,pp. 185-200.
  • C. A. Shaffer and H. Samet, Optimal Quadtree Construction Algorithms, Computer Vision, Graphics, and Image Processing. 37(3), 1987, pp. 402-419.
  • D. Meagher, Geometric Modeling Using Octree Encoding, Computer Graphics and Image Processing. 19(2), 1982, pp. 129-147.
  • T. Liu, A. W. Moore, and A. Gray, New Algorithms for Efficient High Dimensional Nonparametric Classification, The Journal of Machine Learning Research. 7(1), 2006, pp. 1135–1158.
  • S. M. Omohundro, Five Balltree Construction Algorithms, International Computer Science Institute, Berkeley, California, USA, Tech, 1989.
  • Y. Manolopoulos, A. Nanopoulos, A. N. Papadopoulos, and Y. Theodoridis, R-Trees: Theory and Applications, November 2005 [Online].
  • A. Guttman, R-Trees: A dynamic index structure for spatial searching, ACM SIGMOD Record. 14(2), 1984, pp. 47-57.
  • M. k. Wu, Evaluation of R-trees for Nearest Neighbor Search, M. Sc. thesis, Computer Science, University of Houston, 2006.
  • N. Roussopoulos, S. Kelley, and F. Vincent, Nearest Neighbor Queries, ACM SIGMOD Record. 24(2), 1995, pp. 71-79.
  • M. Adler and B. Heeringa, Search Space Reductions for Nearest Neighbor Queries, in 5th international conference on Theory and applications of models of computation, Xian, China, 2008, pp. 554-568.
  • P. Ciaccia, M. Patella, and P. Zezula, M-tree: An Efficient Access Method for Similarity Search in Metric Spaces, in 23rd Very Large Data Bases Conference, Athens, Greece, 1997.
  • T. Skopal, Pivoting M-tree: A Metric Access Method for Efficient Similarity Search, in Dateso 2004 Annual International Workshop on Databases, Texts, Specifications and Objects, Desna, Czech Republic, 2004, pp. 27-37.
  • S. Berchtold, D. A. Keim, and H. P. Kriegei, The X Tree: An Index Structure for High Dimensional Data, in 22th International Conference on Very Large Data Bases, San Francisco, California, USA, 1996, pp. 28-39.
  • Y. Sakurai, M. Yoshikawa, S. Uemura, and H. Kojima, The A Tree: An Index Structure for High Dimensional Spaces Using Relative Approximation, in 26th International Conference on Very Large Data Bases, Cairo, Egypt, 2000, pp. 516-526.
  • D. A. White and R. Jain, Similarity Indexing with the SS Tree, in 12th International Conference on Data Engineering, New Orleans, Los Angeles, USA, 1996, pp. 516-523.
  • N. Katayama and S. Satoh, The SR Tree: An Index Structure for High Dimensional Nearest Neighbor Queries, ACM SIGMOD Record. 26(2), 1997, pp. 369-380.
  • T. M. Cover and P. E. Hart, Nearest Neighbor Pattern Classification, IEEE Transactions on Information Theory. 13(1), 1967, pp. 21-27.
  • S. A. Dudani, The Distance-Weighted k-Nearest-Neighbor Rule, IEEE Transactions on Systems, Man and Cybernetics. 6(4), 1976, pp. 325-327.
  • T. Bailey and A. K. Jain, A Note on Distance-Weighted k-Nearest Neighbor Rules, IEEE Transactions on Systems, Man and Cybernetics. 8(4), 1978, pp. 311-313.
  • S. Tan, Neighbor-weighted K-nearest neighbor for unbalanced text corpus, Expert Systems with Applications. 28(4), 2005, pp. 667-671.
  • K. Hechenbichler and K. Schliep, Weighted k-Nearest-Neighbor Techniques and Ordinal Classification, Collaborative Research Center, LMU University, Munich, Germany, Tech, 2006.
  • H. Parvin, H. Alizadeh, and B. Minaei-Bidgoli, MKNN: Modified K-Nearest Neighbor, in World Congress on Engineering and Computer Science, San Francisco, California, USA, 2008, pp. 831-834.
  • Y. Zeng, Y. Yang, and L. Zhao, Pseudo Nearest Neighbor Rule for Pattern Classification, Expert Systems with Applications. 36(2), 2009, pp. 3587-3595.
  • V. J. d. A. e. S. Lobo, Ship Noise Classification: A Contribution to Prototype Based Classifier Design, Ph. D. dissertation, College of Science and Technology, New University of Lisbon, 2002.
  • P. E. Hart, The Condensed Nearest Neighbor Rule, IEEE Transactions on Information Theory. 14(3), 1968, pp. 515-516.
  • F. Angiulli, Fast Condensed Nearest Neighbor Rule, in 22nd International Conference on Machine Learning, Bonn, Germany, 2005, pp. 25-32.
  • G. W. Gates, The Reduced Nearest Neighbor Rule, IEEE Transactions on Information Theory. 18(3), 1972, pp. 431-433.
  • G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, KNN Model-Based Approach in Classification, in OTM Confederated International Conferences on the Move to Meaningful Internet Systems, Catania, Sicily, Italy, 2003, pp. 986-996.
  • G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, An KNN Model-Based Approach and its Application in Classification in Text Categorization, in 5th International Conference on Computational Linguistics and Intelligent Text Processing, Seoul, Korea, 2004, pp. 559-570.
  • Z. Yong, L. Youwen, and X. Shixiong, An Improved KNN Text Classification Algorithm Based on Clustering, Journal of Computers. 4(3), 2009, pp. 230-237.
  • S. Z. Li and J. Lu, Face Recognition Using the Nearest Feature Line Method, IEEE Transactions on Neural Networks. 10(2), 1999, pp. 439-443.
  • S. Z. Li, K. L. Chan, and C. Wang, Performance Evaluation of the Nearest Feature Line Method in Image Classification and Retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence. 22(11), 2000, pp. 1335-1339.
  • W. Zheng, L. Zhao, and C. Zou, Locally nearest neighbor classifiers for pattern classification, Pattern Recognition. 37(6), 2004, pp. 1307-1309.
  • Q. B. Gao and Z. Z. Wang, Center Based Nearest Neighbor Classifier, Pattern Recognition. 40(1), 2007, pp. 346-349.
  • Y. Zhou, C. Zhang, and J. Wang, Tunable Nearest Neighbor Classifier, in 26th DAGM Symposium on Artificial Intelligence, Tubingen, Germany, 2004, pp. 204-211.
  • Y. Zhou, C. Zhang, and J. Wang, Extended Nearest Feature Line Classifier, in 8th Pacific Rim International Conference on Artificial Intelligence, Auckland, New Zealand, 2004, pp. 183-190.
  • F. Korn and S. M. ukrishnan, Influence Sets Based on Reverse Nearest Neighbor Queries, ACM SIGMOD Record. 29(2), 2000, pp. 201-212.
  • C. Yang and K. I. Lin, An Index Structure for Efficient Reverse Nearest Neighbor Queries, in 17th International Conference on Data Engineering, Heidelberg, Germany, 2001, pp. 485-492.
  • R. Benetis, C. S. Jensen, G. Karciauskas, and S. Saltenis, Nearest Neighbor and Reverse Nearest Neighbor Queries for Moving Objects, The VLDB Journal. 15(3), 2006, pp. 229-249.
  • Y. Tao, M. L. Yiu, and N. Mamoulis, Reverse Nearest Neighbor Search in Metric Spaces, IEEE Transactions on Knowledge and Data Engineering. 18(9), 2006, pp. 1239-1252.
  • I. Stanoi, D. Agrawal, and A. E. Abbadi, Reverse Nearest Neighbor Queries for Dynamic Databases, in ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, Dallas, Texas, USA, 2000, pp. 44-53.
  • E. Achtert, C. Bohm, P. Kroger, P. Kunath, M. Renz, and A. Pryakhin, Efficient Reverse k-Nearest Neighbor Search in Arbitrary Metric Spaces, in ACM SIGMOD International Conference on Management of Data, Chicago, Illinois, USA, 2006, pp. 515-526.
  • Y. Gao, B. Zheng, G. Chen, and Q. Li, On Efficient Mutual Nearest Neighbor Query Processing in Spatial Databases, Data & Knowledge Engineering. 68(8), 2009, pp. 705-727.
  • Y. Tao, D. Papadias, and Q. Shen, Continuous Nearest Neighbor Search, in 28th international conference on Very Large Data Bases, Hong Kong, China, 2002, pp. 287-298.
  • H. Hu and D. L. Lee, Range Nearest-Neighbor Query, IEEE Transactions on Knowledge and Data Engineering. 18(1), 2006, pp. 78-91.
  • J. McNames, A Fast Nearest-Neighbor Algorithm Based on a Principal Axis Search Tree, IEEE Transactions on Pattern Analysis and Machine Intelligence. 23(9), 2001, pp. 964-976.
  • B. Wang and J. Q. Gan, Integration of Projected Clusters and Principal Axis Trees for High-Dimensional Data Indexing and Query, in 5th International Conference on Intelligent Data Engineering and Automated Learning, Exeter, UK, 2004, 2001, pp. 191–196.
  • Y. C. Liaw, M. L. Leou, and C. M. Wu, Fast Exact k Nearest Neighbors Search Using an Orthogonal Search Tree, Pattern Recognition. 43(6), 2010, pp. 2351-2358.
  • Y. C. Liaw, C. M. Wu, and M. L. Leou, Fast k Nearest Neighbors Search Using Modified Principal Axis Search Tree, Digital Signal Processing. 20(5), 2010 pp. 1494-1501.
  • H. Ferhatosmanoglu, I. Stanoi, D. Agrawal, and A. E. Abbadi, Constrained Nearest Neighbor Queries, in 7th International Symposium on Advances in Spatial and Temporal Databases, Redondo Beach, California, USA, 2001, pp. 257-276.
  • T. Emrich, H. P. Kriegel, P. Kröger, M. Renz, and A. Züfle, Constrained Reverse Nearest Neighbor Search on Mobile Objects, in 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Seattle, Washington, USA, 2009, pp. 197-206.
  • S. C. Bagui and N. R. Pal, A Multistage Generalization of the Rank Nearest Neighbor Classification Rule, Pattern Recognition Letters. 16(6), 1995, pp. 601-614.
  • S. C. Bagui, S. Bagui, K. Pal, and N. R. Pal, Breast Cancer Detection using Rank Nearest Neighbor Classification Rules, Pattern Recognition. 36(1), 2003, pp. 25-34.
  • D. Papadias, Q. Shen, Y. Tao, and K. Mouratidis, Group Nearest Neighbor Queries, in 20th International Conference on Data Engineering, Massachusetts, Boston, USA, 2004, pp. 301-312.
  • D. Papadias, Y. Tao, K. Mouratidis, and C. K. Hui, Aggregate Nearest Neighbor Queries in Spatial Databases, ACM Transactions on Database Systems. 30(2), 2005, pp. 529-576.
  • M. L. Yiu, N. Mamoulis, and D. Papadias, Aggregate Nearest Neighbor Queries in Road Networks, IEEE Transactions on Knowledge and Data Engineering. 17(6), 2005, pp. 820-833.
  • H. Samet, Foundations of Multidimensional and Metric Data Structures, University of Maryland at College Park, Morgan Kaufmann, August 2006.