Open Access   Article Go Back

Reduced Distance Computation k Nearest Neighbor Model

Preeti Nair1 , Indu Kashyap2

Section:Research Paper, Product Type: Journal Paper
Volume-7 , Issue-5 , Page no. 658-666, May-2019

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v7i5.658666

Online published on May 31, 2019

Copyright © Preeti Nair, Indu Kashyap . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Preeti Nair, Indu Kashyap, “Reduced Distance Computation k Nearest Neighbor Model,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.5, pp.658-666, 2019.

MLA Style Citation: Preeti Nair, Indu Kashyap "Reduced Distance Computation k Nearest Neighbor Model." International Journal of Computer Sciences and Engineering 7.5 (2019): 658-666.

APA Style Citation: Preeti Nair, Indu Kashyap, (2019). Reduced Distance Computation k Nearest Neighbor Model. International Journal of Computer Sciences and Engineering, 7(5), 658-666.

BibTex Style Citation:
@article{Nair_2019,
author = {Preeti Nair, Indu Kashyap},
title = {Reduced Distance Computation k Nearest Neighbor Model},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {5 2019},
volume = {7},
Issue = {5},
month = {5},
year = {2019},
issn = {2347-2693},
pages = {658-666},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=4296},
doi = {https://doi.org/10.26438/ijcse/v7i5.658666}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i5.658666}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=4296
TI - Reduced Distance Computation k Nearest Neighbor Model
T2 - International Journal of Computer Sciences and Engineering
AU - Preeti Nair, Indu Kashyap
PY - 2019
DA - 2019/05/31
PB - IJCSE, Indore, INDIA
SP - 658-666
IS - 5
VL - 7
SN - 2347-2693
ER -

VIEWS PDF XML
474 316 downloads 127 downloads
  
  
           

Abstract

In data mining k Nearest Neighbor (k NN) classification is one of a widely applied classification algorithm. The k NN is based on Nearest Neighbor (NN) search algorithm. One of the drawbacks in k (where k stands for the number of NN to be selected) NN method is that whenever a query point is given to be classified it has the propensity to search through each and every data point to get the minimum distance for finding the Nearest Neighbors. This increases the computational complexity when a large query set is given. So to reduce this complexity and improve the performance of k NN, a novel classification model called Reduced Distance Computation k Nearest Neighbor RDCkNN model is introduced in this paper. In RDCkNN two processes are combined, first the data is randomized and then an optimum percentage of subset is drawn from the randomized data hence reducing the overall quantum of distance finding tasks. This subset will act as the training point for the query set for performing k NN classification processes. The performance of RDCkNN is compared with standard k NN in terms of number of distance computed and accuracy. The experiments were employed on standard data sets, data sets with missing values and a very large dataset. It was also compared with a number of other well-known classification models in order to validate its efficacy. The results obtained during the experiments done here shows that the proposed model exponentially outperformed standard k NN as well as other classification models.

Key-Words / Index Term

k NN, Complexity, Distance Computation, randomization, subset.

References

[1]. P. Nair, N. Khatri, N and I. Kashyap, “A novel technique: ensemble hybrid 1NN model using stacking approach”, International Journal of Information Technology, Springer doi.org/10.1007/s41870-018-0109-0; 2018.
[2]. S. Jabeen Begum, B. Swaathi,”A Survey for identifying Parkinson’s disease by Binary Bat Algorithm”, Computer Science and EngineeringVol.7, Issue.2, pp.17-23, 2019.
[3]. Y. Li, G. Wang, L. Nie, Q. Wang, W. Tan. “Distance metric optimization driven convolutional neural network for age invariant face recognition”. Pattern Recognition; Vol. 75, Issue C, pp 51-62, 2018.
[4]. K. Q. Weinberger, L. K. Saul, “Distance Metric Learning for Large Margin Nearest Neighbor Classification”, Journal of Machine Learning Research 10; 207-244; 2009.
[5]. J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching”, In: Computer Vision and Pattern Recognition, IEEE Conference on. IEEE; pp 1–8; 2007.
[6]. Z. Zhou, M. Dong, K. Ota, G. Wang, L.T. Yang, “Energy-efficient resource allocation for d2d communications undelaying cloud-RAN-based LTE-A networks”, IEEE Internet of Things Journal; Vol. 3, Issue 3, pp 428-38; 2016.
[7]. L. Fan, X. Lei, N. Yang, T. Q. Duong, G.K. Karagiannidis, ”Secrecy cooperative networks with outdated relay selection over correlated fading channels”, IEEE Transactions on Vehicular Technology, Vol. 66, Issue 8, pp 7599–603, 2017.
[8]. J. Li, Z. Liu, X. Chen, F. Xhafa, X. Tan, D.S Wong, “L-EncDB: A lightweight framework for privacy-preserving data queries in cloud computing”, Knowledge-Based Systems; Vol. 79:18–26; 2015.
[9]. P. He, Z. Deng, C. Gao, X. Wang, J. Li, “Model approach to grammatical evolution: deep-structured analyzing of model and representation” Soft Computing Vo. 21, Issue 18, pp 5413–23, 2017.
[10]. H. Wang, W. Wang, Z. Cui, X. Zhou, J. Zhao, Y. Li, “A new dynamic firefly algorithm for demand estimation of water resources” , Information Sciences; Vol. 438, pp 95–106, 2018.
[11]. H.K.N. Peddinti, J.A.Lavanya, G.K. Chakravarthi, “An Optimal Route Search Using Spatial Keyword Query using Keyword Nearest Neighbor Expansion”, Computer Science and EngineeringVol.4, Issue.3, pp.30-33, 2016
[12]. O.F. Ertugrul, M.E. Tagluk, “A novel version of k nearest neighbor: Dependent nearest neighbor”, Applied Soft Computing, Vol. 55, 480-490, 2017.
[13]. J.L, Bentley, “Multidimensional binary search trees used for associative searching”, Commun. ACM Vol. 18, Issue 9, pp 509-517, 1975.
[14]. H Hu, D. L. Lee, “Range nearest-neighbor query”. IEEE Transactions on Knowledge and Data Engineering; Vol. 18, Issue 1, pp 78–91, 2006.
[15]. Y. Chen, L. Zhou, Y. Tang, J. P. Singh, N. Bouguila, C. Wang, H. Wang, J. Du, “Fast Neighbor Search By Using Revised K-D Tree”, In-formation Sciences Volume 472, pp 145-162, 2019.
[16]. F. Gieseke, J. Heinermann, C.E. Oancea, C. Igel, “Buffer kd trees: Processing massive nearest neighbor queries on GPUS” in Proceedings of the 31st International Conference on Machine Learning, pp. 172–80, 2014.
[17]. S. Bagui, A. K. Mondal, S. Bagui, “Using locality sensitive hashing to improve the KNN algorithm in the mapreduce framework” , Proceedings of the ACMSE Conference Article No. 32; 2018
[18]. K. Ding, C. Huo, B. Fan, C. Pan, “kNN Hashing With Factorized Neighborhood Representation”, IEEE International Conference on Computer Vision (ICCV), pp. 1098-1106; 2015.
[19]. A. Andoni, P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions”, Communications of the ACM; Vol. 51, Issue 1, 117–22; 2008.
[20]. J. Han, M. Kamber, J. Pei, “Data Mining Concepts and Techniques, Waltham, MA”, USA: Third edition, Morgan Kaufmann, pp 423-425, 2012.
[21]. C. Lim, J. In, “Randomization in clinical studies”, Korean Journal of Anesthesiol, DOI: https://doi.org/10.4097/kja.19049; 2019.
[22]. K. P Suresh, “An overview of randomization techniques: An unbiased assessment of outcome in clinical research”, Journal of Human Reproductive Sciences Vol. 4 Issue 1, 2011.
[23]. J. Alcalá-Fdez, A. Fernandez, J. Luengo, J. Derrac, S. García, L. Sánchez, F. Herrera. “KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework”, Journal of Multiple-Valued Logic and Soft Computing Vol.17, 255-287, 2011.
[24]. Eibe Frank, Mark A. Hall, and Ian H. Witten (2016). The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann, Fourth Edition, pp 35-40, 2016.