Open Access   Article Go Back

Rapid Clustering Algorithm for Optimizing Cognate Data of Online Database

B.S. Rawat1 , K. Kumar2 , R.K. Mishra3 , S.S Bedi4

Section:Research Paper, Product Type: Journal Paper
Volume-7 , Issue-5 , Page no. 1076-1082, May-2019

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v7i5.10761082

Online published on May 31, 2019

Copyright © B.S. Rawat, K. Kumar, R.K. Mishra, S.S Bedi . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: B.S. Rawat, K. Kumar, R.K. Mishra, S.S Bedi, “Rapid Clustering Algorithm for Optimizing Cognate Data of Online Database,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.5, pp.1076-1082, 2019.

MLA Style Citation: B.S. Rawat, K. Kumar, R.K. Mishra, S.S Bedi "Rapid Clustering Algorithm for Optimizing Cognate Data of Online Database." International Journal of Computer Sciences and Engineering 7.5 (2019): 1076-1082.

APA Style Citation: B.S. Rawat, K. Kumar, R.K. Mishra, S.S Bedi, (2019). Rapid Clustering Algorithm for Optimizing Cognate Data of Online Database. International Journal of Computer Sciences and Engineering, 7(5), 1076-1082.

BibTex Style Citation:
@article{Rawat_2019,
author = {B.S. Rawat, K. Kumar, R.K. Mishra, S.S Bedi},
title = {Rapid Clustering Algorithm for Optimizing Cognate Data of Online Database},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {5 2019},
volume = {7},
Issue = {5},
month = {5},
year = {2019},
issn = {2347-2693},
pages = {1076-1082},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=4364},
doi = {https://doi.org/10.26438/ijcse/v7i5.10761082}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i5.10761082}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=4364
TI - Rapid Clustering Algorithm for Optimizing Cognate Data of Online Database
T2 - International Journal of Computer Sciences and Engineering
AU - B.S. Rawat, K. Kumar, R.K. Mishra, S.S Bedi
PY - 2019
DA - 2019/05/31
PB - IJCSE, Indore, INDIA
SP - 1076-1082
IS - 5
VL - 7
SN - 2347-2693
ER -

VIEWS PDF XML
451 225 downloads 144 downloads
  
  
           

Abstract

Clustering is one of the main diagnostic method in data mining, widely used in cluster analysis having higher efficiency and scalability when dealing with large data sets. So far, numerous useful clustering algorithms have been developed for large databases, such as Connectivity based clustering [1], Centroid based clustering [2], Distribution based clustering[3] and Density based clustering[4]. K-means clustering algorithm was proposed by MacQueen [5] which is a Centroid based cluster analysis method. However there are some limitations of standard K-means algorithm: initialization of cluster centers, how K-means clustering algorithm calculates the distance between each data objects and cluster centers in each iteration. This paper proposes an improved K-means algorithm which first preprocesses the data and then arranges the dataset in a sequential order thus reducing the number of iterations and complexity. In preprocessing, the noisy data is removed and the resultant data undergoes the improved process of sorting and clustering which controls the computing of distance with each data object to the cluster centers iteratively, saving the execution time. Experimental results show that the improved method can effectively advance the speed of clustering and accuracy, reducing the computational complexity of the K-means.

Key-Words / Index Term

Data mining, Clustering, K-means, improved K-means

References

[1] Jianhua Li and Laleh Behjat, “A Connectivity Based Clustering Algorithm With Application to VLSI Circuit Partitioning”, IEEE Transactons On Circuis and Systems-II: Express Briefs, Vol.53, No. 5, May 2006.
[2] Lifei Chen , Shengrui Wang, Xuanhui Yan, “Centroid-based clustering for graph datasets.”, 21st International Conference on Pattern Recognition , November 11-15, 2012.
[3] Xiaowei Xu, Martin Ester, Hans-Peter Kriegel, Jörg Sander , “A distribution-based clustering algorithm for mining in large spatial databases”, Proceedings 14th International Conference on Data Engineering, 06 August 2002.
[4] Ester M., Kriegel H., Sander J., Xiaowei Xu, “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise”, KDD’96, Portland, OR, pp.226-231, 1996.
[5] J.MacQueen, “Some methods for classification and analysis of multivariate observations.”, In Proc. 5th Berkeley Symp. Math. Stat. Prob., 1:281–297, Berkeley, CA, 1967.
[6] Kaufman L. and Rousseeuw P. J., “Finding Groups in Data: An Introduction to Cluster Analysis”, John Wiley & Sons, 1990.
[7] Raymond T. Ng and Jiawei Han, “CLARANS: A Method for Clustering Objects for Spatial Data Mining”, IEEE Transactions On Knowledge and Data Engineering, Vol. 14, No. 5, SEPTEMBER 2002.
[8] Zhang T, Ramakrishnan R., Livny M., “ BIRCH: An efficient data clustering method for very large databases”, In: SIGMOD Conference, pp.103~114, 1996.
[9] Guha S, Rastogi R, Shim K., “ CURE: An efficient clustering algorithm for large databases”, In: SIGMOD Conference, pp.73~84, 1998.
[10] Ankerst M., Markus M. B., Kriegel H., Sander J., “OPTICS: Ordering Points To Identify the Clustering Structure”, Proc.ACM SIGMOD’99 Int. Conf. On Management of Data, Philadelphia, PA, pp.49-60, 1999.
[11] Fahim A M,Salem A M,Torkey F A, “An efficient enhanced K-means clustering algorithm” Journal of Zhejiang University Science A, Vol.10, pp:1626-1633,July 2006.
[12] Sun Jigui, Liu Jie, Zhao Lianyu, “Clustering algorithms Research”,Journal of Software ,Vol 19,No 1, pp.48-61,January 2008.
[13] Zhe Zang, Junxi Zhang, Huifeng Xue, “Improved K-means clustering algorithm”, Congress on Image and Signal Processing, IEEE DOI 10.1109/CISP.2008.
[14] Malay K. Pakhira, “A modified K-means algorithm to avoid empty clusters” International Journal of Recent Trends in Engineering, Vol 1, No. 1, May 2009 .
[15] Shi Na, Liu Xumin, Guang Yong, “Research on K-means clustering algorithm: An improved K-means clustering algorithm ” Third International Symposium on Intelligent Information Technology and Security Informatics, IEEE, DOI 10.1109/IITSI.2010.
[16] Mumtaz, Dr. K. Duraiswamy, “A novel density based improved K-means clustering algorithm- Dbkmeans” International Journal on Computer Science and Engineering ISSN : 0975-3397 213 Vol. 02, No. 02, 2010, 213-218.
[17] Juntao Wang, Xiaolong Su, “An improved K-means clustering algorithm” IEEE, 3rd ICCSN International Conference on Communication Software and Networks, 2011.
[18] Navjot Kaur, Jaspreet Kaur Sahiwal, Navneet Kaur, “Efficient K-means clustering algorithm using ranking method in datamining” ISSN: 2278 – 1323, International Journal of Advanced Research in Computer Engineering & Technology ,Volume 1, Issue 3, May 2012.
[19] Shyr-Shen Yu , Shao-Wei Chu , Chuin-Mu Wang , Yung-Kuan Chan , Ting-Cheng Chang, “Two Improved K-means Algorithms”, Applied Soft Computing Journal ,2017.