Open Access   Article Go Back

(EDSFCA): Efficient Document Subspace Clustering in High-Dimensional Data using Fast Clustering Algorithm

adhika K R1 , Pushpa C N2 , Thriveni J3 , Venugopal K R4

Section:Research Paper, Product Type: Journal Paper
Volume-7 , Issue-2 , Page no. 1010-1015, Feb-2019

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v7i2.10101015

Online published on Feb 28, 2019

Copyright © Radhika K R, Pushpa C N, Thriveni J, Venugopal K R . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Radhika K R, Pushpa C N, Thriveni J, Venugopal K R, “(EDSFCA): Efficient Document Subspace Clustering in High-Dimensional Data using Fast Clustering Algorithm,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.2, pp.1010-1015, 2019.

MLA Style Citation: Radhika K R, Pushpa C N, Thriveni J, Venugopal K R "(EDSFCA): Efficient Document Subspace Clustering in High-Dimensional Data using Fast Clustering Algorithm." International Journal of Computer Sciences and Engineering 7.2 (2019): 1010-1015.

APA Style Citation: Radhika K R, Pushpa C N, Thriveni J, Venugopal K R, (2019). (EDSFCA): Efficient Document Subspace Clustering in High-Dimensional Data using Fast Clustering Algorithm. International Journal of Computer Sciences and Engineering, 7(2), 1010-1015.

BibTex Style Citation:
@article{R_2019,
author = {Radhika K R, Pushpa C N, Thriveni J, Venugopal K R},
title = {(EDSFCA): Efficient Document Subspace Clustering in High-Dimensional Data using Fast Clustering Algorithm},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {2 2019},
volume = {7},
Issue = {2},
month = {2},
year = {2019},
issn = {2347-2693},
pages = {1010-1015},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=3784},
doi = {https://doi.org/10.26438/ijcse/v7i2.10101015}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i2.10101015}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=3784
TI - (EDSFCA): Efficient Document Subspace Clustering in High-Dimensional Data using Fast Clustering Algorithm
T2 - International Journal of Computer Sciences and Engineering
AU - Radhika K R, Pushpa C N, Thriveni J, Venugopal K R
PY - 2019
DA - 2019/02/28
PB - IJCSE, Indore, INDIA
SP - 1010-1015
IS - 2
VL - 7
SN - 2347-2693
ER -

VIEWS PDF XML
545 367 downloads 235 downloads
  
  
           

Abstract

In the contemporary age of digitization, majority of the users are constantly moving on the prevalent computing in the area of telecommunication and social networking. The data may be produced from several resources from an individual to organization level. The existing data mining techniques are not suitable, due to the features of non structured and semi-structuredness in data which leads to dimensionality problems. To overcome these problems, an Efficient Document Subspace Clustering in High Dimensional Data using Fast Clustering Algorithm (EDSFCA) is proposed. This method performs Datamining techniques like preprocessing and removing of corrupted and repetative data from the subspace clusters. The twitter data is taken as an input and is divided into clusters in order to provide a characteristic of high-dimensional data. This information is organized arbitrarily in subspace clusters and then segmentation is done on data points. The EDSFCA approach does the cluster analysis of datasets in smallest period of time.

Key-Words / Index Term

Data Mining, Fast Clustering Algorithm, High Dimensional Data, Subspace Clustering

References

[1] P. Buhlmann, S. van de Geer, “Statistics for High-Dimensional Data: Methods, Theory and Applications”, Springer Science & Business Media, 2011
[2] V. B. Canedo, N. S. Marono, A. A. Betanzos, “Feature Selection for High-Dimensional Data”, Springer-Computer, 2015
[3] Radhika K R, Pushpa C N, Thriveni J and Venugopal K R, “EDSC: Efficient Document Subspace Clustering Technique for High-Dimensional Data”, In proceedings of International Conference on Computational Techniques in Information and Communication Technologies, Delhi , PP. 11-13, March 2016.
[4] M Verleysen, “Learning High-Dimensional Data”, University atholique Louvain, Microelectronics laboratory, pp. 141-162, 2003.
[5] A Petukhov and I Kozlov, "Greedy Algorithm for Subspace Clustering from Corrupted and Incomplete Data", IEEE Transaction on Information Security, 2015.
[6] Amardeep Kaur and Amitava Datta. “A Novel Algorithm for Fast and Scalable Subspace Clustring in High Dimensional Data”, Journal of BigData, 2015.
[7] C Yang, D Robinson and R Vidal, "Sparse Subspace Clustering with Missing Entries", In Proceedings of the 32nd International Conference on Machine Learning, pp. 2463-2472, 2015.
[8] Singh Vijendra, “Efficient Clustering for High Dimensional Data:Subspace Based Clustering and Density Based Clustering”, Information Technology vol. 10, pp. 1092-1105, 2011.
[9] Lance Parson, Ehtesham Haque and Huan Liu, “Subspace Clustering for High Dimensional Data: A Review”, sigkdd Explorations, vol. 6, pp. 90-105, 2004.
[10] V. Kumatha and S. Palaniammal, “Evaluation of Subspace Clusteing of High Dimensional Data”, International Journal of Computational Science and Applications”, pp. 11-14, 2012.
[11] Sunita Jahirabadkar and Parag Kulkarni, “ Clustering for High Dimensional Data:Density Based Subspace Clustering Algoriithms”,International Journal of Coomputer Applications (0975-8887), vol. 63, pp. 29-35, 2013.
[12] Singh Vijendra and Sahoo Laxman, “Subspace clustering of High Dimensional Data: An Evolutionary Approach”, Applied Computational Intelligence and Soft Computing”, vol. 2013, Article ID 863146, pp. 12.
[13] Hans-peter Kriegel, Peer Kroger, Matthias Renz, Sebastian Wurst, “A Generic Framework for Efficient Subspace Clustering of High Dimensional Data”, In Proceedings of 5th IEEE International Conference on Data Mining (ICDM), Houston, TX, 2005.
[14] Y Wang, Y-X Wang, and A Singh, "Clustering Consistent Sparse Subspace Clustering", Carnegie Mellon University, USA, arXiv preprint arXiv: 1504.01046, 2015.
[15] J Wei, M Wang and Q Wu, “Study on Different Representation Methods for Subspace Segmentation”, International Journal of Grid Distribution Computing, Vol. 8, no.1, pp.259-268, 2015.
[16] C Giraud Taylor and Francis group, “Introduction to High-Dimensional Statistics”, xv+252 pp. ISBN: 978-1-482-23794-8 2014.
[17] R Agrawal, J Gehrke, D Gunopulos, and P Raghavan ,“Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications”. In Transaction of Data Mining and Knowledge Discovery, vol. 11, Issue. 1, pp. 5-33, 2005.
[18] N Tomašev, M Radovanović, D Mladenić and M Ivanović, "Hubness- based Clustering of High-dimensional Data", In Partitional Clustering Algorithms, Springer International Publishing, pp. 353-386, 2015.
[19] Shuyun Wang, Yingjie Fan, Chenghong Zhang, HeXiang Xu, Xiulan Hao and Yunfa Hu, “ Subspace Clustering of High Dimensional Data Streams”, In Proceedings of 7th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2008), pp. 14-16, Portland, USA.
[20] Manolis C. Tsakiris and René Vida, “Abstract algebraic-geometric subspace clustering”, In 48th Asilomar Conference on Signals, Systems and Computers , EISSN: 1058-6393, 2-5 Nov. 2014, CA, USA.
[21] Ezgi Can Ozan and Serkan Kiranyaz, “K-Subspaces Quantization for Approximate Nearest Neighbor Search”, In IEEE Transactions on Knowledge and Data engineering, Vol. 28, No. 7, pp. 1722-1733, 2016.
[22] Han Zhai, Hongyan Zhang, Liangpei Zhang, Pingxiang Li and Antonio Plaza, “A New Sparse Subspace Clustering Algorithm for Hyperspectral Remote Sensing Imagery”, In proceedings of IEEE Geoscience and Remote Sensing Letters, vol. 14, Issue. 1, pp. 43 – 47, 2017.
[23] Shulin Wang, Fang Chen and Jianwen Fang, “Spectral clustering of high-dimensional data via Nonnegative Matrix Factorizationion”, In proceedings of International Joint Conference on Neural Network (IJCNN), pp. 12-17, 2015, Ireland.
[24] Junjian Zhang, Chun-Guang Li, Honggang Zhang and Jun Guo, “Low-rank and structured sparse subspace clustering”, n proceedings of Visual Communication and Image Processing (VCIP), pp. 27-30, 2016, China.
[25] Alexander Petukhov and Inna Kozlov, “Greedy algorithm for subspace clustering from corrupted and incomplete data”, In proceedings of International Conference on Sampling Theory and Applications (SampTA), pp. 25-29, 2015, USA.
[26] Ran He, Liang Wang, Zhenan Sun, Yingya Zhang and Bo Li, “Information Theoretic Subspace Clustering”, IEEE Transactions on Neural Networks and Learning Systems, vol. 27, Issue. 12, pp. 2643-2655, 2016.
[27] Yifan Fu, Junbin Gao, David Tien, Zhouchen Lin and Xia Hong, “ Tensor LRR and sparse coding-based subspace clustering”, In IEEETransactions on Neural Networks and Learning Systems, vol. 27, Issue. 10, pp. 2120-2133, 2016.