Open Access   Article Go Back

A Novel Approach Using Incremental Fusion Sampling for Data Stream Mining

Anupama N1 , Sudarson Jena2 , V Ravi Sankar3

Section:Research Paper, Product Type: Journal Paper
Volume-7 , Issue-5 , Page no. 407-415, May-2019

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v7i5.407415

Online published on May 31, 2019

Copyright © Anupama N, Sudarson Jena, V Ravi Sankar . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Anupama N, Sudarson Jena, V Ravi Sankar, “A Novel Approach Using Incremental Fusion Sampling for Data Stream Mining,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.5, pp.407-415, 2019.

MLA Style Citation: Anupama N, Sudarson Jena, V Ravi Sankar "A Novel Approach Using Incremental Fusion Sampling for Data Stream Mining." International Journal of Computer Sciences and Engineering 7.5 (2019): 407-415.

APA Style Citation: Anupama N, Sudarson Jena, V Ravi Sankar, (2019). A Novel Approach Using Incremental Fusion Sampling for Data Stream Mining. International Journal of Computer Sciences and Engineering, 7(5), 407-415.

BibTex Style Citation:
@article{N_2019,
author = {Anupama N, Sudarson Jena, V Ravi Sankar},
title = {A Novel Approach Using Incremental Fusion Sampling for Data Stream Mining},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {5 2019},
volume = {7},
Issue = {5},
month = {5},
year = {2019},
issn = {2347-2693},
pages = {407-415},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=4256},
doi = {https://doi.org/10.26438/ijcse/v7i5.407415}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i5.407415}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=4256
TI - A Novel Approach Using Incremental Fusion Sampling for Data Stream Mining
T2 - International Journal of Computer Sciences and Engineering
AU - Anupama N, Sudarson Jena, V Ravi Sankar
PY - 2019
DA - 2019/05/31
PB - IJCSE, Indore, INDIA
SP - 407-415
IS - 5
VL - 7
SN - 2347-2693
ER -

VIEWS PDF XML
390 259 downloads 116 downloads
  
  
           

Abstract

Data stream mining is very popular in recent years with advanced electronic devices generating continuous data streams. The performance of standard learning algorithms is been compromised with imbalance nature present in real-world data streams. In this paper we propose a novel algorithm dubbed as Incremental Fusion Sampling for Data Streams (IFSDS) which uses a unique over sampling and under sampling techniques to almost balance the data sets to minimize the effect of imbalance in the stream mining process. The experimental analysis is conducted on 10 data chunks of data streams with varied sizes and different imbalance ratios. The results suggest that the proposed IFSDS algorithm improves the knowledge discovery over benchmark algorithms like C4.5 and Hoeffding tree in terms of performance measures TN Rate, FP Rate, precision, and F-measure.

Key-Words / Index Term

Knowledge Discovery, Data Streams, Imbalanced data, oversampling, under sampling, Increment Fusion Sampling for Data Streams (IFSDS)

References

[1]. P.P. Angelov, Autonomous Learning Systems: From Data Streams to Knowledge in Real-time, John Wiley & Sons, New York, 2012.
[2]. M. Sayed-Mouchaweh and E. Lughofer, Learning in Non-Stationary Environments: Methods and Applications, Springer, New York, 2012.
[3]. J. Gama, Knowledge Discovery from Data Streams, Chapman & Hall/CRC, Boca Raton, Florida, 2010.
[4]. A. Bifet and G. Holmes and R. Kirkby and B. Pfahringer, MOA: Massive Online Analysis, Journal of Machine Learning Research, vol. 11, pp. 1601--1604, 2010.
[5]. Witten, I.H. and Frank, E. (2005) Data Mining: Practical machine learning tools and techniques. 2nd edition Morgan Kaufmann, San Francisco.
[6]. Qiujie Li and Yaobin Mao, A review of boosting methods for imbalanced data classification, Pattern Analysis, and Applications, Volume 17, Issue 4, pp 679-693.

[7]. E. Lughofer and E. Weig and W. Heid and C. Eitzinger and T. Radauer, Recognizing Input Space and Target Concept Drift with Scarcely Labelled and Unlabelled Instances, Information Sciences, vol. 355--356, pp. 127-151, 2016.

[8]. E. Lughofer and O. Buchtala, Reliable All-Pairs Evolving Fuzzy Classifiers, IEEE Transactions on Fuzzy Systems, vol. 21 (4), pp. 625--641, 2013.
[9]. E. Lughofer, E. Weig and W. Heid, C. Eitzinger and T. Radauer, Integrating New Classes On the Fly in Evolving Fuzzy Classifier Designs and its Application in Visual Inspection, Applied Soft Computing, vol. 35, pp. 558--582, 2015.

[10]. ImenKhamassi, M. SayedMouchaweh, MoezHammami, KhaledGhédira, "Discussion and Review on Evolving Data Streams and Concept Drift Adapting". Evolving Systems Springer, DOI: 10.1007/s12530-016-9168-2, 2016
[11]. HEITOR MURILO GOMES, JEAN PAUL BARDDAL, and FABR´ICIO ENEMBRECK, ALBERT BIFET,” A Survey on Ensemble Learning for Data Stream Classification”, ACM Computing Surveys, Vol. 50, No. 2, Article 23, Publication date: March 2017.
[12]. SamanehKhoshrou, Jaime S. Cardoso, Luis F. Teixeira, ”Active Learning from Video Streams in a Multi-Camera Scenario”, 2014, 22nd International Conference on Pattern Recognition.
[13]. Alabdulrahman, R., Viktor, H. and Paquet, E. An Active Learning Approach for Ensemble-based Data Stream Mining. DOI: 10.5220/0006047402750282, In Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2016) - Volume 1: KDIR, pages 275-282
[14]. Abhijeet B. Godase” Classi_cation of Data Streams with Skewed Distribution”, Thesis submitted, Department of Computer Engineering and Information Technology College of Engineering, Pune
[15]. Arabmakki, Elaheh, "A Reduced Labeled Samples (RLS) Framework For Classification Of Imbalanced Concept-Drifting Streaming Data." (2016). Electronic Theses And Dissertations. Paper 2602. Https://Doi.Org/10.18297/Etd/2602

[16]. Riddhi H.Shaparia, Narendra M.Patel, Zankhana H. Shah, "Flower Classification using Different Color Channel", International Journal of Scientific Research in Computer Science and Engineering, Vol.7, Issue.2, pp.1-6, 2019
[17]. Rakesh Kumar Saini, "Data Mining tools and challenges for current market trends-A Review", International Journal of Scientific Research in Network Security and Communication, Vol.7, Issue.2, pp.12-15, 2019.
[18]. Anupama N, SudarsonJena, "A novel approach using incremental under sampling for data stream mining. Big Data & Information Analytics, Doi: 10.3934/bdia.2017017

[19]. Anupama N and Sudarson Jena,” A novel approach using incremental over sampling for data stream mining“, Evolving Systems, Doi: 10.1007/s12530-018-9249-5

[20]. UCIHamiltonA. Asuncion D. Newman. (2007). UCI Repository of Machine Learning Database (School of Information and Computer Science), Irvine, CA: Univ. of California [Online]. Available: http://www.ics.uci.edu/∼mlearn/MLRepository.html
[21]. C4.5 Quinlan, J. R. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, 1993.
[22]. Geoff Hulten, Laurie Spencer, Pedro Domingos: Mining time-changing data streams. In: ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 97-106, 2001.