Identifying Oversampling and under sampling of Data-A Practical Approach Using R
V. Shobana1 , K. Nandhini2
Section:Research Paper, Product Type: Journal Paper
Volume-7 ,
Issue-5 , Page no. 890-896, May-2019
CrossRef-DOI: https://doi.org/10.26438/ijcse/v7i5.890896
Online published on May 31, 2019
Copyright © V. Shobana, K. Nandhini . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
View this paper at Google Scholar | DPI Digital Library
How to Cite this Paper
- IEEE Citation
- MLA Citation
- APA Citation
- BibTex Citation
- RIS Citation
IEEE Style Citation: V. Shobana, K. Nandhini, “Identifying Oversampling and under sampling of Data-A Practical Approach Using R,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.5, pp.890-896, 2019.
MLA Style Citation: V. Shobana, K. Nandhini "Identifying Oversampling and under sampling of Data-A Practical Approach Using R." International Journal of Computer Sciences and Engineering 7.5 (2019): 890-896.
APA Style Citation: V. Shobana, K. Nandhini, (2019). Identifying Oversampling and under sampling of Data-A Practical Approach Using R. International Journal of Computer Sciences and Engineering, 7(5), 890-896.
BibTex Style Citation:
@article{Shobana_2019,
author = {V. Shobana, K. Nandhini},
title = {Identifying Oversampling and under sampling of Data-A Practical Approach Using R},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {5 2019},
volume = {7},
Issue = {5},
month = {5},
year = {2019},
issn = {2347-2693},
pages = {890-896},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=4333},
doi = {https://doi.org/10.26438/ijcse/v7i5.890896}
publisher = {IJCSE, Indore, INDIA},
}
RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i5.890896}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=4333
TI - Identifying Oversampling and under sampling of Data-A Practical Approach Using R
T2 - International Journal of Computer Sciences and Engineering
AU - V. Shobana, K. Nandhini
PY - 2019
DA - 2019/05/31
PB - IJCSE, Indore, INDIA
SP - 890-896
IS - 5
VL - 7
SN - 2347-2693
ER -
VIEWS | XML | |
289 | 251 downloads | 133 downloads |
Abstract
The stimulation of thyroid hormones has a greater impact in maintaining the metabolism our body. If there is any misbehavior in the hormones it will affect the functioning of other organs too. It is such an important gland and proper clinical advices should be taken if there is a misbehavior. The machine learning algorithms plays a major role in the early detection of thyroid disorder. This work focuses on applying random forest algorithm in prediction of thyroid disorder. The random forest algorithm classifies the class attribute and predicts the occurrence of hypo or hyper or normal scenario of thyroid. The algorithm predicts the result with maximum accuracy. The work is implemented in R. R is a statistical tool and it very much handles large volumes of data compared to other traditional mining tools. The algorithm predicts more accurately and the various performance metrics has been analysed.The data set has been taken from UCI Machine repository.
Key-Words / Index Term
Thyroid, random forest, big data, R studio, Confusion Matrix
References
[1].A.M. Ahmed and N.H. Ahmed”History of disorders of thyroid dysfunction”Eastern Mediterranean Health Journal, Vol. 11, No. 3, 2005.
[2]. K. Ramya, A.Sumathi, "Big Data Applications in Aadhar Card Fraud Detection", International Journal of Computer Sciences and Engineering, Vol.7, Issue.3, pp.865-867, 2019.
[3].Han Liu, Mihaela Cocea “Semi-random partitioning of data into training and test sets in granular computing context” December2017, Volume 2, Issue 4, pp. 357–386, Springer International Publishing.
[4]. Liu H, Gegov A, Cocea M (2016c) “Rule based systems for big data: a machine learning approach.” Springer, Switzerland.
[5]. L. Breiman, Random forests, Mach. Learning, 45 (1). (2001) 5-32. http : // dx.doi.org / 10.1023 /A:1010933404324.
[6]. Shobana.V, Dr.K.Nandhini,” Application of Classification Algorithms for Disease Diagnosis Using Big Data Analytics”, IJERCSE Vol.4, Issue 12, 2017.
[7]. Ammulu.K, Venugopal.T“Thyroid Data Prediction using Data Classification Algorithm”, IJIRST Vol. 4 Issue 2, July 2017.
[8]. Waheed Ahmad, Ayaz Ahmad, Chuncheng Lu, Barkat Ali Khoso, Lican Huang “A novel hybrid decision support system for thyroid disease forecasting” Springer January 2018.
[9]. Sakshi Gujral, "Predicting and Detecting Hectoring on Social Media Using Machine Learning", International Journal of Computer Sciences and Engineering, Vol.5, Issue.8, pp.173-176, 2017.