Open Access   Article Go Back

Improved Text Mining Techniques for Spam Review Detection

Akshat A. Uike1 , Sumera W.Ahmad2 , Sunil R.Gupta3

Section:Research Paper, Product Type: Journal Paper
Volume-7 , Issue-5 , Page no. 147-152, May-2019

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v7i5.147152

Online published on May 31, 2019

Copyright © Akshat A. Uike, Sumera W.Ahmad, Sunil R.Gupta . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Akshat A. Uike, Sumera W.Ahmad, Sunil R.Gupta, “Improved Text Mining Techniques for Spam Review Detection,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.5, pp.147-152, 2019.

MLA Style Citation: Akshat A. Uike, Sumera W.Ahmad, Sunil R.Gupta "Improved Text Mining Techniques for Spam Review Detection." International Journal of Computer Sciences and Engineering 7.5 (2019): 147-152.

APA Style Citation: Akshat A. Uike, Sumera W.Ahmad, Sunil R.Gupta, (2019). Improved Text Mining Techniques for Spam Review Detection. International Journal of Computer Sciences and Engineering, 7(5), 147-152.

BibTex Style Citation:
@article{Uike_2019,
author = {Akshat A. Uike, Sumera W.Ahmad, Sunil R.Gupta},
title = {Improved Text Mining Techniques for Spam Review Detection},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {5 2019},
volume = {7},
Issue = {5},
month = {5},
year = {2019},
issn = {2347-2693},
pages = {147-152},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=4213},
doi = {https://doi.org/10.26438/ijcse/v7i5.147152}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i5.147152}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=4213
TI - Improved Text Mining Techniques for Spam Review Detection
T2 - International Journal of Computer Sciences and Engineering
AU - Akshat A. Uike, Sumera W.Ahmad, Sunil R.Gupta
PY - 2019
DA - 2019/05/31
PB - IJCSE, Indore, INDIA
SP - 147-152
IS - 5
VL - 7
SN - 2347-2693
ER -

VIEWS PDF XML
510 326 downloads 201 downloads
  
  
           

Abstract

Text mining has played a important role in providing product recommendations to users. Online reviews have become an important factor when people make purchase and business decisions. Efficient recommendation systems help in improving business and also enhance customer satisfaction. The credibility of purchasing a product highly depends on the e-commerce online reviews. However most of people wrongly promote or demote a product by buying and selling fake reviews. Many websites have become source of such opinion spam. These fake/fraudulent reviews are deliberately written to trick potential customers in order to promote/hype them or defame their reputations. Our work is aimed at identifying whether a review is fake or truthful one. Naïve Bayes Classifier, Logistic regression and Support Vector Machines are the classifiers using in our work. This in turns leads to recommending undeserving products. This paper aims to classify online reviews into groups of positive or negative polarity by using machine learning algorithms. In this study, we find online reviews using SA methods in order to detect fake reviews. SA and text classification methods are applied to a dataset of movie reviews. More specifically, we compare five supervised machine learning algorithms: Naïve Bayes (NB), Support Vector Machine (SVM), K-Nearest Neighbours (KNN-IBK) for sentiment classification of reviews using two different datasets, including movie review dataset and movie reviews dataset. The measured results of our experiments show that the SVM algorithm outperforms other algorithms, and that it reaches the highest accuracy not only in text classification, but also in detecting fake reviews.

Key-Words / Index Term

Amazon E-Commerce dataset, Active Learning, Dataset acquisition, Data pre-processing, KNN Classifier, Rough Set Classifier, Support Vector Machine

References

[1] Jindal, N., Liu, B.: “Opinion Spam and Analysis” in Proceedings of the International Conference on Web Search and Web Data Mining (pp. 219–230), 2008.
[2] Ott, M., Choi, Y., Cardie, C., Hancock J.T,” Finding Deceptive Opinion Spam by any Stretch of the Imagination” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA, pp. 309–319 (2011), 2011.
[3]Jindal, N., Liu, B. and Lim, E.-P. “Finding Unusual Review Patterns Using Unexpected Rules” CIKM (2010)., 2010 92.
[4]Mukherjee, A., Liu, B., Wang, J., Glance, N. and Jindal, N. 2011. Detecting Group Review Spam 2011
[5] Mukherjee, A., Liu, B., & Glance, N. “Spotting fake reviewer groups in consumer reviews” in Proceedings of the ACM international conference on world wide web (pp. 191–200). ACM., 2012
[6] Mukherjee, A., Venkataraman, V., Liu, B., & Glance, N.” Fake Review Detection: Classification and Analysis of Real and Pseudo Reviews”, UIC-CS-03- 2013. Technical Report.
[7] B. Liu, “Sentiment analysis and opinion mining,” Synthesis lectures onhuman language technologies, vol. 5, no. 1, 2012, pp. 1–167.
[8]RAYMOND Y. K. LAU, S. Y. LIAO, RON CHI WAI KWOK, KAIQUAN XU, YUNQING XIA, YUEFENG LI,” TextMining and Probabilistic Language Modeling for Online Review Spam Detection “ACM Trans. Manag. Inform. Syst. 2, 4, Article 25 (December 2011)
[9] Yoo and Gretzel “Comparison of deceptive and truthful reviews” (2009)
[10]Xie, S.,Wang, G., Lin, S., & Yu, P. S” Review spam detection via temporal pattern discovery”, in Proceedings of the ACM international conference on knowledge discovery and data mining (pp. 823–831). ACM., 2012.
[11] Fei, G., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., & Ghosh, R.” Exploiting burstiness in reviews for review spammer detection”,in Proceedings of the ICWSM. Citeseer., 2013
[12] Miss Dipti S.Charjan , Prof. Mukesh A.Pund “ Pattern Discovery For Text Mining Using Pattern Taxonomy” (IJETT) Volume 4 Issue 10- October 2013.
[13]Lin, Y., Zhu, T., Wang, X., Zhang, J., & Zhou, A. “Towards online antiopinion spam: Spotting fake reviews from the review sequence”,in 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM) (pp. 261–264). IEEE.
[14] Heydari, A., Tavakoli, M., Salim, N.”Detection of fake opinions using time series” Expert Systems with Applications, 58, 83-92, 2016
[15] Ye, J., Kumar, S., Akoglu, L”Temporal Opinion Spam Detection” Multivariate Indicative Signals, 2016.
[16] Liu, Pan, et al. "Identifying Indicators of Fake Reviews Based on Spammer`s Behavior Features." Software Quality, Reliability and Security Companion (QRS-C), 2017 IEEE International Conference on. IEEE, 2017.
[17] SP.Rajamohana, Dr.K.Umamaheswari, M.Dharani, R.Vedackshya. “Survey of review spam detection using machine learning techniques.” 2017/978-1-5090-5778-8/17.
[18] Mr.Akshat Uike, Mr Ram Deshmukh, Dr.S R.Gupta, Dr. S.W.Ahmad4 ,“Improved text mining techniques for spam review detection” (IJIIRD), Vol. 03 Issue 01 2019