Text mining based sentiment analysis using a novel deep learning approach

Document Type : Research Paper

Authors

1 Faculty of Education for Girls, University of Kufa, Al- Najaf, Iraq

2 College of Information Technology, University of Babylon, Babil, Iraq

3 Engineering Technical College of Al-Najaf,Al-Furat Al-Awsat Technical University (ATU), Al-Najaf, Iraq

Abstract

Leveraging text mining for sentiment analysis, and integrating text mining and deep learning are the main purposes of this paper. The presented study includes three main steps. At the first step, pre-processing such as tokenization, text cleaning, stop word, stemming, and text normalization has been utilized. Secondly, feature from review and tweets using Bag of Words (BOW) method and Term Frequency $\_$Inverse Document Frequency is extracted. Finally, deep learning by dense neural networks is used for classification. This research throws light on understanding the basic concepts of sentiment analysis and then showcases a model which performs deep learning for classification for a movie review and airline$\_$ sentiment data set. The performance measure in terms of precision, recall, F1-measure and accuracy were calculated. Based on the results, the proposed method achieved an accuracy of $95.38\%$ and $93.84\%$ for a movie review and Airline$\_$ sentiment, respectively.

Keywords

[1] C.C. Aggarwal and C.X. Zhai, Mining Text Data, Springer-Verlag New York, 2012.[2] N. Al-A’araji, E. Al-Shamery and A. Abdulhussein, ARNN for enhancing drift detection of data stream based on
modified page hinckley model, J. Engin. Appl. Sci. 13(10) (2018) 8281-8291.
[3] E. Al-Turkey, G. Al-Sultany, and H. Almamory, Enhancing content-based recommender system by using enriched
user profile, J. Engin. Appl. Sci. 12(10) (2017) 8858–8863.
[4] B. Arkok and A.M. Zeki, Classification of Qur’anic topics based on imbalanced classification, Indonesian J. Elect.
Engin. Comput. Sci. 22(2) (2021) 70–79.
[5] M.A. Burhanuddin, R. Ismail, N. Izzaimah, A.A. Mohammed and N. Zainol, Analysis of mobile service providers
performance using naive bayes data mining technique, Int. J. Elect. Comput. Engin.8(6) (2018) 5153–5161.
[6] R. Burke, M.P. Omahony and N.J. Hurley, Robust Collaborative Recommendation, Recommender Systems Handbook, In: F. Ricci, L. Rokach, B. Shapira, and P. Kantor (eds) Recommender Systems Handbook, Springer,
Boston, 2015.
[7] H. Chehili, S.E. Aliouane, A. Bendahmane and M.A. Hamidechi, DeepEnz: Prediction of enzyme classification
by deep learning, Indonesian J. Elect. Engin. Comput. Sci. 22(2) (2021) 500–507.
[8] A. Collomb, C. Costea, D. Joyeux, O. Hasan and L. Brunie, A study and comparison of sentiment analysis
methods for reputation evaluation, Rapport de recherche RR-LIRIS2014-002, (2014).
[9] M. Day and C. Lee, Deep learning for financial sentiment analysis on finance news providers, IEEE/ACM Int.
Conf. Adv. Social Networks Anal. Min. (2016) 1127–1134.
[10] M.S. Gaya, M.U. Zango, L.A. Yusuf, M. Mustapha, B. Muhammad, A. Sani, A. Tijjani, N.A. Wahab and M.T.M.
Khairi, Estimation of turbidity in water treatment plant using hammerstein-wiener and neural network technique,
Indonesian J. Elect. Engin. Comput. Sci. 5(3) (2017) 666–672.
[11] D. Gupta, A. Malviya and S. Singh, Performance analysis of classification tree learning algorithms, Int. J. Comput.
Appl. 55 (2012).
[12] J. Han, M. Kamber and J. Pei, Data mining: Concepts and Techniques, San Fransisco: Morgan Kaufmann, 2006.
[13] J. Han, J. Pei, and M. Kamber, Data Mining: Concepts and Techniques, 2ed., Elsevier, 2011.
[14] B. Heredia, T.M. Khoshgoftaar, J. Prusa and M. Crawford, Cross-domain sentiment analysis: an empirical
investigation, IEEE 17th Int. Conf. Inf. Reuse Integ. (2016) 160–165.
[15] A.F. Hidayatullah, The influence of stemming on Indonesian Tweet sentiment analysis, Proc. Int. Conf. Elect.
Engin. Comput. Sci. Inf. Palembang, Indonesia, 2(1) (2015) 127–132.
[16] S. Hussain, N.A. Dahan, F.M. Ba-Alwib and N. Ribata, Educational data mining and analysis of students’
academic performance using WEKA, Indonesian J. Elect. Engin. Comput. Sci. 9(2) (2018) 447–459.
[17] A.D. Indriyanti, D.R. Prehanto, and T.Z. Vitadiar, K-means method for clustering learning classes, Indonesian
J. Elect. Engin. Comput. Sci. 22(2) 2021 227–233.
[18] A. Jain, G. Kulkarni and V. Shah, Natural language processing, IJCSE 6(1) (2018).
[19] D. Jannach, M. Zanker, A. Felfernig and G. Friedrich, Recommender Systems: An Introduction, Cambridge
University Press, 2010.
[20] M.H. Jopri, M.R.A.b. Ghani, A.R. Abdullah, T. Sutikno, M. Manap and J. Too, Na¨─▒ve bayes and linear discriminate analysis based diagnostic analytic of harmonic source identification, Indonesian J. Elect. Engin. Comput.
Sci. 20(3) (2020) 1626–1633.[21] Keras.io, IMDB movie review sentiment classification dataset, Aug 2021.
[22] J. Singh, G. Singh and R. Singh, A review of sentiment analysis techniques for opinionated web text, CSI Trans.
ICT 4(2) (2016) 241–247.
[23] S. Srivastava, Weka: a tool for data preprocessing, classification, ensemble, clustering and association rule mining,
Int. J. Comput. Appl. 88 (2014).
[24] Q. Tul Ain, M. Ali, A. Riaz, A. Noureen, M. Kamran, B. Hayat, and A. Rehman, Sentiment analysis using deep
learning techniques: a review, Int. J. Adv. Comput. Sci. Appl. 8(6) (2017).
[25] P. Vateekul and T. Koomsubha, A study of sentiment analysis using deep learning techniques on Thai Twitter
data, 13th Int. Joint Conf. Comput. Sci. Software Engin. (2016) 1–6.
[26] K. Verspoor and K.B. Cohen, Natural Language Processing, Encyclopedia of Systems Biology, Springer, New
York, 2013.
[27] S. Wakade, C. Shekar, K.J. Liszka, and C.-C. Chan, Text mining for sentiment analysis of Twitter data, Proc. Int.
Conf. Inf. Knowledge Engin. The Steering Committee of The World Congress in Computer Science, Computer
Engineering and Applied Computing (WorldComp), 2012.
[28] Y. Zhang, M.J. Er, N. Wang, M. Pratama and R. Venkatesan, Sentiment classification using comprehensive
attention recurrent models, Int. Joint Conf. Neural Networks (IJCNN) (2016) 1562–1569.
[29] L. Zhou and H. Wang, Loan default prediction on large imbalanced data using random forests, TELKOMNIKA
Indonesian J. Elect. Engin. 10(6) (2012) 1519–1525.
Volume 12, Special Issue
December 2021
Pages 595-604
  • Receive Date: 17 March 2021
  • Revise Date: 09 May 2021
  • Accept Date: 23 June 2021