Presenting an approach based on weighted CapsuleNet networks for Arabic and Persian multi-domain sentiment analysis

Document Type : Research Paper

Authors

1 Department of Computer Engineering, North Tehran Branch, Islamic Azad University, Tehran, Iran

2 Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran

Abstract

Sentiment classification is a fundamental task in natural language processing, assigning one of the three classes, positive, negative, or neutral, to free texts. However, sentiment classification models are highly domain dependent; the classifier may perform classification with reasonable accuracy in one domain but not in another due to the Semantic multiplicity of words getting poor accuracy. This article presents a new Persian/Arabic multi-domain sentiment analysis method using the cumulative weighted capsule networks approach. Weighted capsule ensemble consists of training separate capsule networks for each domain and a weighting measure called domain belonging degree (DBD). This criterion consists of TF and IDF, which calculates the dependency of each document for each domain separately; this value is multiplied by the possible output that each capsule creates. In the end, the sum of these multiplications is the title of the final output, and is used to determine the polarity. And the most dependent domain is considered the final output for each domain. The proposed method was evaluated using the Digikala dataset and obtained acceptable accuracy compared to the existing approaches. It achieved an accuracy of 0.89 on detecting the domain of belonging and 0.99 on detecting the polarity. Also, for the problem of dealing with unbalanced classes, a cost-sensitive function was used. This function was able to achieve 0.0162 improvements in accuracy for sentiment classification. This approach on Amazon Arabic data can achieve 0.9695 accuracies in domain classification

Keywords

[1] M.M. Abdelgwad, T.H.A. Soliman, A.I. Taloba, and M.F. Farghaly, Arabic aspect-based sentiment analysis using bidirectional GRU based models, J. King Saud Univer. Comput. Inf. Sci. 34 (2022) no. 9, 6652–6662.
[2] M. Adnan, R. Sarno, and K.R. Sungkono, Sentiment analysis of restaurant review with classification approach in the decision tree-J48 algorithm, Int. Seminar Appl. Technol. Inf. Commun. (iSemantic), IEEE, 2019, pp. 121–126.
[3] R. Akhoundzade and K. H. Devin, Persian sentiment lexicon expansion using unsupervised learning methods, 9th Int. Conf. Comput. Knowledge Engin. (ICCKE), IEEE, 2019, 461–465.
[4] M. Alqmase and H. Al-Muhtaseb, Sport-fanaticism lexicons for sentiment analysis in Arabic social text, Soc. Network Anal. Min. 12 (2022) no. 1, 1–16.
[5] R. Aly, S. Remus, and C. Biemann, Hierarchical multi-label classification of text with capsule networks, Proc. 57th Ann. Meet. Assoc. Comput. Linguist.: Student Research Workshop, 2019, pp. 323–330.
[6] E. Asgarian, M. Kahani, and S. Sharifi, The impact of sentiment features on the sentiment polarity classification in Persian reviews, Cognitive Comput. 10 (2018) no. 1, 117–135.
[7] M.E. Basiri, A. Kabiri, M. Abdar, W.K. Mashwani, N.Y. Yen, and J.C. Hung, The effect of aggregation methods on sentiment classification in Persian reviews, Enterprise Inf. Syst. 14 (2020), no. 9-10, 1394–1421.
[8] Z. Chen and T. Qian, Transfer capsule network for aspect level sentiment classification, Proc. 57th Ann. Meet. Assoc. Comput. Linguistics, 2019, pp. 547–556.
[9] M. Dragoni and G. Petrucci, A neural word embeddings approach for multi-domain sentiment analysis, IEEE Trans. Affect. Comput. 8 (2017) no. 4, 457–470.
[10] M. Dragoni and G. Petrucci, A fuzzy-based strategy for multi-domain sentiment analysis, Int. J. Approx. Reason. 93 (2018), 59–73.
[11] K. Dashtipour, M. Gogate, E. Cambria, A. Hussain, A novel context-aware multimodal framework for Persian sentiment analysis, Neurocomputing 457 (2021), 377–388.
[12] F. Deng, S. Pu, X. Chen, Y. Shi, T. Yuan, and S. Pu, Hyperspectral image classification with capsule network using limited training samples, Sensors 18 (2018) no. 9, 3153.
[13] A. Farghaly and K. Shaalan, Arabic natural language processing: Challenges and solutions, ACM Trans. Asian Language Inf. Process. 8 (2009) no. 4, 1–22.
[14] Y. Geng and X. Luo, Cost-sensitive convolution based neural networks for imbalanced time-series classification, arXiv preprint arXiv:1801.04396, 2018.
[15] M.K. Habib, The challenges of Persian user-generated textual content: A machine learning-based approach, arXivpreprint arXiv:2101.08087, 2021.
[16] M. Hajighorbani, S.R. Hashemi, B. Minaei-Bidgoli, S. Safari, A review of some semi-supervised learning methods, IEEE First Int. Conf. New Res. Achiev. Electric. Comput. Engin., 2016, pp. 1–10.
[17] S.M.R. Hashemi, H. Hassanpour, E. Kozegar, and T. Tan, Cystoscopic image classification by unsupervised feature learning and fusion of classifiers, IEEE Access 9 (2021), 126610–126622.
[18] J. Joseph, S. Vineetha, N.V. Sobhana, A survey on deep learning based sentiment analysis, Mater. Today: Proc. 58 (2022), 456–460.
[19] J. Kim, S. Jang, E. Park, and S. Choi, Text classification using capsules, Neurocomputing 376 (2020), 214–221.
[20] A.P. Kirilenko, S.O. Stepchenkova, H. Kim, and X. Li, Automated sentiment analysis in tourism: Comparison of approaches, J. Travel Res. 57 (2018), no. 8, 1012–1025.
[21] H. Kaur and V. Mangat, A survey of sentiment analysis techniques, Int. Conf. I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC), 2017, pp. 921–925.
[22] S.M. Khabour, Q.A. Al-Radaideh, and D. Mustafa, A new ontology-based method for Arabic sentiment analysis, Big Data Cognitive Comput. 6 (2022) no. 2, 48.
[23] N.A. Khayi and V. Rus, Bi-gru capsule networks for student answers assessment, KDD Workshop on Deep Learning for Education (DL4Ed), 2019.
[24] M. Kwabena Patrick, A. Felix Adekoya, A. Abra Mighty, and B.Y. Edward, Capsule Networks-A survey, J. King Saud Univer. Comput. Inf. Sci. 34 (2022), no. 1, 1295–1310.
[25] Z.C. Lipton, J. Berkowitz, and C. Elkan, A critical review of recurrent neural networks for sequence learning, arXiv preprint arXiv:1506.00019, 2015.
[26] Y. Long, L. Qin, R. Xiang, M. Li, and C.-R. Huang, A cognition-based attention model for sentiment analysis, Proc. Conf. Empir. Meth. Nat. Language Proces., 2017, pp. 462-471.
[27] M. Mohd and R. Hashmy, Opinions mining of Twitter events using spatial-temporal features, J. Artific. Intell. Res. Adv. 5 (2018) no. 2, 36–44.
[28] A. Mohamed, SVM and naive Bayes for sentiment analysis in Arabic, Res. Square 2022, https://doi.org/10.21203/rs.3.rs-1631367/v1
[29] M. Mohammadpour, H. Khaliliardali, S.M.R. Hashemi, and M. AlyanNezhadi, Facial emotion recognition using deep convolutional networks, IEEE 4th Int. Conf. Knowledge-Based Engin. Innov. (KBEI), 2017, pp. 17–21.
[30] H.H. Nguyen, J. Yamagishi, and I. Echizen, Capsule-forensics: Using capsule networks to detect forged images and videos, ICASSP IEEE Int. Conf. Acoustics Speech Signal Process. (ICASSP), 2019, pp. 2307–2311.
[31] J.M. Perea-Ortega, L.A. Urena-Lopez, M. Rushdi-Saleh, and M.T. Martın-Valdivia, Oca: Opinion corpus for Arabic, J. Amer. Soc. Inf. Sci. Technol. 62 (2011) no. 10, 2045–2054.
[32] G. Petrucci and M. Dragoni, The IRMUDOSA system at ESWC-2018 challenge on semantic sentiment analysis, SemanticWeb Challenges: 5th SemWebEval Challenge at ESWC 2018, Heraklion, Greece, June 3–7, 2018, Revised Selected Papers 5, Springer International Publishing, 2018, pp. 167–185.
[33] P. Rathnayaka, S. Abeysinghe, C. Samarajeewa, I. Manchanayake, and M.Walpola, Sentylic at IEST 2018: Gated recurrent neural network and capsule network-based approach for implicit emotion detection,” arXiv preprint arXiv:1809.01452, 2018.
[34] S. Sabour, N. Frosst, and G.E. Hinton, Dynamic routing between capsules, Adv. Neural Inf. Process. Syst. 30 (2017), 3856–3866.
[35] C. Sammut and G.I. Webb, Encyclopedia of Machine Learning, Springer Science & Business Media, 2011.
[36] M. Schuster and K.K. Paliwal, Bidirectional recurrent neural networks, IEEE Trans. Signal Process. 45 (1997), no. 11, 2673–2681.
[37] H. A. Vamerzani and M. Khademi, Increase business intelligence based on opinions mining in the Persian reviews, Int. Acad. J. Sci. Engin. 2 (2015) no. 2, 164–174.
[38] Y. Wang, A. Sun, J. Han, Y. Liu, and X. Zhu, Sentiment analysis by capsules, Proc. World Wide Web Conf. Steering Committee, 2018, pp. 1165–1174.
[39] L. Xiao, H. Zhang, W. Chen, Y. Wang, and Y. Jin, Mcapsnet: Capsule network for text with multi-task learning, Proc. 2018 Conf. Empir. Meth. Natural Language Process., 2018, pp. 4565–4574.
[40] Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy, Hierarchical attention networks for document classification, Proc. Conf. North Amer., Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016, pp. 1480–1489.
[41] Z. Yuan, S. Wu, F. Wu, J. Liu, and Y. Huang, Domain attention model for multi-domain sentiment classification, Knowledge-Based Syst. 155 (2018), 1–10.
[42] Y. Zhang and B. Wallace, A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification, arXiv preprint arXiv:1510.03820, 2015.
[43] X. Zhang, J. Zhao, and Y. LeCun, Character-level convolutional networks for text classification, Adv. Neural Inf. Process. Syst. 28 (2015), 649-657.
Volume 15, Issue 5
May 2024
Pages 247-260
  • Receive Date: 17 January 2023
  • Revise Date: 08 May 2023
  • Accept Date: 23 May 2023