Sentiment analysis for covid-19 in Indonesia on Twitter with TF-IDF featured extraction and stochastic gradient descent

Document Type : Research Paper

Authors

Department of Master in Informatic Engineering, Faculty of Computer Science and Information Technology, Universitas Sumatera Utara, Medan, Indonesia

Abstract

 Twitter is an information platform that can be used by any internet user. The opinions of the Twitter Netizens are still random or unclassified. The technique for classifying sentiment analysis requires an algorithm. One of the classification algorithms is Stochastic Gradient Descent (SGD). The more training data provided to the machine, the accuracy of the classification function model formed by the machine is also higher. But in making representations into numerical vectors, the dimensions of data become large due to the many features. Feature optimization needs to be done to the training data by reducing the dimensions of the training data while maintaining high model accuracy. The optimization feature used is the TF-IDF (term frequency-inverse document frequency) feature extraction. sentiment analysis using TF-IDF feature extraction and stochastic gradient descent algorithm can classify Indonesian text appropriately according to positive and negative sentiment. Classification Performance using TF-IDF feature extraction and stochastic gradient descent algorithm obtained an accuracy is 85.141%.

Keywords

[1] C. Amalia, and Y. Sibaroni, Analisis sentimen data tweet menggunakan model Jaringan Saraf Tiruan dengan pembobotan delta Tf-idf. eProceedings of Engineering, 7(2)(2020) 7810.
[2] V. Chandani, R. S. Wahono and P. Purwanto , Komparasi algoritma klasifikasi machine learning dan feature selection pada analisis sentimen review Film. Journal of Intelligent Systems, 1 (1)(2015).
[3] K. Kominfo, Kominfo: Pengguna Internet di Indonesia 63 Juta Orang, Kementrian Komunikasi dan Informatika, November, 7 (2013), https://kominfo.go.id/
[4] D Kalimeris, G Kaplun, P Nakkiran, B. Edelman, T. Yang, B. Barak and H. Zhang, SGD on neural networks learns functions of increasing complexity, Adv. Neural Inf. Process. Syst. , 32(2019) 3496-3506.
[5] A. M. Pravina, I. Cholissodin and P. P. Adikara (2019). Analisis sentimen tentang opini maskapai Pepnerbangan pada Dokumen Twitter menggunakan algoritme support vector machine (SVM), Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, 3(3)(2019) 2789-2797.
[6] A. S. Ritonga and E. S. Purwaningsih, Penerapan Metode Support Vector Machine (SVM ) Dalam Klasifikasi Kualitas Pengelasan Smaw (Shield Metal Arc Welding), Edutic-Sci. J. Inf. Educ., 5 (1)(2018) 17- 25 .
[7] R. Umar, I. Riadi and Purwono, Perbandingan metode SVM, RF dan SGD untuk penentuan model klasifikasi kinerja programmer pada aktivitas media sosial, Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 4 (2)(2020) 329–335 .
[8] L. Zhang, S. Wang, B. Liu, Deep learning for sentiment analysis: A survey, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4) (2018) e1253 .
Volume 13, Issue 1
March 2022
Pages 1367-1373
  • Receive Date: 05 May 2021
  • Accept Date: 12 October 2021