
CYBER BULLYING SENTIMENT ANALYSIS BASED ON SOCIAL CATEGORIES USING THE CHI-SQUARE TEST
DOI:
https://doi.org/10.69916/comtechno.v2i1.144Keywords:
sentiment analysis, cyber bullying, chi-square, bag of words, classificationAbstract
This research evaluates various machine learning models in classifying sentiment in cyberbullying data across six categories: not_cyberbullying, gender, religion, other_cyberbullying, age, and ethnicity. Using a Bag of Words approach combined with Chi-Square feature selection (1000 features), models tested include SVM, Logistic Regression, Naïve Bayes, KNN, and Random Forest. Results show SVM and Logistic Regression achieving the highest accuracy at 83%, indicating their effectiveness in prediction. Naïve Bayes performed the poorest with 62% accuracy, suggesting a mismatch with the data or need for further tuning. KNN and Random Forest showed good performance with 75% and 81% accuracy respectively, though not as high as SVM and Logistic Regression. This multi-algorithm approach provides insights into each model's effectiveness and behavior on diverse data characteristics, essential for understanding the unique nuances of each cyberbullying category. Model selection should consider accuracy, interpretability, computational cost, and suitability to specific problem characteristics. This research aims to deepen understanding of cyberbullying to support more effective mitigation strategies.
References
H. Guntoro, D. Rikardo, Amirullah, A. Fahrisani, and I. P. Suarsana, “Analisa Hubungan Kebersihan Cargo Bilges dengan Cargo Hold dalam Mendukung Kelancaran Proses Bongkar Muat,” E-Journal Mar. Insid., vol. 1, no. 2, pp. 1–32, 2022, doi: 10.56943/ejmi.v1i2.9.
Cindy Mutia Annur, “Pertumbuhan Melambat, Jumlah Pengguna Media Sosial Global Capai 4,76 Miliar hingga Awal 2023,” databooks. Accessed: Jun. 10, 2024. [Online]. Available: https://databoks.katadata.co.id/datapublish/2023/02/07/pertumbuhan-melambat-jumlah-pengguna-media-sosial-global-capai-476-miliar-hingga-awal-2023
M. R. Kurniawanda and F. A. T. Tobing, “Analysis Sentiment Cyberbullying In Instagram Comments with XGBoost Method,” IJNMT (International J. New Media Technol., vol. 9, no. 1, pp. 28–34, 2022, doi: 10.31937/ijnmt.v9i1.2670.
Fauzan Baehaqi and N. Cahyono, “Analisis Sentimen Terhadap Cyberbullying Pada Komentar Di Instagram Menggunakan Algoritma Naïve Bayes,” Indones. J. Comput. Sci., vol. 13, no. 1, pp. 1051–1063, 2024, doi: 10.33022/ijcs.v13i1.3301.
S. Riadi, E. Utami, and A. Yaqin, “Comparison of NB and SVM in Sentiment Analysis of Cyberbullying using Feature Selection,” Sinkron, vol. 8, no. 4, pp. 2414–2424, 2023, doi: 10.33395/sinkron.v8i4.12629.
S. S. Wijayanti, E. Utami, and A. Yaqin, “Comparison of Kernels on Support Vector Machine (SVM) Methods for Analysis of Cyberbullying,” 2022 6th Int. Conf. Inf. Technol. Inf. Syst. Electr. Eng., 2023, doi: 10.1109/ICITISEE57756.2022.10057761.
Z. Hadi and A. Sunyoto, “Detecting Fake Reviews Using N-gram Model and Chi-Square,” 2023 6th Int. Conf. Inf. Commun. Technol. ICOIACT 2023, pp. 454–458, 2023, doi: 10.1109/ICOIACT59844.2023.10455895.
J. Wang, K. Fu, and C. T. Lu, “SOSNet: A Graph Convolutional Network Approach to Fine-Grained Cyberbullying Detection,” Proc. - 2020 IEEE Int. Conf. Big Data, Big Data 2020, pp. 1699–1708, 2020, doi: 10.1109/BigData50022.2020.9378065.
J. Brownlee, “Deep Learning for Natural Language Processing : Develop Deep Learning Models for Natural Language in Python,” Mach. Learn. Mastery, p. 414, 2017, [Online]. Available: http://web.stanford.edu/class/cs224n/readings/cs224n-2019-notes06-NMT_seq2seq_attention.pdf
O. Mogren, Representation Learning for Natural Language. 2018.
Dedy Sugiarto, Ema Utami, and Ainul Yaqin, “Perbandingan Kinerja Model TF-IDF dan BOW untuk Klasifikasi Opini Publik Tentang Kebijakan BLT Minyak Goreng,” J. Tek. Ind., vol. 12, no. 3, pp. 272–277, 2022, doi: 10.25105/jti.v12i3.15669.
N. Yusliani, S. A. Q. Aruda, M. D. Marieska, D. M. Saputra, and A. Abdiansah, “The effect of Chi-Square Feature Selection on Question Classification using Multinomial Naïve Bayes,” Sinkron, vol. 7, no. 4, pp. 2430–2436, 2022, doi: 10.33395/sinkron.v7i4.11788.
G. R. Ditami, E. F. Ripanti, and H. Sujaini, “Implementasi Support Vector Machine untuk Analisis Sentimen Terhadap Pengaruh Program Promosi Event Belanja pada Marketplace,” J. Edukasi dan Penelit. Inform., vol. 8, no. 3, p. 508, 2022, doi: 10.26418/jp.v8i3.56478.
Y. X. Chu, X. G. Liu, and C. H. Gao, “Multiscale models on time series of silicon content in blast furnace hot metal based on Hilbert-Huang transform,” Proc. 2011 Chinese Control Decis. Conf. CCDC 2011, pp. 842–847, 2011, doi: 10.1109/CCDC.2011.5968300.
I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques. 2011. doi: https://doi.org/10.1016/C2009-0-19715-5.
K. Dinas et al., “Prediksi Jumlah Penggunaan BBM Perbulan Menggunakan Algoritma Decition Tree (C4.5) Pada,” J. Inform. dan Teknol., vol. 1, no. 1, pp. 56–63, 2018.
L. T. E. . Kusrini, Algoritma Data Mining. Buku Algoritma Data Mining, I. Yogyakarta: C.V ANDI, 2009. [Online]. Available: https://books.google.co.id/books?id=-Ojclag73O8C&printsec=frontcover&hl=id#v=onepage&q&f=false
N. T. Romadloni, I. Santoso, and S. Budilaksono, “Perbandingan Metode Naive Bayes, Knn Dan Decision Tree Terhadap Analisis Sentimen Transportasi Krl Commuter Line,” J. IKRA-ITH Inform. J. Komput. dan Inform., vol. 3, no. 2, pp. 1–9, 2019.
A. Tanggu Mara, E. Sediyono, and H. Purnomo, “Penerapan Algoritma K-Nearest Neighbors Pada Analisis Sentimen Metode Pembelajaran Dalam Jaringan (DARING) Di Universitas Kristen Wira Wacana Sumba,” Jointer - J. Informatics Eng., vol. 2, no. 01, pp. 24–31, 2021, doi: 10.53682/jointer.v2i01.30.
R. Jose and V. S. Chooralil, “Prediction of election result by enhanced sentiment analysis on twitter data using classifier ensemble Approach,” IEEE, 2016, doi: https://doi.org/10.1109/SAPIENCE.2016.7684133.
P. K. Sari and R. R. Suryono, “Komparasi Algoritma Support Vector Machine Dan Random Forest Untuk Analisis Sentimen Metaverse,” J. Mnemon., vol. 7, no. 1, pp. 31–39, 2024, doi: 10.36040/mnemonic.v7i1.8977.
Downloads
Published
Scite Metrics
Altmetric
How to Cite
Issue
Section
License
Copyright (c) 2024 Zulpan Hadi, Emi Suryadi, Ardiyallah Akbar, Zaenudin, Rudi Muslim

This work is licensed under a Creative Commons Attribution 4.0 International License.
Most read articles by the same author(s)
- Lutfi Hamim, Bahtiar Imran, Ardiyallah Akbar, SISTEM PAKAR DIAGNOSIS PENYAKIT PADA TANAMAN KACANG HIJAU BERBASIS WEB MENGGUNAKAN METODE DEMPSTER SHAFER , Journal Computer and Technology: Vol. 1 No. 1 (2023): Juli 2023
- Elin Febriani Febri, Bahtiar Imran, Rudi Muslim, SISTEM INFORMASI E-COMMERCE PENJUALAN KERAJINAN ROTAN BERBASIS WEBSITE PADA DESA LOANG MAKA KECAMATAN JANAPRIA , Journal Computer and Technology: Vol. 1 No. 1 (2023): Juli 2023
- Muh Hamzah Andung Giardi, Bahtiar Imran, Emi Suryadi, SISTEM PAKAR DIAGNOSA PENYAKIT MANDUL PADA PRIA MENGGUNAKAN METODE CERTAINTY FACTOR BERBASIS WEBSITE , Journal Computer and Technology: Vol. 1 No. 1 (2023): Juli 2023