
BERT SENTIMENT ANALYSIS FOR DETECTING FRAUDULENT MESSAGES
DOI:
https://doi.org/10.69916/jkbti.v4i2.225Keywords:
BERT, fraud detection, machine learning, sentiment analysis, SMS classificationAbstract
With the increasing prevalence of digital communication, fraudulent SMS messages have become a growing concern. This study employs a BERT-based sentiment approach to classify SMS messages into four categories: fraud, gambling, Unsecured Credit (KTA – Kredit Tanpa Agunan), and others. These categories were determined based on content analysis and common patterns found in high-risk messages, such as suspicious transaction invitations (fraud), betting promotions (gambling), offers for unsecured loans (KTA), and other messages that do not fall into the three main categories. The dataset used consists of approximately 20,000 message records, which underwent data cleaning, tokenization, and manual labeling based on the aforementioned criteria. The model was trained using the AdamW optimizer with CrossEntropyLoss as the loss function for multi-class classification. Training was conducted over 3 epochs, a number chosen based on observations of evaluation metrics on the validation data, which showed that model accuracy began to plateau after the third epoch, while overfitting started to occur in subsequent epochs. After training, the model achieved an average accuracy of 92%. This result indicates that the BERT model is effective in understanding patterns in text messages and capable of classifying message categories with a high level of accuracy. These findings support the application of BERT technology in the efficient detection and identification of fraudulent messages.
Downloads
References
S. R. A. Samad, P. Ganesan, J. Rajasekaran, M. Radhakrishnan, H. Ammaippan, and V. Ramamurthy, “SmishGuard: Leveraging Machine Learning and Natural Language Processing for Smishing Detection,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 11, pp. 586–593, 2023, doi: 10.14569/IJACSA.2023.0141160.
A. Marcus, “Effect of SMS Advertising on Attitudes of Nigeria GSM Phone Users,” vol. 3, no. June, 2019.
S. R. A. Samad, P. Ganesan, J. Rajasekaran, M. Radhakrishnan, H. Ammaippan, and V. Ramamurthy, “SmishGuard: Leveraging Machine Learning and Natural Language Processing for Smishing Detection,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 11, pp. 586–593, 2023, doi: 10.14569/IJACSA.2023.0141160.
D. N. Njuguna, J. Kamau, and D. Kaburu, “A Review of Smishing Attaks Mitigation Strategies,” International Journal of Computer and Information Technology(2279-0764), vol. 11, no. 1, pp. 9–13, 2022, doi: 10.24203/ijcit.v11i1.201.
The Global Risks Report 2022 17th Edition. 2022.
N. Hussain, H. T. Mirza, and I. Hussain, “Detecting Spam Review through Spammer’s Behavior Analysis,” Advances in Distributed Computing and Artificial Intelligence Journal, vol. 8, no. 2, pp. 61–71, 2019, doi: 10.14201/ADCAIJ2019826171.
M. Salman, M. Ikram, and M. A. Kaafar, “Investigating Evasive Techniques in SMS Spam Filtering: A Comparative Analysis of Machine Learning Models,” IEEE Access, vol. 12, pp. 24306–24324, 2024, doi: 10.1109/ACCESS.2024.3364671.
V. Bhateja et al., “Lecture Notes in Networks and Systems 446.” [Online]. Available: https://link.springer.com/bookseries/15179
M. A. Uddin, M. N. Islam, L. Maglaras, H. Janicke, and I. H. Sarker, “ExplainableDetector: Exploring Transformer-based Language Modeling Approach for SMS Spam Detection with Explainability Analysis,” May 2024, [Online]. Available: http://arxiv.org/abs/2405.08026
S. S. Shravasti, “Smishing Detection: Using Artificial Intelligence,” Int J Res Appl Sci Eng Technol, vol. 9, no. 8, pp. 2218–2224, Aug. 2021, doi: 10.22214/ijraset.2021.37737.
A Sinde, Essa Shahra, and Shadi Basurra, “SMS Scam Detection Application based on Optical Character Recognition (OCR) for Image Data using Unsupervised and Deep Semi-Supervised learning,” Arab J Sci Eng, vol. 24, no. 18, p. 6084, Sep. 2023, doi: 10.3390/s24186084.
D. A. Oyeyemi and A. K. Ojo, “SMS Spam Detection and Classification to Combat Abuse in Telephone Networks Using Natural Language Processing,” Journal of Advances in Mathematics and Computer Science, vol. 38, no. 10, pp. 144–156, Oct. 2023, doi: 10.9734/jamcs/2023/v38i101832.
T. Sahmoud and Dr. M. Mikki, “Spam Detection Using BERT,” pp. 2–7, 2022, [Online]. Available: http://arxiv.org/abs/2206.02443
Z. Gao, A. Feng, X. Song, and X. Wu, “Target-dependent sentiment classification with BERT,” IEEE Access, vol. 7, pp. 154290–154299, 2019, doi: 10.1109/ACCESS.2019.2946594.
G. Norris, A. Brookes, and D. Dowell, “The Psychology of Internet Fraud Victimisation: a Systematic Review,” J Police Crim Psychol, vol. 34, no. 3, pp. 231–245, 2019, doi: 10.1007/s11896-019-09334-5.
H. Xu, L. Shu, P. S. Yu, and B. Liu, “Understanding Pre-trained BERT for Aspect-based Sentiment Analysis,” COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference, pp. 244–250, 2020, doi: 10.18653/v1/2020.coling-main.21.
A. Rogers, O. Kovaleva, and A. Rumshisky, “A primer in bertology: What we know about how bert works,” Trans Assoc Comput Linguist, vol. 8, pp. 842–866, 2020, doi: 10.1162/tacl_a_00349.
O. Kovaleva, A. Romanov, A. Rogers, and A. Rumshisky, “Revealing the dark secrets of Bert,” EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference, no. 2018, pp. 4365–4374, 2019, doi: 10.18653/v1/d19-1445.
Downloads
Published
Scite Metrics
Altmetric
How to Cite
Issue
Section
License
Copyright (c) 2025 Yuyun Yusnida Lase, Arif Aryaguna Nauli, Doni Ganda Marbungaran Mahulae

This work is licensed under a Creative Commons Attribution 4.0 International License.