COMPARATIVE STUDY OF CLASSIFICATION MODELS IN PROCESSING STUDENT TEST SCORES DATASETS
DOI:
https://doi.org/10.69916/jkbti.v5i2.475Keywords:
machine learning, classification, student test scores, model comparison, model evaluationAbstract
The development of Machine Learning (ML) has contributed significantly to the field of education, particularly in analyzing student academic data to support data-driven decision-making. Predicting student exam results is important for identifying academic performance patterns, detecting potential failures, and improving learning interventions. However, variations in student characteristics and dataset complexity require the selection of appropriate classification models to achieve optimal prediction performance. This study aims to compare the effectiveness of several ML classification models in predicting student exam results using a student academic dataset. The dataset consists of 306 records, seven attributes, and five grade classes (A, B, C, D, and E), including attendance, quiz scores, midterm examination scores, final examination scores, and assignment scores. Data preprocessing was conducted to handle missing values, duplication, inconsistencies, and outliers. The dataset was split into training and testing data with a ratio of 75:25 and evaluated using 10-fold cross-validation. Several classification models were applied, including k-Nearest Neighbour (kNN), Decision Tree, Naive Bayes, Support Vector Machine (SVM), and Random Forest. Model performance was evaluated using accuracy, precision, recall, and F1-score metrics. The experimental results showed that Random Forest achieved the best performance with an accuracy of 73.9%, precision of 74.0%, recall of 73.9%, and F1-score of 73.9%, followed by Naive Bayes and Decision Tree. Meanwhile, SVM produced the lowest performance among the tested models. The findings indicate that Random Forest is the most effective method for predicting student exam results and has strong potential to support educational decision-making systems.
Downloads
References
Z. Syahputra and R. Kurniawan, “Journal of Computer Networks , Architecture and High Performance Computing Journal of Computer Networks , Architecture and High Performance Computing,” J. Comput. Networks, Archit. High Perform. Comput., vol. 7, no. 1, pp. 341–352, 2025.
A. Wantoro, Zulkifli, P. Bintoro, T. H. Andika, F. Ardhy, and A. N. Al Aziz, “Performance Evaluation of Classification Multi Algorithms on Small Dataset: A Comparative-Based Analysis,” in 2025 Tenth International Conference on Informatics and Computing (ICIC), 2025, pp. 1–6. doi: 10.1109/ICIC68054.2025.11309491.
N. Schaduangrat, C. Nantasenamat, V. Prachayasittikul, and W. Shoombuatong, “Meta-iavp: A sequence-based meta-predictor for improving the prediction of antiviral peptides using effective feature representation,” Int. J. Mol. Sci., vol. 20, no. 22, 2019, doi: 10.3390/ijms20225743.
C. Karima and W. Anggraeni, “Performance Analysis of the Ada-Boost Algorithm For Classification of Hypertension Risk With Clinical Imbalanced Dataset,” Procedia Comput. Sci., vol. 234, pp. 645–653, 2024, doi: https://doi.org/10.1016/j.procs.2024.03.050.
H. Rohayani and M. C. Umam, “Prediksi Penentuan Program Studi Berdasarkan Nilai Siswa dengan Algoritma Backpropagation,” J. Inf. Syst. Res., vol. 3, no. 4, pp. 651–657, 2022, doi: 10.47065/josh.v3i4.1935.
R. D. K. Putra, K. S. Palupi, and N. Wakhidah, “Pengelompokkan Data Nilai Mahasiswa Menggunakan Metode K-Means,” J. Algoritm., vol. 6, no. 1, pp. 88–99, 2025, doi: 10.35957/algoritme.v6i1.11313.
Suraohman, L. Fabrianto, F. Riza, and N. M. Faizah, “Korelasi Antara Profil dan Nilai Akademis Siswa dengan Menggunakan Algoritma K-Means,” J. Teknol. Inf. dan Ilmu Komput., vol. 8, no. 4, pp. 845–852, 2021, doi: 10.25126/jtiik.202183034.
F. Islam, R. Ferdousi, S. Rahman, and H. Y. Bushra, Likelihood Prediction of Diabetes at Early Stage Using Data Mining Techniques. London, 2019. doi: 10.1007/979-981-13-8798-2-12.
H. Sulistiani, A. Syarif, K. Muludi, and Warsito, “Performance evaluation of feature selections on some ML approaches for diagnosing the narcissistic personality disorder,” Bull. Electr. Eng. Informatics, vol. 13, no. 2, pp. 1383–1391, 2024, doi: 10.11591/eei.v13i2.6717.
T. Yan, S.-L. Shen, A. Zhou, and X. Chen, “Prediction of geological characteristics from shield operational parameters by integrating grid search and K-fold cross validation into stacking classification algorithm,” J. Rock Mech. Geotech. Eng., vol. 14, no. 4, pp. 1292–1303, 2022, doi: https://doi.org/10.1016/j.jrmge.2022.03.002.
A. Agliata, D. Giordano, F. Bardozzo, S. Bottiglieri, A. Facchiano, and R. Tagliaferri, “Machine Learning as a Support for the Diagnosis of Type 2 Diabetes,” International Journal of Molecular Sciences, vol. 24, no. 7. 2023. doi: 10.3390/ijms24076775.
A. Wantoro, A. F. Yuliana, D. Yana, A. Andini, and I. Awaliyani, “Optimizing Type 2 Diabetes Classification with Feature Selection and Class Balancing in Machine Learning,” J. Tek. Inform., vol. 6, no. 4, pp. 2625–2637, 2025.
I. Düntsch and G. Gediga, “Confusion Matrices and Rough Set Data Analysis,” J. Phys. Conf. Ser., vol. 1229, no. 1, 2019, doi: 10.1088/1742-6596/1229/1/012055.
B. Imran, H. Hambali, A. Subki, Z. Zaeniah, A. Yani, and M. R. Alfian, “Data Mining Using Random Forest, Naïve Bayes, and Adaboost Models for Prediction and Classification of Benign and Malignant Breast Cancer,” J. Pilar Nusa Mandiri, vol. 18, no. 1, pp. 37–46, 2022, doi: 10.33480/pilar.v18i1.2912.
E. Akkaya and S. Turgay, “Unveiling the Power: A Comparative Analysis of Data Mining Tools through Decision Tree Classification on the Bank Marketing Dataset,” Wseas Trans. Comput., vol. 23, pp. 95–105, 2024, doi: 10.37394/23205.2024.23.9.
Downloads
Published
Scite Metrics
Altmetric
How to Cite
Issue
Section
License
Copyright (c) 2026 Rico Pramestiawan, Arry Verdian, Chindu Lintang Bhuana, Lilik Joko Susanto

This work is licensed under a Creative Commons Attribution 4.0 International License.
Most read articles by the same author(s)
- Arry Verdian, Agus Wantoro, EVALUATION OF IMBALANCE CLASS HANDLING STRATEGIES ON MACHINE LEARNING MODEL PERFORMANCE , Jurnal Kecerdasan Buatan dan Teknologi Informasi: Vol. 5 No. 2 (2026): May 2026
- Chindu Lintang Bhuana, Rico Pramestiawan, Lilik Joko Susanto, Arry Verdian, COMPARATIVE ANALYSIS OF PERFORMANCE OF MACHINE LEARNING FEATURE SELECTION (GINI DECREASE AND RELIEF-F) IN HEART DISEASE DATASET , Jurnal Kecerdasan Buatan dan Teknologi Informasi: Vol. 5 No. 2 (2026): May 2026













