Evaluating Random Forest Algorithm in Educational Data Mining: Optimizing Graduation on-time prediction using Imbalance Methods

  • Rizal Bakri STIEM Bongaya (ID)
  • Niken Probondani Astuti Statistics Research Group, STIEM Bongaya (ID)
  • Ansari Saleh Ahmar Department of Statistics, Universitas Negeri Makassar, Makassar, 90223, Indonesia (ID)
Keywords: Educational Data Mining, Random Forest Algorithm, Imbalance data methods

Viewed = 0 time(s)

Abstract

The study aims to evaluate the performance of Random Forest algorithms in data mining education by optimizing graduation on-time (GOT) predictions using imbalanced data methods. Methods used to handle imbalanced data include random under-sampling (RUS), random over-sampling (ROS), hybrids of RUS and ROS, synthetic minority over-sampling techniques for nominal classes (SMOTE-NC), and hybrids of SMOTE-NC and RUS. After applying these methods, studies analyze their performance on training and testing data. The research findings show that on training data, the RUS-ROS hybrid showed the best performance compared to other methods, while the SMOTENC and RUS hybrid techniques showed the best performance on testing data based on AUC values. The research showed that the use of an imbalanced data method significantly improved the ability of Random Forest algorithms to predict graduation on time (GOT) in the context of educational data. We discuss the implications for educational data mining applications and provide suggestions for future research.



Published
2024-02-27
Section
Articles
How to Cite
Bakri, R., Astuti, N. P., & Ahmar, A. S. (2024). Evaluating Random Forest Algorithm in Educational Data Mining: Optimizing Graduation on-time prediction using Imbalance Methods. ARRUS Journal of Social Sciences and Humanities, 4(1), 108-116. https://doi.org/10.35877/soshum2449