Dr. BOGNÁR LÁSZLÓ'S Abstract Title

How to Build Good Machine Learning Models in Moodle
College Professor of Applied Statistic 
University of Dunaújváros

Moodle is one of the world’s most popular open-source learning platforms, with millions of users and widely used in online education. Although from its version 3.4, it is possible to create Machine Learning models, within the system, but very few studies have been published to date discussing successful applications. This is probably since the creation of a model with good predictive power requires a very wide range of activities, and already when planning a course, the aspect must be emphasized so that it can be well followed and mapped by a machine learning model. The content and structure of the course, the types of learning resources used, the number of students taking the course, the degree of difficulty of the course, the nature of the student activities can all affect the goodness of a Machine Learning model. These aspects are mostly embodied in the so-called predictors of a model (called indicators in Moodle), which quantify the different characteristics of the students, their activities during the course. Moodle has its own built-in predictors, but it is possible for the user to create so-called self-defined predictors.

In this presentation, we review the validity and impact of the above considerations using 16 different Moodle Machine Learning models with self-defined predictors, for predicting the success of 56 full-time and 82 correspondence students, enrolled in the Applied Statistics course at the University of Dunaújváros, in Hungary.

Here students’ cognitive activities are examined. The type of the predictors used in the models are based on: The number of views of Lecture Notes, number of views of Exercise Books, number of views of Lecture Videos, number of views of Minitab Videos (videos for problem solving with a statistical software) and the number of Quiz Attempts and Quiz Max Grades (best grades achieved by students on quizzes). The models differed in the number and in the types of predictors. Binary Logistic Regression was used for model training and evaluation. Comparing the performance of students in full-time and correspondence courses is particularly interesting, as learning habits in the two forms of learning can differ significantly. The target of the models predicts if a student is at risk of not achieving the minimum grade to pass the given course.

The impact of cognitive predictors, that are part of the Moodle core Analytics API, on predictive power was also examined. Models built from purely Moodle core cognitive predictors, yields much less reliable results.