Maryland Assessment Research Center (MARC)

Cheating Detection

  • Use large-language model (LLM) for automated detection of essays generated by AI

     

  • Exploration of the Stacking Ensemble Machine Learning Algorithm for Cheating Detection in Large-Scale Assessment (2023) (Read our work here)
    Cheating detection in large-scale assessment received considerable attention in the extant literature. However, none of the previous studies in this line of research investigated the stacking ensemble machine learning algorithm for cheating detection. Furthermore, no study addressed the issue of class imbalance using resampling. This study explored the application of the stacking ensemble machine learning algorithm to analyze the item response, response time, and augmented data of test-takers to detect cheating behaviors. The performance of the stacking method was compared with that of two other ensemble methods (bagging and boosting) as well as six base non-ensemble machine learning algorithms. Issues related to class imbalance and input features were addressed. The study results indicated that stacking, resampling, and feature sets including augmented summary data generally performed better than its counterparts in cheating detection. Compared with other competing machine learning algorithms investigated in this study, the meta-model from stacking using discriminant analysis based on the top two base models—Gradient Boosting and Random Forest—generally performed the best when item responses and the augmented summary statistics were used as the input features with an under-sampling ratio of 10:1 among all the study conditions.

 

  • Integrating Psychometric Analysis and Machine Learning to Augment Data for Cheating Detection in Large-Scale Assessment (2023) (Read our work here)
    Machine learning methods have been explored for cheating detection in large-scale assessment programs. While some studies analyzed item response and response time (RT) data, a few studies experimented with other data and data augmentation in cheating detection. However, no study has explored data augmentation integrating results from both psychometric analysis and machine learning for cheating detection though Kim et al. (2016) compared the results from these two perspectives. This study explored data augmentation in the stacking learning for cheating detection adding both person-fit measures from psychometric analysis and outlier measures from anomaly detection methods. An empirical data set from a high-stake large-scale testing program is used to demonstrate the application of the proposed method. Class imbalance is addressed via resampling. The performance of the proposed method is compared with alternative approaches. It is found the proposed data augmentation approach effectively increases the cheating detection accuracy with the highest F1 score compared with the values reported in similar studies.

 

  • Data Augmentation in Machine Learning for Cheating Detection in Large-Scale Assessment: An Illustration with the Blending Ensemble Learning Algorithm (2022) (Read our work here)
    Machine learning methods have been explored for cheating detection in large-scale assessment in recent years. Most of these studies analyzed item responses and response time data. Though a few studies investigated data augmentation in the feature space, data augmentation in machine learning for cheating detection is far beyond thorough investigation. This study explored data augmentation of the feature space for the blending ensemble learning at the meta-model level for cheating detection. Four anomaly detection techniques assigned outlier scores to augment the meta-model’s input data in addition to the most informative features from the original dataset identified by four feature selection methods. The performance of the meta-model with data augmentation was compared with that of each base model and the meta-model without data augmentation. Based on the evaluation criteria, the best-performing meta-model with data augmentation was identified. In general, data augmentation in the blending ensemble learning for cheating detection greatly improved the accuracy of cheating detection compared with other alternative approaches.