D. Martínez-Galicia, A. Guerra-Hernández, X. Limón, N. Cruz-Ramírez, and F. Grimaldo. A Bayesian Network Framework to Study Class Noise: Exploring the Filtering of Completely Random Noise, volume 375 of Frontiers in Artificial Intelligence and Applications, pages 128–131. IOS Press, Amsterdam, The Netherlands, 2023. | IOS Press
Abstract. Although the negative consequences of noise during induction have been widely studied, previous work often lacks the use of validated data to measure its impact. We propose a framework based on Bayesian Networks for modeling class noise and generating synthetic data sets where the kind and amount of class noise are under control. The benefits of the proposed approach are illustrated evaluating the filtering of noise completely at random in class labels when inducing decision trees. Unexpectedly, this kind of noise showed a low effect on accuracy and a low occurrence on real datasets. The framework and the methodology developed here seem promising to study other kinds of noise in class labels.
Keywords. Noise modeling, Bayesian Networks, Data generation, Noise filtering.