Chrome Extension
WeChat Mini Program
Use on ChatGLM

The Fault in Our Data Stars: Studying Mitigation Techniques against Faulty Training Data in Machine Learning Applications

2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)(2022)

Cited 4|Views21
No score
Abstract
Machine learning (ML) has been adopted in many safety-critical applications like automated driving and medical diagnosis. Incorrect decisions by ML models can lead to catastrophic consequences, such as vehicle crashes and inappropriate medical procedures, thereby endangering our lives. The correct behaviour of a ML model is contingent upon the availability of well-labelled training data. However, obtaining large and high-quality training datasets for safety-critical applications is difficult, often resulting in the use of faulty training data.We compare the efficacy of five different error mitigation techniques, derived from a survey of more than 200 related articles, which are designed to tolerate noisy/faulty training data. We experimentally find that the error mitigation capabilities of these techniques vary across datasets, ML models, and different kinds of faults. We further find that ensemble learning offers the highest resilience among all the techniques across different configurations, followed by label smoothing.
More
Translated text
Key words
Error resilience,Machine learning,Training
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined