Mitigating data imbalance to improve the generalizability in IoT DDoS detection tasks

The Journal of Supercomputing(2023)

Cited 0|Views1
No score
Abstract
DDoS attacks launched by IoT botnets can be classified into low-rate and high-rate DDoS attacks, which depict two distinct scenarios of data imbalance, namely, the minimal and maximal proportion of attack traffic. Developing a detection model that can effectively address two data imbalance scenarios concurrently is crucial in safeguarding computers against DDoS attacks. This necessitates the development of a model with enhanced generalizability. However, in the examination of cross-evaluation across datasets which is closely associated with the generalizability of models, there is a limited focus on addressing the issue of data imbalance. This oversight has led to a significant decrease in model performance when compared to training and testing on a single dataset. To identify guiding principles within this context, the literature suggests employing an undersampling technique on benign instances during the preprocessing phase to eliminate redundant data. Additionally, the literature conducts simulations of various cross-evaluation scenarios and evaluates the performance of classifiers. This evaluation is done after applying representative oversamplers, undersamplers, and the method proposed in this paper. After analyzing the experiment results, it is advisable to employ oversamplers for low-rate DDoS datasets and undersamplers for high-rate DDoS datasets. Moreover, optimizing SMOTE-based algorithms for specific models will yield optimal performance.
More
Translated text
Key words
Data imbalance,IoT botnets,DDoS,Cross-evaluation,Model generalization,Resampling,Transformer
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined