Exploring class imbalance with under-sampling, over-sampling, and hybrid sampling based on Mahalanobis distance for landslide susceptibility assessment: a case study of the 2018 Iburi earthquake induced landslides in Hokkaido, Japan

GEOSCIENCES JOURNAL(2023)

引用 0|浏览1
暂无评分
摘要
This study focuses on evaluating the performance of the resampling approach using under-sampling, over-sampling, and hybrid sampling techniques in the random forest (RF) model for landslide susceptibility assessment (LSA). For this research, the study area selected was Hokkaido, Japan, which experienced a total of 5,625 landslides as a single event caused by the 2018 Ibury earthquake. The objective of this study is to address the class imbalance issue and improve the accuracy of LSA. Multiple data sources are utilized to obtain conditioning factors, and objective absence data sampling based on Mahalanobis distance is employed to tackle the unlabeled sample problem. The RF model is used to calculate landslide susceptibility values and generate LSA. These values are then evaluated using two diagnostic tools, the Area Under the Receiver Operating Characteristic curve (AUROC) and the Precision-Recall curve (AUPRC). These tools help validate and interpret binary classification predictive models for imbalanced data. The results demonstrate improved performance with larger sample sizes, and the resampling approach yields better consistency compared to random sampling within the study area. To enhance the accuracy and consistency of machine learning techniques in reducing landslide risks, the study recommends utilizing hybrid sampling technique and Mahalanobis distance-based absence data sampling in LSA.
更多
查看译文
关键词
landslide susceptibility assessment,class imbalance,iburi earthquake,landslides,under-sampling,over-sampling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要