A class-rebalancing self-training semisupervised learning for imbalanced data lithology identification

Geophysics(2023)

引用 0|浏览1
暂无评分
摘要
Lithologic identification plays a crucial role in petroleum geologic exploration, and machine learning (ML) has become increasingly prevalent in intelligent lithology identification in recent years. However, identifying lithologies presents challenges due to a lack of lithologic labels and an imbalanced distribution of lithologies. To address this issue and obtain satisfactory lithologic identification results, this study investigates a class-rebalancing self-training (CReST) lithology identification framework. This framework uses logging data and limited lithologic labels as input and achieves promising lithology classification through the CReST approach. Four ML algorithms with high overall performance are selected from 25 common algorithms to establish CReST models, such as bagging classifier, extra trees classifier, random forest classifier, and support vector classifier. The classification results of the models are compared and analyzed under three conditions. The experimental findings indicate that (1) under label scarcity, the effect of category recognition varies greatly with different sample numbers; (2) under self-training (ST), overall performance is improved, but the difference in performance caused by category imbalance also increases; and (3) under CReST framework, the model effectively resolves the identification problems caused by a lack of labels and an imbalanced category distribution. Specifically, the precision of identifying categories with fewer samples is improved by more than 20%.
更多
查看译文
关键词
imbalanced data lithology identification,learning,class-rebalancing,self-training,semi-supervised
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要