谷歌Chrome浏览器插件
订阅小程序
在清言上使用

GreenRu: A Russian Dataset for Detecting Mentions of Green Practices in Social Media Posts

Olga Zakharova, Anna Glazkova

APPLIED SCIENCES-BASEL(2024)

引用 0|浏览0
暂无评分
摘要
Green practices are social practices that aim to harmonize the relations between people and the natural environment. They may involve minimizing the use of resources and the generation of waste and emissions. Detecting green practices in social media posts helps to understand which green practices are currently common and to develop recommendations on the scaling of green practices to reduce environmental problems. This paper describes GreenRu, a novel Russian social media dataset for detecting the mentions of green practices related to waste management. It has a sentence-level markup and consists of 1326 posts collected in Russian online communities. The total number of mentions of green waste practices is 3765. The paper assessed the effectiveness of the multi-label and one-versus-rest BERT-based models for detecting the mentions of green practices in social media posts and compared several data augmentation methods in terms of both classification metrics and human evaluation. To augment the dataset, a backtranslation method and generative language models, such as RuGPT, RuT5, and ChatGPT, were used in this study. The results enable researchers to monitor the green waste practices on social networks and develop environmental policies. Additionally, GreenRu can support machine learning models to analyze social media content, assess the prevalence and effectiveness of green waste practices, and identify ways to expand them.
更多
查看译文
关键词
dataset,social media posts,green practices,waste management,text classification,data augmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要