Leveraging posts’ and authors’ metadata to spot several forms of abusive comments in Twitter

Marco Casavantes,Mario Ezra Aragón,Luis C. González,Manuel Montes-y-Gómez

Journal of Intelligent Information Systems（2023）

引用 1|浏览20

暂无评分

摘要

Social media is frequently plagued with undesirable phenomena such as cyberbullying and abusive content in the form of hateful and racist posts. Therefore, it is crucial to study and propose better mechanisms to automatically identify communication that promote hate speech, hostility, and aggressiveness. Traditional approaches have only focused on exploiting the content and writing style of social media posts while ignoring information related to their context. On the other hand, several recent works have reported some interesting findings in this direction, although they have lacked an exhaustive analysis of contextual information, and also an evaluation about if this same premise holds to detect different types of abusive comments, e.g. offensive, hostile and hateful. For this, we have extended seven Twitter benchmark datasets related to the detection of offensive, aggressive, hostile, and hateful communication. We evaluate our hypothesis by using three different learning models, considering classical (Bag of Words), advanced (Glove), and state-of-the-art (BERT) text representations. Experiments show statistically significant differences between the classification scores of all methods that use a combination of text and metadata in comparison to the classical view of only using the text content of the messages, thus suggesting the importance of paying attention to context to spot the different kinds of abusive comments on social networks.

查看译文

关键词

Abusive language,Text classification,Twitter,Metadata

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要