An Investigation of Large Language Models for Real-World Hate Speech Detection
CoRR(2024)
摘要
Hate speech has emerged as a major problem plaguing our social spaces today.
While there have been significant efforts to address this problem, existing
methods are still significantly limited in effectively detecting hate speech
online. A major limitation of existing methods is that hate speech detection is
a highly contextual problem, and these methods cannot fully capture the context
of hate speech to make accurate predictions. Recently, large language models
(LLMs) have demonstrated state-of-the-art performance in several natural
language tasks. LLMs have undergone extensive training using vast amounts of
natural language data, enabling them to grasp intricate contextual details.
Hence, they could be used as knowledge bases for context-aware hate speech
detection. However, a fundamental problem with using LLMs to detect hate speech
is that there are no studies on effectively prompting LLMs for context-aware
hate speech detection. In this study, we conduct a large-scale study of hate
speech detection, employing five established hate speech datasets. We discover
that LLMs not only match but often surpass the performance of current benchmark
machine learning models in identifying hate speech. By proposing four diverse
prompting strategies that optimize the use of LLMs in detecting hate speech.
Our study reveals that a meticulously crafted reasoning prompt can effectively
capture the context of hate speech by fully utilizing the knowledge base in
LLMs, significantly outperforming existing techniques. Furthermore, although
LLMs can provide a rich knowledge base for the contextual detection of hate
speech, suitable prompting strategies play a crucial role in effectively
leveraging this knowledge base for efficient detection.
更多查看译文
关键词
hate speech,large language model,prompt engineering,few-shot learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要