Triage of Messages and Conversations in a Large-Scale Child Victimization Corpus

Prasanna Lakkur Subramanyam,Mohit Iyyer,Brian Levine

WWW 2024(2024)

引用 0|浏览2
暂无评分
摘要
Children are among the most vulnerable online populations. Reports of child sexual exploitation on social media and apps have grown annually at an alarming rate and are overwhelming investigators. Even a single case can require examining millions of messages involving hundreds of victims. Triage and prioritization based on victims' experiences is an unfortunate necessity. Using a chat dataset of more than 3 million messages between victims and perpetrators, we evaluate and contribute tools for analyzing the experiences of victims of sexual exploitation. We develop both supervised and unsupervised methods to classify messages into categories of interest to law enforcement, such as age requests, persuasion, and sexual messages. We also introduce a conversation clustering technique to illuminate differences among victims' experiences based on their chat history. Through a qualitative analysis, we demonstrate that the learned clusters are coherent and represent distinct conversation patterns. For example, we can distinguish groups of users who never comply with sexual requests, comply after a few conversations, or comply immediately after being targeted. We expect this approach and associated visualizations will aid law enforcement, industry moderators, and sociologists who need to analyze massive corpora in this domain. Finally, we validate prior models derived from conversations involving adults pretending to be minors and provide statistics that could help undercover adults more accurately portray minor victims.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要