Identifying Sexual Predators in Chats Using SVM and Feature Ensemble

2023 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC)(2023)

引用 0|浏览1
Cyber grooming is a compelling problem worldwide nowadays since people spend most of their time online. All of the reports strongly suggested that it becomes very urgent to tackle the online child grooming problem in order to protect children from sexual exploitation. Automatic sexual predator identification can be a promising solution to this issue since the number of online conversations is too large to be monitored manually. In this work, a two-stage is proposed with a combination of several features. The first stage is for detecting the predatory conversations while the second step aims to distinguish the predator from the victim in the predatory conversations. The features ensemble used will be combining lexical and behavioral features. The lexical features used include BoW, POS-based, topical, and emotion-based. Meanwhile, the behavioral features used for this work include the number of messages, the average number of words, the number of exclamation marks, the number of questions, sentence complexity and readability, and the number of intentions. SVM was used as a classifier due to its good ability for many text classification tasks. The experiment result shows that BoW with tf - idf term weighting provided the best performance for both PCI and VPD tasks. BoW with tf- idf term weighting obtained an F 0 .5 score of 0.9893 on PCI and 0.9798 on VPD. The features ensemble can exceed most of the individual features that form it but still cannot beat BoW.
Chat,Sexual Predator,SVM,Bag of Words,Ensemble,Conversation
AI 理解论文
Chat Paper