RThread: A thread-centric analysis of security forums

Knowledge Discovery and Data Mining(2020)

引用 2|浏览10
暂无评分
摘要
Online forums have been shown to contain a wealth of useful information. With a few notable exceptions, such forums have not received much attention from the research community, unlike other online social media. Our goal here is to conduct an in-depth thread-centric analysis of online forums, focusing on security forums. We propose, RThread, a comprehensive unsupervised clustering approach with a powerful visualization component, which we provide as a publicly-accessible web-based tool. Our approach leverages 92 thread features that span three groups: (a) temporal, (b) behavioral, and (c) content related. We analyze data from 8 security forums with 400k posts over a span of 8 years. First, we find that many thread-centric properties follow a log-normal distribution, which is persistent across several forums and over time. Second, we show how our approach can identify clusters of threads with similar behavior, while our visualization component provides an easy way to spot the differences between these clusters. Finally, we show how our approach can spot surprising behaviors, including a cluster, whose threads are used for Search Engine Optimization. We see our approach and our publicly available platform as a building block towards understanding forum activity and extracting interesting information in an unsupervised way.
更多
查看译文
关键词
Online communities mining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要