Redtweet: Recommendation Engine For Reddit

PROCEEDINGS OF THE 2015 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2015)(2015)

引用 2|浏览0
暂无评分
摘要
With the growing popularity in using social media to collect data, there is an increasing need to discover ways in which to productively use this data. Our objective is to form an interest profile from tweets and use this to recommend loosely related Reddit threads which the reader is most likely to be interested in. The problem is approached as a genre classification problem. Given a tweet, we want to deduce what genre(s) it might fall under if those words in the tweet were used in official texts. From there, we keep track of how many tweets fall under which genre, and generate a list of Reddit threads which similarly fall under those genre and are proportional to the interests of the user. Due to the complexity of genre classification, we chose to use an ensemble approach for classification. We use three classifiers in our ensemble: 1) a classic Naive Bayesian classifier, 2) a Naive Bayesian classifier trained only on the parts-of-speech of sentences, and 3) a Naive Bayesian classifier which will only make a decision if the probability P(x) >= 0.9. We measured the success of our classifiers by comparing the accuracy, precision, and recall of each model. Classifiers 1 and 2 had high accuracy than classifier 3 but classifier 3 had a much higher precision and recall rate. After creating the classifier, we were then able to form an interest profile on well-known people, one who has a small number of tweets versus one with a much larger number, and compile a list of recommended articles. The genres tagged to each person seemed to match their public personas and most of the articles chosen fit these genres. Our results are a valuable beginning for what constitutes a much larger project.
更多
查看译文
关键词
Genre Classification,Ensemble Classifiers,Twitter,Reddit
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要