Authorship verification of e-mail and tweet messages applied for continuous authentication

Journal of Computer and System Sciences(2015)

引用 82|浏览94
暂无评分
摘要
Authorship verification using stylometry consists of identifying a user based on his writing style. In this paper, authorship verification is applied for continuous authentication using unstructured online text-based entry. An online document is decomposed into consecutive blocks of short texts over which (continuous) authentication decisions happen, discriminating between legitimate and impostor behaviors. We investigate blocks of texts with 140, 280 and 500 characters. The feature set includes traditional features such as lexical, syntactic, application specific features, and new features extracted from n-gram analysis. Furthermore, the proposed approach includes a strategy to circumvent issues related to unbalanced dataset, and uses Information Gain and Mutual Information as a feature selection strategy and Support Vector Machine (SVM) for classification. Experimental evaluation of the proposed approach based on the Enron email and Twitter corpuses yields very promising results consisting of an Equal Error Rate (EER) varying from 9.98 % to 21.45 % , for different block sizes. Continuous authentication using stylometry, specifically authorship verification.Authorship analysis based on short messages.Stylometric analyzes using shallow machine learning classifiers.
更多
查看译文
关键词
Continuous authentication,Stylometry,Short message verification,n-Gram features,Unbalanced dataset,SVM classifier
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要