Authorship Verification using a Graph Knowledge Discovery Approach

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS(2019)

引用 8|浏览8
暂无评分
摘要
This paper presents an approach to solve authorship verification, a forensic text problem which consists in determining whether or not an unknown document was written by a particular author, from some samples of the author's writing style. The core of the approach is the use of a graph representation to extract relevant linguistic features based on network analysis techniques. The use of graphs provides rich data structures for representing lexical and syntactic aspects of texts, allowing the reinterpretation of centrality measures to extract linguistic features that do not depend entirely of stylistic elements of text documents. The proposed method is applied on the English language partitions of the clef PAN 2014 and 2015 author verification datasets, producing competitive results that outperform the state of the art baselines and are near (or surpass in one of the cases) to the best results reported so far, given the same training and test corpora. These experimental results showed that our interpretation of the four centrality measures: closeness, betweenness, degree and eigenvector allow to detect relevant patterns of an author's writing style. In particular, words with high closeness which are part of some chunk phrases and words with high betweenness that are included in bigrams and trigrams, contribute in a more effective way to verify document authorship.
更多
查看译文
关键词
Authorship verification,supervised learning,syntactic flow graph,social network analysis,centrality measures
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要