Chrome Extension
WeChat Mini Program
Use on ChatGLM

Non-word Attributes’ Efficiency in Text Mining Authorship Prediction

JOURNAL OF INTELLIGENT SYSTEMS(2020)

Cited 2|Views2
No score
Abstract
Literature scripts can be compared to paintings, in an artistic way as well as in the perspective of financial value, whereas the value of these scripts rise and fall depending on their author's popularity. Authors' scripts represent a specific style of writing that can be measured and compared using a text mining field called Stylometric. Stylometric analysis depends on some features called authorship attributes, and these attributes or features can be used in special algorithms and methods to reach that aim. Generally, each method selected in the Stylometric field uses a variety of attributes to reach higher prediction accuracy. The aim of this research is to improve the accuracy of authorship prediction in literary works based on the artistic writing style of the authors. To achieve that, a new set of attributes will be used with the Stylometric Authorship Balanced Attribution method, whichwas chosen in this research among several other machine language methods because of its delicateness in authorship prediction projects. The attributes that have been used by most of the researchers were word frequencies (single word, pair of words, or trio of words), which led to some prediction mistakes. In this research, a new set of attributes is used to decrease these mistakes. These proposed non-word attributes are named sentence length, special characters, and punctuation symbols. The results obtained by using these proposed attributes were excellent.
More
Translated text
Key words
Machine learning,Stylometric,authorship attribution,SABA,non-word attribute
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined