Chrome Extension
WeChat Mini Program
Use on ChatGLM

Style over substance: A psychologically informed approach to feature selection and generalisability for author classification

Isabel Holmes, Timothy Cribbin,Nelli Ferenczi

Computers in Human Behavior Reports(2023)

Cited 0|Views3
No score
Abstract
Author profiling, or classifying user generated content based on demographic or other personal attributes, is a key task in social media-based research. Whilst high-accuracy has been achieved on many attributes, most studies tend to train and test models on a single domain only, ignoring cross-domain performance and research shows that models often transfer poorly into new domains as they tend to depend heavily on topic-specific (i.e., lexical) features. Knowledge specific to the field (e.g., Psychology, Political Science) is often ignored, with a reliance on data driven algorithms for feature development and selection.Focusing on political affiliation, we evaluate an approach that selects stylistic features according to known psychological correlates (personality traits) of this attribute. Training data was collected from Reddit posts made by regular users of the political subreddits of r/republican and r/democrat. A second, non-political dataset, was created by collecting posts by the same users but in different subreddits.Our results show that introducing domain specific knowledge in the form of psychologically informed stylistic features resulted in better out of training domain performance than lexical or more commonly used stylistic features.
More
Translated text
Key words
Author profiling,Political affiliation classification,Stylistic feature sets,Model generalisability,Political psychology,Feature development,Interdisciplinarity,Domain-specific knowledge
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined