Identifying Author in Bengali Literature by Bi-LSTM with Attention Mechanism

Ibrahim Al Azhar, Sohel Ahmed,Md. Saiful Islam,Aisha Khatun

2021 24th International Conference on Computer and Information Technology (ICCIT)(2021)

引用 2|浏览2
暂无评分
摘要
Authorship Attribution is the task of determining the author of an unknown text using one’s writing patterns. It is a well-established task for high-resource languages like English, but it is challenging for low-resource languages like Bengali. In this paper, we propose a Bi-directional Long Short Term Memory(Bi-LSTM) model with self-attention mechanism to address this problem. GloVe embedding vectors encode the semantic and syntactic knowledge of words, which are then fed into the Bi-LSTM models. Moreover, attention mechanism enhances the model’s ability to learn the complex linguistics patterns through learnable parameters, which gives lower weights to common words and higher weights to keywords that capture an author’s stylistic components. It improves performance extract contextual features. We evaluate our model on multiple datasets and experiment with various architectures. Our proposed model outperforms the state-of-the-art model by 12.14%-20.24% in the BAAD6 author dataset, 1.05% - 7.34% in the BAAD16 author dataset, with best performance accuracy of 97.99%. The experimental results demonstrate that the Bi-LSTM model’s attention mechanism notably boosts performance. (The source code are shared as free tools at https://github.com/IbrahimAlAzhar/AuthorshipAttribution)
更多
查看译文
关键词
GloVe embedding,Attention mechanism,BiLSTM,Authorship attribution,Deep neural network,Bangla,Authorship identification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要