Comprehensive study of pre-trained language models: detecting humor in news headlines

Soft Comput.(2022)

引用 0|浏览20
暂无评分
摘要
The ability to automatically understand and analyze human language attracted researchers and practitioners in the Natural Language Processing (NLP) field. Detecting humor is an NLP task needed in many areas, including marketing, politics, and news. However, such a task is challenging due to the context, emotion, culture, and rhythm. To address this problem, we have proposed a robust model called BFHumor, a BERT-Flair-based Humor detection model that detects humor through news headlines. It is an ensemble model of different state-of-the-art pre-trained models utilizing various NLP techniques. We used public humor datasets from the SemEval-2020 workshop to evaluate the proposed model. As a result, the model achieved outstanding performance with 0.51966 as Root Mean Squared Error (RMSE) and 0.62291 as accuracy. In addition, we extensively investigated the underlying reasons behind the high accuracy of the BFHumor model in humor detection tasks. To that end, we conducted two experiments on the BERT model: vocabulary level and linguistic capturing level. Our investigation shows that BERT can capture surface knowledge in the lower layers, syntactic in the middle, and semantic in the higher layers.
更多
查看译文
关键词
Humor,Pre-trained models,BERT,Flair,BERT knowledge,BERT vocabulary
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要