Flagging clickbait in Indonesian online news websites using fine-tuned transformers

International Journal of Electrical and Computer Engineering (IJECE)(2023)

引用 0|浏览7
暂无评分
摘要
Click counts are related to the amount of money that online advertisers paid to news sites. Such business models forced some news sites to employ a dirty trick of click-baiting, i.e., using hyperbolic and interesting words, sometimes unfinished sentences in a headline to purposefully tease the readers. Some Indonesian online news sites also joined the party of clickbait, which indirectly degrade other established news sites' credibility. A neural network with a pre-trained language model multilingual bidirectional encoder representations from transformers (BERT) that acted as an embedding layer is then combined with a 100 node-hidden layer and topped with a sigmoid classifier was trained to detect clickbait headlines. With a total of 6,632 headlines as a training dataset, the classifier performed remarkably well. Evaluated with 5-fold cross-validation, it has an accuracy score of 0.914, an F1-score of 0.914, a precision score of 0.916, and a receiver operating characteristic-area under curve (ROC-AUC) of 0.92. The usage of multilingual BERT in the Indonesian text classification task was tested and is possible to be enhanced further. Future possibilities, societal impact, and limitations of clickbait detection are discussed.
更多
查看译文
关键词
indonesian online news websites,clickbait,fine-tuned
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要