谷歌浏览器插件
订阅小程序
在清言上使用

HuaAMS at SemEval-2022 Task 8: Combining Translation and Domain Pre-training for Cross-lingual News Article Similarity.

Sai Sandeep Sharma Chittilla,Talaat Khalil

International Workshop on Semantic Evaluation (SemEval )(2022)

引用 0|浏览4
暂无评分
摘要
This paper describes our submission to SemEval-2022 Multilingual News Article Similarity task.We experiment with different approaches that utilize a pre-trained language model fitted with a regression head to predict similarity scores for a given pair of news articles.Our best performing systems include 2 key steps: 1) pre-training with in-domain data 2) training data enrichment through machine translation.Our final submission is an ensemble of predictions from our top systems.While we show the significance of pre-training and augmentation, we believe the issue of language coverage calls for more attention.
更多
查看译文
关键词
Semantic Similarity,Part-of-Speech Tagging,Machine Translation,Multilingual Neural Machine Translation,Semantic Web
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要