Czert - Czech BERT-like Model for Language Representation.

Jakub Sido,Ondřej Pražák,Pavel Přibáň,Jan Pašek,Michal Seják,Miloslav Konopík

RANLP（2021）

引用 38|浏览4

暂无评分

摘要

This paper describes the training process of the first Czech monolingual language representation models based on BERT and ALBERT architectures. We pre-train our models on more than 340K of sentences, which is 50 times more than multilingual models that include Czech data. We outperform the multilingual models on 7 out of 10 datasets. In addition, we establish the new state-of-the-art results on seven datasets. At the end, we discuss properties of monolingual and multilingual models based upon our results. We publish all the pre-trained and fine-tuned models freely for the research community.

查看译文

关键词

language,representation,model,bert-like

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要