Advances in Pre-Training Distributed Word Representations
PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018)(2017)
摘要
Many Natural Language Processing applications nowadays rely on pre-trained word representations estimated from large text corpora such as news collections, Wikipedia and Web Crawl. In this paper, we show how to train high-quality word vector representations by using a combination of known tricks that are however rarely used together. The main result of our work is the new set of publicly available pre-trained models that outperform the current state of the art by a large margin on a number of tasks.
更多查看译文
关键词
fastText, word2vec, word vectors, pre-trained
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络