Maori Loanwords: A Corpus Of New Zealand English Tweets

57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP(2019)

引用 9|浏览24
暂无评分
摘要
Maori loanwords are widely used in New Zealand English for various social functions by New Zealanders within and outside of the Maori community. Motivated by the lack of linguistic resources for studying how Maori loanwords are used in social media, we present a new corpus of New Zealand English tweets. We collected tweets containing selected Maori words that are likely to be known by New Zealanders who do not speak Maori. Since over 30% of these words turned out to be irrelevant (e.g., mana is a popular gaming term, Moana is a character from a Disney movie), we manually annotated a sample of our tweets into relevant and irrelevant categories. This data was used to train machine learning models to automatically filter out irrelevant tweets.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要