Non-native text analysis: A survey.

NATURAL LANGUAGE ENGINEERING(2016)

引用 19|浏览132
暂无评分
摘要
Non-native speakers of English far outnumber native speakers; English is the main language of books, newspapers, airports, air-traffic control, international business, academic conferences, science, technology, diplomacy, sports, international competitions, pop music, and advertising (British Council 2014). Online education in the form of massive online open courses is also primarily in Englisheven teaching English. This creates enormous amounts of text written by non-native speakers, which in turn generates a need for grammar correction and analysis. Even aside from massive online open courses, the number of English learners in Asia alone is in the tens of millions. In this paper, we provide a survey of the two main areas of existing work on non-native text analysis, prefaced by an overview of common datasets used by researchers, comparing their attributes and potential uses. Then, an introduction to native language identification follows: determining the native language of an author based on text in the second language. This section is subdivided into various techniques and a shared task on this classification problem. Next, we discuss non-native grammatical error correctionfinding and modifying text to fix errors or to make it sound more fluent. Again, we discuss different methods before investigating a relevant shared task. Lastly, we end with conclusions and potential future directions. While this survey primarily focuses on detecting and correcting non-native English text, many approaches are general and can be used across any language pairing.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要