RAAMove: A Corpus for Analyzing Moves in Research Article Abstracts
arxiv(2024)
摘要
Move structures have been studied in English for Specific Purposes (ESP) and
English for Academic Purposes (EAP) for decades. However, there are few move
annotation corpora for Research Article (RA) abstracts. In this paper, we
introduce RAAMove, a comprehensive multi-domain corpus dedicated to the
annotation of move structures in RA abstracts. The primary objective of RAAMove
is to facilitate move analysis and automatic move identification. This paper
provides a thorough discussion of the corpus construction process, including
the scheme, data collection, annotation guidelines, and annotation procedures.
The corpus is constructed through two stages: initially, expert annotators
manually annotate high-quality data; subsequently, based on the human-annotated
data, a BERT-based model is employed for automatic annotation with the help of
experts' modification. The result is a large-scale and high-quality corpus
comprising 33,988 annotated instances. We also conduct preliminary move
identification experiments using the BERT-based model to verify the
effectiveness of the proposed corpus and model. The annotated corpus is
available for academic research purposes and can serve as essential resources
for move analysis, English language teaching and writing, as well as
move/discourse-related tasks in Natural Language Processing (NLP).
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要