Annotated corpora and tools of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions (edition 1.1)

Carlos Ramisch,Silvio Ricardo Cordeiro,Agata Savary,Veronika Vincze,Verginica Barbu Mititelu,Archna Bhatia,Maja Buljan,Marie Candito,Polona Gantar,Voula Giouli,Tunga Güngör,Abdelati Hawwari,Uxoa Iñurrieta,Jolanta Kovalevskaitė,Simon Krek,Timm Lichte,Chaya Liebeskind,Johanna Monti,Carla Parra Escartín,Behrang QasemiZadeh,Renata Ramisch,Nathan Schneider,Ivelina Stoyanova,Ashwini Vaidya,Abigail Walsh,Cristina Aceta,Itziar Aduriz,Jean-Yves Antoine, Špela Arhar Holdt,Gözde Berk, Agnė Bielinskienė, Goranka Blagus,Loic Boizou,Claire Bonial,Valeria Caruso,Jaka Čibej,Matthieu Constant,Paul Cook,Mona Diab,Tsvetana Dimitrova,Rafael Ehren, Mohamed Elbadrashiny,Hevi Elyovich,Berna Erden,Ainara Estarrona,Aggeliki Fotopoulou,Vassiliki Foufi,Kristina Geeraert,Maarten van Gompel, Itziar Gonzalez,Antton Gurrutxaga,Yaakov Ha-Cohen Kerner, Rehab Ibrahim,Mihaela Ionescu, Kanishka Jain, Ivo-Pavao Jazbec, Teja Kavčič,Natalia Klyueva,Kristina Kocijan,Viktória Kovács,Taja Kuzman,Svetlozara Leseva,Nikola Ljubešić,Ruth Malka,Stella Markantonatou,Héctor Martínez Alonso, Ivana Matas,John McCrae,Helena de Medeiros Caseli,Mihaela Onofrei, Emilia Palka-Binkiewicz, Stella Papadelli,Yannick Parmentier,Antonio Pascucci,Caroline Pasquer, Maria Pia di Buono,Vandana Puri,Annalisa Raffone, Shraddha Ratori, Anna Riccio,Federico Sangati,Vishakha Shukla,Katalin Simkó,Jan Šnajder,Clarissa Somers, Shubham Srivastava,Valentina Stefanova,Shiva Taslimipoor, Natasa Theoxari,Maria Todorova,Ruben Urizar,Aline Villavicencio,Leonardo Zilio

user-5fe1a78c4c775e6ec07359f9(2018)

引用 0|浏览40
暂无评分
摘要
This multilingual resource contains corpora in which verbal MWEs have been manually annotated. VMWEs include idioms (let the cat out of the bag), light-verb constructions (make a decision), verb-particle constructions (give up), inherently reflexive verbs (help oneself), and multi-verb constructions (make do). VMWEs were annotated according to the universal guidelines in 19 languages. The corpora are provided in the cupt format, inspired by the CONLL-U format. The corpora were used in the 1.1 edition of the PARSEME Shared Task (2018). For most languages, morphological and syntactic information ­­­­– not necessarily using UD tagsets – including parts of speech, lemmas, morphological features and/or syntactic dependencies are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe). This item contains training, development and test data, as well as the evaluation tools used in the PARSEME Shared Task 1.1 (2018). The annotation guidelines are available online: http://parsemefr.lif.univ-mrs.fr/parseme-st-guidelines/1.1
更多
查看译文
关键词
Parsing,Part of speech,Syntax,Reflexive verb,Annotation,Natural language processing,Expression (mathematics),Linguistics,Lemma (mathematics),Computer science,Test data,Artificial intelligence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要