Chrome Extension
WeChat Mini Program
Use on ChatGLM

The First Named Entity Recognizer In Maithili: Resource Creation And System Development

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS(2021)

Cited 3|Views2
No score
Abstract
In this paper, we present our effort on the development of a Maithili Named Entity Recognition (NER) system. Maithili is one of the official languages of India, with around 50 million native speakers. Although various NER systems have been developed in several Indian languages, we did not find any openly available NER resource or system in Maithili. For the development, we manually annotated a Maithili NER corpus containing around 200K words. We prepared a baseline classifier using Conditional Random Fields (CRF). Then we ran many experiments using various recurrent neural networks (RNN). We collected larger raw corpus to obtain better word embedding and character embedding. In our experiments, we found, neural models are better than CRF; a CRF layer is effective for the prediction of the final output in the RNN models; character embedding is effective in Maithili language. We also investigated the effectiveness of gazetteer lists in neural models. We prepared a few gazetteer lists from various web resources and used those in the neural models. The incorporation of the gazetteer layer caused performance improvement. The final system achieved an f-measure of 91.6% with 94.9% precision and 88.53% recall.
More
Translated text
Key words
Named entity recognition, Maithili language, corpus annotation, LSTM model, gazetteer lists
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined