Comparative Analysis of Part of Speech(POS) Tagger for Gujarati Language using Deep Learning and Pre-Trained LLM

2024 3rd International Conference for Innovation in Technology (INOCON)(2024)

引用 0|浏览1
暂无评分
摘要
Part of the Speech tagger takes the language word and tags it with grammar tags like nouns, verbs, and adjectives. Part of Speech Tagger(POS) is an important and useful aspect of Natural Language Processing(NLP).POS tagger is mature enough for the English language but much work needs to be done for the low-resource languages.POS tagging helps to understand the morphology of a particular language. We have applied the pre-trained LLM model BERT, RNN, LSTM, BiLSTM, and NLTK’s prebuilt method on the Gujarati dataset for the comparative analysis. A major chunk of the Gujarati dataset was collected from the online source while some of the tagging tasks we have performed, totaling 29813 sentences and their tags were collected.RNN, LSTM surpasses the BERT and gives good accuracy because the BERT LLM is not trained on the Part of Speech Tagging task. Recurrent Neural Network(RNN) gives the highest accuracy which accounts for 98.02% as BERT gives 88% accuracy.
更多
查看译文
关键词
Part of Speech tagging,Gujarati POS tagger,BERT POS tagger,RNN POS tagger,NLTK
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要