Comparative Analysis of Part of Speech(POS) Tagger for Gujarati Language using Deep Learning and Pre-Trained LLM

2024 3rd International Conference for Innovation in Technology (INOCON)(2024)

引用 0|浏览1
Part of the Speech tagger takes the language word and tags it with grammar tags like nouns, verbs, and adjectives. Part of Speech Tagger(POS) is an important and useful aspect of Natural Language Processing(NLP).POS tagger is mature enough for the English language but much work needs to be done for the low-resource languages.POS tagging helps to understand the morphology of a particular language. We have applied the pre-trained LLM model BERT, RNN, LSTM, BiLSTM, and NLTK’s prebuilt method on the Gujarati dataset for the comparative analysis. A major chunk of the Gujarati dataset was collected from the online source while some of the tagging tasks we have performed, totaling 29813 sentences and their tags were collected.RNN, LSTM surpasses the BERT and gives good accuracy because the BERT LLM is not trained on the Part of Speech Tagging task. Recurrent Neural Network(RNN) gives the highest accuracy which accounts for 98.02% as BERT gives 88% accuracy.
Part of Speech tagging,Gujarati POS tagger,BERT POS tagger,RNN POS tagger,NLTK
AI 理解论文
Chat Paper