An Approach to Identify Indic Languages using Text Classification and Natural Language Processing

Deepthi Shetty,Sarojadevi H, Uzma Shakeel, Sanjana S, Aishwarya G M, Nupur P

2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon)(2022)

引用 0|浏览1
暂无评分
摘要
India is one of the most culturally and linguistically diverse nations in the world. India stands second in the world for the most languages spoken by its diverse population, who speak their own regional languages for communication. English is offered as a second additional official language in India. However, there is a communication gap in India because of how little English is used there. It's nearly impossible for humans to bridge this breach by translating from one language into another. However, it is possible to translate languages by taking the help of a machine. As per the literature survey, it was observed that Neural Machine Translation (NMT) is a cutting-edge strategy that significantly outperformed more conventional machine translation methods for translating one language into another.The main objective of this proposed work is to achieve accurate identification of Indic language texts and scripts and provide relevant names of the language after the detection process. The entire work is carried out in stages which includes, collection of the dataset from different sources, preprocessing with the help of data mining techniques, identifying the language of input and in future, approaches like rule based, statistical and neural networks will be used followed by post-processing and efficient tasks like Machine Translations, Named Entity recognition, etc. will be carried out.
更多
查看译文
关键词
indic languages,text classification,natural languages processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要