Performance Evaluation of Different Word Embedding Techniques Across Machine Learning and Deep Learning Models

Tanmoy Mazumder, Shawan Das, Md. Hasibur Rahman,Tanjina Helaly,Tanmoy Sarkar Pias

2022 25th International Conference on Computer and Information Technology (ICCIT)(2022)

引用 0|浏览0
暂无评分
摘要
Sentiment analysis is one of the core fields of Natural Language Processing(NLP). Numerous machine learning and deep learning algorithms have been developed to achieve this task. Generally, deep learning models perform better in this task as they are trained on massive amounts of data. This, however, also poses a disadvantage as collecting sufficient amounts of data is a challenge and training with this data requires devices with high computational power. Word embedding is a vital step in applying machine learning models for NLP tasks. Different word embedding techniques affect the performance of machine learning algorithms. This paper evaluates GloVe, CountVectorizer, and TF-IDF embedding techniques with multiple machine learning models and proves that the right combination of embedding technique and machine learning model(TF-IDF+Logistic Regression: 87.75% accuracy) can achieve nearly the same performance or more as deep learning models (LSTM: 87.89%).
更多
查看译文
关键词
GloVe,TF-IDF,Count Vectorizer,Keras Embedding,BERT,Deep Learning,Sentiment Analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要