Feature selection method on twitter dataset with part-of-speech (PoS) pattern applied to traffic analysis

INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT(2024)

引用 1|浏览0
暂无评分
摘要
In day-to-day life transportation plays a major role in cities. Present day traffic management is a complex task for transportation agencies through traditional approaches, hence Intelligent Transportation systems is applied to give traffic management solutions like parking, E-toll charge and traffic control by analyzing data from related sources. Data is collected from various sources for analyzing transportation need's, yet transportation issues remain one of the major tribulations in cities. Unstructureddata gives enormous information load for big data analytics, but the unstructured content processing is a challenge in industry. Passive data like social media data is a major data sources for Intelligent Systems, social media applications such as Twitter, Facebook where user can share live comments based on their interaction with the world is a rich source for passive data. Social media data helps in analyzing traffic issues like traffic jam, accident locations, road condition etc. Major issue with social media data is processing and analysis of data is very complex because of volume and data format. Big data architecture helps in extracting, processing, loading in database and analyzing this unstructured data. To identify thesentimentalanalysis is majorly classified based onpositive, negative and neutral tweets. As the polarity of neutral tweets is zero it cannot be used for Opinion mining. So, this paper is focused on Neutral tweets classification based on feature selection. Part of Speech (PoS) tagging is used for labeling the words of the text in the tweets to find nouns example location, date and time are compared with the other attribute values for improving the classification of neutral tweets. Research work shown in this paper has taken social media speech data (Tweets) from twitter as input and preprocessing techniques are applied on the data collected, Methods such as feature selection are then used to extract the features related to tweets for classifying neutral tweets for better understanding on road condition, identification of traffic patterns and finally traffic behavior is analyzed by using Ensemble machine learning algorithm. In the proposed model to measure the sentimental analysis a new approach is provided based on feature selection. The findings disclose with SentiWordNetopinion lexicon approach gives 56% accuracy of positive or negative opinion using twitter dataset, the results of feature selection-based opinion mining proposed model increased substantially with 88% accuracy.
更多
查看译文
关键词
Intelligent transport system,Neutral tweets,Feature selection,Feature extraction,Sentimental analysis,Opinion mining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要