Granular IoT Device Identification Using TF-IDF and Cosine Similarity

PROCEEDINGS OF THE 5TH WORKSHOP ON CPS & IOT SECURITY AND PRIVACY, CPSIOTSEC 2023(2023)

引用 0|浏览0
暂无评分
摘要
Internet of things (IoT) devices are becoming more prevalent in home environments and are shown to be generally insecure. There have been many previous studies looking to identify unknown IoT devices on networks. To truly secure a network however, there is a need to identify unknown devices down to the granularity of firmware version; a problem previous studies have failed to solve. As devices change versions, it is expected that there would be subtle differences in the on-wire signatures that would be hard for a human analyst to notice, but easy for an NLP technique to identify. In this paper we extract keywords from both encrypted and unencrypted network traffic and first use UMAP with K-Means clustering to visualise the data and show that natural clusters form across our test dataset of 18 devices covering 61 versions. This analysis suggests that there are underlying patterns in the extracted keywords that could be detected by machine learning techniques. We then show that these patterns can be detected by proposing a novel technique using TF-IDF and cosine similarity that follows the clustering results to identify IoT devices down to the level of firmware version. We show that our chosen features are strong enough to work accurately across a range of device types, manufacturers, models and versions, and note the main observations found when trying to identify devices down to a firmware version. This approach to get granularity down to device version level achieves an accuracy of 67% without being to the detriment of identifying device models, where we achieve an accuracy of 90%.
更多
查看译文
关键词
Internet of Things (IoT),Machine Learning (ML),Firmware,Versions,Device Identification,TF-IDF,UMAP,K-Means
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要