TBDClust

Periodicals(2017)

引用 4|浏览0
暂无评分
摘要
AbstractThe World Wide Web has evolved rapidly, incorporating new content types and becoming more dynamic. The contents from a website can be distributed between several servers, and as a consequence, web traffic has become increasingly complex. From a network traffic perspective, it can be difficult to ascertain which websites are being visited by a user, let alone which part of the user's traffic each website is responsible for. In this paper we present a method for identifying the TCP connections involved in the same full webpage download without the need of deep packet inspection. This identification is needed for example to enable free browsing of specific websites in a pay per use mobile Internet access. It could be not only for third party promoted websites but also portals to gubernamental or medical emergency websites. The proposal is based on a modification of the DBSCAN clustering algorithm to work online and over one-dimensional sorted data. In order to validate our results we use both real traffic and packet captures from a controlled environment. The proposal achieves excellent results in consistency (99%) and completeness (92%), meaning that its error margin identifying the webpage downloads is minimal.
更多
查看译文
关键词
Clustering TCP connections,DBSCAN,Mobile web browsing,Online monitoring,Real traffic dataset,Time-based density clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要