Streaming Analytics with Adaptive Near-data Processing.

International Workshop on Multimodal Human Understanding for the Web and Social Media(2022)

引用 0|浏览33
暂无评分
摘要
Streaming analytics applications need to process massive volumes of data in a timely manner, in domains ranging from datacenter telemetry and geo-distributed log analytics to Internet-of-Things systems. Such applications suffer from significant network transfer costs to transport the data to a stream processor and compute costs to analyze the data in a timely manner. Pushing the computation closer to the data source by partitioning the analytics query is an effective strategy to reduce resource costs for the stream processor. However, the partitioning strategy depends on the nature of resource bottleneck and resource variability that is encountered at the compute resources near the data source. In this paper, we investigate different issues which affect query partitioning strategies. We first study new partitioning techniques within cloud datacenters which operate under constrained compute conditions varying widely across data sources and different time slots. With insights obtained from the study, we suggest several different ways to improve the performance of stream analytics applications operating in different resource environments, by making effective partitioning decisions for a variety of use cases such as geo-distributed streaming analytics.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要