Declarative Distributed Stream Processing

semanticscholar(2020)

引用 0|浏览1
暂无评分
摘要
Many modern applications, in domains ranging from smart cities to healthcare, have a requirement for the timely processing of data arriving at high speed, for example data generated by sensors in the Internet of Things (IoT). Such systems may need to meet a range of other non-functional requirements, including: reliability; security; energy efficiency (for example to prolong the battery life of sensors in the field), or privacy (to remove or de-personalise data prior to transmitting it over open networks). The combination of requirements, very high data arrival rates, and the desire for timely processing, makes the design and management of the supporting infrastructure very challenging. The current generation of IoT tools adopt the principles of stream processing and are designed around a three-tier architecture: sensors generate data, that is sent on to a local gateway (e.g. a smartphone or field gateway) before being passed on to the Cloud for processing. However, in some domains it can be beneficial to perform some processing on the gateway or on the sensors themselves[MW17], so as to reduce the volume of data sent onwards to the cloud; or to reduce the frequency with which sensors must invoke their networking hardware, thus reducing energy expenditure; or to avoid transmitting sensitive data across public networks, by filtering or anonymising data sets at the point of collection. Modern distributed stream-processing systems attempt to separate the functional definition (typically specified as a software program) and the non-functional requirements (the deployment environment and constraints such as power requirements, network utilisation limits, etc.), with varying degrees of success. Typically the programmer must consider and address both their functional and non-functional requirements when authoring their stream-processing program. We are exploring a different approach where the stream-processing operations and the non-functional requirements are described separately as inputs to an Optimiser, which automatically generates the most appropriate deployment for the available resources, which may include sensors and gateways, meeting both the functional and non-functional requirements. An initial approach (by PhD student Peter Michalák[MW17]) used an extended version of SQL as the method of describing the computation. However, SQL has limited expressivity which proved a barrier to encoding some stream-processing operations.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要