Boolean Logical Operator Driven Selective Data Filtering For Large Datasets

Glenn Davidson,Shikharesh Majumdar

2022 Annual Modeling and Simulation Conference (ANNSIM)(2022)

引用 1|浏览1
暂无评分
摘要
Specific users of a system processing large data sets are often interested in only a small subset of the large volumes of available data. This paper presents research on a parallel processing based data filtering technique that filters out and stores only the subset of data that is of interest to a given user. A user’s preferences reflecting her/his interest are captured in a set of keywords or phrases which may be used in conjunction with Boolean operators. An Apache Spark based prototype is built and deployed on an Amazon EC2 cloud to demonstrate the viability of the approach and to analyze the performance of the proposed technique.
更多
查看译文
关键词
Apache Spark,Parallel Data Filtering,Boolean Operator based Selective Data Reduction,Big Data Processing on Clouds
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要