Characterizing I/O optimization opportunities for array-centric applications on HDFS

2018 IEEE High Performance extreme Computing Conference (HPEC)(2018)

Cited 1|Views37
No score
Abstract
An impedance mismatch exists between the increasing sophistication of array-centric analytics and the bytestream-based POSIX interface of parallel file systems. This mismatch is particularly acute in data-intensive scientific applications. This paper examines performance bottlenecks and describes optimizations to alleviate them in the context of computational astronomy pipelines and the Hadoop distributed file system (HDFS). We find that fast data ingestion and intelligent object consolidation promise to accelerate I/O performance by two orders of magnitude.
More
Translated text
Key words
computational astronomy pipelines,Hadoop distributed file system,HDFS,fast data ingestion,array-centric applications,impedance mismatch,array-centric analytics,bytestream-based POSIX interface,parallel file systems,data-intensive scientific applications,performance bottlenecks,characterizing I/O optimization opportunities,intelligent object consolidation
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined