Feature reduction of Darshan counters using evolutionary algorithms

semanticscholar(2021)

引用 0|浏览2
暂无评分
摘要
1 EXTENDED ABSTRACT Feature reduction is an integral part of data preparation in machine learning. It helps denoise the data and makes it easier to fit the model. Predicting the performance of an application using Darshan counters can be tricky due to the large amount of data available, with not all of them being pertinent to predicting the I/O performance. There exist methods for feature reduction, the most common being Recursive Feature Elimination (RFE) [1]. The RFE method aims to correlate the features to a specific data point. We aim to get a subset of features that are able to distinguish between the different applications. Then compare the effectiveness of the subset by creating a model to predict I/O performance and compare that with a similar model created with all the features and with a subset of features got using RFE implemented on Scikit Learn [2]. Currently, we have a variety of profiling tools like Darshan [3] and Recorder [5] that can give us an idea about the I/O of an application [4]. These counters while useful are not all necessary to distinguish applications. Additionally, in order to maintain a set of varied applications to compare new applications, fewer counters also implies smaller size and computational requirements. In this effort, we use evolutionary algorithms, like genetic algorithms, to select a subset of Darshan counters which can be used to distinguish between a varied set of applications. We then compare how well the subset of the counters would correlate with I/O performance of the application.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要