Benchmarking Deep Learning Frameworks: Design Considerations, Metrics And Beyond

Ling Liu,Yanzhao Wu,Wenqi Wei,Wenqi Cao,Semih Sahin,Qi Zhang

2018 IEEE 38TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS)（2018）

引用 48|浏览38

暂无评分

摘要

With increasing number of open-source deep learning (DL) software tools made available, benchmarking DL software frameworks and systems is in high demand. This paper presents design considerations, metrics and challenges towards developing an effective benchmark for DL software frameworks and illustrate our observations through a comparative study of three popular DL frameworks: TensorFlow, Caffe, and Torch. First, we show that these deep learning frameworks are optimized with their default configurations settings. However, the default configuration optimized on one specific dataset may not work effectively for other datasets with respect to runtime performance and learning accuracy. Second, the default configuration optimized on a dataset by one DL framework does not work well for another DL framework on the same dataset. Third, experiments show that different DL frameworks exhibit different levels of robustness against adversarial examples. Through this study, we envision that unlike traditional performance-driven benchmarks, benchmarking deep learning software frameworks should take into account of both runtime and accuracy and their latent interaction with hyper-parameters and data-dependent configurations of DL frameworks.

查看译文

关键词

Benchmarking,Deep Learning Frameworks,Performance,Accuracy,Adversarial Examples,Robustness

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要