Evaluating Energy Efficiency of GPUs using Machine Learning Benchmarks

Brett Foster,Shubbhi Taneja,Joseph Manzano,Kevin Barker

IPDPS Workshops（2023）

引用 0|浏览10

暂无评分

摘要

As we enter the exascale era, the energy efficiency and performance of High-Performance Computing (HPC) systems, especially running Machine Learning (ML) applications, are becoming increasingly important. Nvidia recently released its 9th-generation HPC-grade Graphics Processing Unit (GPU) microarchitecture, Ampere, claiming significant improvements over the previous generation's Volta architecture. In this paper, we perform fine-grained power collection and assess the performance of these two HPC architectures' performance by profiling ML benchmarks. In addition, we analyze various hyperparameters, primarily the batch size and the number of GPUs, to determine their impact on these systems' performance and power efficiency. While Ampere is 3.16x more energy-efficient than Volta in isolation, this is counteracted by the PCIe interconnects of the A100s as the ML tasks are parallelized to run on more GPUs.

查看译文

关键词

high-performance computing,benchmarking,machine learning,GPU,Ampere,NVLink,nvprof,memory footprint,data movement,hugging face

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要