mNPUsim: Evaluating the Effect of Sharing Resources in Multi-core NPUs

2023 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION, IISWC(2023)

Cited 0|Views10
No score
Abstract
Multi-core neural processing units (NPUs) have emerged to scale the computation capability of NPUs to efficiently support diverse machine learning tasks. In such multi-core NPUs, workloads in different cores can affect other co-runners by incurring contentions on shared resources such as external memory bandwidth and memory management unit (MMU) for address translation. However, many recent NPU studies use a single-core NPU framework without considering dynamic effect by the shared resources. For this study, we develop a new multi-core NPU simulator to assess the effect of resource sharing accurately. Using the simulator, this paper reports the sharing behaviors of multi-core NPUs with respect to overall throughput and performance variance caused by co-runners. The evaluation of the dual and quad core NPUs shows that sharing MMU and memory bandwidth in general is beneficial for throughput, with minor degradation in fairness. The evaluation also shows that page table walkers in MMU are one of the critical shareable resources. Due to the bursty nature of NPU memory accesses, sharing of walker bandwidth across multiple cores can significantly improve the performance. The study extends the evaluation of scalability with multi-core NPUs, investigating the effect of mapping heterogeneous models to multiple NPUs.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined