Hardware Specialization: Estimating Monte Carlo Cross-Section Lookup Kernel Performance and Area.

SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis(2023)

Cited 0|Views9
No score
Abstract
Hardware specialization is one of the promising directions in the post-Moore era. It is imperative to understand how hardware specialization paradigms can benefit HPC. An essential question revolves around estimating the theoretical performance of an optimally specialized architecture without requiring extensive hardware development expertise and efforts. Focusing on the Monte Carlo cross-section lookup kernel, known for its notably low resource utilization, we develop a workflow to simulate a specialized architecture’s timing and estimate resource usage to answer these questions, leveraging open-source hardware tools. We implement building blocks of the kernel pipeline in the Chisel construction language and generate Verilog codes for resource estimation. Our late-breaking results show that the kernel latency is 46 cycles per lookup while the optimized CPU code takes 680 cycles, and a potential 15k pipeline copies within a 698 mm2 die, reflective of the Intel Xeon Platinum 8180 dimensions.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined