A Shared-Memory Parallel Implementation of the RePlAce Global Cell Placer

2020 33rd International Conference on VLSI Design and 2020 19th International Conference on Embedded Systems (VLSID)(2020)

引用 6|浏览21
暂无评分
摘要
RePlAce is a state-of-the-art prototype of a flat, analytic, and nonlinear global cell placement algorithm, which models a placement instance as an electrostatic system with positively charged objects. It can handle large-scale standard-cell and mixed-cell placement, while achieving shorter wirelength and similar or shorter runtimes than other state-of-the-art placers on the ISPD-2005/2006 standard-cell benchmarks; however, the runtime of RePlAce on these benchmarks ranges from 15 minutes to 5+ hours on a 2.6 GHz Intel Xeon server running a single thread, rendering development cycles prohibitively long. To address this concern, this paper introduces a multi-threaded shared-memory implementation of RePlAce. The contributions include techniques to reduce memory contention and to effectively balance the workload among threads, targeting the most substantial performance bottlenecks. With 2-12 threads, our parallel RePlAce speeds up the bin density function by a factor of $4.2-10\times$, the wirelength function by a factor of $2.3-3\times$, and the cost gradient function by a factor of $2.9-6.6\times$ compared to the single-threaded original RePlAce baseline. Moreover, our parallel RePlAce is $\approx 3.5\times$ faster than the state-of-the-art PyTorch-based placer DREAMPlace, when both are running on 12 CPU cores.
更多
查看译文
关键词
VLSI placement,multithreading,parallelism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要