Accelerate Dense Matrix Multiplication on Heterogeneous-GPUs.

International Conference on Parallel and Distributed Systems(2023)

引用 0|浏览6
暂无评分
摘要
Matrix multiplication is crucial in scientific computing, but it demands substantial resources. We propose a framework for effectively utilizing heterogeneous GPUs to large matrix multiplication. By splitting matrices into small blocks and using Douglas’s variant of Strassen’s algorithm, we enable concurrent tasks on heterogeneous systems. Our framework improves speed by 89.5% on homogeneous GPU servers and by 108% in multi-server heterogeneous GPU setups.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要