clCaffe: OpenCL Accelerated Caffe for Convolutional Neural Networks

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)(2016)

引用 32|浏览29
暂无评分
摘要
Recent advances in deep convolutional neural networks enable researchers and developers to apply machine learning to a much broader number of applications. With the proliferation of deep learning applications, widely used deep learning frameworks, such as Caffe, Theano and Torch, have been significantly improved with the support of powerful GPUs and GPU-accelerated libraries. However, lack of frameworks and libraries built on OpenCL could hinder exploration of more diverse compute devices (CPUs, GPUs, DSPs and FPGAs) in future deep learning domains. In this work, we present OpenCL acceleration of a well-known deep learning framework, Caffe, while focusing on the convolution layer which has been optimized with three different approaches, GEMM, spatial domain, and frequency domain. Our work, clCaffe, greatly enhances the ability to leverage deep learning use cases on all types of OpenCL devices, particularly on small form factor devices in which discrete GPUs are rare and integrated GPUs are much more common. Our benchmark shows 2.5× speedup on the Intel integrated-GPU, compared to CPU-only AlexNet on ImageNet dataset. As such, our work provides the deep learning community with the opportunity to embrace a broad range of devices through OpenCL.
更多
查看译文
关键词
Deep learning framework,Convolutional Neural Networks,Caffe,OpenCL,Integrated GPU
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要