Comparing a High and Low-Level Deep Neural Network Implementation for Automatic Speech Recognition

High Performance Technical Computing in Dynamic Languages(2014)

引用 10|浏览41
暂无评分
摘要
The use of deep neural networks (DNNs) has improved performance in several fields including computer vision, natural language processing, and automatic speech recognition (ASR). The increased use of DNNs in recent years has been largely due to performance afforded by GPUs, as the computational cost of training large networks on a CPU is prohibitive. Many training algorithms are well-suited to the GPU; however, writing hand-optimized GPGPU code is a significant undertaking. More recently, high-level libraries have attempted to simplify GPGPU development by automatically performing tasks such as optimization and code generation. This work utilizes Theano, a high-level Python library, to implement a DNN for the purpose of phone recognition in ASR. Performance is compared against a low-level, hand-optimized C++/CUDA DNN implementation from Kaldi, a popular ASR toolkit. Results show that the DNN implementation in Theano has CPU and GPU runtimes on par with that of Kaldi, while requiring approximately 95% less lines of code.
更多
查看译文
关键词
learning (artificial intelligence),neural nets,optimisation,speech recognition,ASR toolkit,CPU runtimes,GPU runtimes,automatic speech recognition,code generation,computer vision,deep neural networks,hand-optimized C++/CUDA DNN implementation,hand-optimized GPGPU code,high level Python library,low level deep neural network,natural language processing,optimization,phone recognition,training algorithms,Python, Theano, DNN, Kaldi, CUDA, GPU
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要