Accelerating Machine-Learning Algorithms on FPGAs using Pattern-Based Decomposition

Journal of Signal Processing Systems(2009)

引用 12|浏览4
暂无评分
摘要
Machine-learning algorithms are employed in a wide variety of applications to extract useful information from data sets, and many are known to suffer from super-linear increases in computational time with increasing data size and number of signals being processed (data dimension). Certain principal machine-learning algorithms are commonly found embedded in larger detection, estimation, or classification operations. Three such principal algorithms are the Parzen window-based, non-parametric estimation of Probability Density Functions (PDFs), K-means clustering and correlation. Because they form an integral part of numerous machine-learning applications, fast and efficient execution of these algorithms is extremely desirable. FPGA-based reconfigurable computing (RC) has been successfully used to accelerate computationally intensive problems in a wide variety of scientific domains to achieve speedup over traditional software implementations. However, this potential benefit is quite often not fully realized because creating efficient FPGA designs is generally carried out in a laborious, case-specific manner requiring a great amount of redundant time and effort. In this paper, an approach using pattern-based decomposition for algorithm acceleration on FPGAs is proposed that offers significant increases in productivity via design reusability. Using this approach, we design, analyze, and implement a multi-dimensional PDF estimation algorithm using Gaussian kernels on FPGAs. First, the algorithm’s amenability to a hardware paradigm and expected speedups are predicted. After implementation, actual speedup and performance metrics are compared to the predictions, showing speedup on the order of 20× over a 3.2 GHz processor. Multi-core architectures are developed to further improve performance by scaling the design. Portability of the hardware design across multiple FPGA platforms is also analyzed. After implementing the PDF algorithm, the value of pattern-based decomposition to support reuse is demonstrated by rapid development of the K-means and correlation algorithms.
更多
查看译文
关键词
FPGA,Design patterns,Machine learning,Pattern recognition,Hardware acceleration,Performance prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要