Chrome Extension
WeChat Mini Program
Use on ChatGLM

A tasks reordering model to reduce transfers overhead on GPUs.

Journal of Parallel and Distributed Computing(2017)

Cited 8|Views47
No score
Abstract
The compute capabilities of current GPUs allow exploiting concurrency when several independent tasks are simultaneously launched. These tasks are typically composed by data transfer commands and kernel computation commands. In this paper we develop a run-time approach to optimize the concurrency between data transfers and kernel computation operations in a multithreaded scenario where each CPU thread is sending tasks to the GPU. Our solution is based on a temporal execution model for concurrent tasks that is able to establish the tasks execution order that minimizes the total execution time, including data transfers. Moreover, a heuristic to select the best order has been developed, which is able to improve the execution time achieved by the hardware scheduler of current NVIDIA cards. Our approach obtains performance improvements, under real workloads, of up to 19% with respect to the execution using multiple hardware queues managed by Hyper-Q.
More
Translated text
Key words
GPU,Streams,Concurrency,Tasks scheduling,Hyper-Q
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined