Chrome Extension
WeChat Mini Program
Use on ChatGLM

HiRe: using hint & release to improve synchronization of speculative threads.

ICS(2012)

Cited 0|Views12
No score
Abstract
ABSTRACTThread-Level Speculation (TLS) is a promising technique for improving performance of serial codes on multi-cores by automatically extracting threads and running them in parallel. However, the speculation efficiency as well as the performance gain of TLS systems are reduced by cross-thread data dependence violations. Reducing the cost and frequency of violations are key to improving the efficiency of TLS. One method to keep a dependence from violating is to predict it and communicate the value via synchronization. However, prior work in this field still cannot handle enough violating dependences, especially hard-to-predict ones and those in non-loop TLS tasks. Also, they suffer from over-synchronization and/or introduce complicated hardware. The major reason is that these techniques are highly sensitive to the accuracy of the dependence prediction, which is hard to improve in the face of irregular dependence and task patterns. In this paper, we propose a novel synchronization technique that avoids over synchronization and works for irregularly occurring dependences. We use a profiler to find and mark store-load pairs that generate data dependences. Then, the compiler schedules a hint instruction in advance of the store to inform successor threads of a possible pending write to a specific address; in this way, later loads only wait for a store if the loading location has been hinted. The compiler also schedules a release instruction that notifies the load when it should proceed. It places the release both after the store and on every path leading away from the hint that does not pass through the store. By placing it on all such paths, we limit the cost due to over synchronization. Together, the hint and release form our proposal, called HiRe. We implemented the HiRe scheme on a well-tuned TLS system and evaluated it on a set of SPEC CPU 2000 applications; we find that HiRe suffers only 22% of the violations that occur in our base TLS system, and it cuts the instruction waste rate of TLS in half. Furthermore, it outperforms prior approaches we studied by 3%.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined