Optimizing Memory Translation Emulation in Full System Emulators

TACO(2014)

引用 14|浏览26
暂无评分
摘要
The emulation speed of a full system emulator (FSE) determines its usefulness. This work quantitatively measures where time is spent in QEMU [Bellard 2005], an industrial-strength FSE. The analysis finds that memory emulation is one of the most heavily exercised emulator components. For workloads studied, 38.1% of the emulation time is spent in memory emulation on average, even though QEMU implements a software translation lookaside buffer (STLB) to accelerate dynamic address translation. Despite the amount of time spent in memory emulation, there has been no study on how to further improve its speed. This work analyzes where time is spent in memory emulation and studies the performance impact of a number of STLB optimizations. Although there are several performance optimization techniques for hardware TLBs, this work finds that the trade-offs with an STLB are quite different compared to those with hardware TLBs. As a result, not all hardware TLB performance optimization techniques are applicable to STLBs and vice versa. The evaluated STLB optimizations target STLB lookups, as well as refills, and result in an average emulator performance improvement of 24.4% over the baseline.
更多
查看译文
关键词
design,experimentation,run-time environments,dynamic address translation,tlb,full system emulator,performance measures,performance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要