I don't know much of instrumentation approach of Lluis, however, I had roughly estimated guest memory access (qemu_ld/st performance) overhead in legacy way last year.
My idea was indirect performance estimation by measuring the generated host instruction counts of qemu_ld/st and their execution counts.
For measuring execution counts of IR, I had add some legacy instrumentation IR when generating TB IRs in TCG.
The result was as follows (qemu_ld/st overhead when Linux of x86 guest booting on QEMU 1.0)
- 1 qemu_ld/st IR -> 12 host instruction when fast path (TLB hit case)
- qemu_ld execution count: 5.3%
- qemu_st execution count: 4.8%
Therefore, the performance overhead of qemu_ld/st could be up to about 50% (10% execution count * 12 instruction) using following rough assumptions.
1. Most qemu IRs are translated into one host instruction respectively.
2. The CPU cycles of host instructions are equal.
3. Ignoring TLB miss case.
__________________________________
Principal Engineer
VM Team
Yeongkyoon Lee
S-Core Co., Ltd.
D.L.: +82-31-696-7249
M.P.: +82-10-9965-1265
__________________________________