That is my guess as well in the first place, but my QEMU is built with CONFIG_IOTHREAD set to 0.
I am not 100% sure about how interrupts are delivered in QEMU, my guess is that some kind of timer devices will have to fire and qemu might have installed a signal handler and the signal handler takes the signal and invokes unlink_tb. I hope you can enlighten me on that. Thanks Xin On Wed, Dec 28, 2011 at 2:10 PM, Peter Maydell <peter.mayd...@linaro.org> wrote: > On 28 December 2011 00:43, Xin Tong <xerox.time.t...@gmail.com> wrote: >> I modified QEMU to check for interrupt status at the end of every TB >> and ran it on SPECINT2000 benchmarks with QEMU 0.15.0. The performance >> is 70% of the unmodified one for some benchmarks on a x86_64 host. I >> agree that the extra load-test-branch-not-taken per TB is minimal, but >> what I found is that the average number of TB executed per TB enter is >> low (~3.5 TBs), while the unmodified approach has ~10 TBs per TB >> enter. this makes me wonder why. Maybe the mechanism i used to gather >> this statistics is flawed. but the performance is indeed hindered. > > Since you said you're using system mode, here's my guess. The > unlink-tbs method of interrupting the guest CPU thread runs > in a second thread (the io thread), and doesn't stop the guest > CPU thread. So while the io thread is trying to unlink TBs, > the CPU thread is still running on, and might well execute > a few more TBs before the io thread's traversal of the TB > graph catches up with it and manages to unlink the TB link > the CPU thread is about to traverse. > > More generally: are we really taking an interrupt every 3 to > 5 TBs? This seems very high -- surely we will be spending more > time in the OS servicing interrupts than running useful guest > userspace code... > > -- PMM