On Wed, Feb 23, 2011 at 12:39:52PM +0100, Jan Kiszka wrote: > On 2011-02-23 12:08, Edgar E. Iglesias wrote: > > On Wed, Feb 23, 2011 at 11:25:54AM +0100, Paolo Bonzini wrote: > >> On 02/23/2011 11:18 AM, Edgar E. Iglesias wrote: > >>> Sorry, I don't know the code well enough to give any sensible feedback > >>> on patch 2 - 4. I did test them with some of my guests and things seem > >>> to be OK with them but quite a bit slower. > >>> I saw around 10 - 20% slowdown with a cris guest and -icount 10. > >>> > >>> The slow down might be related to the issue with super slow icount > >>> together > >>> with iothread (adressed by Marcelos iothread timeout patch). > >> > >> No, this supersedes Marcelo's patch. 10-20% doesn't seem comparable to > >> "looks like it deadlocked" anyway. Also, Jan has ideas on how to remove > >> the synchronization overhead in the main loop for TCG+iothread. > > > > I see. I tried booting two of my MIPS and CRIS linux guests with iothread > > and -icount 4. Without your patch, the boot crawls super slow. Your patch > > gives a huge improvement. This was the "deadlock" scenario which I > > mentioned in previous emails. > > > > Just to clarify the previous test where I saw slowdown with your patch: > > A CRIS setup that has a CRIS and basically only two peripherals, > > a timer block and a device (X) that computes stuff but delays the results > > with a virtual timer. The guest CPU is 99% of the time just > > busy-waiting for device X to get ready. > > > > This latter test runs in 3.7s with icount 4 and without iothread, > > with or without your patch. > > > > With icount 4 and iothread it runs in ~1m5s without your patch and > > ~1m20s with your patch. That was the 20% slowdown I mentioned earlier. > > > > Don't know if that info helps... > > You should try to trace the event flow in qemu, either via strace, via > the built-in tracer (which likely requires a bit more tracepoints), or > via a system-level tracer (ftrace / kernelshark).
Thanks, I'll see if I can get some time to run this more carefully during some weekend. > > Did my patches contribute a bit to overhead reduction? They specifically > target the costly vcpu/iothread switches in TCG mode (caused by TCGs > excessive lock-holding times). Do you have a tree for quick access to your patches? (couldnt find them on my inbox). I could give them a quick go and post results. Cheers