On Wed, Jul 25, 2012 at 12:12:55PM +0200, Paolo Bonzini wrote: > Il 25/07/2012 11:56, Luigi Rizzo ha scritto: > > On Wed, Jul 25, 2012 at 11:53:29AM +0300, Avi Kivity wrote: > >> On 07/24/2012 07:58 PM, Luigi Rizzo wrote: > >>> I noticed that the various NIC modules in qemu/kvm do not implement > >>> interrupt mitigation, which is very beneficial as it dramatically > >>> reduces exits from the hypervisor. > >>> > >>> As a proof of concept i tried to implement it for the e1000 driver > >>> (patch below), and it brings tx performance from 9 to 56Kpps on > >>> qemu-softmmu, and from ~20 to 140Kpps on qemu-kvm. > >>> > >>> I am going to measure the rx interrupt mitigation in the next couple > >>> of days. > >>> > >>> Is there any interest in having this code in ? > >> > >> Indeed. But please drop the #ifdef MITIGATIONs. > > > > Thanks for the comments. The #ifdef block MITIGATION was only temporary to > > point out the differences and run the performance comparisons. > > Similarly, the magic thresholds below will be replaced with > > appropriately commented #defines. > > > > Note: > > On the real hardware interrupt mitigation is controlled by a total of four > > registers (TIDV, TADV, RIDV, RADV) which control it with a granularity > > of 1024ns , see > > > > http://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf > > > > An exact emulation of the feature is hard, because the timer resolution we > > have is much coarser (in the ms range). So i am inclined to use a different > > approach, similar to the one i have implemented, namely: > > - the first few packets (whether 1 or 4 or 5 will be decided on the host) > > report an interrupt immediately; > > - subsequent interrupts are delayed through qemu_bh_schedule_idle() > > qemu_bh_schedule_idle() is really a 10ms timer.
yes, i figured that out, this is why i said that my code was more a "proof of concept" than an actual patch. If you have a suggestion on how to schedule a shorter (say 1ms) timer i am all hears. Perhaps qemu_new_timer_ns() and friends ? This said, i do not plan to implement the full mitigation registers controlled by the guest, just possibly use a parameter as in virtio-net where you can have 'tx=bh' or 'tx=timer' and 'x-txtimer=N' with N is the mitigation delay in nanoseconds (virtually, in practice rounded to whatever the host granularity is) cheers luigi