On 31/12/2014 04:45, kevinnma(马文霜) wrote: > Last month, we experienced several guests crash(6cores-8cores),qemu logs > display the following messages: > > qemu-system-x86_64: /build/qemu-2.1.2/kvm-all.c:976: > kvm_irqchip_commit_routes: Assertion `ret == 0' failed. > > After analysis and verification, we can confirm it's irq-balance > daemon(in guest) leads to the assertion failure.So start a 8 core guest > with two disks, execute the following scripts will reproduce the BUG quickly: > > vda_irq_num=25 > vdb_irq_num=27 > while [ 1 ] > do > for irq in {1,2,4,8,10,20,40,80} > do > echo $irq > /proc/irq/$vda_irq_num/smp_affinity > echo $irq > /proc/irq/$vdb_irq_num/smp_affinity > dd if=/dev/vda of=/dev/zero bs=4K count=100 iflag=direct > dd if=/dev/vdb of=/dev/zero bs=4K count=100 iflag=direct > done > done > > QEMU setup static irq route entries in kvm_pc_setup_irq_routing(),PIC and > IOAPIC share the first 15 GSI numbers,take up 23 GSI numbers,but take up 38 > irq route entries.When change irq smp_affinity in guest,a dynamic route > entry may be setup,the current logic is:if allocate GSI number succeeds, > a new route entry can be added.The available dynamic GSI numbers is > 1021(KVM_MAX_IRQ_ROUTES-23),but available irq route entries is only > 986(KVM_MAX_IRQ_ROUTES-38),GSI numbers greater than route entries. > irq-balance's behavior will eventually leads to total irq route entries > exceed KVM_MAX_IRQ_ROUTES,ioctl(KVM_SET_GSI_ROUTING) fail and > kvm_irqchip_commit_routes() trigger assertion failure.
I have two questions: 1) why isn't the existing check in kvm_irqchip_get_virq enough to fix the bug? if (!s->direct_msi && retry) { retry = false; kvm_flush_dynamic_msi_routes(s); goto again; } 2) If you introduce this extra call to kvm_flush_dynamic_msi_routes, does the existing check become obsolete? Thanks, Paolo