>> Hi, all
>> 
>> Segmentation fault happened when reboot VM after hot-unplug virtio NIC, 
>> which can be reproduced 100%.
>> See similar bug report to 
>> https://bugzilla.redhat.com/show_bug.cgi?id=988256
>> 
>> test environment:
>> host: SLES11SP2 (kenrel version: 3.0.58)
>> qemu: 1.5.1, upstream-qemu (commit 
>> 545825d4cda03ea292b7788b3401b99860efe8bc)
>> libvirt: 1.1.0
>> guest os: win2k8 R2 x64bit or sles11sp2 x64 or win2k3 32bit
>> 
>> You can reproduce this problem by following steps:
>> 1. start a VM with virtio NIC(s)
>> 2. hot-unplug a virtio NIC from the VM 3. reboot the VM, then 
>> segmentation fault happened during starting period
>> 
>> the qemu backtrace shown as below:
>> #0  0x00007ff4be3288d0 in __memcmp_sse4_1 () from /lib64/libc.so.6
>> #1  0x00007ff4c07f82c0 in patch_hypercalls (s=0x7ff4c15dd610) at 
>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/hw/i386/kvmvapic.c:549
>> #2  0x00007ff4c07f84f0 in vapic_prepare (s=0x7ff4c15dd610) at 
>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/hw/i386/kvmvapic.c:614
>> #3  0x00007ff4c07f85e7 in vapic_write (opaque=0x7ff4c15dd610, addr=0, 
>> data=32, size=2)
>>     at /mnt/zhanghaoyu/qemu/qemu-1.5.1/hw/i386/kvmvapic.c:651
>> #4  0x00007ff4c082a917 in memory_region_write_accessor 
>> (opaque=0x7ff4c15df938, addr=0, value=0x7ff4bbfe3d00, size=2, 
>>     shift=0, mask=65535) at 
>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/memory.c:334
>> #5  0x00007ff4c082a9ee in access_with_adjusted_size (addr=0, 
>> value=0x7ff4bbfe3d00, size=2, access_size_min=1, 
>>     access_size_max=4, access=0x7ff4c082a89a <memory_region_write_accessor>, 
>> opaque=0x7ff4c15df938)
>>     at /mnt/zhanghaoyu/qemu/qemu-1.5.1/memory.c:364
>> #6  0x00007ff4c082ae49 in memory_region_iorange_write 
>> (iorange=0x7ff4c15dfca0, offset=0, width=2, data=32)
>>     at /mnt/zhanghaoyu/qemu/qemu-1.5.1/memory.c:439
>> #7  0x00007ff4c08236f7 in ioport_writew_thunk (opaque=0x7ff4c15dfca0, 
>> addr=126, data=32)
>>     at /mnt/zhanghaoyu/qemu/qemu-1.5.1/ioport.c:219
>> #8  0x00007ff4c0823078 in ioport_write (index=1, address=126, data=32) 
>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/ioport.c:83
>> #9  0x00007ff4c0823ca9 in cpu_outw (addr=126, val=32) at 
>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/ioport.c:296
>> #10 0x00007ff4c0827485 in kvm_handle_io (port=126, data=0x7ff4c0510000, 
>> direction=1, size=2, count=1)
>>     at /mnt/zhanghaoyu/qemu/qemu-1.5.1/kvm-all.c:1485
>> #11 0x00007ff4c0827e14 in kvm_cpu_exec (env=0x7ff4c15bf270) at 
>> /mnt/zhanghaoyu/qemu/qemu-1.5.1/kvm-all.c:1634
>> #12 0x00007ff4c07b6f27 in qemu_kvm_cpu_thread_fn (arg=0x7ff4c15bf270) 
>> at /mnt/zhanghaoyu/qemu/qemu-1.5.1/cpus.c:759
>> #13 0x00007ff4be58af05 in start_thread () from /lib64/libpthread.so.0
>> #14 0x00007ff4be2cd53d in clone () from /lib64/libc.so.6
>> 
>> If I apply below patch to the upstream qemu, this problem will 
>> disappear,
>> ---
>>  hw/i386/kvmvapic.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>> 
>> diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c index 
>> 15beb80..6fff299 100644
>> --- a/hw/i386/kvmvapic.c
>> +++ b/hw/i386/kvmvapic.c
>> @@ -652,11 +652,11 @@ static void vapic_write(void *opaque, hwaddr addr, 
>> uint64_t data,
>>      switch (size) {
>>      case 2:
>>          if (s->state == VAPIC_INACTIVE) {
>> -            rom_paddr = (env->segs[R_CS].base + env->eip) & ROM_BLOCK_MASK;
>> -            s->rom_state_paddr = rom_paddr + data;
>> -
>>              s->state = VAPIC_STANDBY;
>>          }
>> +        rom_paddr = (env->segs[R_CS].base + env->eip) & ROM_BLOCK_MASK;
>> +        s->rom_state_paddr = rom_paddr + data;
>> +
>>          if (vapic_prepare(s) < 0) {
>>              s->state = VAPIC_INACTIVE;
>>              break;
>
>Yes, we need to update the ROM's physical address after the BIOS reshuffled 
>the layout.
>
>But I'm not happy with simply updating the address unconditionally. We need to 
>understand the crash first, then make QEMU robust against the guest not 
>issuing this initial write after a ROM region layout change.
>And finally make it work properly in the normal case.
>
The direct cause of crash is trying to access invalid address, which is due to 
not updating the rom's physical address.
In my opinion, since hot-plug/unplug involved in, we need to re-calculate rom's 
physical address for all devices which have rom during starting period when 
reboot/reset vm,
is it reasonable to set vapic's state to VAPIC_INACTIVE during vapic's reset?

>Jan

Reply via email to