Hi,

Recently we encountered a problem in our project: 2 CPUs in VM are not brought 
up normally after reboot.
Our host is using KVM kmod 3.6 and QEMU 2.1.
A SLES 11 sp3 VM configured with 8 vcpus,
cpu model is configured with 'host-passthrough'.

After VM's first time started up, everything seems to be OK.
and then VM is paniced and rebooted.
After reboot, only 6 cpus are brought up in VM, cpu1 and cpu7 are not online.

This is the only message we can get from VM:
VM dmesg shows:
[    0.069867] Booting Node   0, Processors  #1
[    5.060042] CPU1: Stuck ??
[    5.060499]  #2
[    5.088322] kvm-clock: cpu 2, msr 6:3fc90901, secondary cpu clock
[    5.088335] KVM setup async PF for cpu 2
[    5.092967] NMI watchdog enabled, takes one hw-pmu counter.
[    5.094405]  #3
[    5.108324] kvm-clock: cpu 3, msr 6:3fcd0901, secondary cpu clock
[    5.108333] KVM setup async PF for cpu 3
[    5.113553] NMI watchdog enabled, takes one hw-pmu counter.
[    5.114970]  #4
[    5.128325] kvm-clock: cpu 4, msr 6:3fd10901, secondary cpu clock
[    5.128336] KVM setup async PF for cpu 4
[    5.134576] NMI watchdog enabled, takes one hw-pmu counter.
[    5.135998]  #5
[    5.152324] kvm-clock: cpu 5, msr 6:3fd50901, secondary cpu clock
[    5.152334] KVM setup async PF for cpu 5
[    5.154764] NMI watchdog enabled, takes one hw-pmu counter.
[    5.156467]  #6
[    5.172327] kvm-clock: cpu 6, msr 6:3fd90901, secondary cpu clock
[    5.172341] KVM setup async PF for cpu 6
[    5.180738] NMI watchdog enabled, takes one hw-pmu counter.
[    5.182173]  #7 Ok.
[   10.170815] CPU7: Stuck ??
[   10.171648] Brought up 6 CPUs
[   10.172394] Total of 6 processors activated (28799.97 BogoMIPS).

From host, we found that QEMU vcpu1 thread and vcpu7 thread were not consuming 
any cpu (Should be in idle state),
All of VCPUs' stacks in host is like bellow:

[<ffffffffa07089b5>] kvm_vcpu_block+0x65/0xa0 [kvm]
[<ffffffffa071c7c1>] __vcpu_run+0xd1/0x260 [kvm]
[<ffffffffa071d508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
[<ffffffffa0709cee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
[<ffffffff8116be8b>] do_vfs_ioctl+0x8b/0x3b0
[<ffffffff8116c251>] sys_ioctl+0xa1/0xb0
[<ffffffff81468092>] system_call_fastpath+0x16/0x1b
[<00002ab9fe1f99a7>] 0x2ab9fe1f99a7
[<ffffffffffffffff>] 0xffffffffffffffff

We looked into the kernel codes that could leading to the above 'Stuck' warning,
and found that the only possible is the emulation of 'cpuid' instruct in 
kvm/qemu has something wrong.
But since we can’t reproduce this problem, we are not quite sure.
Is there any possible that the cupid emulation in kvm/qemu has some bug ?

Has anyone come across these problem before? Or any idea?

Thanks,
zhanghailiang




Reply via email to