Hi Marek,
On 26/08/2021 23:51, Marek Marczykowski-Górecki wrote:
On Thu, Aug 26, 2021 at 10:00:58PM +0100, Julien Grall wrote:
While doing more testing today, I noticed that only one vCPU would be
brought up with HVM guest with Xen 4.16 on my setup (QEMU):
[ 1.122180]
================================================================================
[ 1.122180] UBSAN: shift-out-of-bounds in
oss/linux/arch/x86/kernel/apic/apic.c:2362:13
[ 1.122180] shift exponent -1 is negative
[ 1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
[ 1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
[ 1.122180] Call Trace:
[ 1.122180] dump_stack_lvl+0x56/0x6c
[ 1.122180] ubsan_epilogue+0x5/0x50
[ 1.122180] __ubsan_handle_shift_out_of_bounds+0xfa/0x140
[ 1.122180] ? cgroup_kill_write+0x4d/0x150
[ 1.122180] ? cpu_up+0x6e/0x100
[ 1.122180] ? _raw_spin_unlock_irqrestore+0x30/0x50
[ 1.122180] ? rcu_read_lock_held_common+0xe/0x40
[ 1.122180] ? irq_shutdown_and_deactivate+0x11/0x30
[ 1.122180] ? lock_release+0xc7/0x2a0
[ 1.122180] ? apic_id_is_primary_thread+0x56/0x60
[ 1.122180] apic_id_is_primary_thread+0x56/0x60
[ 1.122180] cpu_up+0xbd/0x100
[ 1.122180] bringup_nonboot_cpus+0x4f/0x60
[ 1.122180] smp_init+0x26/0x74
[ 1.122180] kernel_init_freeable+0x183/0x32d
[ 1.122180] ? _raw_spin_unlock_irq+0x24/0x40
[ 1.122180] ? rest_init+0x330/0x330
[ 1.122180] kernel_init+0x17/0x140
[ 1.122180] ? rest_init+0x330/0x330
[ 1.122180] ret_from_fork+0x22/0x30
[ 1.122244]
================================================================================
[ 1.123176] installing Xen timer for CPU 1
[ 1.123369] x86: Booting SMP configuration:
[ 1.123409] .... node #0, CPUs: #1
[ 1.154400] Callback from call_rcu_tasks_trace() invoked.
[ 1.154491] smp: Brought up 1 node, 1 CPU
[ 1.154526] smpboot: Max logical packages: 2
[ 1.154570] smpboot: Total of 1 processors activated (5999.99 BogoMIPS)
I have tried a PV guest (same setup) and the kernel could bring up all the
vCPUs.
Digging down, Linux will set smp_num_siblings to 0 (via detect_ht_early())
and as a result will skip all the CPUs. The value is retrieve from a CPUID
leaf. So it sounds like we don't set the leaft correctly.
FWIW, I have also tried on Xen 4.11 and could spot the same issue. Does this
ring any bell to you?
Is it maybe this:
https://lore.kernel.org/xen-devel/20201106003529.391649-1-bmas...@redhat.com/T/#u
?
It looks to be different as I don't see the splat.
Anyway, Jan just posted a patch that allows a Linux HVM domain to brings
up all the vCPUs.
Cheers,
--
Julien Grall