On 27.08.2021 12:35, Julien Grall wrote:
> Hi Jan,
> 
> On 27/08/2021 07:28, Jan Beulich wrote:
>> On 27.08.2021 01:42, Andrew Cooper wrote:
>>> On 26/08/2021 22:00, Julien Grall wrote:
>>>> Hi Andrew,
>>>>
>>>> While doing more testing today, I noticed that only one vCPU would be
>>>> brought up with HVM guest with Xen 4.16 on my setup (QEMU):
>>>>
>>>> [    1.122180]
>>>> ================================================================================
>>>> [    1.122180] UBSAN: shift-out-of-bounds in
>>>> oss/linux/arch/x86/kernel/apic/apic.c:2362:13
>>>> [    1.122180] shift exponent -1 is negative
>>>> [    1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
>>>> [    1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
>>>> [    1.122180] Call Trace:
>>>> [    1.122180]  dump_stack_lvl+0x56/0x6c
>>>> [    1.122180]  ubsan_epilogue+0x5/0x50
>>>> [    1.122180]  __ubsan_handle_shift_out_of_bounds+0xfa/0x140
>>>> [    1.122180]  ? cgroup_kill_write+0x4d/0x150
>>>> [    1.122180]  ? cpu_up+0x6e/0x100
>>>> [    1.122180]  ? _raw_spin_unlock_irqrestore+0x30/0x50
>>>> [    1.122180]  ? rcu_read_lock_held_common+0xe/0x40
>>>> [    1.122180]  ? irq_shutdown_and_deactivate+0x11/0x30
>>>> [    1.122180]  ? lock_release+0xc7/0x2a0
>>>> [    1.122180]  ? apic_id_is_primary_thread+0x56/0x60
>>>> [    1.122180]  apic_id_is_primary_thread+0x56/0x60
>>>> [    1.122180]  cpu_up+0xbd/0x100
>>>> [    1.122180]  bringup_nonboot_cpus+0x4f/0x60
>>>> [    1.122180]  smp_init+0x26/0x74
>>>> [    1.122180]  kernel_init_freeable+0x183/0x32d
>>>> [    1.122180]  ? _raw_spin_unlock_irq+0x24/0x40
>>>> [    1.122180]  ? rest_init+0x330/0x330
>>>> [    1.122180]  kernel_init+0x17/0x140
>>>> [    1.122180]  ? rest_init+0x330/0x330
>>>> [    1.122180]  ret_from_fork+0x22/0x30
>>>> [    1.122244]
>>>> ================================================================================
>>>> [    1.123176] installing Xen timer for CPU 1
>>>> [    1.123369] x86: Booting SMP configuration:
>>>> [    1.123409] .... node  #0, CPUs:      #1
>>>> [    1.154400] Callback from call_rcu_tasks_trace() invoked.
>>>> [    1.154491] smp: Brought up 1 node, 1 CPU
>>>> [    1.154526] smpboot: Max logical packages: 2
>>>> [    1.154570] smpboot: Total of 1 processors activated (5999.99
>>>> BogoMIPS)
>>>>
>>>> I have tried a PV guest (same setup) and the kernel could bring up all
>>>> the vCPUs.
>>>>
>>>> Digging down, Linux will set smp_num_siblings to 0 (via
>>>> detect_ht_early()) and as a result will skip all the CPUs. The value
>>>> is retrieve from a CPUID leaf. So it sounds like we don't set the
>>>> leaft correctly.
>>>>
>>>> FWIW, I have also tried on Xen 4.11 and could spot the same issue.
>>>> Does this ring any bell to you?
>>>
>>> The CPUID data we give to guests is generally nonsense when it comes to
>>> topology.  By any chance does the hardware you're booting this on not
>>> have hyperthreading enabled/active to begin with?
>>
>> Well, I'd put the question slightly differently: What CPUID data does
>> qemu supply to Xen here? I could easily see us making an assumption
>> somewhere that is met by all hardware but is theoretically wrong to
>> make and not met by qemu, which then leads to further issues with what
>> we expose to our guest.
> I have pasted the output from cpuid on a baremetal Linux here:

"baremetal" still meaning it was running on qemu, not itself baremetal?

> https://pastebin.com/WvaXiXuL

   miscellaneous (1/ebx):
      process local APIC physical ID = 0x0 (0)
      maximum IDs for CPUs in pkg    = 0x0 (0)
      CLFLUSH line size              = 0x8 (8)
      brand index                    = 0x0 (0)

As suspected the field is zero, and hence will remain zero after
multiplying by 2. I suppose the patch sent earlier should then get you
further.

Jan


Reply via email to