On 01.08.2022 09:38, Juergen Gross wrote:
> On 27.07.22 08:32, Jan Beulich wrote:
>> On 27.07.2022 03:19, Gao, Ruifeng wrote:
>>> Problem Description:
>>> Trying to execute "/usr/local/sbin/xen-hptool cpu-offline <cpuid>", the 
>>> host will hang immediately.
>>>
>>> Version-Release and System Details:
>>> Platform: Ice Lake Server
>>> Host OS: Red Hat Enterprise Linux 8.3 (Ootpa)
>>> Kernel: 5.19.0-rc6
>>> HW: Intel(R) Xeon(R) Gold 6336Y CPU @ 2.40GHz
>>> Xen Version: 4.17-unstable(ab2977b027-dirty)
>>>
>>> Reproduce Steps:
>>> 1. Boot from Xen and check the information:
>>> [root@icx-2s1 ~]# xl info
>>> host                   : icx-2s1
>>> release                : 5.19.0-rc6
>>> xen_version            : 4.17-unstable
>>> xen_caps               : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p 
>>> hvm-3.0-x86_64
>>> platform_params        : virt_start=0xffff800000000000
>>> xen_changeset          : Thu Jul 14 19:45:36 2022 +0100 git:ab2977b027-dirty
>>> 2. Execute the cpu-offline command, here cpuid is 48 as an example:
>>> [root@icx-2s1 ~]# /usr/local/sbin/xen-hptool cpu-offline 48
>>>
>>> Actual Results:
>>> The host will hang immediately.
>>
>> Well, it crashes (which is an important difference). Also you've hidden
>> the important details (allowing to easily identify what area the issue
>> is in) quite well in the attachment.
>>
>> Jürgen (and possibly George / Dario),
>>
>> this
>>
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82d04023be76>] R xfree+0x150/0x1f7
>> (XEN)    [<ffff82d040248795>] F 
>> common/sched/credit2.c#csched2_free_udata+0xc/0xe
>> (XEN)    [<ffff82d040259169>] F schedule_cpu_rm+0x38d/0x4b3
>> (XEN)    [<ffff82d0402430ca>] F 
>> common/sched/cpupool.c#cpupool_unassign_cpu_finish+0x17e/0x22c
>> (XEN)    [<ffff82d04021d402>] F 
>> common/sched/cpupool.c#cpu_callback+0x3fb/0x4dc
>> (XEN)    [<ffff82d040229fc3>] F notifier_call_chain+0x6b/0x96
>> (XEN)    [<ffff82d040204df7>] F 
>> common/cpu.c#cpu_notifier_call_chain+0x1b/0x33
>> (XEN)    [<ffff82d040204e33>] F common/cpu.c#_take_cpu_down+0x24/0x2b
>> (XEN)    [<ffff82d040204e43>] F common/cpu.c#take_cpu_down+0x9/0x10
>> (XEN)    [<ffff82d040231517>] F 
>> common/stop_machine.c#stopmachine_action+0x86/0x96
>> (XEN)    [<ffff82d040231cc5>] F common/tasklet.c#do_tasklet_work+0x72/0xa5
>> (XEN)    [<ffff82d040231f42>] F do_tasklet+0x58/0x8a
>> (XEN)    [<ffff82d040320b60>] F arch/x86/domain.c#idle_loop+0x8d/0xee
>> (XEN)
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 48:
>> (XEN) Assertion '!in_irq() && (local_irq_is_enabled() || num_online_cpus() 
>> <= 1)' failed at common/xmalloc_tlsf.c:704
>> (XEN) ****************************************
>>
>> is pointing at the problem quite clearly. Conceptually I think it
>> has always been wrong to call xfree() from stop-machine context. It
>> just so happened that we got away with that so far, because the CPU
>> being brought down was the only one using respective functions (and
>> hence there was no other risk of locking issues).
>>
>> Question is whether we want to continue building upon this (and
>> hence the involved assertion would need to "learn" to ignore
>> stop-machine context) or whether instead the freeing of the memory
>> here can be deferred, e.g. to be taken care of by the CPU driving
>> the offlining process.
> 
> This is even more complicated.
> 
> I think ASSERT_ALLOC_CONTEXT() will trigger more often, especially
> with core scheduling enabled. In fact I think this is the reason why
> I've seen very rare strange failures with core scheduling when trying
> cpu hotplug operations, as there are even xmalloc() calls in stop
> machine context.
> 
> I'm seeing the following possibilities:
> 
> 1) Pre-allocating the needed data and deferring freeing of no longer
>     needed data when taking a cpu down. Apart form some refactoring
>     in common/sched/cpupool.c and common/sched/core.c this should be
>     doable.
> 
> 2) In case stop_machine() is called for action on only one cpu allow
>     memory allocations and freeing with interrupts off and flush the
>     TLBs locally when enabling interrupts again. This would require
>     rather limited changes, but wouldn't be as clean as the other
>     approach.
> 
> Any preferences? I'd be fine with both variants and could write the
> patches.

I'd prefer 1 over 2, but in the unlikely event that 1 ends up unwieldy
I could live with an extensively commented form of 2.

Jan

Reply via email to