On 28.04.2022 09:37, Jan Beulich wrote:
> On 27.04.2022 20:44, David Vrabel wrote:
>>
>>
>> On 27/04/2022 19:03, Andrew Cooper wrote:
>>> On 19/04/2022 16:03, David Vrabel wrote:
>>>> From: David Vrabel <dvra...@amazon.co.uk>
>>>>
>>>> If the direct map is incorrectly modified with interrupts disabled,
>>>> the required TLB flushes are degraded to flushing the local CPU only.
>>>>
>>>> This could lead to very hard to diagnose problems as different CPUs will
>>>> end up with different views of memory. Although, no such issues have yet
>>>> been identified.
>>>>
>>>> Change the check in the flush_area() macro to look at system_state
>>>> instead. This defers the switch from local to all later in the boot
>>>> (see xen/arch/x86/setup.c:__start_xen()). This is fine because
>>>> additional PCPUs are not brought up until after the system state is
>>>> SYS_STATE_smp_boot.
>>>>
>>>> Signed-off-by: David Vrabel <dvra...@amazon.co.uk>
>>>
>>> This explodes on CET systems:
>>>
>>> (XEN) Assertion 'local_irq_is_enabled()' failed at arch/x86/smp.c:265
>>> (XEN) ----[ Xen-4.17.0-10.24-d  x86_64  debug=y  Not tainted ]----
>>> (XEN) CPU:    0
>>> (XEN) RIP:    e008:[<ffff82d040345300>] flush_area_mask+0x40/0x13e
>>> <snip>
>>> (XEN) Xen call trace:
>>> (XEN)    [<ffff82d040345300>] R flush_area_mask+0x40/0x13e
>>> (XEN)    [<ffff82d040338a40>] F modify_xen_mappings+0xc5/0x958
>>> (XEN)    [<ffff82d0404474f9>] F
>>> arch/x86/alternative.c#_alternative_instructions+0xb7/0xb9
>>> (XEN)    [<ffff82d0404476cc>] F alternative_branches+0xf/0x12
>>> (XEN)    [<ffff82d04044e37d>] F __start_xen+0x1ef4/0x2776
>>> (XEN)    [<ffff82d040203344>] F __high_start+0x94/0xa0
>>> (XEN)
>>> (XEN)
>>> (XEN) ****************************************
>>> (XEN) Panic on CPU 0:
>>> (XEN) Assertion 'local_irq_is_enabled()' failed at arch/x86/smp.c:265
>>> (XEN) ****************************************
>>> (XEN)
>>>
>>> We really did want a local-only flush here, because we specifically
>>> intended to make self-modifying changes before bringing secondary CPUs up.
>>
>> I think the transition to SYS_STATE_smp_boot system state should be 
>> later. i.e., the last point were only 1 PCPU is guaranteed.
> 
> I'm not sure there isn't code which assumes pre-SMP initcalls to happen
> in this state already. So it may take addition of yet another state if
> no other solution can be found.

Or maybe this again shouldn't be using system_state but num_online_cpus()?

Jan


Reply via email to