On 14.08.2025 20:20, Andrew Cooper wrote:
> On 14/08/2025 8:26 am, Jan Beulich wrote:
>> On 13.08.2025 13:36, Andrew Cooper wrote:
>>> On 12/08/2025 10:43 am, Nicola Vetrini wrote:
>>>> On 2025-08-08 22:23, Andrew Cooper wrote:
>>>>> diff --git a/xen/arch/x86/traps-setup.c b/xen/arch/x86/traps-setup.c
>>>>> index 8ca379c9e4cb..13b8fcf0ba51 100644
>>>>> --- a/xen/arch/x86/traps-setup.c
>>>>> +++ b/xen/arch/x86/traps-setup.c
>>>>> @@ -19,6 +20,124 @@ boolean_param("ler", opt_ler);
>>>>>
>>>>>  void nocall entry_PF(void);
>>>>>
>>>>> +/*
>>>>> + * Sets up system tables and descriptors for IDT devliery.
>>>>> + *
>>>>> + * - Sets up TSS with stack pointers, including ISTs
>>>>> + * - Inserts TSS selector into regular and compat GDTs
>>>>> + * - Loads GDT, IDT, TR then null LDT
>>>>> + * - Sets up IST references in the IDT
>>>>> + */
>>>>> +static void load_system_tables(void)
>>>>> +{
>>>>> +    unsigned int i, cpu = smp_processor_id();
>>>>> +    unsigned long stack_bottom = get_stack_bottom(),
>>>>> +        stack_top = stack_bottom & ~(STACK_SIZE - 1);
>>>>> +    /*
>>>>> +     * NB: define tss_page as a local variable because clang 3.5
>>>>> doesn't
>>>>> +     * support using ARRAY_SIZE against per-cpu variables.
>>>>> +     */
>>>>> +    struct tss_page *tss_page = &this_cpu(tss_page);
>>>>> +    idt_entry_t *idt = this_cpu(idt);
>>>>> +
>>>> Given the clang baseline this might not be needed anymore?
>>> Hmm.  While true, looking at 51461114e26, the code is definitely better
>>> written with the tss_page variable and we wouldn't want to go back to
>>> the old form.
>>>
>>> I think that I'll simply drop the comment.
>>>
>>> ~Andrew
>>>
>>> P.S.
>>>
>>> Generally speaking, because of the RELOC_HIDE() in this_cpu(), any time
>>> you ever want two accesses to a variable, it's better (code gen wise) to
>>> construct a pointer to it and use the point multiple times.
>>>
>>> I don't understand why there's a RELOC_HIDE() in this_cpu().  The
>>> justification doesn't make sense, but I've not had time to explore what
>>> happens if we take it out.
>> There's no justification in xen/percpu.h?
> 
> Well, it's given in compiler.h by RELOC_HIDE().
> 
> /* This macro obfuscates arithmetic on a variable address so that gcc
>    shouldn't recognize the original var, and make assumptions about it */
> 
> 
> But this is far from convincing.
> 
>>
>> My understanding is that we simply may not expose any accesses to per_cpu_*
>> variables directly to the compiler, or there's a risk that it might access
>> the "master" variable (i.e. CPU0's on at least x86).
> 
> RELOC_HIDE() doesn't do anything about the correctness of the pointer
> arithmetic expression to make the access work.
> 
> I don't see how a correct expression can ever access CPU0's data by
> accident.

Hmm, upon another look I agree. I wonder whether we inherited this from
Linux, where in turn it may have been merely a workaround to deal with
preemptible code not correctly accessing per-CPU data (i.e. not
accounting for get_per_cpu_offset() not being stable across preemption).
Yet then per_cpu() would have been of similar concern when "cpu" isn't
properly re-fetched after any possible preemption point ...

Jan

Reply via email to