On 2025-08-15 10:30, Jan Beulich wrote:
On 14.08.2025 20:20, Andrew Cooper wrote:
On 14/08/2025 8:26 am, Jan Beulich wrote:
On 13.08.2025 13:36, Andrew Cooper wrote:
On 12/08/2025 10:43 am, Nicola Vetrini wrote:
On 2025-08-08 22:23, Andrew Cooper wrote:
diff --git a/xen/arch/x86/traps-setup.c b/xen/arch/x86/traps-setup.c
index 8ca379c9e4cb..13b8fcf0ba51 100644
--- a/xen/arch/x86/traps-setup.c
+++ b/xen/arch/x86/traps-setup.c
@@ -19,6 +20,124 @@ boolean_param("ler", opt_ler);

 void nocall entry_PF(void);

+/*
+ * Sets up system tables and descriptors for IDT devliery.
+ *
+ * - Sets up TSS with stack pointers, including ISTs
+ * - Inserts TSS selector into regular and compat GDTs
+ * - Loads GDT, IDT, TR then null LDT
+ * - Sets up IST references in the IDT
+ */
+static void load_system_tables(void)
+{
+    unsigned int i, cpu = smp_processor_id();
+    unsigned long stack_bottom = get_stack_bottom(),
+        stack_top = stack_bottom & ~(STACK_SIZE - 1);
+    /*
+     * NB: define tss_page as a local variable because clang 3.5
doesn't
+     * support using ARRAY_SIZE against per-cpu variables.
+     */
+    struct tss_page *tss_page = &this_cpu(tss_page);
+    idt_entry_t *idt = this_cpu(idt);
+
Given the clang baseline this might not be needed anymore?
Hmm.  While true, looking at 51461114e26, the code is definitely better written with the tss_page variable and we wouldn't want to go back to
the old form.

I think that I'll simply drop the comment.

~Andrew

P.S.

Generally speaking, because of the RELOC_HIDE() in this_cpu(), any time you ever want two accesses to a variable, it's better (code gen wise) to
construct a pointer to it and use the point multiple times.

I don't understand why there's a RELOC_HIDE() in this_cpu().  The
justification doesn't make sense, but I've not had time to explore what
happens if we take it out.
There's no justification in xen/percpu.h?

Well, it's given in compiler.h by RELOC_HIDE().

/* This macro obfuscates arithmetic on a variable address so that gcc
   shouldn't recognize the original var, and make assumptions about it */


But this is far from convincing.


My understanding is that we simply may not expose any accesses to per_cpu_* variables directly to the compiler, or there's a risk that it might access
the "master" variable (i.e. CPU0's on at least x86).

RELOC_HIDE() doesn't do anything about the correctness of the pointer
arithmetic expression to make the access work.

I don't see how a correct expression can ever access CPU0's data by
accident.

Hmm, upon another look I agree. I wonder whether we inherited this from
Linux, where in turn it may have been merely a workaround to deal with
preemptible code not correctly accessing per-CPU data (i.e. not
accounting for get_per_cpu_offset() not being stable across preemption).
Yet then per_cpu() would have been of similar concern when "cpu" isn't
properly re-fetched after any possible preemption point ...

Jan

Probably inherited with a stripped-down comment on top of RELOC_HIDE, see [1]. In a way it does make sense that the compiler may decide to optimize based on this assumption, though I don't know whether wrapping is meant to happen with per-CPU variables.

[1] https://elixir.bootlin.com/linux/v6.16/source/include/linux/compiler-gcc.h#L31

--
Nicola Vetrini, B.Sc.
Software Engineer
BUGSENG (https://bugseng.com)
LinkedIn: https://www.linkedin.com/in/nicola-vetrini-a42471253

Reply via email to