Before CPU identification has run (and it may not have run at all e.g.
when AP bringup failed altogether), cpu_data[].phys_proc_id (which is
what cpu_to_socket() resolves to) can't really be used. The use of
cpu_to_socket()'s result as an array index cpu_smpboot_free() therefore
needs guarding, as the function will also be invoked upon AP bringup
failure, in which case CPU identification may not have run.

Without "x86/CPU: re-work populating of cpu_data[]" [1] the issue is
less pronounced: The field starts out as zero, then has the BSP value
(likely again zero) copied into it, and it is properly invalidated only
in cpu_smpboot_free(). Still it is clearly wrong to use the BSP's socket
number here.

Making the guard work with and without the above patch applied turns out
interesting: Prior to that patch, the sole invalidation done is that in
cpu_smpboot_free(). Upon a later bringup attempt, the fields invalidated
are overwritten by the BSP values again, though. Hence compare APIC IDs,
as they cannot validly be the same once CPU identification has run.

[1] https://lists.xen.org/archives/html/xen-devel/2024-02/msg00727.html

Signed-off-by: Jan Beulich <jbeul...@suse.com>
---
Sadly there was no feedback at all yet for the referenced patch.

--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -958,7 +958,13 @@ static void cpu_smpboot_free(unsigned in
     unsigned int socket = cpu_to_socket(cpu);
     struct cpuinfo_x86 *c = cpu_data;
 
-    if ( cpumask_empty(socket_cpumask[socket]) )
+    /*
+     * We may come here without the CPU having run through CPU identification.
+     * In that case the socket number cannot be relied upon, but the respective
+     * socket_cpumask[] slot also wouldn't have been set.
+     */
+    if ( c[cpu].apicid != boot_cpu_data.apicid &&
+         cpumask_empty(socket_cpumask[socket]) )
     {
         xfree(socket_cpumask[socket]);
         socket_cpumask[socket] = NULL;

Reply via email to