On 24.10.2023 16:53, Roger Pau Monne wrote:
> Sporadically we have seen the following during AP bringup on AMD platforms
> only:
> 
> microcode: CPU59 updated from revision 0x830107a to 0x830107a, date = 
> 2023-05-17
> microcode: CPU60 updated from revision 0x830104d to 0x830107a, date = 
> 2023-05-17
> CPU60: No irq handler for vector 27 (IRQ -2147483648)
> microcode: CPU61 updated from revision 0x830107a to 0x830107a, date = 
> 2023-05-17
> 
> This is similar to the issue raised on Linux commit 36e9e1eab777e, where they
> observed i8259 (active) vectors getting delivered to CPUs different than 0.
> 
> On AMD or Hygon platforms adjust the target CPU mask of i8259 interrupt
> descriptors to contain all possible CPUs, so that APs will reserve the vector
> at startup if any legacy IRQ is still delivered through the i8259.  Note that
> if the IO-APIC takes over those interrupt descriptors the CPU mask will be
> reset.
> 
> Spurious i8259 interrupt vectors however (IRQ7 and IRQ15) can be injected even
> when all i8259 pins are masked, and hence would need to be handled on all 
> CPUs.
> 
> Continue to reserve PIC vectors on CPU0 only, but do check for such spurious
> interrupts on all CPUs if the vendor is AMD or Hygon.  Note that once the
> vectors get used by devices detecting PIC spurious interrupts will no longer 
> be
> possible, however the device driver should be able to cope with spurious
> interrupts.  Such PIC spurious interrupts occurring when the vector is in use
> by a local APIC routed source will lead to an extra EOI, which might
> unintentionally clear a different vector from ISR.  Note this is already the
> current behavior, so assume it's infrequent enough to not cause real issues.
> 
> Finally, adjust the printed message to display the CPU where the spurious
> interrupt has been received, so it looks like:
> 
> microcode: CPU1 updated from revision 0x830107a to 0x830107a, date = 
> 2023-05-17
> cpu1: spurious 8259A interrupt: IRQ7
> microcode: CPU2 updated from revision 0x830104d to 0x830107a, date = 
> 2023-05-17
> 
> Amends: 3fba06ba9f8b ('x86/IRQ: re-use legacy vector ranges on APs')
> Signed-off-by: Roger Pau Monné <roger....@citrix.com>

Reviewed-by: Jan Beulich <jbeul...@suse.com>
with one nit (which I think can be taken care of when committing):

> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1920,7 +1920,16 @@ void do_IRQ(struct cpu_user_regs *regs)
>                  kind = "";
>              if ( !(vector >= FIRST_LEGACY_VECTOR &&
>                     vector <= LAST_LEGACY_VECTOR &&
> -                   !smp_processor_id() &&
> +                   (!smp_processor_id() ||
> +                    /*
> +                     * For AMD/Hygon do spurious PIC interrupt
> +                     * detection on all CPUs, as it has been observed
> +                     * that during unknown circumstances spurious PIC
> +                     * interrupts have been delivered to CPUs
> +                     * different than the BSP.
> +                     */
> +                   (boot_cpu_data.x86_vendor & (X86_VENDOR_AMD |
> +                                                X86_VENDOR_HYGON))) &&

Afaict these two lines need indenting by one more blank, to account
for the parentheses enclosing the || operands.

Jan

>                     bogus_8259A_irq(vector - FIRST_LEGACY_VECTOR)) )
>              {
>                  printk("CPU%u: No irq handler for vector %02x (IRQ %d%s)\n",


Reply via email to