On Tue, Jul 11, 2023 at 12:53:31PM +0200, Jan Beulich wrote:
> On 10.07.2023 16:43, Roger Pau Monné wrote:
> > On Mon, Jul 10, 2023 at 12:56:27PM +0200, Jan Beulich wrote:
> >> On 07.07.2023 11:53, Roger Pau Monne wrote:
> >>> The current logic to init the local APIC and the IO-APIC does init the
> >>> former first before doing any kind of sanitation on the IO-APIC pin
> >>> configuration.  It's already noted on enable_IO_APIC() that Xen
> >>> shouldn't trust the IO-APIC being empty at bootup.
> >>>
> >>> At XenServer we have a system where the IO-APIC 0 is handed to Xen
> >>> with pin 0 unmasked, set to Fixed delivery mode, edge triggered and
> >>> with a vector of 0 (all fields of the RTE are zeroed).  Once the local
> >>> APIC is enabled periodic injections from such pin cause a storm of
> >>> errors:
> >>>
> >>> APIC error on CPU0: 00(40), Received illegal vector
> >>> APIC error on CPU0: 40(40), Received illegal vector
> >>> APIC error on CPU0: 40(40), Received illegal vector
> >>> APIC error on CPU0: 40(40), Received illegal vector
> >>> APIC error on CPU0: 40(40), Received illegal vector
> >>> APIC error on CPU0: 40(40), Received illegal vector
> >>>
> >>> That prevents Xen from booting.
> >>
> >> And I expect no RTE is in ExtInt mode, so one might conclude that
> >> firmware meant to set up RTE 0 that way. Mainly as a remark, albeit
> >> of course there's then the question whether to change the RTE
> >> rather than masking it. What do ACPI tables say?
> > 
> > There's one relevant override:
> > 
> > [668h 1640   1]                Subtable Type : 02 [Interrupt Source 
> > Override]
> > [669h 1641   1]                       Length : 0A
> > [66Ah 1642   1]                          Bus : 00
> > [66Bh 1643   1]                       Source : 00
> > [66Ch 1644   4]                    Interrupt : 00000002
> > [670h 1648   2]        Flags (decoded below) : 0000
> >                                     Polarity : 0
> >                                 Trigger Mode : 0
> > 
> > So IRQ 0 -> GSI 2, so it's likely pin 0 is the one the i8259 is
> > connected to.
> 
> Then wouldn't we be better off converting that RTE to ExtInt? That
> would allow PIC IRQs (not likely to exist, but still) to work
> (without undue other side effects afaics), whereas masking this RTE
> would not.

I'm kind of worry of trying to automate this logic.  Should we always
convert pin 0 to ExtInt if it's found unmasked, with and invalid
vector and there's a source override entry for the IRQ?

It seems weird to infer this just from the fact that pin 0 is all
zeroed out.

> >>> --- a/xen/arch/x86/apic.c
> >>> +++ b/xen/arch/x86/apic.c
> >>> @@ -1476,6 +1476,10 @@ int __init APIC_init_uniprocessor (void)
> >>>          return -1;
> >>>      }
> >>>  
> >>> +    if ( smp_found_config && !skip_ioapic_setup && nr_ioapics )
> >>> +        /* Sanitize the IO-APIC pins before enabling the local APIC. */
> >>> +        sanitize_IO_APIC();
> >>
> >> I'm a little puzzled by the smp_found_config part of the check here,
> >> but not in smp_prepare_cpus(). What's the reason for (a) the check
> >> and (b) the difference?
> > 
> > This just mimics what gates the call to setup_IO_APIC() in that same
> > function.  It makes no sense to call sanitize_IO_APIC() if
> > setup_IO_APIC() is not called, and I wasn't planning on changing the
> > logic that gates the call setup_IO_APIC() in this patch.
> > 
> > I did note the difference with smp_prepare_cpus(), and I think we
> > should look at unifying those paths, but didn't want to do it as part
> > of this fix.
> 
> Well, consistency is one valid goal. But masking RTEs may need to be
> done more aggressively than setting up the IO-APICs. In particular
> if we're not to use them, we still want to mask all RTEs. Otherwise
> we're likely to observe each IRQ to arrive via two separate routes
> (once through the 8259 and once from an unmasked IO-APIC pin).

So avoid the smp_found_config check here and use the same condition as in
smp_prepare_cpus(), I will adjust the patch to that effect.

I do wonder why we don't simply mandate IO-APIC usage and get rid of
the code to handle the 8259.  Is it only to cope with systems that
have a broken IO-APIC configuration?  Because I don't think there are
x86 64bit systems without an IO-APIC?

> >> Isn't checking nr_ioapics sufficient in this
> >> regard? (b) probably could be addressed by moving the code to the
> >> beginning of verify_local_APIC(), immediately ahead of which you
> >> insert both call sites. (At which point the function may want naming
> >> verify_IO_APIC() for consistency, but that's surely minor.)
> > 
> > I wanted the call to sanitize_IO_APIC() to be done at the same level
> > where the call to setup_IO_APIC() is done, because it makes the logic
> > clearer.
> 
> Hmm, I see.
> 
> >> As to also checking skip_ioapic_setup - wouldn't the unmasked pin be
> >> similarly troublesome in that case?
> > 
> > skip_ioapic_setup is set when the IO-APIC address in the MADT is
> > invalid, so in that case attempting to access IO-APIC registers in
> > that case won't lead to anything good.
> 
> Of course, as I did say as well. Still if for some IO-APICs we know
> their addresses, we would still be able to deal with those (if we
> were to stick to this mask-all-RTEs-early model).

The issue is that ioapic_init_mappings() will refuse to process
further IO-APIC entries once an entry with address 0 is found.  We
could change that, but I would likely do it as an improvement rather
than tie it to this change.

Thanks, Roger.

Reply via email to