Hello,

The following series aims to prevent local APIC errors from stalling the
shtudown process.  On XenServer testing we have seen reports of AMD
boxes sporadically getting stuck in a spam of:

APIC error on CPU0: 00(08), Receive accept error

Messages during shutdown, as a result of device interrupts targeting
CPUs that are offline (and have the local APIC disabled).

First patch strictly solves the issue of shutdown getting stuck, further
patches aim to quiesce interrupts from all devices (known by Xen) as an
attempt to prevent a spurious "APIC error on CPU0: 00(00)" plus also
make kexec more reliable.

Thanks, Roger.

Roger Pau Monne (5):
  x86/shutdown: offline APs with interrupts disabled on all CPUs
  x86/irq: drop fixup_irqs() parameters
  x86/smp: perform disabling on interrupts ahead of AP shutdown
  x86/pci: disable MSI(-X) on all devices at shutdown
  x86/iommu: disable interrupts at shutdown

 xen/arch/x86/crash.c                        |  2 ++
 xen/arch/x86/include/asm/irq.h              |  4 +--
 xen/arch/x86/include/asm/msi.h              |  1 +
 xen/arch/x86/irq.c                          | 30 ++++++++-----------
 xen/arch/x86/msi.c                          | 18 +++++++++++
 xen/arch/x86/smp.c                          | 33 +++++++++++++++------
 xen/arch/x86/smpboot.c                      |  2 +-
 xen/drivers/passthrough/amd/iommu.h         |  1 +
 xen/drivers/passthrough/amd/iommu_init.c    | 17 +++++++++++
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  1 +
 xen/drivers/passthrough/iommu.c             |  6 ++++
 xen/drivers/passthrough/pci.c               | 33 +++++++++++++++++++++
 xen/drivers/passthrough/vtd/iommu.c         | 19 ++++++++++++
 xen/include/xen/iommu.h                     |  3 ++
 xen/include/xen/pci.h                       |  4 +++
 15 files changed, 145 insertions(+), 29 deletions(-)

-- 
2.46.0


Reply via email to