On Thu, Oct 08 2020 at 12:10, David Woodhouse wrote:
> On Thu, 2020-10-08 at 11:34 +0200, Thomas Gleixner wrote:
>> The overall conclusion for this is:
>> 
>>  1) X2APIC support on bare metal w/o irq remapping is not going to
>>     happen unless you:
>> 
>>       - added support in multi-queue devices which utilize managed
>>         interrupts
>>         
>>       - audited the whole tree for other assumptions related to the
>>         reachability of possible CPUs.
>> 
>>     I'm not expecting you to be done with that before I retire so for
>>     me it's just not going to happen :)
>
> Makes sense. It probably does mean we should a BUG_ON for the case
> where IRQ remapping *is* enabled but any device is found which isn't
> behind it. But that's OK.

We can kinda gracefully handle that. See the completely untested and
incomplete patch below.

>>  2) X2APIC support on VIRT is possible if the extended ID magic is
>>     supported by the hypervisor because that does not make any CPU
>>     unreachable for MSI and therefore the multi-queue muck and
>>     everything else just works.
>> 
>>     This requires to have either the domain affinity limitation for HPET
>>     in place or just to force disable HPET or at least HPET-MSI which is
>>     a reasonable tradeoff.
>> 
>>     HPET is not required for guests which have kvmclock and
>>     APIC/deadline timer and known (hypervisor provided) frequencies.
>
> HPET-MSI should work fine. Like the IOAPIC, it's just a child of the
> *actual* MSI domain. The address/data in the MSI message are completely
> opaque to it, and if the parent domain happens to put meaningful
> information into bits 11-5 of the MSI address, the HPET won't even
> notice.
>
> The HPET's Tn_FSB_INT_ADDR register does have a full 32 bits of the MSI
> address; it's not doing bit-swizzling like the IOAPIC does, which might
> potentially *not* have been able to set certain bits in the MSI.

Indeed. I thought it was crippled in some way, but you're right it has
all the bits.

Thanks,

        tglx
---
Subject: x86/iommu: Make interrupt remapping more robust
From: Thomas Gleixner <t...@linutronix.de>
Date: Thu, 08 Oct 2020 14:09:44 +0200

Needs to be split into pieces and cover PCI proper. Right now PCI gets a
NULL pointer assigned which makes it explode at the wrong place
later. Also hyperv iommu wants some love.

NOT-Signed-off-by: Thomas Gleixner <t...@linutronix.de>
---
 arch/x86/kernel/apic/io_apic.c      |    4 +++-
 arch/x86/kernel/apic/msi.c          |   24 ++++++++++++++----------
 drivers/iommu/amd/iommu.c           |    6 +++---
 drivers/iommu/intel/irq_remapping.c |    4 ++--
 4 files changed, 22 insertions(+), 16 deletions(-)

--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2300,7 +2300,9 @@ static int mp_irqdomain_create(int ioapi
        info.type = X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT;
        info.devid = mpc_ioapic_id(ioapic);
        parent = irq_remapping_get_irq_domain(&info);
-       if (!parent)
+       if (IS_ERR(parent))
+               return PTR_ERR(parent);
+       else if (!parent)
                parent = x86_vector_domain;
        else
                name = "IO-APIC-IR";
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -415,9 +415,9 @@ static struct msi_domain_info hpet_msi_d
 struct irq_domain *hpet_create_irq_domain(int hpet_id)
 {
        struct msi_domain_info *domain_info;
+       struct fwnode_handle *fn = NULL;
        struct irq_domain *parent, *d;
        struct irq_alloc_info info;
-       struct fwnode_handle *fn;
 
        if (x86_vector_domain == NULL)
                return NULL;
@@ -432,25 +432,29 @@ struct irq_domain *hpet_create_irq_domai
        init_irq_alloc_info(&info, NULL);
        info.type = X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT;
        info.devid = hpet_id;
+
        parent = irq_remapping_get_irq_domain(&info);
-       if (parent == NULL)
+       if (IS_ERR(parent))
+               goto fail;
+       else if (!parent)
                parent = x86_vector_domain;
        else
                hpet_msi_controller.name = "IR-HPET-MSI";
 
        fn = irq_domain_alloc_named_id_fwnode(hpet_msi_controller.name,
                                              hpet_id);
-       if (!fn) {
-               kfree(domain_info);
-               return NULL;
-       }
+       if (!fn)
+               goto fail;
 
        d = msi_create_irq_domain(fn, domain_info, parent);
-       if (!d) {
-               irq_domain_free_fwnode(fn);
-               kfree(domain_info);
-       }
+       if (!d)
+               goto fail;
        return d;
+
+fail:
+       irq_domain_free_fwnode(fn);
+       kfree(domain_info);
+       return NULL;
 }
 
 int hpet_assign_irq(struct irq_domain *domain, struct hpet_channel *hc,
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3557,7 +3557,7 @@ static struct irq_domain *get_irq_domain
        struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
 
        if (!iommu)
-               return NULL;
+               return ERR_PTR(-ENODEV);
 
        switch (info->type) {
        case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT:
@@ -3565,7 +3565,7 @@ static struct irq_domain *get_irq_domain
                return iommu->ir_domain;
        default:
                WARN_ON_ONCE(1);
-               return NULL;
+               return ERR_PTR(-ENODEV);
        }
 }
 
@@ -3578,7 +3578,7 @@ static struct irq_domain *get_irq_domain
 
        devid = get_devid(info);
        if (devid < 0)
-               return NULL;
+               return ERR_PTR(-ENODEV);
        return get_irq_domain_for_devid(info, devid);
 }
 
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -212,7 +212,7 @@ static struct irq_domain *map_hpet_to_ir
                if (ir_hpet[i].id == hpet_id && ir_hpet[i].iommu)
                        return ir_hpet[i].iommu->ir_domain;
        }
-       return NULL;
+       return ERR_PTR(-ENODEV);
 }
 
 static struct intel_iommu *map_ioapic_to_iommu(int apic)
@@ -230,7 +230,7 @@ static struct irq_domain *map_ioapic_to_
 {
        struct intel_iommu *iommu = map_ioapic_to_iommu(apic);
 
-       return iommu ? iommu->ir_domain : NULL;
+       return iommu ? iommu->ir_domain : ERR_PTR(-ENODEV);
 }
 
 static struct irq_domain *map_dev_to_ir(struct pci_dev *dev)
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to