> -----Original Message-----
> From: Thomas Gleixner [mailto:t...@linutronix.de]
> Sent: Monday, May 25, 2015 4:38 PM
> To: Wu, Feng
> Cc: j...@8bytes.org; dw...@infradead.org; jiang....@linux.intel.com;
> io...@lists.linux-foundation.org; linux-kernel@vger.kernel.org
> Subject: Re: [v7 4/8] iommu, x86: No need to migrating irq for VT-d
> Posted-Interrupts
> 
> On Mon, 25 May 2015, Feng Wu wrote:
> 
> > We don't need to migrate the irqs for VT-d Posted-Interrupts here.
> > When 'pst' is set in IRTE, the associated irq will be posted to
> > guests instead of interrupt remapping. The destination of the
> > interrupt is set in Posted-Interrupts Descriptor, and the migration
> > happens during vCPU scheduling.
> >
> > However, we still update the cached irte here, which can be used
> > when changing back to remapping mode.
> >
> > Signed-off-by: Feng Wu <feng...@intel.com>
> > Reviewed-by: Jiang Liu <jiang....@linux.intel.com>
> > Acked-by: David Woodhouse <david.woodho...@intel.com>
> > ---
> >  drivers/iommu/intel_irq_remapping.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/intel_irq_remapping.c
> b/drivers/iommu/intel_irq_remapping.c
> > index 1955b09..646f4cf 100644
> > --- a/drivers/iommu/intel_irq_remapping.c
> > +++ b/drivers/iommu/intel_irq_remapping.c
> > @@ -994,7 +994,10 @@ intel_ir_set_affinity(struct irq_data *data, const
> struct cpumask *mask,
> >      */
> >     irte->vector = cfg->vector;
> >     irte->dest_id = IRTE_DEST(cfg->dest_apicid);
> > -   modify_irte(&ir_data->irq_2_iommu, irte);
> > +
> > +   /* We don't need to modify irte if the interrupt is for posting. */
> > +   if (irte->pst != 1)
> > +           modify_irte(&ir_data->irq_2_iommu, irte);
> 
> I don't think this is correct. ir_data->irte_entry contains the non
> posted version, which has pst == 0.
> 
> You need some other way to store whether you are in posted mode or
> not.

Yes, seems this is incorrect. Thank you for pointing this out. After more
thinking about this, I think I can do it this way:
#1. Check the 'pst' field in hardware
#2. If 'pst' is 1, we don't update the IRTE in hardware.

However, the question is the check and update operations should be protected
by the same spinlock ' irq_2_ir_lock ', otherwise, race condition may happen.

Based on the above idea, I have two solutions for this, do you think which one
is better or you have other better suggestions? It is highly appreciated if you
can give comments about them!

Solution 1:
Introduction a new function test_and_modify_irte() which is called by 
intel_ir_set_affinity
in place of the original modify_irte().
Here is the changes:

+static int test_and_modify_irte(struct irq_2_iommu *irq_iommu,
+                               struct irte *irte_modified)
+{
+       struct intel_iommu *iommu;
+       unsigned long flags;
+       struct irte *irte;
+       int rc, index;
+
+       if (!irq_iommu)
+               return -1;
+
+       raw_spin_lock_irqsave(&irq_2_ir_lock, flags);
+
+       iommu = irq_iommu->iommu;
+
+       index = irq_iommu->irte_index + irq_iommu->sub_handle;
+       irte = &iommu->ir_table->base[index];
+
+       if (irte->pst)
+               goto unlock;
+
+       set_64bit(&irte->low, irte_modified->low);
+       set_64bit(&irte->high, irte_modified->high);
+       __iommu_flush_cache(iommu, irte, sizeof(*irte));
+
+       rc = qi_flush_iec(iommu, index, 0);
+unlock:
+       raw_spin_unlock_irqrestore(&irq_2_ir_lock, flags);
+
+       return rc;
+}
+

Soluation 2:
Instead of introducing a new function, add a flag in the original modify_irte()
function to indicate that whether we need to check and return before updating
the real hardware, add pass 1 to return_on_pst in intel_ir_set_affinity()
Here is the changes:
static int modify_irte(struct irq_2_iommu *irq_iommu,
-                      struct irte *irte_modified)
+                      struct irte *irte_modified
+                      bool return_on_pst)
 {
        struct intel_iommu *iommu;
        unsigned long flags;
@@ -140,11 +173,15 @@ static int modify_irte(struct irq_2_iommu *irq_iommu,
        index = irq_iommu->irte_index + irq_iommu->sub_handle;
        irte = &iommu->ir_table->base[index];

+       if (return_on_pst && irte->pst)
+               goto unlock;
+
        set_64bit(&irte->low, irte_modified->low);
        set_64bit(&irte->high, irte_modified->high);
        __iommu_flush_cache(iommu, irte, sizeof(*irte));

        rc = qi_flush_iec(iommu, index, 0);
+unlock:
        raw_spin_unlock_irqrestore(&irq_2_ir_lock, flags);

        return rc;

Thanks,
Feng

> 
> Thanks,
> 
>       tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to