Re: [RESEND PATCH V6 0/6] Add support for privileged mappings

2016-12-02 Thread Robin Murphy
Hi Sricharan,

On 02/12/16 14:55, Sricharan R wrote:
> This series is a resend of the V5 that Mitch sent sometime back [2]
> All the patches are the same and i have just rebased. Not sure why this
> finally did not make it last time. The last patch in the previous
> series does not apply now [3], so just redid that. Also Copied the tags
> that he had from last time as well.

Heh, I was assuming this would be down to me to pick up. Vinod did have
some complaints last time about the commit message on the PL330 patch -
I did get as far as rewriting that and reworking onto my SMMU
changes[1], I just hadn't got round to sending it, so it fell onto the
"after the next merge window" pile.

I'd give some review comments, but they'd essentially be a diff against
that branch :)

Robin.

[1]:http://www.linux-arm.org/git?p=linux-rm.git;a=shortlog;h=refs/heads/mh/dma-priv

> The following patch to the ARM SMMU driver:
> 
> commit d346180e70b91b3d5a1ae7e5603e65593d4622bc
> Author: Robin Murphy 
> Date:   Tue Jan 26 18:06:34 2016 +
> 
> iommu/arm-smmu: Treat all device transactions as unprivileged
> 
> started forcing all SMMU transactions to come through as "unprivileged".
> The rationale given was that:
> 
>   (1) There is no way in the IOMMU API to even request privileged
>   mappings.
> 
>   (2) It's difficult to implement a DMA mapper that correctly models the
>   ARM VMSAv8 behavior of unprivileged-writeable =>
>   privileged-execute-never.
> 
> This series rectifies (1) by introducing an IOMMU API for privileged
> mappings and implements it in io-pgtable-arm.
> 
> This series rectifies (2) by introducing a new dma attribute
> (DMA_ATTR_PRIVILEGED) for users of the DMA API that need privileged
> mappings which are inaccessible to lesser-privileged execution levels, and
> implements it in the arm64 IOMMU DMA mapper.  The one known user (pl330.c)
> is converted over to the new attribute.
> 
> Jordan and Jeremy can provide more info on the use case if needed, but the
> high level is that it's a security feature to prevent attacks such as [1].
> 
> [1] https://github.com/robclark/kilroy
> [2] https://lkml.org/lkml/2016/7/27/590
> [3] https://patchwork.kernel.org/patch/9250493/
> 
> Changelog:
> 
>  v5..v6
> - Rebased all the patches and redid 6/6 as it does not apply in
>   this code base. 
> 
>  v4..v5
> 
> - Simplified patch 4/6 (suggested by Robin Murphy).
> 
>   v3..v4
> 
> - Rebased and reworked on linux next due to the dma attrs rework going
>   on over there.  Patches changed: 3/6, 4/6, and 5/6.
> 
>   v2..v3
> 
> - Incorporated feedback from Robin:
>   * Various comments and re-wordings.
>   * Use existing bit definitions for IOMMU_PRIV implementation
> in io-pgtable-arm.
>   * Renamed and redocumented dma_direction_to_prot.
>   * Don't worry about executability in new DMA attr.
> 
>   v1..v2
> 
> - Added a new DMA attribute to make executable privileged mappings
>   work, and use that in the pl330 driver (suggested by Will).
> 
> Jeremy Gebben (1):
>   iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag
> 
> Mitchel Humpherys (4):
>   iommu: add IOMMU_PRIV attribute
>   common: DMA-mapping: add DMA_ATTR_PRIVILEGED attribute
>   arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED
>   dmaengine: pl330: Make sure microcode is privileged
> 
> Sricharan R (1):
>   iommu/arm-smmu: Set privileged attribute to 'default' instead of
> 'unprivileged'
> 
>  Documentation/DMA-attributes.txt | 10 ++
>  arch/arm64/mm/dma-mapping.c  |  6 +++---
>  drivers/dma/pl330.c  |  6 --
>  drivers/iommu/arm-smmu.c |  2 +-
>  drivers/iommu/dma-iommu.c| 10 --
>  drivers/iommu/io-pgtable-arm.c   |  5 -
>  include/linux/dma-iommu.h|  3 ++-
>  include/linux/dma-mapping.h  |  7 +++
>  include/linux/iommu.h|  1 +
>  9 files changed, 40 insertions(+), 10 deletions(-)
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 03/10] iommu: Add new reserved IOMMU attributes

2016-12-06 Thread Robin Murphy
On 15/11/16 13:09, Eric Auger wrote:
> IOMMU_RESV_NOMAP is used to tag reserved IOVAs that are not
> supposed to be IOMMU mapped. IOMMU_RESV_MSI tags IOVAs
> corresponding to MSIs that need to be IOMMU mapped.
> 
> IOMMU_RESV_MASK allows to check if the IOVA is reserved.
> 
> Signed-off-by: Eric Auger 
> ---
>  include/linux/iommu.h | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 7f6ebd0..02cf565 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -32,6 +32,10 @@
>  #define IOMMU_NOEXEC (1 << 3)
>  #define IOMMU_MMIO   (1 << 4) /* e.g. things like MSI doorbells */
>  
> +#define IOMMU_RESV_MASK  0x300 /* Reserved IOVA mask */
> +#define IOMMU_RESV_NOMAP (1 << 8) /* IOVA that cannot be mapped */
> +#define IOMMU_RESV_MSI   (1 << 9) /* MSI region transparently 
> mapped */

It feels a bit grotty encoding these in prot - NOMAP sort of fits, but
MSI really is something else entirely. On reflection I think a separate
iommu_resv_region::type field would be better, particularly as it's
something we might potentially want to expose via the sysfs entry.

Robin.

> +
>  struct iommu_ops;
>  struct iommu_group;
>  struct bus_type;
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 02/10] iommu: Rename iommu_dm_regions into iommu_resv_regions

2016-12-06 Thread Robin Murphy
On 15/11/16 13:09, Eric Auger wrote:
> We want to extend the callbacks used for dm regions and
> use them for reserved regions. Reserved regions can be
> - directly mapped regions
> - regions that cannot be iommu mapped (PCI host bridge windows, ...)
> - MSI regions (because they belong to another address space or because
>   they are not translated by the IOMMU and need special handling)
> 
> So let's rename the struct and also the callbacks.

Acked-by: Robin Murphy 

> Signed-off-by: Eric Auger 
> ---
>  drivers/iommu/amd_iommu.c | 20 ++--
>  drivers/iommu/iommu.c | 22 +++---
>  include/linux/iommu.h | 29 +++--
>  3 files changed, 36 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 754595e..a6c351d 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -3159,8 +3159,8 @@ static bool amd_iommu_capable(enum iommu_cap cap)
>   return false;
>  }
>  
> -static void amd_iommu_get_dm_regions(struct device *dev,
> -  struct list_head *head)
> +static void amd_iommu_get_resv_regions(struct device *dev,
> +struct list_head *head)
>  {
>   struct unity_map_entry *entry;
>   int devid;
> @@ -3170,7 +3170,7 @@ static void amd_iommu_get_dm_regions(struct device *dev,
>   return;
>  
>   list_for_each_entry(entry, &amd_iommu_unity_map, list) {
> - struct iommu_dm_region *region;
> + struct iommu_resv_region *region;
>  
>   if (devid < entry->devid_start || devid > entry->devid_end)
>   continue;
> @@ -3193,18 +3193,18 @@ static void amd_iommu_get_dm_regions(struct device 
> *dev,
>   }
>  }
>  
> -static void amd_iommu_put_dm_regions(struct device *dev,
> +static void amd_iommu_put_resv_regions(struct device *dev,
>struct list_head *head)
>  {
> - struct iommu_dm_region *entry, *next;
> + struct iommu_resv_region *entry, *next;
>  
>   list_for_each_entry_safe(entry, next, head, list)
>   kfree(entry);
>  }
>  
> -static void amd_iommu_apply_dm_region(struct device *dev,
> +static void amd_iommu_apply_resv_region(struct device *dev,
> struct iommu_domain *domain,
> -   struct iommu_dm_region *region)
> +   struct iommu_resv_region *region)
>  {
>   struct dma_ops_domain *dma_dom = to_dma_ops_domain(to_pdomain(domain));
>   unsigned long start, end;
> @@ -3228,9 +3228,9 @@ static void amd_iommu_apply_dm_region(struct device 
> *dev,
>   .add_device = amd_iommu_add_device,
>   .remove_device = amd_iommu_remove_device,
>   .device_group = amd_iommu_device_group,
> - .get_dm_regions = amd_iommu_get_dm_regions,
> - .put_dm_regions = amd_iommu_put_dm_regions,
> - .apply_dm_region = amd_iommu_apply_dm_region,
> + .get_resv_regions = amd_iommu_get_resv_regions,
> + .put_resv_regions = amd_iommu_put_resv_regions,
> + .apply_resv_region = amd_iommu_apply_resv_region,
>   .pgsize_bitmap  = AMD_IOMMU_PGSIZES,
>  };
>  
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 9a2f196..c7ed334 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -318,7 +318,7 @@ static int iommu_group_create_direct_mappings(struct 
> iommu_group *group,
> struct device *dev)
>  {
>   struct iommu_domain *domain = group->default_domain;
> - struct iommu_dm_region *entry;
> + struct iommu_resv_region *entry;
>   struct list_head mappings;
>   unsigned long pg_size;
>   int ret = 0;
> @@ -331,14 +331,14 @@ static int iommu_group_create_direct_mappings(struct 
> iommu_group *group,
>   pg_size = 1UL << __ffs(domain->pgsize_bitmap);
>   INIT_LIST_HEAD(&mappings);
>  
> - iommu_get_dm_regions(dev, &mappings);
> + iommu_get_resv_regions(dev, &mappings);
>  
>   /* We need to consider overlapping regions for different devices */
>   list_for_each_entry(entry, &mappings, list) {
>   dma_addr_t start, end, addr;
>  
> - if (domain->ops->apply_dm_region)
> - domain->ops->apply_dm_region(dev, domain, entry);
> + if (domain->ops->apply_resv_region)
> + domain->ops->apply_resv_region(dev, domain, entry);
>  
>   start = ALIGN(entry->start,

Re: [RFC v3 04/10] iommu: iommu_alloc_resv_region

2016-12-06 Thread Robin Murphy
On 15/11/16 13:09, Eric Auger wrote:
> Introduce a new helper serving the purpose to allocate a reserved
> region.  This will be used in iommu driver implementing reserved
> region callbacks.
> 
> Signed-off-by: Eric Auger 
> ---
>  drivers/iommu/iommu.c | 16 
>  include/linux/iommu.h |  8 
>  2 files changed, 24 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index c7ed334..6ee529f 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1562,6 +1562,22 @@ void iommu_put_resv_regions(struct device *dev, struct 
> list_head *list)
>   ops->put_resv_regions(dev, list);
>  }
>  
> +struct iommu_resv_region *iommu_alloc_resv_region(phys_addr_t start,
> +   size_t length,
> +   unsigned int prot)
> +{
> + struct iommu_resv_region *region;
> +
> + region = kzalloc(sizeof(*region), GFP_KERNEL);
> + if (!region)
> + return NULL;
> +
> + region->start = start;
> + region->length = length;
> + region->prot = prot;
> + return region;
> +}

I think you need an INIT_LIST_HEAD() in here as well, or
CONFIG_DEBUG_LIST might get unhappy about using an uninitialised head later.

Robin.

> +
>  /* Request that a device is direct mapped by the IOMMU */
>  int iommu_request_dm_for_dev(struct device *dev)
>  {
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 02cf565..0aea877 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -241,6 +241,8 @@ extern void iommu_set_fault_handler(struct iommu_domain 
> *domain,
>  extern void iommu_get_resv_regions(struct device *dev, struct list_head 
> *list);
>  extern void iommu_put_resv_regions(struct device *dev, struct list_head 
> *list);
>  extern int iommu_request_dm_for_dev(struct device *dev);
> +extern struct iommu_resv_region *
> +iommu_alloc_resv_region(phys_addr_t start, size_t length, unsigned int prot);
>  
>  extern int iommu_attach_group(struct iommu_domain *domain,
> struct iommu_group *group);
> @@ -454,6 +456,12 @@ static inline void iommu_put_resv_regions(struct device 
> *dev,
>  {
>  }
>  
> +static inline struct iommu_resv_region *
> +iommu_alloc_resv_region(phys_addr_t start, size_t length, unsigned int prot)
> +{
> + return NULL;
> +}
> +
>  static inline int iommu_request_dm_for_dev(struct device *dev)
>  {
>   return -ENODEV;
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 05/10] iommu: Do not map reserved regions

2016-12-06 Thread Robin Murphy
On 15/11/16 13:09, Eric Auger wrote:
> As we introduced IOMMU_RESV_NOMAP and IOMMU_RESV_MSI regions,
> let's prevent those new regions from being mapped.
> 
> Signed-off-by: Eric Auger 
> ---
>  drivers/iommu/iommu.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 6ee529f..a4530ad 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -343,6 +343,9 @@ static int iommu_group_create_direct_mappings(struct 
> iommu_group *group,
>   start = ALIGN(entry->start, pg_size);
>   end   = ALIGN(entry->start + entry->length, pg_size);
>  
> + if (entry->prot & IOMMU_RESV_MASK)

This seems to be the only place that this mask is used, and frankly I
think it's less clear than simply "(IOMMU_RESV_NOMAP | IOMMU_RESV_MSI)"
would be, at which point we may as well drop the mask and special value
trickery altogether. Plus, per my previous comment, if it were to be "if
(entry->type != )" instead, that's about as obvious
as it can get.

Robin.

> + continue;
> +
>   for (addr = start; addr < end; addr += pg_size) {
>   phys_addr_t phys_addr;
>  
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 06/10] iommu: iommu_get_group_resv_regions

2016-12-06 Thread Robin Murphy
On 15/11/16 13:09, Eric Auger wrote:
> Introduce iommu_get_group_resv_regions whose role consists in
> enumerating all devices from the group and collecting their
> reserved regions. It checks duplicates.
> 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> - we do not move list elements from device to group list since
>   the iommu_put_resv_regions() could not be called.
> - at the moment I did not introduce any iommu_put_group_resv_regions
>   since it simply consists in voiding/freeing the list
> ---
>  drivers/iommu/iommu.c | 53 
> +++
>  include/linux/iommu.h |  8 
>  2 files changed, 61 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index a4530ad..e0fbcc5 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -133,6 +133,59 @@ static ssize_t iommu_group_show_name(struct iommu_group 
> *group, char *buf)
>   return sprintf(buf, "%s\n", group->name);
>  }
>  
> +static bool iommu_resv_region_present(struct iommu_resv_region *region,
> +   struct list_head *head)
> +{
> + struct iommu_resv_region *entry;
> +
> + list_for_each_entry(entry, head, list) {
> + if ((region->start == entry->start) &&
> + (region->length == entry->length) &&
> + (region->prot == entry->prot))
> + return true;
> + }
> + return false;
> +}
> +
> +static int
> +iommu_insert_device_resv_regions(struct list_head *dev_resv_regions,
> +  struct list_head *group_resv_regions)
> +{
> + struct iommu_resv_region *entry, *region;
> +
> + list_for_each_entry(entry, dev_resv_regions, list) {
> + if (iommu_resv_region_present(entry, group_resv_regions))
> + continue;

In the case of overlapping regions which _aren't_ an exact match, would
it be better to expand the existing one rather than leave the caller to
sort it out? It seems a bit inconsistent to handle only the one case here.

> + region = iommu_alloc_resv_region(entry->start, entry->length,
> +entry->prot);
> + if (!region)
> + return -ENOMEM;
> +
> + list_add_tail(®ion->list, group_resv_regions);
> + }
> + return 0;
> +}
> +
> +int iommu_get_group_resv_regions(struct iommu_group *group,
> +  struct list_head *head)
> +{
> + struct iommu_device *device;
> + int ret = 0;
> +
> + list_for_each_entry(device, &group->devices, list) {

Should we not be taking the group mutex around this?

Robin.

> + struct list_head dev_resv_regions;
> +
> + INIT_LIST_HEAD(&dev_resv_regions);
> + iommu_get_resv_regions(device->dev, &dev_resv_regions);
> + ret = iommu_insert_device_resv_regions(&dev_resv_regions, head);
> + iommu_put_resv_regions(device->dev, &dev_resv_regions);
> + if (ret)
> + break;
> + }
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_get_group_resv_regions);
> +
>  static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);
>  
>  static void iommu_group_release(struct kobject *kobj)
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 0aea877..0f7ae2c 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -243,6 +243,8 @@ extern void iommu_set_fault_handler(struct iommu_domain 
> *domain,
>  extern int iommu_request_dm_for_dev(struct device *dev);
>  extern struct iommu_resv_region *
>  iommu_alloc_resv_region(phys_addr_t start, size_t length, unsigned int prot);
> +extern int iommu_get_group_resv_regions(struct iommu_group *group,
> + struct list_head *head);
>  
>  extern int iommu_attach_group(struct iommu_domain *domain,
> struct iommu_group *group);
> @@ -462,6 +464,12 @@ static inline void iommu_put_resv_regions(struct device 
> *dev,
>   return NULL;
>  }
>  
> +static inline int iommu_get_group_resv_regions(struct iommu_group *group,
> +struct list_head *head)
> +{
> + return -ENODEV;
> +}
> +
>  static inline int iommu_request_dm_for_dev(struct device *dev)
>  {
>   return -ENODEV;
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 09/10] iommu/arm-smmu: Implement reserved region get/put callbacks

2016-12-06 Thread Robin Murphy
On 15/11/16 13:09, Eric Auger wrote:
> The get() populates the list with the PCI host bridge windows
> and the MSI IOVA range.
> 
> At the moment an arbitray MSI IOVA window is set at 0x800
> of size 1MB. This will allow to report those info in iommu-group
> sysfs?
> 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> RFC v2 -> v3:
> - use existing get/put_resv_regions
> 
> RFC v1 -> v2:
> - use defines for MSI IOVA base and length
> ---
>  drivers/iommu/arm-smmu.c | 52 
> 
>  1 file changed, 52 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 8f72814..81f1a83 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -278,6 +278,9 @@ enum arm_smmu_s2cr_privcfg {
>  
>  #define FSYNR0_WNR   (1 << 4)
>  
> +#define MSI_IOVA_BASE0x800
> +#define MSI_IOVA_LENGTH  0x10
> +
>  static int force_stage;
>  module_param(force_stage, int, S_IRUGO);
>  MODULE_PARM_DESC(force_stage,
> @@ -1545,6 +1548,53 @@ static int arm_smmu_of_xlate(struct device *dev, 
> struct of_phandle_args *args)
>   return iommu_fwspec_add_ids(dev, &fwid, 1);
>  }
>  
> +static void arm_smmu_get_resv_regions(struct device *dev,
> +   struct list_head *head)
> +{
> + struct iommu_resv_region *region;
> + struct pci_host_bridge *bridge;
> + struct resource_entry *window;
> +
> + /* MSI region */
> + region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> +  IOMMU_RESV_MSI);
> + if (!region)
> + return;
> +
> + list_add_tail(®ion->list, head);
> +
> + if (!dev_is_pci(dev))
> + return;
> +
> + bridge = pci_find_host_bridge(to_pci_dev(dev)->bus);
> +
> + resource_list_for_each_entry(window, &bridge->windows) {
> + phys_addr_t start;
> + size_t length;
> +
> + if (resource_type(window->res) != IORESOURCE_MEM &&
> + resource_type(window->res) != IORESOURCE_IO)

As Joerg commented elsewhere, considering anything other than memory
resources isn't right (I appreciate you've merely copied my own mistake
here). We need some other way to handle root complexes where the CPU
MMIO views of PCI windows appear in PCI memory space - using the I/O
address of I/O resources only works by chance on Juno, and it still
doesn't account for config space. I suggest we just leave that out for
the time being to make life easier (does it even apply to anything other
than Juno?) and figure it out later.

> + continue;
> +
> + start = window->res->start - window->offset;
> + length = window->res->end - window->res->start + 1;
> + region = iommu_alloc_resv_region(start, length,
> +  IOMMU_RESV_NOMAP);
> + if (!region)
> + return;
> + list_add_tail(®ion->list, head);
> + }
> +}

Either way, there's nothing SMMU-specific about PCI windows. The fact
that we'd have to copy-paste all of this into the SMMUv3 driver
unchanged suggests it should go somewhere common (although I would be
inclined to leave the insertion of the fake MSI region to driver-private
wrappers). As I said before, the current iova_reserve_pci_windows()
simply wants splitting into appropriate public callbacks for
get_resv_regions and apply_resv_regions.

Robin.

> +static void arm_smmu_put_resv_regions(struct device *dev,
> +   struct list_head *head)
> +{
> + struct iommu_resv_region *entry, *next;
> +
> + list_for_each_entry_safe(entry, next, head, list)
> + kfree(entry);
> +}
> +
>  static struct iommu_ops arm_smmu_ops = {
>   .capable= arm_smmu_capable,
>   .domain_alloc   = arm_smmu_domain_alloc,
> @@ -1560,6 +1610,8 @@ static int arm_smmu_of_xlate(struct device *dev, struct 
> of_phandle_args *args)
>   .domain_get_attr= arm_smmu_domain_get_attr,
>   .domain_set_attr= arm_smmu_domain_set_attr,
>   .of_xlate   = arm_smmu_of_xlate,
> + .get_resv_regions   = arm_smmu_get_resv_regions,
> + .put_resv_regions   = arm_smmu_put_resv_regions,
>   .pgsize_bitmap  = -1UL, /* Restricted during device attach */
>  };
>  
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RESEND PATCH V6 0/6] Add support for privileged mappings

2016-12-06 Thread Robin Murphy
On 04/12/16 07:48, Sricharan wrote:
> Hi Robin,
> 
>> Hi Sricharan,
>>
>> On 02/12/16 14:55, Sricharan R wrote:
>>> This series is a resend of the V5 that Mitch sent sometime back [2]
>>> All the patches are the same and i have just rebased. Not sure why this
>>> finally did not make it last time. The last patch in the previous
>>> series does not apply now [3], so just redid that. Also Copied the tags
>>> that he had from last time as well.
>>
>> Heh, I was assuming this would be down to me to pick up. Vinod did have
>> some complaints last time about the commit message on the PL330 patch -
>> I did get as far as rewriting that and reworking onto my SMMU
>> changes[1], I just hadn't got round to sending it, so it fell onto the
>> "after the next merge window" pile.
>>
>> I'd give some review comments, but they'd essentially be a diff against
>> that branch :)
>>
> 
> Sure, i did not knew that you were on this already. I can repost with the diff
> from your branch taken in or wait for you as well. I am fine with either ways
> that you suggest.
> 
> I checked the patches against your branch, i see that the changes are,
> 
> 1) one patch for implementing it for armv7s descriptor
> 2) Changes on pl330 patch commit logs and
> 3) One patch for doing the revert on arm-smmuv3 as well.

If you want to pick up my short-descriptor and SMMUv3 patches and run
with them you're more than welcome - the rest is just cosmetic stuff
which doesn't really matter, especially as it's picking up acks as-is.

Robin.

> Regards,
>  Sricharan
> 
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 09/10] iommu/arm-smmu: Implement reserved region get/put callbacks

2016-12-07 Thread Robin Murphy
On 07/12/16 15:02, Auger Eric wrote:
> Hi Robin,
> On 06/12/2016 19:55, Robin Murphy wrote:
>> On 15/11/16 13:09, Eric Auger wrote:
>>> The get() populates the list with the PCI host bridge windows
>>> and the MSI IOVA range.
>>>
>>> At the moment an arbitray MSI IOVA window is set at 0x800
>>> of size 1MB. This will allow to report those info in iommu-group
>>> sysfs?
> 
> 
> First thank you for reviewing the series. This is definitively helpful!
>>>
>>> Signed-off-by: Eric Auger 
>>>
>>> ---
>>>
>>> RFC v2 -> v3:
>>> - use existing get/put_resv_regions
>>>
>>> RFC v1 -> v2:
>>> - use defines for MSI IOVA base and length
>>> ---
>>>  drivers/iommu/arm-smmu.c | 52 
>>> 
>>>  1 file changed, 52 insertions(+)
>>>
>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>> index 8f72814..81f1a83 100644
>>> --- a/drivers/iommu/arm-smmu.c
>>> +++ b/drivers/iommu/arm-smmu.c
>>> @@ -278,6 +278,9 @@ enum arm_smmu_s2cr_privcfg {
>>>  
>>>  #define FSYNR0_WNR (1 << 4)
>>>  
>>> +#define MSI_IOVA_BASE  0x800
>>> +#define MSI_IOVA_LENGTH0x10
>>> +
>>>  static int force_stage;
>>>  module_param(force_stage, int, S_IRUGO);
>>>  MODULE_PARM_DESC(force_stage,
>>> @@ -1545,6 +1548,53 @@ static int arm_smmu_of_xlate(struct device *dev, 
>>> struct of_phandle_args *args)
>>> return iommu_fwspec_add_ids(dev, &fwid, 1);
>>>  }
>>>  
>>> +static void arm_smmu_get_resv_regions(struct device *dev,
>>> + struct list_head *head)
>>> +{
>>> +   struct iommu_resv_region *region;
>>> +   struct pci_host_bridge *bridge;
>>> +   struct resource_entry *window;
>>> +
>>> +   /* MSI region */
>>> +   region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
>>> +IOMMU_RESV_MSI);
>>> +   if (!region)
>>> +   return;
>>> +
>>> +   list_add_tail(®ion->list, head);
>>> +
>>> +   if (!dev_is_pci(dev))
>>> +   return;
>>> +
>>> +   bridge = pci_find_host_bridge(to_pci_dev(dev)->bus);
>>> +
>>> +   resource_list_for_each_entry(window, &bridge->windows) {
>>> +   phys_addr_t start;
>>> +   size_t length;
>>> +
>>> +   if (resource_type(window->res) != IORESOURCE_MEM &&
>>> +   resource_type(window->res) != IORESOURCE_IO)
>>
>> As Joerg commented elsewhere, considering anything other than memory
>> resources isn't right (I appreciate you've merely copied my own mistake
>> here). We need some other way to handle root complexes where the CPU
>> MMIO views of PCI windows appear in PCI memory space - using the I/O
>> address of I/O resources only works by chance on Juno, and it still
>> doesn't account for config space. I suggest we just leave that out for
>> the time being to make life easier (does it even apply to anything other
>> than Juno?) and figure it out later.
> OK so I understand I should remove IORESOURCE_IO check.
>>
>>> +   continue;
>>> +
>>> +   start = window->res->start - window->offset;
>>> +   length = window->res->end - window->res->start + 1;
>>> +   region = iommu_alloc_resv_region(start, length,
>>> +IOMMU_RESV_NOMAP);
>>> +   if (!region)
>>> +   return;
>>> +   list_add_tail(®ion->list, head);
>>> +   }
>>> +}
>>
>> Either way, there's nothing SMMU-specific about PCI windows. The fact
>> that we'd have to copy-paste all of this into the SMMUv3 driver
>> unchanged suggests it should go somewhere common (although I would be
>> inclined to leave the insertion of the fake MSI region to driver-private
>> wrappers). As I said before, the current iova_reserve_pci_windows()
>> simply wants splitting into appropriate public callbacks for
>> get_resv_regions and apply_resv_regions.
> Do you mean somewhere common in the arm-smmu subsystem (new file) or in
> another subsystem (pci?)
> 
> More generally the current implementation does not handle the case w

Re: [RFC v3 00/10] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions

2016-12-08 Thread Robin Murphy
On 08/12/16 09:36, Auger Eric wrote:
> Hi,
> 
> On 15/11/2016 14:09, Eric Auger wrote:
>> Following LPC discussions, we now report reserved regions through
>> iommu-group sysfs reserved_regions attribute file.
> 
> 
> While I am respinning this series into v4, here is a tentative summary
> of technical topics for which no consensus was reached at this point.
> 
> 1) Shall we report the usable IOVA range instead of reserved IOVA
>ranges. Not discussed at/after LPC.
>x I currently report reserved regions. Alex expressed the need to
>  report the full usable IOVA range instead (x86 min-max range
>  minus MSI APIC window). I think this is meaningful for ARM
>  too where arm-smmu might not support the full 64b range.
>x Any objection we report the usable IOVA regions instead?

The issue with that is that we can't actually report "the usable
regions" at the moment, as that involves pulling together disjoint
properties of arbitrary hardware unrelated to the IOMMU. We'd be
reporting "the not-definitely-unusable regions, which may have some
unusable holes in them still". That seems like an ABI nightmare - I'd
still much rather say "here are some, but not necessarily all, regions
you definitely can't use", because saying "here are some regions which
you might be able to use most of, probably" is what we're already doing
today, via a single implicit region from 0 to ULONG_MAX ;)

The address space limits are definitely useful to know, but I think it
would be better to expose them separately to avoid the ambiguity. At
worst, I guess it would be reasonable to express the limits via an
"out-of-range" reserved region type for 0 to $base and $top to
ULONG-MAX. To *safely* expose usable regions, we'd have to start out
with a very conservative assumption (e.g. only IOVAs matching physical
RAM), and only expand them once we're sure we can detect every possible
bit of problematic hardware in the system - that's just too limiting to
be useful. And if we expose something knowingly inaccurate, we risk
having another "bogoMIPS in /proc/cpuinfo" ABI burden on our hands, and
nobody wants that...

> 2) Shall the kernel check collision with MSI window* when userspace
>calls VFIO_IOMMU_MAP_DMA?
>Joerg/Will No; Alex yes
>*for IOVA regions consumed downstream to the IOMMU: everyone says NO

If we're starting off by having the SMMU drivers expose it as a fake
fixed region, I don't think we need to worry about this yet. We all seem
to agree that as long as we communicate the fixed regions to userspace,
it's then userspace's job to work around them. Let's come back to this
one once we actually get to the point of dynamically sizing and
allocating 'real' MSI remapping region(s).

Ultimately, the kernel *will* police collisions either way, because an
underlying iommu_map() is going to fail if overlapping IOVAs are ever
actually used, so it's really just a question of whether to have a more
user-friendly failure mode.

> 3) RMRR reporting in the iommu group sysfs? Joerg: yes; Don: no
>My current series does not expose them in iommu group sysfs.
>I understand we can expose the RMRR regions in the iomm group sysfs
>without necessarily supporting RMRR requiring device assignment.
>We can also add this support later.

As you say, reporting them doesn't necessitate allowing device
assignment, and it's information which can already be easily grovelled
out of dmesg (for intel-iommu at least) - there doesn't seem to be any
need to hide them, but the x86 folks can have the final word on that.

Robin.

> Thanks
> 
> Eric
> 
> 
>>
>> Reserved regions are populated through the IOMMU get_resv_region callback
>> (former get_dm_regions), now implemented by amd-iommu, intel-iommu and
>> arm-smmu.
>>
>> The intel-iommu reports the [FEE0_h - FEF0_000h] MSI window as an
>> IOMMU_RESV_NOMAP reserved region.
>>
>> arm-smmu reports the MSI window (arbitrarily located at 0x800 and
>> 1MB large) and the PCI host bridge windows.
>>
>> The series integrates a not officially posted patch from Robin:
>> "iommu/dma: Allow MSI-only cookies".
>>
>> This series currently does not address IRQ safety assessment.
>>
>> Best Regards
>>
>> Eric
>>
>> Git: complete series available at
>> https://github.com/eauger/linux/tree/v4.9-rc5-reserved-rfc-v3
>>
>> History:
>> RFC v2 -> v3:
>> - switch to an iommu-group sysfs API
>> - use new dummy allocator provided by Robin
>> - dummy allocator initialized by vfio-iommu-type1 after enumerating
>>   the reserved regions
>> - at the moment ARM MSI base address/size is left unchanged compared
>>   to v2
>> - we currently report reserved regions and not usable IOVA regions as
>>   requested by Alex
>>
>> RFC v1 -> v2:
>> - fix intel_add_reserved_regions
>> - add mutex lock/unlock in vfio_iommu_type1
>>
>>
>> Eric Auger (10):
>>   iommu/dma: Allow MSI-only cookies
>>   iommu: Rename iommu_dm_regions into iommu_resv_regions
>>   iommu: Add new reserved IOMMU attributes
>>   iommu: iomm

Re: [RFC v3 00/10] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions

2016-12-08 Thread Robin Murphy
On 08/12/16 13:36, Auger Eric wrote:
> Hi Robin,
> 
> On 08/12/2016 14:14, Robin Murphy wrote:
>> On 08/12/16 09:36, Auger Eric wrote:
>>> Hi,
>>>
>>> On 15/11/2016 14:09, Eric Auger wrote:
>>>> Following LPC discussions, we now report reserved regions through
>>>> iommu-group sysfs reserved_regions attribute file.
>>>
>>>
>>> While I am respinning this series into v4, here is a tentative summary
>>> of technical topics for which no consensus was reached at this point.
>>>
>>> 1) Shall we report the usable IOVA range instead of reserved IOVA
>>>ranges. Not discussed at/after LPC.
>>>x I currently report reserved regions. Alex expressed the need to
>>>  report the full usable IOVA range instead (x86 min-max range
>>>  minus MSI APIC window). I think this is meaningful for ARM
>>>  too where arm-smmu might not support the full 64b range.
>>>x Any objection we report the usable IOVA regions instead?
>>
>> The issue with that is that we can't actually report "the usable
>> regions" at the moment, as that involves pulling together disjoint
>> properties of arbitrary hardware unrelated to the IOMMU. We'd be
>> reporting "the not-definitely-unusable regions, which may have some
>> unusable holes in them still". That seems like an ABI nightmare - I'd
>> still much rather say "here are some, but not necessarily all, regions
>> you definitely can't use", because saying "here are some regions which
>> you might be able to use most of, probably" is what we're already doing
>> today, via a single implicit region from 0 to ULONG_MAX ;)
>>
>> The address space limits are definitely useful to know, but I think it
>> would be better to expose them separately to avoid the ambiguity. At
>> worst, I guess it would be reasonable to express the limits via an
>> "out-of-range" reserved region type for 0 to $base and $top to
>> ULONG-MAX. To *safely* expose usable regions, we'd have to start out
>> with a very conservative assumption (e.g. only IOVAs matching physical
>> RAM), and only expand them once we're sure we can detect every possible
>> bit of problematic hardware in the system - that's just too limiting to
>> be useful. And if we expose something knowingly inaccurate, we risk
>> having another "bogoMIPS in /proc/cpuinfo" ABI burden on our hands, and
>> nobody wants that...
> Makes sense to me. "out-of-range reserved region type for 0 to $base and
> $top to ULONG-MAX" can be an alternative to fulfill the requirement.
>>
>>> 2) Shall the kernel check collision with MSI window* when userspace
>>>calls VFIO_IOMMU_MAP_DMA?
>>>Joerg/Will No; Alex yes
>>>*for IOVA regions consumed downstream to the IOMMU: everyone says NO
>>
>> If we're starting off by having the SMMU drivers expose it as a fake
>> fixed region, I don't think we need to worry about this yet. We all seem
>> to agree that as long as we communicate the fixed regions to userspace,
>> it's then userspace's job to work around them. Let's come back to this
>> one once we actually get to the point of dynamically sizing and
>> allocating 'real' MSI remapping region(s).
>>
>> Ultimately, the kernel *will* police collisions either way, because an
>> underlying iommu_map() is going to fail if overlapping IOVAs are ever
>> actually used, so it's really just a question of whether to have a more
>> user-friendly failure mode.
> That's true on ARM but not on x86 where the APIC MSI region is not
> mapped I think.

Yes, but the APIC interrupt region is fixed, i.e. it falls under "IOVA
regions consumed downstream to the IOMMU" since the DMAR units are
physically incapable of remapping those addresses. I take "MSI window"
to mean specifically the thing we have to set aside and get a magic
remapping cookie for, which is also why it warrants its own special
internal type - we definitely *don't* want VFIO trying to set up a
remapping cookie on x86. We just don't let userspace know about the
difference, yet (if ever).

Robin.

>>> 3) RMRR reporting in the iommu group sysfs? Joerg: yes; Don: no
>>>My current series does not expose them in iommu group sysfs.
>>>I understand we can expose the RMRR regions in the iomm group sysfs
>>>without necessarily supporting RMRR requiring device assignment.
>>>We can also add this support later.
>>
>> As you say, reporting th

Re: [RFC v3 00/10] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions

2016-12-08 Thread Robin Murphy
On 08/12/16 17:01, Alex Williamson wrote:
> On Thu, 8 Dec 2016 13:14:04 +
> Robin Murphy  wrote:
>> On 08/12/16 09:36, Auger Eric wrote:
>>> 3) RMRR reporting in the iommu group sysfs? Joerg: yes; Don: no
>>>My current series does not expose them in iommu group sysfs.
>>>I understand we can expose the RMRR regions in the iomm group sysfs
>>>without necessarily supporting RMRR requiring device assignment.
>>>We can also add this support later.  
>>
>> As you say, reporting them doesn't necessitate allowing device
>> assignment, and it's information which can already be easily grovelled
>> out of dmesg (for intel-iommu at least) - there doesn't seem to be any
>> need to hide them, but the x86 folks can have the final word on that.
> 
> Eric and I talked about this and I don't see the value in identifying
> an RMRR as anything other than a reserved range for a device.  It's not
> userspace's job to maintain an identify mapped range for the device,
> and it can't be trusted to do so anyway.  It does throw a kink in the
> machinery though as an RMRR is a reserved memory range unique to a
> device.  It doesn't really fit into a monolithic /sys/class/iommu view
> of global reserved regions as an RMRR is only relevant to the device
> paths affected.

I think we're in violent agreement then - to clarify, I was thinking in
terms of patch 7 of this series, where everything relevant to a
particular group would be exposed as just an opaque "don't use this
address range" regardless of the internal type.

I'm less convinced the kernel has any need to provide its own 'global'
view of reservations which strictly are always at least per-IOMMU, if
not per-root-complex, even when all the instances do share the same
address by design. The group-based interface fits the reality neatly,
and userspace can easily iterate all the groups if it wants to consider
everything. Plus if it doesn't want to, then it needn't bother reserving
anything which doesn't apply to the group(s) it's going to bind to VFIO.

Robin.

> Another kink is that sometimes we know what the RMRR is for, know that
> it's irrelevant for our use case, and ignore it.  This is true for USB
> and Intel graphics use cases of RMRRs.
> 
> Also, aside from the above mentioned cases, devices with RMRRs are
> currently excluded from participating in the IOMMU API by the
> intel-iommu driver and I expect this to continue in the general case
> regardless of whether the ranges are more easily exposed to userspace.
> ARM may have to deal with mangling a guest memory map due to lack of
> any standard layout, de facto or otherwise, but for x86 I don't think
> it's worth the migration and hotplug implications.  Thanks,
> 
> Alex
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V7 7/8] iommu/arm-smmu: Set privileged attribute to 'default' instead of 'unprivileged'

2016-12-13 Thread Robin Murphy
On 12/12/16 18:38, Sricharan R wrote:
> Currently the driver sets all the device transactions privileges
> to UNPRIVILEGED, but there are cases where the iommu masters wants
> to isolate privileged supervisor and unprivileged user.
> So don't override the privileged setting to unprivileged, instead
> set it to default as incoming and let it be controlled by the pagetable
> settings.
> 
> Acked-by: Will Deacon 
> Signed-off-by: Sricharan R 

Since everything else has already got my tags on it:

Reviewed-by: Robin Murphy 

I'd say the whole series looks good to go now, thanks for picking it up.

Robin.

> ---
>  drivers/iommu/arm-smmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index eaa8f44..8bb0eea 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1213,7 +1213,7 @@ static int arm_smmu_domain_add_master(struct 
> arm_smmu_domain *smmu_domain,
>   continue;
>  
>   s2cr[idx].type = type;
> - s2cr[idx].privcfg = S2CR_PRIVCFG_UNPRIV;
> + s2cr[idx].privcfg = S2CR_PRIVCFG_DEFAULT;
>   s2cr[idx].cbndx = cbndx;
>   arm_smmu_write_s2cr(smmu, idx);
>   }
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V7 4/8] common: DMA-mapping: add DMA_ATTR_PRIVILEGED attribute

2016-12-13 Thread Robin Murphy
On 12/12/16 18:38, Sricharan R wrote:
> From: Mitchel Humpherys 
> 
> This patch adds the DMA_ATTR_PRIVILEGED attribute to the DMA-mapping
> subsystem.
> 
> Some advanced peripherals such as remote processors and GPUs perform
> accesses to DMA buffers in both privileged "supervisor" and unprivileged
> "user" modes.  This attribute is used to indicate to the DMA-mapping
> subsystem that the buffer is fully accessible at the elevated privilege
> level (and ideally inaccessible or at least read-only at the
> lesser-privileged levels).
> 
> Cc: linux-...@vger.kernel.org
> Reviewed-by: Robin Murphy 
> Tested-by: Robin Murphy 
> Acked-by: Will Deacon 
> Signed-off-by: Mitchel Humpherys 
> ---
>  Documentation/DMA-attributes.txt | 10 ++
>  include/linux/dma-mapping.h  |  7 +++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/Documentation/DMA-attributes.txt 
> b/Documentation/DMA-attributes.txt
> index 98bf7ac..44c6bc4 100644
> --- a/Documentation/DMA-attributes.txt
> +++ b/Documentation/DMA-attributes.txt
> @@ -143,3 +143,13 @@ So, this provides a way for drivers to avoid those error 
> messages on calls
>  where allocation failures are not a problem, and shouldn't bother the logs.
>  
>  NOTE: At the moment DMA_ATTR_NO_WARN is only implemented on PowerPC.
> +
> +DMA_ATTR_PRIVILEGED
> +--
> +
> +Some advanced peripherals such as remote processors and GPUs perform
> +accesses to DMA buffers in both privileged "supervisor" and unprivileged
> +"user" modes.  This attribute is used to indicate to the DMA-mapping
> +subsystem that the buffer is fully accessible at the elevated privilege
> +level (and ideally inaccessible or at least read-only at the
> +lesser-privileged levels).
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index 6f3e6ca..ee31ea1 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -63,6 +63,13 @@
>  #define DMA_ATTR_NO_WARN (1UL << 8)
>  
>  /*
> + * DMA_ATTR_PRIVILEGED: used to indicate that the buffer is fully
> + * accessible at an elevated privilege level (and ideally inaccessible or
> + * at least read-only at lesser-privileged levels).
> + */
> +#define DMA_ATTR_PRIVILEGED  (1UL << 8)

Oops, I spoke slightly too soon - there's a value conflict here which
has been missed in the rebase.

Robin

> +
> +/*
>   * A dma_addr_t can hold any valid DMA or bus address for the platform.
>   * It can be given to a device to use as a DMA source or target.  A CPU 
> cannot
>   * reference a dma_addr_t directly because there may be translation between
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V7 5/8] arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED

2016-12-13 Thread Robin Murphy
On 12/12/16 18:38, Sricharan R wrote:
> From: Mitchel Humpherys 
> 
> The newly added DMA_ATTR_PRIVILEGED is useful for creating mappings that
> are only accessible to privileged DMA engines.  Implement it in
> dma-iommu.c so that the ARM64 DMA IOMMU mapper can make use of it.
> 
> Reviewed-by: Robin Murphy 
> Tested-by: Robin Murphy 
> Acked-by: Will Deacon 
> Signed-off-by: Mitchel Humpherys 
> ---
>  arch/arm64/mm/dma-mapping.c |  6 +++---
>  drivers/iommu/dma-iommu.c   | 10 --
>  include/linux/dma-iommu.h   |  3 ++-
>  3 files changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 401f79a..ae76ead 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -557,7 +557,7 @@ static void *__iommu_alloc_attrs(struct device *dev, 
> size_t size,
>unsigned long attrs)
>  {
>   bool coherent = is_device_dma_coherent(dev);
> - int ioprot = dma_direction_to_prot(DMA_BIDIRECTIONAL, coherent);
> + int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
>   size_t iosize = size;
>   void *addr;
>  
> @@ -711,7 +711,7 @@ static dma_addr_t __iommu_map_page(struct device *dev, 
> struct page *page,
>  unsigned long attrs)
>  {
>   bool coherent = is_device_dma_coherent(dev);
> - int prot = dma_direction_to_prot(dir, coherent);
> + int prot = dma_info_to_prot(dir, coherent, attrs);
>   dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot);
>  
>   if (!iommu_dma_mapping_error(dev, dev_addr) &&
> @@ -769,7 +769,7 @@ static int __iommu_map_sg_attrs(struct device *dev, 
> struct scatterlist *sgl,
>   __iommu_sync_sg_for_device(dev, sgl, nelems, dir);
>  
>   return iommu_dma_map_sg(dev, sgl, nelems,
> - dma_direction_to_prot(dir, coherent));
> + dma_info_to_prot(dir, coherent, attrs));
>  }
>  
>  static void __iommu_unmap_sg_attrs(struct device *dev,
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index d2a7a46..756d5e0 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -182,16 +182,22 @@ int iommu_dma_init_domain(struct iommu_domain *domain, 
> dma_addr_t base,
>  EXPORT_SYMBOL(iommu_dma_init_domain);
>  
>  /**
> - * dma_direction_to_prot - Translate DMA API directions to IOMMU API page 
> flags
> + * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU 
> API
> + *page flags.
>   * @dir: Direction of DMA transfer
>   * @coherent: Is the DMA master cache-coherent?
> + * @attrs: DMA attributes for the mapping
>   *
>   * Return: corresponding IOMMU API page protection flags
>   */
> -int dma_direction_to_prot(enum dma_data_direction dir, bool coherent)
> +int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
> +  unsigned long attrs)
>  {
>   int prot = coherent ? IOMMU_CACHE : 0;
>  
> + if (attrs & DMA_ATTR_PRIVILEGED)
> + prot |= IOMMU_PRIV;
> +
>   switch (dir) {
>   case DMA_BIDIRECTIONAL:
>   return prot | IOMMU_READ | IOMMU_WRITE;

...and applying against -next now also needs this hunk:

@@ -639,7 +639,7 @@ dma_addr_t iommu_dma_map_resource(struct device
*dev, phys_addr_t phys,
size_t size, enum dma_data_direction dir, unsigned long attrs)
 {
return __iommu_dma_map(dev, phys, size,
-   dma_direction_to_prot(dir, false) | IOMMU_MMIO);
+   dma_info_to_prot(dir, false, attrs) | IOMMU_MMIO);
 }

 void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,

With those two issues fixed up, I've given the series (applied to
next-20161213) a spin on a SMMUv3/PL330 fast model and it still checks out.

Robin.

> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index 32c5890..a203181 100644
> --- a/include/linux/dma-iommu.h
> +++ b/include/linux/dma-iommu.h
> @@ -34,7 +34,8 @@ int iommu_dma_init_domain(struct iommu_domain *domain, 
> dma_addr_t base,
>   u64 size, struct device *dev);
>  
>  /* General helpers for DMA-API <-> IOMMU-API interaction */
> -int dma_direction_to_prot(enum dma_data_direction dir, bool coherent);
> +int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
> +  unsigned long attrs);
>  
>  /*
>   * These implement the bulk of the relevant DMA mapping callbacks, but 
> require
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V7 5/8] arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED

2016-12-13 Thread Robin Murphy
On 13/12/16 14:38, Sricharan wrote:
> Hi Robin,
> 
>> -Original Message-
>> From: linux-arm-kernel [mailto:linux-arm-kernel-boun...@lists.infradead.org] 
>> On Behalf Of Robin Murphy
>> Sent: Tuesday, December 13, 2016 7:33 PM
>> To: Sricharan R ; jcro...@codeaurora.org; 
>> pd...@codeaurora.org; jgeb...@codeaurora.org;
>> j...@8bytes.org; linux-ker...@vger.kernel.org; prat...@codeaurora.org; 
>> iommu@lists.linux-foundation.org; tz...@codeaurora.org;
>> linux-arm-ker...@lists.infradead.org; will.dea...@arm.com; 
>> mitch...@codeaurora.org; vinod.k...@intel.com
>> Cc: dan.j.willi...@intel.com
>> Subject: Re: [PATCH V7 5/8] arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED
>>
>> On 12/12/16 18:38, Sricharan R wrote:
>>> From: Mitchel Humpherys 
>>>
>>> The newly added DMA_ATTR_PRIVILEGED is useful for creating mappings that
>>> are only accessible to privileged DMA engines.  Implement it in
>>> dma-iommu.c so that the ARM64 DMA IOMMU mapper can make use of it.
>>>
>>> Reviewed-by: Robin Murphy 
>>> Tested-by: Robin Murphy 
>>> Acked-by: Will Deacon 
>>> Signed-off-by: Mitchel Humpherys 
>>> ---
>>>  arch/arm64/mm/dma-mapping.c |  6 +++---
>>>  drivers/iommu/dma-iommu.c   | 10 --
>>>  include/linux/dma-iommu.h   |  3 ++-
>>>  3 files changed, 13 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
>>> index 401f79a..ae76ead 100644
>>> --- a/arch/arm64/mm/dma-mapping.c
>>> +++ b/arch/arm64/mm/dma-mapping.c
>>> @@ -557,7 +557,7 @@ static void *__iommu_alloc_attrs(struct device *dev, 
>>> size_t size,
>>>  unsigned long attrs)
>>>  {
>>> bool coherent = is_device_dma_coherent(dev);
>>> -   int ioprot = dma_direction_to_prot(DMA_BIDIRECTIONAL, coherent);
>>> +   int ioprot = dma_info_to_prot(DMA_BIDIRECTIONAL, coherent, attrs);
>>> size_t iosize = size;
>>> void *addr;
>>>
>>> @@ -711,7 +711,7 @@ static dma_addr_t __iommu_map_page(struct device *dev, 
>>> struct page *page,
>>>unsigned long attrs)
>>>  {
>>> bool coherent = is_device_dma_coherent(dev);
>>> -   int prot = dma_direction_to_prot(dir, coherent);
>>> +   int prot = dma_info_to_prot(dir, coherent, attrs);
>>> dma_addr_t dev_addr = iommu_dma_map_page(dev, page, offset, size, prot);
>>>
>>> if (!iommu_dma_mapping_error(dev, dev_addr) &&
>>> @@ -769,7 +769,7 @@ static int __iommu_map_sg_attrs(struct device *dev, 
>>> struct scatterlist *sgl,
>>> __iommu_sync_sg_for_device(dev, sgl, nelems, dir);
>>>
>>> return iommu_dma_map_sg(dev, sgl, nelems,
>>> -   dma_direction_to_prot(dir, coherent));
>>> +   dma_info_to_prot(dir, coherent, attrs));
>>>  }
>>>
>>>  static void __iommu_unmap_sg_attrs(struct device *dev,
>>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>>> index d2a7a46..756d5e0 100644
>>> --- a/drivers/iommu/dma-iommu.c
>>> +++ b/drivers/iommu/dma-iommu.c
>>> @@ -182,16 +182,22 @@ int iommu_dma_init_domain(struct iommu_domain 
>>> *domain, dma_addr_t base,
>>>  EXPORT_SYMBOL(iommu_dma_init_domain);
>>>
>>>  /**
>>> - * dma_direction_to_prot - Translate DMA API directions to IOMMU API page 
>>> flags
>>> + * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU 
>>> API
>>> + *page flags.
>>>   * @dir: Direction of DMA transfer
>>>   * @coherent: Is the DMA master cache-coherent?
>>> + * @attrs: DMA attributes for the mapping
>>>   *
>>>   * Return: corresponding IOMMU API page protection flags
>>>   */
>>> -int dma_direction_to_prot(enum dma_data_direction dir, bool coherent)
>>> +int dma_info_to_prot(enum dma_data_direction dir, bool coherent,
>>> +unsigned long attrs)
>>>  {
>>> int prot = coherent ? IOMMU_CACHE : 0;
>>>
>>> +   if (attrs & DMA_ATTR_PRIVILEGED)
>>> +   prot |= IOMMU_PRIV;
>>> +
>>> switch (dir) {
>>> case DMA_BIDIRECTIONAL:
>>> return prot | IOMMU_READ | IOMMU_WRITE;
>>
>> ...and applying against -next now also needs this hunk:
>>
>> @@ -63

Re: [PATCH V7 5/8] arm64/dma-mapping: Implement DMA_ATTR_PRIVILEGED

2016-12-13 Thread Robin Murphy
r_mapping;
>  
> @@ -1522,7 +1524,8 @@ static void *__arm_iommu_alloc_attrs(struct device 
> *dev, size_t size,
>  
>   if (coherent_flag  == COHERENT || !gfpflags_allow_blocking(gfp))
>   return __iommu_alloc_simple(dev, size, gfp, handle,
> - coherent_flag);
> + coherent_flag,
> + attrs);

Super-nit: unnecessary line break.

>  
>   /*
>* Following is a work-around (a.k.a. hack) to prevent pages
> @@ -1672,10 +1675,13 @@ static int arm_iommu_get_sgtable(struct device *dev, 
> struct sg_table *sgt,
>GFP_KERNEL);
>  }
>  
> -static int __dma_direction_to_prot(enum dma_data_direction dir)
> +static int __dma_info_to_prot(enum dma_data_direction dir, unsigned long 
> attrs)
>  {
>   int prot;
>  
> + if (attrs & DMA_ATTR_PRIVILEGED)
> + prot |= IOMMU_PRIV;
> +
>   switch (dir) {
>   case DMA_BIDIRECTIONAL:
>   prot = IOMMU_READ | IOMMU_WRITE;
> @@ -1722,7 +1728,7 @@ static int __map_sg_chunk(struct device *dev, struct 
> scatterlist *sg,
>   if (!is_coherent && (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
>   __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, 
> dir);
>  
> - prot = __dma_direction_to_prot(dir);
> + prot = __dma_info_to_prot(dir, attrs);
>  
>   ret = iommu_map(mapping->domain, iova, phys, len, prot);
>   if (ret < 0)
> @@ -1930,7 +1936,7 @@ static dma_addr_t arm_coherent_iommu_map_page(struct 
> device *dev, struct page *p
>   if (dma_addr == DMA_ERROR_CODE)
>   return dma_addr;
>  
> - prot = __dma_direction_to_prot(dir);
> + prot = __dma_info_to_prot(dir, attrs);
>  
>   ret = iommu_map(mapping->domain, dma_addr, page_to_phys(page), len, 
> prot);
>   if (ret < 0)
> @@ -2036,7 +2042,7 @@ static dma_addr_t arm_iommu_map_resource(struct device 
> *dev,
>   if (dma_addr == DMA_ERROR_CODE)
>   return dma_addr;
>  
> - prot = __dma_direction_to_prot(dir) | IOMMU_MMIO;
> + prot = __dma_info_to_prot(dir, attrs) | IOMMU_MMIO;
>  
>   ret = iommu_map(mapping->domain, dma_addr, addr, len, prot);
>   if (ret < 0)
> 

Looks reasonable to me. Assuming it survives testing:

Acked-by: Robin Murphy 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: RFC: extend iommu-map binding to support #iommu-cells > 1

2016-12-16 Thread Robin Murphy
On 16/12/16 02:36, Stuart Yoder wrote:
> For context, please see the thread:
> https://www.spinics.net/lists/arm-kernel/msg539066.html
> 
> The existing iommu-map binding did not account for the situation where
> #iommu-cells == 2, as permitted in the ARM SMMU binding.  The 2nd cell
> of the IOMMU specifier being the SMR mask.  The existing binding defines
> the mapping as:
>Any RID r in the interval [rid-base, rid-base + length) is associated with
>the listed IOMMU, with the iommu-specifier (r - rid-base + iommu-base).
> 
> ...and that does not work if iommu-base is 2 cells, the second being the
> SMR mask.
> 
> While this can be worked around by always having length=1, it seems we
> can get this cleaned up by updating the binding definition for iommu-map.
> 
> See patch below.  Thoughts?

I really don't think defining a generic binding to have a magic
non-standard meaning for one specific use-case is the right way to go.

Give me a moment to spin the patch I reckon you actually want...

Robin.

> 
> Thanks,
> Stuart
> 
> -
> 
> diff --git a/Documentation/devicetree/bindings/pci/pci-iommu.txt 
> b/Documentation/devicetree/bindings/pci/pci-iommu.txt
> index 56c8296..e81b461 100644
> --- a/Documentation/devicetree/bindings/pci/pci-iommu.txt
> +++ b/Documentation/devicetree/bindings/pci/pci-iommu.txt
> @@ -38,8 +38,20 @@ Optional properties
>The property is an arbitrary number of tuples of
>(rid-base,iommu,iommu-base,length).
> 
> -  Any RID r in the interval [rid-base, rid-base + length) is associated with
> -  the listed IOMMU, with the iommu-specifier (r - rid-base + iommu-base).
> +  If the associated IOMMU has an #iommu-cells value of 1, any RID r in the
> +  interval [rid-base, rid-base + length) is associated with the listed IOMMU,
> +  with the iommu-specifier (r - rid-base + iommu-base).
> +
> +  ARM SMMU Note:
> +The ARM SMMU binding permits an #iommu-cells value of 2 and in this
> +case defines an IOMMU specifier to be: (stream-id,smr-mask)
> +
> +In an iommu-map this means the iommu-base consists of 2 cells:
> +(rid-base,iommu,[stream-id,smr-mask],length).
> +
> +In this case the RID to IOMMU specifier mapping is defined to be:
> +any RID r in the interval [rid-base, rid-base + length) is associated
> +with the listed IOMMU, with the iommu-specifier (r - rid-base + 
> stream-id).
> 
>  - iommu-map-mask: A mask to be applied to each Requester ID prior to being
>mapped to an iommu-specifier per the iommu-map property.
> 
> 
> 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH] iommu/arm-smmu: Add global SMR masking property

2016-12-16 Thread Robin Murphy
The current SMR masking support using a 2-cell iommu-specifier is
primarily intended to handle individual masters with large and/or
complex Stream ID assignments; it quickly gets a bit clunky in other SMR
use-cases where we just want to consistently mask out the same part of
every Stream ID (e.g. for MMU-500 configurations where the appended TBU
number gets in the way unnecessarily). Let's add a new property to allow
a single global mask value to better fit the latter situation.

CC: Stuart Yoder 
Signed-off-by: Robin Murphy 
---

Compile-tested only...

 Documentation/devicetree/bindings/iommu/arm,smmu.txt | 8 
 drivers/iommu/arm-smmu.c | 4 +++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index e862d1485205..98f5cbe5fdb4 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -60,6 +60,14 @@ conditions.
   aliases of secure registers have to be used during
   SMMU configuration.
 
+- stream-match-mask : Specifies a fixed SMR mask value to combine with
+  the Stream ID value from every iommu-specifier. This
+  may be used instead of an "#iommu-cells" value of 2
+  when there is no need for per-master SMR masks, but
+  it is still desired to mask some portion of every
+  Stream ID (e.g. for certain MMU-500 configurations
+  given globally unique external IDs).
+
 ** Deprecated properties:
 
 - mmu-masters (deprecated in favour of the generic "iommus" binding) :
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 8f7281444551..f1abcb7dde36 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1534,13 +1534,15 @@ static int arm_smmu_domain_set_attr(struct iommu_domain 
*domain,
 
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
-   u32 fwid = 0;
+   u32 mask, fwid = 0;
 
if (args->args_count > 0)
fwid |= (u16)args->args[0];
 
if (args->args_count > 1)
fwid |= (u16)args->args[1] << SMR_MASK_SHIFT;
+   else if (!of_property_read_u32(args->np, "stream-match-mask", &mask))
+   fwid |= (u16)mask << SMR_MASK_SHIFT;
 
return iommu_fwspec_add_ids(dev, &fwid, 1);
 }
-- 
2.10.2.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: RFC: extend iommu-map binding to support #iommu-cells > 1

2016-12-16 Thread Robin Murphy
On 16/12/16 14:21, Stuart Yoder wrote:
> 
> 
>> -Original Message-
>> From: Mark Rutland [mailto:mark.rutl...@arm.com]
>> Sent: Friday, December 16, 2016 5:39 AM
>> To: Stuart Yoder 
>> Cc: robin.mur...@arm.com; will.dea...@arm.com; robh...@kernel.org; Bharat 
>> Bhushan
>> ; Nipun Gupta ; Diana Madalina 
>> Craciun
>> ; devicet...@vger.kernel.org; 
>> iommu@lists.linux-foundation.org
>> Subject: Re: RFC: extend iommu-map binding to support #iommu-cells > 1
>>
>> On Fri, Dec 16, 2016 at 02:36:57AM +, Stuart Yoder wrote:
>>> For context, please see the thread:
>>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Farm-
>> kernel%2Fmsg539066.html&data=01%7C01%7Cstuart.yoder%40nxp.com%7C0464e12bddfd42e0f0a508d425a847cb%7C686ea
>> 1d3bc2b4c6fa92cd99c5c301635%7C0&sdata=KnzeOlbts271GwD1cpx2FLMsKnxw%2FOdiGVTp6%2BKFrbM%3D&reserved=0
>>>
>>> The existing iommu-map binding did not account for the situation where
>>> #iommu-cells == 2, as permitted in the ARM SMMU binding.  The 2nd cell
>>> of the IOMMU specifier being the SMR mask.  The existing binding defines
>>> the mapping as:
>>>Any RID r in the interval [rid-base, rid-base + length) is associated 
>>> with
>>>the listed IOMMU, with the iommu-specifier (r - rid-base + iommu-base).
>>>
>>> ...and that does not work if iommu-base is 2 cells, the second being the
>>> SMR mask.
>>>
>>> While this can be worked around by always having length=1, it seems we
>>> can get this cleaned up by updating the binding definition for iommu-map.
>>
>> To reiterate, I'm very much not keen on the pci-iommu binding having
>> knowledge of the decomposition of iommu-specifiers from other bindings.
> 
> With the current definition of iommu-map we already have baked in an
> assumption that an iommu-specifier is a value that can be incremented
> by 1 to get to the next sequential specifier.  So the binding already
> has "knowledge" that applies in most, but not all cases.
> 
> The generic iommu binding also defines a case where #iommu-cells=4
> for some IOMMUs.
> 
> Since the ARM SMMU is a special case, perhaps the intepretation
> of an iommu-specifier in the context of iommu-map should be moved
> into the SMMU binding.
> 
>> As mentioned previously, there's an intended interpretation [1] that we
>> need to fix up the pci-iommu binding with. With that, I don't think it's
>> even necessary to bodge iommu-cells = <1> AFAICT.
> 
> You had said in the previous thread:
> 
>   > I had intended that the offset was added to the final cell of the
>   > iommu-specifier (i.e. that the iommu-specifier was treated as a large
>   > number).
> 
>   > You can handle this case by adding additional entries in the map table,
>   > with a single entry length.
> 
> I understand that, and it works, but I don't see why the definition has
> to be that the offset is added to the "final cell".

Because the cells of a specifier form a single opaque big-endian value.
Were DT little-endian it would be the "first cell". To be pedantic, both
of those descriptions are technically wrong because they fail to account
for overflow and carry up into the next cell (in whichever direction).

>  Why can't it be
> an iommu specific definition that makes sense for a given IOMMU?

Because the implementation would then become a nightmarish rabbit-warren
of special-cases, largely defeating the point of a *generic* binding. At
the very least it'd have to chase every phandle and determine its
compatible just to work out what to do with any given specifier - merely
thinking about the complexity of the error handling for the number of
additional failure modes that introduces is enough to put me off.

Robin.

> 
> Stuart
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: RFC: extend iommu-map binding to support #iommu-cells > 1

2016-12-16 Thread Robin Murphy
On 16/12/16 15:56, Stuart Yoder wrote:
> The existing iommu-map binding did not account for the situation where
> #iommu-cells == 2, as permitted in the ARM SMMU binding.  The 2nd cell
> of the IOMMU specifier being the SMR mask.  The existing binding defines
> the mapping as:
>Any RID r in the interval [rid-base, rid-base + length) is associated 
> with
>the listed IOMMU, with the iommu-specifier (r - rid-base + iommu-base).
>
> ...and that does not work if iommu-base is 2 cells, the second being the
> SMR mask.
>
> While this can be worked around by always having length=1, it seems we
> can get this cleaned up by updating the binding definition for iommu-map.

 To reiterate, I'm very much not keen on the pci-iommu binding having
 knowledge of the decomposition of iommu-specifiers from other bindings.
>>>
>>> With the current definition of iommu-map we already have baked in an
>>> assumption that an iommu-specifier is a value that can be incremented
>>> by 1 to get to the next sequential specifier.  So the binding already
>>> has "knowledge" that applies in most, but not all cases.
>>>
>>> The generic iommu binding also defines a case where #iommu-cells=4
>>> for some IOMMUs.
>>>
>>> Since the ARM SMMU is a special case, perhaps the intepretation
>>> of an iommu-specifier in the context of iommu-map should be moved
>>> into the SMMU binding.
>>>
 As mentioned previously, there's an intended interpretation [1] that we
 need to fix up the pci-iommu binding with. With that, I don't think it's
 even necessary to bodge iommu-cells = <1> AFAICT.
>>>
>>> You had said in the previous thread:
>>>
>>>   > I had intended that the offset was added to the final cell of the
>>>   > iommu-specifier (i.e. that the iommu-specifier was treated as a large
>>>   > number).
>>>
>>>   > You can handle this case by adding additional entries in the map table,
>>>   > with a single entry length.
>>>
>>> I understand that, and it works, but I don't see why the definition has
>>> to be that the offset is added to the "final cell".
>>
>> Because the cells of a specifier form a single opaque big-endian value.
>> Were DT little-endian it would be the "first cell". To be pedantic, both
>> of those descriptions are technically wrong because they fail to account
>> for overflow and carry up into the next cell (in whichever direction).
> 
> If it is really opaque how can we reliably add 1 to it?

The pci-iommu binding isn't adding 1 to an opaque value. It is
*generating* an IOMMU specifier of the form of a single numeric value,
as defined by some linear translation of a PCI RID relative to a numeric
base value appropriate for the IOMMU topology. It is explicit therein
that a single numeric value must be the appropriate interpretation of
that specifier. That happens to be true of a 1-cell arm-smmu specifier,
therefore iommu-map can be used with arm-smmu with #iommu-cells=1. It is
also true of a 2-cell some-other-iommu specifier, therefore iommu-map
can be used with some-other-iommu with #iommu-cells=2 (if we fix the
fact that the current implementation fails to consider more than one
cell). It is not, however, true of a 2-cell arm-smmu specifier,
therefore iommu-map cannot be used with arm-smmu with #iommu-cells=2,
although the fact that of_pci_iommu_configure() unconditionally
generates a 1-cell specifier at the moment does happen to sidestep that.

The point of the 2-cell arm-smmu specifier is really to handle the
devices (which exist) with dozens or hundreds of stream IDs, which are
*only* usable via SMR masking, and particularly with a hand-crafted mask
that is able to assume the non-existence of overlapping IDs (that aspect
being actually quite significant for optimal allocation - one of the
reasons my automatic-mask-generation prototype is now gathering dust).

The MMU-500 TBU use-case is really an entirely different kettle of fish,
hence the RFC I posted earlier.

Robin.

>>>  Why can't it be
>>> an iommu specific definition that makes sense for a given IOMMU?
>>
>> Because the implementation would then become a nightmarish rabbit-warren
>> of special-cases, largely defeating the point of a *generic* binding. At
>> the very least it'd have to chase every phandle and determine its
>> compatible just to work out what to do with any given specifier - merely
>> thinking about the complexity of the error handling for the number of
>> additional failure modes that introduces is enough to put me off.
> 
> In order to decode an iommu-map at all we have to chase every phandle, no?
> Isn't that how we know how many cells an iommu-map entry has?
> 
> I have not thought through the implementation, but my thought was that
> the code could use a callback to handle the IOMMU-specific case.  If
> callback==NULL then the default case is what we have today.  An IOMMU like
> the SMMU can implement a simple callback that can return the mapping.
> 
> Not sure how this is

Re: [PATCH] iommu: Drop the of_iommu_{set/get}_ops() interface

2017-01-04 Thread Robin Murphy
[+Yong Wu for mtk_iommu]

On 03/01/17 17:34, Lorenzo Pieralisi wrote:
> With the introduction of the new iommu_{register/get}_instance()
> interface in commit e4f10ffe4c9b ("iommu: Make of_iommu_set/get_ops() DT
> agnostic") (based on struct fwnode_handle as look-up token, so firmware
> agnostic) to register IOMMU instances with the core IOMMU layer there is
> no reason to keep the old OF based interface around any longer.
> 
> Convert all the IOMMU drivers (and OF IOMMU core code) that rely on the
> of_iommu_{set/get}_ops() to the new kernel interface to register/retrieve
> IOMMU instances and remove the of_iommu_{set/get}_ops() remaining glue
> code in order to complete the interface rework.

Reviewed-by: Robin Murphy 

Looking at before-and-after disassemblies, of_iommu.o is binary
identical, and exynos-iommu.o differs only in the use of dev->fwnode
rather than &dev->of_node->fwnode (and is binary identical if I hack it
back to the latter). I'm not sure why the (GCC 6.2) codegen for
mtk_iommu.o changes quite so much when merely replacing a callsite with
the contents of its static inline callee, but it does :/

Robin.

> Signed-off-by: Lorenzo Pieralisi 
> Cc: Matthias Brugger 
> Cc: Will Deacon 
> Cc: Robin Murphy 
> Cc: Joerg Roedel 
> Cc: Marek Szyprowski 
> ---
> Exynos, msm and mtk code compile tested only owing to lack of
> test platforms, I would appreciate some help in testing this
> patch on those platforms before merging it even if it is just
> a simple interface conversion.
> 
> Thanks,
> Lorenzo
> 
>  drivers/iommu/exynos-iommu.c |  2 +-
>  drivers/iommu/msm_iommu.c|  2 +-
>  drivers/iommu/mtk_iommu.c|  2 +-
>  drivers/iommu/of_iommu.c |  4 ++--
>  include/linux/of_iommu.h | 11 ---
>  5 files changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
> index 57ba0d3..b79e4c4 100644
> --- a/drivers/iommu/exynos-iommu.c
> +++ b/drivers/iommu/exynos-iommu.c
> @@ -628,7 +628,7 @@ static int __init exynos_sysmmu_probe(struct 
> platform_device *pdev)
>  
>   pm_runtime_enable(dev);
>  
> - of_iommu_set_ops(dev->of_node, &exynos_iommu_ops);
> + iommu_register_instance(dev->fwnode, &exynos_iommu_ops);
>  
>   return 0;
>  }
> diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
> index b09692b..9cd3cee 100644
> --- a/drivers/iommu/msm_iommu.c
> +++ b/drivers/iommu/msm_iommu.c
> @@ -737,7 +737,7 @@ static int msm_iommu_probe(struct platform_device *pdev)
>   }
>  
>   list_add(&iommu->dev_node, &qcom_iommu_devices);
> - of_iommu_set_ops(pdev->dev.of_node, &msm_iommu_ops);
> + iommu_register_instance(pdev->dev.fwnode, &msm_iommu_ops);
>  
>   pr_info("device mapped at %p, irq %d with %d ctx banks\n",
>   iommu->base, iommu->irq, iommu->ncb);
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 1479c76..0596ab2 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -655,7 +655,7 @@ static int mtk_iommu_init_fn(struct device_node *np)
>   return ret;
>   }
>  
> - of_iommu_set_ops(np, &mtk_iommu_ops);
> + iommu_register_instance(&np->fwnode, &mtk_iommu_ops);
>   return 0;
>  }
>  
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index 0f57ddc..d7f480a 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -127,7 +127,7 @@ static const struct iommu_ops
>  "iommu-map-mask", &iommu_spec.np, iommu_spec.args))
>   return NULL;
>  
> - ops = of_iommu_get_ops(iommu_spec.np);
> + ops = iommu_get_instance(&iommu_spec.np->fwnode);
>   if (!ops || !ops->of_xlate ||
>   iommu_fwspec_init(&pdev->dev, &iommu_spec.np->fwnode, ops) ||
>   ops->of_xlate(&pdev->dev, &iommu_spec))
> @@ -157,7 +157,7 @@ const struct iommu_ops *of_iommu_configure(struct device 
> *dev,
>  "#iommu-cells", idx,
>  &iommu_spec)) {
>   np = iommu_spec.np;
> - ops = of_iommu_get_ops(np);
> + ops = iommu_get_instance(&np->fwnode);
>  
>   if (!ops || !ops->of_xlate ||
>   iommu_fwspec_init(dev, &np->fwnode, ops) ||
> diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
> index 6a7fc50..13394ac 100644
> --- a/include/linux/of_iommu.h
> +++ b/include/linux/of_iommu.h
> @@ -31,17 +31,6 @

Re: [PATCH 02/10] iommu/of: Prepare for deferred IOMMU configuration

2017-01-05 Thread Robin Murphy
On 05/01/17 12:27, Lorenzo Pieralisi wrote:
> On Thu, Jan 05, 2017 at 02:04:37PM +0530, Sricharan wrote:
>> Hi Robin/Lorenzo,
>>
>>> Hi Robin,Lorenzo,
>>>
>>>> On Wed, Nov 30, 2016 at 04:42:27PM +, Robin Murphy wrote:
>>>>> On 30/11/16 16:17, Lorenzo Pieralisi wrote:
>>>>>> Sricharan, Robin,
>>>>>>
>>>>>> I gave this series a go on ACPI and apart from an SMMU v3 fix-up
>>>>>> it seems to work, more thorough testing required though.
>>>>>>
>>>>>> A key question below.
>>>>>>
>>>>>> On Wed, Nov 30, 2016 at 05:52:16AM +0530, Sricharan R wrote:
>>>>>>> From: Robin Murphy 
>>>>>>>
>>>>>>> IOMMU configuration represents unchanging properties of the hardware,
>>>>>>> and as such should only need happen once in a device's lifetime, but
>>>>>>> the necessary interaction with the IOMMU device and driver complicates
>>>>>>> exactly when that point should be.
>>>>>>>
>>>>>>> Since the only reasonable tool available for handling the inter-device
>>>>>>> dependency is probe deferral, we need to prepare of_iommu_configure()
>>>>>>> to run later than it is currently called (i.e. at driver probe rather
>>>>>>> than device creation), to handle being retried, and to tell whether a
>>>>>>> not-yet present IOMMU should be waited for or skipped (by virtue of
>>>>>>> having declared a built-in driver or not).
>>>>>>>
>>>>>>> Signed-off-by: Robin Murphy 
>>>>>>> ---
>>>>>>>  drivers/iommu/of_iommu.c | 30 +-
>>>>>>>  1 file changed, 29 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>>>> index ee49081..349bd1d 100644
>>>>>>> --- a/drivers/iommu/of_iommu.c
>>>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>>>> @@ -104,12 +104,20 @@ int of_get_dma_window(struct device_node *dn, 
>>>>>>> const char *prefix, int index,
>>>>>>> int err;
>>>>>>>
>>>>>>> ops = iommu_get_instance(fwnode);
>>>>>>> -   if (!ops || !ops->of_xlate)
>>>>>>> +   if ((ops && !ops->of_xlate) ||
>>>>>>> +   (!ops && !of_match_node(&__iommu_of_table, iommu_spec->np)))
>>>>>>
>>>>>> IIUC of_match_node() here is there to check there is a driver compiled
>>>>>> in for this device_node (aka compatible string in OF world), correct ?
>>>>>
>>>>> Yes - specifically, it's checking the magic table for a matching
>>>>> IOMMU_OF_DECLARE entry.
>>>>>
>>>>>> If that's the case (and I think that's what Sricharan was referring to
>>>>>> in his ACPI query) I need to cook-up something on the ACPI side to
>>>>>> emulate the OF linker table behaviour (or anyway to detect a driver is
>>>>>> actually in the kernel), it is not that difficult but it is key to know,
>>>>>> I will give it some thought to make it as clean as possible.
>>>>>
>>>>> I didn't think this would be a concern for ACPI, since IORT works much
>>>>> the same way the current of_iommu_init_fn/of_platform_device_create()
>>>>> bodges in drivers so for DT. If you can only discover SMMUs from IORT,
>>>>> then iort_init_platform_devices() will have already created every SMMU
>>>>> that's going to exist before discovering other devices from wherever
>>>>> they come from, thus you could never get into the situation of probing a
>>>>> device without its SMMU being ready (if it's ever going to be). Is that
>>>>> not right?
>>>>
>>>> It is right, my point and question is: we are probing a device and we
>>>> have to know whether it is worth deferring its IOMMU DMA setup. On DT,
>>>> through of_match_node(&__iommu_of_table, iommu_device_node) we check at
>>>> once that:
>>>>
>>>> 1 - A device for the IOMMU exists
>>>>
>>>> AND
>>>>
>>>> 2 - A

Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-05 Thread Robin Murphy
On 05/01/17 14:47, Will Deacon wrote:
> On Thu, Jan 05, 2017 at 02:07:31PM +, Mark Rutland wrote:
>> On Thu, Jan 05, 2017 at 02:00:05PM +, Will Deacon wrote:
>>> On Thu, Jan 05, 2017 at 12:08:57PM +, Mark Rutland wrote:
 On Thu, Jan 05, 2017 at 11:55:29AM +, Will Deacon wrote:
> On Tue, Jan 03, 2017 at 04:30:54PM -0500, Rob Clark wrote:
>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
>> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>> index ef465b0..5f405a6 100644
>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>> @@ -68,6 +68,9 @@ conditions.
>>aliases of secure registers have to be used during
>>SMMU configuration.
>>  
>> +- arm,smmu-enable-stall : Enable stall mode to stall memory transactions
>> +  and resume after fault is handled

 The wording here seems to describe a policy rather than a property.

 Can you elaborate on when/why this is required/preferred/valid?
>>>
>>> It's not a policy, it's a hardware capability. There are some non-probeable
>>> reasons why stalling mode is unsafe or unusable:
>>>
>>>   
>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/474530.html
>>
>> Ok. My point was that the wording above is an imperative -- it tells
>> the kernel to enable stall mode, not if/why it is safe to do so (i.e. it
>> is a policy, not a property).
>>
>> It sounds like that's just a wording issue. Something like
>> "arm,stalling-is-usable" (along with a descrition of when that
>> can/should be in the DT) would be vastly better.
> 
> Why does it need a vendor prefix? I'm not down on the convention there.
> "stalling-safe" or "stalling-supported" are alternative strings.
> 
>>  static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
>> @@ -824,6 +852,8 @@ static void arm_smmu_init_context_bank(struct 
>> arm_smmu_domain *smmu_domain,
>>  
>>  /* SCTLR */
>>  reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE | SCTLR_M;
>> +if (smmu->options & ARM_SMMU_OPT_ENABLE_STALL)
>> +reg |= SCTLR_CFCFG;
>
> I wonder if this should also be predicated on the compatible string, so
> that the "arm,smmu-enable-stall" property is ignored (with a warning) if
> the compatible string isn't specific enough to identify an implementation
> with the required SS behaviour? On the other hand, it feels pretty
> redundant and a single "stalling works" property is all we need.

 Can you elaborate on what "stalling works" entails? Is that just the SS
 bit behaviour? are there integration or endpoint-specific things that we
 need to care about?
>>>
>>> See above. The "stalling works" property (arm,smmu-enable-stall) would
>>> indicate that both the implementation *and* the integration are such
>>> that stalling is usable for demand paging. I suspect there are endpoints
>>> that can't deal with stalls (e.g. they might timeout and signal a RAS
>>> event), but in that case their respective device drivers should ensure
>>> that any DMA buffers are pinned and/or register a fault handler to
>>> request termination of the faulting transaction.
>>
>> Ok. It would be good to elaborate on what "stalling is useable" means in
>> the property description. i.e. what specificallty the implementation and
>> integration need to ensure.
> 
> We can describe some of those guarantees in the property description, but
> it's difficult to enumerate them exhaustively. For example, you wouldn't
> want stalling to lead to data corruption, denial of service, or for the
> thing to catch fire, but having those as explicit requirements is a bit
> daft. It's also impossible to check that you thought of everything.
> 
> Aside from renaming the option, I'm really after an opinion on whether
> it's better to have one property or combine it with the compatible
> string, because I can see benefits of both and don't much care either
> way.

The SMMU implementation side of the decision (i.e. independence of IRQ
assertion vs. SS) seems like exactly the sort of stuff the compatible
string already has covered. The integration side I'm less confident can
be described this way at all - the "this device definitely won't go
wrong if stalled for an indefinite length of time" is inherently a
per-master thing, so a single property on the SMMU implying that for
every device connected to it seems a bit optimistic, and breaks down as
soon as you have one device in the system for which that isn't true (a
PCI root complex, say), even if that guy's traffic never crosses paths
with whichever few devices you actually care about using stalls with.

I think this needs to be some kind of "arm,smmu-stall-safe" property
placed on individual master device nodes (mad idea: or even an extra
cell of flags in the 

Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-05 Thread Robin Murphy
On 05/01/17 16:07, Will Deacon wrote:
> On Thu, Jan 05, 2017 at 03:32:50PM +0000, Robin Murphy wrote:
>> On 05/01/17 14:47, Will Deacon wrote:
>>> On Thu, Jan 05, 2017 at 02:07:31PM +, Mark Rutland wrote:
>>>> Ok. It would be good to elaborate on what "stalling is useable" means in
>>>> the property description. i.e. what specificallty the implementation and
>>>> integration need to ensure.
>>>
>>> We can describe some of those guarantees in the property description, but
>>> it's difficult to enumerate them exhaustively. For example, you wouldn't
>>> want stalling to lead to data corruption, denial of service, or for the
>>> thing to catch fire, but having those as explicit requirements is a bit
>>> daft. It's also impossible to check that you thought of everything.
>>>
>>> Aside from renaming the option, I'm really after an opinion on whether
>>> it's better to have one property or combine it with the compatible
>>> string, because I can see benefits of both and don't much care either
>>> way.
>>
>> The SMMU implementation side of the decision (i.e. independence of IRQ
>> assertion vs. SS) seems like exactly the sort of stuff the compatible
>> string already has covered. The integration side I'm less confident can
>> be described this way at all - the "this device definitely won't go
>> wrong if stalled for an indefinite length of time" is inherently a
>> per-master thing, so a single property on the SMMU implying that for
>> every device connected to it seems a bit optimistic, and breaks down as
>> soon as you have one device in the system for which that isn't true (a
>> PCI root complex, say), even if that guy's traffic never crosses paths
>> with whichever few devices you actually care about using stalls with.
>>
>> I think this needs to be some kind of "arm,smmu-stall-safe" property
>> placed on individual master device nodes (mad idea: or even an extra
>> cell of flags in the IOMMU specifier) to encapsulate both that the given
>> device itself is OK with being stalled, and that it's integrated in such
>> a way that its stalled transactions cannot disrupt any *other* device
>> (e.g. it has a TBU all to itself). Then upon initialising a context bank
>> on a suitable SMMU implementation, we set CFCFG based on whatever device
>> is being attached to the corresponding domain, and refuse any subsequent
>> attempts to attach a non-stallable device to a stalling domain (and
>> possibly even vice-versa).
> 
> If we're going to add per-master properties, I'd *really* like them to be
> independent of the IOMMU in use. That is, we should be able to re-use this
> property as part of supporting SVM for platform devices in future.

I'd argue that they are still fairly separate things despite the
overlap: stalling is a specific ARM SMMU architecture thing (in both
architectures) which may be used for purposes unrelated to SVM;
conversely SVM implemented via PRI or similar mechanisms should be
pretty much oblivious to the transaction fault model.

> But I think we agree that we need:
> 
>   1. A compatible string for the SMMU that can be used to infer the SS
>  behaviour in the driver
> 
>   2. A property on the SMMU to say that it's been integrated in such a
>  way that stalling is safe (doesn't deadlock)

That's still got to be a per-master property, not a SMMU property, I
think. To illustrate:

  [A] [B]   [C]
   |   |_|
 __|__|___
| TBU || TBU |
|_|  SMMU  |_|
|__|__|__|
   |  |

Say A and B are instances of some device happy to be stalled, and C is a
PCIe RC, and each is attached to their own context bank - enabling
stalls for A is definitely fine. However even though B and C are using
different context banks, enabling stalls for B might deadlock C if it
results in more total outstanding transactions than the TBU's slave port
supports. Therefore A can happily claim to be stall-safe, but B cannot
due to its integration with respect to C.

And yes, I can point you at some existing hardware which really does
posess a topology like that.

>   3. A generic master property that says that the device can DMA to
>  unpinned memory

That sounds a bit *too* generic to me, given that there are multiple
incompatible ways that could be implemented. I'm not the biggest fan of
properties with heavily context-specific interpretations, especially
when there's more than a hint of software implementation details in the mix.

Robin.

> 
> Anything else?
> 
> Will
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 01/18] iommu/dma: Allow MSI-only cookies

2017-01-06 Thread Robin Murphy
On 06/01/17 11:46, Auger Eric wrote:
> 
> 
> On 06/01/2017 11:59, Joerg Roedel wrote:
>> On Thu, Jan 05, 2017 at 07:04:29PM +, Eric Auger wrote:
>>>  struct iommu_dma_cookie {
>>> -   struct iova_domain  iovad;
>>> -   struct list_headmsi_page_list;
>>> -   spinlock_t  msi_lock;
>>> +   union {
>>> +   struct iova_domain  iovad;
>>> +   dma_addr_t  msi_iova;
>>> +   };
>>> +   struct list_headmsi_page_list;
>>> +   spinlock_t  msi_lock;
>>> +   enum iommu_dma_cookie_type  type;
>>
>> Please move the type to the beginning of the struct and add a comment
>> how the type relates to the union.
> 
> Sure
> 
> Thank you for the review.

FWIW I already had a cleaned up version of this patch, I just hadn't
mentioned it. I've pushed out an update with that change added too[1].

Robin.

[1]:http://linux-arm.org/git?p=linux-rm.git;a=shortlog;h=refs/heads/iommu/misc

> 
> Best regards
> 
> Eric
>>
>>
>>
>>  Joerg
>>

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/4] iommu/exynos: Ensure that SYSMMU is added only once to its master device

2017-01-09 Thread Robin Murphy
Hi Marek,

On 09/01/17 12:03, Marek Szyprowski wrote:
> This patch prepares Exynos IOMMU driver for deferred probing
> support. Once it gets added, of_xlate() callback might be called
> more than once for the same SYSMMU controller and master device
> (for example it happens when masters device driver fails with
> EPROBE_DEFER). This patch adds a check, which ensures that SYSMMU
> controller is added to its master device (owner) only once.

FWIW, I think it would be better to address that issue in the
probe-deferral code itself, rather than fixing up individual IOMMU
drivers. In fact, I *think* it's already covered in the latest version
of the of_iommu_configure() changes[1], specifically this bit:

...
if (fwspec) {
if (fwspec->ops)
return fwspec->ops;
...

Which should ensure that a successful IOMMU configuration happens
exactly once between device creation and destruction.

That said, this patch would still add robustness against actual error
conditions, e.g. an erroneous DT which managed to specify the same
phandle twice, although making it "if (WARN_ON(entry == data))" might be
a good idea.

Robin.

[1]:https://www.mail-archive.com/iommu@lists.linux-foundation.org/msg15333.html

> Signed-off-by: Marek Szyprowski 
> ---
>  drivers/iommu/exynos-iommu.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
> index b0d537e6a445..8bf5a06a6c01 100644
> --- a/drivers/iommu/exynos-iommu.c
> +++ b/drivers/iommu/exynos-iommu.c
> @@ -1253,7 +1253,7 @@ static int exynos_iommu_of_xlate(struct device *dev,
>  {
>   struct exynos_iommu_owner *owner = dev->archdata.iommu;
>   struct platform_device *sysmmu = of_find_device_by_node(spec->np);
> - struct sysmmu_drvdata *data;
> + struct sysmmu_drvdata *data, *entry;
>  
>   if (!sysmmu)
>   return -ENODEV;
> @@ -1272,6 +1272,10 @@ static int exynos_iommu_of_xlate(struct device *dev,
>   dev->archdata.iommu = owner;
>   }
>  
> + list_for_each_entry(entry, &owner->controllers, owner_node)
> + if (entry == data)
> + return 0;
> +
>   list_add_tail(&data->owner_node, &owner->controllers);
>   data->master = dev;
>  
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 01/19] iommu/dma: Implement PCI allocation optimisation

2017-01-10 Thread Robin Murphy
On 09/01/17 13:45, Eric Auger wrote:
> From: Robin Murphy 
> 
> Whilst PCI devices may have 64-bit DMA masks, they still benefit from
> using 32-bit addresses wherever possible in order to avoid DAC (PCI) or
> longer address packets (PCIe), which may incur a performance overhead.
> Implement the same optimisation as other allocators by trying to get a
> 32-bit address first, only falling back to the full mask if that fails.

Oops, this was just some development work which got promoted to my misc
branch for safe keeping; it really has nothing to do with this series.

I'd managed to overlook that one of the __alloc_iova() conflicts hit the
MSI cookie patch, sorry. Things are now in a more appropriate order in
my tree.

Robin.

> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/dma-iommu.c | 21 +++--
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 2db0d64..a59ca47 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -204,19 +204,28 @@ int dma_direction_to_prot(enum dma_data_direction dir, 
> bool coherent)
>  }
>  
>  static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
> - dma_addr_t dma_limit)
> + dma_addr_t dma_limit, struct device *dev)
>  {
>   struct iova_domain *iovad = cookie_iovad(domain);
>   unsigned long shift = iova_shift(iovad);
>   unsigned long length = iova_align(iovad, size) >> shift;
> + struct iova *iova = NULL;
>  
>   if (domain->geometry.force_aperture)
>   dma_limit = min(dma_limit, domain->geometry.aperture_end);
> +
> + /* Try to get PCI devices a SAC address */
> + if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev))
> + iova = alloc_iova(iovad, length, DMA_BIT_MASK(32) >> shift,
> +   true);
>   /*
>* Enforce size-alignment to be safe - there could perhaps be an
>* attribute to control this per-device, or at least per-domain...
>*/
> - return alloc_iova(iovad, length, dma_limit >> shift, true);
> + if (!iova)
> + iova = alloc_iova(iovad, length, dma_limit >> shift, true);
> +
> + return iova;
>  }
>  
>  /* The IOVA allocator knows what we mapped, so just unmap whatever that was 
> */
> @@ -369,7 +378,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t 
> size, gfp_t gfp,
>   if (!pages)
>   return NULL;
>  
> - iova = __alloc_iova(domain, size, dev->coherent_dma_mask);
> + iova = __alloc_iova(domain, size, dev->coherent_dma_mask, dev);
>   if (!iova)
>   goto out_free_pages;
>  
> @@ -440,7 +449,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
> phys_addr_t phys,
>   struct iova_domain *iovad = cookie_iovad(domain);
>   size_t iova_off = iova_offset(iovad, phys);
>   size_t len = iova_align(iovad, size + iova_off);
> - struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev));
> + struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev), dev);
>  
>   if (!iova)
>   return DMA_ERROR_CODE;
> @@ -598,7 +607,7 @@ int iommu_dma_map_sg(struct device *dev, struct 
> scatterlist *sg,
>   prev = s;
>   }
>  
> - iova = __alloc_iova(domain, iova_len, dma_get_mask(dev));
> + iova = __alloc_iova(domain, iova_len, dma_get_mask(dev), dev);
>   if (!iova)
>   goto out_restore_sg;
>  
> @@ -675,7 +684,7 @@ static struct iommu_dma_msi_page 
> *iommu_dma_get_msi_page(struct device *dev,
>   if (!msi_page)
>   return NULL;
>  
> - iova = __alloc_iova(domain, iovad->granule, dma_get_mask(dev));
> + iova = __alloc_iova(domain, iovad->granule, dma_get_mask(dev), dev);
>   if (!iova)
>   goto out_free_page;
>  
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/arm-smmu: Fix for ThunderX erratum #27704

2017-01-11 Thread Robin Murphy
On 11/01/17 11:51, Tomasz Nowicki wrote:
> The goal of erratum #27704 workaround was to make sure that ASIDs and VMIDs
> are unique across all SMMU instances on affected Cavium systems.
> 
> Currently, the workaround code partitions ASIDs and VMIDs by increasing
> global cavium_smmu_context_count which in turn becomes the base ASID and VMID
> value for the given SMMU instance upon the context bank initialization.
> 
> For systems with multiple SMMU instances this approach implies the risk
> of crossing 8-bit ASID, like for CN88xx capable of 4 SMMUv2, 128 context bank 
> each:
> SMMU_0 (0-127 ASID RANGE)
> SMMU_1 (127-255 ASID RANGE)
> SMMU_2 (256-383 ASID RANGE) <--- crossing 8-bit ASID
> SMMU_3 (384-511 ASID RANGE) <--- crossing 8-bit ASID

I could swear that at some point in the original discussion it was said
that the TLBs were only shared between pairs of SMMUs, so in fact 0/1
and 2/3 are independent of each other - out of interest, have you
managed to hit an actual problem in practice or is this patch just by
inspection?

Of course, depending on the SMMUs to probe in the right order isn't
particularly robust, so it's still probably a worthwhile change.

> Since we use 8-bit ASID now we effectively misconfigure ASID[15:8] bits for
> SMMU_CBn_TTBRm register. Also, we still use non-zero ASID[15:8] bits
> upon context invalidation. This patch adds 16-bit ASID support for stage-1
> AArch64 contexts for Cavium SMMUv2 model so that we use ASIDs consistently.
> 
> Signed-off-by: Tomasz Nowicki 
> ---
>  drivers/iommu/arm-smmu.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index a60cded..ae8f059 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -260,6 +260,7 @@ enum arm_smmu_s2cr_privcfg {
>  
>  #define TTBCR2_SEP_SHIFT 15
>  #define TTBCR2_SEP_UPSTREAM  (0x7 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_AS(1 << 4)
>  
>  #define TTBRn_ASID_SHIFT 48
>  
> @@ -778,6 +779,9 @@ static void arm_smmu_init_context_bank(struct 
> arm_smmu_domain *smmu_domain,
>   reg = pgtbl_cfg->arm_lpae_s1_cfg.tcr;
>   reg2 = pgtbl_cfg->arm_lpae_s1_cfg.tcr >> 32;
>   reg2 |= TTBCR2_SEP_UPSTREAM;
> + if (smmu->model == CAVIUM_SMMUV2 &&

I'd be inclined to say "smmu->version == ARM_SMMU_V2" there, rather than
make it Cavium-specific - we enable 16-bit VMID unconditionally where
supported, so I don't see any reason not to handle 16-bit ASIDs in the
same manner.

> + cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
> +     reg2 |= TTBCR2_AS;
>   }
>   if (smmu->version > ARM_SMMU_V1)
>   writel_relaxed(reg2, cb_base + ARM_SMMU_CB_TTBCR2);
> 

Either way:

Reviewed-by: Robin Murphy 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH] IOMMU: SMMUv2: Support for Extended Stream ID (16 bit)

2017-01-11 Thread Robin Murphy
On 10/01/17 11:57, Aleksey Makarov wrote:
> Enable the Extended Stream ID feature when available.
> 
> This patch on top of series "[PATCH v7 00/19] KVM PCIe/MSI passthrough
> on ARM/ARM64 and IOVA reserved regions" by Eric Auger allows
> to passthrough an external PCIe network card on a ThunderX server
> successfully.
> 
> Without this patch that card caused a warning like
> 
>   pci 0006:90:00.0: stream ID 0x9000 out of range for SMMU (0x7fff)
> 
> during boot.
> 
> Signed-off-by: Aleksey Makarov 
> ---
>  drivers/iommu/arm-smmu.c | 53 
> +---
>  1 file changed, 37 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 13d26009b8e0..d160c12828f4 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -24,6 +24,7 @@
>   *   - v7/v8 long-descriptor format
>   *   - Non-secure access to the SMMU
>   *   - Context fault reporting
> + *   - Extended Stream ID (16 bit)
>   */
>  
>  #define pr_fmt(fmt) "arm-smmu: " fmt
> @@ -87,6 +88,7 @@
>  #define sCR0_CLIENTPD(1 << 0)
>  #define sCR0_GFRE(1 << 1)
>  #define sCR0_GFIE(1 << 2)
> +#define sCR0_EXIDENABLE  (1 << 3)
>  #define sCR0_GCFGFRE (1 << 4)
>  #define sCR0_GCFGFIE (1 << 5)
>  #define sCR0_USFCFG  (1 << 10)
> @@ -126,6 +128,7 @@
>  #define ID0_NUMIRPT_MASK 0xff
>  #define ID0_NUMSIDB_SHIFT9
>  #define ID0_NUMSIDB_MASK 0xf
> +#define ID0_EXIDS(1 << 8)
>  #define ID0_NUMSMRG_SHIFT0
>  #define ID0_NUMSMRG_MASK 0xff
>  
> @@ -169,6 +172,7 @@
>  #define ARM_SMMU_GR0_S2CR(n) (0xc00 + ((n) << 2))
>  #define S2CR_CBNDX_SHIFT 0
>  #define S2CR_CBNDX_MASK  0xff
> +#define S2CR_EXIDVALID   (1 << 10)
>  #define S2CR_TYPE_SHIFT  16
>  #define S2CR_TYPE_MASK   0x3
>  enum arm_smmu_s2cr_type {
> @@ -354,6 +358,7 @@ struct arm_smmu_device {
>  #define ARM_SMMU_FEAT_FMT_AARCH64_64K(1 << 9)
>  #define ARM_SMMU_FEAT_FMT_AARCH32_L  (1 << 10)
>  #define ARM_SMMU_FEAT_FMT_AARCH32_S  (1 << 11)
> +#define ARM_SMMU_FEAT_EXIDS  (1 << 12)
>   u32 features;
>  
>  #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
> @@ -1051,7 +1056,7 @@ static void arm_smmu_write_smr(struct arm_smmu_device 
> *smmu, int idx)
>   struct arm_smmu_smr *smr = smmu->smrs + idx;
>   u32 reg = smr->id << SMR_ID_SHIFT | smr->mask << SMR_MASK_SHIFT;
>  
> - if (smr->valid)
> + if (!(smmu->features & ARM_SMMU_FEAT_EXIDS) && smr->valid)
>   reg |= SMR_VALID;
>   writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx));
>  }
> @@ -1063,6 +1068,9 @@ static void arm_smmu_write_s2cr(struct arm_smmu_device 
> *smmu, int idx)
> (s2cr->cbndx & S2CR_CBNDX_MASK) << S2CR_CBNDX_SHIFT |
> (s2cr->privcfg & S2CR_PRIVCFG_MASK) << S2CR_PRIVCFG_SHIFT;
>  
> + if (smmu->features & ARM_SMMU_FEAT_EXIDS && smmu->smrs &&
> + smmu->smrs[idx].valid)
> + reg |= S2CR_EXIDVALID;
>   writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_S2CR(idx));
>  }
>  
> @@ -1674,6 +1682,9 @@ static void arm_smmu_device_reset(struct 
> arm_smmu_device *smmu)
>   if (smmu->features & ARM_SMMU_FEAT_VMID16)
>   reg |= sCR0_VMID16EN;
>  
> + if (smmu->features & ARM_SMMU_FEAT_EXIDS)
> + reg |= sCR0_EXIDENABLE;
> +
>   /* Push the button */
>   __arm_smmu_tlb_sync(smmu);
>   writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
> @@ -1761,7 +1772,12 @@ static int arm_smmu_device_cfg_probe(struct 
> arm_smmu_device *smmu)
>  "\t(IDR0.CTTW overridden by FW configuration)\n");
>  
>   /* Max. number of entries we have for stream matching/indexing */
> - size = 1 << ((id >> ID0_NUMSIDB_SHIFT) & ID0_NUMSIDB_MASK);
> + if (smmu->version == ARM_SMMU_V2 && id & ID0_EXIDS) {
> + smmu->features |= ARM_SMMU_FEAT_EXIDS;
> + size = (1 << 16);

Unnecessary parentheses.

> + } else {
> + size = 1 << ((id >> ID0_NUMSIDB_SHIFT) & ID0_NUMSIDB_MASK);
> + }

Given what the architecture says about the relationship between EXIDS
and NUMSIDB, I suppose an even shorter version could be:

if (smmu->version == ARM_SMMU_V2 && id & ID0_EXIDS)
size *= 2;

but I'm not sure that's actually any nicer to read.

>   smmu->streamid_mask = size - 1;
>   if (id & ID0_SMS) {
>   u32 smr;
> @@ -1774,20 +1790,25 @@ static int arm_smmu_device_cfg_probe(struct 
> arm_smmu_device *smmu)
>   return -ENODEV;
>   }
>  
> - /*
> -  * SMR.ID bits may not be preserved if the corresponding MASK
> -  

Re: [PATCH 1/1] iommu/arm-smmu: Fix for ThunderX erratum #27704

2017-01-13 Thread Robin Murphy
On 13/01/17 10:43, Tomasz Nowicki wrote:
> On 12.01.2017 07:41, Tomasz Nowicki wrote:
>> On 11.01.2017 13:19, Robin Murphy wrote:
>>> On 11/01/17 11:51, Tomasz Nowicki wrote:
>>>> The goal of erratum #27704 workaround was to make sure that ASIDs and
>>>> VMIDs
>>>> are unique across all SMMU instances on affected Cavium systems.
>>>>
>>>> Currently, the workaround code partitions ASIDs and VMIDs by increasing
>>>> global cavium_smmu_context_count which in turn becomes the base ASID
>>>> and VMID
>>>> value for the given SMMU instance upon the context bank initialization.
>>>>
>>>> For systems with multiple SMMU instances this approach implies the risk
>>>> of crossing 8-bit ASID, like for CN88xx capable of 4 SMMUv2, 128
>>>> context bank each:
>>>> SMMU_0 (0-127 ASID RANGE)
>>>> SMMU_1 (127-255 ASID RANGE)
>>>> SMMU_2 (256-383 ASID RANGE) <--- crossing 8-bit ASID
>>>> SMMU_3 (384-511 ASID RANGE) <--- crossing 8-bit ASID
>>>
>>> I could swear that at some point in the original discussion it was said
>>> that the TLBs were only shared between pairs of SMMUs, so in fact 0/1
>>> and 2/3 are independent of each other
>>
>> Indeed TLBs are only shared between pairs of SMMUs but the workaround
>> makes sure ASIDs are unique across all SMMU instances so we do not have
>> to bother about SMMUs probe order.
>>
>>  - out of interest, have you
>>> managed to hit an actual problem in practice or is this patch just by
>>> inspection?
>>
>> Except SMMU0/1 devices all other devices under other SMMUs will fail on
>> guest power off/on. Since we try to invalidate tlb with 16bit ASID but
>> we actually have 8 bit zero padded 16 bit entry.
>>
>>>
>>> Of course, depending on the SMMUs to probe in the right order isn't
>>> particularly robust, so it's still probably a worthwhile change.
>>>
>>>> Since we use 8-bit ASID now we effectively misconfigure ASID[15:8]
>>>> bits for
>>>> SMMU_CBn_TTBRm register. Also, we still use non-zero ASID[15:8] bits
>>>> upon context invalidation. This patch adds 16-bit ASID support for
>>>> stage-1
>>>> AArch64 contexts for Cavium SMMUv2 model so that we use ASIDs
>>>> consistently.
>>>>
>>>> Signed-off-by: Tomasz Nowicki 
>>>> ---
>>>>  drivers/iommu/arm-smmu.c | 4 
>>>>  1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>>> index a60cded..ae8f059 100644
>>>> --- a/drivers/iommu/arm-smmu.c
>>>> +++ b/drivers/iommu/arm-smmu.c
>>>> @@ -260,6 +260,7 @@ enum arm_smmu_s2cr_privcfg {
>>>>
>>>>  #define TTBCR2_SEP_SHIFT15
>>>>  #define TTBCR2_SEP_UPSTREAM(0x7 << TTBCR2_SEP_SHIFT)
>>>> +#define TTBCR2_AS(1 << 4)
>>>>
>>>>  #define TTBRn_ASID_SHIFT48
>>>>
>>>> @@ -778,6 +779,9 @@ static void arm_smmu_init_context_bank(struct
>>>> arm_smmu_domain *smmu_domain,
>>>>  reg = pgtbl_cfg->arm_lpae_s1_cfg.tcr;
>>>>  reg2 = pgtbl_cfg->arm_lpae_s1_cfg.tcr >> 32;
>>>>  reg2 |= TTBCR2_SEP_UPSTREAM;
>>>> +if (smmu->model == CAVIUM_SMMUV2 &&
>>>
>>> I'd be inclined to say "smmu->version == ARM_SMMU_V2" there, rather than
>>> make it Cavium-specific - we enable 16-bit VMID unconditionally where
>>> supported, so I don't see any reason not to handle 16-bit ASIDs in the
>>> same manner.
>>
>> I agree, I will enable 16-bit ASID for ARM_SMMU_V2.
>>
> 
> Actually, the ARM_SMMU_CTX_FMT_AARCH64 context check is all we need here:
> 
> +if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
> +reg2 |= TTBCR2_AS;

Ah, clever! The horrible SMMUv1 64KB supplement supports AArch64
contexts without being SMMUv2, but of course doesn't have stage 1 :)

Robin.

> 
> Thanks,
> Tomasz

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS

2017-01-13 Thread Robin Murphy
On 13/01/17 11:07, Geert Uytterhoeven wrote:
> Add support for DMA_ATTR_FORCE_CONTIGUOUS to the generic IOMMU DMA code.
> This allows to allocate physically contiguous DMA buffers on arm64
> systems with an IOMMU.

Can anyone explain what this attribute is actually used for? I've never
quite figured it out.

> Note that as this uses the CMA allocator, setting this attribute has a
> runtime-dependency on CONFIG_DMA_CMA, just like on arm32.
> 
> For arm64 systems using swiotlb, no changes are needed to support the
> allocation of physically contiguous DMA buffers:
>   - swiotlb always uses physically contiguous buffers (up to
> IO_TLB_SEGSIZE = 128 pages),
>   - arm64's __dma_alloc_coherent() already calls
> dma_alloc_from_contiguous() when CMA is available.
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
>  arch/arm64/mm/dma-mapping.c |  4 ++--
>  drivers/iommu/dma-iommu.c   | 44 ++--
>  include/linux/dma-iommu.h   |  2 +-
>  3 files changed, 37 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 1d7d5d2881db7c19..5fba14a21163e41f 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -589,7 +589,7 @@ static void *__iommu_alloc_attrs(struct device *dev, 
> size_t size,
>   addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
> __builtin_return_address(0));
>   if (!addr)
> - iommu_dma_free(dev, pages, iosize, handle);
> + iommu_dma_free(dev, pages, iosize, handle, attrs);
>   } else {
>   struct page *page;
>   /*
> @@ -642,7 +642,7 @@ static void __iommu_free_attrs(struct device *dev, size_t 
> size, void *cpu_addr,
>  
>   if (WARN_ON(!area || !area->pages))
>   return;
> - iommu_dma_free(dev, area->pages, iosize, &handle);
> + iommu_dma_free(dev, area->pages, iosize, &handle, attrs);
>   dma_common_free_remap(cpu_addr, size, VM_USERMAP);
>   } else {
>   iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 2db0d641cf4505b5..b991e8dc35c5 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -30,6 +30,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  struct iommu_dma_msi_page {
>   struct list_headlist;
> @@ -238,15 +239,21 @@ static void __iommu_dma_unmap(struct iommu_domain 
> *domain, dma_addr_t dma_addr)
>   __free_iova(iovad, iova);
>  }
>  
> -static void __iommu_dma_free_pages(struct page **pages, int count)
> +static void __iommu_dma_free_pages(struct device *dev, struct page **pages,
> +int count, unsigned long attrs)
>  {
> - while (count--)
> - __free_page(pages[count]);
> + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> + dma_release_from_contiguous(dev, pages[0], count);
> + } else {
> + while (count--)
> + __free_page(pages[count]);
> + }
>   kvfree(pages);
>  }
>  
> -static struct page **__iommu_dma_alloc_pages(unsigned int count,
> - unsigned long order_mask, gfp_t gfp)
> +static struct page **__iommu_dma_alloc_pages(struct device *dev,
> + unsigned int count, unsigned long order_mask, gfp_t gfp,
> + unsigned long attrs)
>  {
>   struct page **pages;
>   unsigned int i = 0, array_size = count * sizeof(*pages);
> @@ -265,6 +272,20 @@ static struct page **__iommu_dma_alloc_pages(unsigned 
> int count,
>   /* IOMMU can map any pages, so himem can also be used here */
>   gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
>  
> + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> + int order = get_order(count << PAGE_SHIFT);
> + struct page *page;
> +
> + page = dma_alloc_from_contiguous(dev, count, order);
> + if (!page)
> + return NULL;
> +
> + while (count--)
> + pages[i++] = page++;
> +
> + return pages;
> + }
> +

This is really yuck. Plus it's entirely pointless to go through the
whole page array/scatterlist dance when we know the buffer is going to
be physically contiguous - it should just be allocate, map, done. I'd
much rather see standalone iommu_dma_{alloc,free}_contiguous()
functions, and let the arch code handle dispatching appropriately.

Robin.

>   while (count) {
>   struct page *page = NULL;
>   unsigned int order_size;
> @@ -294,7 +315,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int 
> count,
>   __free_pages(page, order);
>   }
>   if (!page) {
> - __iommu_dma_free_pages(pages, i);
> + __iommu_dma_free_pages(de

Re: [PATCH] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS

2017-01-13 Thread Robin Murphy
On 13/01/17 11:59, Geert Uytterhoeven wrote:
> Hi Robin,
> 
> On Fri, Jan 13, 2017 at 12:32 PM, Robin Murphy  wrote:
>> On 13/01/17 11:07, Geert Uytterhoeven wrote:
>>> Add support for DMA_ATTR_FORCE_CONTIGUOUS to the generic IOMMU DMA code.
>>> This allows to allocate physically contiguous DMA buffers on arm64
>>> systems with an IOMMU.
>>
>> Can anyone explain what this attribute is actually used for? I've never
>> quite figured it out.
> 
> My understanding is that DMA_ATTR_FORCE_CONTIGUOUS is needed when using
> an IOMMU but wanting the buffers to be both contiguous in IOVA space and
> physically contiguous to allow passing to devices without IOMMU.
> 
> Main users are graphic and remote processors.

Sure, I assumed it must be to do with buffer sharing, but the systems
I'm aware of which have IOMMUs in their media subsystems tend to have
them in front of every IP block involved, so I was curious as to what
bit of non-IOMMU hardware wanted to play too. The lone in-tree use in
the Exynos DRM driver was never very revealing, and the new one I see in
the Qualcomm PIL driver frankly looks redundant to me.

Robin.

>>> --- a/drivers/iommu/dma-iommu.c
>>> +++ b/drivers/iommu/dma-iommu.c
> 
>>> @@ -265,6 +272,20 @@ static struct page **__iommu_dma_alloc_pages(unsigned 
>>> int count,
>>>   /* IOMMU can map any pages, so himem can also be used here */
>>>   gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
>>>
>>> + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
>>> + int order = get_order(count << PAGE_SHIFT);
>>> + struct page *page;
>>> +
>>> + page = dma_alloc_from_contiguous(dev, count, order);
>>> + if (!page)
>>> + return NULL;
>>> +
>>> + while (count--)
>>> + pages[i++] = page++;
>>> +
>>> + return pages;
>>> + }
>>> +
>>
>> This is really yuck. Plus it's entirely pointless to go through the
>> whole page array/scatterlist dance when we know the buffer is going to
>> be physically contiguous - it should just be allocate, map, done. I'd
>> much rather see standalone iommu_dma_{alloc,free}_contiguous()
>> functions, and let the arch code handle dispatching appropriately.
> 
> Fair enough.
> 
> Gr{oetje,eeting}s,
> 
> Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: Handle default domain attach failure

2017-01-16 Thread Robin Murphy
We wouldn't normally expect ops->attach_dev() to fail, but on IOMMUs
with limited hardware resources, or generally misconfigured systems,
it is certainly possible. We report failure correctly from the external
iommu_attach_device() interface, but do not do so in iommu_group_add()
when attaching to the default domain. The result of failure there is
that the device, group and domain all get left in a broken,
part-configured state which leads to weird errors and misbehaviour down
the line when IOMMU API calls sort-of-but-don't-quite work.

Check the return value of __iommu_attach_device() on the default domain,
and refactor the error handling paths to cope with its failure and clean
up correctly in such cases.

Fixes: e39cb8a3aa98 ("iommu: Make sure a device is always attached to a domain")
Reported-by: Punit Agrawal 
Signed-off-by: Robin Murphy 
---
 drivers/iommu/iommu.c | 37 -
 1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index dbe7f653bb7c..aed906a3e3db 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -383,36 +383,30 @@ int iommu_group_add_device(struct iommu_group *group, 
struct device *dev)
device->dev = dev;
 
ret = sysfs_create_link(&dev->kobj, &group->kobj, "iommu_group");
-   if (ret) {
-   kfree(device);
-   return ret;
-   }
+   if (ret)
+   goto err_free_device;
 
device->name = kasprintf(GFP_KERNEL, "%s", kobject_name(&dev->kobj));
 rename:
if (!device->name) {
-   sysfs_remove_link(&dev->kobj, "iommu_group");
-   kfree(device);
-   return -ENOMEM;
+   ret = -ENOMEM;
+   goto err_remove_link;
}
 
ret = sysfs_create_link_nowarn(group->devices_kobj,
   &dev->kobj, device->name);
if (ret) {
-   kfree(device->name);
if (ret == -EEXIST && i >= 0) {
/*
 * Account for the slim chance of collision
 * and append an instance to the name.
 */
+   kfree(device->name);
device->name = kasprintf(GFP_KERNEL, "%s.%d",
 kobject_name(&dev->kobj), i++);
goto rename;
}
-
-   sysfs_remove_link(&dev->kobj, "iommu_group");
-   kfree(device);
-   return ret;
+   goto err_free_name;
}
 
kobject_get(group->devices_kobj);
@@ -424,8 +418,10 @@ int iommu_group_add_device(struct iommu_group *group, 
struct device *dev)
mutex_lock(&group->mutex);
list_add_tail(&device->list, &group->devices);
if (group->domain)
-   __iommu_attach_device(group->domain, dev);
+   ret = __iommu_attach_device(group->domain, dev);
mutex_unlock(&group->mutex);
+   if (ret)
+   goto err_put_group;
 
/* Notify any listeners about change to group. */
blocking_notifier_call_chain(&group->notifier,
@@ -436,6 +432,21 @@ int iommu_group_add_device(struct iommu_group *group, 
struct device *dev)
pr_info("Adding device %s to group %d\n", dev_name(dev), group->id);
 
return 0;
+
+err_put_group:
+   mutex_lock(&group->mutex);
+   list_del(&device->list);
+   mutex_unlock(&group->mutex);
+   dev->iommu_group = NULL;
+   kobject_put(group->devices_kobj);
+err_free_name:
+   kfree(device->name);
+err_remove_link:
+   sysfs_remove_link(&dev->kobj, "iommu_group");
+err_free_device:
+   kfree(device);
+   pr_err("Failed to add device %s to group %d: %d\n", dev_name(dev), 
group->id, ret);
+   return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_group_add_device);
 
-- 
2.10.2.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/2] iommu/dma: Stop getting dma_32bit_pfn wrong

2017-01-16 Thread Robin Murphy
iommu_dma_init_domain() was originally written under the misconception
that dma_32bit_pfn represented some sort of size limit for IOVA domains.
Since the truth is almost the exact opposite of that, rework the logic
and comments to reflect its real purpose of optimising lookups when
allocating from a subset of the available 64-bit space.

Signed-off-by: Robin Murphy 
---

Sending this as a v2 since both patches have been seen before, and #1 is
ever so slightly tweaked. #2 applies on top of Eric's MSI series, since
that seems ready to go now - there is a trivial merge conflict otherwise
around the extra argument in the __alloc_iova() call.

Robin.

 drivers/iommu/dma-iommu.c | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index de41ead6542a..9aa432e8fbd3 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -203,6 +203,7 @@ int iommu_dma_init_domain(struct iommu_domain *domain, 
dma_addr_t base,
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = &cookie->iovad;
unsigned long order, base_pfn, end_pfn;
+   bool pci = dev && dev_is_pci(dev);
 
if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
return -EINVAL;
@@ -225,19 +226,31 @@ int iommu_dma_init_domain(struct iommu_domain *domain, 
dma_addr_t base,
end_pfn = min_t(unsigned long, end_pfn,
domain->geometry.aperture_end >> order);
}
+   /*
+* PCI devices may have larger DMA masks, but still prefer allocating
+* within a 32-bit mask to avoid DAC addressing. Such limitations don't
+* apply to the typical platform device, so for those we may as well
+* leave the cache limit at the top of their range to save an rb_last()
+* traversal on every allocation.
+*/
+   if (pci)
+   end_pfn &= DMA_BIT_MASK(32) >> order;
 
-   /* All we can safely do with an existing domain is enlarge it */
+   /* start_pfn is always nonzero for an already-initialised domain */
if (iovad->start_pfn) {
if (1UL << order != iovad->granule ||
-   base_pfn != iovad->start_pfn ||
-   end_pfn < iovad->dma_32bit_pfn) {
+   base_pfn != iovad->start_pfn) {
pr_warn("Incompatible range for DMA domain\n");
return -EFAULT;
}
-   iovad->dma_32bit_pfn = end_pfn;
+   /*
+* If we have devices with different DMA masks, move the free
+* area cache limit down for the benefit of the smaller one.
+*/
+   iovad->dma_32bit_pfn = min(end_pfn, iovad->dma_32bit_pfn);
} else {
init_iova_domain(iovad, 1UL << order, base_pfn, end_pfn);
-   if (dev && dev_is_pci(dev))
+   if (pci)
iova_reserve_pci_windows(to_pci_dev(dev), iovad);
}
return 0;
-- 
2.10.2.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/2] iommu/dma: Implement PCI allocation optimisation

2017-01-16 Thread Robin Murphy
Whilst PCI devices may have 64-bit DMA masks, they still benefit from
using 32-bit addresses wherever possible in order to avoid DAC (PCI) or
longer address packets (PCIe), which may incur a performance overhead.
Implement the same optimisation as other allocators by trying to get a
32-bit address first, only falling back to the full mask if that fails.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/dma-iommu.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9aa432e8fbd3..8adff5f83b38 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -281,19 +281,28 @@ int dma_direction_to_prot(enum dma_data_direction dir, 
bool coherent)
 }
 
 static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
-   dma_addr_t dma_limit)
+   dma_addr_t dma_limit, struct device *dev)
 {
struct iova_domain *iovad = cookie_iovad(domain);
unsigned long shift = iova_shift(iovad);
unsigned long length = iova_align(iovad, size) >> shift;
+   struct iova *iova = NULL;
 
if (domain->geometry.force_aperture)
dma_limit = min(dma_limit, domain->geometry.aperture_end);
+
+   /* Try to get PCI devices a SAC address */
+   if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev))
+   iova = alloc_iova(iovad, length, DMA_BIT_MASK(32) >> shift,
+ true);
/*
 * Enforce size-alignment to be safe - there could perhaps be an
 * attribute to control this per-device, or at least per-domain...
 */
-   return alloc_iova(iovad, length, dma_limit >> shift, true);
+   if (!iova)
+   iova = alloc_iova(iovad, length, dma_limit >> shift, true);
+
+   return iova;
 }
 
 /* The IOVA allocator knows what we mapped, so just unmap whatever that was */
@@ -446,7 +455,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t 
size, gfp_t gfp,
if (!pages)
return NULL;
 
-   iova = __alloc_iova(domain, size, dev->coherent_dma_mask);
+   iova = __alloc_iova(domain, size, dev->coherent_dma_mask, dev);
if (!iova)
goto out_free_pages;
 
@@ -517,7 +526,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
phys_addr_t phys,
struct iova_domain *iovad = cookie_iovad(domain);
size_t iova_off = iova_offset(iovad, phys);
size_t len = iova_align(iovad, size + iova_off);
-   struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev));
+   struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev), dev);
 
if (!iova)
return DMA_ERROR_CODE;
@@ -675,7 +684,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist 
*sg,
prev = s;
}
 
-   iova = __alloc_iova(domain, iova_len, dma_get_mask(dev));
+   iova = __alloc_iova(domain, iova_len, dma_get_mask(dev), dev);
if (!iova)
goto out_restore_sg;
 
@@ -755,7 +764,7 @@ static struct iommu_dma_msi_page 
*iommu_dma_get_msi_page(struct device *dev,
 
msi_page->phys = msi_addr;
if (iovad) {
-   iova = __alloc_iova(domain, size, dma_get_mask(dev));
+   iova = __alloc_iova(domain, size, dma_get_mask(dev), dev);
if (!iova)
goto out_free_page;
msi_page->iova = iova_dma_addr(iovad, iova);
-- 
2.10.2.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] IOMMU: SMMUv2: Support for Extended Stream ID (16 bit)

2017-01-16 Thread Robin Murphy
On 16/01/17 14:11, Aleksey Makarov wrote:
> Enable the Extended Stream ID feature when available.
> 
> This patch on top of series "KVM PCIe/MSI passthrough on ARM/ARM64
> and IOVA reserved regions" by Eric Auger [1] allows to passthrough
> an external PCIe network card on a ThunderX server successfully.
> 
> Without this patch that card caused a warning like
> 
>   pci 0006:90:00.0: stream ID 0x9000 out of range for SMMU (0x7fff)
> 
> during boot.
> 
> [1] 
> https://lkml.kernel.org/r/1484127714-3263-1-git-send-email-eric.au...@redhat.com
> 
> Signed-off-by: Aleksey Makarov 
> ---
> v2:
> - remove unnecessary parentheses (Robin Murphy)
> - refactor testing SMR fields to after setting sCR0 as theirs width
>   depends on sCR0_EXIDENABLE (Robin Murphy)
> 
> v1 (rfc):
> https://lkml.kernel.org/r/20170110115755.19102-1-aleksey.maka...@linaro.org
> 
>  drivers/iommu/arm-smmu.c | 67 
> ++--
>  1 file changed, 48 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 13d26009b8e0..c33df4083d24 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -24,6 +24,7 @@
>   *   - v7/v8 long-descriptor format
>   *   - Non-secure access to the SMMU
>   *   - Context fault reporting
> + *   - Extended Stream ID (16 bit)
>   */
>  
>  #define pr_fmt(fmt) "arm-smmu: " fmt
> @@ -87,6 +88,7 @@
>  #define sCR0_CLIENTPD(1 << 0)
>  #define sCR0_GFRE(1 << 1)
>  #define sCR0_GFIE(1 << 2)
> +#define sCR0_EXIDENABLE  (1 << 3)
>  #define sCR0_GCFGFRE (1 << 4)
>  #define sCR0_GCFGFIE (1 << 5)
>  #define sCR0_USFCFG  (1 << 10)
> @@ -126,6 +128,7 @@
>  #define ID0_NUMIRPT_MASK 0xff
>  #define ID0_NUMSIDB_SHIFT9
>  #define ID0_NUMSIDB_MASK 0xf
> +#define ID0_EXIDS(1 << 8)
>  #define ID0_NUMSMRG_SHIFT0
>  #define ID0_NUMSMRG_MASK 0xff
>  
> @@ -169,6 +172,7 @@
>  #define ARM_SMMU_GR0_S2CR(n) (0xc00 + ((n) << 2))
>  #define S2CR_CBNDX_SHIFT 0
>  #define S2CR_CBNDX_MASK  0xff
> +#define S2CR_EXIDVALID   (1 << 10)
>  #define S2CR_TYPE_SHIFT  16
>  #define S2CR_TYPE_MASK   0x3
>  enum arm_smmu_s2cr_type {
> @@ -354,6 +358,7 @@ struct arm_smmu_device {
>  #define ARM_SMMU_FEAT_FMT_AARCH64_64K(1 << 9)
>  #define ARM_SMMU_FEAT_FMT_AARCH32_L  (1 << 10)
>  #define ARM_SMMU_FEAT_FMT_AARCH32_S  (1 << 11)
> +#define ARM_SMMU_FEAT_EXIDS  (1 << 12)
>   u32 features;
>  
>  #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
> @@ -1051,7 +1056,7 @@ static void arm_smmu_write_smr(struct arm_smmu_device 
> *smmu, int idx)
>   struct arm_smmu_smr *smr = smmu->smrs + idx;
>   u32 reg = smr->id << SMR_ID_SHIFT | smr->mask << SMR_MASK_SHIFT;
>  
> - if (smr->valid)
> + if (!(smmu->features & ARM_SMMU_FEAT_EXIDS) && smr->valid)
>   reg |= SMR_VALID;
>   writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx));
>  }
> @@ -1063,6 +1068,9 @@ static void arm_smmu_write_s2cr(struct arm_smmu_device 
> *smmu, int idx)
> (s2cr->cbndx & S2CR_CBNDX_MASK) << S2CR_CBNDX_SHIFT |
> (s2cr->privcfg & S2CR_PRIVCFG_MASK) << S2CR_PRIVCFG_SHIFT;
>  
> + if (smmu->features & ARM_SMMU_FEAT_EXIDS && smmu->smrs &&
> + smmu->smrs[idx].valid)
> + reg |= S2CR_EXIDVALID;
>   writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_S2CR(idx));
>  }
>  
> @@ -1073,6 +1081,35 @@ static void arm_smmu_write_sme(struct arm_smmu_device 
> *smmu, int idx)
>   arm_smmu_write_smr(smmu, idx);
>  }
>  
> +/*
> + * The width of SMR's mask field depends on sCR0_EXIDENABLE, so this function
> + * should be called after sCR0 is written.
> + */
> +static void arm_smmu_test_smr_masks(struct arm_smmu_device *smmu)
> +{
> + void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
> + u32 smr;
> +
> + if (!smmu->smrs)
> + return;
> +
> + /*
> +  * SMR.ID bits may not be preserved if the corresponding
> +  * MASK bits are set, so check each one separately.
> +  * We can reject masters later if they try to claim IDs
> +  * outside these

Re: [PATCH V5 05/12] drivers: platform: Configure dma operations at probe time

2017-01-19 Thread Robin Murphy
On 19/01/17 15:05, Sricharan R wrote:
> Configuring DMA ops at probe time will allow deferring device probe when
> the IOMMU isn't available yet. The dma_configure for the device is
> now called from the generic device_attach callback just before the
> bus/driver probe is called. This way, configuring the DMA ops for the
> device would be called at the same place for all bus_types, hence the
> deferred probing mechanism should work for all buses as well.
> 
> pci_bus_add_devices(platform/amba)(_device_create/driver_register)
>| |
> pci_bus_add_device (device_add/driver_register)
>| |
> device_attach   device_initial_probe
>| |
> __device_attach_driver__device_attach_driver
>|
> driver_probe_device
>|
> really_probe
>|
> dma_configure
> 
> Similarly on the device/driver_unregister path __device_release_driver is
> called which inturn calls dma_deconfigure.
> 
> Signed-off-by: Sricharan R 
> ---
>  * Removed the dma configuration for the pci devices in case of DT
>from pci_dma_configure which was hanging outside separately and
>doing it in dma_configure function itself.
> 
>  drivers/base/dd.c   |  9 +
>  drivers/base/dma-mapping.c  | 32 
>  drivers/of/platform.c   |  5 +
>  drivers/pci/probe.c |  5 +
>  include/linux/dma-mapping.h |  3 +++
>  5 files changed, 46 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index a1fbf55..4882f06 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -19,6 +19,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -356,6 +357,10 @@ static int really_probe(struct device *dev, struct 
> device_driver *drv)
>   if (ret)
>   goto pinctrl_bind_failed;
>  
> + ret = dma_configure(dev);
> + if (ret)
> + goto dma_failed;
> +
>   if (driver_sysfs_add(dev)) {
>   printk(KERN_ERR "%s: driver_sysfs_add(%s) failed\n",
>   __func__, dev_name(dev));
> @@ -417,6 +422,8 @@ static int really_probe(struct device *dev, struct 
> device_driver *drv)
>   goto done;
>  
>  probe_failed:
> + dma_deconfigure(dev);
> +dma_failed:
>   if (dev->bus)
>   blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
>BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
> @@ -826,6 +833,8 @@ static void __device_release_driver(struct device *dev, 
> struct device *parent)
>   drv->remove(dev);
>  
>   device_links_driver_cleanup(dev);
> + dma_deconfigure(dev);
> +
>   devres_release_all(dev);
>   dev->driver = NULL;
>   dev_set_drvdata(dev, NULL);
> diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
> index efd71cf..dfe6fd7 100644
> --- a/drivers/base/dma-mapping.c
> +++ b/drivers/base/dma-mapping.c
> @@ -10,6 +10,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -341,3 +342,34 @@ void dma_common_free_remap(void *cpu_addr, size_t size, 
> unsigned long vm_flags)
>   vunmap(cpu_addr);
>  }
>  #endif
> +
> +/*
> + * Common configuration to enable DMA API use for a device
> + */
> +#include 
> +
> +int dma_configure(struct device *dev)
> +{
> + struct device *_dev = dev;
> + int is_pci = dev_is_pci(dev);
> +
> + if (is_pci) {
> + _dev = pci_get_host_bridge_device(to_pci_dev(dev));
> + if (IS_ENABLED(CONFIG_OF) && _dev->parent &&
> + _dev->parent->of_node)
> + _dev = _dev->parent;
> + }
> +
> + if (_dev->of_node)
> + of_dma_configure(dev, _dev->of_node);
> +
> + if (is_pci)
> + pci_put_host_bridge_device(_dev);

There's a fun bug here - at this point _dev is the *parent* of the
bridge device, so we put the refcount on the wrong device (the platform
device representing the host controller, rather than the PCI device
representing its insides), which frees the guy we're in the middle of
probing, and things rapidly go wrong afterwards:

[1.461026] bus: 'platform': driver_probe_device: matched device
4000.pcie-controller with driver pci-host-generic
[1.471640] bus: 'platform': really_probe: probing driver
pci-host-generic with device 4000.pcie-controller
[1.481678] OF: PCI: host bridge /pcie-controller@4000 ranges:

...

[2.158259] bus: 'pci': driver_probe_device: matched device
:02:10.0 with driver pcieport
[2.166716] bus: 'pci': really_probe: probing driver pcieport with
device :02:10.0
[2.174590] pci :02:10.0: Driver pcieport requests probe deferral
[2.180978] pci :02:10.0: Added to deferred list
[2.185915] bus: 'pci': driver_probe_device: matched device
:02:1f.0 with driver pcieport
[  

Re: [PATCH V5 08/12] iommu/arm-smmu: Clean up early-probing workarounds

2017-01-19 Thread Robin Murphy
On 19/01/17 16:50, Lorenzo Pieralisi wrote:
> On Thu, Jan 19, 2017 at 08:35:52PM +0530, Sricharan R wrote:
>> From: Robin Murphy 
>>
>> Now that the appropriate ordering is enforced via profe-deferral of
>> masters in core code, rip it all out and bask in the simplicity.
>>
>> Signed-off-by: Robin Murphy 
>> [Sricharan: Rebased on top of ACPI IORT SMMU series]
>> Signed-off-by: Sricharan R 
>> ---
>>  * No change
> 
> Well, a tad too early on the series for ACPI, aka if we bisect the
> series here you would break ACPI.
> 
> Totally agree on the patch, but you should move it to the end of the
> series.

Indeed - I think a more appropriate ordering of the current patch
numbers would be:

1, 2, 3, 4, 9, 5+10 (squashed), 6, 11, 7, 8, 12

Robin.

> 
> Lorenzo
> 
>>  drivers/iommu/arm-smmu-v3.c | 46 ++-
>>  drivers/iommu/arm-smmu.c| 58 
>> +++--
>>  2 files changed, 10 insertions(+), 94 deletions(-)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 7d45d8b..7fc4e5f 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -2724,51 +2724,9 @@ static int arm_smmu_device_remove(struct 
>> platform_device *pdev)
>>  .probe  = arm_smmu_device_probe,
>>  .remove = arm_smmu_device_remove,
>>  };
>> +module_platform_driver(arm_smmu_driver);
>>  
>> -static int __init arm_smmu_init(void)
>> -{
>> -static bool registered;
>> -int ret = 0;
>> -
>> -if (!registered) {
>> -ret = platform_driver_register(&arm_smmu_driver);
>> -registered = !ret;
>> -}
>> -return ret;
>> -}
>> -
>> -static void __exit arm_smmu_exit(void)
>> -{
>> -return platform_driver_unregister(&arm_smmu_driver);
>> -}
>> -
>> -subsys_initcall(arm_smmu_init);
>> -module_exit(arm_smmu_exit);
>> -
>> -static int __init arm_smmu_of_init(struct device_node *np)
>> -{
>> -int ret = arm_smmu_init();
>> -
>> -if (ret)
>> -return ret;
>> -
>> -if (!of_platform_device_create(np, NULL, platform_bus_type.dev_root))
>> -return -ENODEV;
>> -
>> -return 0;
>> -}
>> -IOMMU_OF_DECLARE(arm_smmuv3, "arm,smmu-v3", arm_smmu_of_init);
>> -
>> -#ifdef CONFIG_ACPI
>> -static int __init acpi_smmu_v3_init(struct acpi_table_header *table)
>> -{
>> -if (iort_node_match(ACPI_IORT_NODE_SMMU_V3))
>> -return arm_smmu_init();
>> -
>> -return 0;
>> -}
>> -IORT_ACPI_DECLARE(arm_smmu_v3, ACPI_SIG_IORT, acpi_smmu_v3_init);
>> -#endif
>> +IOMMU_OF_DECLARE(arm_smmuv3, "arm,smmu-v3", NULL);
>>  
>>  MODULE_DESCRIPTION("IOMMU API for ARM architected SMMUv3 implementations");
>>  MODULE_AUTHOR("Will Deacon ");
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 73a0a25..c86ae5f 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -2134,56 +2134,14 @@ static int arm_smmu_device_remove(struct 
>> platform_device *pdev)
>>  .probe  = arm_smmu_device_probe,
>>  .remove = arm_smmu_device_remove,
>>  };
>> -
>> -static int __init arm_smmu_init(void)
>> -{
>> -static bool registered;
>> -int ret = 0;
>> -
>> -if (!registered) {
>> -ret = platform_driver_register(&arm_smmu_driver);
>> -registered = !ret;
>> -}
>> -return ret;
>> -}
>> -
>> -static void __exit arm_smmu_exit(void)
>> -{
>> -return platform_driver_unregister(&arm_smmu_driver);
>> -}
>> -
>> -subsys_initcall(arm_smmu_init);
>> -module_exit(arm_smmu_exit);
>> -
>> -static int __init arm_smmu_of_init(struct device_node *np)
>> -{
>> -int ret = arm_smmu_init();
>> -
>> -if (ret)
>> -return ret;
>> -
>> -if (!of_platform_device_create(np, NULL, platform_bus_type.dev_root))
>> -return -ENODEV;
>> -
>> -return 0;
>> -}
>> -IOMMU_OF_DECLARE(arm_smmuv1, "arm,smmu-v1", arm_smmu_of_init);
>> -IOMMU_OF_DECLARE(arm_smmuv2, "arm,smmu-v2", arm_smmu_of_init);
>> -IOMMU_OF_DECLARE(arm_mmu400, "arm,mmu-400", arm_smmu_of_init);
>> -IOMMU_OF_DECLARE(arm_mmu401, "arm,mmu-401", arm_smmu_of_init);
>> -IOMMU_OF_DECLARE(arm_m

Re: [PATCH 1/5] iommu/arm-smmu: Restrict domain attributes to UNMANAGED domains

2017-01-19 Thread Robin Murphy
On 19/01/17 18:19, Will Deacon wrote:
> The ARM SMMU drivers provide a DOMAIN_ATTR_NESTING domain attribute,
> which allows callers of the IOMMU API to request that the page table
> for a domain is installed at stage-2, if supported by the hardware.
> 
> Since setting this attribute only makes sense for UNMANAGED domains,
> this patch returns -ENODEV if the domain_{get,set}_attr operations are
> called on other domain types.

For the sake of discussion, would it make sense to enforce this in
domain_set_attr() itself? The intersection of drivers providing these
callbacks and drivers supporting anything other than unmanaged domains
is currently these two below, so it clearly wouldn't break anything to
put this check in core code today. Looking forward, is there likely to
be any plausible situation where users of a managed domain would be
legitimate in mucking about with its attrs, on any platform?

Robin.

> Signed-off-by: Will Deacon 
> ---
>  drivers/iommu/arm-smmu-v3.c | 6 ++
>  drivers/iommu/arm-smmu.c| 6 ++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 4d6ec444a9d6..c254325b0c7a 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -1839,6 +1839,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
> *domain,
>  {
>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>  
> + if (domain->type != IOMMU_DOMAIN_UNMANAGED)
> + return -ENODEV;
> +
>   switch (attr) {
>   case DOMAIN_ATTR_NESTING:
>   *(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
> @@ -1854,6 +1857,9 @@ static int arm_smmu_domain_set_attr(struct iommu_domain 
> *domain,
>   int ret = 0;
>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>  
> + if (domain->type != IOMMU_DOMAIN_UNMANAGED)
> + return -ENODEV;
> +
>   mutex_lock(&smmu_domain->init_mutex);
>  
>   switch (attr) {
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index a60cded8a6ed..a328ffb75509 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1497,6 +1497,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
> *domain,
>  {
>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>  
> + if (domain->type != IOMMU_DOMAIN_UNMANAGED)
> + return -ENODEV;
> +
>   switch (attr) {
>   case DOMAIN_ATTR_NESTING:
>   *(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
> @@ -1512,6 +1515,9 @@ static int arm_smmu_domain_set_attr(struct iommu_domain 
> *domain,
>   int ret = 0;
>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>  
> + if (domain->type != IOMMU_DOMAIN_UNMANAGED)
> + return -ENODEV;
> +
>   mutex_lock(&smmu_domain->init_mutex);
>  
>   switch (attr) {
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu/arm-smmu: Install bypass S2CRs for IOMMU_DOMAIN_IDENTITY domains

2017-01-19 Thread Robin Murphy
On 19/01/17 18:19, Will Deacon wrote:
> In preparation for allowing the default domain type to be overridden,
> this patch adds support for IOMMU_DOMAIN_IDENTITY domains to the
> ARM SMMU driver.
> 
> An identity domain is created by placing the corresponding S2CR
> registers into "bypass" mode, which allows transactions to flow through
> the SMMU without any translation.

The other subtle nicety this opens the door to is being able to disable
unmatched stream bypass without needing enough context banks for every
single known device, since S2CRs are generally more abundant.

Reviewed-by: Robin Murphy 

> Signed-off-by: Will Deacon 
> ---
>  drivers/iommu/arm-smmu.c | 20 +---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index a328ffb75509..0f5e42a719e5 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -404,6 +404,7 @@ enum arm_smmu_domain_stage {
>   ARM_SMMU_DOMAIN_S1 = 0,
>   ARM_SMMU_DOMAIN_S2,
>   ARM_SMMU_DOMAIN_NESTED,
> + ARM_SMMU_DOMAIN_BYPASS,
>  };
>  
>  struct arm_smmu_domain {
> @@ -824,6 +825,12 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>   if (smmu_domain->smmu)
>   goto out_unlock;
>  
> + if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> + smmu_domain->stage = ARM_SMMU_DOMAIN_BYPASS;
> + smmu_domain->smmu = smmu;
> + goto out_unlock;
> + }
> +
>   /*
>* Mapping the requested stage onto what we support is surprisingly
>* complicated, mainly because the spec allows S1+S2 SMMUs without
> @@ -984,7 +991,7 @@ static void arm_smmu_destroy_domain_context(struct 
> iommu_domain *domain)
>   void __iomem *cb_base;
>   int irq;
>  
> - if (!smmu)
> + if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
>   return;
>  
>   /*
> @@ -1007,7 +1014,9 @@ static struct iommu_domain 
> *arm_smmu_domain_alloc(unsigned type)
>  {
>   struct arm_smmu_domain *smmu_domain;
>  
> - if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
> + if (type != IOMMU_DOMAIN_UNMANAGED &&
> + type != IOMMU_DOMAIN_DMA &&
> + type != IOMMU_DOMAIN_IDENTITY)
>   return NULL;
>   /*
>* Allocate the domain and initialise some of its data structures.
> @@ -1205,10 +1214,15 @@ static int arm_smmu_domain_add_master(struct 
> arm_smmu_domain *smmu_domain,
>  {
>   struct arm_smmu_device *smmu = smmu_domain->smmu;
>   struct arm_smmu_s2cr *s2cr = smmu->s2crs;
> - enum arm_smmu_s2cr_type type = S2CR_TYPE_TRANS;
>   u8 cbndx = smmu_domain->cfg.cbndx;
> + enum arm_smmu_s2cr_type type;
>   int i, idx;
>  
> + if (smmu_domain->stage == ARM_SMMU_DOMAIN_BYPASS)
> + type = S2CR_TYPE_BYPASS;
> + else
> + type = S2CR_TYPE_TRANS;
> +
>   for_each_cfg_sme(fwspec, i, idx) {
>   if (type == s2cr[idx].type && cbndx == s2cr[idx].cbndx)
>   continue;
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/5] iommu/arm-smmu-v3: Install bypass STEs for IOMMU_DOMAIN_IDENTITY domains

2017-01-19 Thread Robin Murphy
On 19/01/17 18:19, Will Deacon wrote:
> In preparation for allowing the default domain type to be overridden,
> this patch adds support for IOMMU_DOMAIN_IDENTITY domains to the
> ARM SMMUv3 driver.
> 
> An identity domain is created by placing the corresponding stream table
> entries into "bypass" mode, which allows transactions to flow through
> the SMMU without any translation.
> 
> Signed-off-by: Will Deacon 
> ---
>  drivers/iommu/arm-smmu-v3.c | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index c254325b0c7a..d33291274455 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -629,6 +629,7 @@ enum arm_smmu_domain_stage {
>   ARM_SMMU_DOMAIN_S1 = 0,
>   ARM_SMMU_DOMAIN_S2,
>   ARM_SMMU_DOMAIN_NESTED,
> + ARM_SMMU_DOMAIN_BYPASS,
>  };
>  
>  struct arm_smmu_domain {
> @@ -1385,7 +1386,9 @@ static struct iommu_domain 
> *arm_smmu_domain_alloc(unsigned type)
>  {
>   struct arm_smmu_domain *smmu_domain;
>  
> - if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
> + if (type != IOMMU_DOMAIN_UNMANAGED &&
> + type != IOMMU_DOMAIN_DMA &&
> + type != IOMMU_DOMAIN_IDENTITY)
>   return NULL;
>  
>   /*
> @@ -1516,6 +1519,11 @@ static int arm_smmu_domain_finalise(struct 
> iommu_domain *domain)
>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   struct arm_smmu_device *smmu = smmu_domain->smmu;
>  
> + if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> + smmu_domain->stage = ARM_SMMU_DOMAIN_BYPASS;
> + return 0;
> + }
> +
>   /* Restrict the stage to what we can actually support */
>   if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
>   smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
> @@ -1651,7 +1659,9 @@ static int arm_smmu_attach_dev(struct iommu_domain 
> *domain, struct device *dev)
>   ste->bypass = false;
>   ste->valid = true;
>  
> - if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> + if (smmu_domain->stage == ARM_SMMU_DOMAIN_BYPASS) {
> + ste->bypass = true;

How is this intended to interact with the disable_bypass parameter? at
the moment, that will still end up transforming this into a faulting
STE, and I'm not sure that's right. I'd say we want to treat "bypass
because not attached to a domain" and "bypass because attached to a
passthrough domain" as distinct things, and it's only really the former
which makes sense to disable.

Robin.

> + } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
>   ste->s1_cfg = &smmu_domain->s1_cfg;
>   ste->s2_cfg = NULL;
>   arm_smmu_write_ctx_desc(smmu, ste->s1_cfg);
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/5] arm64: dma-mapping: Only swizzle DMA ops for IOMMU_DOMAIN_DMA

2017-01-19 Thread Robin Murphy
On 19/01/17 18:19, Will Deacon wrote:
> The arm64 DMA-mapping implementation sets the DMA ops to the IOMMU DMA
> ops if we detect that an IOMMU is present for the master and the DMA
> ranges are valid.
> 
> In the case when the IOMMU domain for the device is not of type
> IOMMU_DOMAIN_DMA, then we have no business swizzling the ops, since
> we're not in control of the underlying address space. This patch leaves
> the DMA ops alone for masters attached to non-DMA IOMMU domains.

In fact, I don't think there would be any harm in taking this one
through arm64 straight away. The DMA ops can't be expected to work
successfully on any old domain, so it's a reasonable sanity check
regardless.

Reviewed-by: Robin Murphy 

> Signed-off-by: Will Deacon 
> ---
>  arch/arm64/mm/dma-mapping.c | 17 -
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index e04082700bb1..5d3c6ad621e8 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -831,14 +831,21 @@ static bool do_iommu_attach(struct device *dev, const 
> struct iommu_ops *ops,
>* then the IOMMU core will have already configured a group for this
>* device, and allocated the default domain for that group.
>*/
> - if (!domain || iommu_dma_init_domain(domain, dma_base, size, dev)) {
> - pr_warn("Failed to set up IOMMU for device %s; retaining 
> platform DMA ops\n",
> - dev_name(dev));
> - return false;
> + if (!domain)
> + goto out_err;
> +
> + if (domain->type == IOMMU_DOMAIN_DMA) {
> + if (iommu_dma_init_domain(domain, dma_base, size, dev))
> + goto out_err;
> +
> + dev->archdata.dma_ops = &iommu_dma_ops;
>   }
>  
> - dev->archdata.dma_ops = &iommu_dma_ops;
>   return true;
> +out_err:
> + pr_warn("Failed to set up IOMMU for device %s; retaining platform DMA 
> ops\n",
> +  dev_name(dev));
> + return false;
>  }
>  
>  static void queue_iommu_attach(struct device *dev, const struct iommu_ops 
> *ops,
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH/RFC 1/2] arm64: mm: Silently allow devices lacking IOMMU group

2017-01-23 Thread Robin Murphy
Hi Magnus,

On 23/01/17 12:12, Magnus Damm wrote:
> From: Magnus Damm 
> 
> Consider failure of iommu_get_domain_for_dev() as non-critical and
> get rid of the warning printout. This allows IOMMU properties to be
> included in the DTB even though the kernel is configured with
> CONFIG_IOMMU_API=n or in case a particular IOMMU driver refuses to
> enable IOMMU support for a certain slave device and returns error
> from the ->add_device() callback.
> 
> This is only a cosmetic change that removes console warning printouts.

The warning is there for a reason - at this point, we *expected* the
device to be using an IOMMU for DMA, so a failure is significant. Rather
than masking genuine failures in other cases because your case
deliberately breaks that expectation, simply change the expectation -
i.e. rather than letting of_xlate() succeed then failing add_device()
later, reject the of_xlate() call up-front such that the DMA layer never
gets told about the IOMMU in the first place.

Robin.

> Signed-off-by: Magnus Damm 
> ---
> 
>  arch/arm64/mm/dma-mapping.c |   10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> --- 0001/arch/arm64/mm/dma-mapping.c
> +++ work/arch/arm64/mm/dma-mapping.c  2017-01-23 20:54:40.060607110 +0900
> @@ -827,11 +827,19 @@ static bool do_iommu_attach(struct devic
>   struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
>  
>   /*
> +  * In case IOMMU support is excluded from the kernel or if the device
> +  * is not hooked up to any IOMMU group then be silent and keep the
> +  * old dma_ops.
> +  */
> + if (!domain)
> + return false;
> +
> + /*
>* If the IOMMU driver has the DMA domain support that we require,
>* then the IOMMU core will have already configured a group for this
>* device, and allocated the default domain for that group.
>*/
> - if (!domain || iommu_dma_init_domain(domain, dma_base, size, dev)) {
> + if (iommu_dma_init_domain(domain, dma_base, size, dev)) {
>   pr_warn("Failed to set up IOMMU for device %s; retaining 
> platform DMA ops\n",
>   dev_name(dev));
>   return false;
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH/RFC 2/4] iommu: iova: use __alloc_percpu_gfp() with GFP_NOWAIT in init_iova_rcaches()

2017-01-25 Thread Robin Murphy
On 25/01/17 12:54, Yoshihiro Shimoda wrote:
> In the future, the init_iova_rcaches will be called in atomic.

That screams "doing the wrong thing". The sole point of the rcaches is
to reuse IOVAs, whereas the main point of this series seems to involve
not reusing IOVAs. The fact that we have to affect the allocation of
something we explicitly want to avoid using, rather than, say, not
allocating it at all, is a bit of a red flag.

Also, the rcaches are rather big. Allocating all this lot in atomic
context is not necessarily going to be the best idea.

Robin.

> Signed-off-by: Yoshihiro Shimoda 
> ---
>  drivers/iommu/iova.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index b7268a1..866ad65 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -723,7 +723,9 @@ static void init_iova_rcaches(struct iova_domain *iovad)
>   rcache = &iovad->rcaches[i];
>   spin_lock_init(&rcache->lock);
>   rcache->depot_size = 0;
> - rcache->cpu_rcaches = __alloc_percpu(sizeof(*cpu_rcache), 
> cache_line_size());
> + rcache->cpu_rcaches = __alloc_percpu_gfp(sizeof(*cpu_rcache),
> +  cache_line_size(),
> +  GFP_NOWAIT);
>   if (WARN_ON(!rcache->cpu_rcaches))
>   continue;
>   for_each_possible_cpu(cpu) {
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH/RFC 1/4] iommu: dma: track mapped iova

2017-01-25 Thread Robin Murphy
On 25/01/17 12:53, Yoshihiro Shimoda wrote:
> From: Magnus Damm 
> 
> To track mapped iova for a workaround code in the future.
> 
> Signed-off-by: Magnus Damm 
> Signed-off-by: Yoshihiro Shimoda 
> ---
>  drivers/iommu/dma-iommu.c | 29 +++--
>  include/linux/iommu.h |  2 ++

So what's being added here is a counter of allocations within the
iova_domain held by an iommu_dma_cookie? Then why is it all the way down
in the iommu_domain and not in the cookie? That's needlessly invasive -
it would be almost understandable (but still horrible) if you needed to
refer to it directly from the IOMMU driver, but as far as I can see you
don't.

Robin.

>  2 files changed, 25 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 2db0d64..a0b8c0f 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -19,6 +19,7 @@
>   * along with this program.  If not, see .
>   */
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -147,6 +148,7 @@ int iommu_dma_init_domain(struct iommu_domain *domain, 
> dma_addr_t base,
>   order = __ffs(domain->pgsize_bitmap);
>   base_pfn = max_t(unsigned long, 1, base >> order);
>   end_pfn = (base + size - 1) >> order;
> + atomic_set(&domain->iova_pfns_mapped, 0);
>  
>   /* Check the domain allows at least some access to the device... */
>   if (domain->geometry.force_aperture) {
> @@ -209,6 +211,7 @@ static struct iova *__alloc_iova(struct iommu_domain 
> *domain, size_t size,
>   struct iova_domain *iovad = cookie_iovad(domain);
>   unsigned long shift = iova_shift(iovad);
>   unsigned long length = iova_align(iovad, size) >> shift;
> + struct iova *iova;
>  
>   if (domain->geometry.force_aperture)
>   dma_limit = min(dma_limit, domain->geometry.aperture_end);
> @@ -216,9 +219,23 @@ static struct iova *__alloc_iova(struct iommu_domain 
> *domain, size_t size,
>* Enforce size-alignment to be safe - there could perhaps be an
>* attribute to control this per-device, or at least per-domain...
>*/
> - return alloc_iova(iovad, length, dma_limit >> shift, true);
> + iova = alloc_iova(iovad, length, dma_limit >> shift, true);
> + if (iova)
> + atomic_add(iova_size(iova), &domain->iova_pfns_mapped);
> +
> + return iova;
>  }
>  
> +void
> +__free_iova_domain(struct iommu_domain *domain, struct iova *iova)
> +{
> + struct iova_domain *iovad = cookie_iovad(domain);
> +
> + atomic_sub(iova_size(iova), &domain->iova_pfns_mapped);
> + __free_iova(iovad, iova);
> +}
> +
> +
>  /* The IOVA allocator knows what we mapped, so just unmap whatever that was 
> */
>  static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t 
> dma_addr)
>  {
> @@ -235,7 +252,7 @@ static void __iommu_dma_unmap(struct iommu_domain 
> *domain, dma_addr_t dma_addr)
>   size -= iommu_unmap(domain, pfn << shift, size);
>   /* ...and if we can't, then something is horribly, horribly wrong */
>   WARN_ON(size > 0);
> - __free_iova(iovad, iova);
> + __free_iova_domain(domain, iova);
>  }
>  
>  static void __iommu_dma_free_pages(struct page **pages, int count)
> @@ -401,7 +418,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t 
> size, gfp_t gfp,
>  out_free_sg:
>   sg_free_table(&sgt);
>  out_free_iova:
> - __free_iova(iovad, iova);
> + __free_iova_domain(domain, iova);
>  out_free_pages:
>   __iommu_dma_free_pages(pages, count);
>   return NULL;
> @@ -447,7 +464,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
> phys_addr_t phys,
>  
>   dma_addr = iova_dma_addr(iovad, iova);
>   if (iommu_map(domain, dma_addr, phys - iova_off, len, prot)) {
> - __free_iova(iovad, iova);
> + __free_iova_domain(domain, iova);
>   return DMA_ERROR_CODE;
>   }
>   return dma_addr + iova_off;
> @@ -613,7 +630,7 @@ int iommu_dma_map_sg(struct device *dev, struct 
> scatterlist *sg,
>   return __finalise_sg(dev, sg, nents, dma_addr);
>  
>  out_free_iova:
> - __free_iova(iovad, iova);
> + __free_iova_domain(domain, iova);
>  out_restore_sg:
>   __invalidate_sg(sg, nents);
>   return 0;
> @@ -689,7 +706,7 @@ static struct iommu_dma_msi_page 
> *iommu_dma_get_msi_page(struct device *dev,
>   return msi_page;
>  
>  out_free_iova:
> - __free_iova(iovad, iova);
> + __free_iova_domain(domain, iova);
>  out_free_page:
>   kfree(msi_page);
>   return NULL;
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 0ff5111..91d0159 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -19,6 +19,7 @@
>  #ifndef __LINUX_IOMMU_H
>  #define __LINUX_IOMMU_H
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -84,6 +85,7 @@ struct iommu_domain {
>   void *handler_token;
>   struct iommu_domain_geometry 

Re: [PATCH/RFC 3/4] iommu: dma: iommu iova domain reset

2017-01-25 Thread Robin Murphy
On 25/01/17 12:54, Yoshihiro Shimoda wrote:
> From: Magnus Damm 
> 
> To add a workaround code for ipmmu-vmsa driver, this patch adds
> a new geometry "force_reset_when_empty" not to reuse iova space.

The domain geometry is absolutely not the appropriate place for that. If
anything, it could possibly be a domain attribute, but it has nothing to
do with describing the range of input addresses the hardware is able to
translate.

> When all pfns happen to get unmapped then ask the IOMMU driver to
> flush the state followed by starting from an empty iova space.

And what happens if all the PFNs are never unmapped? Many devices (USB
being among them) use a coherent allocation for some kind of descriptor
buffer which exists for the lifetime of the device, then use streaming
mappings for data transfers - the net result of that is that the number
of PFNs mapped will always be >=1, and eventually streaming mapping will
fail because you've exhausted the address space. And if the device *is*
a USB controller, at that point the thread will hang because the USB
core's response to a DMA mapping failure happens to be "keep trying
indefinitely".

Essentially, however long this allocation workaround postpones it, one
or other failure mode is unavoidable with certain devices unless you can
do something drastic like periodically suspend and resume the entire
system to reset everything.

> Signed-off-by: Magnus Damm 
> Signed-off-by: Yoshihiro Shimoda 
> ---
>  drivers/iommu/dma-iommu.c | 42 +-
>  drivers/iommu/iova.c  |  9 +
>  include/linux/iommu.h |  2 ++
>  include/linux/iova.h  |  1 +
>  4 files changed, 49 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index a0b8c0f..d0fa0b1 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -42,6 +42,7 @@ struct iommu_dma_cookie {
>   struct iova_domain  iovad;
>   struct list_headmsi_page_list;
>   spinlock_t  msi_lock;
> + spinlock_t  reset_lock;

So now we do get something in the cookie, but it's protecting a bunch of
machinery that's accessible from a wider scope? That doesn't seem like a
good design.

>  };
>  
>  static inline struct iova_domain *cookie_iovad(struct iommu_domain *domain)
> @@ -74,6 +75,7 @@ int iommu_get_dma_cookie(struct iommu_domain *domain)
>  
>   spin_lock_init(&cookie->msi_lock);
>   INIT_LIST_HEAD(&cookie->msi_page_list);
> + spin_lock_init(&cookie->reset_lock);
>   domain->iova_cookie = cookie;
>   return 0;
>  }
> @@ -208,9 +210,11 @@ int dma_direction_to_prot(enum dma_data_direction dir, 
> bool coherent)
>  static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
>   dma_addr_t dma_limit)
>  {
> + struct iommu_dma_cookie *cookie = domain->iova_cookie;
>   struct iova_domain *iovad = cookie_iovad(domain);
>   unsigned long shift = iova_shift(iovad);
>   unsigned long length = iova_align(iovad, size) >> shift;
> + unsigned long flags;
>   struct iova *iova;
>  
>   if (domain->geometry.force_aperture)
> @@ -219,9 +223,19 @@ static struct iova *__alloc_iova(struct iommu_domain 
> *domain, size_t size,
>* Enforce size-alignment to be safe - there could perhaps be an
>* attribute to control this per-device, or at least per-domain...
>*/
> - iova = alloc_iova(iovad, length, dma_limit >> shift, true);
> - if (iova)
> - atomic_add(iova_size(iova), &domain->iova_pfns_mapped);
> + if (domain->geometry.force_reset_when_empty) {
> + spin_lock_irqsave(&cookie->reset_lock, flags);

Is this locking definitely safe against the potential concurrent callers
of __free_iova() on the same domain who won't touch reset_lock? (It may
well be, it's just not clear)

> +
> + iova = alloc_iova(iovad, length, dma_limit >> shift, true);
> + if (iova)
> + atomic_add(iova_size(iova), &domain->iova_pfns_mapped);
> +
> + spin_unlock_irqrestore(&cookie->reset_lock, flags);
> + } else {
> + iova = alloc_iova(iovad, length, dma_limit >> shift, true);
> + if (iova)
> + atomic_add(iova_size(iova), &domain->iova_pfns_mapped);

Why would we even need to keep track of this on non-broken IOMMUS?

> + }
>  
>   return iova;
>  }
> @@ -229,10 +243,28 @@ static struct iova *__alloc_iova(struct iommu_domain 
> *domain, size_t size,
>  void
>  __free_iova_domain(struct iommu_domain *domain, struct iova *iova)
>  {
> + struct iommu_dma_cookie *cookie = domain->iova_cookie;
>   struct iova_domain *iovad = cookie_iovad(domain);
> + unsigned long flags;
>  
> - atomic_sub(iova_size(iova), &domain->iova_pfns_mapped);
> - __free_iova(iovad, iova);
> + /* In case force_reset_when_empty is set, do not reuse iova space
> +  * but instead si

Re: [PATCH 0/2] Fix incorrect warning from dma-debug

2017-01-25 Thread Robin Murphy
On 25/01/17 16:23, Geert Uytterhoeven wrote:
> Hi Robin,

Hi Geert,

> On Mon, May 9, 2016 at 11:37 AM, Robin Murphy  wrote:
>> On 08/05/16 11:59, Niklas Söderlund wrote:
>>> While using CONFIG_DMA_API_DEBUG i came across this warning which I
>>> think is a false positive. As shown dma_sync_single_for_device() are
>>> called from the dma_map_single() call path. This triggers the warning
>>> since the dma-debug code have not yet been made aware of the mapping.
>>
>> Almost right ;) The thing being mapped (the SPI device's buffer) and the
>> thing being synced (the IOMMU's PTE) are entirely unrelated. Due to the
>> current of_iommu_init() setup, the IOMMU is probed long before
>> dma_debug_init() gets called, therefore DMA debug is missing entries for
>> some of the initial page table mappings and gets confused when we update
>> them later.
> 
> I think I've been seeing the same as Niklas since quite a while.
> Finally I had a deeper look, and it looks like there is a bug somewhere,
> causing the wrong IOMMU PTE to be synced.
> 
>>> I try to solve this by introducing __dma_sync_single_for_device() which
>>> do not call into the dma-debug code. I'm no expert and this might be a
>>> bad way of solving the problem but it allowed me to keep working.
>>
>> The simple fix should be to just call dma_debug_init() from a sufficiently
>> earlier initcall level. The best would be to sort out a proper device
>> dependency order to avoid the whole early-IOMMU-creation thing entirely.
> 
> And so I did. After disabling the call to dma_debug_fs_init(), you can call
> dma_debug_init() quite early. But the warning didn't go away:

Yet the underlying reason has subtly changed!

> ipmmu-vmsa e67b.mmu: DMA-API: device driver tries to sync DMA memory
> it has not allocated [device address=0x00067bab2ff8] [size=8 
> bytes]
> [ cut here ]
> WARNING: CPU: 0 PID: 174 at lib/dma-debug.c:1235 check_sync+0xcc/0x568
> ...
> [] check_sync+0xcc/0x568
> [] debug_dma_sync_single_for_device+0x44/0x4c
> [] __arm_lpae_set_pte.isra.3+0x6c/0x78
> [] __arm_lpae_map+0x318/0x384
> [] arm_lpae_map+0xb0/0xc4
> [] ipmmu_map+0x48/0x58
> [] iommu_map+0x120/0x1fc
> [] __iommu_dma_map+0xb8/0xec
> [] iommu_dma_map_page+0x50/0x58
> [] __iommu_map_page+0x54/0x98
> 
> So, who allocated that memory?
> 
> During early kernel init (before fs_initcall(dma_debug_init)):
> 
> arm-lpae io-pgtable: arm_lpae_alloc_pgtable:652: cfg->ias = 32

Note that you have a 32-bit IAS...

> data->pg_shift = 12 va_bits = 20
> arm-lpae io-pgtable: arm_lpae_alloc_pgtable:657: data->bits_per_level = 9
> data->levels = 3 pgd_bits = 2
> ipmmu-vmsa e67b.mmu: __arm_lpae_alloc_pages:224
> dma_map_single(0xffc63bab2000, 32) returned 0x00067bab2000
> 
> Hence 0x67bab2000 is the PGD, which has only 4 entries (32 bytes).
> Call stack:
> 
> [] __arm_lpae_alloc_pages.isra.11+0x144/0x1e8
> [] arm_64_lpae_alloc_pgtable_s1+0xdc/0x118
> [] arm_32_lpae_alloc_pgtable_s1+0x44/0x68
> [] alloc_io_pgtable_ops+0x4c/0x80
> [] ipmmu_attach_device+0xd0/0x3b0
> 
> When starting DMA from the device:
> 
> iommu: map: iova 0xfff000 pa 0x00067a555000 size 0x1000 pgsize 
> 4096

...then count the f's carefully.

> arm-lpae io-pgtable: __arm_lpae_map:318: iova 0xfff000
> phys 0x00067a555000 size 4096 lvl 1 ptep 0xffc63bab2000
> arm-lpae io-pgtable: __arm_lpae_map:320: incr. ptep to 0xffc63bab2ff8
> ipmmu-vmsa e67b.mmu: __arm_lpae_alloc_pages:224
> dma_map_single(0xffc63a49, 4096) returned 0x00067a49
> ipmmu-vmsa e67b.mmu: DMA-API: device driver tries to sync DMA memory
> it has not allocated [device address=0x00067bab2ff8] [size=8 
> bytes]
> 
> __arm_lpae_map() added "ARM_LPAE_LVL_IDX(iova, lvl, data)" == 0xff8 to ptep
> (the PGD base address), but the PGD has only 32 bytes, leading to the warning.
> 
> Does my analysis make sense?
> Do you have a clue?

The initial false positive misleads from the fact that this is actually
DMA-debug doing its job admirably. The bug lies in however you ended up
trying to map a 40-bit IOVA in a 32-bit pagetable.

Robin.

> 
> Thanks!
> 
> Gr{oetje,eeting}s,
> 
> Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH V7 01/11] iommu/of: Refactor of_iommu_configure() for error handling

2017-01-25 Thread Robin Murphy
Hi Tomasz,

On 25/01/17 17:17, Tomasz Nowicki wrote:
> Hi Sricharan,
> 
> On 23.01.2017 17:18, Sricharan R wrote:
>> From: Robin Murphy 
>>
>> In preparation for some upcoming cleverness, rework the control flow in
>> of_iommu_configure() to minimise duplication and improve the propogation
>> of errors. It's also as good a time as any to switch over from the
>> now-just-a-compatibility-wrapper of_iommu_get_ops() to using the generic
>> IOMMU instance interface directly.
>>
>> Signed-off-by: Robin Murphy 
>> ---
>>  drivers/iommu/of_iommu.c | 83
>> +++-
>>  1 file changed, 53 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>> index 0f57ddc..ee49081 100644
>> --- a/drivers/iommu/of_iommu.c
>> +++ b/drivers/iommu/of_iommu.c
>> @@ -96,6 +96,28 @@ int of_get_dma_window(struct device_node *dn, const
>> char *prefix, int index,
>>  }
>>  EXPORT_SYMBOL_GPL(of_get_dma_window);
>>
>> +static const struct iommu_ops
>> +*of_iommu_xlate(struct device *dev, struct of_phandle_args *iommu_spec)
>> +{
>> +const struct iommu_ops *ops;
>> +struct fwnode_handle *fwnode = &iommu_spec->np->fwnode;
>> +int err;
>> +
>> +ops = iommu_get_instance(fwnode);
>> +if (!ops || !ops->of_xlate)
>> +return NULL;
>> +
>> +err = iommu_fwspec_init(dev, &iommu_spec->np->fwnode, ops);
>> +if (err)
>> +return ERR_PTR(err);
>> +
>> +err = ops->of_xlate(dev, iommu_spec);
>> +if (err)
>> +return ERR_PTR(err);
>> +
>> +return ops;
>> +}
>> +
>>  static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>  {
>>  struct of_phandle_args *iommu_spec = data;
>> @@ -105,10 +127,11 @@ static int __get_pci_rid(struct pci_dev *pdev,
>> u16 alias, void *data)
>>  }
>>
>>  static const struct iommu_ops
>> -*of_pci_iommu_configure(struct pci_dev *pdev, struct device_node
>> *bridge_np)
>> +*of_pci_iommu_init(struct pci_dev *pdev, struct device_node *bridge_np)
>>  {
>>  const struct iommu_ops *ops;
>>  struct of_phandle_args iommu_spec;
>> +int err;
>>
>>  /*
>>   * Start by tracing the RID alias down the PCI topology as
>> @@ -123,56 +146,56 @@ static int __get_pci_rid(struct pci_dev *pdev,
>> u16 alias, void *data)
>>   * bus into the system beyond, and which IOMMU it ends up at.
>>   */
>>  iommu_spec.np = NULL;
>> -if (of_pci_map_rid(bridge_np, iommu_spec.args[0], "iommu-map",
>> -   "iommu-map-mask", &iommu_spec.np, iommu_spec.args))
>> -return NULL;
>> +err = of_pci_map_rid(bridge_np, iommu_spec.args[0], "iommu-map",
>> + "iommu-map-mask", &iommu_spec.np,
>> + iommu_spec.args);
>> +if (err)
>> +return ERR_PTR(err);
>>
>> -ops = of_iommu_get_ops(iommu_spec.np);
>> -if (!ops || !ops->of_xlate ||
>> -iommu_fwspec_init(&pdev->dev, &iommu_spec.np->fwnode, ops) ||
>> -ops->of_xlate(&pdev->dev, &iommu_spec))
>> -ops = NULL;
>> +ops = of_iommu_xlate(&pdev->dev, &iommu_spec);
>>
>>  of_node_put(iommu_spec.np);
>>  return ops;
>>  }
>>
>> -const struct iommu_ops *of_iommu_configure(struct device *dev,
>> -   struct device_node *master_np)
>> +static const struct iommu_ops
>> +*of_platform_iommu_init(struct device *dev, struct device_node *np)
>>  {
>>  struct of_phandle_args iommu_spec;
>> -struct device_node *np;
>>  const struct iommu_ops *ops = NULL;
>>  int idx = 0;
>>
>> -if (dev_is_pci(dev))
>> -return of_pci_iommu_configure(to_pci_dev(dev), master_np);
>> -
>>  /*
>>   * We don't currently walk up the tree looking for a parent IOMMU.
>>   * See the `Notes:' section of
>>   * Documentation/devicetree/bindings/iommu/iommu.txt
>>   */
>> -while (!of_parse_phandle_with_args(master_np, "iommus",
>> -   "#iommu-cells", idx,
>> -   &iommu_spec)) {
>> -np = iommu_spec.np;
>> -ops = of_iommu_get_ops(np);
>> -
>> -if (!ops || !ops->of_xlate ||
>> -iommu_fwspec_init

Re: [PATCH/RFC] iommu/ipmmu-vmsa: Restrict IOMMU Domain Geometry to 32-bit address space

2017-01-26 Thread Robin Murphy
On 26/01/17 09:53, Geert Uytterhoeven wrote:
> Currently, the IPMMU/VMSA driver supports 32-bit I/O Virtual Addresses
> only, and thus sets io_pgtable_cfg.ias = 32.  However, it doesn't force
> a 32-bit IOVA space through the IOMMU Domain Geometry.
> 
> Hence if a device (e.g. SYS-DMAC) rightfully configures a 40-bit DMA
> mask, it will still be handed out a 40-bit IOVA, outside the 32-bit IOVA
> space, leading to out-of-bounds accesses of the PGD when mapping the
> IOVA.
> 
> Force a 32-bit IOMMU Domain Geometry to fix this.

Reviewed-by: Robin Murphy 

> Signed-off-by: Geert Uytterhoeven 
> ---
> Should the generic code restrict the geometry based on IAS instead?

Which generic code? IAS is specific to the io-pgtable library (well,
really it's an ARM-SMMU-ism, but "input address size" is a fairly
portable concept), but io-pgtable is just factored-out driver helper
code and doesn't know anything about iommu_domains and the IOMMU API.
Conversely, the IOMMU API core and kernel code beyond also know nothing
about io-pgtable internals - in fact the domain geometry *is* how the
IOMMU driver communicates its configured address space size to the
outside world.

Robin.

> ---
>  drivers/iommu/ipmmu-vmsa.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
> index 798578f1676480c6..eb8b3df8733b15fb 100644
> --- a/drivers/iommu/ipmmu-vmsa.c
> +++ b/drivers/iommu/ipmmu-vmsa.c
> @@ -424,6 +424,8 @@ static int ipmmu_domain_init_context(struct 
> ipmmu_vmsa_domain *domain)
>   domain->cfg.ias = 32;
>   domain->cfg.oas = 40;
>   domain->cfg.tlb = &ipmmu_gather_ops;
> + domain->io_domain.geometry.aperture_end = DMA_BIT_MASK(32);
> + domain->io_domain.geometry.force_aperture = true;
>   /*
>* TODO: Add support for coherent walk through CCI with DVM and remove
>* cache handling. For now, delegate it to the io-pgtable code.
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/5] iommu: Allow default domain type to be set on the kernel command line

2017-01-26 Thread Robin Murphy
On 26/01/17 17:15, Joerg Roedel wrote:
> On Thu, Jan 19, 2017 at 06:19:15PM +, Will Deacon wrote:
>> Rather than modify each IOMMU driver to provide different semantics for
>> DMA domains, instead we introduce a command line parameter that can be
>> used to change the type of the default domain. Passthrough can then be
>> specified using "iommu.default_domain=identity" on the kernel command
>> line.
> 
> I like the general idea of this, but the above is a terrible name for a
> kernel commandline-parameter. The x86 iommus support iommu=pt which is
> pretty much the same as this patch does.

Indeed, I was keen on making "iommu=pt" also do this default domain
switch itself so we wouldn't need a new option - it didn't *appear* that
that would break the AMD driver (as the only other default domain user
supporting identity domains) but I may have overlooked something.

Robin.

> How about something like "iommu.passthrough=0/1"? And please add the
> parameter to the kernel documentation too.
> 
> 
>   Joerg
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: Better document the IOMMU_PRIV flag

2017-01-27 Thread Robin Murphy
This is a fairly subtle thing - let's make sure it's described as
clearly as possible to avoid potential misunderstandings.

Signed-off-by: Robin Murphy 
---

Having another look through the IOMMU_PRIV series, I wasn't convinced
that the original comment was really all that helpful - I'm happy for
this to be squashed in if you like.

Robin.

 include/linux/iommu.h | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 69e2417a2965..3c830e153069 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -32,10 +32,13 @@
 #define IOMMU_NOEXEC   (1 << 3)
 #define IOMMU_MMIO (1 << 4) /* e.g. things like MSI doorbells */
 /*
- * This is to make the IOMMU API setup privileged
- * mapppings accessible by the master only at higher
- * privileged execution level and inaccessible at
- * less privileged levels.
+ * Where the bus hardware includes a privilege level as part of its access type
+ * markings, and certain devices are capable of issuing transactions marked as
+ * either 'supervisor' or 'user', the IOMMU_PRIV flag requests that the other
+ * given permission flags only apply to accesses at the higher privilege level,
+ * and that unprivileged transactions should have as little access as possible.
+ * This would usually imply the same permissions as kernel mappings on the CPU,
+ * if the IOMMU page table format is equivalent.
  */
 #define IOMMU_PRIV (1 << 5)
 
-- 
2.11.0.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH/RFC] iommu/dma: Per-domain flag to control size-alignment

2017-01-27 Thread Robin Murphy
Hi Magnus,

On 27/01/17 06:24, Magnus Damm wrote:
> From: Magnus Damm 
> 
> Introduce the flag "no_size_align" to allow disabling size-alignment
> on a per-domain basis. This follows the suggestion by the comment
> in the code, however a per-device control may be preferred?
> 
> Needed to make virtual space contiguous for certain devices.

That sounds very suspicious - a single allocation is contiguous with
itself by definition, and anyone relying on multiple allocations being
contiguous with one another is doing it wrong, because there's no way we
could ever guarantee that (with this allocator, at any rate). I'd be
very reticent to touch this without a specific example of what problem
it solves.

Since I understand all this stuff a lot more deeply now that back when I
first wrote that comment, I'd say that if there really is some real need
to implement this feature, it should be a dma_attr flag, i.e. not even
per-device, but per-allocation. We'd be breaking a behaviour currently
guaranteed by the DMA API, so we need to be really sure the caller is OK
with that - having it be their responsibility to explicitly ask is
definitely safest.

Robin.

> Signed-off-by: Magnus Damm 
> ---
> 
>  drivers/iommu/dma-iommu.c |6 +-
>  include/linux/iommu.h |1 +
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> --- 0001/drivers/iommu/dma-iommu.c
> +++ work/drivers/iommu/dma-iommu.c2017-01-27 15:17:50.280607110 +0900
> @@ -209,14 +209,18 @@ static struct iova *__alloc_iova(struct
>   struct iova_domain *iovad = cookie_iovad(domain);
>   unsigned long shift = iova_shift(iovad);
>   unsigned long length = iova_align(iovad, size) >> shift;
> + bool size_aligned = true;
>  
>   if (domain->geometry.force_aperture)
>   dma_limit = min(dma_limit, domain->geometry.aperture_end);
> +
> + if (domain->no_size_align)
> + size_aligned = false;
>   /*
>* Enforce size-alignment to be safe - there could perhaps be an
>* attribute to control this per-device, or at least per-domain...
>*/
> - return alloc_iova(iovad, length, dma_limit >> shift, true);
> + return alloc_iova(iovad, length, dma_limit >> shift, size_aligned);
>  }
>  
>  /* The IOVA allocator knows what we mapped, so just unmap whatever that was 
> */
> --- 0001/include/linux/iommu.h
> +++ work/include/linux/iommu.h2017-01-27 15:16:37.630607110 +0900
> @@ -83,6 +83,7 @@ struct iommu_domain {
>   iommu_fault_handler_t handler;
>   void *handler_token;
>   struct iommu_domain_geometry geometry;
> + bool no_size_align;
>   void *iova_cookie;
>  };
>  
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/2] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS

2017-01-27 Thread Robin Murphy
Hi Geert,

On 27/01/17 15:34, Geert Uytterhoeven wrote:
> Add helpers for allocating physically contiguous DMA buffers to the
> generic IOMMU DMA code.  This can be useful when two or more devices
> with different memory requirements are involved in buffer sharing.
> 
> The iommu_dma_{alloc,free}_contiguous() functions complement the existing
> iommu_dma_{alloc,free}() functions, and allow architecture-specific code
> to implement support for the DMA_ATTR_FORCE_CONTIGUOUS attribute on
> systems with an IOMMU.  As this uses the CMA allocator, setting this
> attribute has a runtime-dependency on CONFIG_DMA_CMA.
> 
> Note that unlike the existing iommu_dma_alloc() helper,
> iommu_dma_alloc_contiguous() has no callback to flush pages.
> Ensuring the returned region is made visible to a non-coherent device is
> the responsibility of the caller.
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
> v2:
>   - Provide standalone iommu_dma_{alloc,free}_contiguous() functions, as
> requested by Robin Murphy,
>   - Simplify operations by getting rid of the page array/scatterlist
> dance, as the buffer is contiguous,
>   - Move CPU cache magement into the caller, which is much simpler with
> a single contiguous buffer.

Thanks for the rework, that's a lot easier to make sense of! Now, please
don't hate me, but...

> ---
>  drivers/iommu/dma-iommu.c | 72 
> +++
>  include/linux/dma-iommu.h |  4 +++
>  2 files changed, 76 insertions(+)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 2db0d641cf4505b5..8f8ed4426f9a3a12 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -30,6 +30,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  struct iommu_dma_msi_page {
>   struct list_headlist;
> @@ -408,6 +409,77 @@ struct page **iommu_dma_alloc(struct device *dev, size_t 
> size, gfp_t gfp,
>  }
>  
>  /**
> + * iommu_dma_free_contiguous - Free a buffer allocated by
> + *  iommu_dma_alloc_contiguous()
> + * @dev: Device which owns this buffer
> + * @page: Buffer page pointer as returned by iommu_dma_alloc_contiguous()
> + * @size: Size of buffer in bytes
> + * @handle: DMA address of buffer
> + *
> + * Frees the pages associated with the buffer.
> + */
> +void iommu_dma_free_contiguous(struct device *dev, struct page *page,
> + size_t size, dma_addr_t *handle)
> +{
> + __iommu_dma_unmap(iommu_get_domain_for_dev(dev), *handle);
> + dma_release_from_contiguous(dev, page, PAGE_ALIGN(size) >> PAGE_SHIFT);
> + *handle = DMA_ERROR_CODE;
> +}
> +
> +/**
> + * iommu_dma_alloc_contiguous - Allocate and map a buffer contiguous in IOVA
> + *   and physical space
> + * @dev: Device to allocate memory for. Must be a real device attached to an
> + *iommu_dma_domain
> + * @size: Size of buffer in bytes
> + * @prot: IOMMU mapping flags
> + * @handle: Out argument for allocated DMA handle
> + *
> + * Return: Buffer page pointer, or NULL on failure.
> + *
> + * Note that unlike iommu_dma_alloc(), it's the caller's responsibility to
> + * ensure the returned region is made visible to the given non-coherent 
> device.
> + */
> +struct page *iommu_dma_alloc_contiguous(struct device *dev, size_t size,
> + int prot, dma_addr_t *handle)
> +{
> + struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> + struct iova_domain *iovad = cookie_iovad(domain);
> + dma_addr_t dma_addr;
> + unsigned int count;
> + struct page *page;
> + struct iova *iova;
> + int ret;
> +
> + *handle = DMA_ERROR_CODE;
> +
> + size = PAGE_ALIGN(size);
> + count = size >> PAGE_SHIFT;
> + page = dma_alloc_from_contiguous(dev, count, get_order(size));
> + if (!page)
> + return NULL;
> +
> + iova = __alloc_iova(domain, size, dev->coherent_dma_mask);
> + if (!iova)
> + goto out_free_pages;
> +
> + size = iova_align(iovad, size);
> + dma_addr = iova_dma_addr(iovad, iova);
> + ret = iommu_map(domain, dma_addr, page_to_phys(page), size, prot);
> + if (ret < 0)
> + goto out_free_iova;
> +
> + *handle = dma_addr;
> + return page;
> +
> +out_free_iova:
> + __free_iova(iovad, iova);
> +out_free_pages:
> + dma_release_from_contiguous(dev, page, count);
> + return NULL;
> +}

...now that I can see it clearly, isn't this more or less just:

page = dma_alloc_from_contiguous(dev, ...);
if (page)
dma_addr = iommu_dma_map_page(dev,

Re: [PATCH V7 01/11] iommu/of: Refactor of_iommu_configure() for error handling

2017-01-27 Thread Robin Murphy
On 27/01/17 18:00, Sricharan wrote:
> Hi Robin,
> 
> [..]
> 
 +const struct iommu_ops *of_iommu_configure(struct device *dev,
 +   struct device_node *master_np)
 +{
 +const struct iommu_ops *ops;
 +
 +if (!master_np)
 +return NULL;
 +
 +if (dev_is_pci(dev))
 +ops = of_pci_iommu_init(to_pci_dev(dev), master_np);
>>>
>>> I gave the whole patch set a try on ThunderX. really_probe() is failing
>>> on dma_configure()->of_pci_iommu_init() for each PCI device.
>>
>> When you say "failing", do you mean cleanly, or with a crash? I've
>> managed to hit __of_match_node() dereferencing NULL from
>> of_iommu_xlate() in a horribly complicated chain of events, which I'm
>> trying to figure out now, and I wonder if the two might be related.
> 
> Sorry that there is crash still. __of_match_node seems to checking
> for NULL arguments , feels like some invalid pointer was passed in.
> Is there any particular sequence to try for this ?

Ah, I did figure it out - it wasn't actually a NULL dereference, but an
unmapped address. Turns out __iommu_of_table is in initdata, so any
driver probing after init, connected to an unprobed IOMMU (in this case
disabled in DT), trips over trying to match the now-freed table. I'm
working on the fix - technically the bug's in my patch (#2) anyway ;)

Robin.

> 
> Regards,
>  Sricharan
> 
> 
> 
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH/RFC] iommu/dma: Per-domain flag to control size-alignment

2017-01-30 Thread Robin Murphy
On 30/01/17 07:20, Yoshihiro Shimoda wrote:
> Hi Robin, Magnus,
> 
>> -Original Message-----
>> From: Robin Murphy
>> Sent: Saturday, January 28, 2017 2:38 AM
>>
>> Hi Magnus,
>>
>> On 27/01/17 06:24, Magnus Damm wrote:
>>> From: Magnus Damm 
>>>
>>> Introduce the flag "no_size_align" to allow disabling size-alignment
>>> on a per-domain basis. This follows the suggestion by the comment
>>> in the code, however a per-device control may be preferred?
>>>
>>> Needed to make virtual space contiguous for certain devices.
>>
>> That sounds very suspicious - a single allocation is contiguous with
>> itself by definition, and anyone relying on multiple allocations being
>> contiguous with one another is doing it wrong, because there's no way we
>> could ever guarantee that (with this allocator, at any rate). I'd be
>> very reticent to touch this without a specific example of what problem
>> it solves.
> 
> Thank you for the comment! This patch was from my request.
> But, I completely misunderstood this "size-alignment" behavior.
> And, my concern was already resolved by the following patch at last April:
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/iommu?id=809eac54cdd62c67afea1e17080e681dfa33dc09
> 
> So, no one needs this patch anymore.

Cool, thanks for the clarification :)

Robin.

> 
> Best regards,
> Yoshihiro Shimoda
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V7 08/11] drivers: acpi: Handle IOMMU lookup failure with deferred probing or error

2017-01-30 Thread Robin Murphy
On 29/01/17 17:53, Sinan Kaya wrote:
> On 1/24/2017 7:37 AM, Lorenzo Pieralisi wrote:
>> [+hanjun, tomasz, sinan]
>>
>> It is quite a key patchset, I would be glad if they can test on their
>> respective platforms with IORT.
>>
> 
> Tested on top of 4.10-rc5.
> 
> 1.Platform Hidma device passed dmatest
> 2.Seeing some USB stalls on a platform USB device.
> 3.PCIe NVME drive probed and worked fine with MSI interrupts after boot.
> 4.NVMe driver didn't probe following a hotplug insertion and received an
> SMMU error event during the insertion.

What was the SMMU error - a translation/permission fault (implying the
wrong DMA ops) or a bad STE fault (implying we totally failed to tell
the SMMU about the device at all)?

Robin.

> 
> /sys/bus/pci/slots/4 #
> /sys/bus/pci/slots/4 # dmesg | grep nvme
> [   14.041357] nvme nvme0: pci function 0003:01:00.0
> [  198.399521] nvme nvme0: pci function 0003:01:00.0
> [__198.416232]_nvme_0003:01:00.0:_enabling_device_(_->_0002)
> [  264.402216] nvme nvme0: I/O 228 QID 0 timeout, disable controller
> [  264.402313] nvme nvme0: Identify Controller failed (-4)
> [  264.421270] nvme nvme0: Removing after probe failure status: -5
> /sys/bus/pci/slots/4 #
> 
> 
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V7 01/11] iommu/of: Refactor of_iommu_configure() for error handling

2017-01-30 Thread Robin Murphy
On 30/01/17 07:00, Sricharan wrote:
> Hi Robin,
> 
>>> [..]
>>>
>> +const struct iommu_ops *of_iommu_configure(struct device *dev,
>> +   struct device_node *master_np)
>> +{
>> +const struct iommu_ops *ops;
>> +
>> +if (!master_np)
>> +return NULL;
>> +
>> +if (dev_is_pci(dev))
>> +ops = of_pci_iommu_init(to_pci_dev(dev), master_np);
>
> I gave the whole patch set a try on ThunderX. really_probe() is failing
> on dma_configure()->of_pci_iommu_init() for each PCI device.

 When you say "failing", do you mean cleanly, or with a crash? I've
 managed to hit __of_match_node() dereferencing NULL from
 of_iommu_xlate() in a horribly complicated chain of events, which I'm
 trying to figure out now, and I wonder if the two might be related.
>>>
>>> Sorry that there is crash still. __of_match_node seems to checking
>>> for NULL arguments , feels like some invalid pointer was passed in.
>>> Is there any particular sequence to try for this ?
>>
>> Ah, I did figure it out - it wasn't actually a NULL dereference, but an
>> unmapped address. Turns out __iommu_of_table is in initdata, so any
>> driver probing after init, connected to an unprobed IOMMU (in this case
>> disabled in DT), trips over trying to match the now-freed table. I'm
>> working on the fix - technically the bug's in my patch (#2) anyway ;)
>>
> 
> Ok, thanks for bringing this out. There is also an issue that
> Sinan has mentioned while testing the ACPI hotplug path, probably
> its related to the above, not sure. I will try to check more on that
> in the meanwhile. Then, taking your fix and fixing the hotplug case
> i will do one more repost.

OK, I've finally settled on the below fixup for patch #2 - I have some
follow-on ideas for eventually getting of the magic table altogether,
but they can wait until we've got the baseline functionality sorted.
Updated full patch here:

http://www.linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=5616af885f7c5c24f7239d5c689583b2b583c407

Robin.

-8<-

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 349bd1d01612..1f92d98237d5 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -96,6 +96,19 @@ int of_get_dma_window(struct device_node *dn, const
char *prefix, int index,
 }
 EXPORT_SYMBOL_GPL(of_get_dma_window);

+static bool of_iommu_driver_present(struct device_node *np)
+{
+   /*
+* If the IOMMU still isn't ready by the time we reach init, assume
+* it never will be. We don't want to defer indefinitely, nor attempt
+* to dereference __iommu_of_table after it's been freed.
+*/
+   if (system_state > SYSTEM_BOOTING)
+   return false;
+
+   return of_match_node(&__iommu_of_table, np);
+}
+
 static const struct iommu_ops
 *of_iommu_xlate(struct device *dev, struct of_phandle_args *iommu_spec)
 {
@@ -105,7 +118,7 @@ static const struct iommu_ops

ops = iommu_get_instance(fwnode);
if ((ops && !ops->of_xlate) ||
-   (!ops && !of_match_node(&__iommu_of_table, iommu_spec->np)))
+   (!ops && !of_iommu_driver_present(iommu_spec->np)))
return NULL;

err = iommu_fwspec_init(dev, &iommu_spec->np->fwnode, ops);
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/dma: Remove bogus dma_supported() implementation

2017-02-01 Thread Robin Murphy
Back when this was first written, dma_supported() was somewhat of a
murky mess, with subtly different interpretations being relied upon in
various places. The "does device X support DMA to address range Y?"
uses assuming Y to be physical addresses, which motivated the current
iommu_dma_supported() implementation and are alluded to in the comment
therein, have since been cleaned up, leaving only the far less ambiguous
"can device X drive address bits Y" usage internal to DMA API mask
setting. As such, there is no reason to keep a slightly misleading
callback which does nothing but duplicate the current default behaviour;
we already constrain IOVA allocations to the iommu_domain aperture where
necessary, so let's leave DMA mask business to architecture-specific
code where it belongs.

Signed-off-by: Robin Murphy 
---

Techincally an IOMMU patch, but could possibly go via arm64 I suppose.

 arch/arm64/mm/dma-mapping.c |  1 -
 drivers/iommu/dma-iommu.c   | 10 --
 include/linux/dma-iommu.h   |  1 -
 3 files changed, 12 deletions(-)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 88493b0ebd5e..351f7595cb3e 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -810,7 +810,6 @@ static struct dma_map_ops iommu_dma_ops = {
.sync_sg_for_device = __iommu_sync_sg_for_device,
.map_resource = iommu_dma_map_resource,
.unmap_resource = iommu_dma_unmap_resource,
-   .dma_supported = iommu_dma_supported,
.mapping_error = iommu_dma_mapping_error,
 };
 
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 1c9ac26e3b68..48d36ce59efb 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -734,16 +734,6 @@ void iommu_dma_unmap_resource(struct device *dev, 
dma_addr_t handle,
__iommu_dma_unmap(iommu_get_domain_for_dev(dev), handle);
 }
 
-int iommu_dma_supported(struct device *dev, u64 mask)
-{
-   /*
-* 'Special' IOMMUs which don't have the same addressing capability
-* as the CPU will have to wait until we have some way to query that
-* before they'll be able to use this framework.
-*/
-   return 1;
-}
-
 int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
 {
return dma_addr == DMA_ERROR_CODE;
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 3a846f9ec0fd..5725c94b1f12 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -67,7 +67,6 @@ dma_addr_t iommu_dma_map_resource(struct device *dev, 
phys_addr_t phys,
size_t size, enum dma_data_direction dir, unsigned long attrs);
 void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
size_t size, enum dma_data_direction dir, unsigned long attrs);
-int iommu_dma_supported(struct device *dev, u64 mask);
 int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr);
 
 /* The DMA API isn't _quite_ the whole story, though... */
-- 
2.11.0.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3] arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU

2017-02-02 Thread Robin Murphy
Hi Geert,

On 31/01/17 11:12, Geert Uytterhoeven wrote:
> Add support for allocating physically contiguous DMA buffers on arm64
> systems with an IOMMU.  This can be useful when two or more devices
> with different memory requirements are involved in buffer sharing.
> 
> Note that as this uses the CMA allocator, setting the
> DMA_ATTR_FORCE_CONTIGUOUS attribute has a runtime-dependency on
> CONFIG_DMA_CMA, just like on arm32.
> 
> For arm64 systems using swiotlb, no changes are needed to support the
> allocation of physically contiguous DMA buffers:
>   - swiotlb always uses physically contiguous buffers (up to
> IO_TLB_SEGSIZE = 128 pages),
>   - arm64's __dma_alloc_coherent() already calls
> dma_alloc_from_contiguous() when CMA is available.
> 
> Signed-off-by: Geert Uytterhoeven 
> Acked-by: Laurent Pinchart 
> ---
> v3:
>   - Add Acked-by,
>   - Update comment to "one of _4_ things",
>   - Call dma_alloc_from_contiguous() and iommu_dma_map_page() directly,
> as suggested by Robin Murphy,
> 
> v2:
>   - New, handle dispatching in the arch (arm64) code, as requested by
> Robin Murphy.
> ---
>  arch/arm64/mm/dma-mapping.c | 63 
> ++---
>  1 file changed, 48 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 1d7d5d2881db7c19..b314256fcee028ce 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -577,20 +577,7 @@ static void *__iommu_alloc_attrs(struct device *dev, 
> size_t size,
>*/
>   gfp |= __GFP_ZERO;
>  
> - if (gfpflags_allow_blocking(gfp)) {
> - struct page **pages;
> - pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
> -
> - pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
> - handle, flush_page);
> - if (!pages)
> - return NULL;
> -
> - addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
> -   __builtin_return_address(0));
> - if (!addr)
> - iommu_dma_free(dev, pages, iosize, handle);
> - } else {
> + if (!gfpflags_allow_blocking(gfp)) {
>   struct page *page;
>   /*
>* In atomic context we can't remap anything, so we'll only
> @@ -614,6 +601,45 @@ static void *__iommu_alloc_attrs(struct device *dev, 
> size_t size,
>   __free_from_pool(addr, size);
>   addr = NULL;
>   }
> + } else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> + pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
> + struct page *page;
> +
> + page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
> +  get_order(size));
> + if (!page)
> + return NULL;
> +
> + *handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
> + if (iommu_dma_mapping_error(dev, *handle)) {
> + dma_release_from_contiguous(dev, page,
> + size >> PAGE_SHIFT);
> + return NULL;
> + }
> + if (!coherent)

I think we might be able to return early if coherent here, as the
non-IOMMU __dma_alloc() does. We still need a (cacheable) remap in the
normal coherent case to make whatever pages iommu_dma_alloc() scrapes
together appear contiguous, but that obviously isn't a concern here.
However, I'm not entirely confident that that wouldn't break (or at
least further complicate) the already-horrible "what the hell is this?"
logic in __iommu_free_attrs() below, so I'm inclined to leave it as-is.

> + __dma_flush_area(page_to_virt(page), iosize);
> +
> + addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
> +prot,
> +__builtin_return_address(0));
> + if (!addr) {
> + iommu_dma_unmap_page(dev, *handle, iosize, 0, attrs);
> + dma_release_from_contiguous(dev, page,
> + size >> PAGE_SHIFT);
> + }
> + } else {
> + pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
> + struct page **pages;
> +
> + pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
> +   

Re: [PATCH 1/1] iommu: to avoid an unnecessary assignment

2017-02-03 Thread Robin Murphy
On 03/02/17 09:35, Zhen Lei wrote:
> Move the assignment statement into if branch above, where it only
> needs to be.
> 
> Signed-off-by: Zhen Lei 
> ---
>  drivers/iommu/iommu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index dbe7f65..b231400 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1714,13 +1714,14 @@ int iommu_fwspec_add_ids(struct device *dev, u32 
> *ids, int num_ids)
>   fwspec = krealloc(dev->iommu_fwspec, size, GFP_KERNEL);
>   if (!fwspec)
>   return -ENOMEM;
> +
> + dev->iommu_fwspec = fwspec;
>   }
> 
>   for (i = 0; i < num_ids; i++)
>   fwspec->ids[fwspec->num_ids + i] = ids[i];
> 
>   fwspec->num_ids += num_ids;
> - dev->iommu_fwspec = fwspec;

Strictly, it was done here because I like following transactional
idioms, i.e. at any point dev->fwspec is either the old one or the
fully-initialised new one. However, since the state of the new one
immediately after realloc isn't uninitialised, but still directly
equivalent to the old one, I don't see an issue with moving the
assignemnt there, plus it does avoid a redundant reassignment the first
time through.

Reviewed-by: Robin Murphy 

>   return 0;
>  }
>  EXPORT_SYMBOL_GPL(iommu_fwspec_add_ids);
> --
> 2.5.0
> 
> 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V8 08/11] drivers: acpi: Handle IOMMU lookup failure with deferred probing or error

2017-02-03 Thread Robin Murphy
On 03/02/17 16:15, Sricharan wrote:
> Hi Lorenzo, Robin,
> 
>> -Original Message-
>> From: linux-arm-kernel [mailto:linux-arm-kernel-boun...@lists.infradead.org] 
>> On Behalf Of Sricharan R
>> Sent: Friday, February 03, 2017 9:19 PM
>> To: robin.mur...@arm.com; will.dea...@arm.com; j...@8bytes.org; 
>> lorenzo.pieral...@arm.com; iommu@lists.linux-foundation.org;
>> linux-arm-ker...@lists.infradead.org; linux-arm-...@vger.kernel.org; 
>> m.szyprow...@samsung.com; bhelg...@google.com; linux-
>> p...@vger.kernel.org; linux-a...@vger.kernel.org; t...@semihalf.com; 
>> hanjun@linaro.org; ok...@codeaurora.org
>> Cc: sricha...@codeaurora.org
>> Subject: [PATCH V8 08/11] drivers: acpi: Handle IOMMU lookup failure with 
>> deferred probing or error
>>
>> This is an equivalent to the DT's handling of the iommu master's probe
>> with deferred probing when the corrsponding iommu is not probed yet.
>> The lack of a registered IOMMU can be caused by the lack of a driver for
>> the IOMMU, the IOMMU device probe not having been performed yet, having
>> been deferred, or having failed.
>>
>> The first case occurs when the firmware describes the bus master and
>> IOMMU topology correctly but no device driver exists for the IOMMU yet
>> or the device driver has not been compiled in. Return NULL, the caller
>> will configure the device without an IOMMU.
>>
>> The second and third cases are handled by deferring the probe of the bus
>> master device which will eventually get reprobed after the IOMMU.
>>
>> The last case is currently handled by deferring the probe of the bus
>> master device as well. A mechanism to either configure the bus master
>> device without an IOMMU or to fail the bus master device probe depending
>> on whether the IOMMU is optional or mandatory would be a good
>> enhancement.
>>
>> Tested-by: Hanjun Guo 
>> Acked-by: Lorenzo Pieralisi 
>> Signed-off-by: Sricharan R 
>> ---
>> drivers/acpi/arm64/iort.c  | 25 -
>> drivers/acpi/scan.c|  7 +--
>> drivers/base/dma-mapping.c |  2 +-
>> include/acpi/acpi_bus.h|  2 +-
>> include/linux/acpi.h   |  7 +--
>> 5 files changed, 36 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
>> index bf0ed09..d01bae8 100644
>> --- a/drivers/acpi/arm64/iort.c
>> +++ b/drivers/acpi/arm64/iort.c
>> @@ -550,8 +550,17 @@ static const struct iommu_ops *iort_iommu_xlate(struct 
>> device *dev,
>>  return NULL;
>>
>>  ops = iommu_get_instance(iort_fwnode);
>> +/*
>> + * If the ops look-up fails, this means that either
>> + * the SMMU drivers have not been probed yet or that
>> + * the SMMU drivers are not built in the kernel;
>> + * Depending on whether the SMMU drivers are built-in
>> + * in the kernel or not, defer the IOMMU configuration
>> + * or just abort it.
>> + */
>>  if (!ops)
>> -return NULL;
>> +return iort_iommu_driver_enabled(node->type) ?
>> +   ERR_PTR(-EPROBE_DEFER) : NULL;
>>
>>  ret = arm_smmu_iort_xlate(dev, streamid, iort_fwnode, ops);
>>  }
>> @@ -625,12 +634,26 @@ const struct iommu_ops *iort_iommu_configure(struct 
>> device *dev)
>>
>>  while (parent) {
>>  ops = iort_iommu_xlate(dev, parent, streamid);
>> +if (IS_ERR_OR_NULL(ops))
>> +return ops;
>>
>>  parent = iort_node_get_id(node, &streamid,
>>IORT_IOMMU_TYPE, i++);
>>  }
>>  }
>>
>> +/*
>> + * If we have reason to believe the IOMMU driver missed the initial
>> + * add_device callback for dev, replay it to get things in order.
>> + */
>> +if (!IS_ERR_OR_NULL(ops) && ops->add_device &&
>> +dev->bus && !dev->iommu_group) {
>> +int err = ops->add_device(dev);
>> +
>> +if (err)
>> +ops = ERR_PTR(err);
>> +}
>> +
> 
> On the last post we discussed that the above replay hunk can be made
> common. In trying to do that, i ended up with a patch like below. But not
> sure if that should be a part of this series though. I tested with OF devices
> and would have to be tested with ACPI devices once. Nothing changes
> functionally because of this ideally. Should be split in two patches though.
> 
> Regards,
>  Sricharan
> 
> From aafbf2c97375a086327504f2367eaf9197c719b1 Mon Sep 17 00:00:00 2001
> From: Sricharan R 
> Date: Fri, 3 Feb 2017 15:24:47 +0530
> Subject: [PATCH] drivers: iommu: Add iommu_add_device api
> 
> The code to call IOMMU driver's add_device is same
> for both OF and ACPI cases. So add an api which can
> be shared across both the places.
> 
> Also, now with probe-deferral the iommu master devices gets
> added to the respective iommus during 

Re: [PATCH V2 1/3] iommu/arm-smmu: Add pm_runtime/sleep ops

2017-02-08 Thread Robin Murphy
On 08/02/17 12:30, Sricharan wrote:
> Hi Mark,
> 
>> On Wed, Feb 08, 2017 at 04:23:17PM +0530, Sricharan wrote:
 On Thu, Feb 02, 2017 at 10:40:18PM +0530, Sricharan R wrote:
> +- clock-names:Should be a pair of "smmu_iface_clk" and "smmu_bus_clk"
> +  required for smmu's register group access and interface
> +  clk for the smmu's underlying bus access.
> +
> +- clocks: Phandles for respective clocks described by 
> clock-names.

 Which SMMU implementations are those clock-names valid for?

 The SMMU architecture specifications do not architect the clocks, which
 are implemementation-specific.

 AFAICT, this doesn't match MMU-400 or MMU-500.
>>>
>>> Ok, should be more specific. Infact QCOM has MMU-500 and also
>>> a smmu v2 implementation which is fully compatible with
>>> "arm,smmu-v2", with the clocks being controlled by the soc's
>>> clock controller. i was trying to define these clock bindings
>>> so that its works across socs.
>>
>> I don't think we can do that, if we don't know precisely what those
>> clocks are used for.
>>
>> i.e. we'd need a compatible string for the QCOM SMMUv2 variant, which
>> would imply the set of clocks.
>>
> 
> Ok, this was what i was trying to do for V3 and will actually put it
> this way.

Clocks are not architectural, so it only makes sense to associate them
with an implementation-specific compatible string. There's also no
guarantee that different microarchitectures have equivalent internal
clock domains - I'm not sure if "the SMMU's underlying bus access" is
meant to refer to accesses *by* the SMMU, i.e. page table walks,
accesses *through* the SMMU by upstream masters, or both, and the
differences are rather significant. I'd also note that an MMU-500
configuration may have up to *33* clocks.

>>> So there are one or more interface clocks which are required for the
>>> smmu's interface or the configuration access and one or more clocks
>>> required for smmu's downstream bus access. That was the reason i was
>>> trying to iterate over the list of clocks down below.  But agree that
>>> the bindings should define each of the clocks required separately.
>>
>> As above, I don't think the code should do this. It should only touch
>> the clocks it knows about.
>>
> 
> ok, after defining QCOM specific SMMU bindings, this would be become 
> handling clocks specific to QCOM implementation as per its clock
> bindings, which as i understand is what you suggest.
> 
>>> So one way here is, define a separate compatible for QCOM's SMMU
>>> implementation and define all the clock bindings as a part of it
>>> and handle it in the same way in the driver.
>>
>> That would be my preference.
>>
> 
> ok.

Either way, the QCOM implementation deserves its own compatible if only
for the sake of the imp-def gaps in the architecture (e.g. FSR.SS
behaviour WRT to IRQs as touched upon in the other thread).

Robin.

>>> But just thinking if it would scale well for any other soc that is
>>> compatible with arm,smmu-v2 driver and wants to handle clocks in the
>>> future ?
>>
>> I don't think we can have our cake and eat it here. Either we handle the
>> clock management for each variant, or we don't do it at all. We have no
>> idea what requirements a future variant might have w.r.t. the management
>> of clocks we don't know about yet.
>>
> 
> Right, at this point, this is first soc which adds the clocks in to the 
> driver.
> Feels if the clocks are initialized and enabled/disabled as a part of some
> implementation specific callbacks, that would help always because that is
> the part which is going to different for each implementation and patches 2,3
> would remain common. Finally, as you have suggested will introduce new
> SMMU binding in the case of QCOM and will try to handle clocks specifically 
> for that
> implementation and see how it looks.
> 
> Regards,
>  Sricharan
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 1/3] iommu/arm-smmu: Add pm_runtime/sleep ops

2017-02-08 Thread Robin Murphy
On 08/02/17 13:52, Mark Rutland wrote:
> On Wed, Feb 08, 2017 at 07:15:37PM +0530, Sricharan wrote:
>>> Clocks are not architectural, so it only makes sense to associate them
>>> with an implementation-specific compatible string. There's also no
>>
>> ok, it for this the QCOM specific implementation binding is tried(going to).
>>
>>> guarantee that different microarchitectures have equivalent internal
>>> clock domains - I'm not sure if "the SMMU's underlying bus access" is
>>> meant to refer to accesses *by* the SMMU, i.e. page table walks,
>>> accesses *through* the SMMU by upstream masters, or both
>>
>> In the above QCOM case, it is actually both. Its the same path for both the
>> page table walker and upstream masters.

Right, that's what I feared. As far as I can make out the current ARM
implementations, transactions passing through will require at least
TBUn_BCLK for the appropriate TBU, but would also need the page table
walker clocked with CCLK to resolve TLB misses. But then the programming
interface is also in the CCLK domain (not counting the incoming APB or
AXI clock for the actual slave port itself). Thus this 'generic' clock
binding already isn't compatible with MMU-40x/500.

>>> differences are rather significant. I'd also note that an MMU-500
>>> configuration may have up to *33* clocks.
>>>
>>> Either way, the QCOM implementation deserves its own compatible if only
>>> for the sake of the imp-def gaps in the architecture (e.g. FSR.SS
>>> behaviour WRT to IRQs as touched upon in the other thread).
>>>
>>
>> Ok, slightly unclear, so you mean then *clocks* are not good enough reason
>> to have a new compatible ?
> 
> I beleive Robin's point was even if the clocks didn't matter, there are
> other reasons we should have the QCOM-specific compatible string.
> 
> So we should have one regardless.

Exactly.

Robin.

> 
> Thanks,
> Mark.
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 01/11] iommu: Rename iommu_get_instance()

2017-02-10 Thread Robin Murphy
Hi Joerg,

I'm really liking this series! Superficially it doesn't seem to break
anything on my Juno, but I'll give it a more thorough workout soon.

Just a few comments from skimming through...

On 09/02/17 11:32, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Rename the function to iommu_ops_from_fwnode(), because that
> is what the function actually does. The new name is much
> more descriptive about what the function does.
> 
> Signed-off-by: Joerg Roedel 
> ---
>  drivers/acpi/arm64/iort.c | 2 +-
>  drivers/iommu/iommu.c | 2 +-
>  include/linux/iommu.h | 4 ++--
>  include/linux/of_iommu.h  | 2 +-
>  4 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index e0d2e6e..3752521 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -536,7 +536,7 @@ static const struct iommu_ops *iort_iommu_xlate(struct 
> device *dev,
>   if (!iort_fwnode)
>   return NULL;
>  
> - ops = iommu_get_instance(iort_fwnode);
> + ops = iommu_ops_from_fwnode(iort_fwnode);
>   if (!ops)
>   return NULL;
>  
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index dbe7f65..0ee05bb 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1653,7 +1653,7 @@ void iommu_register_instance(struct fwnode_handle 
> *fwnode,
>   spin_unlock(&iommu_instance_lock);
>  }
>  
> -const struct iommu_ops *iommu_get_instance(struct fwnode_handle *fwnode)
> +const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode)
>  {
>   struct iommu_instance *instance;
>   const struct iommu_ops *ops = NULL;
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 0ff5111..085e1f0 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -354,7 +354,7 @@ int iommu_fwspec_init(struct device *dev, struct 
> fwnode_handle *iommu_fwnode,
>  int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids);
>  void iommu_register_instance(struct fwnode_handle *fwnode,
>const struct iommu_ops *ops);
> -const struct iommu_ops *iommu_get_instance(struct fwnode_handle *fwnode);
> +const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode);
>  
>  #else /* CONFIG_IOMMU_API */
>  
> @@ -590,7 +590,7 @@ static inline void iommu_register_instance(struct 
> fwnode_handle *fwnode,
>  }
>  
>  static inline
> -const struct iommu_ops *iommu_get_instance(struct fwnode_handle *fwnode)
> +const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode)
>  {
>   return NULL;
>  }
> diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
> index 6a7fc50..66fcbc9 100644
> --- a/include/linux/of_iommu.h
> +++ b/include/linux/of_iommu.h
> @@ -39,7 +39,7 @@ static inline void of_iommu_set_ops(struct device_node *np,
>  
>  static inline const struct iommu_ops *of_iommu_get_ops(struct device_node 
> *np)
>  {
> - return iommu_get_instance(&np->fwnode);
> + return iommu_ops_from_fwnode(&np->fwnode);
>  }

Note that you've already got Lorenzo's patch queued to remove these of_
wrappers.

Robin.

>  
>  extern struct of_device_id __iommu_of_table;
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 06/11] iommu: Add iommu_device_set_fwnode() interface

2017-02-10 Thread Robin Murphy
On 09/02/17 11:32, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Allow to store a fwnode in 'struct iommu_device';
> 
> Signed-off-by: Joerg Roedel 
> ---
>  include/linux/iommu.h | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index bae3cfc..626c935 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -214,6 +214,7 @@ struct iommu_ops {
>  struct iommu_device {
>   struct list_head list;
>   const struct iommu_ops *ops;
> + struct fwnode_handle *fwnode;
>   struct device dev;
>  };
>  
> @@ -233,6 +234,12 @@ static inline void iommu_device_set_ops(struct 
> iommu_device *iommu,
>   iommu->ops = ops;
>  }
>  
> +static inline void iommu_device_set_fwnode(struct iommu_device *iommu,
> +struct fwnode_handle *fwnode)
> +{
> + iommu->fwnode = fwnode;
> +}

Would it make sense to simply make the ops and fwnode additional
arguments to iommu_device_register() (permitting fwnode to be NULL)?
AFAICS they should typically all have the same effective lifetime so
there doesn't seem to be any real need to handle everything separately.

Robin.

> +
>  #define IOMMU_GROUP_NOTIFY_ADD_DEVICE1 /* Device added */
>  #define IOMMU_GROUP_NOTIFY_DEL_DEVICE2 /* Pre Device removed 
> */
>  #define IOMMU_GROUP_NOTIFY_BIND_DRIVER   3 /* Pre Driver bind */
> @@ -580,6 +587,11 @@ static inline void iommu_device_set_ops(struct 
> iommu_device *iommu,
>  {
>  }
>  
> +static inline void iommu_device_set_fwnode(struct iommu_device *iommu,
> +struct fwnode_handle *fwnode)
> +{
> +}
> +
>  static inline void iommu_device_unregister(struct iommu_device *iommu)
>  {
>  }
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/11] iommu/arm-smmu: Make use of the iommu_register interface

2017-02-10 Thread Robin Murphy
On 09/02/17 11:32, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Also add the smmu devices to sysfs.
> 
> Signed-off-by: Joerg Roedel 
> ---
>  drivers/iommu/arm-smmu-v3.c | 22 +-
>  drivers/iommu/arm-smmu.c| 30 ++
>  2 files changed, 51 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 4d6ec44..32133e2 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -616,6 +616,9 @@ struct arm_smmu_device {
>   unsigned intsid_bits;
>  
>   struct arm_smmu_strtab_cfg  strtab_cfg;
> +
> + /* IOMMU core code handle */
> + struct iommu_device iommu;
>  };
>  
>  /* SMMU private data for each master */
> @@ -1795,8 +1798,10 @@ static int arm_smmu_add_device(struct device *dev)
>   }
>  
>   group = iommu_group_get_for_dev(dev);
> - if (!IS_ERR(group))
> + if (!IS_ERR(group)) {
>   iommu_group_put(group);
> + iommu_device_link(&smmu->iommu, dev);

Given the coupling evident from this and the other patches, might it
work to simply do the linking/unlinking automatically in
iommu_group_{add,remove}_device()?

Robin.

> + }
>  
>   return PTR_ERR_OR_ZERO(group);
>  }
> @@ -1805,14 +1810,17 @@ static void arm_smmu_remove_device(struct device *dev)
>  {
>   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
>   struct arm_smmu_master_data *master;
> + struct arm_smmu_device *smmu;
>  
>   if (!fwspec || fwspec->ops != &arm_smmu_ops)
>   return;
>  
>   master = fwspec->iommu_priv;
> + smmu = master->smmu;
>   if (master && master->ste.valid)
>   arm_smmu_detach_dev(dev);
>   iommu_group_remove_device(dev);
> + iommu_device_unlink(&smmu->iommu, dev);
>   kfree(master);
>   iommu_fwspec_free(dev);
>  }
> @@ -2613,6 +2621,7 @@ static int arm_smmu_device_probe(struct platform_device 
> *pdev)
>  {
>   int irq, ret;
>   struct resource *res;
> + resource_size_t ioaddr;
>   struct arm_smmu_device *smmu;
>   struct device *dev = &pdev->dev;
>   bool bypass;
> @@ -2630,6 +2639,7 @@ static int arm_smmu_device_probe(struct platform_device 
> *pdev)
>   dev_err(dev, "MMIO region too small (%pr)\n", res);
>   return -EINVAL;
>   }
> + ioaddr = res->start;
>  
>   smmu->base = devm_ioremap_resource(dev, res);
>   if (IS_ERR(smmu->base))
> @@ -2682,6 +2692,16 @@ static int arm_smmu_device_probe(struct 
> platform_device *pdev)
>   return ret;
>  
>   /* And we're up. Go go go! */
> + ret = iommu_device_sysfs_add(&smmu->iommu, dev, NULL,
> +  "smmu3.%pa", &ioaddr);
> + if (ret)
> + return ret;
> +
> + iommu_device_set_ops(&smmu->iommu, &arm_smmu_ops);
> + iommu_device_set_fwnode(&smmu->iommu, dev->fwnode);
> +
> + ret = iommu_device_register(&smmu->iommu);
> +
>   iommu_register_instance(dev->fwnode, &arm_smmu_ops);
>  
>  #ifdef CONFIG_PCI
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index a60cded..f4ce1e7 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -380,6 +380,9 @@ struct arm_smmu_device {
>   unsigned int*irqs;
>  
>   u32 cavium_id_base; /* Specific to Cavium */
> +
> + /* IOMMU core code handle */
> + struct iommu_device iommu;
>  };
>  
>  enum arm_smmu_context_fmt {
> @@ -1444,6 +1447,8 @@ static int arm_smmu_add_device(struct device *dev)
>   if (ret)
>   goto out_free;
>  
> + iommu_device_link(&smmu->iommu, dev);
> +
>   return 0;
>  
>  out_free:
> @@ -1456,10 +1461,17 @@ static int arm_smmu_add_device(struct device *dev)
>  static void arm_smmu_remove_device(struct device *dev)
>  {
>   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
> + struct arm_smmu_master_cfg *cfg;
> + struct arm_smmu_device *smmu;
> +
>  
>   if (!fwspec || fwspec->ops != &arm_smmu_ops)
>   return;
>  
> + cfg  = fwspec->iommu_priv;
> + smmu = cfg->smmu;
> +
> + iommu_device_unlink(&smmu->iommu, dev);
>   arm_smmu_master_free_smes(fwspec);
>   iommu_group_remove_device(dev);
>   kfree(fwspec->iommu_priv);
> @@ -2011,6 +2023,7 @@ static int arm_smmu_device_dt_probe(struct 
> platform_device *pdev,
>  static int arm_smmu_device_probe(struct platform_device *pdev)
>  {
>   struct resource *res;
> + resource_size_t ioaddr;
>   struct arm_smmu_device *smmu;
>   struct device *dev = &pdev->dev;
>   int num_irqs, i, err;
> @@ -2031,6 +2044,7 @@ static int arm_smmu_device_probe(struct platform_device 
> *pdev)
>   return err;
>  
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + ioaddr = res->start;
>   smmu->base = devm_ioremap_resource(dev, res);
>  

Re: [PATCH 08/11] iommu/msm: Make use of iommu_device_register interface

2017-02-10 Thread Robin Murphy
On 09/02/17 11:32, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Register the MSM IOMMUs to the iommu core and add sysfs
> entries for that driver.
> 
> Signed-off-by: Joerg Roedel 
> ---
>  drivers/iommu/msm_iommu.c | 73 
> +++
>  drivers/iommu/msm_iommu.h |  3 ++
>  2 files changed, 76 insertions(+)
> 
> diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
> index b09692b..30795cb 100644
> --- a/drivers/iommu/msm_iommu.c
> +++ b/drivers/iommu/msm_iommu.c
> @@ -371,6 +371,58 @@ static int msm_iommu_domain_config(struct msm_priv *priv)
>   return 0;
>  }
>  
> +/* Must be called under msm_iommu_lock */
> +static struct msm_iommu_dev *find_iommu_for_dev(struct device *dev)
> +{
> + struct msm_iommu_dev *iommu, *ret = NULL;
> + struct msm_iommu_ctx_dev *master;
> +
> + list_for_each_entry(iommu, &qcom_iommu_devices, dev_node) {
> + master = list_first_entry(&iommu->ctx_list,
> +   struct msm_iommu_ctx_dev,
> +   list);
> + if (master->of_node == dev->of_node) {
> + ret = iommu;
> + break;
> + }
> + }
> +
> + return ret;
> +}
> +
> +static int msm_iommu_add_device(struct device *dev)
> +{
> + struct msm_iommu_dev *iommu;
> + unsigned long flags;
> + int ret = 0;
> +
> + spin_lock_irqsave(&msm_iommu_lock, flags);
> +
> + iommu = find_iommu_for_dev(dev);
> + if (iommu)
> + iommu_device_link(&iommu->iommu, dev);
> + else
> + ret = -ENODEV;
> +
> + spin_unlock_irqrestore(&msm_iommu_lock, flags);
> +
> + return ret;
> +}
> +
> +static void msm_iommu_remove_device(struct device *dev)
> +{
> + struct msm_iommu_dev *iommu;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&msm_iommu_lock, flags);
> +
> + iommu = find_iommu_for_dev(dev);
> + if (iommu)
> + iommu_device_unlink(&iommu->iommu, dev);
> +
> + spin_unlock_irqrestore(&msm_iommu_lock, flags);
> +}
> +
>  static int msm_iommu_attach_dev(struct iommu_domain *domain, struct device 
> *dev)
>  {
>   int ret = 0;
> @@ -646,6 +698,8 @@ irqreturn_t msm_iommu_fault_handler(int irq, void *dev_id)
>   .unmap = msm_iommu_unmap,
>   .map_sg = default_iommu_map_sg,
>   .iova_to_phys = msm_iommu_iova_to_phys,
> + .add_device = msm_iommu_add_device,
> + .remove_device = msm_iommu_remove_device,
>   .pgsize_bitmap = MSM_IOMMU_PGSIZES,
>   .of_xlate = qcom_iommu_of_xlate,
>  };
> @@ -653,6 +707,7 @@ irqreturn_t msm_iommu_fault_handler(int irq, void *dev_id)
>  static int msm_iommu_probe(struct platform_device *pdev)
>  {
>   struct resource *r;
> + resource_size_t ioaddr;
>   struct msm_iommu_dev *iommu;
>   int ret, par, val;
>  
> @@ -696,6 +751,7 @@ static int msm_iommu_probe(struct platform_device *pdev)
>   ret = PTR_ERR(iommu->base);
>   goto fail;
>   }
> + ioaddr = r->start;
>  
>   iommu->irq = platform_get_irq(pdev, 0);
>   if (iommu->irq < 0) {
> @@ -737,6 +793,23 @@ static int msm_iommu_probe(struct platform_device *pdev)
>   }
>  
>   list_add(&iommu->dev_node, &qcom_iommu_devices);
> +
> + ret = iommu_device_sysfs_add(&iommu->iommu, iommu->dev, NULL,
> +  "msm-smmu.%pa", &ioaddr);
> + if (ret) {
> + pr_err("Could not add msm-smmu at %pa to sysfs\n", &ioaddr);
> + goto fail;
> + }

Nit: there's a bit of inconsistency with printing errors between the
various drivers (for both _sysfs_add and _register). I reckon if we want
error messages we may as well just fold them into the helper functions.

> +
> + iommu_device_set_ops(&iommu->iommu, &msm_iommu_ops);
> + iommu_device_set_fwnode(&iommu->iommu, &pdev->dev.of_node->fwnode);
> +
> + ret = iommu_device_register(&iommu->iommu);
> + if (ret) {
> + pr_err("Could not register msm-smmu at %pa\n", &ioaddr);
> + goto fail;
> + }

I think there's a corresponding unregister missing for
msm_iommu_remove() here (and similarly in the ARM SMMU drivers, looking
back). I know it's not strictly a problem at the moment, but I do now
have IOMMU-drivers-as-modules working on top of the probe deferral
series... ;)

Robin.

> +
>   of_iommu_set_ops(pdev->dev.of_node, &msm_iommu_ops);
>  
>   pr_info("device mapped at %p, irq %d with %d ctx banks\n",
> diff --git a/drivers/iommu/msm_iommu.h b/drivers/iommu/msm_iommu.h
> index 4ca25d5..ae92d27 100644
> --- a/drivers/iommu/msm_iommu.h
> +++ b/drivers/iommu/msm_iommu.h
> @@ -19,6 +19,7 @@
>  #define MSM_IOMMU_H
>  
>  #include 
> +#include 
>  #include 
>  
>  /* Sharability attributes of MSM IOMMU mappings */
> @@ -68,6 +69,8 @@ struct msm_iommu_dev {
>   struct list_head dom_node;
>   struct list_head ctx_list;
>   DECLARE_BITMAP(context_map, 

Re: [PATCH 06/11] iommu: Add iommu_device_set_fwnode() interface

2017-02-10 Thread Robin Murphy
On 10/02/17 15:22, Joerg Roedel wrote:
> Hi Robin,
> 
> On Fri, Feb 10, 2017 at 02:16:54PM +0000, Robin Murphy wrote:
>>> +static inline void iommu_device_set_fwnode(struct iommu_device *iommu,
>>> +  struct fwnode_handle *fwnode)
>>> +{
>>> +   iommu->fwnode = fwnode;
>>> +}
>>
>> Would it make sense to simply make the ops and fwnode additional
>> arguments to iommu_device_register() (permitting fwnode to be NULL)?
>> AFAICS they should typically all have the same effective lifetime so
>> there doesn't seem to be any real need to handle everything separately.
> 
> Well, it is not yet clear what other information will end up in
> 'struct iommu_device', and I don't want to add another parameter to
> iommu_device_register for every new struct member.

That's a fair point. I think the ops, as a core piece of the whole API,
would be sufficiently self-explanatory as part of registration, but then
we'd end up with a weird interface with different members initialised
through different paths, and I agree that ends up just as ugly.

> Also I think having these wrappers is more readable in the code, as it
> is clear what the code does without looking up the function prototypes
> in the header.

Yeah, on reflection explicit initialisation is certainly easier to read
than a bunch of arguments handled implicitly by register(), but then
from that angle, even more clear would be to simply have the drivers
write the relevant struct members directly - I'd be quite happy with
that, and we then don't have to add another setter to iommu.h for every
new struct member (and risk it looking like Java code...)

Robin.

> 
> It might make sense to set the mandatory struct members via
> iommu_device_register in the future, but we'll see :)
> 
> 
>   Joerg
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 06/11] iommu: Add iommu_device_set_fwnode() interface

2017-02-10 Thread Robin Murphy
On 10/02/17 16:11, Joerg Roedel wrote:
> On Fri, Feb 10, 2017 at 04:03:07PM +0000, Robin Murphy wrote:
>> Yeah, on reflection explicit initialisation is certainly easier to read
>> than a bunch of arguments handled implicitly by register(), but then
>> from that angle, even more clear would be to simply have the drivers
>> write the relevant struct members directly - I'd be quite happy with
>> that, and we then don't have to add another setter to iommu.h for every
>> new struct member (and risk it looking like Java code...)
> 
> Yeah, that was my first approach. But there is the Intel VT-d anomaly,
> where a part of the driver can be built-in (dmar.c) with
> CONFIG_IOMMU_API=N. In this case 'struct iommu_device' is empty, and
> trying to access the members directly doesn't compile anymore.
> 
> I have to look if this anomaly could be removed, then it is probably the
> best to set the struct members directly without wrapper functions.

Ah, I hadn't managed to spot that - I assume there probably is some
valid edge case for wanting x2APIC functionality without DMA remapping
which prevents us from just adding the dependency. Looking at the code,
though, that situation does seem to rely on the call never actually
executing at runtime - not only is it conditional on a static variable
which is only ever set by non-present code, it would fail the probe if
it were called - so I think it would be perfectly reasonable to just
address that particular problem as below (untested, but if it lets us
get rid of the dummy !IOMMU_API definitions of the registration
functions I'd say we've done the right thing).

Robin.

->8-
diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 8ccbd7023194..161641caff79 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1077,6 +1077,7 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd)

raw_spin_lock_init(&iommu->register_lock);

+#ifdef CONFIG_IOMMU_API
if (intel_iommu_enabled) {
iommu->iommu_dev = iommu_device_create(NULL, iommu,
   intel_iommu_groups,
@@ -1087,6 +1088,7 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd)
goto err_unmap;
}
}
+#endif

drhd->iommu = iommu;


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/11] iommu/arm-smmu: Make use of the iommu_register interface

2017-02-10 Thread Robin Murphy
On 10/02/17 15:25, Joerg Roedel wrote:
> On Fri, Feb 10, 2017 at 02:20:34PM +0000, Robin Murphy wrote:
>>> @@ -1795,8 +1798,10 @@ static int arm_smmu_add_device(struct device *dev)
>>> }
>>>  
>>> group = iommu_group_get_for_dev(dev);
>>> -   if (!IS_ERR(group))
>>> +   if (!IS_ERR(group)) {
>>> iommu_group_put(group);
>>> +   iommu_device_link(&smmu->iommu, dev);
>>
>> Given the coupling evident from this and the other patches, might it
>> work to simply do the linking/unlinking automatically in
>> iommu_group_{add,remove}_device()?
> 
> Yes, this is one of the goals too. But currently we don't have a generic
> device->hw_iommu mapping in the iommu-code which would allow to call
> the link/unlink functions in generic code too.

At some point we should change the iommu_ops pointer in iommu_fwspec for
an iommu_device pointer - that would then give us an easy
dev->fwpec->hw_iommu relationship which is mostly managed by core code
already. In the meantime I was imagining just passing it around,
something like iommu_group_add_device(hw_iommu, group, dev), but now I
suspect that'd be running up against a similar objection to before ;)

Robin.

> 
> But changing this is one of the next things on my list :)
> 
> 
> 
>   Joerg
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 08/11] iommu/msm: Make use of iommu_device_register interface

2017-02-10 Thread Robin Murphy
On 10/02/17 15:33, Joerg Roedel wrote:
> On Fri, Feb 10, 2017 at 02:35:39PM +0000, Robin Murphy wrote:
>> On 09/02/17 11:32, Joerg Roedel wrote:
>>> +   ret = iommu_device_sysfs_add(&iommu->iommu, iommu->dev, NULL,
>>> +"msm-smmu.%pa", &ioaddr);
>>> +   if (ret) {
>>> +   pr_err("Could not add msm-smmu at %pa to sysfs\n", &ioaddr);
>>> +   goto fail;
>>> +   }
>>
>> Nit: there's a bit of inconsistency with printing errors between the
>> various drivers (for both _sysfs_add and _register). I reckon if we want
>> error messages we may as well just fold them into the helper functions.
> 
> Yeah, this could be unified too. For now I looked how verbose the
> driver was that I was going to change and added messages to be
> consistent inside the drivers.
> 
>>
>>> +
>>> +   iommu_device_set_ops(&iommu->iommu, &msm_iommu_ops);
>>> +   iommu_device_set_fwnode(&iommu->iommu, &pdev->dev.of_node->fwnode);
>>> +
>>> +   ret = iommu_device_register(&iommu->iommu);
>>> +   if (ret) {
>>> +   pr_err("Could not register msm-smmu at %pa\n", &ioaddr);
>>> +   goto fail;
>>> +   }
>>
>> I think there's a corresponding unregister missing for
>> msm_iommu_remove() here (and similarly in the ARM SMMU drivers, looking
>> back). I know it's not strictly a problem at the moment, but I do now
>> have IOMMU-drivers-as-modules working on top of the probe deferral
>> series... ;)
> 
> Well, that there was an iommu_register_instance() without any
> unregistration interface at all makes me believe that unregistering
> iommus is not really implemented yet.
> 
> And in fact, the remove functions for msm and arm-smmu seem to only
> disable the hardware, but are not removing the corresponding data
> structures.

For the ARM SMMUs at least, the SMMU-specific data is (well, should be)
all devm_* managed, thus freed automatically by the driver core after
remove() returns. It is true that there's an implicit expectation that
the SMMU won't be removed until all domains, groups and masters have
been explicitly torn down by the relevant detach()/remove()/free()
calls, although I guess the sysfs links might actually help enforce that.

> So I think we are fine from that side.

Sure, I mostly just wanted not to lose sight of the future possibility
of unloadable IOMMU drivers (admittedly I've not even tried that yet,
only loading them post-boot).

Robin.

> 
> 
>   Joerg
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: RFC on No ACS Support and SMMUv3 Support

2017-02-14 Thread Robin Murphy
On 14/02/17 01:54, Sinan Kaya wrote:
> On 2/13/2017 8:46 PM, Alex Williamson wrote:
>>> My first goal is to support virtual function passthrough for device's that 
>>> are directly
>>> connected. This will be possible with the quirk I proposed and it will be 
>>> the most
>>> secure solution. It can certainly be generalized for other systems.
>> Why is this anything more than a quirk for the affected PCIe root port
>> vendor:device IDs and use of pci_device_group() to evaluate the rest of
>> the topology, as appears is already done?  Clearly a blanket exception
>> for the platform wouldn't necessarily be correct if a user could plugin
>> a device that adds a PCIe switch.
> 
> I was going to go this direction first. I wanted to check with everybody to 
> see
> if there are other/better alternatives possible via either changing 
> pci_device_group or changing the smmuv3 driver.

I second Alex's opinion here. The SMMU driver is absolutely not the
appropriate place to address this - it's a PCI issue that needs to be
solved in the PCI domain. To flip things around, regardless of VFIO,
overriding group allocation may just plain break some devices - if you
plug in, say, some USB 2.0 card with an OHCI/EHCI pair on two different
functions, assigning them to different groups such that they get
different domains and can't hand off valid DMA addresses to each other
is liable to go downhill very quickly.

Robin.

>>> My second goal is extend the code such that ACS validation is up to the 
>>> customer via 
>>> pci=noacs kernel command line for instance. This will let the customer 
>>> choose what he
>>> really wants rather than kernel trying to be smart and protective. By 
>>> passing pci=noacs
>>> parameter, customer acknowledges the risks this command line carries.
>> Be prepared that this will need to taint the kernel since we make
>> assertions with drivers like vfio to provide secure, isolated user
>> access to devices and we can't make that statement if the user has
>> bypassed ACS enforcement.  There is absolutely no way that such an
>> option would not be severely abused.  In fact, I tried adding such an
>> option with the pcie_acs_override= patch[1], Bjorn rejected it and I'm
>> thankful that he did.  I don't think this is a good idea, sometimes the
>> kernel does need to be smarter than the user to protect them from
>> themselves.  Any easy bypass also lets hardware vendors ignore the
>> issue longer.  Thanks,
> 
> Bjorn, any inputs?
> 
>>
>> Alex
>>
>> [1] https://lkml.org/lkml/2013/5/30/513
>>
> 
> 

(apologies if a disclaimer appears below; SMTP problems...)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 2/2] iommu/arm-smmu: support qcom implementation

2017-02-14 Thread Robin Murphy
Hi Rob,

On 10/02/17 18:41, Rob Clark wrote:
> For devices with iommu(s) in secure mode, we cannot touch global
> registers, and we have to live with the context -> sid mapping that
> the secure world has set up for us.
> 
> This enables, for example db410c (apq8016) devices to use the up-
> stream arm-smmu driver.  This is the last major hurdle for having
> the gpu working out of the box on an upstream kernel.
> 
> NOTE: at this point, it works but I haven't spent any time thinking
> much about the bindings.  Since we can't read the global regs, we
> need to get all the device config info from DT (or at least anything
> that can't be hard-coded).

This approach seems, I have to say, unworkably horrible. I'm absolutely
against the idea of pretending we have access to global state which we
don't, then piling a facsimile of that state into the DT for no reason
other than to keep the overcomplicated pretence up. This configuration
comes out looking and behaving like a discrete IOMMU (see e.g.
rockchip-iommu for inspiration) - if we have to support it, it would
make far more sense to simply describe what we have, i.e. a
"qcom,apq8016-smmu-context-bank" with a single interrupt and
#iommu-cells = <0>. They'd want their own probe routine and private data
(AFAICS more or less just a base address, an IRQ, ID features and an
iommu_group), and probably a separate iommu_ops because you'd want to
handle {add,remove}_device() and device_group() significantly
differently. By the time we get up to {attach,remove}_dev() it might be
clean enough to dynamically handle both cases in the same code,
especially with a little general refactor of the ARM_SMMU_CB*
arithmetic, and once we get to operating on arm_smmu_domains there
should be no real difference.

> Also, this only works for non-secure contexts.  For secure contexts,
> even things like map/unmap need to go via scm, so that bypasses
> even more of arm-smmu.  I'm not sure if it is better to have all of
> those special paths added to arm-smmu.  Or have a 2nd iommu driver
> for the secure contexts.  (I think the second path only works since
> I don't think the CPU is really touching the hw at all for secure
> contexts.)
> 
> Not in lieu of bindings docs (which will come before this is more
> than just an RFC), but here is an example of what the DT bindings
> look like:
> 
>   gpu_iommu: qcom,iommu@1f0 {
>   compatible = "qcom,smmu-v2";
>   reg = <0x1f0 0x1>;
> 
>   #global-interrupts = <1>;
>   interrupts =
>   ,// global
>   ,   // unk0
>   ,   // gfx3d_user
>   ;   // gfx3d_priv
> 
>   qcom,stream-to-cb = <
>   0x0002   // unk0
>   0x   // gfx3d_user
>   0x0001   // gfx3d_priv
>   >;
> 
>   #iommu-cells = <1>;
> 
>   clocks = <&gcc GCC_SMMU_CFG_CLK>,
><&gcc GCC_GFX_TCU_CLK>;
>   clock-names = "smmu_iface_clk", "smmu_bus_clk";
> 
>   qcom,cb-count = <3>;
>   qcom,iommu-secure-id = <18>;
>   qcom,mapping-groups-count = <3>;
> 
>   status = "okay";
>   };
> 
> Since we must live with the assignment of stream-id's to context bank
> mapping that the secure world has set up for us, the "qcom,stream-to-cb"
> binding gives a mapping table of sid to cb.  (Indexed by cb, value is
> the sid.)  I'm not 100% sure what to do about devices with multiple
> sid's..  if I understand how things work properly, we could probably
> make the values in this table the result of OR'ing all the sids
> together.
> 
> The "qcom,cb-count" and "qcom,mapping-groups-count" can go away and
> be computed from "qcom,stream-to-cb".  I just haven't done that yet.
> The "qcom,iommu-secure-id" field is needed for the one scm call needed
> for non-secure contexts for initial configuration.
> 
> Anyways, at this point, I'm mostly just looking for feedback about
> whether this is the best way forward, vs introducing a seperate iommu
> driver, and any suggestions anyone might have.  And any ideas about how
> to best handle the secure context banks, since I think we have no
> choice but to use them for venus (the video enc/dec block).

I'd say the answer to how we handle secure contexts is "we don't".
Disregarding the apparent craziness of why the non-secure world would be
allowed direct control of an entirely secure device, at that point it's
not really an ARM SMMU any more, it's just *some*
firmware-paravirtualised IOMMU. If you can't touch the hardware, who
knows what's actually under there; "qcom-scm-iommu" is welcome to its
own driver.

Robin.

> Signed-off-by: Rob Clark 
> ---
>  drivers/iommu/arm-smmu.c | 233 
> +++
>  1 file changed, 217 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/i

Re: [PATCH v4] arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU

2017-02-17 Thread Robin Murphy
On 07/02/17 15:38, Geert Uytterhoeven wrote:
> Add support for allocating physically contiguous DMA buffers on arm64
> systems with an IOMMU.  This can be useful when two or more devices
> with different memory requirements are involved in buffer sharing.
> 
> Note that as this uses the CMA allocator, setting the
> DMA_ATTR_FORCE_CONTIGUOUS attribute has a runtime-dependency on
> CONFIG_DMA_CMA, just like on arm32.
> 
> For arm64 systems using swiotlb, no changes are needed to support the
> allocation of physically contiguous DMA buffers:
>   - swiotlb always uses physically contiguous buffers (up to
> IO_TLB_SEGSIZE = 128 pages),
>   - arm64's __dma_alloc_coherent() already calls
> dma_alloc_from_contiguous() when CMA is available.

I think this looks about as good as it ever could now :)

Reviewed-by: Robin Murphy 

Thanks,
Robin.

> Signed-off-by: Geert Uytterhoeven 
> Acked-by: Laurent Pinchart 
> ---
> v4:
>   - Replace dma_to_phys()/phys_to_page() by vmalloc_to_page(), to pass
> the correct page pointer to dma_release_from_contiguous().
> Note that the latter doesn't scream when passed a wrong pointer, but
> just returns false.  While this makes life easier for callers who
> may want to call another deallocator, it makes it harder catching
> bugs.
> 
> v3:
>   - Add Acked-by,
>   - Update comment to "one of _4_ things",
>   - Call dma_alloc_from_contiguous() and iommu_dma_map_page() directly,
> as suggested by Robin Murphy,
> 
> v2:
>   - New, handle dispatching in the arch (arm64) code, as requested by
> Robin Murphy.
> ---
>  arch/arm64/mm/dma-mapping.c | 63 
> ++---
>  1 file changed, 48 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 351f7595cb3ebdb9..fb76e82c90edd514 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -584,20 +584,7 @@ static void *__iommu_alloc_attrs(struct device *dev, 
> size_t size,
>*/
>   gfp |= __GFP_ZERO;
>  
> - if (gfpflags_allow_blocking(gfp)) {
> - struct page **pages;
> - pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
> -
> - pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
> - handle, flush_page);
> - if (!pages)
> - return NULL;
> -
> - addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
> -   __builtin_return_address(0));
> - if (!addr)
> - iommu_dma_free(dev, pages, iosize, handle);
> - } else {
> + if (!gfpflags_allow_blocking(gfp)) {
>   struct page *page;
>   /*
>* In atomic context we can't remap anything, so we'll only
> @@ -621,6 +608,45 @@ static void *__iommu_alloc_attrs(struct device *dev, 
> size_t size,
>   __free_from_pool(addr, size);
>   addr = NULL;
>   }
> + } else if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> + pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
> + struct page *page;
> +
> + page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
> +  get_order(size));
> + if (!page)
> + return NULL;
> +
> + *handle = iommu_dma_map_page(dev, page, 0, iosize, ioprot);
> + if (iommu_dma_mapping_error(dev, *handle)) {
> + dma_release_from_contiguous(dev, page,
> + size >> PAGE_SHIFT);
> + return NULL;
> + }
> + if (!coherent)
> + __dma_flush_area(page_to_virt(page), iosize);
> +
> + addr = dma_common_contiguous_remap(page, size, VM_USERMAP,
> +prot,
> +__builtin_return_address(0));
> + if (!addr) {
> + iommu_dma_unmap_page(dev, *handle, iosize, 0, attrs);
> + dma_release_from_contiguous(dev, page,
> + size >> PAGE_SHIFT);
> + }
> + } else {
> + pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL, coherent);
> + struct page **pages;
> +
> + pages = iommu_dma_alloc(dev, iosize, gfp, attrs, ioprot,
> + handle

Re: [RFC PATCH v1] iommu/io-pgtable-arm-v7s: Check for leaf entry right after finding it

2017-02-21 Thread Robin Murphy
On 16/02/17 13:52, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko 
> 
> Do a check for already installed leaf entry at the current level before
> performing any actions when trying to map.
> 
> This check is already present in arm_v7s_init_pte(), i.e. before
> installing new leaf entry at the current level if conditions to do so
> are met (num_entries > 0).
> 
> But, this might be insufficient in case when we have already
> installed block mapping at this level and it is not time to
> install new leaf entry (num_entries == 0).
> In that case we continue walking the page table down with wrong pointer
> to the next level.
> 
> So, move check from arm_v7s_init_pte() to __arm_v7s_map() in order to
> avoid all cases.

Would it not be more logical (and simpler) to just check that the thing
we dereference is valid to dereference when we dereference it? i.e.:

-8<-
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c
b/drivers/iommu/io-pgtable-arm-v7s.c
index 0769276c0537..f3112f9ff494 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -418,8 +418,10 @@ static int __arm_v7s_map(struct arm_v7s_io_pgtable
*data, unsigned long iova,
pte |= ARM_V7S_ATTR_NS_TABLE;

__arm_v7s_set_pte(ptep, pte, 1, cfg);
-   } else {
+   } else if (ARM_V7S_PTE_IS_TABLE(pte, lvl)) {
cptep = iopte_deref(pte, lvl);
+   } else {
+   return -EEXIST;
}

/* Rinse, repeat */
->8-

I think the equivalent could be done in LPAE as well.

Robin.

> Signed-off-by: Oleksandr Tyshchenko 
> ---
> This patch does similar things as the patch I have pushed a week ago
> for long-descriptor format [1].
> The reason why this patch is RFC is because I am not sure I did the right
> thing and I even didn't test it.
> 
> [1] 
> https://lists.linuxfoundation.org/pipermail/iommu/2017-February/020411.html
> ---
> ---
>  drivers/iommu/io-pgtable-arm-v7s.c | 15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
> b/drivers/iommu/io-pgtable-arm-v7s.c
> index f50e51c..7f7594b 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -364,10 +364,6 @@ static int arm_v7s_init_pte(struct arm_v7s_io_pgtable 
> *data,
>   if (WARN_ON(__arm_v7s_unmap(data, iova + i * sz,
>   sz, lvl, tblp) != sz))
>   return -EINVAL;
> - } else if (ptep[i]) {
> - /* We require an unmap first */
> - WARN_ON(!selftest_running);
> - return -EEXIST;
>   }
>  
>   pte |= ARM_V7S_PTE_TYPE_PAGE;
> @@ -392,11 +388,20 @@ static int __arm_v7s_map(struct arm_v7s_io_pgtable 
> *data, unsigned long iova,
>  {
>   struct io_pgtable_cfg *cfg = &data->iop.cfg;
>   arm_v7s_iopte pte, *cptep;
> - int num_entries = size >> ARM_V7S_LVL_SHIFT(lvl);
> + int i = 0, num_entries = size >> ARM_V7S_LVL_SHIFT(lvl);
>  
>   /* Find our entry at the current level */
>   ptep += ARM_V7S_LVL_IDX(iova, lvl);
>  
> + /* Check for already installed leaf entry */
> + do {
> + if (ptep[i] && !ARM_V7S_PTE_IS_TABLE(ptep[i], lvl)) {
> + /* We require an unmap first */
> + WARN_ON(!selftest_running);
> + return -EEXIST;
> + }
> + } while (++i < num_entries);
> +
>   /* If we can install a leaf entry at this level, then do so */
>   if (num_entries)
>   return arm_v7s_init_pte(data, iova, paddr, prot,
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [bug report] iommu/arm-smmu: Make use of the iommu_register interface

2017-02-22 Thread Robin Murphy
On 22/02/17 11:00, Joerg Roedel wrote:
> Hi Dan,
> 
> thanks for the report! There are more bogus things going on here.
> 
> On Wed, Feb 15, 2017 at 11:36:48AM +0300, Dan Carpenter wrote:
>> The patch 9648cbc9625b: "iommu/arm-smmu: Make use of the 
>> iommu_register interface" from Feb 1, 2017, leads to the following 
>> Smatch complaint:
>>
>> drivers/iommu/arm-smmu-v3.c:1810 arm_smmu_remove_device()
>>   warn: variable dereferenced before check 'master' (see line 1809)
>>
>> drivers/iommu/arm-smmu-v3.c
>>   1808   master = fwspec->iommu_priv;
>>   1809   smmu = master->smmu;
>>
>> New dereference.
>>
>>   1810   if (master && master->ste.valid)
>> ^^
>> Old code checked for NULL.
>>
>>   1811   arm_smmu_detach_dev(dev);
>>   1812   iommu_group_remove_device(dev);
> 
> So the master pointer comes from fwspec->iommu_priv, and master is freed
> later in the function. But I can't find where the fwspec->iommu_priv
> pointer is cleared. To me it looks like this breaks when a device is
> removed and the added again.

The pointer isn't cleared because the whole fwspec is freed on the very
next line.

Robin.

> 
> Robin, Will, can you have a look please?
> 
> 
> Thanks,
> 
>   Joerg
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Partial BAR Address Allocation

2017-02-23 Thread Robin Murphy
On 22/02/17 23:39, Bjorn Helgaas wrote:
> [+cc Joerg, iommu list]
> 
> On Wed, Feb 22, 2017 at 03:44:53PM -0500, Sinan Kaya wrote:
>> On 2/22/2017 1:44 PM, Bjorn Helgaas wrote:
>>> There is no way for a driver to say "I only need this memory BAR and
>>> not the other ones."  The reason is because the PCI_COMMAND_MEMORY bit
>>> enables *all* the memory BARs; there's no way to enable memory BARs
>>> selectively.  If we enable memory BARs and one of them is unassigned,
>>> that unassigned BAR is enabled, and the device will respond at
>>> whatever address the register happens to contain, and that may cause
>>> conflicts.
>>>
>>> I'm not sure this answers your question.  Do you want to get rid of
>>> 32-bit BAR addresses because your host bridge doesn't have a window to
>>> 32-bit PCI addresses?  It's typical for a bridge to support a window
>>> to the 32-bit PCI space as well as one to the 64-bit PCI space.  Often
>>> it performs address translation for the 32-bit window so it doesn't
>>> have to be in the 32-bit area on the CPU side, e.g., you could have
>>> something like this where we have three host bridges and the 2-4GB
>>> space on each PCI root bus is addressable:
>>>
>>>   pci_bus :00: root bus resource [mem 0x108000-0x10] (bus 
>>> address [0x8000-0x])
>>>   pci_bus 0001:00: root bus resource [mem 0x118000-0x11] (bus 
>>> address [0x8000-0x])
>>>   pci_bus 0002:00: root bus resource [mem 0x128000-0x12] (bus 
>>> address [0x8000-0x])
>>
>> The problem is that according to PCI specification BAR addresses and
>> DMA addresses cannot overlap.
>>
>> From PCI-to-PCI Bridge Arch. spec.: "A bridge forwards PCI memory
>> transactions from its primary interface to its secondary interface
>> (downstream) if a memory address is in the range defined by the
>> Memory Base and Memory Limit registers (when the base is less than
>> or equal to the limit) as illustrated in Figure 4-3. Conversely, a
>> memory transaction on the secondary interface that is within this
>> address range will not be forwarded upstream to the primary
>> interface."
>>
>> To be specific, if your DMA address happens to be in
>> [0x8000-0x] and root port's aperture includes this
>> range; the DMA will never make to the system memory.
>>
>> Lorenzo and Robin took some steps to carve out PCI addresses out of
>> DMA addresses in IOMMU drivers by using iova_reserve_pci_windows()
>> function.
>>
>> However, I see that we are still exposed when the operating system
>> doesn't have any IOMMU driver and is using the SWIOTLB for instance. 
> 
> Hmmm.  I guess SWIOTLB assumes there's no address translation in the
> DMA direction, right?

Not entirely - it does rely on arch-provided dma_to_phys() and
phys_to_dma() helpers which are free to accommodate such translations in
a device-specific manner. On arm64 we use these to account for
dev->dma_pfn_offset describing a straightforward linear offset, but
unless one constant offset would apply to all possible outbound windows
I'm not sure that's much help here.

>  If there's no address translation in the PIO
> direction, PCI bus BAR addresses are identical to the CPU-side
> addresses.  In that case, there's no conflict because we already have
> to assign BARs so they never look like a system memory address.
> 
> But if there *is* address translation in the PIO direction, we can
> have conflicts because the bridge can translate CPU-side PIO accesses
> to arbitrary PCI bus addresses.
> 
>> The FW solution I'm looking at requires carving out some part of the
>> DDR from before OS boot so that OS doesn't reclaim that area for
>> DMA.
> 
> If you want to reach system RAM, I guess you need to make sure you
> only DMA to bus addresses outside the host bridge windows, as you said
> above.  DMA inside the windows would be handled as peer-to-peer DMA.
> 
>> I'm not very happy with this solution. I'm also surprised that there
>> is no generic solution in the kernel takes care of this for all root
>> ports regardless of IOMMU driver presence.
> 
> The PCI core isn't really involved in allocating DMA addresses,
> although there definitely is the connection with PCI-to-PCI bridge
> windows that you mentioned.  I added IOMMU guys, who would know a lot
> more than I do.

To me, having the bus addresses of windows shadow assigned physical
addresses sounds mostly like a broken system configuration. Can the
firmware not reprogram them elsewhere, or is the entire bottom 4GB of
the physical memory map occupied by system RAM?

Robin.

> 
> Bjorn
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: iova: Consolidate code for adding new node to iovad domain rbtree

2017-02-23 Thread Robin Murphy
On 23/02/17 08:17, Marek Szyprowski wrote:
> This patch consolidates almost the same code used in iova_insert_rbtree()
> and __alloc_and_insert_iova_range() functions. There is no functional change.
> 
> Signed-off-by: Marek Szyprowski 
> ---
>  drivers/iommu/iova.c | 85 
> +++-
>  1 file changed, 31 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index b7268a14184f..32b9c2fb37b6 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -100,6 +100,32 @@ static unsigned long iova_rcache_get(struct iova_domain 
> *iovad,
>   }
>  }
>  
> +/* Insert the iova into domain rbtree by holding writer lock */
> +static void
> +iova_insert_rbtree(struct rb_root *root, struct iova *iova,
> +struct rb_node *start)
> +{
> + struct rb_node **new, *parent = NULL;
> +
> + new = (start) ? &start : &(root->rb_node);
> + /* Figure out where to put new node */
> + while (*new) {
> + struct iova *this = rb_entry(*new, struct iova, node);
> +
> + parent = *new;
> +
> + if (iova->pfn_lo < this->pfn_lo)
> + new = &((*new)->rb_left);
> + else if (iova->pfn_lo > this->pfn_lo)
> + new = &((*new)->rb_right);
> + else
> + BUG(); /* this should not happen */

Ooh, if we're touching this, can we downgrade it to a WARN()? Granted,
allocating an IOVA region of size 0 is not a reasonable thing to do
intentionally, but the fact that it's guaranteed to take down the kernel
is perhaps a bit much (I hit it s many times back when debugging the
iommu_dma_map_sg() stuff).

Nice tidyup otherwise, though.

Robin.

> + }
> + /* Add new node and rebalance tree. */
> + rb_link_node(&iova->node, parent, new);
> + rb_insert_color(&iova->node, root);
> +}
> +
>  /*
>   * Computes the padding size required, to make the start address
>   * naturally aligned on the power-of-two order of its size
> @@ -157,35 +183,8 @@ static int __alloc_and_insert_iova_range(struct 
> iova_domain *iovad,
>   new->pfn_lo = limit_pfn - (size + pad_size) + 1;
>   new->pfn_hi = new->pfn_lo + size - 1;
>  
> - /* Insert the new_iova into domain rbtree by holding writer lock */
> - /* Add new node and rebalance tree. */
> - {
> - struct rb_node **entry, *parent = NULL;
> -
> - /* If we have 'prev', it's a valid place to start the
> -insertion. Otherwise, start from the root. */
> - if (prev)
> - entry = &prev;
> - else
> - entry = &iovad->rbroot.rb_node;
> -
> - /* Figure out where to put new node */
> - while (*entry) {
> - struct iova *this = rb_entry(*entry, struct iova, node);
> - parent = *entry;
> -
> - if (new->pfn_lo < this->pfn_lo)
> - entry = &((*entry)->rb_left);
> - else if (new->pfn_lo > this->pfn_lo)
> - entry = &((*entry)->rb_right);
> - else
> - BUG(); /* this should not happen */
> - }
> -
> - /* Add new node and rebalance tree. */
> - rb_link_node(&new->node, parent, entry);
> - rb_insert_color(&new->node, &iovad->rbroot);
> - }
> + /* If we have 'prev', it's a valid place to start the insertion. */
> + iova_insert_rbtree(&iovad->rbroot, new, prev);
>   __cached_rbnode_insert_update(iovad, saved_pfn, new);
>  
>   spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags);
> @@ -194,28 +193,6 @@ static int __alloc_and_insert_iova_range(struct 
> iova_domain *iovad,
>   return 0;
>  }
>  
> -static void
> -iova_insert_rbtree(struct rb_root *root, struct iova *iova)
> -{
> - struct rb_node **new = &(root->rb_node), *parent = NULL;
> - /* Figure out where to put new node */
> - while (*new) {
> - struct iova *this = rb_entry(*new, struct iova, node);
> -
> - parent = *new;
> -
> - if (iova->pfn_lo < this->pfn_lo)
> - new = &((*new)->rb_left);
> - else if (iova->pfn_lo > this->pfn_lo)
> - new = &((*new)->rb_right);
> - else
> - BUG(); /* this should not happen */
> - }
> - /* Add new node and rebalance tree. */
> - rb_link_node(&iova->node, parent, new);
> - rb_insert_color(&iova->node, root);
> -}
> -
>  static struct kmem_cache *iova_cache;
>  static unsigned int iova_cache_users;
>  static DEFINE_MUTEX(iova_cache_mutex);
> @@ -505,7 +482,7 @@ void put_iova_domain(struct iova_domain *iovad)
>  
>   iova = alloc_and_init_iova(pfn_lo, pfn_hi);
>   if (iova)
> - iova_insert_rbtree(&iovad->rbroot, iova);
> + iova_insert_

Re: [PATCH v2] iommu: iova: Consolidate code for adding new node to iovad domain rbtree

2017-02-24 Thread Robin Murphy
On 24/02/17 11:13, Marek Szyprowski wrote:
> This patch consolidates almost the same code used in iova_insert_rbtree()
> and __alloc_and_insert_iova_range() functions. While touching this code,
> replace BUG() with WARN_ON(1) to avoid taking down the whole system in
> case of corrupted iova tree or incorrect calls.

Thanks Marek!

Tested-by: Robin Murphy 
Reviewed-by: Robin Murphy 

> Signed-off-by: Marek Szyprowski 
> ---
> v2:
> - replaced BUG() with WARN_ON(1) as suggested by Robin Murphy
> 
> v1:
> - initial version
> ---
>  drivers/iommu/iova.c | 87 
> 
>  1 file changed, 33 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index b7268a14184f..e80a4105ac2a 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -100,6 +100,34 @@ static unsigned long iova_rcache_get(struct iova_domain 
> *iovad,
>   }
>  }
>  
> +/* Insert the iova into domain rbtree by holding writer lock */
> +static void
> +iova_insert_rbtree(struct rb_root *root, struct iova *iova,
> +struct rb_node *start)
> +{
> + struct rb_node **new, *parent = NULL;
> +
> + new = (start) ? &start : &(root->rb_node);
> + /* Figure out where to put new node */
> + while (*new) {
> + struct iova *this = rb_entry(*new, struct iova, node);
> +
> + parent = *new;
> +
> + if (iova->pfn_lo < this->pfn_lo)
> + new = &((*new)->rb_left);
> + else if (iova->pfn_lo > this->pfn_lo)
> + new = &((*new)->rb_right);
> + else {
> + WARN_ON(1); /* this should not happen */
> + return;
> + }
> + }
> + /* Add new node and rebalance tree. */
> + rb_link_node(&iova->node, parent, new);
> + rb_insert_color(&iova->node, root);
> +}
> +
>  /*
>   * Computes the padding size required, to make the start address
>   * naturally aligned on the power-of-two order of its size
> @@ -157,35 +185,8 @@ static int __alloc_and_insert_iova_range(struct 
> iova_domain *iovad,
>   new->pfn_lo = limit_pfn - (size + pad_size) + 1;
>   new->pfn_hi = new->pfn_lo + size - 1;
>  
> - /* Insert the new_iova into domain rbtree by holding writer lock */
> - /* Add new node and rebalance tree. */
> - {
> - struct rb_node **entry, *parent = NULL;
> -
> - /* If we have 'prev', it's a valid place to start the
> -insertion. Otherwise, start from the root. */
> - if (prev)
> - entry = &prev;
> - else
> - entry = &iovad->rbroot.rb_node;
> -
> - /* Figure out where to put new node */
> - while (*entry) {
> - struct iova *this = rb_entry(*entry, struct iova, node);
> - parent = *entry;
> -
> - if (new->pfn_lo < this->pfn_lo)
> - entry = &((*entry)->rb_left);
> - else if (new->pfn_lo > this->pfn_lo)
> - entry = &((*entry)->rb_right);
> - else
> - BUG(); /* this should not happen */
> - }
> -
> - /* Add new node and rebalance tree. */
> - rb_link_node(&new->node, parent, entry);
> - rb_insert_color(&new->node, &iovad->rbroot);
> - }
> + /* If we have 'prev', it's a valid place to start the insertion. */
> + iova_insert_rbtree(&iovad->rbroot, new, prev);
>   __cached_rbnode_insert_update(iovad, saved_pfn, new);
>  
>   spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags);
> @@ -194,28 +195,6 @@ static int __alloc_and_insert_iova_range(struct 
> iova_domain *iovad,
>   return 0;
>  }
>  
> -static void
> -iova_insert_rbtree(struct rb_root *root, struct iova *iova)
> -{
> - struct rb_node **new = &(root->rb_node), *parent = NULL;
> - /* Figure out where to put new node */
> - while (*new) {
> - struct iova *this = rb_entry(*new, struct iova, node);
> -
> - parent = *new;
> -
> - if (iova->pfn_lo < this->pfn_lo)
> - new = &((*new)->rb_left);
> - else if (iova->pfn_lo > this->pfn_lo)
> - new = &((*new)->rb_right);
> - else
> -  

Re: [RFC PATCH] iommu/arm-smmu: Add global SMR masking property

2017-03-02 Thread Robin Murphy
On 02/03/17 04:18, Nipun Gupta wrote:
> 
> Hi Robin/Will,
> 
> This patch is currently not applied on the tree.
> I had verified the patch and it seems good.
> Is ack required on the patch or do I need to send a non RFC patch (with 
> Robin's signoff)?
> This is very much required to support SMMU on NXP platform.

It's still sat in my "patches to do something with" queue - I don't
think we ever reached a concrete decision on the property name for a DT
maintainer ack, but I've tweaked the description per Will's comment;
thanks for the reminder. I'll send an rc1-based version out next week to
reboot the discussion.

Robin.

> 
> Thanks,
> Nipun
> 
> 
>> -Original Message-
>> From: Nipun Gupta
>> Sent: Sunday, December 18, 2016 2:37
>> To: Robin Murphy ; iommu@lists.linux-
>> foundation.org; devicet...@vger.kernel.org; linux-arm-
>> ker...@lists.infradead.org
>> Cc: mark.rutl...@arm.com; will.dea...@arm.com; Stuart Yoder
>> 
>> Subject: RE: [RFC PATCH] iommu/arm-smmu: Add global SMR masking property
>>
>>
>>
>>> -Original Message-
>>> From: iommu-boun...@lists.linux-foundation.org [mailto:iommu-
>>> boun...@lists.linux-foundation.org] On Behalf Of Robin Murphy
>>> Sent: Friday, December 16, 2016 18:49
>>> To: iommu@lists.linux-foundation.org; devicet...@vger.kernel.org; linux-
>> arm-
>>> ker...@lists.infradead.org
>>> Cc: mark.rutl...@arm.com; will.dea...@arm.com; Stuart Yoder
>>> 
>>> Subject: [RFC PATCH] iommu/arm-smmu: Add global SMR masking property
>>>
>>> The current SMR masking support using a 2-cell iommu-specifier is
>>> primarily intended to handle individual masters with large and/or
>>> complex Stream ID assignments; it quickly gets a bit clunky in other SMR
>>> use-cases where we just want to consistently mask out the same part of
>>> every Stream ID (e.g. for MMU-500 configurations where the appended TBU
>>> number gets in the way unnecessarily). Let's add a new property to allow
>>> a single global mask value to better fit the latter situation.
>>>
>>> CC: Stuart Yoder 
>>
>> Tested-by: Nipun Gupta 
>>
>>> Signed-off-by: Robin Murphy 
>>> ---
>>>
>>> Compile-tested only...
>>>
>>>  Documentation/devicetree/bindings/iommu/arm,smmu.txt | 8 
>>>  drivers/iommu/arm-smmu.c | 4 +++-
>>>  2 files changed, 11 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>>> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>>> index e862d1485205..98f5cbe5fdb4 100644
>>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>>> @@ -60,6 +60,14 @@ conditions.
>>>aliases of secure registers have to be used during
>>>SMMU configuration.
>>>
>>> +- stream-match-mask : Specifies a fixed SMR mask value to combine with
>>> +  the Stream ID value from every iommu-specifier. This
>>> +  may be used instead of an "#iommu-cells" value of 2
>>> +  when there is no need for per-master SMR masks, but
>>> +  it is still desired to mask some portion of every
>>> +  Stream ID (e.g. for certain MMU-500 configurations
>>> +  given globally unique external IDs).
>>> +
>>>  ** Deprecated properties:
>>>
>>>  - mmu-masters (deprecated in favour of the generic "iommus" binding) :
>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>> index 8f7281444551..f1abcb7dde36 100644
>>> --- a/drivers/iommu/arm-smmu.c
>>> +++ b/drivers/iommu/arm-smmu.c
>>> @@ -1534,13 +1534,15 @@ static int arm_smmu_domain_set_attr(struct
>>> iommu_domain *domain,
>>>
>>>  static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args
>> *args)
>>>  {
>>> -   u32 fwid = 0;
>>> +   u32 mask, fwid = 0;
>>>
>>> if (args->args_count > 0)
>>> fwid |= (u16)args->args[0];
>>>
>>> if (args->args_count > 1)
>>> fwid |= (u16)args->args[1] << SMR_MASK_SHIFT;
>>> +   else if (!of_property_read_u32(args->np, "stream-match-mask",
>>> &mask))
>>> +   fwid |= (u16)mask << SMR_MASK_SHIFT;
>>>
>>> return iommu_fwspec_add_ids(dev, &fwid, 1);
>>>  }
>>> --
>>> 2.10.2.dirty
>>>
>>> ___
>>> iommu mailing list
>>> iommu@lists.linux-foundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: Report smmu type in dmesg

2017-03-06 Thread Robin Murphy
On 06/03/17 13:58, Robert Richter wrote:
> The ARM SMMU detection especially depends from system firmware. For
> better diagnostic, log the detected type in dmesg.

This paragraph especially depends from grammar. I think.

> The smmu type's name is now stored in struct arm_smmu_type and ACPI
> code is modified to use that struct too. Rename ARM_SMMU_MATCH_DATA()
> macro to ARM_SMMU_TYPE() for better readability.
> 
> Signed-off-by: Robert Richter 
> ---
>  drivers/iommu/arm-smmu.c | 61 
> 
>  1 file changed, 30 insertions(+), 31 deletions(-)

That seems a relatively invasive diffstat for the sake of printing a
string once at boot time to what I can only assume is a small audience
of firmware developers who find "cat
/sys/firmware/devicetree/base/iommu*/compatible" (or the ACPI
equivalent) too hard ;)

Assuming there is a really good reason somewhere to justify this, I
still wonder if a simple self-contained "smmu->model to string" function
wouldn't do, if we really want to do this? Maybe it's not quite that
simple if the generic case needs to key off smmu->version as well, but
still. Arguably, just searching the of_match_table by model/version and
printing the corresponding DT compatible would do the job (given an
MMU-400 model to disambiguate those).

Either way it ought to be replacing the "SMMUv%d with:" message in
arm_smmu_device_cfg_probe() - this driver is noisy enough already
without starting to repeat itself.

> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index abf6496843a6..5c793b3d3173 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -366,6 +366,7 @@ struct arm_smmu_device {
>   u32 options;
>   enum arm_smmu_arch_version  version;
>   enum arm_smmu_implementationmodel;
> + const char  *name;

If we are going to add a pointer to static implementation data, it may
as well be a "const arm_smmu_type *type" pointer to subsume version and
model as well.

>  
>   u32 num_context_banks;
>   u32 num_s2_context_banks;
> @@ -1955,19 +1956,20 @@ static int arm_smmu_device_cfg_probe(struct 
> arm_smmu_device *smmu)
>   return 0;
>  }
>  
> -struct arm_smmu_match_data {
> +struct arm_smmu_type {
>   enum arm_smmu_arch_version version;
>   enum arm_smmu_implementation model;
> + const char *name;
>  };
>  
> -#define ARM_SMMU_MATCH_DATA(name, ver, imp)  \
> -static struct arm_smmu_match_data name = { .version = ver, .model = imp }
> +#define ARM_SMMU_TYPE(var, ver, imp, _name)  \
> +static struct arm_smmu_type var = { .version = ver, .model = imp, .name = 
> _name }
>  
> -ARM_SMMU_MATCH_DATA(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU);
> -ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU);
> -ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU);
> -ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
> -ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
> +ARM_SMMU_TYPE(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU, "smmu-generic-v1");
> +ARM_SMMU_TYPE(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU, "smmu-generic-v2");
> +ARM_SMMU_TYPE(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU, "arm-mmu401");

Strictly, I think you mean "ARM® CoreLink™ MMU-401". Printing the name
of a driver-internal structure looks like someone left a debugging hack in.

> +ARM_SMMU_TYPE(arm_mmu500, ARM_SMMU_V2, ARM_MMU500, "arm-mmu500");

And similarly. Of course, I'm not actually suggesting that using the
full marketing names of things is a great idea, but then again if we do
go naming specific IPs, and anyone gets sniffy about using the names
"properly", then guess what's going to get removed again? You'll always
find me firmly in the "easier not to go there" camp.

Robin.

> +ARM_SMMU_TYPE(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2, "cavium-smmuv2");
>  
>  static const struct of_device_id arm_smmu_of_match[] = {
>   { .compatible = "arm,smmu-v1", .data = &smmu_generic_v1 },
> @@ -1981,29 +1983,19 @@ static const struct of_device_id arm_smmu_of_match[] 
> = {
>  MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
>  
>  #ifdef CONFIG_ACPI
> -static int acpi_smmu_get_data(u32 model, struct arm_smmu_device *smmu)
> +static struct arm_smmu_type *acpi_smmu_get_type(u32 model)
>  {
> - int ret = 0;
> -
>   switch (model) {
>   case ACPI_IORT_SMMU_V1:
>   case ACPI_IORT_SMMU_CORELINK_MMU400:
> - smmu->version = ARM_SMMU_V1;
> - smmu->model = GENERIC_SMMU;
> - break;
> + return &smmu_generic_v1;
>   case ACPI_IORT_SMMU_V2:
> - smmu->version = ARM_SMMU_V2;
> - smmu->model = GENERIC_SMMU;
> - break;
> + return &smmu_generic_v2;
>   case ACPI_IORT_SMMU_CORELINK_MMU500:
> - smmu->version = ARM_SMMU_V2;
> - smmu->model = ARM_MMU500;

Re: [PATCH 5/9] iommu: add qcom_iommu

2017-03-07 Thread Robin Murphy
On 01/03/17 17:42, Rob Clark wrote:
> An iommu driver for Qualcomm "B" family devices which do not completely
> implement the ARM SMMU spec.

Is that actually true, or is it just that it's a compliant SMMU on which
firmware has set SCR1.GASRAE? (which makes the global address space
secure-access-only). I don't know which Qualcomm SoCs are the ones
apparently using a plain ARM MMU-500 IP, but if any of those are also
running this particular firmware configuration that puts us in a
somewhat weird situation with respect to drivers :/

Robin.

>  These devices have context-bank register
> layout that is similar to ARM SMMU, but no global register space (or at
> least not one that is accessible).
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/iommu/Kconfig |  10 +
>  drivers/iommu/Makefile|   1 +
>  drivers/iommu/arm-smmu-regs.h |   2 +
>  drivers/iommu/qcom_iommu.c| 825 
> ++
>  4 files changed, 838 insertions(+)
>  create mode 100644 drivers/iommu/qcom_iommu.c
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 37e204f..400a404 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -359,4 +359,14 @@ config MTK_IOMMU_V1
>  
> if unsure, say N here.
>  
> +config QCOM_IOMMU
> + bool "Qualcomm IOMMU Support"
> + depends on ARM || ARM64
> + depends on ARCH_QCOM || COMPILE_TEST
> + select IOMMU_API
> + select IOMMU_IO_PGTABLE_LPAE
> + select ARM_DMA_USE_IOMMU
> + help
> +   Support for IOMMU on certain Qualcomm SoCs.
> +
>  endif # IOMMU_SUPPORT
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 195f7b9..b910aea 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
>  obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
> +obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
> diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
> index 632240f..e643164 100644
> --- a/drivers/iommu/arm-smmu-regs.h
> +++ b/drivers/iommu/arm-smmu-regs.h
> @@ -174,6 +174,8 @@ enum arm_smmu_s2cr_privcfg {
>  #define ARM_SMMU_CB_S1_TLBIVAL   0x620
>  #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630
>  #define ARM_SMMU_CB_S2_TLBIIPAS2L0x638
> +#define ARM_SMMU_CB_TLBSYNC  0x7f0
> +#define ARM_SMMU_CB_TLBSTATUS0x7f4
>  #define ARM_SMMU_CB_ATS1PR   0x800
>  #define ARM_SMMU_CB_ATSR 0x8f0
>  
> diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
> new file mode 100644
> index 000..5d3bb63
> --- /dev/null
> +++ b/drivers/iommu/qcom_iommu.c
> @@ -0,0 +1,825 @@
> +/*
> + * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
> + *
> + * Copyright (C) 2013 ARM Limited
> + * Copyright (C) 2017 Red Hat
> + */
> +
> +#define pr_fmt(fmt) "qcom-iommu: " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "io-pgtable.h"
> +#include "arm-smmu-regs.h"
> +
> +#define SMMU_INTR_SEL_NS 0x2000
> +
> +struct qcom_iommu_dev {
> + /* IOMMU core code handle */
> + struct iommu_device  iommu;
> + struct device   *dev;
> + struct clk  *iface_clk;
> + struct clk  *bus_clk;
> + void __iomem*local_base;
> + u32  sec_id;
> + struct list_head context_list;   /* list of qcom_iommu_context 
> */
> +};
> +
> +struct qcom_iommu_ctx {
> + struct device   *dev;
> + void __iomem*base;
> + unsigned int irq;
> + bool secure_init;
> + u32  asid;  /* asid and ctx bank # are 1:1 */
> + struct iommu_group  *group;
> + struct list_head node;  /* head in 
> qcom_iommu_device::context_list */
> +};
> +
> +struct qcom_iommu_domain {
> + struct io_pgtable_ops   *pgtbl_ops;
> + spinlock_t   pgtbl_l

[PATCH 1/4] iommu/arm-smmu: Handle size mismatches better

2017-03-07 Thread Robin Murphy
We currently warn if the firmware-described region size differs from the
SMMU address space size reported by the hardware, but continue to use
the former to calculate where our context bank base should be,
effectively guaranteeing that things will not work correctly.

Since over-mapping is effectively harmless, and under-mapping can be OK
provided all the usable context banks are still covered, let's let the
hardware information take precedence in the case of a mismatch, such
that we get the correct context bank base and in most cases things will
actually work instead of silently misbehaving. And at worst, if the
firmware is wrong enough to have not mapped something we actually try to
use, the resulting out-of-bounds access will hopefully provide a much
more obvious clue.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index abf6496843a6..bc7ef6a0c54d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1864,10 +1864,12 @@ static int arm_smmu_device_cfg_probe(struct 
arm_smmu_device *smmu)
/* Check for size mismatch of SMMU address space from mapped region */
size = 1 << (((id >> ID1_NUMPAGENDXB_SHIFT) & ID1_NUMPAGENDXB_MASK) + 
1);
size *= 2 << smmu->pgshift;
-   if (smmu->size != size)
+   if (smmu->size != size) {
dev_warn(smmu->dev,
"SMMU address space size (0x%lx) differs from mapped 
region size (0x%lx)!\n",
size, smmu->size);
+   smmu->size = size;
+   }
 
smmu->num_s2_context_banks = (id >> ID1_NUMS2CB_SHIFT) & 
ID1_NUMS2CB_MASK;
smmu->num_context_banks = (id >> ID1_NUMCB_SHIFT) & ID1_NUMCB_MASK;
-- 
2.11.0.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/4] ARM SMMU per-context TLB sync

2017-03-07 Thread Robin Murphy
The discussion around context-level access for Qualcomm SMMUs reminded
me to dig up this patch I started ages ago and finish it off. As it's
ended up, it's now a mini-series, with some new preparatory cleanup
manifesting as patches 2 and 3. Patch 1 is broken out of patch 3 for
clarity as somewhat of a fix in its own right, in that it's really an
entirely unrelated thing which came up in parallel, but happens to
be inherent to code I'm touching here anyway.

Robin.

Robin Murphy (4):
  iommu/arm-smmu: Handle size mismatches better
  iommu/arm-smmu: Simplify ASID/VMID handling
  iommu/arm-smmu: Tidy up context bank indexing
  iommu/arm-smmu: Use per-context TLB sync as appropriate

 drivers/iommu/arm-smmu.c | 207 +--
 1 file changed, 127 insertions(+), 80 deletions(-)

-- 
2.11.0.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] iommu/arm-smmu: Simplify ASID/VMID handling

2017-03-07 Thread Robin Murphy
Calculating ASIDs/VMIDs dynamically from arm_smmu_cfg was a neat trick,
but the global uniqueness workaround makes it somewhat more awkward, and
means we end up having to pass extra state around in certain cases just
to keep a handle on the offset.

We already have 16 bits going spare in arm_smmu_cfg; let's just
precalculate an ASID/VMID, plop it in there, and tidy up the users
accordingly. We'd also need something like this anyway if we ever get
near to thinking about SVM, so it's no bad thing.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu.c | 36 +++-
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index bc7ef6a0c54d..0b64852baa6a 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -404,14 +404,15 @@ enum arm_smmu_context_fmt {
 struct arm_smmu_cfg {
u8  cbndx;
u8  irptndx;
+   union {
+   u16 asid;
+   u16 vmid;
+   };
u32 cbar;
enum arm_smmu_context_fmt   fmt;
 };
 #define INVALID_IRPTNDX0xff
 
-#define ARM_SMMU_CB_ASID(smmu, cfg) ((u16)(smmu)->cavium_id_base + 
(cfg)->cbndx)
-#define ARM_SMMU_CB_VMID(smmu, cfg) ((u16)(smmu)->cavium_id_base + 
(cfg)->cbndx + 1)
-
 enum arm_smmu_domain_stage {
ARM_SMMU_DOMAIN_S1 = 0,
ARM_SMMU_DOMAIN_S2,
@@ -603,12 +604,10 @@ static void arm_smmu_tlb_inv_context(void *cookie)
 
if (stage1) {
base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
-   writel_relaxed(ARM_SMMU_CB_ASID(smmu, cfg),
-  base + ARM_SMMU_CB_S1_TLBIASID);
+   writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID);
} else {
base = ARM_SMMU_GR0(smmu);
-   writel_relaxed(ARM_SMMU_CB_VMID(smmu, cfg),
-  base + ARM_SMMU_GR0_TLBIVMID);
+   writel_relaxed(cfg->vmid, base + ARM_SMMU_GR0_TLBIVMID);
}
 
__arm_smmu_tlb_sync(smmu);
@@ -629,14 +628,14 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long 
iova, size_t size,
 
if (cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) {
iova &= ~12UL;
-   iova |= ARM_SMMU_CB_ASID(smmu, cfg);
+   iova |= cfg->asid;
do {
writel_relaxed(iova, reg);
iova += granule;
} while (size -= granule);
} else {
iova >>= 12;
-   iova |= (u64)ARM_SMMU_CB_ASID(smmu, cfg) << 48;
+   iova |= (u64)cfg->asid << 48;
do {
writeq_relaxed(iova, reg);
iova += granule >> 12;
@@ -653,7 +652,7 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long 
iova, size_t size,
} while (size -= granule);
} else {
reg = ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_TLBIVMID;
-   writel_relaxed(ARM_SMMU_CB_VMID(smmu, cfg), reg);
+   writel_relaxed(cfg->vmid, reg);
}
 }
 
@@ -735,7 +734,7 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
reg = CBA2R_RW64_32BIT;
/* 16-bit VMIDs live in CBA2R */
if (smmu->features & ARM_SMMU_FEAT_VMID16)
-   reg |= ARM_SMMU_CB_VMID(smmu, cfg) << CBA2R_VMID_SHIFT;
+   reg |= cfg->vmid << CBA2R_VMID_SHIFT;
 
writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBA2R(cfg->cbndx));
}
@@ -754,26 +753,24 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
(CBAR_S1_MEMATTR_WB << CBAR_S1_MEMATTR_SHIFT);
} else if (!(smmu->features & ARM_SMMU_FEAT_VMID16)) {
/* 8-bit VMIDs live in CBAR */
-   reg |= ARM_SMMU_CB_VMID(smmu, cfg) << CBAR_VMID_SHIFT;
+   reg |= cfg->vmid << CBAR_VMID_SHIFT;
}
writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBAR(cfg->cbndx));
 
/* TTBRs */
if (stage1) {
-   u16 asid = ARM_SMMU_CB_ASID(smmu, cfg);
-
if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
reg = pgtbl_cfg->arm_v7s_cfg.ttbr[0];
writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR0);
reg = pgtbl_cfg->arm_v7s_cfg.ttbr[1];
writel_relaxed(reg, cb_base + ARM_SMMU_CB_TTBR1);
-   writel_relaxed(asid, cb

[PATCH 3/4] iommu/arm-smmu: Tidy up context bank indexing

2017-03-07 Thread Robin Murphy
ARM_AMMU_CB() is calculated relative to ARM_SMMU_CB_BASE(), but the
latter is never of use on its own, and what we end up with is the same
ARM_SMMU_CB_BASE() + ARM_AMMU_CB() expression being duplicated at every
callsite. Folding the two together gives us a self-contained context
bank accessor which is much more pleasant to work with.

Secondly, we might as well simplify CB_BASE itself at the same time.
We use the address space size for its own sake precisely once, at probe
time, and every other usage is to dynamically calculate CB_BASE over
and over and over again. Let's flip things around so that we just
maintain the CB_BASE address directly.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 0b64852baa6a..c8aafe304171 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -216,8 +216,7 @@ enum arm_smmu_s2cr_privcfg {
 #define CBA2R_VMID_MASK0x
 
 /* Translation context bank */
-#define ARM_SMMU_CB_BASE(smmu) ((smmu)->base + ((smmu)->size >> 1))
-#define ARM_SMMU_CB(smmu, n)   ((n) * (1 << (smmu)->pgshift))
+#define ARM_SMMU_CB(smmu, n)   ((smmu)->cb_base + ((n) << (smmu)->pgshift))
 
 #define ARM_SMMU_CB_SCTLR  0x0
 #define ARM_SMMU_CB_ACTLR  0x4
@@ -344,7 +343,7 @@ struct arm_smmu_device {
struct device   *dev;
 
void __iomem*base;
-   unsigned long   size;
+   void __iomem*cb_base;
unsigned long   pgshift;
 
 #define ARM_SMMU_FEAT_COHERENT_WALK(1 << 0)
@@ -603,7 +602,7 @@ static void arm_smmu_tlb_inv_context(void *cookie)
void __iomem *base;
 
if (stage1) {
-   base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
+   base = ARM_SMMU_CB(smmu, cfg->cbndx);
writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID);
} else {
base = ARM_SMMU_GR0(smmu);
@@ -623,7 +622,7 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long 
iova, size_t size,
void __iomem *reg;
 
if (stage1) {
-   reg = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
+   reg = ARM_SMMU_CB(smmu, cfg->cbndx);
reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
 
if (cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) {
@@ -642,7 +641,7 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long 
iova, size_t size,
} while (size -= granule);
}
} else if (smmu->version == ARM_SMMU_V2) {
-   reg = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
+   reg = ARM_SMMU_CB(smmu, cfg->cbndx);
reg += leaf ? ARM_SMMU_CB_S2_TLBIIPAS2L :
  ARM_SMMU_CB_S2_TLBIIPAS2;
iova >>= 12;
@@ -672,7 +671,7 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
struct arm_smmu_device *smmu = smmu_domain->smmu;
void __iomem *cb_base;
 
-   cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
+   cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);
fsr = readl_relaxed(cb_base + ARM_SMMU_CB_FSR);
 
if (!(fsr & FSR_FAULT))
@@ -725,7 +724,7 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
 
gr1_base = ARM_SMMU_GR1(smmu);
stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS;
-   cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
+   cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);
 
if (smmu->version > ARM_SMMU_V1) {
if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH64)
@@ -1007,7 +1006,7 @@ static void arm_smmu_destroy_domain_context(struct 
iommu_domain *domain)
 * Disable the context bank and free the page tables before freeing
 * it.
 */
-   cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
+   cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);
writel_relaxed(0, cb_base + ARM_SMMU_CB_SCTLR);
 
if (cfg->irptndx != INVALID_IRPTNDX) {
@@ -1358,7 +1357,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct 
iommu_domain *domain,
u64 phys;
unsigned long va;
 
-   cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
+   cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);
 
/* ATS1 registers can only be written atomically */
va = iova & ~0xfffUL;
@@ -1685,7 +1684,7 @@ static void arm_smmu_device_reset(struct arm_smmu_device 
*smmu)
 
/* Make sure all context banks are disabled and clear CB_FSR  */
for (i = 0; i < smmu->num_context

[PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate

2017-03-07 Thread Robin Murphy
TLB synchronisation typically involves the SMMU blocking all incoming
transactions until the TLBs report completion of all outstanding
operations. In the common SMMUv2 configuration of a single distributed
SMMU serving multiple peripherals, that means that a single unmap
request has the potential to bring the hammer down on the entire system
if synchronised globally. Since stage 1 contexts, and stage 2 contexts
under SMMUv2, offer local sync operations, let's make use of those
wherever we can in the hope of minimising global disruption.

To that end, rather than add any more branches to the already unwieldy
monolithic TLB maintenance ops, break them up into smaller, neater,
functions which we can then mix and match as appropriate.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm-smmu.c | 156 ++-
 1 file changed, 100 insertions(+), 56 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index c8aafe304171..f7411109670f 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -237,6 +237,8 @@ enum arm_smmu_s2cr_privcfg {
 #define ARM_SMMU_CB_S1_TLBIVAL 0x620
 #define ARM_SMMU_CB_S2_TLBIIPAS2   0x630
 #define ARM_SMMU_CB_S2_TLBIIPAS2L  0x638
+#define ARM_SMMU_CB_TLBSYNC0x7f0
+#define ARM_SMMU_CB_TLBSTATUS  0x7f4
 #define ARM_SMMU_CB_ATS1PR 0x800
 #define ARM_SMMU_CB_ATSR   0x8f0
 
@@ -569,14 +571,13 @@ static void __arm_smmu_free_bitmap(unsigned long *map, 
int idx)
 }
 
 /* Wait for any pending TLB invalidations to complete */
-static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu)
+static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu,
+   void __iomem *sync, void __iomem *status)
 {
int count = 0;
-   void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
 
-   writel_relaxed(0, gr0_base + ARM_SMMU_GR0_sTLBGSYNC);
-   while (readl_relaxed(gr0_base + ARM_SMMU_GR0_sTLBGSTATUS)
-  & sTLBGSTATUS_GSACTIVE) {
+   writel_relaxed(0, sync);
+   while (readl_relaxed(status) & sTLBGSTATUS_GSACTIVE) {
cpu_relax();
if (++count == TLB_LOOP_TIMEOUT) {
dev_err_ratelimited(smmu->dev,
@@ -587,29 +588,49 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device 
*smmu)
}
 }
 
-static void arm_smmu_tlb_sync(void *cookie)
+static void arm_smmu_tlb_sync_global(struct arm_smmu_device *smmu)
 {
-   struct arm_smmu_domain *smmu_domain = cookie;
-   __arm_smmu_tlb_sync(smmu_domain->smmu);
+   void __iomem *base = ARM_SMMU_GR0(smmu);
+
+   __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_GR0_sTLBGSYNC,
+   base + ARM_SMMU_GR0_sTLBGSTATUS);
 }
 
-static void arm_smmu_tlb_inv_context(void *cookie)
+static void arm_smmu_tlb_sync_context(void *cookie)
+{
+   struct arm_smmu_domain *smmu_domain = cookie;
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+   void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx);
+
+   __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_CB_TLBSYNC,
+   base + ARM_SMMU_CB_TLBSTATUS);
+}
+
+static void arm_smmu_tlb_sync_vmid(void *cookie)
+{
+   struct arm_smmu_domain *smmu_domain = cookie;
+
+   arm_smmu_tlb_sync_global(smmu_domain->smmu);
+}
+
+static void arm_smmu_tlb_inv_context_s1(void *cookie)
 {
struct arm_smmu_domain *smmu_domain = cookie;
struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+   void __iomem *base = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx);
+
+   writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID);
+   arm_smmu_tlb_sync_context(cookie);
+}
+
+static void arm_smmu_tlb_inv_context_s2(void *cookie)
+{
+   struct arm_smmu_domain *smmu_domain = cookie;
struct arm_smmu_device *smmu = smmu_domain->smmu;
-   bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS;
-   void __iomem *base;
+   void __iomem *base = ARM_SMMU_GR0(smmu);
 
-   if (stage1) {
-   base = ARM_SMMU_CB(smmu, cfg->cbndx);
-   writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID);
-   } else {
-   base = ARM_SMMU_GR0(smmu);
-   writel_relaxed(cfg->vmid, base + ARM_SMMU_GR0_TLBIVMID);
-   }
-
-   __arm_smmu_tlb_sync(smmu);
+   writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID);
+   arm_smmu_tlb_sync_global(smmu);
 }
 
 static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
@@ -617,48 +638,66 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long 
iova, size_t size,
 {
struct arm_smmu_domain *smmu_domain = cookie;
struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-   struct arm_smmu_device *smmu = smmu_domain->smmu;
bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS;
-   void __iomem *reg;
+   void __i

[PATCH] iommu/arm-smmu: Add global SMR masking property

2017-03-07 Thread Robin Murphy
The current SMR masking support using a 2-cell iommu-specifier is
primarily intended to handle individual masters with large and/or
complex Stream ID assignments; it quickly gets a bit clunky in other SMR
use-cases where we just want to consistently mask out the same part of
every Stream ID (e.g. for MMU-500 configurations where the appended TBU
number gets in the way unnecessarily). Let's add a new property to allow
a single global mask value to better fit the latter situation.

Tested-by: Nipun Gupta 
Signed-off-by: Robin Murphy 
---

Time to rekindle the discussion about whether an architecture-level
concept with a rather specific name needs a vendor prefix ;)

Robin.

 Documentation/devicetree/bindings/iommu/arm,smmu.txt | 10 ++
 drivers/iommu/arm-smmu.c |  4 +++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 6cdf32d037fc..d66f355e174f 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -60,6 +60,16 @@ conditions.
   aliases of secure registers have to be used during
   SMMU configuration.
 
+- stream-match-mask : Specifies a fixed SMR mask value to combine with
+  the Stream ID value from every iommu-specifier. This
+  may be used instead of an "#iommu-cells" value of 2
+  when there is no need for per-master SMR masks, but
+  it is still desired to mask some portion of every
+  Stream ID (e.g. for certain MMU-500 configurations
+  given globally unique external IDs). This property is
+  not valid for SMMUs using stream indexing, and may be
+  ignored if stream matching is not supported.
+
 ** Deprecated properties:
 
 - mmu-masters (deprecated in favour of the generic "iommus" binding) :
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index abf6496843a6..e394d55146a6 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1590,13 +1590,15 @@ static int arm_smmu_domain_set_attr(struct iommu_domain 
*domain,
 
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
-   u32 fwid = 0;
+   u32 mask, fwid = 0;
 
if (args->args_count > 0)
fwid |= (u16)args->args[0];
 
if (args->args_count > 1)
fwid |= (u16)args->args[1] << SMR_MASK_SHIFT;
+   else if (!of_property_read_u32(args->np, "stream-match-mask", &mask))
+   fwid |= (u16)mask << SMR_MASK_SHIFT;
 
return iommu_fwspec_add_ids(dev, &fwid, 1);
 }
-- 
2.11.0.dirty

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: Report smmu type in dmesg

2017-03-07 Thread Robin Murphy
On 07/03/17 14:06, Robert Richter wrote:
> On 06.03.17 18:22:08, Robin Murphy wrote:
>> On 06/03/17 13:58, Robert Richter wrote:
>>> The ARM SMMU detection especially depends from system firmware. For
>>> better diagnostic, log the detected type in dmesg.
>>
>> This paragraph especially depends from grammar. I think.
> 
> Thanks for the mail on you. :)
> 
>>
>>> The smmu type's name is now stored in struct arm_smmu_type and ACPI
>>> code is modified to use that struct too. Rename ARM_SMMU_MATCH_DATA()
>>> macro to ARM_SMMU_TYPE() for better readability.
>>>
>>> Signed-off-by: Robert Richter 
>>> ---
>>>  drivers/iommu/arm-smmu.c | 61 
>>> 
>>>  1 file changed, 30 insertions(+), 31 deletions(-)
>>
>> That seems a relatively invasive diffstat for the sake of printing a
>> string once at boot time to what I can only assume is a small audience
>> of firmware developers who find "cat
>> /sys/firmware/devicetree/base/iommu*/compatible" (or the ACPI
>> equivalent) too hard ;)
> 
> Reading firmware data is not really a solution as you don't know what
> the driver is doing with it. The actual background of this patch is to
> be sure a certain workaround was enabled in the kernel. ARM's cpu
> errata framework does this nicely. In case of smmus we just have the
> internal model implementation type which is not visible in the logs.
> Right now, there is no way to figure that out without knowing fw
> specifics and kernel sources.

Ah, now it starts to become clear. In that case, if we want to confirm
the presence of specific workarounds, we should actually _confirm the
presence of specific workarounds_. I'd have no complaint with e.g. this:

-8<-
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index f7411109670f..9e50a092632c 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1934,6 +1934,8 @@ static int arm_smmu_device_cfg_probe(struct
arm_smmu_device *smmu)
atomic_add_return(smmu->num_context_banks,
  &cavium_smmu_context_count);
smmu->cavium_id_base -= smmu->num_context_banks;
+   dev_notice(smmu->dev, "\tusing ASID/VMID offset %u\n",
+  smmu->cavium_id_base);
}

/* ID2 */
->8-

and the equivalent for other things, if need be. If you just print "hey,
this is SMMU-x", the user is in fact no better off, since they would
then still have to go and look at the source for whatever kernel they're
running to find out which particular workarounds for SMMU-x bugs that
particular kernel implements.

Robin.

> The change is big but most of it is a reasonable rework anyway. I
> didn't want to split that into a series of patches. But I could do
> that.
> 
>> Assuming there is a really good reason somewhere to justify this, I
>> still wonder if a simple self-contained "smmu->model to string" function
>> wouldn't do, if we really want to do this? Maybe it's not quite that
>> simple if the generic case needs to key off smmu->version as well, but
>> still. Arguably, just searching the of_match_table by model/version and
>> printing the corresponding DT compatible would do the job (given an
>> MMU-400 model to disambiguate those).
> 
> Whatever you prefer. To me a dynamic search makes things more complex
> for no benefit. And providing DT names in an ACPI context is confusing
> either.
> 
>> Either way it ought to be replacing the "SMMUv%d with:" message in
>> arm_smmu_device_cfg_probe() - this driver is noisy enough already
>> without starting to repeat itself.
>>
>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>> index abf6496843a6..5c793b3d3173 100644
>>> --- a/drivers/iommu/arm-smmu.c
>>> +++ b/drivers/iommu/arm-smmu.c
>>> @@ -366,6 +366,7 @@ struct arm_smmu_device {
>>> u32 options;
>>> enum arm_smmu_arch_version  version;
>>> enum arm_smmu_implementationmodel;
>>> +   const char  *name;
>>
>> If we are going to add a pointer to static implementation data, it may
>> as well be a "const arm_smmu_type *type" pointer to subsume version and
>> model as well.
> 
> The name still may change but not the particular string. Both work for
> me.
> 
>>
>>>  
>>> u32 num_context_banks;
>>> u32 num_s

Re: [PATCH v3 01/09] iommu/ipmmu-vmsa: Introduce features, break out alias

2017-03-08 Thread Robin Murphy
Hi Magnus,

On 08/03/17 11:01, Magnus Damm wrote:
> From: Magnus Damm 
> 
> Introduce struct ipmmu_features to track various hardware
> and software implementation changes inside the driver for
> different kinds of IPMMU hardware. Add use_ns_alias_offset
> as a first example of a feature to control if the secure
> register bank offset should be used or not.
> 
> Signed-off-by: Magnus Damm 
> ---
> 
>  Changes since V2:
>  - None
> 
>  Changes since V1:
>  - Moved patch to front of the series
> 
>  drivers/iommu/ipmmu-vmsa.c |   35 ---
>  1 file changed, 28 insertions(+), 7 deletions(-)
> 
> --- 0007/drivers/iommu/ipmmu-vmsa.c
> +++ work/drivers/iommu/ipmmu-vmsa.c   2017-03-07 12:25:47.0 +0900
> @@ -32,11 +32,15 @@
>  
>  #define IPMMU_CTX_MAX 1
>  
> +struct ipmmu_features {
> + bool use_ns_alias_offset;
> +};
> +
>  struct ipmmu_vmsa_device {
>   struct device *dev;
>   void __iomem *base;
>   struct list_head list;
> -
> + const struct ipmmu_features *features;
>   unsigned int num_utlbs;
>   spinlock_t lock;/* Protects ctx and domains[] */
>   DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
> @@ -999,13 +1003,33 @@ static void ipmmu_device_reset(struct ip
>   ipmmu_write(mmu, i * IM_CTX_SIZE + IMCTR, 0);
>  }
>  
> +static const struct ipmmu_features ipmmu_features_default = {
> + .use_ns_alias_offset = true,
> +};
> +
> +static const struct of_device_id ipmmu_of_ids[] = {
> + {
> + .compatible = "renesas,ipmmu-vmsa",
> + .data = &ipmmu_features_default,
> + }, {
> + /* Terminator */
> + },
> +};
> +
> +MODULE_DEVICE_TABLE(of, ipmmu_of_ids);
> +
>  static int ipmmu_probe(struct platform_device *pdev)
>  {
>   struct ipmmu_vmsa_device *mmu;
> + const struct of_device_id *match;
>   struct resource *res;
>   int irq;
>   int ret;
>  
> + match = of_match_node(ipmmu_of_ids, pdev->dev.of_node);

of_device_get_match_data() makes this a lot easier.

> + if (!match)
> + return -EINVAL;

Also, if the driver is DT-only per the other series, note that this
cannot happen anyway, since of_driver_match_device() would have to have
found a match for your probe function to be called in the first place.

Robin.

> +
>   mmu = devm_kzalloc(&pdev->dev, sizeof(*mmu), GFP_KERNEL);
>   if (!mmu) {
>   dev_err(&pdev->dev, "cannot allocate device data\n");
> @@ -1016,6 +1040,7 @@ static int ipmmu_probe(struct platform_d
>   mmu->num_utlbs = 32;
>   spin_lock_init(&mmu->lock);
>   bitmap_zero(mmu->ctx, IPMMU_CTX_MAX);
> + mmu->features = match->data;
>  
>   /* Map I/O memory and request IRQ. */
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> @@ -1035,7 +1060,8 @@ static int ipmmu_probe(struct platform_d
>* Offset the registers base unconditionally to point to the non-secure
>* alias space for now.
>*/
> - mmu->base += IM_NS_ALIAS_OFFSET;
> + if (mmu->features->use_ns_alias_offset)
> + mmu->base += IM_NS_ALIAS_OFFSET;
>  
>   irq = platform_get_irq(pdev, 0);
>   if (irq < 0) {
> @@ -1084,11 +1110,6 @@ static int ipmmu_remove(struct platform_
>   return 0;
>  }
>  
> -static const struct of_device_id ipmmu_of_ids[] = {
> - { .compatible = "renesas,ipmmu-vmsa", },
> - { }
> -};
> -
>  static struct platform_driver ipmmu_driver = {
>   .driver = {
>   .name = "ipmmu-vmsa",
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 03/09] iommu/ipmmu-vmsa: Enable multi context support

2017-03-08 Thread Robin Murphy
On 08/03/17 11:01, Magnus Damm wrote:
> From: Magnus Damm 
> 
> Add support for up to 8 contexts. Each context is mapped to one
> domain. One domain is assigned one or more slave devices. Contexts
> are allocated dynamically and slave devices are grouped together
> based on which IPMMU device they are connected to. This makes slave
> devices tied to the same IPMMU device share the same IOVA space.
> 
> Signed-off-by: Magnus Damm 
> ---
> 
>  Changes since V2:
>  - Updated patch description to reflect code included in:
>[PATCH v7 00/07] iommu/ipmmu-vmsa: IPMMU multi-arch update V7
> 
>  Changes since V1:
>  - Support up to 8 contexts instead of 4
>  - Use feature flag and runtime handling
>  - Default to single context
> 
>  drivers/iommu/ipmmu-vmsa.c |   38 ++
>  1 file changed, 30 insertions(+), 8 deletions(-)
> 
> --- 0012/drivers/iommu/ipmmu-vmsa.c
> +++ work/drivers/iommu/ipmmu-vmsa.c   2017-03-08 17:59:19.900607110 +0900
> @@ -30,11 +30,12 @@
>  
>  #include "io-pgtable.h"
>  
> -#define IPMMU_CTX_MAX 1
> +#define IPMMU_CTX_MAX 8
>  
>  struct ipmmu_features {
>   bool use_ns_alias_offset;
>   bool has_cache_leaf_nodes;
> + bool has_eight_ctx;

Wouldn't it be more sensible to just encode a number of contexts
directly, if it isn't reported by the hardware itself? I'm just
imagining future hardware generations... :P

bool also_has_another_eight_ctx_on_top_of_that;
bool wait_no_this_is_the_one_where_ctx_15_isnt_usable;

>  };
>  
>  struct ipmmu_vmsa_device {
> @@ -44,6 +45,7 @@ struct ipmmu_vmsa_device {
>   const struct ipmmu_features *features;
>   bool is_leaf;
>   unsigned int num_utlbs;
> + unsigned int num_ctx;
>   spinlock_t lock;/* Protects ctx and domains[] */
>   DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
>   struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
> @@ -376,11 +378,12 @@ static int ipmmu_domain_allocate_context
>  
>   spin_lock_irqsave(&mmu->lock, flags);
>  
> - ret = find_first_zero_bit(mmu->ctx, IPMMU_CTX_MAX);
> - if (ret != IPMMU_CTX_MAX) {
> + ret = find_first_zero_bit(mmu->ctx, mmu->num_ctx);
> + if (ret != mmu->num_ctx) {
>   mmu->domains[ret] = domain;
>   set_bit(ret, mmu->ctx);

Using test_and_set_bit() in a loop would avoid having to take a lock here.

> - }
> + } else
> + ret = -EBUSY;
>  
>   spin_unlock_irqrestore(&mmu->lock, flags);
>  
> @@ -425,9 +428,9 @@ static int ipmmu_domain_init_context(str
>* Find an unused context.
>*/
>   ret = ipmmu_domain_allocate_context(domain->root, domain);
> - if (ret == IPMMU_CTX_MAX) {
> + if (ret < 0) {
>   free_io_pgtable_ops(domain->iop);
> - return -EBUSY;
> + return ret;
>   }
>  
>   domain->context_id = ret;
> @@ -562,7 +565,7 @@ static irqreturn_t ipmmu_irq(int irq, vo
>   /*
>* Check interrupts for all active contexts.
>*/
> - for (i = 0; i < IPMMU_CTX_MAX; i++) {
> + for (i = 0; i < mmu->num_ctx; i++) {
>   if (!mmu->domains[i])
>   continue;
>   if (ipmmu_domain_irq(mmu->domains[i]) == IRQ_HANDLED)
> @@ -632,6 +635,13 @@ static int ipmmu_attach_device(struct io
>   domain->mmu = mmu;
>   domain->root = root;
>   ret = ipmmu_domain_init_context(domain);
> + if (ret < 0) {
> + dev_err(dev, "Unable to initialize IPMMU context\n");
> + domain->mmu = NULL;
> + } else {
> + dev_info(dev, "Using IPMMU context %u\n",
> +  domain->context_id);
> + }
>   } else if (domain->mmu != mmu) {
>   /*
>* Something is wrong, we can't attach two devices using
> @@ -1047,13 +1057,14 @@ static void ipmmu_device_reset(struct ip
>   unsigned int i;
>  
>   /* Disable all contexts. */
> - for (i = 0; i < 4; ++i)
> + for (i = 0; i < mmu->num_ctx; ++i)
>   ipmmu_write(mmu, i * IM_CTX_SIZE + IMCTR, 0);
>  }
>  
>  static const struct ipmmu_features ipmmu_features_default = {
>   .use_ns_alias_offset = true,
>   .has_cache_leaf_nodes = false,
> + .has_eight_ctx = false,
>  };
>  
>  static const struct of_device_id ipmmu_of_ids[] = {
> @@ -1112,6 +1123,17 @@ static int ipmmu_probe(struct platform_d
>   if (mmu->features->use_ns_alias_offset)
>   mmu->base += IM_NS_ALIAS_OFFSET;
>  
> + /*
> +  * The number of contexts varies with generation and instance.
> +  * Newer SoCs get a total of 8 contexts enabled, older ones just one.
> +  */
> + if (mmu->features->has_eight_ctx)
> + mmu->num_ctx = 8;
> + else
> + mmu->num_ctx = 1;
> +
> + WARN_ON(mmu->num_ctx > IPMMU_CTX_MAX);

The likelihood of that happening doesn't appear to warrant a runtime
check. Espe

Re: [PATCH v3 06/09] iommu/ipmmu-vmsa: Write IMCTR twice

2017-03-08 Thread Robin Murphy
On 08/03/17 11:02, Magnus Damm wrote:
> From: Magnus Damm 
> 
> Write IMCTR both in the root device and the leaf node.
> 
> Signed-off-by: Magnus Damm 
> ---
> 
>  Changes since V2:
>  - None
> 
>  Changes since V1:
>  - None
> 
>  drivers/iommu/ipmmu-vmsa.c |   17 ++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> --- 0018/drivers/iommu/ipmmu-vmsa.c
> +++ work/drivers/iommu/ipmmu-vmsa.c   2017-03-08 18:30:36.870607110 +0900
> @@ -286,6 +286,16 @@ static void ipmmu_ctx_write(struct ipmmu
>   ipmmu_write(domain->root, domain->context_id * IM_CTX_SIZE + reg, data);
>  }
>  
> +static void ipmmu_ctx_write2(struct ipmmu_vmsa_domain *domain, unsigned int 
> reg,
> +  u32 data)

That's pretty cryptic. Maybe both functions could do with less ambiguous
names - something like ipmmu_ctx_write_root() vs. ipmmu_ctx_write_all(),
perhaps? (and if there's a more specific hardware term than "all" that
describes this kind of configuration, even better).

Robin.

> +{
> + if (domain->mmu != domain->root)
> + ipmmu_write(domain->mmu,
> + domain->context_id * IM_CTX_SIZE + reg, data);
> +
> + ipmmu_write(domain->root, domain->context_id * IM_CTX_SIZE + reg, data);
> +}
> +
>  /* 
> -
>   * TLB and microTLB Management
>   */
> @@ -312,7 +322,7 @@ static void ipmmu_tlb_invalidate(struct
>  
>   reg = ipmmu_ctx_read(domain, IMCTR);
>   reg |= IMCTR_FLUSH;
> - ipmmu_ctx_write(domain, IMCTR, reg);
> + ipmmu_ctx_write2(domain, IMCTR, reg);
>  
>   ipmmu_tlb_sync(domain);
>  }
> @@ -472,7 +482,8 @@ static int ipmmu_domain_init_context(str
>* software management as we have no use for it. Flush the TLB as
>* required when modifying the context registers.
>*/
> - ipmmu_ctx_write(domain, IMCTR, IMCTR_INTEN | IMCTR_FLUSH | IMCTR_MMUEN);
> + ipmmu_ctx_write2(domain, IMCTR,
> +  IMCTR_INTEN | IMCTR_FLUSH | IMCTR_MMUEN);
>  
>   return 0;
>  }
> @@ -498,7 +509,7 @@ static void ipmmu_domain_destroy_context
>*
>* TODO: Is TLB flush really needed ?
>*/
> - ipmmu_ctx_write(domain, IMCTR, IMCTR_FLUSH);
> + ipmmu_ctx_write2(domain, IMCTR, IMCTR_FLUSH);
>   ipmmu_tlb_sync(domain);
>   ipmmu_domain_free_context(domain->root, domain->context_id);
>  }
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 06/07] iommu/ipmmu-vmsa: ARM and ARM64 archdata access

2017-03-08 Thread Robin Murphy
On 07/03/17 03:17, Magnus Damm wrote:
> From: Magnus Damm 
> 
> Not all architectures have an iommu member in their archdata, so
> use #ifdefs support build with COMPILE_TEST on any architecture.

I have a feeling I might be repeating myself, but ipmmu_vmsa_archdata
looks to be trivially convertible to iommu_fwspec, which I strongly
encourage, not least because it would obviate bodges like this.

Robin.

> Signed-off-by: Magnus Damm 
> Reviewed-by: Joerg Roedel 
> ---
> 
> Changes since V6:
>  - Updated patch to handle newly introduced functions in:
>[PATCH v7 05/07] iommu/ipmmu-vmsa: Add new IOMMU_DOMAIN_DMA ops
> 
>  drivers/iommu/ipmmu-vmsa.c |   43 ++-
>  1 file changed, 30 insertions(+), 13 deletions(-)
> 
> --- 0010/drivers/iommu/ipmmu-vmsa.c
> +++ work/drivers/iommu/ipmmu-vmsa.c   2017-03-06 19:26:26.070607110 +0900
> @@ -72,6 +72,25 @@ static struct ipmmu_vmsa_domain *to_vmsa
>   return container_of(dom, struct ipmmu_vmsa_domain, io_domain);
>  }
>  
> +#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> +static struct ipmmu_vmsa_archdata *to_archdata(struct device *dev)
> +{
> + return dev->archdata.iommu;
> +}
> +static void set_archdata(struct device *dev, struct ipmmu_vmsa_archdata *p)
> +{
> + dev->archdata.iommu = p;
> +}
> +#else
> +static struct ipmmu_vmsa_archdata *to_archdata(struct device *dev)
> +{
> + return NULL;
> +}
> +static void set_archdata(struct device *dev, struct ipmmu_vmsa_archdata *p)
> +{
> +}
> +#endif
> +
>  #define TLB_LOOP_TIMEOUT 100 /* 100us */
>  
>  /* 
> -
> @@ -543,7 +562,7 @@ static void ipmmu_domain_free(struct iom
>  static int ipmmu_attach_device(struct iommu_domain *io_domain,
>  struct device *dev)
>  {
> - struct ipmmu_vmsa_archdata *archdata = dev->archdata.iommu;
> + struct ipmmu_vmsa_archdata *archdata = to_archdata(dev);
>   struct ipmmu_vmsa_device *mmu = archdata->mmu;
>   struct ipmmu_vmsa_domain *domain = to_vmsa_domain(io_domain);
>   unsigned long flags;
> @@ -588,7 +607,7 @@ static int ipmmu_attach_device(struct io
>  static void ipmmu_detach_device(struct iommu_domain *io_domain,
>   struct device *dev)
>  {
> - struct ipmmu_vmsa_archdata *archdata = dev->archdata.iommu;
> + struct ipmmu_vmsa_archdata *archdata = to_archdata(dev);
>   struct ipmmu_vmsa_domain *domain = to_vmsa_domain(io_domain);
>   unsigned int i;
>  
> @@ -709,7 +728,7 @@ static int ipmmu_init_platform_device(st
>   archdata->utlbs = utlbs;
>   archdata->num_utlbs = num_utlbs;
>   archdata->dev = dev;
> - dev->archdata.iommu = archdata;
> + set_archdata(dev, archdata);
>   return 0;
>  
>  error:
> @@ -729,12 +748,11 @@ static struct iommu_domain *ipmmu_domain
>  
>  static int ipmmu_add_device(struct device *dev)
>  {
> - struct ipmmu_vmsa_archdata *archdata;
>   struct ipmmu_vmsa_device *mmu = NULL;
>   struct iommu_group *group;
>   int ret;
>  
> - if (dev->archdata.iommu) {
> + if (to_archdata(dev)) {
>   dev_warn(dev, "IOMMU driver already assigned to device %s\n",
>dev_name(dev));
>   return -EINVAL;
> @@ -770,8 +788,7 @@ static int ipmmu_add_device(struct devic
>* - Make the mapping size configurable ? We currently use a 2GB mapping
>*   at a 1GB offset to ensure that NULL VAs will fault.
>*/
> - archdata = dev->archdata.iommu;
> - mmu = archdata->mmu;
> + mmu = to_archdata(dev)->mmu;
>   if (!mmu->mapping) {
>   struct dma_iommu_mapping *mapping;
>  
> @@ -799,7 +816,7 @@ error:
>   if (mmu)
>   arm_iommu_release_mapping(mmu->mapping);
>  
> - dev->archdata.iommu = NULL;
> + set_archdata(dev, NULL);
>  
>   if (!IS_ERR_OR_NULL(group))
>   iommu_group_remove_device(dev);
> @@ -809,7 +826,7 @@ error:
>  
>  static void ipmmu_remove_device(struct device *dev)
>  {
> - struct ipmmu_vmsa_archdata *archdata = dev->archdata.iommu;
> + struct ipmmu_vmsa_archdata *archdata = to_archdata(dev);
>  
>   arm_iommu_detach_device(dev);
>   iommu_group_remove_device(dev);
> @@ -817,7 +834,7 @@ static void ipmmu_remove_device(struct d
>   kfree(archdata->utlbs);
>   kfree(archdata);
>  
> - dev->archdata.iommu = NULL;
> + set_archdata(dev, NULL);
>  }
>  
>  static const struct iommu_ops ipmmu_ops = {
> @@ -874,7 +891,7 @@ static void ipmmu_domain_free_dma(struct
>  
>  static int ipmmu_add_device_dma(struct device *dev)
>  {
> - struct ipmmu_vmsa_archdata *archdata = dev->archdata.iommu;
> + struct ipmmu_vmsa_archdata *archdata = to_archdata(dev);
>   struct iommu_group *group;
>  
>   /* The device has been verified in xlate() */
> @@ -893,7 +910,7 @@ static int ipmmu_add_device_dma(struct d
>  
>  static void i

Re: [PATCH v7 05/07] iommu/ipmmu-vmsa: Add new IOMMU_DOMAIN_DMA ops

2017-03-08 Thread Robin Murphy
On 07/03/17 03:17, Magnus Damm wrote:
> From: Magnus Damm 
> 
> Introduce an alternative set of iommu_ops suitable for 64-bit ARM
> as well as 32-bit ARM when CONFIG_IOMMU_DMA=y. Also adjust the
> Kconfig to depend on ARM or IOMMU_DMA. Initialize the device
> from ->xlate() when CONFIG_IOMMU_DMA=y.
> 
> Signed-off-by: Magnus Damm 
> ---
> 
>  Changes since V6:
>  - Rolled in the following patches from "r8a7795 support V2":
>[PATCH v2 04/11] iommu/ipmmu-vmsa: Reuse iommu groups
>[PATCH v2 06/11] iommu/ipmmu-vmsa: Teach xlate() to skip disabled iommus
>  - Moved find_group() implementation to prevent warning on 32-bit ARM
>  - Rolled in the following patch from "IPMMU slave device whitelist V2":
>[PATCH/RFC v2 3/4] iommu/ipmmu-vmsa: Check devices in xlate()
> 
>  drivers/iommu/Kconfig  |1 
>  drivers/iommu/ipmmu-vmsa.c |  164 
> +---
>  2 files changed, 157 insertions(+), 8 deletions(-)
> 
> --- 0001/drivers/iommu/Kconfig
> +++ work/drivers/iommu/Kconfig2017-03-06 18:42:42.0 +0900
> @@ -274,6 +274,7 @@ config EXYNOS_IOMMU_DEBUG
>  
>  config IPMMU_VMSA
>   bool "Renesas VMSA-compatible IPMMU"
> + depends on ARM || IOMMU_DMA
>   depends on ARM_LPAE
>   depends on ARCH_RENESAS || COMPILE_TEST
>   select IOMMU_API
> --- 0009/drivers/iommu/ipmmu-vmsa.c
> +++ work/drivers/iommu/ipmmu-vmsa.c   2017-03-06 19:22:27.700607110 +0900
> @@ -10,6 +10,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -22,8 +23,10 @@
>  #include 
>  #include 
>  
> +#if defined(CONFIG_ARM) && !defined(CONFIG_IOMMU_DMA)
>  #include 
>  #include 
> +#endif
>  
>  #include "io-pgtable.h"
>  
> @@ -57,6 +60,8 @@ struct ipmmu_vmsa_archdata {
>   struct ipmmu_vmsa_device *mmu;
>   unsigned int *utlbs;
>   unsigned int num_utlbs;
> + struct device *dev;
> + struct list_head list;
>  };
>  
>  static DEFINE_SPINLOCK(ipmmu_devices_lock);
> @@ -522,14 +527,6 @@ static struct iommu_domain *__ipmmu_doma
>   return &domain->io_domain;
>  }
>  
> -static struct iommu_domain *ipmmu_domain_alloc(unsigned type)
> -{
> - if (type != IOMMU_DOMAIN_UNMANAGED)
> - return NULL;
> -
> - return __ipmmu_domain_alloc(type);
> -}
> -
>  static void ipmmu_domain_free(struct iommu_domain *io_domain)
>  {
>   struct ipmmu_vmsa_domain *domain = to_vmsa_domain(io_domain);
> @@ -572,6 +569,9 @@ static int ipmmu_attach_device(struct io
>   dev_err(dev, "Can't attach IPMMU %s to domain on IPMMU %s\n",
>   dev_name(mmu->dev), dev_name(domain->mmu->dev));
>   ret = -EINVAL;
> + } else {
> + dev_info(dev, "Reusing IPMMU context %u\n",
> +  domain->context_id);

Indentation?

>   }
>  
>   spin_unlock_irqrestore(&domain->lock, flags);
> @@ -708,6 +708,7 @@ static int ipmmu_init_platform_device(st
>   archdata->mmu = mmu;
>   archdata->utlbs = utlbs;
>   archdata->num_utlbs = num_utlbs;
> + archdata->dev = dev;
>   dev->archdata.iommu = archdata;
>   return 0;
>  
> @@ -716,6 +717,16 @@ error:
>   return ret;
>  }
>  
> +#if defined(CONFIG_ARM) && !defined(CONFIG_IOMMU_DMA)
> +
> +static struct iommu_domain *ipmmu_domain_alloc(unsigned type)
> +{
> + if (type != IOMMU_DOMAIN_UNMANAGED)
> + return NULL;
> +
> + return __ipmmu_domain_alloc(type);
> +}
> +
>  static int ipmmu_add_device(struct device *dev)
>  {
>   struct ipmmu_vmsa_archdata *archdata;
> @@ -823,6 +834,141 @@ static const struct iommu_ops ipmmu_ops
>   .pgsize_bitmap = SZ_1G | SZ_2M | SZ_4K,
>  };
>  
> +#endif /* !CONFIG_ARM && CONFIG_IOMMU_DMA */
> +
> +#ifdef CONFIG_IOMMU_DMA
> +
> +static DEFINE_SPINLOCK(ipmmu_slave_devices_lock);
> +static LIST_HEAD(ipmmu_slave_devices);
> +
> +static struct iommu_domain *ipmmu_domain_alloc_dma(unsigned type)
> +{
> + struct iommu_domain *io_domain = NULL;
> +
> + switch (type) {
> + case IOMMU_DOMAIN_UNMANAGED:
> + io_domain = __ipmmu_domain_alloc(type);
> + break;
> +
> + case IOMMU_DOMAIN_DMA:
> + io_domain = __ipmmu_domain_alloc(type);
> + if (io_domain)
> + iommu_get_dma_cookie(io_domain);
> + break;
> + }
> +
> + return io_domain;
> +}

I still think it would be tidier to put this logic straight into
__ipmmu_domain_alloc(), and use that directly as the callback for this
case. The ipmmu_domain_alloc() wrapper ensures that IOMMU_DOMAIN_DMA
can't be passed through in the legacy 32-bit case, and the cookie calls
are stubbed for !CONFIG_IOMMU_DMA so there are no build concerns.

> +static void ipmmu_domain_free_dma(struct iommu_domain *io_domain)
> +{
> + switch (io_domain->type) {
> + case IOMMU_DOMAIN_DMA:
> + iommu_put_dma_cookie(io_domain);
> + /* fall-through */
> + default:
> + ipmm

Re: [PATCH V8 01/11] iommu/of: Refactor of_iommu_configure() for error handling

2017-03-08 Thread Robin Murphy
On 08/03/17 18:58, Jean-Philippe Brucker wrote:
[...]
>>  static const struct iommu_ops
>> -*of_pci_iommu_configure(struct pci_dev *pdev, struct device_node *bridge_np)
>> +*of_pci_iommu_init(struct pci_dev *pdev, struct device_node *bridge_np)
>>  {
>>  const struct iommu_ops *ops;
>>  struct of_phandle_args iommu_spec;
>> +int err;
>>  
>>  /*
>>   * Start by tracing the RID alias down the PCI topology as
>> @@ -123,56 +146,56 @@ static int __get_pci_rid(struct pci_dev *pdev, u16 
>> alias, void *data)
>>   * bus into the system beyond, and which IOMMU it ends up at.
>>   */
>>  iommu_spec.np = NULL;
>> -if (of_pci_map_rid(bridge_np, iommu_spec.args[0], "iommu-map",
>> -   "iommu-map-mask", &iommu_spec.np, iommu_spec.args))
>> -return NULL;
>> +err = of_pci_map_rid(bridge_np, iommu_spec.args[0], "iommu-map",
>> + "iommu-map-mask", &iommu_spec.np,
>> + iommu_spec.args);
>> +if (err)
>> +return ERR_PTR(err);
> 
> This change doesn't work with of_pci_map_rid when the PCI RC isn't behind
> an IOMMU:
> 
> map = of_get_property(np, map_name, &map_len);
> if (!map) {
> if (target)
> return -ENODEV;
> /* Otherwise, no map implies no translation */
> *id_out = rid;
> return 0;
> }
> 
> Previously with no iommu-map, we returned -ENODEV but it was discarded by
> of_pci_iommu_configure. Now it is propagated and the whole device probing
> fails. Instead, maybe of_pci_map_rid should always return 0 if no
> iommu-map, and the caller should check if *target is still NULL?

Ah yes, Tomasz had found breakages with the "mmu-masters" binding
before, and I'd already pushed out a fixup for this one[1], but I forgot
that that discussion was all off-list (out of diplomatic concern that
the breakage might have been intentional - it wasn't, honest!)

Now that rc1 is out I should re-do that branch with v8 of this series
plus the fixups folded in, unless Sricharan beats me to it.

Thanks for the reminder,
Robin.

[1]:http://www.linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=0049a34e523506813995c05766f5e2c16d616354

> 
> Thanks,
> Jean-Philippe

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V8 01/11] iommu/of: Refactor of_iommu_configure() for error handling

2017-03-09 Thread Robin Murphy
On 09/03/17 09:52, sricharan wrote:
> Hi Robin,
> 
>> On 08/03/17 18:58, Jean-Philippe Brucker wrote:
>> [...]
  static const struct iommu_ops
 -*of_pci_iommu_configure(struct pci_dev *pdev, struct device_node
 *bridge_np)
 +*of_pci_iommu_init(struct pci_dev *pdev, struct device_node
 +*bridge_np)
  {
const struct iommu_ops *ops;
struct of_phandle_args iommu_spec;
 +  int err;

/*
 * Start by tracing the RID alias down the PCI topology as @@
 -123,56 +146,56 @@ static int __get_pci_rid(struct pci_dev *pdev, u16
>> alias, void *data)
 * bus into the system beyond, and which IOMMU it ends up at.
 */
iommu_spec.np = NULL;
 -  if (of_pci_map_rid(bridge_np, iommu_spec.args[0], "iommu-map",
 - "iommu-map-mask", &iommu_spec.np,
>> iommu_spec.args))
 -  return NULL;
 +  err = of_pci_map_rid(bridge_np, iommu_spec.args[0], "iommu-map",
 +   "iommu-map-mask", &iommu_spec.np,
 +   iommu_spec.args);
 +  if (err)
 +  return ERR_PTR(err);
>>>
>>> This change doesn't work with of_pci_map_rid when the PCI RC isn't
>>> behind an IOMMU:
>>>
>>> map = of_get_property(np, map_name, &map_len);
>>> if (!map) {
>>> if (target)
>>> return -ENODEV;
>>> /* Otherwise, no map implies no translation */
>>> *id_out = rid;
>>> return 0;
>>> }
>>>
>>> Previously with no iommu-map, we returned -ENODEV but it was discarded
>>> by of_pci_iommu_configure. Now it is propagated and the whole device
>>> probing fails. Instead, maybe of_pci_map_rid should always return 0 if
>>> no iommu-map, and the caller should check if *target is still NULL?
>>
>> Ah yes, Tomasz had found breakages with the "mmu-masters" binding
>> before, and I'd already pushed out a fixup for this one[1], but I forgot
> that
>> that discussion was all off-list (out of diplomatic concern that the
> breakage
>> might have been intentional - it wasn't, honest!)
>>
>> Now that rc1 is out I should re-do that branch with v8 of this series plus
> the
>> fixups folded in, unless Sricharan beats me to it.
>>
> 
> Right, I had this one [1] as V9 which had all your fixes that we discussed
> offline as
> well.  I was about to post a V9 today on top of -rc1 today as well.
> 
> [1] https://github.com/sricharanaz/iommu/tree/pd_v9

Awesome, I'll try to get on with testing that ASAP. Thanks!

Robin.

> 
> Regards,
>  Sricharan
> 
> 
>> Thanks for the reminder,
>> Robin.
>>
>> [1]:http://www.linux-arm.org/git?p=linux-
>> rm.git;a=commitdiff;h=0049a34e523506813995c05766f5e2c16d616354
>>
>>>
>>> Thanks,
>>> Jean-Philippe
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
>> the body of a message to majord...@vger.kernel.org More majordomo info
>> at  http://vger.kernel.org/majordomo-info.html
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 06/07] iommu/ipmmu-vmsa: ARM and ARM64 archdata access

2017-03-09 Thread Robin Murphy
On 09/03/17 03:44, Magnus Damm wrote:
> Hi Robin,
> 
> On Wed, Mar 8, 2017 at 9:48 PM, Robin Murphy  wrote:
>> On 07/03/17 03:17, Magnus Damm wrote:
>>> From: Magnus Damm 
>>>
>>> Not all architectures have an iommu member in their archdata, so
>>> use #ifdefs support build with COMPILE_TEST on any architecture.
>>
>> I have a feeling I might be repeating myself, but ipmmu_vmsa_archdata
>> looks to be trivially convertible to iommu_fwspec, which I strongly
>> encourage, not least because it would obviate bodges like this.
> 
> Yeah, I think it should be possible to use iommu_fwspec for this
> purpose. The question is when to do it. =)

I'd actually be inclined to do it *before* any other major changes, as
it would be pretty minimal given the current structure of the driver.
But then I'm free to wilfully ignore the burden of maintaining patch
stacks between both mainline and older trees ;)

> I actually looked into it recently, but then realised that for this to
> work then due to code sharing I need to make use of iommu_fwspec on
> both 32-bit and 64-bit ARM. So it requires rework of the existing
> IPMMU for 32-bit ARM (including hairy legacy CONFIG_IOMMU_DMA=n code).
> I was actually thinking of doing some rework of 32-bit ARM IPMMU code
> anyway (I suspect iommu_device_* conversion caused breakage) and it
> probably has to happen on top of current -next. I would also like to
> start reducing burden of forward porting all these patches, and
> stirring up the ground does not really help much there...

Note that iommu_fwspec can be used pretty much orthogonally to any of
the core DMA ops support, so it shouldn't be as invasive as you might
think. See 84672f192671 ("iommu/mediatek: Convert M4Uv1 to
iommu_fwspec") as an example of an archdata-to-fwspec conversion of a
driver which only supports 32-bit ARM, and notably borrows its master
handling directly from the the IPMMU driver.

Robin.

> 
> Cheers,
> 
> / magnus
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: Report smmu type in dmesg

2017-03-09 Thread Robin Murphy
On 09/03/17 12:02, Robert Richter wrote:
> On 07.03.17 18:41:33, Robin Murphy wrote:
>> On 07/03/17 14:06, Robert Richter wrote:
>>> On 06.03.17 18:22:08, Robin Murphy wrote:
>>>> On 06/03/17 13:58, Robert Richter wrote:
>>>>> The ARM SMMU detection especially depends from system firmware. For
>>>>> better diagnostic, log the detected type in dmesg.
>>>>
>>>> This paragraph especially depends from grammar. I think.
>>>
>>> Thanks for the mail on you. :)
>>>
>>>>
>>>>> The smmu type's name is now stored in struct arm_smmu_type and ACPI
>>>>> code is modified to use that struct too. Rename ARM_SMMU_MATCH_DATA()
>>>>> macro to ARM_SMMU_TYPE() for better readability.
>>>>>
>>>>> Signed-off-by: Robert Richter 
>>>>> ---
>>>>>  drivers/iommu/arm-smmu.c | 61 
>>>>> 
>>>>>  1 file changed, 30 insertions(+), 31 deletions(-)
>>>>
>>>> That seems a relatively invasive diffstat for the sake of printing a
>>>> string once at boot time to what I can only assume is a small audience
>>>> of firmware developers who find "cat
>>>> /sys/firmware/devicetree/base/iommu*/compatible" (or the ACPI
>>>> equivalent) too hard ;)
>>>
>>> Reading firmware data is not really a solution as you don't know what
>>> the driver is doing with it. The actual background of this patch is to
>>> be sure a certain workaround was enabled in the kernel. ARM's cpu
>>> errata framework does this nicely. In case of smmus we just have the
>>> internal model implementation type which is not visible in the logs.
>>> Right now, there is no way to figure that out without knowing fw
>>> specifics and kernel sources.
>>
>> Ah, now it starts to become clear. In that case, if we want to confirm
>> the presence of specific workarounds, we should actually _confirm the
>> presence of specific workarounds_. I'd have no complaint with e.g. this:
>>
>> -8<-
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index f7411109670f..9e50a092632c 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -1934,6 +1934,8 @@ static int arm_smmu_device_cfg_probe(struct
>> arm_smmu_device *smmu)
>> atomic_add_return(smmu->num_context_banks,
>>   &cavium_smmu_context_count);
>> smmu->cavium_id_base -= smmu->num_context_banks;
>> +   dev_notice(smmu->dev, "\tusing ASID/VMID offset %u\n",
>> +  smmu->cavium_id_base);
>> }
>>
>> /* ID2 */
>> ->8-
>>
>> and the equivalent for other things, if need be. If you just print "hey,
>> this is SMMU-x", the user is in fact no better off, since they would
>> then still have to go and look at the source for whatever kernel they're
>> running to find out which particular workarounds for SMMU-x bugs that
>> particular kernel implements.
> 
> I don't understand why you don't want to expose in some way the smmu
> type. There are a lot of things that can go wrong, esp. with firmware,
> to detect the proper smmu type.

Because there is only one reason for which detecting the "proper SMMU
type" matters - implementation-specific workarounds for areas in which a
given hardware implementation deviates from the architecture assumed by
the driver.

OK, let's print the model name. Now, if I give you this:

[0.475009] arm-smmu 2b50.iommu: probing hardware configuration...
[0.481650] arm-smmu 2b50.iommu: ARM MMU-401 r0p0 with:
[0.486436] arm-smmu 2b50.iommu: stage 2 translation
[0.491925] arm-smmu 2b50.iommu: coherent table walk
[0.497420] arm-smmu 2b50.iommu: stream matching with 32
register groups
[0.504678] arm-smmu 2b50.iommu: 4 context banks (4 stage-2 only)
[0.511312] arm-smmu 2b50.iommu: Supported page sizes: 0x60211000
[0.517943] arm-smmu 2b50.iommu: Stage-2: 40-bit IPA -> 40-bit PA

please tell me which hardware problems this system has kernel
workarounds in place for, and which it doesn't. If you want another
hint, the version is 4.11.0-rc1+. Note the "+".

Robin.

> The above change is not a general solution too for reporting the
> enablement of smmu errata workarounds. The check could be done
> multiple times and in the fast path. For the particular problem the
> above would work, but still some message on the type detected would be
> fine. I could rework my patch in a way that .name is not permanently
> stored in struct arm_smmu_device.
> 
> -Robert
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/3] iommu: Disambiguate MSI region types

2017-03-09 Thread Robin Murphy
Whilst it doesn't matter much to VFIO at the moment, when parsing
reserved regions on the host side we really needs to be able to tell
the difference between the software-reserved region used to map MSIs
translated by an IOMMU, and hardware regions for which the write might
never even reach the IOMMU. In particular, ARM systems assume the former
topology, but may need to cope with the latter as well, which will
require rather different handling in the iommu-dma layer.

For clarity, rename the software-managed type to IOMMU_RESV_SW_MSI, use
IOMMU_RESV_MSI to describe the hardware type, and document everything a
little bit. Since the x86 MSI remapping hardware falls squarely under
this meaning of IOMMU_RESV_MSI, apply that type to their regions as well,
so that we tell a consistent story to userspace across platforms (and
have future consistency if those drivers start migrating to iommu-dma).

Fixes: d30ddcaa7b02 ("iommu: Add a new type field in iommu_resv_region")
CC: Eric Auger 
CC: Alex Williamson 
CC: David Woodhouse 
CC: k...@vger.kernel.org
Signed-off-by: Robin Murphy 
---
 drivers/iommu/amd_iommu.c   | 2 +-
 drivers/iommu/arm-smmu-v3.c | 2 +-
 drivers/iommu/arm-smmu.c| 2 +-
 drivers/iommu/intel-iommu.c | 2 +-
 drivers/iommu/iommu.c   | 1 +
 drivers/vfio/vfio_iommu_type1.c | 2 +-
 include/linux/iommu.h   | 5 +
 7 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 98940d1392cb..b17536d6e69b 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3202,7 +3202,7 @@ static void amd_iommu_get_resv_regions(struct device *dev,
 
region = iommu_alloc_resv_region(MSI_RANGE_START,
 MSI_RANGE_END - MSI_RANGE_START + 1,
-0, IOMMU_RESV_RESERVED);
+0, IOMMU_RESV_MSI);
if (!region)
return;
list_add_tail(®ion->list, head);
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5806a6acc94e..591bb96047c9 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1888,7 +1888,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-prot, IOMMU_RESV_MSI);
+prot, IOMMU_RESV_SW_MSI);
if (!region)
return;
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index abf6496843a6..b493c99e17f7 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1608,7 +1608,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-prot, IOMMU_RESV_MSI);
+prot, IOMMU_RESV_SW_MSI);
if (!region)
return;
 
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 238ad3447712..f1611fd6f5b0 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5249,7 +5249,7 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
 
reg = iommu_alloc_resv_region(IOAPIC_RANGE_START,
  IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1,
- 0, IOMMU_RESV_RESERVED);
+ 0, IOMMU_RESV_MSI);
if (!reg)
return;
list_add_tail(®->list, head);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8ea14f41a979..7dbc05f10d5a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -72,6 +72,7 @@ static const char * const iommu_group_resv_type_string[] = {
[IOMMU_RESV_DIRECT] = "direct",
[IOMMU_RESV_RESERVED]   = "reserved",
[IOMMU_RESV_MSI]= "msi",
+   [IOMMU_RESV_SW_MSI] = "msi",
 };
 
 #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)  \
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c26fa1f3ed86..e32abdebd2df 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1192,7 +1192,7 @@ static bool vfio_iommu_has_resv_msi(struct iommu_group 
*group,
INIT_LIST_HEAD(&group_resv_regions);
iommu_get_group_resv_regions(group, &group_resv_regions);
list_for_each_entry(region, &group_resv_regions, list) {
-   if (region->type & IOMMU_RESV_MSI) {
+   if (region->type & IOMMU_RESV_SW_MSI) {
*base = region->start;
ret = true;

  1   2   3   4   5   6   7   8   9   10   >