Re: [RESEND,3/3] iommu/dma: Plumb in the per-CPU IOVA caches

2017-04-07 Thread Nate Watterson
Hi Robin, On 4/6/2017 2:56 PM, Robin Murphy wrote: On 06/04/17 19:15, Manoj Iyer wrote: On Fri, 31 Mar 2017, Robin Murphy wrote: With IOVA allocation suitably tidied up, we are finally free to opt in to the per-CPU caching mechanism. The caching alone can provide a modest improvement over walk

Re: [PATCH] iommu/amd: flush IOTLB for specific domains only

2017-04-07 Thread Joerg Roedel
On Mon, Mar 27, 2017 at 11:47:07AM +0530, arindam.n...@amd.com wrote: > From: Arindam Nath > > The idea behind flush queues is to defer the IOTLB flushing > for domains for which the mappings are no longer valid. We > add such domains in queue_add(), and when the queue size > reaches FLUSH_QUEUE_

Re: [PATCH] iommu/iova: fix underflow bug in __alloc_and_insert_iova_range

2017-04-07 Thread Joerg Roedel
On Fri, Apr 07, 2017 at 01:36:20AM -0400, Nate Watterson wrote: > Signed-off-by: Nate Watterson > --- > drivers/iommu/iova.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.o

Re: [PATCH 2/5] iommu/omap: Permanently keep iommu_dev pointer in arch_data

2017-04-07 Thread Joerg Roedel
Hi Suman, On Mon, Apr 03, 2017 at 03:35:46PM -0500, Suman Anna wrote: > > + iommu = platform_get_drvdata(pdev); > > + if (!iommu) { > > + of_node_put(np); > > + return -EINVAL; > > + } > > This change is causing the issues. OMAP IOMMU driver is not probed yet, > but this

[PATCH 2/4] iommu/omap: Set dev->archdata.iommu = NULL in omap_iommu_remove_device

2017-04-07 Thread Joerg Roedel
From: Joerg Roedel Don't leave a stale pointer in case the device continues to exist for some more time. Signed-off-by: Joerg Roedel --- drivers/iommu/omap-iommu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c index e9c9b08..08bd7

[PATCH 0/4 v2] iommu/omap: Add support for iommu-groups and 'struct iommu_device'

2017-04-07 Thread Joerg Roedel
Hi, here is a small patch-set for the omap-iommu driver to make it use new features of the iommu-core. Please review. Thanks, Joerg Changes since v1: * Dropped patch 2 and moved device-link and group-handling to attach/detach_dev call-backs for now. Joerg R

[PATCH 4/4] iommu/omap: Add iommu-group support

2017-04-07 Thread Joerg Roedel
From: Joerg Roedel Support for IOMMU groups will become mandatory for drivers, so add it to the omap iommu driver. Signed-off-by: Joerg Roedel --- drivers/iommu/omap-iommu.c | 43 +-- drivers/iommu/omap-iommu.h | 1 + 2 files changed, 42 insertions(+),

[PATCH 1/4] iommu/omap: Move data structures to omap-iommu.h

2017-04-07 Thread Joerg Roedel
From: Joerg Roedel The internal data-structures are scattered over various header and C files. Consolidate them in omap-iommu.h. Signed-off-by: Joerg Roedel --- drivers/iommu/omap-iommu.c | 16 drivers/iommu/omap-iommu.h | 32 +++

[PATCH 3/4] iommu/omap: Make use of 'struct iommu_device'

2017-04-07 Thread Joerg Roedel
From: Joerg Roedel Modify the driver to register individual iommus and establish links between devices and iommus in sysfs. Signed-off-by: Joerg Roedel --- drivers/iommu/omap-iommu.c | 25 + drivers/iommu/omap-iommu.h | 2 ++ 2 files changed, 27 insertions(+) diff --g

Re: [PATCH V10 06/12] of: device: Fix overflow of coherent_dma_mask

2017-04-07 Thread Robin Murphy
On 06/04/17 20:34, Frank Rowand wrote: > On 04/06/17 04:01, Sricharan R wrote: >> Hi Frank, >> >> On 4/6/2017 12:31 PM, Frank Rowand wrote: >>> On 04/04/17 03:18, Sricharan R wrote: Size of the dma-range is calculated as coherent_dma_mask + 1 and passed to arch_setup_dma_ops further. It o

[GIT PULL] iommu/arm-smmu: Updates for 4.12

2017-04-07 Thread Will Deacon
Hi Joerg, Please pull these arm-smmu updates for 4.12. Highlights include: * TLB sync optimisations for SMMUv2 * Support for using an IDENTITY domain in conjunction with DMA ops * Support for SMR masking * Support for 16-bit ASIDs (was previously broken) Thanks, Will --->8 The followi

Re: [GIT PULL] iommu/arm-smmu: Updates for 4.12

2017-04-07 Thread Joerg Roedel
On Fri, Apr 07, 2017 at 05:02:12PM +0100, Will Deacon wrote: > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git > for-joerg/arm-smmu/updates Pulled, thanks Will. ___ iommu mailing list iommu@l

Re: [PATCH V3 0/5] iommu/arm-smmu: Add runtime pm/sleep support

2017-04-07 Thread Jordan Crouse
On Tue, Apr 04, 2017 at 12:39:14PM -0700, Stephen Boyd wrote: > On 04/03, Will Deacon wrote: > > On Fri, Mar 31, 2017 at 10:58:16PM -0400, Rob Clark wrote: > > > On Fri, Mar 31, 2017 at 1:54 PM, Will Deacon wrote: > > > > On Thu, Mar 09, 2017 at 09:05:43PM +0530, Sricharan R wrote: > > > >> This s

[RFC 0/3] virtio-iommu: a paravirtualized IOMMU

2017-04-07 Thread Jean-Philippe Brucker
This is the initial proposal for a paravirtualized IOMMU device using virtio transport. It contains a description of the device, a Linux driver, and a toy implementation in kvmtool. With this prototype, you can translate DMA to guest memory from emulated (virtio), or passed-through (VFIO) devices.

[RFC 1/3] virtio-iommu: firmware description of the virtual topology

2017-04-07 Thread Jean-Philippe Brucker
Unlike other virtio devices, the virtio-iommu doesn't work independently, it is linked to other virtual or assigned devices. So before jumping into device operations, we need to define a way for the guest to discover the virtual IOMMU and the devices it translates. The host must describe the relat

[RFC 3/3] virtio-iommu: future work

2017-04-07 Thread Jean-Philippe Brucker
Here I propose a few ideas for extensions and optimizations. This is all very exploratory, feel free to correct mistakes and suggest more things. I. Linux host 1. vhost-iommu 2. VFIO nested translation II. Page table sharing 1. Sharing IOMM

[RFC 2/3] virtio-iommu: device probing and operations

2017-04-07 Thread Jean-Philippe Brucker
After the virtio-iommu device has been probed and the driver is aware of the devices translated by the IOMMU, it can start sending requests to the virtio-iommu device. The operations described here are voluntarily minimalistic, so vIOMMU devices can be as simple as possible to implement, and can be

[RFC PATCH linux] iommu: Add virtio-iommu driver

2017-04-07 Thread Jean-Philippe Brucker
The virtio IOMMU is a para-virtualized device, allowing to send IOMMU requests such as map/unmap over virtio-mmio transport. This driver should illustrate the initial proposal for virtio-iommu, that you hopefully received with it. It handle attach, detach, map and unmap requests. The bulk of the c

[RFC PATCH kvmtool 00/15] Add virtio-iommu

2017-04-07 Thread Jean-Philippe Brucker
Implement a virtio-iommu device and translate DMA traffic from vfio and virtio devices. Virtio needed some rework to support scatter-gather accesses to vring and buffers at page granularity. Patch 3 implements the actual virtio-iommu device. Adding --viommu on the command-line now inserts a virtua

[RFC PATCH kvmtool 03/15] virtio: add virtio-iommu

2017-04-07 Thread Jean-Philippe Brucker
Implement a simple para-virtualized IOMMU for handling device address spaces in guests. Four operations are implemented: * attach/detach: guest creates an address space, symbolized by a unique identifier (IOASID), and attaches the device to it. * map/unmap: guest creates a GVA->GPA mapping in an

[RFC PATCH kvmtool 02/15] FDT: (re)introduce a dynamic phandle allocator

2017-04-07 Thread Jean-Philippe Brucker
The phandle allocator was removed because static values were sufficient for creating a common irqchip. With adding multiple virtual IOMMUs to the device-tree, there is a need for a dynamic allocation of phandles. Add a simple allocator that returns values above the static ones. Signed-off-by: Jean

[RFC PATCH kvmtool 01/15] virtio: synchronize virtio-iommu headers with Linux

2017-04-07 Thread Jean-Philippe Brucker
Pull virtio-iommu header (initial proposal) from Linux. Also add virtio_config.h because it defines VIRTIO_F_IOMMU_PLATFORM, which I'm going to need soon, and it's not provided by my toolchain. Signed-off-by: Jean-Philippe Brucker --- include/linux/virtio_config.h | 74 ++ i

[RFC PATCH kvmtool 04/15] Add a simple IOMMU

2017-04-07 Thread Jean-Philippe Brucker
Add a rb-tree based IOMMU with support for map, unmap and access operations. It will be used to store mappings for virtio devices and MSI doorbells. If needed, it could also be extended with a TLB implementation. Signed-off-by: Jean-Philippe Brucker --- Makefile| 1 + include/kvm/i

[RFC PATCH kvmtool 06/15] irq: register MSI doorbell addresses

2017-04-07 Thread Jean-Philippe Brucker
For passed-through devices behind a vIOMMU, we'll need to translate writes to MSI vectors. Let the IRQ code register MSI doorbells, and add a simple way for other systems to check if an address is a doorbell. Signed-off-by: Jean-Philippe Brucker --- arm/gic.c | 4 include/kvm/irq.h

[RFC PATCH kvmtool 05/15] iommu: describe IOMMU topology in device-trees

2017-04-07 Thread Jean-Philippe Brucker
Add an "iommu-map" property to the PCI host controller, describing which iommus translate which devices. We describe individual devices in iommu-map, not ranges. This patch is incompatible with current mainline Linux, which requires *all* devices under a host controller to be described by the iommu

[RFC PATCH kvmtool 07/15] virtio: factor virtqueue initialization

2017-04-07 Thread Jean-Philippe Brucker
All virtio devices are doing the same few operations when initializing their virtqueues. Move these operations to virtio core, as we'll have to complexify vring initialization when implementing a virtual IOMMU. Signed-off-by: Jean-Philippe Brucker --- include/kvm/virtio.h | 16 +---

[RFC PATCH kvmtool 08/15] virtio: add vIOMMU instance for virtio devices

2017-04-07 Thread Jean-Philippe Brucker
Virtio devices can now opt-in to use an IOMMU, by setting the use_iommu field. None of this will work in the current state, since virtio devices still access memory linearly. A subsequent patch implements sg accesses. Signed-off-by: Jean-Philippe Brucker --- include/kvm/virtio-mmio.h | 1 + inc

[RFC PATCH kvmtool 10/15] virtio-pci: translate MSIs with the virtual IOMMU

2017-04-07 Thread Jean-Philippe Brucker
When the virtio device is behind a virtual IOMMU, the doorbell address written into the MSI-X table by the guest is an IOVA, not a physical one. When injecting an MSI, KVM needs a physical address to recognize the doorbell and the associated IRQ chip. Translate the address given by the guest into a

[RFC PATCH kvmtool 09/15] virtio: access vring and buffers through IOMMU mappings

2017-04-07 Thread Jean-Philippe Brucker
Teach the virtio core how to access scattered vring structures. When presenting a virtual IOMMU to the guest in front of virtio devices, the virtio ring and buffers will be scattered across discontiguous guest- physical pages. The device has to translate all IOVAs to host-virtual addresses and gath

[RFC PATCH kvmtool 12/15] vfio: add support for virtual IOMMU

2017-04-07 Thread Jean-Philippe Brucker
Currently all passed-through devices must access the same guest-physical address space. Register an IOMMU to offer individual address spaces to devices. The way we do it is allocate one container per group, and add mappings on demand. Since guest cannot access devices unless it is attached to a co

[RFC PATCH kvmtool 11/15] virtio: set VIRTIO_F_IOMMU_PLATFORM when necessary

2017-04-07 Thread Jean-Philippe Brucker
Pass the VIRTIO_F_IOMMU_PLATFORM to tell the guest when a device is behind an IOMMU. Other feature bits in virtio do not depend on the device type and could be factored the same way. For instance our vring implementation always supports indirect descriptors (VIRTIO_RING_F_INDIRECT_DESC), so we cou

[RFC PATCH kvmtool 13/15] virtio-iommu: debug via IPC

2017-04-07 Thread Jean-Philippe Brucker
Add a new parameter to lkvm debug, '-i' or '--iommu'. Commands will be added later. For the moment, rework the debug builtin to share dump facilities with the '-d'/'--dump' parameter. Signed-off-by: Jean-Philippe Brucker --- builtin-debug.c | 8 +++- include/kvm/builtin-debug.h

[RFC PATCH kvmtool 14/15] virtio-iommu: implement basic debug commands

2017-04-07 Thread Jean-Philippe Brucker
Using debug printf with the virtual IOMMU can be extremely verbose. To ease debugging, add a few commands that can be sent via IPC. Format for commands is "cmd [iommu [address_space]]" (or cmd:[iommu:[address_space]]) $ lkvm debug -a -i list iommu 0 "viommu-vfio" ioas 1 devic

[RFC PATCH kvmtool 15/15] virtio: use virtio-iommu when available

2017-04-07 Thread Jean-Philippe Brucker
This is for development only. Virtual devices might blow up unexpectedly. In general it seems to work (slowing devices down by a factor of two of course). virtio-scsi, virtio-rng and virtio-balloon are still untested. Signed-off-by: Jean-Philippe Brucker --- virtio/core.c | 3 +++ 1 file changed

Re: [RFC 0/3] virtio-iommu: a paravirtualized IOMMU

2017-04-07 Thread Michael S. Tsirkin
On Fri, Apr 07, 2017 at 08:17:44PM +0100, Jean-Philippe Brucker wrote: > There are a number of advantages in a paravirtualized IOMMU over a full > emulation. It is portable and could be reused on different architectures. > It is easier to implement than a full emulation, with less state tracking. >

Re: [PATCH V10 06/12] of: device: Fix overflow of coherent_dma_mask

2017-04-07 Thread Frank Rowand
On 04/06/17 00:01, Frank Rowand wrote: > On 04/04/17 03:18, Sricharan R wrote: >> Size of the dma-range is calculated as coherent_dma_mask + 1 >> and passed to arch_setup_dma_ops further. It overflows when >> the coherent_dma_mask is set for full 64 bits 0x, >> resulting in size get

Re: [PATCH V10 06/12] of: device: Fix overflow of coherent_dma_mask

2017-04-07 Thread Frank Rowand
On 04/07/17 07:46, Robin Murphy wrote: > On 06/04/17 20:34, Frank Rowand wrote: >> On 04/06/17 04:01, Sricharan R wrote: >>> Hi Frank, >>> >>> On 4/6/2017 12:31 PM, Frank Rowand wrote: On 04/04/17 03:18, Sricharan R wrote: > Size of the dma-range is calculated as coherent_dma_mask + 1

AMD Nested Paging Performance Issues

2017-04-07 Thread Nick Sarnie
Hi all, Myself and many community members have noticed an issue when enabling Nested Page Tables in a VM which uses GPU Passthrough on AMD CPUs. I have found that NPT universally increases CPU performance when it is enabled, but it destroys passed-through GPU performance by around two to three tim