[RFC PATCH v1 0/9] deferred_probe_timeout logic clean up

2022-05-26 Thread Saravana Kannan via iommu
This series is based on linux-next + these 2 small patches applies on top: https://lore.kernel.org/lkml/20220526034609.480766-1-sarava...@google.com/ A lot of the deferred_probe_timeout logic is redundant with fw_devlink=on. Also, enabling deferred_probe_timeout by default breaks a few cases. Th

[RFC PATCH v1 2/9] pinctrl: devicetree: Delete usage of driver_deferred_probe_check_state()

2022-05-26 Thread Saravana Kannan via iommu
Now that fw_devlink=on by default and fw_devlink supports "pinctrl-[0-8]" property, the execution will never get to the point where driver_deferred_probe_check_state() is called before the supplier has probed successfully or before deferred probe timeout has expired. So, delete the call and replac

[RFC PATCH v1 4/9] Revert "driver core: Set default deferred_probe_timeout back to 0."

2022-05-26 Thread Saravana Kannan via iommu
This reverts commit 11f7e7ef553b6b93ac1aa74a3c2011b9cc8aeb61. Let's take another shot at getting deferred_probe_timeout=10 to work. Signed-off-by: Saravana Kannan --- drivers/base/dd.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 11b0fb641

[RFC PATCH v1 1/9] PM: domains: Delete usage of driver_deferred_probe_check_state()

2022-05-26 Thread Saravana Kannan via iommu
Now that fw_devlink=on by default and fw_devlink supports "power-domains" property, the execution will never get to the point where driver_deferred_probe_check_state() is called before the supplier has probed successfully or before deferred probe timeout has expired. So, delete the call and replac

[RFC PATCH v1 3/9] net: mdio: Delete usage of driver_deferred_probe_check_state()

2022-05-26 Thread Saravana Kannan via iommu
Now that fw_devlink=on by default and fw_devlink supports interrupt properties, the execution will never get to the point where driver_deferred_probe_check_state() is called before the supplier has probed successfully or before deferred probe timeout has expired. So, delete the call and replace it

[RFC PATCH v1 5/9] driver core: Set fw_devlink.strict=1 by default

2022-05-26 Thread Saravana Kannan via iommu
Now that deferred_probe_timeout is non-zero by default, fw_devlink will never permanently block the probing of devices. It'll try its best to probe the devices in the right order and then finally let devices probe even if their suppliers don't have any drivers. Signed-off-by: Saravana Kannan ---

[RFC PATCH v1 6/9] iommu/of: Delete usage of driver_deferred_probe_check_state()

2022-05-26 Thread Saravana Kannan via iommu
Now that fw_devlink=on and fw_devlink.strict=1 by default and fw_devlink supports iommu DT properties, the execution will never get to the point where driver_deferred_probe_check_state() is called before the supplier has probed successfully or before deferred probe timeout has expired. So, delete

[RFC PATCH v1 7/9] driver core: Add fw_devlink_unblock_may_probe() helper function

2022-05-26 Thread Saravana Kannan via iommu
This function can be used during the kernel boot sequence to forcefully override fw_devlink=on and unblock the probing of all devices that have a driver. It's mainly meant to be called from late_initcall() or late_initcall_sync() where a device needs to probe before the kernel can mount rootfs. S

[RFC PATCH v1 8/9] net: ipconfig: Force fw_devlink to unblock any devices that might probe

2022-05-26 Thread Saravana Kannan via iommu
If there are network devices that could probe without some of their suppliers probing and those network devices are needed for IP auto config to work, then fw_devlink=on might break that usecase by blocking the network devices from probing by the time IP auto config starts. So, when IP auto config

[RFC PATCH v1 9/9] driver core: Delete driver_deferred_probe_check_state()

2022-05-26 Thread Saravana Kannan via iommu
The function is no longer used. So delete it. Signed-off-by: Saravana Kannan --- drivers/base/dd.c | 30 -- include/linux/device/driver.h | 1 - 2 files changed, 31 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index af8138d44e6c..789b0

Re: [PATCH v2 2/2] iommu: mtk_iommu: Add support for MT6795 Helio X10 M4Us

2022-05-26 Thread Yong Wu via iommu
On Wed, 2022-05-18 at 12:18 +0200, AngeloGioacchino Del Regno wrote: > Add support for the M4Us found in the MT6795 Helio X10 SoC. > > Signed-off-by: AngeloGioacchino Del Regno < > angelogioacchino.delre...@collabora.com> > --- > drivers/iommu/mtk_iommu.c | 17 - > 1 file changed,

Re: [PATCH v2 1/7] dt-bindings: iommu: mediatek: Add phandles for mediatek infra/pericfg

2022-05-26 Thread Yong Wu via iommu
On Wed, 2022-05-18 at 12:04 +0200, AngeloGioacchino Del Regno wrote: > Add properties "mediatek,infracfg" and "mediatek,pericfg" to let the > mtk_iommu driver retrieve phandles to the infracfg and pericfg > syscon(s) > instead of performing a per-soc compatible lookup. > > Signed-off-by: AngeloGio

Re: [PATCH v2 2/7] iommu: mtk_iommu: Lookup phandle to retrieve syscon to infracfg

2022-05-26 Thread Yong Wu via iommu
On Wed, 2022-05-18 at 12:04 +0200, AngeloGioacchino Del Regno wrote: > This driver will get support for more SoCs and the list of infracfg > compatibles is expected to grow: in order to prevent getting this > situation out of control and see a long list of compatible strings, > add support to retri

Re: [PATCH v2 3/7] iommu: mtk_iommu: Lookup phandle to retrieve syscon to pericfg

2022-05-26 Thread Yong Wu via iommu
On Wed, 2022-05-18 at 12:04 +0200, AngeloGioacchino Del Regno wrote: > On some SoCs (of which only MT8195 is supported at the time of > writing), > the "R" and "W" (I/O) enable bits for the IOMMUs are in the > pericfg_ao > register space and not in the IOMMU space: as it happened already > with > i

[PATCH v2 0/4] DMA mapping changes for SCSI core

2022-05-26 Thread John Garry via iommu
As reported in [0], DMA mappings whose size exceeds the IOMMU IOVA caching limit may see a big performance hit. This series introduces a new DMA mapping API, dma_opt_mapping_size(), so that drivers may know this limit when performance is a factor in the mapping. Robin didn't like using dma_max_ma

[PATCH v2 1/4] dma-mapping: Add dma_opt_mapping_size()

2022-05-26 Thread John Garry via iommu
Streaming DMA mapping involving an IOMMU may be much slower for larger total mapping size. This is because every IOMMU DMA mapping requires an IOVA to be allocated and freed. IOVA sizes above a certain limit are not cached, which can have a big impact on DMA mapping performance. Provide an API for

[PATCH v2 2/4] dma-iommu: Add iommu_dma_opt_mapping_size()

2022-05-26 Thread John Garry via iommu
Add the IOMMU callback for DMA mapping API dma_opt_mapping_size(), which allows the drivers to know the optimal mapping limit and thus limit the requested IOVA lengths. This value is based on the IOVA rcache range limit, as IOVAs allocated above this limit must always be newly allocated, which may

[PATCH v2 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits

2022-05-26 Thread John Garry via iommu
Streaming DMA mappings may be considerably slower when mappings go through an IOMMU and the total mapping length is somewhat long. This is because the IOMMU IOVA code allocates and free an IOVA for each mapping, which may affect performance. For performance reasons set the request_queue max_sector

[PATCH v2 4/4] libata-scsi: Cap ata_device->max_sectors according to shost->max_sectors

2022-05-26 Thread John Garry via iommu
ATA devices (struct ata_device) have a max_sectors field which is configured internally in libata. This is then used to (re)configure the associated sdev request queue max_sectors value from how it is earlier set in __scsi_init_queue(). In __scsi_init_queue() the max_sectors value is set according

[RFC PATCH V3 0/2] swiotlb: Add child io tlb mem support

2022-05-26 Thread Tianyu Lan
From: Tianyu Lan Traditionally swiotlb was not performance critical because it was only used for slow devices. But in some setups, like TDX/SEV confidential guests, all IO has to go through swiotlb. Currently swiotlb only has a single lock. Under high IO load with multiple CPUs this can lead to s

[RFC PATCH V3 2/2] net: netvsc: Allocate per-device swiotlb bounce buffer for netvsc

2022-05-26 Thread Tianyu Lan
From: Tianyu Lan Netvsc driver allocates device io tlb mem via calling swiotlb_device_ allocate() and set child io tlb mem number according to device queue number. Child io tlb mem may reduce overhead of single spin lock in device io tlb mem among multi device queues. Signed-off-by: Tianyu Lan

[RFC PATCH V3 1/2] swiotlb: Add Child IO TLB mem support

2022-05-26 Thread Tianyu Lan
From: Tianyu Lan Traditionally swiotlb was not performance critical because it was only used for slow devices. But in some setups, like TDX/SEV confidential guests, all IO has to go through swiotlb. Currently swiotlb only has a single lock. Under high IO load with multiple CPUs this can lead to s

RE: [RFC PATCH V3 2/2] net: netvsc: Allocate per-device swiotlb bounce buffer for netvsc

2022-05-26 Thread Dexuan Cui via iommu
> From: Tianyu Lan > Sent: Thursday, May 26, 2022 5:01 AM > ... > @@ -119,6 +124,10 @@ static void netvsc_subchan_work(struct work_struct > *w) > nvdev->max_chn = 1; > nvdev->num_chn = 1; > } > + > + /* Allocate boucne buffer.*/

Re: [PATCH v2 2/4] dma-iommu: Add iommu_dma_opt_mapping_size()

2022-05-26 Thread Damien Le Moal via iommu
On 2022/05/26 19:28, John Garry wrote: > Add the IOMMU callback for DMA mapping API dma_opt_mapping_size(), which > allows the drivers to know the optimal mapping limit and thus limit the > requested IOVA lengths. > > This value is based on the IOVA rcache range limit, as IOVAs allocated > above t

Re: [PATCH v2 0/9] Add dynamic iommu backed bounce buffers

2022-05-26 Thread David Stevens
On Tue, May 24, 2022 at 9:27 PM Niklas Schnelle wrote: > > On Fri, 2021-08-06 at 19:34 +0900, David Stevens wrote: > > From: David Stevens > > > > This patch series adds support for per-domain dynamic pools of iommu > > bounce buffers to the dma-iommu API. This allows iommu mappings to be > > reu

[PATCH 1/1] iommu/vt-d: Remove unused iovad from dmar_domain

2022-05-26 Thread Lu Baolu
Not used anywhere. Cleanup it to avoid dead code. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 0f9df5a19ef7..a22adfbdf870 100644 --- a/drivers/iommu/intel/iommu.h +++ b/d

[PATCH 00/12] iommu/vt-d: Optimize the use of locks

2022-05-26 Thread Lu Baolu
Hi folks, This series tries to optimize the uses of two locks in the Intel IOMMU driver: - The intel_iommu::lock is used to protect the IOMMU resources shared by devices. They include the IOMMU root and context tables, the pasid tables and the domain IDs. - The global device_domain_lock is us

[PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in debugfs

2022-05-26 Thread Lu Baolu
Retrieve the attached domain for a device through the generic interface exposed by the iommu core. This also makes device_domain_lock static. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.h | 1 - drivers/iommu/intel/debugfs.c | 20 drivers/iommu/intel/iommu.c |

[PATCH 02/12] iommu/vt-d: Remove for_each_device_domain()

2022-05-26 Thread Lu Baolu
The per-device device_domain_info data could be retrieved from the device itself. There's no need to search a global list. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.h | 2 -- drivers/iommu/intel/iommu.c | 25 - drivers/iommu/intel/pasid.c | 37 +++

[PATCH 03/12] iommu/vt-d: Remove clearing translation data in disable_dmar_iommu()

2022-05-26 Thread Lu Baolu
The disable_dmar_iommu() is called when IOMMU initialzation fails or the IOMMU is hot-removed from the system. In both cases, there is no need to clear the IOMMU translation data structures for devices. On the initialization path, the device probing only happens after the IOMMU is initialized succ

[PATCH 05/12] iommu/vt-d: Unncessary spinlock for root table alloc and free

2022-05-26 Thread Lu Baolu
The IOMMU root table is allocated and freed in the IOMMU initialization code in static boot or hot-plug paths. There's no need for a spinlock. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 18 +- 1 file changed, 5 insertions(+), 13 deletions(-) diff --git a/drivers/i

[PATCH 04/12] iommu/vt-d: Use pci_get_domain_bus_and_slot() in pgtable_walk()

2022-05-26 Thread Lu Baolu
Use pci_get_domain_bus_and_slot() instead of searching the global list to retrieve the pci device pointer. This removes device_domain_list global list as there are no consumers anymore. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.h | 1 - drivers/iommu/intel/iommu.c | 33 ++---

[PATCH 06/12] iommu/vt-d: Acquiring lock in domain ID allocation helpers

2022-05-26 Thread Lu Baolu
The iommu->lock is used to protect the per-IOMMU domain ID resource. Move the spinlock acquisition/release into the helpers where domain IDs are allocated and freed. The device_domain_lock is irrelevant to domain ID resources, remove its assertion as well. Signed-off-by: Lu Baolu --- drivers/iom

[PATCH 07/12] iommu/vt-d: Acquiring lock in pasid manipulation helpers

2022-05-26 Thread Lu Baolu
The iommu->lock is used to protect the per-IOMMU pasid directory table and pasid table. Move the spinlock acquisition/release into the helpers to make the code self-contained. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 2 - drivers/iommu/intel/pasid.c | 106 +++--

[PATCH 08/12] iommu/vt-d: Replace spin_lock_irqsave() with spin_lock()

2022-05-26 Thread Lu Baolu
The iommu->lock is used to protect changes in root/context/pasid tables and domain ID allocation. There's no use case to change these resources in any interrupt context. Hence there's no need to disable interrupts when helding the spinlock. Signed-off-by: Lu Baolu --- drivers/iommu/intel/debugfs

[PATCH 09/12] iommu/vt-d: Check device list of domain in domain free path

2022-05-26 Thread Lu Baolu
When the IOMMU domain is about to be freed, it should not be set on any device. Instead of silently dealing with some bug cases, it's better to trigger a warning to report and fix any potential bugs at the first time. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 17 ++---

[PATCH 10/12] iommu/vt-d: Fold __dmar_remove_one_dev_info() into its caller

2022-05-26 Thread Lu Baolu
Fold __dmar_remove_one_dev_info() into dmar_remove_one_dev_info() which is its only caller. Make the spin lock critical range only cover the device list change code and remove some unnecessary checks. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 34 +-

[PATCH 11/12] iommu/vt-d: Use device_domain_lock accurately

2022-05-26 Thread Lu Baolu
The device_domain_lock is used to protect the device tracking list of a domain. Remove unnecessary spin_lock/unlock()'s and move the necessary ones around the list access. Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 68 +++-- 1 file changed, 27 inser

[PATCH 12/12] iommu/vt-d: Convert device_domain_lock into per-domain mutex

2022-05-26 Thread Lu Baolu
Using a global device_domain_lock spinlock to protect per-domain device tracking lists is an inefficient way, especially considering this lock is also needed in the hot paths. On the other hand, in the iommu_unmap() path, the driver needs to iterate over the device tracking list and flush the cach