From: Magnus Damm
Update the IPMMU DT binding documentation to include the r8a7796 compat
string for the IPMMU devices included in the R-Car M3-W SoC.
Signed-off-by: Magnus Damm
Acked-by: Laurent Pinchart
Acked-by: Rob Herring
Acked-by: Simon Horman
Acked-by: Geert Uytterhoeven
---
Docume
From: Magnus Damm
Update the IPMMU DT binding documentation to include the r8a77970 compat
string for the IPMMU devices included in the R-Car V3M SoC.
Signed-off-by: Magnus Damm
---
Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt |1 +
1 file changed, 1 insertion(+)
--- 00
From: Magnus Damm
Update the IPMMU DT binding documentation to include the r8a77995 compat
string for the IPMMU devices included in the R-Car D3 SoC.
Signed-off-by: Magnus Damm
---
Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt |1 +
1 file changed, 1 insertion(+)
--- 000
iommu/ipmmu-vmsa: R-Car Gen3 IPMMU DT binding update
[PATCH 1/3] iommu/ipmmu-vmsa: Document R-Car M3-W IPMMU DT bindings
[PATCH 2/3] iommu/ipmmu-vmsa: Document R-Car V3M IPMMU DT bindings
[PATCH 3/3] iommu/ipmmu-vmsa: Document R-Car D3 IPMMU DT bindings
This series documents IPMMU DT bindings for
The logic of __get_cached_rbnode() is a little obtuse, but then
__get_prev_node_of_cached_rbnode_or_last_node_and_update_limit_pfn()
wouldn't exactly roll off the tongue...
Now that we have the invariant that there is always a valid node to
start searching downwards from, everything gets a bit eas
From: Zhen Lei
Now that the cached node optimisation can apply to all allocations, the
couple of users which were playing tricks with dma_32bit_pfn in order to
benefit from it can stop doing so. Conversely, there is also no need for
all the other users to explicitly calculate a 'real' 32-bit PFN,
Add a permanent dummy IOVA reservation to the rbtree, such that we can
always access the top of the address space instantly. The immediate
benefit is that we remove the overhead of the rb_last() traversal when
not using the cached node, but it also paves the way for further
simplifications.
Signed
From: Zhen Lei
Checking the IOVA bounds separately before deciding which direction to
continue the search (if necessary) results in redundantly comparing both
pfns twice each. GCC can already determine that the final comparison op
is redundant and optimise it down to 3 in total, but we can go one
The cached node mechanism provides a significant performance benefit for
allocations using a 32-bit DMA mask, but in the case of non-PCI devices
or where the 32-bit space is full, the loss of this benefit can be
significant - on large systems there can be many thousands of entries in
the tree, such
From: Zhen Lei
The mask for calculating the padding size doesn't change, so there's no
need to recalculate it every loop iteration. Furthermore, Once we've
done that, it becomes clear that we don't actually need to calculate a
padding size at all - by flipping the arithmetic around, we can just
c
v4: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1493704.html
Right, this is hopefully the last version - I've put things back in a
sensible order with the new additions at the end, so if they prove
contentious the first 4 previously-tested patches can still get their
time in -next
get_cpu_ptr() disabled preemption and returns the ->fq object of the
current CPU. raw_cpu_ptr() does the same except that it not disable
preemption which means the scheduler can move it to another CPU after it
obtained the per-CPU object.
In this case this is not bad because the data structure itse
On 2017-09-11 22:22:11 [-0400], Vinod Adhikary wrote:
> Dear all,
Hi,
> Thank you for the great community support and support from Sebastian to
> provide me this patch. I wanted to send this email to inform you and
> perhaps get some information on how I could keep myself updated on updates
> in r
On Thu, Sep 21, 2017 at 12:58:04PM +0100, Robin Murphy wrote:
> Christoph, Marek; how reasonable do you think it is to expect
> dma_alloc_coherent() to be inherently NUMA-aware on NUMA-capable
> systems? SWIOTLB looks fairly straightforward to fix up (for the simple
> allocation case; I'm not sure
According to Spec, it is ILLEGAL to set STE.S1STALLD if STALL_MODEL
is not 0b00, which means we should not disable stall mode if stall
or terminate mode is not configuable.
Meanwhile, it is also ILLEGAL when STALL_MODEL==0b10 && CD.S==0 which
means if stall mode is force we should always set CD.S.
[+Christoph and Marek]
On 21/09/17 09:59, Ganapatrao Kulkarni wrote:
> Introduce smmu_alloc_coherent and smmu_free_coherent functions to
> allocate/free dma coherent memory from NUMA node associated with SMMU.
> Replace all calls of dmam_alloc_coherent with smmu_alloc_coherent
> for SMMU stream ta
On 21/09/17 09:59, Ganapatrao Kulkarni wrote:
> Change function __iommu_dma_alloc_pages to allocate memory/pages
> for dma from respective device numa node.
>
> Signed-off-by: Ganapatrao Kulkarni
> ---
> drivers/iommu/dma-iommu.c | 17 ++---
> 1 file changed, 10 insertions(+), 7 dele
On 21/09/17 09:59, Ganapatrao Kulkarni wrote:
> function __arm_lpae_alloc_pages is used to allcoated memory for smmu
> translation tables. updating function to allocate memory/pages
> from the proximity domain of SMMU device.
AFAICS, data->pgd_size always works out to a power-of-two number of
page
On 21/09/17 11:20, Robin Murphy wrote:
> of_pci_iommu_init() tries to be clever and stop its alias walk at the
> device represented by master_np, in case of weird PCI topologies where
> the bridge to the IOMMU and the rest of the system is not at the root.
> It turns out this is a bit short-sighted
On 20/09/17 10:37, Auger Eric wrote:
> Hi Jean,
> On 19/09/2017 12:47, Jean-Philippe Brucker wrote:
>> Hi Eric,
>>
>> On 12/09/17 18:13, Auger Eric wrote:
>>> 2.6.7
>>> - As I am currently integrating v0.4 in QEMU here are some other comments:
>>> At the moment struct virtio_iommu_req_probe flags i
of_pci_iommu_init() tries to be clever and stop its alias walk at the
device represented by master_np, in case of weird PCI topologies where
the bridge to the IOMMU and the rest of the system is not at the root.
It turns out this is a bit short-sighted, since there are plenty of
other callers of pc
of_pci_iommu_init() tries to be clever and stop its alias walk at the
device represented by master_np, in case of weird PCI topologies where
the bridge to the IOMMU and the rest of the system is not at the root.
It turns out this is a bit short-sighted, since there are plenty of
other callers of pc
Introduce smmu_alloc_coherent and smmu_free_coherent functions to
allocate/free dma coherent memory from NUMA node associated with SMMU.
Replace all calls of dmam_alloc_coherent with smmu_alloc_coherent
for SMMU stream tables and command queues.
Signed-off-by: Ganapatrao Kulkarni
---
drivers/iom
Change function __iommu_dma_alloc_pages to allocate memory/pages
for dma from respective device numa node.
Signed-off-by: Ganapatrao Kulkarni
---
drivers/iommu/dma-iommu.c | 17 ++---
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/i
Adding numa aware memory allocations used for iommu dma allocation and
memory allocated for SMMU stream tables, page walk tables and command queues.
With this patch, iperf testing on ThunderX2, with 40G NIC card on
NODE 1 PCI shown same performance(around 30% improvement) as NODE 0.
Ganapatrao Ku
function __arm_lpae_alloc_pages is used to allcoated memory for smmu
translation tables. updating function to allocate memory/pages
from the proximity domain of SMMU device.
Signed-off-by: Ganapatrao Kulkarni
---
drivers/iommu/io-pgtable-arm.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletio
This function can be used on NUMA systems in place of alloc_pages_exact
Adding code to export and to remove __meminit section tagging.
Signed-off-by: Ganapatrao Kulkarni
---
include/linux/gfp.h | 2 +-
mm/page_alloc.c | 3 ++-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/in
27 matches
Mail list logo