On Tue, Jul 28, 2020 at 11:30:32AM +0200, Nicolas Saenz Julienne wrote:
> On Tue, 2020-07-28 at 11:13 +0200, Christoph Hellwig wrote:
> > On Mon, Jul 27, 2020 at 07:56:56PM +0200, Nicolas Saenz Julienne wrote:
> > > Hi Christoph,
> > > thanks for having a look at this!
> > > 
> > > On Fri, 2020-07-24 at 15:41 +0200, Christoph Hellwig wrote:
> > > > Yes, the iommu is an interesting case, and the current code is
> > > > wrong for that.
> > > 
> > > Care to expand on this? I do get that checking dma_coherent_ok() on memory
> > > that'll later on be mapped into an iommu is kind of silly, although I 
> > > think
> > > harmless in Amir's specific case, since devices have wide enough dma-
> ranges. 
> > > Is
> > > there more to it?
> > 
> > I think the problem is that it can lead to not finding suitable memory.
> > 
> > > > Can you try the patch below?  It contains a modified version of Nicolas'
> > > > patch to try CMA again for the expansion and a new (for now hackish) way
> > > > to
> > > > not apply the addressability check for dma-iommu allocations.
> > > > 
> > > > diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c
> > > > index 6bc74a2d51273e..ec5e525d2b9309 100644
> > > > --- a/kernel/dma/pool.c
> > > > +++ b/kernel/dma/pool.c
> > > > @@ -3,7 +3,9 @@
> > > >   * Copyright (C) 2012 ARM Ltd.
> > > >   * Copyright (C) 2020 Google LLC
> > > >   */
> > > > +#include <linux/cma.h>
> > > >  #include <linux/debugfs.h>
> > > > +#include <linux/dma-contiguous.h>
> > > >  #include <linux/dma-direct.h>
> > > >  #include <linux/dma-noncoherent.h>
> > > >  #include <linux/init.h>
> > > > @@ -55,6 +57,31 @@ static void dma_atomic_pool_size_add(gfp_t gfp, 
> > > > size_t
> > > > size)
> > > >                 pool_size_kernel += size;
> > > >  }
> > > >  
> > > > +static bool cma_in_zone(gfp_t gfp)
> > > > +{
> > > > +       phys_addr_t end;
> > > > +       unsigned long size;
> > > > +       struct cma *cma;
> > > > +
> > > > +       cma = dev_get_cma_area(NULL);
> > > > +       if (!cma)
> > > > +               return false;
> > > > +
> > > > +       size = cma_get_size(cma);
> > > > +       if (!size)
> > > > +               return false;
> > > > +       end = cma_get_base(cma) - memblock_start_of_DRAM() + size - 1;
> > > > +
> > > > +       /* CMA can't cross zone boundaries, see cma_activate_area() */
> > > > +       if (IS_ENABLED(CONFIG_ZONE_DMA) && (gfp & GFP_DMA) &&
> > > > +           end <= DMA_BIT_MASK(zone_dma_bits))
> > > > +               return true;
> > > > +       if (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp & GFP_DMA32) &&
> > > > +           end <= DMA_BIT_MASK(32))
> > > > +               return true;
> > > > +       return true;
> > > 
> > > IIUC this will always return true given a CMA is present. Which reverts to
> > > the
> > > previous behaviour (previous as in breaking some rpi4 setups), isn't it?
> > 
> > Was that really what broke the PI?  I'll try to get the split out series
> > today, which might have a few more tweaks, and then we'll need to test it
> > both on these rpi4 setups and Amits phone.
> 
> There was two issues with RPi:
>  - Not validating that pool allocated memory was OK for the device
>  - Locating all atomic pools in CMA, which doesn't work for all RPi4 devices*,
>    and IMO misses the point of having multiple pools.
> 
> * With ACPI RPi4 we have CMA located in ZONE_DMA32, yet have an atomic pool
> consumer, PCIe, that only wants memory in the [0 3GB] area, effectively 
> needing
> ZONE_DMA memory.

Ok, I found a slight bug that wasn't intended.  I wanted to make sure
we can always fall back to a lower pool, but got that wrong.  Should be
fixed in the next version.

Reply via email to