date:20201004

[Bug 208205] [amdgpu] System hang / freeze

2020-10-04 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=208205

Andrea Borgia (and...@borgia.bo.it) changed:

   What|Removed |Added

 CC||and...@borgia.bo.it

--- Comment #5 from Andrea Borgia (and...@borgia.bo.it) ---
Ryzen 3 2200G, had multiple random freezes with debian kernel package
"linux-image-5.8.0-2-amd64". Now I am back to "linux-image-5.7.0-2-amd64" which
was working nicely.

Other notable packages:
firmware-amd-graphics 20200918-1
amd64-microcode   3.20191218.1

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

2020-10-04 Thread Daniel Vetter

On Sun, Oct 4, 2020 at 1:24 AM Jason Gunthorpe  wrote:
> On Sat, Oct 03, 2020 at 03:52:32PM -0700, John Hubbard wrote:
> > On 10/3/20 2:45 AM, Daniel Vetter wrote:
> > > On Sat, Oct 3, 2020 at 12:39 AM John Hubbard  wrote:
> > > >
> > > > On 10/2/20 10:53 AM, Daniel Vetter wrote:
> > > > > For $reasons I've stumbled over this code and I'm not sure the change
> > > > > to the new gup functions in 55a650c35fea ("mm/gup: frame_vector:
> > > > > convert get_user_pages() --> pin_user_pages()") was entirely correct.
> > > > >
> > > > > This here is used for long term buffers (not just quick I/O) like
> > > > > RDMA, and John notes this in his patch. But I thought the rule for
> > > > > these is that they need to add FOLL_LONGTERM, which John's patch
> > > > > didn't do.
> > > >
> > > > Yep. The earlier gup --> pup conversion patches were intended to not
> > > > have any noticeable behavior changes, and FOLL_LONGTERM, with it's
> > > > special cases and such, added some risk that I wasn't ready to take
> > > > on yet. Also, FOLL_LONGTERM rules are only *recently* getting firmed
> > > > up. So there was some doubt at least in my mind, about which sites
> > > > should have it.
> > > >
> > > > But now that we're here, I think it's really good that you've brought
> > > > this up. It's definitely time to add FOLL_LONGTERM wherever it's 
> > > > missing.
> > >
> > > So should I keep this patch, or will it collide with a series you're 
> > > working on?
> >
> > It doesn't collide with anything on my end yet, because I've been slow to
> > pick up on the need for changing callsites to add FOLL_LONGTERM. :)
> >
> > And it looks like that's actually a problem, because:
> >
> > >
> > > Also with the firmed up rules, correct that I can also drop the
> > > vma_is_fsdax check when the FOLL_LONGTERM flag is set?
> >
> > That's the right direction to go *in general*, but I see that the
> > pin_user_pages code is still a bit stuck in the past. And this patch
> > won't actually work, with or without that vma_is_fsdax() check.
> > Because:
> >
> > get_vaddr_frames(FOLL_LONGTERM)
> >pin_user_pages_locked()
> >   if (WARN_ON_ONCE(gup_flags & FOLL_LONGTERM))
> >   return -EINVAL;
>
> There is no particular reason this code needs to have the mm sem at
> that point.
>
> It should call pin_user_pages_fast() and only if that fails get the mmap
> lock and extract the VMA to do broken hackery.

Yeah I think that works. I tried understanding gup.c code a bit more,
and it looks like FOLL_LONGTERM only works for the pup_fast variant
right now? All others seem to have this comment that it's somehow
incompatible with FAULT_FLAG_ALLOW_RETRY and daxfs. But grepping
around for that didn't show up anything, at least not nearby dax code.
For my understanding of all this, what's the hiccup there?

For plans, I only started this for a bit of my own learning, but I
think I'll respin with the following changes:
- convert exynos and habanalabs to pin_user_pages_fast directly,
instead of going through this frame-vector detour
- move the locking and convert get_vaddr_frames to pup_fast as Jason suggested
- hack up some truly gross rfc to plug the follow_pfn hole

Cheers, Daniel

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH rdma-next v5 1/4] lib/scatterlist: Add support in dynamic allocation of SG table from pages

2020-10-04 Thread Leon Romanovsky

From: Maor Gottlieb 

Extend __sg_alloc_table_from_pages to support dynamic allocation of
SG table from pages. It should be used by drivers that can't supply
all the pages at one time.

This function returns the last populated SGE in the table. Users should
pass it as an argument to the function from the second call and forward.
As before, nents will be equal to the number of populated SGEs (chunks).

With this new extension, drivers can benefit the optimization of merging
contiguous pages without a need to allocate all pages in advance and
hold them in a large buffer.

E.g. with the Infiniband driver that allocates a single page for hold the
pages. For 1TB memory registration, the temporary buffer would consume only
4KB, instead of 2GB.

Signed-off-by: Maor Gottlieb 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Leon Romanovsky 
---
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c |  12 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c  |  15 ++-
 include/linux/scatterlist.h |  38 +++---
 lib/scatterlist.c   | 125 
 tools/testing/scatterlist/main.c|   9 +-
 5 files changed, 142 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 12b30075134a..f2eaed6aca3d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -403,6 +403,7 @@ __i915_gem_userptr_alloc_pages(struct drm_i915_gem_object 
*obj,
unsigned int max_segment = i915_sg_segment_size();
struct sg_table *st;
unsigned int sg_page_sizes;
+   struct scatterlist *sg;
int ret;

st = kmalloc(sizeof(*st), GFP_KERNEL);
@@ -410,13 +411,12 @@ __i915_gem_userptr_alloc_pages(struct drm_i915_gem_object 
*obj,
return ERR_PTR(-ENOMEM);

 alloc_table:
-   ret = __sg_alloc_table_from_pages(st, pvec, num_pages,
- 0, num_pages << PAGE_SHIFT,
- max_segment,
- GFP_KERNEL);
-   if (ret) {
+   sg = __sg_alloc_table_from_pages(st, pvec, num_pages, 0,
+num_pages << PAGE_SHIFT, max_segment,
+NULL, 0, GFP_KERNEL);
+   if (IS_ERR(sg)) {
kfree(st);
-   return ERR_PTR(ret);
+   return ERR_CAST(sg);
}

ret = i915_gem_gtt_prepare_pages(obj, st);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
index ab524ab3b0b4..f22acd398b1f 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
@@ -419,6 +419,7 @@ static int vmw_ttm_map_dma(struct vmw_ttm_tt *vmw_tt)
int ret = 0;
static size_t sgl_size;
static size_t sgt_size;
+   struct scatterlist *sg;

if (vmw_tt->mapped)
return 0;
@@ -441,13 +442,15 @@ static int vmw_ttm_map_dma(struct vmw_ttm_tt *vmw_tt)
if (unlikely(ret != 0))
return ret;

-   ret = __sg_alloc_table_from_pages
-   (&vmw_tt->sgt, vsgt->pages, vsgt->num_pages, 0,
-(unsigned long) vsgt->num_pages << PAGE_SHIFT,
-dma_get_max_seg_size(dev_priv->dev->dev),
-GFP_KERNEL);
-   if (unlikely(ret != 0))
+   sg = __sg_alloc_table_from_pages(&vmw_tt->sgt, vsgt->pages,
+   vsgt->num_pages, 0,
+   (unsigned long) vsgt->num_pages << PAGE_SHIFT,
+   dma_get_max_seg_size(dev_priv->dev->dev),
+   NULL, 0, GFP_KERNEL);
+   if (IS_ERR(sg)) {
+   ret = PTR_ERR(sg);
goto out_sg_alloc_fail;
+   }

if (vsgt->num_pages > vmw_tt->sgt.nents) {
uint64_t over_alloc =
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 45cf7b69d852..36c47e7e66a2 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -165,6 +165,22 @@ static inline void sg_set_buf(struct scatterlist *sg, 
const void *buf,
 #define for_each_sgtable_dma_sg(sgt, sg, i)\
for_each_sg((sgt)->sgl, sg, (sgt)->nents, i)

+static inline void __sg_chain(struct scatterlist *chain_sg,
+ struct scatterlist *sgl)
+{
+   /*
+* offset and length are unused for chain entry. Clear them.
+*/
+   chain_sg->offset = 0;
+   chain_sg->length = 0;
+
+   /*
+* Set lowest bit to indicate a link pointer, and make sure to clear
+* the termination bit if it happens to be set.
+*/
+   chain_sg->page_link = ((unsigned long) sgl | SG_CHAIN) & ~SG_END;
+}
+
 /**
  *

[PATCH rdma-next v5 4/4] RDMA/umem: Move to allocate SG table from pages

2020-10-04 Thread Leon Romanovsky

From: Maor Gottlieb 

Remove the implementation of ib_umem_add_sg_table and instead
call to __sg_alloc_table_from_pages which already has the logic to
merge contiguous pages.

Besides that it removes duplicated functionality, it reduces the
memory consumption of the SG table significantly. Prior to this
patch, the SG table was allocated in advance regardless consideration
of contiguous pages.

In huge pages system of 2MB page size, without this change, the SG table
would contain x512 SG entries.
E.g. for 100GB memory registration:

 Number of entries  Size
Before26214400  600.0MB
After512001.2MB

Signed-off-by: Maor Gottlieb 
Signed-off-by: Leon Romanovsky 
---
 drivers/infiniband/core/umem.c | 94 +-
 1 file changed, 12 insertions(+), 82 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index c1ab6a4f2bc3..e9fecbdf391b 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -61,73 +61,6 @@ static void __ib_umem_release(struct ib_device *dev, struct 
ib_umem *umem, int d
sg_free_table(&umem->sg_head);
 }

-/* ib_umem_add_sg_table - Add N contiguous pages to scatter table
- *
- * sg: current scatterlist entry
- * page_list: array of npage struct page pointers
- * npages: number of pages in page_list
- * max_seg_sz: maximum segment size in bytes
- * nents: [out] number of entries in the scatterlist
- *
- * Return new end of scatterlist
- */
-static struct scatterlist *ib_umem_add_sg_table(struct scatterlist *sg,
-   struct page **page_list,
-   unsigned long npages,
-   unsigned int max_seg_sz,
-   int *nents)
-{
-   unsigned long first_pfn;
-   unsigned long i = 0;
-   bool update_cur_sg = false;
-   bool first = !sg_page(sg);
-
-   /* Check if new page_list is contiguous with end of previous page_list.
-* sg->length here is a multiple of PAGE_SIZE and sg->offset is 0.
-*/
-   if (!first && (page_to_pfn(sg_page(sg)) + (sg->length >> PAGE_SHIFT) ==
-  page_to_pfn(page_list[0])))
-   update_cur_sg = true;
-
-   while (i != npages) {
-   unsigned long len;
-   struct page *first_page = page_list[i];
-
-   first_pfn = page_to_pfn(first_page);
-
-   /* Compute the number of contiguous pages we have starting
-* at i
-*/
-   for (len = 0; i != npages &&
- first_pfn + len == page_to_pfn(page_list[i]) &&
- len < (max_seg_sz >> PAGE_SHIFT);
-len++)
-   i++;
-
-   /* Squash N contiguous pages from page_list into current sge */
-   if (update_cur_sg) {
-   if ((max_seg_sz - sg->length) >= (len << PAGE_SHIFT)) {
-   sg_set_page(sg, sg_page(sg),
-   sg->length + (len << PAGE_SHIFT),
-   0);
-   update_cur_sg = false;
-   continue;
-   }
-   update_cur_sg = false;
-   }
-
-   /* Squash N contiguous pages into next sge or first sge */
-   if (!first)
-   sg = sg_next(sg);
-
-   (*nents)++;
-   sg_set_page(sg, first_page, len << PAGE_SHIFT, 0);
-   first = false;
-   }
-
-   return sg;
-}
-
 /**
  * ib_umem_find_best_pgsz - Find best HW page size to use for this MR
  *
@@ -217,7 +150,7 @@ struct ib_umem *ib_umem_get(struct ib_device *device, 
unsigned long addr,
struct mm_struct *mm;
unsigned long npages;
int ret;
-   struct scatterlist *sg;
+   struct scatterlist *sg = NULL;
unsigned int gup_flags = FOLL_WRITE;

/*
@@ -272,15 +205,9 @@ struct ib_umem *ib_umem_get(struct ib_device *device, 
unsigned long addr,

cur_base = addr & PAGE_MASK;

-   ret = sg_alloc_table(&umem->sg_head, npages, GFP_KERNEL);
-   if (ret)
-   goto vma;
-
if (!umem->writable)
gup_flags |= FOLL_FORCE;

-   sg = umem->sg_head.sgl;
-
while (npages) {
cond_resched();
ret = pin_user_pages_fast(cur_base,
@@ -292,15 +219,19 @@ struct ib_umem *ib_umem_get(struct ib_device *device, 
unsigned long addr,
goto umem_release;

cur_base += ret * PAGE_SIZE;
-   npages   -= ret;
-
-   sg = ib_umem_add_sg_table(sg, page_list, ret,
-   dma_get_max_seg_size(device->dma_device),
-   &umem->sg_nent

[PATCH rdma-next v5 0/4] Dynamicaly allocate SG table from the pages

2020-10-04 Thread Leon Romanovsky

From: Leon Romanovsky 

Changelog:
v5:
 * Use sg_init_table to allocate table and avoid changes is __sg_alloc_table
 * Fix offset issue
v4: https://lore.kernel.org/lkml/20200927064647.3106737-1-l...@kernel.org
 * Fixed formatting in first patch.
 * Added fix (clear tmp_netnts) in first patch to fix i915 failure.
 * Added test patches
v3: https://lore.kernel.org/linux-rdma/20200922083958.2150803-1-l...@kernel.org/
 * Squashed Christopher's suggestion to avoid introduced new API, but extend 
existing one.
v2: https://lore.kernel.org/linux-rdma/20200916140726.839377-1-l...@kernel.org
 * Fixed indentations and comments
 * Deleted sg_alloc_next()
 * Squashed lib/scatterlist patches into one
v1: https://lore.kernel.org/lkml/20200910134259.1304543-1-l...@kernel.org
 * Changed _sg_chain to be __sg_chain
 * Added dependency on ARCH_NO_SG_CHAIN
 * Removed struct sg_append
v0:
 * https://lore.kernel.org/lkml/20200903121853.1145976-1-l...@kernel.org

--
>From Maor:

This series extends __sg_alloc_table_from_pages to allow chaining of
new pages to already initialized SG table.

This allows for the drivers to utilize the optimization of merging contiguous
pages without a need to pre allocate all the pages and hold them in
a very large temporary buffer prior to the call to SG table initialization.

The second patch changes the Infiniband driver to use the new API. It
removes duplicate functionality from the code and benefits the
optimization of allocating dynamic SG table from pages.

In huge pages system of 2MB page size, without this change, the SG table
would contain x512 SG entries.
E.g. for 100GB memory registration:

 Number of entries  Size
Before26214400  600.0MB
After512001.2MB

Thanks

Maor Gottlieb (2):
  lib/scatterlist: Add support in dynamic allocation of SG table from
pages
  RDMA/umem: Move to allocate SG table from pages

Tvrtko Ursulin (2):
  tools/testing/scatterlist: Rejuvenate bit-rotten test
  tools/testing/scatterlist: Show errors in human readable form

 drivers/gpu/drm/i915/gem/i915_gem_userptr.c |  12 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c  |  15 ++-
 drivers/infiniband/core/umem.c  |  94 ++-
 include/linux/scatterlist.h |  38 +++---
 lib/scatterlist.c   | 125 
 tools/testing/scatterlist/Makefile  |   3 +-
 tools/testing/scatterlist/linux/mm.h|  35 ++
 tools/testing/scatterlist/main.c|  53 ++---
 8 files changed, 225 insertions(+), 150 deletions(-)

--
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH rdma-next v5 3/4] tools/testing/scatterlist: Show errors in human readable form

2020-10-04 Thread Leon Romanovsky

From: Tvrtko Ursulin 

Instead of just asserting dump some more useful info about what the test
saw versus what it expected to see.

Signed-off-by: Tvrtko Ursulin 
Cc: Maor Gottlieb 
Signed-off-by: Leon Romanovsky 
---
 tools/testing/scatterlist/main.c | 44 
 1 file changed, 34 insertions(+), 10 deletions(-)

diff --git a/tools/testing/scatterlist/main.c b/tools/testing/scatterlist/main.c
index 4899359a31ac..b2c7e9f7b8d3 100644
--- a/tools/testing/scatterlist/main.c
+++ b/tools/testing/scatterlist/main.c
@@ -5,6 +5,15 @@

 #define MAX_PAGES (64)

+struct test {
+   int alloc_ret;
+   unsigned num_pages;
+   unsigned *pfn;
+   unsigned size;
+   unsigned int max_seg;
+   unsigned int expected_segments;
+};
+
 static void set_pages(struct page **pages, const unsigned *array, unsigned num)
 {
unsigned int i;
@@ -17,17 +26,32 @@ static void set_pages(struct page **pages, const unsigned 
*array, unsigned num)

 #define pfn(...) (unsigned []){ __VA_ARGS__ }

+static void fail(struct test *test, struct sg_table *st, const char *cond)
+{
+   unsigned int i;
+
+   fprintf(stderr, "Failed on '%s'!\n\n", cond);
+
+   printf("size = %u, max segment = %u, expected nents = %u\nst->nents = 
%u, st->orig_nents= %u\n",
+  test->size, test->max_seg, test->expected_segments, st->nents,
+  st->orig_nents);
+
+   printf("%u input PFNs:", test->num_pages);
+   for (i = 0; i < test->num_pages; i++)
+   printf(" %x", test->pfn[i]);
+   printf("\n");
+
+   exit(1);
+}
+
+#define VALIDATE(cond, st, test) \
+   if (!(cond)) \
+   fail((test), (st), #cond);
+
 int main(void)
 {
const unsigned int sgmax = SCATTERLIST_MAX_SEGMENT;
-   struct test {
-   int alloc_ret;
-   unsigned num_pages;
-   unsigned *pfn;
-   unsigned size;
-   unsigned int max_seg;
-   unsigned int expected_segments;
-   } *test, tests[] = {
+   struct test *test, tests[] = {
{ -EINVAL, 1, pfn(0), PAGE_SIZE, PAGE_SIZE + 1, 1 },
{ -EINVAL, 1, pfn(0), PAGE_SIZE, 0, 1 },
{ -EINVAL, 1, pfn(0), PAGE_SIZE, sgmax + 1, 1 },
@@ -66,8 +90,8 @@ int main(void)
if (test->alloc_ret)
continue;

-   assert(st.nents == test->expected_segments);
-   assert(st.orig_nents == test->expected_segments);
+   VALIDATE(st.nents == test->expected_segments, &st, test);
+   VALIDATE(st.orig_nents == test->expected_segments, &st, test);

sg_free_table(&st);
}
--
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH rdma-next v5 2/4] tools/testing/scatterlist: Rejuvenate bit-rotten test

2020-10-04 Thread Leon Romanovsky

From: Tvrtko Ursulin 

A couple small tweaks are needed to make the test build and run
on current kernels.

Signed-off-by: Tvrtko Ursulin 
Cc: Maor Gottlieb 
Signed-off-by: Leon Romanovsky 
---
 tools/testing/scatterlist/Makefile   |  3 ++-
 tools/testing/scatterlist/linux/mm.h | 35 
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/tools/testing/scatterlist/Makefile 
b/tools/testing/scatterlist/Makefile
index cbb003d9305e..c65233876622 100644
--- a/tools/testing/scatterlist/Makefile
+++ b/tools/testing/scatterlist/Makefile
@@ -14,7 +14,7 @@ targets: include $(TARGETS)
 main: $(OFILES)

 clean:
-   $(RM) $(TARGETS) $(OFILES) scatterlist.c linux/scatterlist.h 
linux/highmem.h linux/kmemleak.h asm/io.h
+   $(RM) $(TARGETS) $(OFILES) scatterlist.c linux/scatterlist.h 
linux/highmem.h linux/kmemleak.h linux/slab.h asm/io.h
@rmdir asm

 scatterlist.c: ../../../lib/scatterlist.c
@@ -28,4 +28,5 @@ include: ../../../include/linux/scatterlist.h
@touch asm/io.h
@touch linux/highmem.h
@touch linux/kmemleak.h
+   @touch linux/slab.h
@cp $< linux/scatterlist.h
diff --git a/tools/testing/scatterlist/linux/mm.h 
b/tools/testing/scatterlist/linux/mm.h
index 6f9ac14aa800..6ae907f375d2 100644
--- a/tools/testing/scatterlist/linux/mm.h
+++ b/tools/testing/scatterlist/linux/mm.h
@@ -114,6 +114,12 @@ static inline void *kmalloc(unsigned int size, unsigned 
int flags)
return malloc(size);
 }

+static inline void *
+kmalloc_array(unsigned int n, unsigned int size, unsigned int flags)
+{
+   return malloc(n * size);
+}
+
 #define kfree(x) free(x)

 #define kmemleak_alloc(a, b, c, d)
@@ -122,4 +128,33 @@ static inline void *kmalloc(unsigned int size, unsigned 
int flags)
 #define PageSlab(p) (0)
 #define flush_kernel_dcache_page(p)

+#define MAX_ERRNO  4095
+
+#define IS_ERR_VALUE(x) unlikely((unsigned long)(void *)(x) >= (unsigned 
long)-MAX_ERRNO)
+
+static inline void * __must_check ERR_PTR(long error)
+{
+   return (void *) error;
+}
+
+static inline long __must_check PTR_ERR(__force const void *ptr)
+{
+   return (long) ptr;
+}
+
+static inline bool __must_check IS_ERR(__force const void *ptr)
+{
+   return IS_ERR_VALUE((unsigned long)ptr);
+}
+
+static inline int __must_check PTR_ERR_OR_ZERO(__force const void *ptr)
+{
+   if (IS_ERR(ptr))
+   return PTR_ERR(ptr);
+   else
+   return 0;
+}
+
+#define IS_ENABLED(x) (0)
+
 #endif
--
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

2020-10-04 Thread Daniel Vetter

On Sun, Oct 4, 2020 at 2:51 PM Jason Gunthorpe  wrote:
>
> On Sat, Oct 03, 2020 at 11:40:22AM +0200, Daniel Vetter wrote:
>
> > > That leaves the only interesting places as vb2_dc_get_userptr() and
> > > vb2_vmalloc_get_userptr() which both completely fail to follow the
> > > REQUIRED behavior in the function's comment about checking PTEs. It
> > > just DMA maps them. Badly broken.
> > >
> > > Guessing this hackery is for some embedded P2P DMA transfer?
> >
> > Yeah, see also the follow_pfn trickery in
> > videobuf_dma_contig_user_get(), I think this is fully intentional and
> > userspace abi we can't break :-/
>
> We don't need to break uABI, it just needs to work properly in the
> kernel:
>
>   vma = find_vma_intersection()
>   dma_buf = dma_buf_get_from_vma(vma)
>   sg = dma_buf_p2p_dma_map(dma_buf)
>   [.. do dma ..]
>   dma_buf_unmap(sg)
>   dma_buf_put(dma_buf)
>
> It is as we discussed before, dma buf needs to be discoverable from a
> VMA, at least for users doing this kind of stuff.

I'm not a big fan of magic behaviour like this, there's more to
dma-buf buffer sharing than just "how do I get at the backing
storage". Thus far we've done everything rather explicitly. Plus with
exynos and habanalabs converted there's only v4l left over, and that
has a proper dma-buf import path already.

> > Yup this should be done with dma_buf instead, and v4l has that. But
> > old uapi and all that. This is why I said we might need a new
> > VM_DYNAMIC_PFNMAP or so, to make follow_pfn not resolve this in the
> > case where the driver manages the underlying iomem range (or whatever
> > it is) dynamically and moves buffer objects around, like drm drivers
> > do. But I looked, and we've run out of vma->vm_flags :-(
>
> A VM flag doesn't help - we need to introduce some kind of lifetime,
> and that has to be derived from the VMA. It needs data not just a flag

I don't want to make it work, I just want to make it fail. Rough idea
I have in mind is to add a follow_pfn_longterm, for all callers which
aren't either synchronized through mmap_sem or an mmu_notifier. If
this really breaks anyone's use-case we can add a tainting kernel
option which re-enables this (we've done something similar for
phys_addr_t based buffer sharing in fbdev, entirely unfixable since
the other driver has to just blindly trust that what userspace passes
around is legit). This here isn't unfixable, but if v4l people want to
keep it without a big "security hole here" sticker, they should do the
work, not me :-)

> > The other problem is that I also have no real working clue about all
> > the VM_* flags and what they all mean, and whether drm drivers set the
> > right ones in all cases (they probably don't, but oh well).
> > Documentation for this stuff in headers is a bit thin at times.
>
> Yah, I don't really know either :\
>
> The comment above vm_normal_page() is a bit helpful. Don't know what
> VM_IO/VM_PFNMAP mean in their 3 combinations
>
> There are very few places that set VM_PFNMAP without VM_IO..

Best I could find is:
- mk68 seems to outright reject pagefaults on VM_IO vma
- some places set VM_IO together with VM_MIXEDMAP instead of
VM_PFNMAP. There's some comments that this makes cow possible, but I
guess that's for the old pfn remap vma (remap_file_pages, which is
removed now). But really no clue.

VM_IO | VM_MIXEDMAP kinda makes me wonder whether follow_pfn gets the
page refcounting all right or horribly wrong in some cases ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] dt-bindings: Another round of adding missing 'additionalProperties'

2020-10-04 Thread Vinod Koul

On 02-10-20, 18:41, Rob Herring wrote:

>  .../phy/amlogic,meson-g12a-usb2-phy.yaml  |  2 ++
>  .../bindings/phy/qcom,ipq806x-usb-phy-hs.yaml |  2 ++
>  .../bindings/phy/qcom,ipq806x-usb-phy-ss.yaml |  2 ++
>  .../bindings/phy/qcom,qusb2-phy.yaml  |  1 +
>  .../bindings/phy/qcom-usb-ipq4019-phy.yaml|  2 ++

For phy changes:

Acked-By: Vinod Koul 

-- 
~Vinod
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[RFC PATCH v3 1/4] RDMA/umem: Support importing dma-buf as user memory region

2020-10-04 Thread Jianxin Xiong

Dma-buf is a standard cross-driver buffer sharing mechanism that can be
used to support peer-to-peer access from RDMA devices.

Device memory exported via dma-buf is associated with a file descriptor.
This is passed to the user space as a property associated with the
buffer allocation. When the buffer is registered as a memory region,
the file descriptor is passed to the RDMA driver along with other
parameters.

Implement the common code for importing dma-buf object and mapping
dma-buf pages.

Signed-off-by: Jianxin Xiong 
Reviewed-by: Sean Hefty 
Acked-by: Michael J. Ruhl 
---
 drivers/infiniband/core/Makefile  |   2 +-
 drivers/infiniband/core/umem.c|   4 +
 drivers/infiniband/core/umem_dmabuf.c | 291 ++
 drivers/infiniband/core/umem_dmabuf.h |  14 ++
 drivers/infiniband/core/umem_odp.c|  12 ++
 include/rdma/ib_umem.h|  19 ++-
 6 files changed, 340 insertions(+), 2 deletions(-)
 create mode 100644 drivers/infiniband/core/umem_dmabuf.c
 create mode 100644 drivers/infiniband/core/umem_dmabuf.h

diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index 24cb71a..b8d51a7 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -40,5 +40,5 @@ ib_uverbs-y :=uverbs_main.o 
uverbs_cmd.o uverbs_marshall.o \
uverbs_std_types_srq.o \
uverbs_std_types_wq.o \
uverbs_std_types_qp.o
-ib_uverbs-$(CONFIG_INFINIBAND_USER_MEM) += umem.o
+ib_uverbs-$(CONFIG_INFINIBAND_USER_MEM) += umem.o umem_dmabuf.o
 ib_uverbs-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o
diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 831bff8..59ec36c 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -2,6 +2,7 @@
  * Copyright (c) 2005 Topspin Communications.  All rights reserved.
  * Copyright (c) 2005 Cisco Systems.  All rights reserved.
  * Copyright (c) 2005 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2020 Intel Corporation. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -42,6 +43,7 @@
 #include 
 
 #include "uverbs.h"
+#include "umem_dmabuf.h"
 
 static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int 
dirty)
 {
@@ -318,6 +320,8 @@ void ib_umem_release(struct ib_umem *umem)
 {
if (!umem)
return;
+   if (umem->is_dmabuf)
+   return ib_umem_dmabuf_release(umem);
if (umem->is_odp)
return ib_umem_odp_release(to_ib_umem_odp(umem));
 
diff --git a/drivers/infiniband/core/umem_dmabuf.c 
b/drivers/infiniband/core/umem_dmabuf.c
new file mode 100644
index 000..10ed646
--- /dev/null
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -0,0 +1,291 @@
+// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
+/*
+ * Copyright (c) 2020 Intel Corporation. All rights reserved.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "uverbs.h"
+
+struct ib_umem_dmabuf {
+   struct ib_umem_odp umem_odp;
+   struct dma_buf_attachment *attach;
+   struct sg_table *sgt;
+   atomic_t notifier_seq;
+};
+
+static inline struct ib_umem_dmabuf *to_ib_umem_dmabuf(struct ib_umem *umem)
+{
+   struct ib_umem_odp *umem_odp = to_ib_umem_odp(umem);
+   return container_of(umem_odp, struct ib_umem_dmabuf, umem_odp);
+}
+
+static void ib_umem_dmabuf_invalidate_cb(struct dma_buf_attachment *attach)
+{
+   struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv;
+   struct ib_umem_odp *umem_odp = &umem_dmabuf->umem_odp;
+   struct mmu_notifier_range range = {};
+   unsigned long current_seq;
+
+   /* no concurrent invalidation due to the dma_resv lock */
+
+   dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv);
+
+   if (!umem_dmabuf->sgt)
+   return;
+
+   range.start = ib_umem_start(umem_odp);
+   range.end = ib_umem_end(umem_odp);
+   range.flags = MMU_NOTIFIER_RANGE_BLOCKABLE;
+   current_seq = atomic_read(&umem_dmabuf->notifier_seq);
+   umem_odp->notifier.ops->invalidate(&umem_odp->notifier, &range,
+  current_seq);
+
+   atomic_inc(&umem_dmabuf->notifier_seq);
+}
+
+static struct dma_buf_attach_ops ib_umem_dmabuf_attach_ops = {
+   .allow_peer2peer = 1,
+   .move_notify = ib_umem_dmabuf_invalidate_cb,
+};
+
+static inline int ib_umem_dmabuf_init_odp(struct ib_umem_odp *umem_odp)
+{
+   size_t page_size = 1UL << umem_odp->page_shift;
+   unsigned long start;
+   unsigned long end;
+   size_t pages;
+
+   umem_odp->umem.is_odp = 1;
+   mutex_init(&umem_odp->umem_mutex);
+
+   start = ALIGN_DOWN(umem_odp->umem.address, page_size);
+   if (check_add_overflow(umem_odp->umem.a

[RFC PATCH v3 0/4] RDMA: Add dma-buf support

2020-10-04 Thread Jianxin Xiong

When enabled, an RDMA capable NIC can perform peer-to-peer transactions
over PCIe to access the local memory located on another device. This can
often lead to better performance than using a system memory buffer for
RDMA and copying data between the buffer and device memory.

Current kernel RDMA stack uses get_user_pages() to pin the physical
pages backing the user buffer and uses dma_map_sg_attrs() to get the
dma addresses for memory access. This usually doesn't work for peer
device memory due to the lack of associated page structures.

Several mechanisms exist today to facilitate device memory access.

ZONE_DEVICE is a new zone for device memory in the memory management
subsystem. It allows pages from device memory being described with
specialized page structures, but what can be done with these page
structures may be different from system memory. ZONE_DEVICE is further
specialized into multiple memory types, such as one type for PCI
p2pmem/p2pdma and one type for HMM.

PCI p2pmem/p2pdma uses ZONE_DEVICE to represent device memory residing
in a PCI BAR and provides a set of calls to publish, discover, allocate,
and map such memory for peer-to-peer transactions. One feature of the
API is that the buffer is allocated by the side that does the DMA
transfer. This works well with the storage usage case, but is awkward
with GPU-NIC communication, where typically the buffer is allocated by
the GPU driver rather than the NIC driver.

Heterogeneous Memory Management (HMM) utilizes mmu_interval_notifier
and ZONE_DEVICE to support shared virtual address space and page
migration between system memory and device memory. HMM doesn't support
pinning device memory because pages located on device must be able to
migrate to system memory when accessed by CPU. Peer-to-peer access
is currently not supported by HMM.

Dma-buf is a standard mechanism for sharing buffers among different
device drivers. The buffer to be shared is exported by the owning
driver and imported by the driver that wants to use it. The exporter
provides a set of ops that the importer can call to pin and map the
buffer. In addition, a file descriptor can be associated with a dma-
buf object as the handle that can be passed to user space.

This patch series adds dma-buf importer role to the RDMA driver in
attempt to support RDMA using device memory such as GPU VRAM. Dma-buf is
chosen for a few reasons: first, the API is relatively simple and allows
a lot of flexibility in implementing the buffer manipulation ops.
Second, it doesn't require page structure. Third, dma-buf is already
supported in many GPU drivers. However, we are aware that existing GPU
drivers don't allow pinning device memory via the dma-buf interface.
Pinning would simply cause the backing storage to migrate to system RAM.
True peer-to-peer access is only possible using dynamic attach, which
requires on-demand paging support from the NIC to work. For this reason,
this series only works with ODP capable NICs.

This is the third version of the patch series. Here are the changes
from the previous version:
* Use dma_buf_dynamic_attach() instead of dma_buf_attach()
* Use on-demand paging mechanism to avoid pinning the GPU memory
* Instead of adding a new parameter to the device method for memory
registration, pass all the attributes including the file descriptor
as a structure.
* Define a new access flag for dma-buf based memory region.
* Check for on-demand paging support in the new uverbs command

This series consists of four patches. The first patch adds the common
code for importing dma-buf from a file descriptor and mapping the
dma-buf pages. Patch 2 changes the signature of driver method for user
space memory registration to accept a structure of attributes and adds
dma-buf file descriptor to the structure. Vendor drivers are updated
accordingly. Patch 3 adds dma-buf support to the mlx5 driver. Patch 4
adds a new uverbs command for registering dma-buf based memory region.

Related user space RDMA library changes will be provided as a separate
patch series.

Jianxin Xiong (4):
  RDMA/umem: Support importing dma-buf as user memory region
  RDMA: Expand driver memory registration methods to support dma-buf
  RDMA/mlx5: Support dma-buf based userspace memory region
  RDMA/uverbs: Add uverbs command for dma-buf based MR registration

 drivers/infiniband/core/Makefile|   2 +-
 drivers/infiniband/core/umem.c  |   4 +
 drivers/infiniband/core/umem_dmabuf.c   | 291 
 drivers/infiniband/core/umem_dmabuf.h   |  14 ++
 drivers/infiniband/core/umem_odp.c  |  12 +
 drivers/infiniband/core/uverbs_cmd.c|  25 +-
 drivers/infiniband/core/uverbs_std_types_mr.c   | 115 ++
 drivers/infiniband/core/verbs.c |  15 +-
 drivers/infiniband/hw/bnxt_re/ib_verbs.c|  23 +-
 drivers/infiniband/hw/bnxt_re/ib_verbs.h|   4 +-
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h  |   6 +-
 driv

[RFC PATCH v3 3/4] RDMA/mlx5: Support dma-buf based userspace memory region

2020-10-04 Thread Jianxin Xiong

Recognize the new access flag and call the core function to import
dma-buf based memory region.

Since the invalidation callback is called from the dma-buf driver
instead of mmu_interval_notifier, the part inside the ODP pagefault
handler that checks for on-going invalidation is modified to handle
the difference.

Signed-off-by: Jianxin Xiong 
Reviewed-by: Sean Hefty 
Acked-by: Michael J. Ruhl 
---
 drivers/infiniband/hw/mlx5/mr.c  | 50 +---
 drivers/infiniband/hw/mlx5/odp.c | 22 --
 2 files changed, 67 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 3c91e32..d58be20 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -912,6 +912,44 @@ static int mr_umem_get(struct mlx5_ib_dev *dev, u64 start, 
u64 length,
return 0;
 }
 
+static int mr_umem_dmabuf_get(struct mlx5_ib_dev *dev, u64 start, u64 length,
+ int dmabuf_fd, int access_flags,
+ struct ib_umem **umem, int *npages,
+ int *page_shift, int *ncont, int *order)
+{
+   struct ib_umem *u;
+   struct ib_umem_odp *odp;
+
+   *umem = NULL;
+
+   u = ib_umem_dmabuf_get(&dev->ib_dev, start, length, dmabuf_fd,
+  access_flags, &mlx5_mn_ops);
+   if (IS_ERR(u)) {
+   mlx5_ib_dbg(dev, "umem get failed (%ld)\n", PTR_ERR(u));
+   return PTR_ERR(u);
+   }
+
+   odp = to_ib_umem_odp(u);
+   *page_shift = odp->page_shift;
+   *ncont = ib_umem_odp_num_pages(odp);
+   *npages = *ncont << (*page_shift - PAGE_SHIFT);
+   if (order)
+   *order = ilog2(roundup_pow_of_two(*ncont));
+
+   if (!*npages) {
+   mlx5_ib_warn(dev, "avoid zero region\n");
+   ib_umem_release(u);
+   return -EINVAL;
+   }
+
+   *umem = u;
+
+   mlx5_ib_dbg(dev, "npages %d, ncont %d, order %d, page_shift %d\n",
+   *npages, *ncont, *order, *page_shift);
+
+   return 0;
+}
+
 static void mlx5_ib_umr_done(struct ib_cq *cq, struct ib_wc *wc)
 {
struct mlx5_ib_umr_context *context =
@@ -1382,9 +1420,15 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd,
return &mr->ibmr;
}
 
-   err = mr_umem_get(dev, attr->start, attr->length,
- attr->access_flags, &umem,
- &npages, &page_shift, &ncont, &order);
+   if (attr->access_flags & IB_ACCESS_DMABUF)
+   err = mr_umem_dmabuf_get(dev, attr->start, attr->length,
+attr->fd, attr->access_flags,
+&umem, &npages, &page_shift, &ncont,
+&order);
+   else
+   err = mr_umem_get(dev, attr->start, attr->length,
+ attr->access_flags, &umem,
+ &npages, &page_shift, &ncont, &order);
if (err < 0)
return ERR_PTR(err);
 
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index cfd7efa..f2ca3f8 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -665,6 +665,24 @@ void mlx5_ib_fence_odp_mr(struct mlx5_ib_mr *mr)
dma_fence_odp_mr(mr);
 }
 
+static inline unsigned long notifier_read_begin(struct ib_umem_odp *odp)
+{
+   if (odp->umem.is_dmabuf)
+   return ib_umem_dmabuf_notifier_read_begin(odp);
+   else
+   return mmu_interval_read_begin(&odp->notifier);
+}
+
+static inline int notifier_read_retry(struct ib_umem_odp *odp,
+ unsigned long current_seq)
+{
+   if (odp->umem.is_dmabuf) {
+   return ib_umem_dmabuf_notifier_read_retry(odp, current_seq);
+   } else {
+   return mmu_interval_read_retry(&odp->notifier, current_seq);
+   }
+}
+
 #define MLX5_PF_FLAGS_DOWNGRADE BIT(1)
 static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp,
 u64 user_va, size_t bcnt, u32 *bytes_mapped,
@@ -683,7 +701,7 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct 
ib_umem_odp *odp,
if (odp->umem.writable && !downgrade)
access_mask |= ODP_WRITE_ALLOWED_BIT;
 
-   current_seq = mmu_interval_read_begin(&odp->notifier);
+   current_seq = notifier_read_begin(odp);
 
np = ib_umem_odp_map_dma_pages(odp, user_va, bcnt, access_mask,
   current_seq);
@@ -691,7 +709,7 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct 
ib_umem_odp *odp,
return np;
 
mutex_lock(&odp->umem_mutex);
-   if (!mmu_interval_read_retry(&odp->notifier, current_seq)) {
+   if (!notifier_read_retry(odp, current_seq)) {
/*
 *

[RFC PATCH v3 2/4] RDMA: Expand driver memory registration methods to support dma-buf

2020-10-04 Thread Jianxin Xiong

For better extensibility, driver methods for user memory registration
are changed to to accept a structure instead of individual attributes
of the memory region.

To support dma-buf based memory region, a 'fd' field is added to the
the structure and a new access flag is defined.

Signed-off-by: Jianxin Xiong 
Reviewed-by: Sean Hefty 
Acked-by: Michael J. Ruhl 
---
 drivers/infiniband/core/uverbs_cmd.c| 25 +---
 drivers/infiniband/core/verbs.c | 15 ++--
 drivers/infiniband/hw/bnxt_re/ib_verbs.c| 23 +--
 drivers/infiniband/hw/bnxt_re/ib_verbs.h|  4 +-
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h  |  6 +--
 drivers/infiniband/hw/cxgb4/mem.c   | 19 -
 drivers/infiniband/hw/efa/efa.h |  3 +-
 drivers/infiniband/hw/efa/efa_verbs.c   | 24 ++--
 drivers/infiniband/hw/hns/hns_roce_device.h |  8 ++--
 drivers/infiniband/hw/hns/hns_roce_mr.c | 28 +++---
 drivers/infiniband/hw/i40iw/i40iw_verbs.c   | 24 +---
 drivers/infiniband/hw/mlx4/mlx4_ib.h|  7 ++--
 drivers/infiniband/hw/mlx4/mr.c | 37 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h|  8 ++--
 drivers/infiniband/hw/mlx5/mr.c | 51 +
 drivers/infiniband/hw/mthca/mthca_provider.c| 13 ---
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 23 ++-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.h |  5 ++-
 drivers/infiniband/hw/qedr/verbs.c  | 25 ++--
 drivers/infiniband/hw/qedr/verbs.h  |  4 +-
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c| 12 +++---
 drivers/infiniband/hw/usnic/usnic_ib_verbs.h|  4 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c| 24 ++--
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h |  4 +-
 drivers/infiniband/sw/rdmavt/mr.c   | 21 +-
 drivers/infiniband/sw/rdmavt/mr.h   |  4 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c   | 10 ++---
 drivers/infiniband/sw/siw/siw_verbs.c   | 26 +++--
 drivers/infiniband/sw/siw/siw_verbs.h   |  5 ++-
 include/rdma/ib_verbs.h | 21 +++---
 include/uapi/rdma/ib_user_ioctl_verbs.h |  2 +
 31 files changed, 265 insertions(+), 220 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c 
b/drivers/infiniband/core/uverbs_cmd.c
index 2fbc583..b522204 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2005, 2006, 2007 Cisco Systems.  All rights reserved.
  * Copyright (c) 2005 PathScale, Inc.  All rights reserved.
  * Copyright (c) 2006 Mellanox Technologies.  All rights reserved.
+ * Copyright (c) 2020 Intel Corporatiion.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -694,6 +695,7 @@ static int ib_uverbs_reg_mr(struct uverbs_attr_bundle 
*attrs)
struct ib_uobject   *uobj;
struct ib_pd*pd;
struct ib_mr*mr;
+   struct ib_user_mr_attr   user_mr_attr;
int  ret;
struct ib_device *ib_dev;
 
@@ -727,8 +729,17 @@ static int ib_uverbs_reg_mr(struct uverbs_attr_bundle 
*attrs)
}
}
 
-   mr = pd->device->ops.reg_user_mr(pd, cmd.start, cmd.length, cmd.hca_va,
-cmd.access_flags,
+   if (cmd.access_flags & IB_ACCESS_DMABUF) {
+   pr_debug("Dma-buf support not available via regular reg_mr 
call\n");
+   ret = -EINVAL;
+   goto err_put;
+   }
+
+   user_mr_attr.start = cmd.start;
+   user_mr_attr.length = cmd.length;
+   user_mr_attr.virt_addr = cmd.hca_va;
+   user_mr_attr.access_flags = cmd.access_flags;
+   mr = pd->device->ops.reg_user_mr(pd, &user_mr_attr,
 &attrs->driver_udata);
if (IS_ERR(mr)) {
ret = PTR_ERR(mr);
@@ -769,6 +780,7 @@ static int ib_uverbs_rereg_mr(struct uverbs_attr_bundle 
*attrs)
struct ib_pd*pd = NULL;
struct ib_mr*mr;
struct ib_pd*old_pd;
+   struct ib_user_mr_attr   user_mr_attr;
int  ret;
struct ib_uobject   *uobj;
 
@@ -811,9 +823,12 @@ static int ib_uverbs_rereg_mr(struct uverbs_attr_bundle 
*attrs)
}
 
old_pd = mr->pd;
-   ret = mr->device->ops.rereg_user_mr(mr, cmd.flags, cmd.start,
-   cmd.length, cmd.hca_va,
-   cmd.access_flags, pd,
+   user_mr_attr.start = cmd.start;
+   user_mr_attr.length = cmd.length;
+   user_mr_attr.virt_addr = cmd.hca_va;
+   user_mr_attr.a

[RFC PATCH v3 4/4] RDMA/uverbs: Add uverbs command for dma-buf based MR registration

2020-10-04 Thread Jianxin Xiong

Add uverbs command for registering user memory region associated
with a dma-buf file descriptor.

Signed-off-by: Jianxin Xiong 
Reviewed-by: Sean Hefty 
Acked-by: Michael J. Ruhl 
---
 drivers/infiniband/core/uverbs_std_types_mr.c | 115 ++
 include/uapi/rdma/ib_user_ioctl_cmds.h|  14 
 2 files changed, 129 insertions(+)

diff --git a/drivers/infiniband/core/uverbs_std_types_mr.c 
b/drivers/infiniband/core/uverbs_std_types_mr.c
index 9b22bb5..388364a 100644
--- a/drivers/infiniband/core/uverbs_std_types_mr.c
+++ b/drivers/infiniband/core/uverbs_std_types_mr.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2018, Mellanox Technologies inc.  All rights reserved.
+ * Copyright (c) 2020, Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -178,6 +179,88 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QUERY_MR)(
return IS_UVERBS_COPY_ERR(ret) ? ret : 0;
 }
 
+static int UVERBS_HANDLER(UVERBS_METHOD_REG_DMABUF_MR)(
+   struct uverbs_attr_bundle *attrs)
+{
+   struct ib_uobject *uobj =
+   uverbs_attr_get_uobject(attrs, 
UVERBS_ATTR_REG_DMABUF_MR_HANDLE);
+   struct ib_pd *pd =
+   uverbs_attr_get_obj(attrs, UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE);
+   struct ib_device *ib_dev = pd->device;
+
+   struct ib_user_mr_attr user_mr_attr;
+   struct ib_mr *mr;
+   int ret;
+
+   if (!ib_dev->ops.reg_user_mr)
+   return -EOPNOTSUPP;
+
+   if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_ON_DEMAND_PAGING))
+   return -EOPNOTSUPP;
+
+   ret = uverbs_copy_from(&user_mr_attr.start, attrs,
+  UVERBS_ATTR_REG_DMABUF_MR_ADDR);
+   if (ret)
+   return ret;
+
+   ret = uverbs_copy_from(&user_mr_attr.length, attrs,
+  UVERBS_ATTR_REG_DMABUF_MR_LENGTH);
+   if (ret)
+   return ret;
+
+   ret = uverbs_copy_from(&user_mr_attr.virt_addr, attrs,
+  UVERBS_ATTR_REG_DMABUF_MR_HCA_VA);
+   if (ret)
+   return ret;
+
+   ret = uverbs_copy_from(&user_mr_attr.fd, attrs,
+  UVERBS_ATTR_REG_DMABUF_MR_FD);
+   if (ret)
+   return ret;
+
+   ret = uverbs_get_flags32(&user_mr_attr.access_flags, attrs,
+UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS,
+IB_ACCESS_SUPPORTED);
+   if (ret)
+   return ret;
+
+   user_mr_attr.access_flags |= IB_ACCESS_DMABUF;
+
+   ret = ib_check_mr_access(user_mr_attr.access_flags);
+   if (ret)
+   return ret;
+
+   mr = pd->device->ops.reg_user_mr(pd, &user_mr_attr,
+&attrs->driver_udata);
+   if (IS_ERR(mr))
+   return PTR_ERR(mr);
+
+   mr->device  = pd->device;
+   mr->pd  = pd;
+   mr->type= IB_MR_TYPE_USER;
+   mr->uobject = uobj;
+   atomic_inc(&pd->usecnt);
+
+   uobj->object = mr;
+
+   ret = uverbs_copy_to(attrs, UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY,
+&mr->lkey, sizeof(mr->lkey));
+   if (ret)
+   goto err_dereg;
+
+   ret = uverbs_copy_to(attrs, UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY,
+&mr->rkey, sizeof(mr->rkey));
+   if (ret)
+   goto err_dereg;
+
+   return 0;
+
+err_dereg:
+   ib_dereg_mr_user(mr, uverbs_get_cleared_udata(attrs));
+
+   return ret;
+}
+
 DECLARE_UVERBS_NAMED_METHOD(
UVERBS_METHOD_ADVISE_MR,
UVERBS_ATTR_IDR(UVERBS_ATTR_ADVISE_MR_PD_HANDLE,
@@ -243,6 +326,37 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QUERY_MR)(
UVERBS_ATTR_TYPE(u32),
UA_MANDATORY));
 
+DECLARE_UVERBS_NAMED_METHOD(
+   UVERBS_METHOD_REG_DMABUF_MR,
+   UVERBS_ATTR_IDR(UVERBS_ATTR_REG_DMABUF_MR_HANDLE,
+   UVERBS_OBJECT_MR,
+   UVERBS_ACCESS_NEW,
+   UA_MANDATORY),
+   UVERBS_ATTR_IDR(UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE,
+   UVERBS_OBJECT_PD,
+   UVERBS_ACCESS_READ,
+   UA_MANDATORY),
+   UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DMABUF_MR_ADDR,
+  UVERBS_ATTR_TYPE(u64),
+  UA_MANDATORY),
+   UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DMABUF_MR_LENGTH,
+  UVERBS_ATTR_TYPE(u64),
+  UA_MANDATORY),
+   UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DMABUF_MR_HCA_VA,
+  UVERBS_ATTR_TYPE(u64),
+  UA_MANDATORY),
+   UVERBS_ATTR_PTR_IN(UVERBS_ATTR_REG_DMABUF_MR_FD,
+  UVERBS_ATTR_TYPE(u32),
+  UA_MANDATORY),
+   UVERBS_ATTR_FLAGS_I

[Bug 201497] [amdgpu]: 'ERROR No EDID read' is back in 4.19

2020-10-04 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=201497

C0rn3j (spleefe...@gmail.com) changed:

   What|Removed |Added

 CC||spleefe...@gmail.com

--- Comment #17 from C0rn3j (spleefe...@gmail.com) ---
Created attachment 292811
  --> https://bugzilla.kernel.org/attachment.cgi?id=292811&action=edit
5.8 dmesg RX 580

I believe am hitting the same issue.

I have a DP monitor ACER XB270HAbprz and it is stuck in 640x480 on my RX 580.

This is on Linux 5.8.

On monitor replug this relevant line pops up:
[drm:dc_link_detect_helper [amdgpu]] *ERROR* No EDID read.

This same monitor with the same cable works great on GTX 970 and GTX 1080 Ti
under Nvidia 455.23.04.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH RESEND] drm/bridge: tc358764: restore connector support

2020-10-04 Thread Sam Ravnborg

Hi Marek.

On Wed, Sep 30, 2020 at 01:40:42PM +0200, Marek Szyprowski wrote:
> This patch restores DRM connector registration in the TC358764 bridge
> driver and restores usage of the old drm_panel_* API, thus allows dynamic
> panel registration. This fixes panel operation on Exynos5250-based
> Arndale board.
> 
> This is equivalent to the revert of the following commits:
> 1644127f83bc "drm/bridge: tc358764: add drm_panel_bridge support"
> 385ca38da29c "drm/bridge: tc358764: drop drm_connector_(un)register"
> and removal of the calls to drm_panel_attach()/drm_panel_detach(), which
> were no-ops and has been removed in meanwhile.
> 
> Signed-off-by: Marek Szyprowski 
> Reviewed-by: Andrzej Hajda 

Thanks for providing the revert so we can have this fixed in upstream.
So far I have had no time to dive deeper into what is going wrong but
and the revert is the right cause of action for now.

I expect Andrzej to pick it up as he has already reviewed it.

Sam

> ---
> As I've reported and Andrzej Hajda pointed, the reverted patches break
> operation of the panel on the Arndale board. Noone suggested how to fix
> the regression, I've decided to send a revert until a new solution is
> found.
> 
> The issues with tc358764 might be automatically resolved once the Exynos
> DSI itself is converted to DRM bridge:
> https://patchwork.kernel.org/cover/11770683/
> but that approach has also its own issues so far.
> 
> Resend reason: added Sam Ravnborg to CC:
> 
> Best regards,
> Marek Szyprowski
> ---
>  drivers/gpu/drm/bridge/tc358764.c | 107 +-
>  1 file changed, 92 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/tc358764.c 
> b/drivers/gpu/drm/bridge/tc358764.c
> index d89394bc5aa4..c1e35bdf9232 100644
> --- a/drivers/gpu/drm/bridge/tc358764.c
> +++ b/drivers/gpu/drm/bridge/tc358764.c
> @@ -153,9 +153,10 @@ static const char * const tc358764_supplies[] = {
>  struct tc358764 {
>   struct device *dev;
>   struct drm_bridge bridge;
> + struct drm_connector connector;
>   struct regulator_bulk_data supplies[ARRAY_SIZE(tc358764_supplies)];
>   struct gpio_desc *gpio_reset;
> - struct drm_bridge *panel_bridge;
> + struct drm_panel *panel;
>   int error;
>  };
>  
> @@ -209,6 +210,12 @@ static inline struct tc358764 *bridge_to_tc358764(struct 
> drm_bridge *bridge)
>   return container_of(bridge, struct tc358764, bridge);
>  }
>  
> +static inline
> +struct tc358764 *connector_to_tc358764(struct drm_connector *connector)
> +{
> + return container_of(connector, struct tc358764, connector);
> +}
> +
>  static int tc358764_init(struct tc358764 *ctx)
>  {
>   u32 v = 0;
> @@ -271,11 +278,43 @@ static void tc358764_reset(struct tc358764 *ctx)
>   usleep_range(1000, 2000);
>  }
>  
> +static int tc358764_get_modes(struct drm_connector *connector)
> +{
> + struct tc358764 *ctx = connector_to_tc358764(connector);
> +
> + return drm_panel_get_modes(ctx->panel, connector);
> +}
> +
> +static const
> +struct drm_connector_helper_funcs tc358764_connector_helper_funcs = {
> + .get_modes = tc358764_get_modes,
> +};
> +
> +static const struct drm_connector_funcs tc358764_connector_funcs = {
> + .fill_modes = drm_helper_probe_single_connector_modes,
> + .destroy = drm_connector_cleanup,
> + .reset = drm_atomic_helper_connector_reset,
> + .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
> + .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
> +};
> +
> +static void tc358764_disable(struct drm_bridge *bridge)
> +{
> + struct tc358764 *ctx = bridge_to_tc358764(bridge);
> + int ret = drm_panel_disable(bridge_to_tc358764(bridge)->panel);
> +
> + if (ret < 0)
> + dev_err(ctx->dev, "error disabling panel (%d)\n", ret);
> +}
> +
>  static void tc358764_post_disable(struct drm_bridge *bridge)
>  {
>   struct tc358764 *ctx = bridge_to_tc358764(bridge);
>   int ret;
>  
> + ret = drm_panel_unprepare(ctx->panel);
> + if (ret < 0)
> + dev_err(ctx->dev, "error unpreparing panel (%d)\n", ret);
>   tc358764_reset(ctx);
>   usleep_range(1, 15000);
>   ret = regulator_bulk_disable(ARRAY_SIZE(ctx->supplies), ctx->supplies);
> @@ -296,28 +335,71 @@ static void tc358764_pre_enable(struct drm_bridge 
> *bridge)
>   ret = tc358764_init(ctx);
>   if (ret < 0)
>   dev_err(ctx->dev, "error initializing bridge (%d)\n", ret);
> + ret = drm_panel_prepare(ctx->panel);
> + if (ret < 0)
> + dev_err(ctx->dev, "error preparing panel (%d)\n", ret);
> +}
> +
> +static void tc358764_enable(struct drm_bridge *bridge)
> +{
> + struct tc358764 *ctx = bridge_to_tc358764(bridge);
> + int ret = drm_panel_enable(ctx->panel);
> +
> + if (ret < 0)
> + dev_err(ctx->dev, "error enabling panel (%d)\n", ret);
>  }
>  
>  static int tc358764_attach(struct drm_bridge *bridge,
>

[PATCH 00/14] drm/msm: de-struct_mutex-ification

2020-10-04 Thread Rob Clark

From: Rob Clark 

This doesn't remove *all* the struct_mutex, but it covers the worst
of it, ie. shrinker/madvise/free/retire.  The submit path still uses
struct_mutex, but it still needs *something* serialize a portion of
the submit path, and lock_stat mostly just shows the lock contention
there being with other submits.  And there are a few other bits of
struct_mutex usage in less critical paths (debugfs, etc).  But this
seems like a reasonable step in the right direction.

Rob Clark (14):
  drm/msm: Use correct drm_gem_object_put() in fail case
  drm/msm: Drop chatty trace
  drm/msm: Move update_fences()
  drm/msm: Add priv->mm_lock to protect active/inactive lists
  drm/msm: Document and rename preempt_lock
  drm/msm: Protect ring->submits with it's own lock
  drm/msm: Refcount submits
  drm/msm: Remove obj->gpu
  drm/msm: Drop struct_mutex from the retire path
  drm/msm: Drop struct_mutex in free_object() path
  drm/msm: remove msm_gem_free_work
  drm/msm: drop struct_mutex in madvise path
  drm/msm: Drop struct_mutex in shrinker path
  drm/msm: Don't implicit-sync if only a single ring

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c |  4 +-
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 12 +--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  4 +-
 drivers/gpu/drm/msm/msm_debugfs.c |  7 ++
 drivers/gpu/drm/msm/msm_drv.c | 15 +---
 drivers/gpu/drm/msm/msm_drv.h | 19 +++--
 drivers/gpu/drm/msm/msm_gem.c | 76 ++
 drivers/gpu/drm/msm/msm_gem.h | 53 +
 drivers/gpu/drm/msm/msm_gem_shrinker.c| 58 ++
 drivers/gpu/drm/msm/msm_gem_submit.c  | 17 ++--
 drivers/gpu/drm/msm/msm_gpu.c | 96 ++-
 drivers/gpu/drm/msm/msm_gpu.h |  5 +-
 drivers/gpu/drm/msm/msm_ringbuffer.c  |  3 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h  | 13 ++-
 14 files changed, 188 insertions(+), 194 deletions(-)

-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 03/14] drm/msm: Move update_fences()

2020-10-04 Thread Rob Clark

From: Rob Clark 

Small cleanup, update_fences() is used in the hangcheck path, but also
in the normal retire path.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gpu.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 31fce3ac0cdc..ca8c95b32c8b 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -265,6 +265,20 @@ int msm_gpu_hw_init(struct msm_gpu *gpu)
return ret;
 }
 
+static void update_fences(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
+   uint32_t fence)
+{
+   struct msm_gem_submit *submit;
+
+   list_for_each_entry(submit, &ring->submits, node) {
+   if (submit->seqno > fence)
+   break;
+
+   msm_update_fence(submit->ring->fctx,
+   submit->fence->seqno);
+   }
+}
+
 #ifdef CONFIG_DEV_COREDUMP
 static ssize_t msm_gpu_devcoredump_read(char *buffer, loff_t offset,
size_t count, void *data, size_t datalen)
@@ -411,20 +425,6 @@ static void msm_gpu_crashstate_capture(struct msm_gpu *gpu,
  * Hangcheck detection for locked gpu:
  */
 
-static void update_fences(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
-   uint32_t fence)
-{
-   struct msm_gem_submit *submit;
-
-   list_for_each_entry(submit, &ring->submits, node) {
-   if (submit->seqno > fence)
-   break;
-
-   msm_update_fence(submit->ring->fctx,
-   submit->fence->seqno);
-   }
-}
-
 static struct msm_gem_submit *
 find_submit(struct msm_ringbuffer *ring, uint32_t fence)
 {
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 05/14] drm/msm: Document and rename preempt_lock

2020-10-04 Thread Rob Clark

From: Rob Clark 

Before adding another lock, give ring->lock a more descriptive name.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c |  4 ++--
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 12 ++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  4 ++--
 drivers/gpu/drm/msm/msm_ringbuffer.c  |  2 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h  |  7 ++-
 5 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index c941c8138f25..543437a2186e 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -36,7 +36,7 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer 
*ring,
OUT_RING(ring, upper_32_bits(shadowptr(a5xx_gpu, ring)));
}
 
-   spin_lock_irqsave(&ring->lock, flags);
+   spin_lock_irqsave(&ring->preempt_lock, flags);
 
/* Copy the shadow to the actual register */
ring->cur = ring->next;
@@ -44,7 +44,7 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer 
*ring,
/* Make sure to wrap wptr if we need to */
wptr = get_wptr(ring);
 
-   spin_unlock_irqrestore(&ring->lock, flags);
+   spin_unlock_irqrestore(&ring->preempt_lock, flags);
 
/* Make sure everything is posted before making a decision */
mb();
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c 
b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
index 7e04509c4e1f..183de1139eeb 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
@@ -45,9 +45,9 @@ static inline void update_wptr(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring)
if (!ring)
return;
 
-   spin_lock_irqsave(&ring->lock, flags);
+   spin_lock_irqsave(&ring->preempt_lock, flags);
wptr = get_wptr(ring);
-   spin_unlock_irqrestore(&ring->lock, flags);
+   spin_unlock_irqrestore(&ring->preempt_lock, flags);
 
gpu_write(gpu, REG_A5XX_CP_RB_WPTR, wptr);
 }
@@ -62,9 +62,9 @@ static struct msm_ringbuffer *get_next_ring(struct msm_gpu 
*gpu)
bool empty;
struct msm_ringbuffer *ring = gpu->rb[i];
 
-   spin_lock_irqsave(&ring->lock, flags);
+   spin_lock_irqsave(&ring->preempt_lock, flags);
empty = (get_wptr(ring) == ring->memptrs->rptr);
-   spin_unlock_irqrestore(&ring->lock, flags);
+   spin_unlock_irqrestore(&ring->preempt_lock, flags);
 
if (!empty)
return ring;
@@ -132,9 +132,9 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
}
 
/* Make sure the wptr doesn't update while we're in motion */
-   spin_lock_irqsave(&ring->lock, flags);
+   spin_lock_irqsave(&ring->preempt_lock, flags);
a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
-   spin_unlock_irqrestore(&ring->lock, flags);
+   spin_unlock_irqrestore(&ring->preempt_lock, flags);
 
/* Set the address of the incoming preemption record */
gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_LO,
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 8915882e..fc85f008d69d 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -65,7 +65,7 @@ static void a6xx_flush(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring)
OUT_RING(ring, upper_32_bits(shadowptr(a6xx_gpu, ring)));
}
 
-   spin_lock_irqsave(&ring->lock, flags);
+   spin_lock_irqsave(&ring->preempt_lock, flags);
 
/* Copy the shadow to the actual register */
ring->cur = ring->next;
@@ -73,7 +73,7 @@ static void a6xx_flush(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring)
/* Make sure to wrap wptr if we need to */
wptr = get_wptr(ring);
 
-   spin_unlock_irqrestore(&ring->lock, flags);
+   spin_unlock_irqrestore(&ring->preempt_lock, flags);
 
/* Make sure everything is posted before making a decision */
mb();
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c 
b/drivers/gpu/drm/msm/msm_ringbuffer.c
index 935bf9b1d941..1b6958e908dc 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -46,7 +46,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu 
*gpu, int id,
ring->memptrs_iova = memptrs_iova;
 
INIT_LIST_HEAD(&ring->submits);
-   spin_lock_init(&ring->lock);
+   spin_lock_init(&ring->preempt_lock);
 
snprintf(name, sizeof(name), "gpu-ring-%d", ring->id);
 
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h 
b/drivers/gpu/drm/msm/msm_ringbuffer.h
index 0987d6bf848c..4956d1bc5d0e 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.h
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
@@ -46,7 +46,12 @@ struct msm_ringbuffer {
struct msm_rbmemptrs *memptrs;
uint64_t m

[PATCH 04/14] drm/msm: Add priv->mm_lock to protect active/inactive lists

2020-10-04 Thread Rob Clark

From: Rob Clark 

Rather than relying on the big dev->struct_mutex hammer, introduce a
more specific lock for protecting the bo lists.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_debugfs.c  |  7 +++
 drivers/gpu/drm/msm/msm_drv.c  |  1 +
 drivers/gpu/drm/msm/msm_drv.h  | 13 +++-
 drivers/gpu/drm/msm/msm_gem.c  | 28 +++---
 drivers/gpu/drm/msm/msm_gem_shrinker.c | 12 +++
 drivers/gpu/drm/msm/msm_gpu.h  |  5 -
 6 files changed, 52 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_debugfs.c 
b/drivers/gpu/drm/msm/msm_debugfs.c
index ee2e270f464c..64afbed89821 100644
--- a/drivers/gpu/drm/msm/msm_debugfs.c
+++ b/drivers/gpu/drm/msm/msm_debugfs.c
@@ -112,6 +112,11 @@ static int msm_gem_show(struct drm_device *dev, struct 
seq_file *m)
 {
struct msm_drm_private *priv = dev->dev_private;
struct msm_gpu *gpu = priv->gpu;
+   int ret;
+
+   ret = mutex_lock_interruptible(&priv->mm_lock);
+   if (ret)
+   return ret;
 
if (gpu) {
seq_printf(m, "Active Objects (%s):\n", gpu->name);
@@ -121,6 +126,8 @@ static int msm_gem_show(struct drm_device *dev, struct 
seq_file *m)
seq_printf(m, "Inactive Objects:\n");
msm_gem_describe_objects(&priv->inactive_list, m);
 
+   mutex_unlock(&priv->mm_lock);
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 49685571dc0e..dc6efc089285 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -441,6 +441,7 @@ static int msm_drm_init(struct device *dev, struct 
drm_driver *drv)
init_llist_head(&priv->free_list);
 
INIT_LIST_HEAD(&priv->inactive_list);
+   mutex_init(&priv->mm_lock);
 
drm_mode_config_init(ddev);
 
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index b9dd8f8f4887..50978e5db376 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -174,8 +174,19 @@ struct msm_drm_private {
struct msm_rd_state *hangrd;   /* debugfs to dump hanging submits */
struct msm_perf_state *perf;
 
-   /* list of GEM objects: */
+   /*
+* List of inactive GEM objects.  Every bo is either in the 
inactive_list
+* or gpu->active_list (for the gpu it is active on[1])
+*
+* These lists are protected by mm_lock.  If struct_mutex is involved, 
it
+* should be aquired prior to mm_lock.  One should *not* hold mm_lock in
+* get_pages()/vmap()/etc paths, as they can trigger the shrinker.
+*
+* [1] if someone ever added support for the old 2d cores, there could 
be
+* more than one gpu object
+*/
struct list_head inactive_list;
+   struct mutex mm_lock;
 
/* worker for delayed free of objects: */
struct work_struct free_work;
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index a870b3ad129d..b04ed8b52f9d 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -746,13 +746,17 @@ int msm_gem_sync_object(struct drm_gem_object *obj,
 void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu)
 {
struct msm_gem_object *msm_obj = to_msm_bo(obj);
-   WARN_ON(!mutex_is_locked(&obj->dev->struct_mutex));
+   struct msm_drm_private *priv = obj->dev->dev_private;
+
+   might_sleep();
WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED);
 
if (!atomic_fetch_inc(&msm_obj->active_count)) {
+   mutex_lock(&priv->mm_lock);
msm_obj->gpu = gpu;
list_del_init(&msm_obj->mm_list);
list_add_tail(&msm_obj->mm_list, &gpu->active_list);
+   mutex_unlock(&priv->mm_lock);
}
 }
 
@@ -761,12 +765,14 @@ void msm_gem_active_put(struct drm_gem_object *obj)
struct msm_gem_object *msm_obj = to_msm_bo(obj);
struct msm_drm_private *priv = obj->dev->dev_private;
 
-   WARN_ON(!mutex_is_locked(&obj->dev->struct_mutex));
+   might_sleep();
 
if (!atomic_dec_return(&msm_obj->active_count)) {
+   mutex_lock(&priv->mm_lock);
msm_obj->gpu = NULL;
list_del_init(&msm_obj->mm_list);
list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
+   mutex_unlock(&priv->mm_lock);
}
 }
 
@@ -921,13 +927,16 @@ static void free_object(struct msm_gem_object *msm_obj)
 {
struct drm_gem_object *obj = &msm_obj->base;
struct drm_device *dev = obj->dev;
+   struct msm_drm_private *priv = dev->dev_private;
 
WARN_ON(!mutex_is_locked(&dev->struct_mutex));
 
/* object should not be on active list: */
WARN_ON(is_active(msm_obj));
 
+   mutex_lock(&priv->mm_lock);
list_del(&msm_obj->mm_list);
+   mutex_unlock(&priv->mm_lock);
 
mutex_lock(&msm_ob

[PATCH 01/14] drm/msm: Use correct drm_gem_object_put() in fail case

2020-10-04 Thread Rob Clark

From: Rob Clark 

We only want to use the _unlocked() variant in the unlocked case.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 14e14caf90f9..a870b3ad129d 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -1115,7 +1115,11 @@ static struct drm_gem_object *_msm_gem_new(struct 
drm_device *dev,
return obj;
 
 fail:
-   drm_gem_object_put(obj);
+   if (struct_mutex_locked) {
+   drm_gem_object_put_locked(obj);
+   } else {
+   drm_gem_object_put(obj);
+   }
return ERR_PTR(ret);
 }
 
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 13/14] drm/msm: Drop struct_mutex in shrinker path

2020-10-04 Thread Rob Clark

From: Rob Clark 

Now that the inactive_list is protected by mm_lock, and everything
else on per-obj basis is protected by obj->lock, we no longer depend
on struct_mutex.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem.c  |  1 -
 drivers/gpu/drm/msm/msm_gem_shrinker.c | 54 --
 2 files changed, 55 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 9cdac4f7228c..e749a1c6f4e0 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -654,7 +654,6 @@ void msm_gem_purge(struct drm_gem_object *obj, enum 
msm_gem_lock subclass)
struct drm_device *dev = obj->dev;
struct msm_gem_object *msm_obj = to_msm_bo(obj);
 
-   WARN_ON(!mutex_is_locked(&dev->struct_mutex));
WARN_ON(!is_purgeable(msm_obj, subclass));
WARN_ON(obj->import_attach);
 
diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c 
b/drivers/gpu/drm/msm/msm_gem_shrinker.c
index 39a1b5327267..2c7bda1e2bf9 100644
--- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
+++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
@@ -8,48 +8,13 @@
 #include "msm_gem.h"
 #include "msm_gpu_trace.h"
 
-static bool msm_gem_shrinker_lock(struct drm_device *dev, bool *unlock)
-{
-   /* NOTE: we are *closer* to being able to get rid of
-* mutex_trylock_recursive().. the msm_gem code itself does
-* not need struct_mutex, although codepaths that can trigger
-* shrinker are still called in code-paths that hold the
-* struct_mutex.
-*
-* Also, msm_obj->madv is protected by struct_mutex.
-*
-* The next step is probably split out a seperate lock for
-* protecting inactive_list, so that shrinker does not need
-* struct_mutex.
-*/
-   switch (mutex_trylock_recursive(&dev->struct_mutex)) {
-   case MUTEX_TRYLOCK_FAILED:
-   return false;
-
-   case MUTEX_TRYLOCK_SUCCESS:
-   *unlock = true;
-   return true;
-
-   case MUTEX_TRYLOCK_RECURSIVE:
-   *unlock = false;
-   return true;
-   }
-
-   BUG();
-}
-
 static unsigned long
 msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
 {
struct msm_drm_private *priv =
container_of(shrinker, struct msm_drm_private, shrinker);
-   struct drm_device *dev = priv->dev;
struct msm_gem_object *msm_obj;
unsigned long count = 0;
-   bool unlock;
-
-   if (!msm_gem_shrinker_lock(dev, &unlock))
-   return 0;
 
mutex_lock(&priv->mm_lock);
 
@@ -60,9 +25,6 @@ msm_gem_shrinker_count(struct shrinker *shrinker, struct 
shrink_control *sc)
 
mutex_unlock(&priv->mm_lock);
 
-   if (unlock)
-   mutex_unlock(&dev->struct_mutex);
-
return count;
 }
 
@@ -71,13 +33,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct 
shrink_control *sc)
 {
struct msm_drm_private *priv =
container_of(shrinker, struct msm_drm_private, shrinker);
-   struct drm_device *dev = priv->dev;
struct msm_gem_object *msm_obj;
unsigned long freed = 0;
-   bool unlock;
-
-   if (!msm_gem_shrinker_lock(dev, &unlock))
-   return SHRINK_STOP;
 
mutex_lock(&priv->mm_lock);
 
@@ -92,9 +49,6 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct 
shrink_control *sc)
 
mutex_unlock(&priv->mm_lock);
 
-   if (unlock)
-   mutex_unlock(&dev->struct_mutex);
-
if (freed > 0)
trace_msm_gem_purge(freed << PAGE_SHIFT);
 
@@ -106,13 +60,8 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned 
long event, void *ptr)
 {
struct msm_drm_private *priv =
container_of(nb, struct msm_drm_private, vmap_notifier);
-   struct drm_device *dev = priv->dev;
struct msm_gem_object *msm_obj;
unsigned unmapped = 0;
-   bool unlock;
-
-   if (!msm_gem_shrinker_lock(dev, &unlock))
-   return NOTIFY_DONE;
 
mutex_lock(&priv->mm_lock);
 
@@ -130,9 +79,6 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned 
long event, void *ptr)
 
mutex_unlock(&priv->mm_lock);
 
-   if (unlock)
-   mutex_unlock(&dev->struct_mutex);
-
*(unsigned long *)ptr += unmapped;
 
if (unmapped > 0)
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 12/14] drm/msm: drop struct_mutex in madvise path

2020-10-04 Thread Rob Clark

From: Rob Clark 

The obj->lock is sufficient for what we need.

This *does* have the implication that userspace can try to shoot
themselves in the foot by racing madvise(DONTNEED) with submit.  But
the result will be about the same if they did madvise(DONTNEED) before
the submit ioctl, ie. they might not get want they want if they race
with shrinker.  But iova fault handling is robust enough, and userspace
is only shooting it's own foot.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_drv.c  | 11 ++--
 drivers/gpu/drm/msm/msm_gem.c  |  6 ++--
 drivers/gpu/drm/msm/msm_gem.h  | 38 ++
 drivers/gpu/drm/msm/msm_gem_shrinker.c |  4 +--
 4 files changed, 32 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index e766c1f45045..d2488816ce48 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -906,14 +906,9 @@ static int msm_ioctl_gem_madvise(struct drm_device *dev, 
void *data,
return -EINVAL;
}
 
-   ret = mutex_lock_interruptible(&dev->struct_mutex);
-   if (ret)
-   return ret;
-
obj = drm_gem_object_lookup(file, args->handle);
if (!obj) {
-   ret = -ENOENT;
-   goto unlock;
+   return -ENOENT;
}
 
ret = msm_gem_madvise(obj, args->madv);
@@ -922,10 +917,8 @@ static int msm_ioctl_gem_madvise(struct drm_device *dev, 
void *data,
ret = 0;
}
 
-   drm_gem_object_put_locked(obj);
+   drm_gem_object_put(obj);
 
-unlock:
-   mutex_unlock(&dev->struct_mutex);
return ret;
 }
 
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 5e75d644ce41..9cdac4f7228c 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -639,8 +639,6 @@ int msm_gem_madvise(struct drm_gem_object *obj, unsigned 
madv)
 
mutex_lock(&msm_obj->lock);
 
-   WARN_ON(!mutex_is_locked(&obj->dev->struct_mutex));
-
if (msm_obj->madv != __MSM_MADV_PURGED)
msm_obj->madv = madv;
 
@@ -657,7 +655,7 @@ void msm_gem_purge(struct drm_gem_object *obj, enum 
msm_gem_lock subclass)
struct msm_gem_object *msm_obj = to_msm_bo(obj);
 
WARN_ON(!mutex_is_locked(&dev->struct_mutex));
-   WARN_ON(!is_purgeable(msm_obj));
+   WARN_ON(!is_purgeable(msm_obj, subclass));
WARN_ON(obj->import_attach);
 
mutex_lock_nested(&msm_obj->lock, subclass);
@@ -749,7 +747,7 @@ void msm_gem_active_get(struct drm_gem_object *obj, struct 
msm_gpu *gpu)
struct msm_drm_private *priv = obj->dev->dev_private;
 
might_sleep();
-   WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED);
+   WARN_ON(msm_gem_madv(msm_obj, OBJ_LOCK_NORMAL) != MSM_MADV_WILLNEED);
 
if (!atomic_fetch_inc(&msm_obj->active_count)) {
mutex_lock(&priv->mm_lock);
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index e98a8004813b..bb8aa6b1b254 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -97,18 +97,6 @@ static inline bool is_active(struct msm_gem_object *msm_obj)
return atomic_read(&msm_obj->active_count);
 }
 
-static inline bool is_purgeable(struct msm_gem_object *msm_obj)
-{
-   WARN_ON(!mutex_is_locked(&msm_obj->base.dev->struct_mutex));
-   return (msm_obj->madv == MSM_MADV_DONTNEED) && msm_obj->sgt &&
-   !msm_obj->base.dma_buf && !msm_obj->base.import_attach;
-}
-
-static inline bool is_vunmapable(struct msm_gem_object *msm_obj)
-{
-   return (msm_obj->vmap_count == 0) && msm_obj->vaddr;
-}
-
 /* The shrinker can be triggered while we hold objA->lock, and need
  * to grab objB->lock to purge it.  Lockdep just sees these as a single
  * class of lock, so we use subclasses to teach it the difference.
@@ -125,6 +113,32 @@ enum msm_gem_lock {
OBJ_LOCK_SHRINKER,
 };
 
+/* Use this helper to read msm_obj->madv when msm_obj->lock not held: */
+static inline unsigned
+msm_gem_madv(struct msm_gem_object *msm_obj, enum msm_gem_lock subclass)
+{
+   unsigned madv;
+
+   mutex_lock_nested(&msm_obj->lock, subclass);
+   madv = msm_obj->madv;
+   mutex_unlock(&msm_obj->lock);
+
+   return madv;
+}
+
+static inline bool
+is_purgeable(struct msm_gem_object *msm_obj, enum msm_gem_lock subclass)
+{
+   return (msm_gem_madv(msm_obj, subclass) == MSM_MADV_DONTNEED) &&
+   msm_obj->sgt && !msm_obj->base.dma_buf &&
+   !msm_obj->base.import_attach;
+}
+
+static inline bool is_vunmapable(struct msm_gem_object *msm_obj)
+{
+   return (msm_obj->vmap_count == 0) && msm_obj->vaddr;
+}
+
 void msm_gem_purge(struct drm_gem_object *obj, enum msm_gem_lock subclass);
 void msm_gem_vunmap(struct drm_gem_object *obj, enum msm_gem_lock subclass);
 
diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c 
b/drivers/gpu/d

[PATCH 08/14] drm/msm: Remove obj->gpu

2020-10-04 Thread Rob Clark

From: Rob Clark 

It cannot be atomically updated with obj->active_count, and the only
purpose is a useless WARN_ON() (which becomes a buggy WARN_ON() once
retire_submits() is not serialized with incoming submits via
struct_mutex)

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem.c | 2 --
 drivers/gpu/drm/msm/msm_gem.h | 1 -
 drivers/gpu/drm/msm/msm_gpu.c | 5 -
 3 files changed, 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index b04ed8b52f9d..c52a3969e60b 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -753,7 +753,6 @@ void msm_gem_active_get(struct drm_gem_object *obj, struct 
msm_gpu *gpu)
 
if (!atomic_fetch_inc(&msm_obj->active_count)) {
mutex_lock(&priv->mm_lock);
-   msm_obj->gpu = gpu;
list_del_init(&msm_obj->mm_list);
list_add_tail(&msm_obj->mm_list, &gpu->active_list);
mutex_unlock(&priv->mm_lock);
@@ -769,7 +768,6 @@ void msm_gem_active_put(struct drm_gem_object *obj)
 
if (!atomic_dec_return(&msm_obj->active_count)) {
mutex_lock(&priv->mm_lock);
-   msm_obj->gpu = NULL;
list_del_init(&msm_obj->mm_list);
list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
mutex_unlock(&priv->mm_lock);
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index e05b1530aca6..61147bd96b06 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -64,7 +64,6 @@ struct msm_gem_object {
 *
 */
struct list_head mm_list;
-   struct msm_gpu *gpu; /* non-null if active */
 
/* Transiently in the process of submit ioctl, objects associated
 * with the submit are on submit->bo_list.. this only lasts for
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index fd3fc6f36ab1..c9ff19a75169 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -800,11 +800,6 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit)
struct drm_gem_object *drm_obj = &msm_obj->base;
uint64_t iova;
 
-   /* can't happen yet.. but when we add 2d support we'll have
-* to deal w/ cross-ring synchronization:
-*/
-   WARN_ON(is_active(msm_obj) && (msm_obj->gpu != gpu));
-
/* submit takes a reference to the bo and iova until retired: */
drm_gem_object_get(&msm_obj->base);
msm_gem_get_and_pin_iova(&msm_obj->base, submit->aspace, &iova);
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 10/14] drm/msm: Drop struct_mutex in free_object() path

2020-10-04 Thread Rob Clark

From: Rob Clark 

Now that active_list/inactive_list is protected by mm_lock, we no longer
need dev->struct_mutex in the free_object() path.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index c52a3969e60b..126d92fd21cd 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -927,8 +927,6 @@ static void free_object(struct msm_gem_object *msm_obj)
struct drm_device *dev = obj->dev;
struct msm_drm_private *priv = dev->dev_private;
 
-   WARN_ON(!mutex_is_locked(&dev->struct_mutex));
-
/* object should not be on active list: */
WARN_ON(is_active(msm_obj));
 
@@ -965,20 +963,14 @@ void msm_gem_free_work(struct work_struct *work)
 {
struct msm_drm_private *priv =
container_of(work, struct msm_drm_private, free_work);
-   struct drm_device *dev = priv->dev;
struct llist_node *freed;
struct msm_gem_object *msm_obj, *next;
 
while ((freed = llist_del_all(&priv->free_list))) {
-
-   mutex_lock(&dev->struct_mutex);
-
llist_for_each_entry_safe(msm_obj, next,
  freed, freed)
free_object(msm_obj);
 
-   mutex_unlock(&dev->struct_mutex);
-
if (need_resched())
break;
}
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 14/14] drm/msm: Don't implicit-sync if only a single ring

2020-10-04 Thread Rob Clark

From: Rob Clark 

Any cross-device sync use-cases *must* use explicit sync.  And if there
is only a single ring (no-preemption), everything is FIFO order and
there is no need to implicit-sync.

Mesa should probably just always use MSM_SUBMIT_NO_IMPLICIT, as behavior
is undefined when fences are not used to synchronize buffer usage across
contexts (which is the only case where multiple different priority rings
could come into play).

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 7d653bdc92dc..b9b68153b7b2 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -219,7 +219,7 @@ static int submit_lock_objects(struct msm_gem_submit 
*submit)
return ret;
 }
 
-static int submit_fence_sync(struct msm_gem_submit *submit, bool no_implicit)
+static int submit_fence_sync(struct msm_gem_submit *submit, bool implicit_sync)
 {
int i, ret = 0;
 
@@ -239,7 +239,7 @@ static int submit_fence_sync(struct msm_gem_submit *submit, 
bool no_implicit)
return ret;
}
 
-   if (no_implicit)
+   if (!implicit_sync)
continue;
 
ret = msm_gem_sync_object(&msm_obj->base, submit->ring->fctx,
@@ -704,7 +704,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
if (ret)
goto out;
 
-   ret = submit_fence_sync(submit, !!(args->flags & 
MSM_SUBMIT_NO_IMPLICIT));
+   ret = submit_fence_sync(submit, (gpu->nr_rings > 1) &&
+   !(args->flags & MSM_SUBMIT_NO_IMPLICIT));
if (ret)
goto out;
 
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 11/14] drm/msm: remove msm_gem_free_work

2020-10-04 Thread Rob Clark

From: Rob Clark 

Now that we don't need struct_mutex in the free path, we can get rid of
the asynchronous free altogether.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_drv.c |  3 ---
 drivers/gpu/drm/msm/msm_drv.h |  5 -
 drivers/gpu/drm/msm/msm_gem.c | 27 ---
 drivers/gpu/drm/msm/msm_gem.h |  1 -
 4 files changed, 36 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index dc6efc089285..e766c1f45045 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -437,9 +437,6 @@ static int msm_drm_init(struct device *dev, struct 
drm_driver *drv)
 
priv->wq = alloc_ordered_workqueue("msm", 0);
 
-   INIT_WORK(&priv->free_work, msm_gem_free_work);
-   init_llist_head(&priv->free_list);
-
INIT_LIST_HEAD(&priv->inactive_list);
mutex_init(&priv->mm_lock);
 
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 535f9e718e2d..96f8009e247c 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -188,10 +188,6 @@ struct msm_drm_private {
struct list_head inactive_list;
struct mutex mm_lock;
 
-   /* worker for delayed free of objects: */
-   struct work_struct free_work;
-   struct llist_head free_list;
-
struct workqueue_struct *wq;
 
unsigned int num_planes;
@@ -340,7 +336,6 @@ void msm_gem_kernel_put(struct drm_gem_object *bo,
struct msm_gem_address_space *aspace, bool locked);
 struct drm_gem_object *msm_gem_import(struct drm_device *dev,
struct dma_buf *dmabuf, struct sg_table *sgt);
-void msm_gem_free_work(struct work_struct *work);
 
 __printf(2, 3)
 void msm_gem_object_set_name(struct drm_gem_object *bo, const char *fmt, ...);
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 126d92fd21cd..5e75d644ce41 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -917,16 +917,6 @@ void msm_gem_free_object(struct drm_gem_object *obj)
struct drm_device *dev = obj->dev;
struct msm_drm_private *priv = dev->dev_private;
 
-   if (llist_add(&msm_obj->freed, &priv->free_list))
-   queue_work(priv->wq, &priv->free_work);
-}
-
-static void free_object(struct msm_gem_object *msm_obj)
-{
-   struct drm_gem_object *obj = &msm_obj->base;
-   struct drm_device *dev = obj->dev;
-   struct msm_drm_private *priv = dev->dev_private;
-
/* object should not be on active list: */
WARN_ON(is_active(msm_obj));
 
@@ -959,23 +949,6 @@ static void free_object(struct msm_gem_object *msm_obj)
kfree(msm_obj);
 }
 
-void msm_gem_free_work(struct work_struct *work)
-{
-   struct msm_drm_private *priv =
-   container_of(work, struct msm_drm_private, free_work);
-   struct llist_node *freed;
-   struct msm_gem_object *msm_obj, *next;
-
-   while ((freed = llist_del_all(&priv->free_list))) {
-   llist_for_each_entry_safe(msm_obj, next,
- freed, freed)
-   free_object(msm_obj);
-
-   if (need_resched())
-   break;
-   }
-}
-
 /* convenience method to construct a GEM buffer object, and userspace handle */
 int msm_gem_new_handle(struct drm_device *dev, struct drm_file *file,
uint32_t size, uint32_t flags, uint32_t *handle,
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 61147bd96b06..e98a8004813b 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -127,7 +127,6 @@ enum msm_gem_lock {
 
 void msm_gem_purge(struct drm_gem_object *obj, enum msm_gem_lock subclass);
 void msm_gem_vunmap(struct drm_gem_object *obj, enum msm_gem_lock subclass);
-void msm_gem_free_work(struct work_struct *work);
 
 /* Created per submit-ioctl, to track bo's and cmdstream bufs, etc,
  * associated with the cmdstream submission for synchronization (and
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 02/14] drm/msm: Drop chatty trace

2020-10-04 Thread Rob Clark

From: Rob Clark 

It is somewhat redundant with the gpu tracepoints, and anyways not too
useful to justify spamming the log when debug traces are enabled.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gpu.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 55d16489d0f3..31fce3ac0cdc 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -535,7 +535,6 @@ static void recover_worker(struct work_struct *work)
 
 static void hangcheck_timer_reset(struct msm_gpu *gpu)
 {
-   DBG("%s", gpu->name);
mod_timer(&gpu->hangcheck_timer,
round_jiffies_up(jiffies + DRM_MSM_HANGCHECK_JIFFIES));
 }
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 09/14] drm/msm: Drop struct_mutex from the retire path

2020-10-04 Thread Rob Clark

From: Rob Clark 

Now that we are not relying on dev->struct_mutex to protect the
ring->submits lists, drop the struct_mutex lock.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gpu.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index c9ff19a75169..5e351d1c00e9 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -707,7 +707,7 @@ static void retire_submit(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring,
 
msm_gem_active_put(&msm_obj->base);
msm_gem_unpin_iova(&msm_obj->base, submit->aspace);
-   drm_gem_object_put_locked(&msm_obj->base);
+   drm_gem_object_put(&msm_obj->base);
}
 
pm_runtime_mark_last_busy(&gpu->pdev->dev);
@@ -722,11 +722,8 @@ static void retire_submit(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring,
 
 static void retire_submits(struct msm_gpu *gpu)
 {
-   struct drm_device *dev = gpu->dev;
int i;
 
-   WARN_ON(!mutex_is_locked(&dev->struct_mutex));
-
/* Retire the commits starting with highest priority */
for (i = 0; i < gpu->nr_rings; i++) {
struct msm_ringbuffer *ring = gpu->rb[i];
@@ -756,15 +753,12 @@ static void retire_submits(struct msm_gpu *gpu)
 static void retire_worker(struct work_struct *work)
 {
struct msm_gpu *gpu = container_of(work, struct msm_gpu, retire_work);
-   struct drm_device *dev = gpu->dev;
int i;
 
for (i = 0; i < gpu->nr_rings; i++)
update_fences(gpu, gpu->rb[i], gpu->rb[i]->memptrs->fence);
 
-   mutex_lock(&dev->struct_mutex);
retire_submits(gpu);
-   mutex_unlock(&dev->struct_mutex);
 }
 
 /* call from irq handler to schedule work to retire bo's */
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 07/14] drm/msm: Refcount submits

2020-10-04 Thread Rob Clark

From: Rob Clark 

Before we remove dev->struct_mutex from the retire path, we have to deal
with the situation of a submit retiring before the submit ioctl returns.

To deal with this, ring->submits will hold a reference to the submit,
which is dropped when the submit is retired.  And the submit ioctl path
holds it's own ref, which it drops when it is done with the submit.

Also, add to submit list *after* getting/pinning bo's, to prevent badness
in case the completed fence is corrupted, and retire_worker mistakenly
believes the submit is done too early.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_drv.h|  1 -
 drivers/gpu/drm/msm/msm_gem.h| 13 +
 drivers/gpu/drm/msm/msm_gem_submit.c | 12 ++--
 drivers/gpu/drm/msm/msm_gpu.c| 21 -
 4 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 50978e5db376..535f9e718e2d 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -277,7 +277,6 @@ void msm_unregister_mmu(struct drm_device *dev, struct 
msm_mmu *mmu);
 
 bool msm_use_mmu(struct drm_device *dev);
 
-void msm_gem_submit_free(struct msm_gem_submit *submit);
 int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
struct drm_file *file);
 
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index a1bf741b9b89..e05b1530aca6 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -136,6 +136,7 @@ void msm_gem_free_work(struct work_struct *work);
  * lasts for the duration of the submit-ioctl.
  */
 struct msm_gem_submit {
+   struct kref ref;
struct drm_device *dev;
struct msm_gpu *gpu;
struct msm_gem_address_space *aspace;
@@ -169,6 +170,18 @@ struct msm_gem_submit {
} bos[];
 };
 
+void __msm_gem_submit_destroy(struct kref *kref);
+
+static inline void msm_gem_submit_get(struct msm_gem_submit *submit)
+{
+   kref_get(&submit->ref);
+}
+
+static inline void msm_gem_submit_put(struct msm_gem_submit *submit)
+{
+   kref_put(&submit->ref, __msm_gem_submit_destroy);
+}
+
 /* helper to determine of a buffer in submit should be dumped, used for both
  * devcoredump and debugfs cmdstream dumping:
  */
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index e1d1f005b3d4..7d653bdc92dc 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -42,6 +42,7 @@ static struct msm_gem_submit *submit_create(struct drm_device 
*dev,
if (!submit)
return NULL;
 
+   kref_init(&submit->ref);
submit->dev = dev;
submit->aspace = queue->ctx->aspace;
submit->gpu = gpu;
@@ -60,12 +61,12 @@ static struct msm_gem_submit *submit_create(struct 
drm_device *dev,
return submit;
 }
 
-void msm_gem_submit_free(struct msm_gem_submit *submit)
+void __msm_gem_submit_destroy(struct kref *kref)
 {
+   struct msm_gem_submit *submit =
+   container_of(kref, struct msm_gem_submit, ref);
+
dma_fence_put(submit->fence);
-   spin_lock(&submit->ring->submit_lock);
-   list_del(&submit->node);
-   spin_unlock(&submit->ring->submit_lock);
put_pid(submit->pid);
msm_submitqueue_put(submit->queue);
 
@@ -805,8 +806,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
submit_cleanup(submit);
if (has_ww_ticket)
ww_acquire_fini(&submit->ticket);
-   if (ret)
-   msm_gem_submit_free(submit);
+   msm_gem_submit_put(submit);
 out_unlock:
if (ret && (out_fence_fd >= 0))
put_unused_fd(out_fence_fd);
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 8d1e254f964a..fd3fc6f36ab1 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -712,7 +712,12 @@ static void retire_submit(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring,
 
pm_runtime_mark_last_busy(&gpu->pdev->dev);
pm_runtime_put_autosuspend(&gpu->pdev->dev);
-   msm_gem_submit_free(submit);
+
+   spin_lock(&ring->submit_lock);
+   list_del(&submit->node);
+   spin_unlock(&ring->submit_lock);
+
+   msm_gem_submit_put(submit);
 }
 
 static void retire_submits(struct msm_gpu *gpu)
@@ -786,10 +791,6 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit)
 
submit->seqno = ++ring->seqno;
 
-   spin_lock(&ring->submit_lock);
-   list_add_tail(&submit->node, &ring->submits);
-   spin_unlock(&ring->submit_lock);
-
msm_rd_dump_submit(priv->rd, submit, NULL);
 
update_sw_cntrs(gpu);
@@ -816,6 +817,16 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit)
msm_gem_active_get(drm_obj, gpu);
}
 
+   /*
+* ring->submits holds a ref to the submit, to deal with

[PATCH 06/14] drm/msm: Protect ring->submits with it's own lock

2020-10-04 Thread Rob Clark

From: Rob Clark 

One less place to rely on dev->struct_mutex.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem_submit.c |  2 ++
 drivers/gpu/drm/msm/msm_gpu.c| 37 ++--
 drivers/gpu/drm/msm/msm_ringbuffer.c |  1 +
 drivers/gpu/drm/msm/msm_ringbuffer.h |  6 +
 4 files changed, 39 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index aa5c60a7132d..e1d1f005b3d4 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -63,7 +63,9 @@ static struct msm_gem_submit *submit_create(struct drm_device 
*dev,
 void msm_gem_submit_free(struct msm_gem_submit *submit)
 {
dma_fence_put(submit->fence);
+   spin_lock(&submit->ring->submit_lock);
list_del(&submit->node);
+   spin_unlock(&submit->ring->submit_lock);
put_pid(submit->pid);
msm_submitqueue_put(submit->queue);
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index ca8c95b32c8b..8d1e254f964a 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -270,6 +270,7 @@ static void update_fences(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring,
 {
struct msm_gem_submit *submit;
 
+   spin_lock(&ring->submit_lock);
list_for_each_entry(submit, &ring->submits, node) {
if (submit->seqno > fence)
break;
@@ -277,6 +278,7 @@ static void update_fences(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring,
msm_update_fence(submit->ring->fctx,
submit->fence->seqno);
}
+   spin_unlock(&ring->submit_lock);
 }
 
 #ifdef CONFIG_DEV_COREDUMP
@@ -430,11 +432,14 @@ find_submit(struct msm_ringbuffer *ring, uint32_t fence)
 {
struct msm_gem_submit *submit;
 
-   WARN_ON(!mutex_is_locked(&ring->gpu->dev->struct_mutex));
-
-   list_for_each_entry(submit, &ring->submits, node)
-   if (submit->seqno == fence)
+   spin_lock(&ring->submit_lock);
+   list_for_each_entry(submit, &ring->submits, node) {
+   if (submit->seqno == fence) {
+   spin_unlock(&ring->submit_lock);
return submit;
+   }
+   }
+   spin_unlock(&ring->submit_lock);
 
return NULL;
 }
@@ -523,8 +528,10 @@ static void recover_worker(struct work_struct *work)
for (i = 0; i < gpu->nr_rings; i++) {
struct msm_ringbuffer *ring = gpu->rb[i];
 
+   spin_lock(&ring->submit_lock);
list_for_each_entry(submit, &ring->submits, node)
gpu->funcs->submit(gpu, submit);
+   spin_unlock(&ring->submit_lock);
}
}
 
@@ -711,7 +718,6 @@ static void retire_submit(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring,
 static void retire_submits(struct msm_gpu *gpu)
 {
struct drm_device *dev = gpu->dev;
-   struct msm_gem_submit *submit, *tmp;
int i;
 
WARN_ON(!mutex_is_locked(&dev->struct_mutex));
@@ -720,9 +726,24 @@ static void retire_submits(struct msm_gpu *gpu)
for (i = 0; i < gpu->nr_rings; i++) {
struct msm_ringbuffer *ring = gpu->rb[i];
 
-   list_for_each_entry_safe(submit, tmp, &ring->submits, node) {
-   if (dma_fence_is_signaled(submit->fence))
+   while (true) {
+   struct msm_gem_submit *submit = NULL;
+
+   spin_lock(&ring->submit_lock);
+   submit = list_first_entry_or_null(&ring->submits,
+   struct msm_gem_submit, node);
+   spin_unlock(&ring->submit_lock);
+
+   /*
+* If no submit, we are done.  If submit->fence hasn't
+* been signalled, then later submits are not signalled
+* either, so we are also done.
+*/
+   if (submit && dma_fence_is_signaled(submit->fence)) {
retire_submit(gpu, ring, submit);
+   } else {
+   break;
+   }
}
}
 }
@@ -765,7 +786,9 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit)
 
submit->seqno = ++ring->seqno;
 
+   spin_lock(&ring->submit_lock);
list_add_tail(&submit->node, &ring->submits);
+   spin_unlock(&ring->submit_lock);
 
msm_rd_dump_submit(priv->rd, submit, NULL);
 
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c 
b/drivers/gpu/drm/msm/msm_ringbuffer.c
index 1b6958e908dc..4d2a2a4abef8 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -46,6 +46,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu 
*gpu,

Re: [PATCH] Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached"

2020-10-04 Thread Sam Ravnborg

Hi Paul.

On Sun, Oct 04, 2020 at 04:17:58PM +0200, Paul Cercueil wrote:
> This reverts commit 37054fc81443cc6a8c3a38395f384412b8373d82.

In the changelog please refer to commits like this:
37054fc81443 ("gpu/drm: ingenic: Add option to mmap GEM buffers cached")

Use "dim cite 37054fc81443cc6a8c3a38395f384412b8373d82" to get the right format.

> 
> At the very moment this commit was created, the DMA API it relied on was
> modified in the DMA tree, which caused the driver to break in
> linux-next.
> 
> Revert it for now, and it will be resubmitted later to work with the new
> DMA API.
> 
> Signed-off-by: Paul Cercueil 

With the changelog updated:
Acked-by: Sam Ravnborg 

> ---
>  drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 114 +-
>  drivers/gpu/drm/ingenic/ingenic-drm.h |   4 -
>  drivers/gpu/drm/ingenic/ingenic-ipu.c |  12 +--
>  3 files changed, 4 insertions(+), 126 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c 
> b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
> index 0225dc1f5eb8..7d8b0ad52979 100644
> --- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
> +++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
> @@ -9,8 +9,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -24,7 +22,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -100,11 +97,6 @@ struct ingenic_drm {
>   struct notifier_block clock_nb;
>  };
>  
> -static bool ingenic_drm_cached_gem_buf;
> -module_param_named(cached_gem_buffers, ingenic_drm_cached_gem_buf, bool, 
> 0400);
> -MODULE_PARM_DESC(cached_gem_buffers,
> -  "Enable fully cached GEM buffers [default=false]");
> -
>  static bool ingenic_drm_writeable_reg(struct device *dev, unsigned int reg)
>  {
>   switch (reg) {
> @@ -402,8 +394,6 @@ static int ingenic_drm_plane_atomic_check(struct 
> drm_plane *plane,
>plane->state->fb->format->format != state->fb->format->format))
>   crtc_state->mode_changed = true;
>  
> - drm_atomic_helper_check_plane_damage(state->state, state);
> -
>   return 0;
>  }
>  
> @@ -521,38 +511,6 @@ void ingenic_drm_plane_config(struct device *dev,
>   }
>  }
>  
> -void ingenic_drm_sync_data(struct device *dev,
> -struct drm_plane_state *old_state,
> -struct drm_plane_state *state)
> -{
> - const struct drm_format_info *finfo = state->fb->format;
> - struct ingenic_drm *priv = dev_get_drvdata(dev);
> - struct drm_atomic_helper_damage_iter iter;
> - unsigned int offset, i;
> - struct drm_rect clip;
> - dma_addr_t paddr;
> - void *addr;
> -
> - if (!ingenic_drm_cached_gem_buf)
> - return;
> -
> - drm_atomic_helper_damage_iter_init(&iter, old_state, state);
> -
> - drm_atomic_for_each_plane_damage(&iter, &clip) {
> - for (i = 0; i < finfo->num_planes; i++) {
> - paddr = drm_fb_cma_get_gem_addr(state->fb, state, i);
> - addr = phys_to_virt(paddr);
> -
> - /* Ignore x1/x2 values, invalidate complete lines */
> - offset = clip.y1 * state->fb->pitches[i];
> -
> - dma_cache_sync(priv->dev, addr + offset,
> -(clip.y2 - clip.y1) * 
> state->fb->pitches[i],
> -DMA_TO_DEVICE);
> - }
> - }
> -}
> -
>  static void ingenic_drm_update_palette(struct ingenic_drm *priv,
>  const struct drm_color_lut *lut)
>  {
> @@ -581,8 +539,6 @@ static void ingenic_drm_plane_atomic_update(struct 
> drm_plane *plane,
>   if (state && state->fb) {
>   crtc_state = state->crtc->state;
>  
> - ingenic_drm_sync_data(priv->dev, oldstate, state);
> -
>   addr = drm_fb_cma_get_gem_addr(state->fb, state, 0);
>   width = state->src_w >> 16;
>   height = state->src_h >> 16;
> @@ -752,69 +708,7 @@ static void ingenic_drm_disable_vblank(struct drm_crtc 
> *crtc)
>   regmap_update_bits(priv->map, JZ_REG_LCD_CTRL, JZ_LCD_CTRL_EOF_IRQ, 0);
>  }
>  
> -static struct drm_framebuffer *
> -ingenic_drm_gem_fb_create(struct drm_device *dev, struct drm_file *file,
> -   const struct drm_mode_fb_cmd2 *mode_cmd)
> -{
> - if (ingenic_drm_cached_gem_buf)
> - return drm_gem_fb_create_with_dirty(dev, file, mode_cmd);
> -
> - return drm_gem_fb_create(dev, file, mode_cmd);
> -}
> -
> -static int ingenic_drm_gem_mmap(struct drm_gem_object *obj,
> - struct vm_area_struct *vma)
> -{
> - struct drm_gem_cma_object *cma_obj = to_drm_gem_cma_obj(obj);
> - struct device *dev = cma_obj->base.dev->dev;
> - unsigned long attrs;
> - int ret;
> -
> - if (ingenic_drm_cached_gem_buf)
> - attrs = DMA_ATTR_NON_CONSISTENT;
> - else

Re: [PATCH 04/14] drm/msm: Add priv->mm_lock to protect active/inactive lists

2020-10-04 Thread Daniel Vetter

On Sun, Oct 4, 2020 at 9:21 PM Rob Clark  wrote:
>
> From: Rob Clark 
>
> Rather than relying on the big dev->struct_mutex hammer, introduce a
> more specific lock for protecting the bo lists.
>
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/msm_debugfs.c  |  7 +++
>  drivers/gpu/drm/msm/msm_drv.c  |  1 +
>  drivers/gpu/drm/msm/msm_drv.h  | 13 +++-
>  drivers/gpu/drm/msm/msm_gem.c  | 28 +++---
>  drivers/gpu/drm/msm/msm_gem_shrinker.c | 12 +++
>  drivers/gpu/drm/msm/msm_gpu.h  |  5 -
>  6 files changed, 52 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_debugfs.c 
> b/drivers/gpu/drm/msm/msm_debugfs.c
> index ee2e270f464c..64afbed89821 100644
> --- a/drivers/gpu/drm/msm/msm_debugfs.c
> +++ b/drivers/gpu/drm/msm/msm_debugfs.c
> @@ -112,6 +112,11 @@ static int msm_gem_show(struct drm_device *dev, struct 
> seq_file *m)
>  {
> struct msm_drm_private *priv = dev->dev_private;
> struct msm_gpu *gpu = priv->gpu;
> +   int ret;
> +
> +   ret = mutex_lock_interruptible(&priv->mm_lock);
> +   if (ret)
> +   return ret;
>
> if (gpu) {
> seq_printf(m, "Active Objects (%s):\n", gpu->name);
> @@ -121,6 +126,8 @@ static int msm_gem_show(struct drm_device *dev, struct 
> seq_file *m)
> seq_printf(m, "Inactive Objects:\n");
> msm_gem_describe_objects(&priv->inactive_list, m);
>
> +   mutex_unlock(&priv->mm_lock);
> +
> return 0;
>  }
>
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index 49685571dc0e..dc6efc089285 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -441,6 +441,7 @@ static int msm_drm_init(struct device *dev, struct 
> drm_driver *drv)
> init_llist_head(&priv->free_list);
>
> INIT_LIST_HEAD(&priv->inactive_list);
> +   mutex_init(&priv->mm_lock);

I highly recommend you drop a

fs_reclaim_acquire(GFP_KERNEL);
might_lock(&priv->mm_lock);
fs_reclaim_release(GFP_KERNEL);

in here to teach lockdep about your ordering against the shrinker.
Gives you full testing every boot, even if your shrinker never gets
called.
-Daniel

>
> drm_mode_config_init(ddev);
>
> diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
> index b9dd8f8f4887..50978e5db376 100644
> --- a/drivers/gpu/drm/msm/msm_drv.h
> +++ b/drivers/gpu/drm/msm/msm_drv.h
> @@ -174,8 +174,19 @@ struct msm_drm_private {
> struct msm_rd_state *hangrd;   /* debugfs to dump hanging submits */
> struct msm_perf_state *perf;
>
> -   /* list of GEM objects: */
> +   /*
> +* List of inactive GEM objects.  Every bo is either in the 
> inactive_list
> +* or gpu->active_list (for the gpu it is active on[1])
> +*
> +* These lists are protected by mm_lock.  If struct_mutex is 
> involved, it
> +* should be aquired prior to mm_lock.  One should *not* hold mm_lock 
> in
> +* get_pages()/vmap()/etc paths, as they can trigger the shrinker.
> +*
> +* [1] if someone ever added support for the old 2d cores, there 
> could be
> +* more than one gpu object
> +*/
> struct list_head inactive_list;
> +   struct mutex mm_lock;
>
> /* worker for delayed free of objects: */
> struct work_struct free_work;
> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> index a870b3ad129d..b04ed8b52f9d 100644
> --- a/drivers/gpu/drm/msm/msm_gem.c
> +++ b/drivers/gpu/drm/msm/msm_gem.c
> @@ -746,13 +746,17 @@ int msm_gem_sync_object(struct drm_gem_object *obj,
>  void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu)
>  {
> struct msm_gem_object *msm_obj = to_msm_bo(obj);
> -   WARN_ON(!mutex_is_locked(&obj->dev->struct_mutex));
> +   struct msm_drm_private *priv = obj->dev->dev_private;
> +
> +   might_sleep();
> WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED);
>
> if (!atomic_fetch_inc(&msm_obj->active_count)) {
> +   mutex_lock(&priv->mm_lock);
> msm_obj->gpu = gpu;
> list_del_init(&msm_obj->mm_list);
> list_add_tail(&msm_obj->mm_list, &gpu->active_list);
> +   mutex_unlock(&priv->mm_lock);
> }
>  }
>
> @@ -761,12 +765,14 @@ void msm_gem_active_put(struct drm_gem_object *obj)
> struct msm_gem_object *msm_obj = to_msm_bo(obj);
> struct msm_drm_private *priv = obj->dev->dev_private;
>
> -   WARN_ON(!mutex_is_locked(&obj->dev->struct_mutex));
> +   might_sleep();
>
> if (!atomic_dec_return(&msm_obj->active_count)) {
> +   mutex_lock(&priv->mm_lock);
> msm_obj->gpu = NULL;
> list_del_init(&msm_obj->mm_list);
> list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
> +   mutex_unlock(&priv->mm_

Re: [PATCH 04/14] drm/msm: Add priv->mm_lock to protect active/inactive lists

2020-10-04 Thread Rob Clark

On Sun, Oct 4, 2020 at 3:15 PM Daniel Vetter  wrote:
>
> On Sun, Oct 4, 2020 at 9:21 PM Rob Clark  wrote:
> >
> > From: Rob Clark 
> >
> > Rather than relying on the big dev->struct_mutex hammer, introduce a
> > more specific lock for protecting the bo lists.
> >
> > Signed-off-by: Rob Clark 
> > ---
> >  drivers/gpu/drm/msm/msm_debugfs.c  |  7 +++
> >  drivers/gpu/drm/msm/msm_drv.c  |  1 +
> >  drivers/gpu/drm/msm/msm_drv.h  | 13 +++-
> >  drivers/gpu/drm/msm/msm_gem.c  | 28 +++---
> >  drivers/gpu/drm/msm/msm_gem_shrinker.c | 12 +++
> >  drivers/gpu/drm/msm/msm_gpu.h  |  5 -
> >  6 files changed, 52 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_debugfs.c 
> > b/drivers/gpu/drm/msm/msm_debugfs.c
> > index ee2e270f464c..64afbed89821 100644
> > --- a/drivers/gpu/drm/msm/msm_debugfs.c
> > +++ b/drivers/gpu/drm/msm/msm_debugfs.c
> > @@ -112,6 +112,11 @@ static int msm_gem_show(struct drm_device *dev, struct 
> > seq_file *m)
> >  {
> > struct msm_drm_private *priv = dev->dev_private;
> > struct msm_gpu *gpu = priv->gpu;
> > +   int ret;
> > +
> > +   ret = mutex_lock_interruptible(&priv->mm_lock);
> > +   if (ret)
> > +   return ret;
> >
> > if (gpu) {
> > seq_printf(m, "Active Objects (%s):\n", gpu->name);
> > @@ -121,6 +126,8 @@ static int msm_gem_show(struct drm_device *dev, struct 
> > seq_file *m)
> > seq_printf(m, "Inactive Objects:\n");
> > msm_gem_describe_objects(&priv->inactive_list, m);
> >
> > +   mutex_unlock(&priv->mm_lock);
> > +
> > return 0;
> >  }
> >
> > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > index 49685571dc0e..dc6efc089285 100644
> > --- a/drivers/gpu/drm/msm/msm_drv.c
> > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > @@ -441,6 +441,7 @@ static int msm_drm_init(struct device *dev, struct 
> > drm_driver *drv)
> > init_llist_head(&priv->free_list);
> >
> > INIT_LIST_HEAD(&priv->inactive_list);
> > +   mutex_init(&priv->mm_lock);
>
> I highly recommend you drop a
>
> fs_reclaim_acquire(GFP_KERNEL);
> might_lock(&priv->mm_lock);
> fs_reclaim_release(GFP_KERNEL);
>
> in here to teach lockdep about your ordering against the shrinker.
> Gives you full testing every boot, even if your shrinker never gets
> called.

Good idea..

(tbf, I have tested this with android+lockdep which pretty is great
shrinker exercise.. but immediate notification of future problems is a
good plan)

BR,
-R
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] dt-bindings: Fix 'reg' size issues in zynqmp examples

2020-10-04 Thread Vinod Koul

On 28-09-20, 10:59, Rob Herring wrote:
> The default sizes in examples for 'reg' are 1 cell each. Fix the
> incorrect sizes in zynqmp examples:
> 
> Documentation/devicetree/bindings/dma/xilinx/xlnx,zynqmp-dpdma.example.dt.yaml:
>  example-0: dma-controller@fd4c:reg:0: [0, 4249616384, 0, 4096] is too 
> long
>   From schema: 
> /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
> Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.example.dt.yaml:
>  example-0: display@fd4a:reg:0: [0, 4249485312, 0, 4096] is too long
>   From schema: 
> /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
> Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.example.dt.yaml:
>  example-0: display@fd4a:reg:1: [0, 4249526272, 0, 4096] is too long
>   From schema: 
> /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
> Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.example.dt.yaml:
>  example-0: display@fd4a:reg:2: [0, 4249530368, 0, 4096] is too long
>   From schema: 
> /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml
> Documentation/devicetree/bindings/display/xlnx/xlnx,zynqmp-dpsub.example.dt.yaml:
>  example-0: display@fd4a:reg:3: [0, 4249534464, 0, 4096] is too long
>   From schema: 
> /usr/local/lib/python3.8/dist-packages/dtschema/schemas/reg.yaml

Applied, thanks

-- 
~Vinod
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 2/2] panfrost: Add compatible string for bifrost

2020-10-04 Thread Tomeu Vizoso

On Fri, 19 Jun 2020 at 11:00, Steven Price  wrote:
>
> On 11/06/2020 09:58, Tomeu Vizoso wrote:
> > Mesa now supports some Bifrost devices, so enable it.
> >
> > Signed-off-by: Tomeu Vizoso 
>
> Reviewed-by: Steven Price 
>
> I've also dug out my Hikey960 (from the box it's been in since lock down
> started), so I'll see if I can get things running on there, at the
> moment I'm seeing some DATA_INVALID_FAULT regressions running my hacked
> DDK :(

Hi!

Has this one fallen through the cracks?

Thanks,

Tomeu

>
> Steve
>
> > ---
> >   drivers/gpu/drm/panfrost/panfrost_drv.c | 1 +
> >   1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> > b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > index 882fecc33fdb..8ff8e140f91e 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > @@ -677,6 +677,7 @@ static const struct of_device_id dt_match[] = {
> >   { .compatible = "arm,mali-t830", .data = &default_data, },
> >   { .compatible = "arm,mali-t860", .data = &default_data, },
> >   { .compatible = "arm,mali-t880", .data = &default_data, },
> > + { .compatible = "arm,mali-bifrost", .data = &default_data, },
> >   {}
> >   };
> >   MODULE_DEVICE_TABLE(of, dt_match);
> >
>
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 208205] [amdgpu] System hang / freeze

Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

[PATCH rdma-next v5 1/4] lib/scatterlist: Add support in dynamic allocation of SG table from pages

[PATCH rdma-next v5 4/4] RDMA/umem: Move to allocate SG table from pages

[PATCH rdma-next v5 0/4] Dynamicaly allocate SG table from the pages

[PATCH rdma-next v5 3/4] tools/testing/scatterlist: Show errors in human readable form

[PATCH rdma-next v5 2/4] tools/testing/scatterlist: Rejuvenate bit-rotten test

Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

Re: [PATCH] dt-bindings: Another round of adding missing 'additionalProperties'

[RFC PATCH v3 1/4] RDMA/umem: Support importing dma-buf as user memory region

[RFC PATCH v3 0/4] RDMA: Add dma-buf support

[RFC PATCH v3 3/4] RDMA/mlx5: Support dma-buf based userspace memory region

[RFC PATCH v3 2/4] RDMA: Expand driver memory registration methods to support dma-buf

[RFC PATCH v3 4/4] RDMA/uverbs: Add uverbs command for dma-buf based MR registration

[Bug 201497] [amdgpu]: 'ERROR No EDID read' is back in 4.19

Re: [PATCH RESEND] drm/bridge: tc358764: restore connector support

[PATCH 00/14] drm/msm: de-struct_mutex-ification

[PATCH 03/14] drm/msm: Move update_fences()

[PATCH 05/14] drm/msm: Document and rename preempt_lock

[PATCH 04/14] drm/msm: Add priv->mm_lock to protect active/inactive lists

[PATCH 01/14] drm/msm: Use correct drm_gem_object_put() in fail case

[PATCH 13/14] drm/msm: Drop struct_mutex in shrinker path

[PATCH 12/14] drm/msm: drop struct_mutex in madvise path

[PATCH 08/14] drm/msm: Remove obj->gpu

[PATCH 10/14] drm/msm: Drop struct_mutex in free_object() path

[PATCH 14/14] drm/msm: Don't implicit-sync if only a single ring

[PATCH 11/14] drm/msm: remove msm_gem_free_work

[PATCH 02/14] drm/msm: Drop chatty trace

[PATCH 09/14] drm/msm: Drop struct_mutex from the retire path

[PATCH 07/14] drm/msm: Refcount submits

[PATCH 06/14] drm/msm: Protect ring->submits with it's own lock

Re: [PATCH] Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached"

Re: [PATCH 04/14] drm/msm: Add priv->mm_lock to protect active/inactive lists

Re: [PATCH 04/14] drm/msm: Add priv->mm_lock to protect active/inactive lists

Re: [PATCH] dt-bindings: Fix 'reg' size issues in zynqmp examples

Re: [PATCH 2/2] panfrost: Add compatible string for bifrost

36 matches

Site Navigation

Mail list logo

Footer information