Current dma-buf vmap semantics require that the mapped buffer remains
in place until the corresponding vunmap has completed.

For GEM-SHMEM, this used to be guaranteed by a pin operation while creating
an S/G table in import. GEM-SHMEN can now import dma-buf objects without
creating the S/G table, so the pin is missing. Leads to page-fault errors,
such as the one shown below.

[  102.101726] BUG: unable to handle page fault for address: ffffc90127000000
[...]
[  102.157102] RIP: 0010:udl_compress_hline16+0x219/0x940 [udl]
[...]
[  102.243250] Call Trace:
[  102.245695]  <TASK>
[  102.2477V95]  ? validate_chain+0x24e/0x5e0
[  102.251805]  ? __lock_acquire+0x568/0xae0
[  102.255807]  udl_render_hline+0x165/0x341 [udl]
[  102.260338]  ? __pfx_udl_render_hline+0x10/0x10 [udl]
[  102.265379]  ? local_clock_noinstr+0xb/0x100
[  102.269642]  ? __lock_release.isra.0+0x16c/0x2e0
[  102.274246]  ? mark_held_locks+0x40/0x70
[  102.278177]  udl_primary_plane_helper_atomic_update+0x43e/0x680 [udl]
[  102.284606]  ? __pfx_udl_primary_plane_helper_atomic_update+0x10/0x10 [udl]
[  102.291551]  ? lockdep_hardirqs_on_prepare.part.0+0x92/0x170
[  102.297208]  ? lockdep_hardirqs_on+0x88/0x130
[  102.301554]  ? _raw_spin_unlock_irq+0x24/0x50
[  102.305901]  ? wait_for_completion_timeout+0x2bb/0x3a0
[  102.311028]  ? drm_atomic_helper_calc_timestamping_constants+0x141/0x200
[  102.317714]  ? drm_atomic_helper_commit_planes+0x3b6/0x1030
[  102.323279]  drm_atomic_helper_commit_planes+0x3b6/0x1030
[  102.328664]  drm_atomic_helper_commit_tail+0x41/0xb0
[  102.333622]  commit_tail+0x204/0x330
[...]
[  102.529946] ---[ end trace 0000000000000000 ]---
[  102.651980] RIP: 0010:udl_compress_hline16+0x219/0x940 [udl]

In this stack strace, udl (based on GEM-SHMEM) imported and vmap'ed a
dma-buf from amdgpu. Amdgpu relocated the buffer, thereby invalidating the
mapping.

Provide a custom dma-buf vmap method in amdgpu that pins the object before
mapping it's buffer's pages into kernel address space. Do the opposite in
vunmap.

Note that dma-buf vmap differs from GEM vmap in how it handles relocation.
While dma-buf vmap keeps the buffer in place, GEM vmap requires the caller
to keep the buffer in place. Hence, this fix is in amdgpu's dma-buf code
instead of its GEM code.

A discussion of various approaches to solving the problem is available
at [1].

v2:
- only use mapable domains (Christian)
- try pinning to domains in prefered order

Signed-off-by: Thomas Zimmermann <tzimmerm...@suse.de>
Fixes: 660cd44659a0 ("drm/shmem-helper: Import dmabuf without mapping its 
sg_table")
Reported-by: Thomas Zimmermann <tzimmerm...@suse.de>
Closes: 
https://lore.kernel.org/dri-devel/ba1bdfb8-dbf7-4372-bdcb-df7e0511c...@suse.de/
Cc: Shixiong Ou <oushixi...@kylinos.cn>
Cc: Thomas Zimmermann <tzimmerm...@suse.de>
Cc: Maarten Lankhorst <maarten.lankho...@linux.intel.com>
Cc: Maxime Ripard <mrip...@kernel.org>
Cc: David Airlie <airl...@gmail.com>
Cc: Simona Vetter <sim...@ffwll.ch>
Cc: Sumit Semwal <sumit.sem...@linaro.org>
Cc: "Christian König" <christian.koe...@amd.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Link: 
https://lore.kernel.org/dri-devel/9792c6c3-a2b8-4b2b-b5ba-fba19b153...@suse.de/ 
# [1]
Signed-off-by: Thomas Zimmermann <tzimmerm...@suse.de>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 41 ++++++++++++++++++++-
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 5743ebb2f1b7..471b41bd3e29 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -285,6 +285,43 @@ static int amdgpu_dma_buf_begin_cpu_access(struct dma_buf 
*dma_buf,
        return ret;
 }
 
+static int amdgpu_dma_buf_vmap(struct dma_buf *dma_buf, struct iosys_map *map)
+{
+       struct drm_gem_object *obj = dma_buf->priv;
+       struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
+       int ret;
+
+       /*
+        * Pin to keep buffer in place while it's vmap'ed. The actual
+        * domain is not that important as long as it's mapable. Using
+        * GTT should be compatible with most use cases. VRAM and CPU
+        * are the fallbacks if the buffer has already been pinned there.
+        */
+       ret = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT);
+       if (ret) {
+               ret = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_VRAM);
+               if (ret) {
+                       ret = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_CPU);
+                       if (ret)
+                               return ret;
+               }
+       }
+       ret = drm_gem_dmabuf_vmap(dma_buf, map);
+       if (ret)
+               amdgpu_bo_unpin(bo);
+
+       return ret;
+}
+
+static void amdgpu_dma_buf_vunmap(struct dma_buf *dma_buf, struct iosys_map 
*map)
+{
+       struct drm_gem_object *obj = dma_buf->priv;
+       struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
+
+       drm_gem_dmabuf_vunmap(dma_buf, map);
+       amdgpu_bo_unpin(bo);
+}
+
 const struct dma_buf_ops amdgpu_dmabuf_ops = {
        .attach = amdgpu_dma_buf_attach,
        .pin = amdgpu_dma_buf_pin,
@@ -294,8 +331,8 @@ const struct dma_buf_ops amdgpu_dmabuf_ops = {
        .release = drm_gem_dmabuf_release,
        .begin_cpu_access = amdgpu_dma_buf_begin_cpu_access,
        .mmap = drm_gem_dmabuf_mmap,
-       .vmap = drm_gem_dmabuf_vmap,
-       .vunmap = drm_gem_dmabuf_vunmap,
+       .vmap = amdgpu_dma_buf_vmap,
+       .vunmap = amdgpu_dma_buf_vunmap,
 };
 
 /**
-- 
2.50.1

Reply via email to