[Intel-gfx] [PULL] drm-misc-next
dgpu: stop using pages with drm_prime_sg_to_page_addr_arrays drm/nouveau: stop using pages with drm_prime_sg_to_page_addr_arrays v2 drm/vmwgfx: switch to ttm_sg_tt_init drm/qxl: switch to ttm_sg_tt_init drm/ttm: nuke ttm_dma_tt_init drm/prime: split array import functions v4 drm/ttm/drivers: remove unecessary ttm_module.h include v2 drm/ttm: stop destroying pinned ghost object drm/ttm: cleanup BO size handling v3 drm/ttm: use pin_count more extensively drm/ttm: cleanup LRU handling further Chuhong Yuan (1): drm/fb-helper: Add missed unlocks in setcmap_legacy() Dafna Hirschfeld (2): drm/rockchip: for error print, use the correct device pointer drm/rockchip: fix typo in Kconfig 's/HDMI/dsi/' Dan Carpenter (3): drm/kmb: Remove an unnecessary NULL check gma500: clean up error handling in init drm/panel: khadas: Fix error code in khadas_ts050_panel_add() Daniel Vetter (9): drm/ttm: Warn on pinning without holding a reference drm/nouveau: Drop mutex_lock_nested for atomic dma-buf: Fix kerneldoc formatting drm/vkms: Unset preferred_depth drm/amdkfd: fix ttm size refactor fallout dma-buf: Remove kmap kerneldoc vestiges dma-buf: some kerneldoc formatting fixes dma-buf: begin/end_cpu might lock the dma_resv lock dma-buf: doc polish for pin/unpin Dave Stevenson (4): drm/vc4: dsi: Correct DSI register definition drm/vc4: dsi: Add support for DSI0 dt-bindings: Add compatible for BCM2711 DSI1 drm/vc4: dsi: Add configuration for BCM2711 DSI1 Douglas Anderson (7): drm: panel: simple: Fixup the struct panel_desc kernel doc drm: panel: simple: Defer unprepare delay till next prepare to shorten it drm: panel: simple: Allow specifying the delay from prepare to enable dt-bindings: dt-bindings: display: simple: Add BOE NV110WTM-N61 drm: panel: simple: Add BOE NV110WTM-N61 drm: panel: Fully transition panel_desc kerneldoc to inline style drm: panel: add flags to BOE NV110WTM-N61 Guido Günther (6): drm/panel: st7703: Use dev_err_probe drm/panel: mantix: Tweak init sequence drm/panel: mantix: Allow to specify default mode for different panels drm/panel: mantix: Support panel from Shenzhen Yashi Changhua Intelligent Technology Co dt-bindings: vendor-prefixes: Add ys vendor prefix dt-bindings: display: mantix: Add compatible for panel from YS Gurchetan Singh (3): drm/virtio: virtio_{blah} --> virtio_gpu_{blah} drm/virtio: rework virtio_fence_signaled drm/virtio: consider dma-fence context when signaling Jialin Zhang (1): drm/gma500: Fix error return code in psb_driver_load() Jonathan Liu (1): drm/rockchip: dw_hdmi: fix incorrect clock in vpll clock error message Jyri Sarha (2): drm/omap: Implement CTM property for CRTC using OVL managers CPR matrix drm/omap: Enable COLOR_ENCODING and COLOR_RANGE properties for planes Krzysztof Kozlowski (1): drm/ingenic: depend on COMMON_CLK to fix compile tests Laurent Pinchart (1): drm: Remove drmm_add_final_kfree() declaration from public headers Linus Walleij (2): dt-bindings: display: mcde: Convert to YAML schema drm/panel: s6e63m0: Fix init sequence again Luben Tuikov (4): drm/scheduler: "node" --> "list" gpu/drm: ring_mirror_list --> pending_list drm/scheduler: Essentialize the job done callback drm/sched: Add missing structure comment Maarten Lankhorst (1): Merge drm/drm-next into drm-misc-next Maxime Ripard (20): drm/vc4: hdmi: Don't poll for the infoframes status on setup drm/vc4: drv: Remove the DSI pointer in vc4_drv drm/vc4: dsi: Use snprintf for the PHY clocks instead of an array drm/vc4: dsi: Introduce a variant structure drm: Introduce an atomic_commit_setup function drm: Document use-after-free gotcha with private objects drm/vc4: Simplify a bit the global atomic_check drm/vc4: kms: Wait on previous FIFO users before a commit drm/vc4: kms: Remove unassigned_channels from the HVS state drm/vc4: kms: Remove async modeset semaphore drm/vc4: kms: Convert to atomic helpers drm/vc4: hvs: Align the HVS atomic hooks to the new API drm/vc4: Pass the atomic state to encoder hooks drm/vc4: hdmi: Take into account the clock doubling flag in atomic_check drm/vc4: hdmi: Don't access the connector state in reset if kmalloc fails drm/vc4: hdmi: Create a custom connector state drm/vc4: hdmi: Store pixel frequency in the connector state drm/vc4: hdmi: Use the connector state pixel rate for the PHY drm/vc4: hdmi: Limit the BCM2711 to the max without scrambling drm/vc4: hdmi: Enable 10/12 bpc output Neil Armstrong (2): dt-bindings: panel-simple-dsi: add Khadas TS050 pan
[Intel-gfx] [PATCH v6 40/64] drm/i915: Use a single page table lock for each gtt.
We may create page table objects on the fly, but we may need to wait with the ww lock held. Instead of waiting on a freed obj lock, ensure we have the same lock for each object to keep -EDEADLK working. This ensures that i915_vma_pin_ww can lock the page tables when required. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/intel_ggtt.c | 8 +- drivers/gpu/drm/i915/gt/intel_gtt.c | 38 ++- drivers/gpu/drm/i915/gt/intel_gtt.h | 5 drivers/gpu/drm/i915/gt/intel_ppgtt.c | 3 ++- drivers/gpu/drm/i915/i915_vma.c | 5 5 files changed, 56 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index 28cff6fb3059..06a41c1dd0bf 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -624,7 +624,9 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt) if (err) goto err_ppgtt; + i915_gem_object_lock(ppgtt->vm.scratch[0], NULL); err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash); + i915_gem_object_unlock(ppgtt->vm.scratch[0]); if (err) goto err_stash; @@ -711,6 +713,7 @@ static void ggtt_cleanup_hw(struct i915_ggtt *ggtt) mutex_unlock(&ggtt->vm.mutex); i915_address_space_fini(&ggtt->vm); + dma_resv_fini(&ggtt->vm.resv); arch_phys_wc_del(ggtt->mtrr); @@ -1092,6 +1095,7 @@ static int ggtt_probe_hw(struct i915_ggtt *ggtt, struct intel_gt *gt) ggtt->vm.gt = gt; ggtt->vm.i915 = i915; ggtt->vm.dma = &i915->drm.pdev->dev; + dma_resv_init(&ggtt->vm.resv); if (INTEL_GEN(i915) <= 5) ret = i915_gmch_probe(ggtt); @@ -1099,8 +1103,10 @@ static int ggtt_probe_hw(struct i915_ggtt *ggtt, struct intel_gt *gt) ret = gen6_gmch_probe(ggtt); else ret = gen8_gmch_probe(ggtt); - if (ret) + if (ret) { + dma_resv_fini(&ggtt->vm.resv); return ret; + } if ((ggtt->vm.total - 1) >> 32) { drm_err(&i915->drm, diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 444d9bacfafd..941f8af016d6 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -13,16 +13,36 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz) { + struct drm_i915_gem_object *obj; + if (I915_SELFTEST_ONLY(should_fail(&vm->fault_attr, 1))) i915_gem_shrink_all(vm->i915); - return i915_gem_object_create_internal(vm->i915, sz); + obj = i915_gem_object_create_internal(vm->i915, sz); + /* ensure all dma objects have the same reservation class */ + if (!IS_ERR(obj)) + obj->base.resv = &vm->resv; + return obj; } int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj) { int err; + i915_gem_object_lock(obj, NULL); + err = i915_gem_object_pin_pages(obj); + i915_gem_object_unlock(obj); + if (err) + return err; + + i915_gem_object_make_unshrinkable(obj); + return 0; +} + +int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj) +{ + int err; + err = i915_gem_object_pin_pages(obj); if (err) return err; @@ -56,6 +76,20 @@ void __i915_vm_close(struct i915_address_space *vm) mutex_unlock(&vm->mutex); } +/* lock the vm into the current ww, if we lock one, we lock all */ +int i915_vm_lock_objects(struct i915_address_space *vm, +struct i915_gem_ww_ctx *ww) +{ + if (vm->scratch[0]->base.resv == &vm->resv) { + return i915_gem_object_lock(vm->scratch[0], ww); + } else { + struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm); + + /* We borrowed the scratch page from ggtt, take the top level object */ + return i915_gem_object_lock(ppgtt->pd->pt.base, ww); + } +} + void i915_address_space_fini(struct i915_address_space *vm) { drm_mm_takedown(&vm->mm); @@ -69,6 +103,7 @@ static void __i915_vm_release(struct work_struct *work) vm->cleanup(vm); i915_address_space_fini(vm); + dma_resv_fini(&vm->resv); kfree(vm); } @@ -98,6 +133,7 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass) mutex_init(&vm->mutex); lockdep_set_subclass(&vm->mutex, subclass); i915_gem_shrinker_taints_mutex(vm->i915, &vm->mutex); + dma_resv_init(&vm->resv); GEM_BUG_ON(!vm->total); drm_mm_init(&vm->mm, 0, vm
[Intel-gfx] [PATCH v6 00/64] drm/i915: Remove obj->mm.lock!
Rebased on top of latest drm-tip. The userptr changes have been tested against beignet's selftests, and intel-compute-runtime with piglit. I added a few ERRs to check if the changed paths were hit, but didn't get any returned, and no new test failures, so no regressions. IGT needs fixes, as it still uses some of the deprecated calls. Test-with: 20201215084825.101358-1-maarten.lankho...@linux.intel.com --- Finally there, just needs a lot of fixes! A lot of places were calling certain calls without any object lock held, with the removal of mm.lock we can no longer do this, and have to fix it. Phys page handling has to be redone, as nothing protects obj->ops structure, we have to remove swapping it, and move HAS_STRUCT_PAGE to obj->flags instead. Userpointer locking is inverted, which we tried to get around with a workqueue. We correct the lock ordering and try to acquire userptr pages first before taking any ww locks. This is more compatible with the locking hierarchy, as we may need to acquire mmap_sem. The previous versione broke gem_exec_schedule@pi-shared/distinct-iova, but now that we only unbind when required, this is now fixed. We also have to fix some dma-work, the command parser and clflush are slightly reworked to put all memory allocations and pinning in the preparation, so the work could pass fence annotations. In a few places like igt_spinner and execlists, we move some part of init to the first pin, because we need to have the ww lock held and it makes it easier that way. Finally we convert all selftests, and then remove obj->mm.lock! Compared to last version, I moved gt_revoke slightly to fix a lockdep splat with reset mutex vs mmu notifier. Maarten Lankhorst (62): drm/i915: Do not share hwsp across contexts any more, v6 drm/i915: Pin timeline map after first timeline pin, v3. drm/i915: Move cmd parser pinning to execbuffer drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2. drm/i915: Ensure we hold the object mutex in pin correctly. drm/i915: Add gem object locking to madvise. drm/i915: Move HAS_STRUCT_PAGE to obj->flags drm/i915: Rework struct phys attachment handling drm/i915: Convert i915_gem_object_attach_phys() to ww locking, v2. drm/i915: make lockdep slightly happier about execbuf. drm/i915: Disable userptr pread/pwrite support. drm/i915: No longer allow exporting userptr through dma-buf drm/i915: Reject more ioctls for userptr drm/i915: Reject UNSYNCHRONIZED for userptr, v2. drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER. drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v5. drm/i915: Flatten obj->mm.lock drm/i915: Populate logical context during first pin. drm/i915: Make ring submission compatible with obj->mm.lock removal, v2. drm/i915: Handle ww locking in init_status_page drm/i915: Rework clflush to work correctly without obj->mm.lock. drm/i915: Pass ww ctx to intel_pin_to_display_plane drm/i915: Add object locking to vm_fault_cpu drm/i915: Move pinning to inside engine_wa_list_verify() drm/i915: Take reservation lock around i915_vma_pin. drm/i915: Make lrc_init_wa_ctx compatible with ww locking. drm/i915: Make __engine_unpark() compatible with ww locking. drm/i915: Take obj lock around set_domain ioctl drm/i915: Defer pin calls in buffer pool until first use by caller. drm/i915: Fix pread/pwrite to work with new locking rules. drm/i915: Fix workarounds selftest, part 1 drm/i915: Add igt_spinner_pin() to allow for ww locking around spinner. drm/i915: Add ww locking around vm_access() drm/i915: Increase ww locking for perf. drm/i915: Lock ww in ucode objects correctly drm/i915: Add ww locking to dma-buf ops. drm/i915: Add missing ww lock in intel_dsb_prepare. drm/i915: Fix ww locking in shmem_create_from_object drm/i915: Use a single page table lock for each gtt. drm/i915/selftests: Prepare huge_pages testcases for obj->mm.lock removal. drm/i915/selftests: Prepare client blit for obj->mm.lock removal. drm/i915/selftests: Prepare coherency tests for obj->mm.lock removal. drm/i915/selftests: Prepare context tests for obj->mm.lock removal. drm/i915/selftests: Prepare dma-buf tests for obj->mm.lock removal. drm/i915/selftests: Prepare execbuf tests for obj->mm.lock removal. drm/i915/selftests: Prepare mman testcases for obj->mm.lock removal. drm/i915/selftests: Prepare object tests for obj->mm.lock removal. drm/i915/selftests: Prepare object blit tests for obj->mm.lock removal. drm/i915/selftests: Prepare igt_gem_utils for obj->mm.lock removal drm/i915/selftests: Prepare context selftest for obj->mm.lock removal drm/i915/selftests: Prepare hangcheck for obj->mm.lock removal drm/i915/selftests: Prepare execlists and lrc selftests for obj->mm.lock removal drm/i915/selftests: Prepare mocs tests for obj->mm.lock removal
[Intel-gfx] [PATCH v6 46/64] drm/i915/selftests: Prepare execbuf tests for obj->mm.lock removal.
Also quite simple, a single call needs to use the unlocked version. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c index e1d50a5a1477..4df505e4c53a 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c @@ -116,7 +116,7 @@ static int igt_gpu_reloc(void *arg) if (IS_ERR(scratch)) return PTR_ERR(scratch); - map = i915_gem_object_pin_map(scratch, I915_MAP_WC); + map = i915_gem_object_pin_map_unlocked(scratch, I915_MAP_WC); if (IS_ERR(map)) { err = PTR_ERR(map); goto err_scratch; -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 23/64] drm/i915: Add object locking to vm_fault_cpu
Take a simple lock so we hold ww around (un)pin_pages as needed. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index c0034d811e50..163208a6260d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -246,6 +246,9 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf) area->vm_flags & VM_WRITE)) return VM_FAULT_SIGBUS; + if (i915_gem_object_lock_interruptible(obj, NULL)) + return VM_FAULT_NOPAGE; + err = i915_gem_object_pin_pages(obj); if (err) goto out; @@ -269,6 +272,7 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf) i915_gem_object_unpin_pages(obj); out: + i915_gem_object_unlock(obj); return i915_error_to_vmf_fault(err); } -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 01/64] drm/i915: Do not share hwsp across contexts any more, v6
Instead of sharing pages with breadcrumbs, give each timeline a single page. This allows unrelated timelines not to share locks any more during command submission. As an additional benefit, seqno wraparound no longer requires i915_vma_pin, which means we no longer need to worry about a potential -EDEADLK at a point where we are ready to submit. Changes since v1: - Fix erroneous i915_vma_acquire that should be a i915_vma_release (ickle). - Extra check for completion in intel_read_hwsp(). Changes since v2: - Fix inconsistent indent in hwsp_alloc() (kbuild) - memset entire cacheline to 0. Changes since v3: - Do same in intel_timeline_reset_seqno(), and clflush for good measure. Changes since v4: - Use refcounting on timeline, instead of relying on i915_active. - Fix waiting on kernel requests. Changes since v5: - Bump amount of slots to maximum (256), for best wraparounds. - Add hwsp_offset to i915_request to fix potential wraparound hang. - Ensure timeline wrap test works with the changes. - Assign hwsp in intel_timeline_read_hwsp() within the rcu lock to fix a hang. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström #v1 Reported-by: kernel test robot --- drivers/gpu/drm/i915/gt/gen2_engine_cs.c | 2 +- drivers/gpu/drm/i915/gt/gen6_engine_cs.c | 8 +- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 12 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 1 + drivers/gpu/drm/i915/gt/intel_gt_types.h | 4 - drivers/gpu/drm/i915/gt/intel_timeline.c | 422 -- .../gpu/drm/i915/gt/intel_timeline_types.h| 17 +- drivers/gpu/drm/i915/gt/selftest_engine_cs.c | 5 +- drivers/gpu/drm/i915/gt/selftest_timeline.c | 83 ++-- drivers/gpu/drm/i915/i915_request.c | 4 - drivers/gpu/drm/i915/i915_request.h | 17 +- 11 files changed, 160 insertions(+), 415 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen2_engine_cs.c b/drivers/gpu/drm/i915/gt/gen2_engine_cs.c index b491a64919c8..9646200d2792 100644 --- a/drivers/gpu/drm/i915/gt/gen2_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen2_engine_cs.c @@ -143,7 +143,7 @@ static u32 *__gen2_emit_breadcrumb(struct i915_request *rq, u32 *cs, int flush, int post) { GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma); - GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR); + GEM_BUG_ON(offset_in_page(rq->hwsp_seqno) != I915_GEM_HWS_SEQNO_ADDR); *cs++ = MI_FLUSH; diff --git a/drivers/gpu/drm/i915/gt/gen6_engine_cs.c b/drivers/gpu/drm/i915/gt/gen6_engine_cs.c index ce38d1bcaba3..dd094e21bb51 100644 --- a/drivers/gpu/drm/i915/gt/gen6_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen6_engine_cs.c @@ -161,7 +161,7 @@ u32 *gen6_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs) PIPE_CONTROL_DC_FLUSH_ENABLE | PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_CS_STALL); - *cs++ = i915_request_active_timeline(rq)->hwsp_offset | + *cs++ = i915_request_active_offset(rq) | PIPE_CONTROL_GLOBAL_GTT; *cs++ = rq->fence.seqno; @@ -359,7 +359,7 @@ u32 *gen7_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs) PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_GLOBAL_GTT_IVB | PIPE_CONTROL_CS_STALL); - *cs++ = i915_request_active_timeline(rq)->hwsp_offset; + *cs++ = i915_request_active_offset(rq); *cs++ = rq->fence.seqno; *cs++ = MI_USER_INTERRUPT; @@ -374,7 +374,7 @@ u32 *gen7_emit_breadcrumb_rcs(struct i915_request *rq, u32 *cs) u32 *gen6_emit_breadcrumb_xcs(struct i915_request *rq, u32 *cs) { GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma); - GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR); + GEM_BUG_ON(offset_in_page(rq->hwsp_seqno) != I915_GEM_HWS_SEQNO_ADDR); *cs++ = MI_FLUSH_DW | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX; *cs++ = I915_GEM_HWS_SEQNO_ADDR | MI_FLUSH_DW_USE_GTT; @@ -394,7 +394,7 @@ u32 *gen7_emit_breadcrumb_xcs(struct i915_request *rq, u32 *cs) int i; GEM_BUG_ON(i915_request_active_timeline(rq)->hwsp_ggtt != rq->engine->status_page.vma); - GEM_BUG_ON(offset_in_page(i915_request_active_timeline(rq)->hwsp_offset) != I915_GEM_HWS_SEQNO_ADDR); + GEM_BUG_ON(offset_in_page(rq->hwsp_seqno) != I915_GEM_HWS_SEQNO_ADDR); *cs++ = MI_FLUSH_DW | MI_INVALIDATE_TLB | MI_FLUSH_DW_OP_STOREDW | MI_FLUSH_DW_STORE_INDEX; diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c index 1972dd5dca00..33eaa0203b1d 100644 --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c @@ -3
[Intel-gfx] [PATCH v6 45/64] drm/i915/selftests: Prepare dma-buf tests for obj->mm.lock removal.
Use pin_pages_unlocked() where we don't have a lock. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c index b6d43880b0c1..dd74bc09ec88 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c @@ -194,7 +194,7 @@ static int igt_dmabuf_import_ownership(void *arg) dma_buf_put(dmabuf); - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) { pr_err("i915_gem_object_pin_pages failed with err=%d\n", err); goto out_obj; -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 50/64] drm/i915/selftests: Prepare igt_gem_utils for obj->mm.lock removal
igt_emit_store_dw needs to use the unlocked version, as it's not holding a lock. This fixes igt_gpu_fill_dw() which is used by some other selftests. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c index d6783061bc72..0b092c62bb34 100644 --- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c +++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c @@ -55,7 +55,7 @@ igt_emit_store_dw(struct i915_vma *vma, if (IS_ERR(obj)) return ERR_CAST(obj); - cmd = i915_gem_object_pin_map(obj, I915_MAP_WC); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto err; -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 21/64] drm/i915: Rework clflush to work correctly without obj->mm.lock.
Pin in the caller, not in the work itself. This should also work better for dma-fence annotations. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c index bc0223716906..daf9284ef1f5 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c @@ -27,15 +27,8 @@ static void __do_clflush(struct drm_i915_gem_object *obj) static int clflush_work(struct dma_fence_work *base) { struct clflush *clflush = container_of(base, typeof(*clflush), base); - struct drm_i915_gem_object *obj = clflush->obj; - int err; - err = i915_gem_object_pin_pages(obj); - if (err) - return err; - - __do_clflush(obj); - i915_gem_object_unpin_pages(obj); + __do_clflush(clflush->obj); return 0; } @@ -44,6 +37,7 @@ static void clflush_release(struct dma_fence_work *base) { struct clflush *clflush = container_of(base, typeof(*clflush), base); + i915_gem_object_unpin_pages(clflush->obj); i915_gem_object_put(clflush->obj); } @@ -63,6 +57,11 @@ static struct clflush *clflush_work_create(struct drm_i915_gem_object *obj) if (!clflush) return NULL; + if (__i915_gem_object_get_pages(obj) < 0) { + kfree(clflush); + return NULL; + } + dma_fence_work_init(&clflush->base, &clflush_ops); clflush->obj = i915_gem_object_get(obj); /* obj <-> clflush cycle */ -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 47/64] drm/i915/selftests: Prepare mman testcases for obj->mm.lock removal.
Ensure we hold the lock around put_pages, and use the unlocked wrappers for pinning pages and mappings. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c index 44908c68e331..5cf6df49c333 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -322,7 +322,7 @@ static int igt_partial_tiling(void *arg) if (IS_ERR(obj)) return PTR_ERR(obj); - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) { pr_err("Failed to allocate %u pages (%lu total), err=%d\n", nreal, obj->base.size / PAGE_SIZE, err); @@ -459,7 +459,7 @@ static int igt_smoke_tiling(void *arg) if (IS_ERR(obj)) return PTR_ERR(obj); - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) { pr_err("Failed to allocate %u pages (%lu total), err=%d\n", nreal, obj->base.size / PAGE_SIZE, err); @@ -798,7 +798,7 @@ static int wc_set(struct drm_i915_gem_object *obj) { void *vaddr; - vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC); + vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(vaddr)) return PTR_ERR(vaddr); @@ -814,7 +814,7 @@ static int wc_check(struct drm_i915_gem_object *obj) void *vaddr; int err = 0; - vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC); + vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(vaddr)) return PTR_ERR(vaddr); @@ -1316,7 +1316,9 @@ static int __igt_mmap_revoke(struct drm_i915_private *i915, } if (type != I915_MMAP_TYPE_GTT) { + i915_gem_object_lock(obj, NULL); __i915_gem_object_put_pages(obj); + i915_gem_object_unlock(obj); if (i915_gem_object_has_pages(obj)) { pr_err("Failed to put-pages object!\n"); err = -EINVAL; -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 30/64] drm/i915: Fix pread/pwrite to work with new locking rules.
We are removing obj->mm.lock, and need to take the reservation lock before we can pin pages. Move the pinning pages into the helper, and merge gtt pwrite/pread preparation and cleanup paths. The fence lock is also removed; it will conflict with fence annotations, because of memory allocations done when pagefaulting inside copy_*_user. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/Makefile | 1 - drivers/gpu/drm/i915/gem/i915_gem_fence.c | 95 - drivers/gpu/drm/i915/gem/i915_gem_object.h | 5 - drivers/gpu/drm/i915/i915_gem.c| 224 +++-- 4 files changed, 114 insertions(+), 211 deletions(-) delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_fence.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index f1c7c3246226..350274698037 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -137,7 +137,6 @@ gem-y += \ gem/i915_gem_dmabuf.o \ gem/i915_gem_domain.o \ gem/i915_gem_execbuffer.o \ - gem/i915_gem_fence.o \ gem/i915_gem_internal.o \ gem/i915_gem_object.o \ gem/i915_gem_object_blt.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_fence.c b/drivers/gpu/drm/i915/gem/i915_gem_fence.c deleted file mode 100644 index 8ab842c80f99.. --- a/drivers/gpu/drm/i915/gem/i915_gem_fence.c +++ /dev/null @@ -1,95 +0,0 @@ -/* - * SPDX-License-Identifier: MIT - * - * Copyright © 2019 Intel Corporation - */ - -#include "i915_drv.h" -#include "i915_gem_object.h" - -struct stub_fence { - struct dma_fence dma; - struct i915_sw_fence chain; -}; - -static int __i915_sw_fence_call -stub_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) -{ - struct stub_fence *stub = container_of(fence, typeof(*stub), chain); - - switch (state) { - case FENCE_COMPLETE: - dma_fence_signal(&stub->dma); - break; - - case FENCE_FREE: - dma_fence_put(&stub->dma); - break; - } - - return NOTIFY_DONE; -} - -static const char *stub_driver_name(struct dma_fence *fence) -{ - return DRIVER_NAME; -} - -static const char *stub_timeline_name(struct dma_fence *fence) -{ - return "object"; -} - -static void stub_release(struct dma_fence *fence) -{ - struct stub_fence *stub = container_of(fence, typeof(*stub), dma); - - i915_sw_fence_fini(&stub->chain); - - BUILD_BUG_ON(offsetof(typeof(*stub), dma)); - dma_fence_free(&stub->dma); -} - -static const struct dma_fence_ops stub_fence_ops = { - .get_driver_name = stub_driver_name, - .get_timeline_name = stub_timeline_name, - .release = stub_release, -}; - -struct dma_fence * -i915_gem_object_lock_fence(struct drm_i915_gem_object *obj) -{ - struct stub_fence *stub; - - assert_object_held(obj); - - stub = kmalloc(sizeof(*stub), GFP_KERNEL); - if (!stub) - return NULL; - - i915_sw_fence_init(&stub->chain, stub_notify); - dma_fence_init(&stub->dma, &stub_fence_ops, &stub->chain.wait.lock, - 0, 0); - - if (i915_sw_fence_await_reservation(&stub->chain, - obj->base.resv, NULL, true, - i915_fence_timeout(to_i915(obj->base.dev)), - I915_FENCE_GFP) < 0) - goto err; - - dma_resv_add_excl_fence(obj->base.resv, &stub->dma); - - return &stub->dma; - -err: - stub_release(&stub->dma); - return NULL; -} - -void i915_gem_object_unlock_fence(struct drm_i915_gem_object *obj, - struct dma_fence *fence) -{ - struct stub_fence *stub = container_of(fence, typeof(*stub), dma); - - i915_sw_fence_commit(&stub->chain); -} diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 0c357e91dce4..fbd01e25ca71 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -163,11 +163,6 @@ static inline void i915_gem_object_unlock(struct drm_i915_gem_object *obj) dma_resv_unlock(obj->base.resv); } -struct dma_fence * -i915_gem_object_lock_fence(struct drm_i915_gem_object *obj); -void i915_gem_object_unlock_fence(struct drm_i915_gem_object *obj, - struct dma_fence *fence); - static inline void i915_gem_object_set_readonly(struct drm_i915_gem_object *obj) { diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c5d0f3887d29..dd811d0f96f2 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -306,7 +306,6 @@ i915_gem_shmem_pread(struct drm_i915
[Intel-gfx] [PATCH v6 41/64] drm/i915/selftests: Prepare huge_pages testcases for obj->mm.lock removal.
Straightforward conversion, just convert a bunch of calls to unlocked versions. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- .../gpu/drm/i915/gem/selftests/huge_pages.c | 28 ++- 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index 6c2241b7387b..dadd485bc52f 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -589,7 +589,7 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg) goto out_put; } - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) goto out_put; @@ -653,15 +653,19 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg) break; } + i915_gem_object_lock(obj, NULL); i915_gem_object_unpin_pages(obj); __i915_gem_object_put_pages(obj); + i915_gem_object_unlock(obj); i915_gem_object_put(obj); } return 0; out_unpin: + i915_gem_object_lock(obj, NULL); i915_gem_object_unpin_pages(obj); + i915_gem_object_unlock(obj); out_put: i915_gem_object_put(obj); @@ -675,8 +679,10 @@ static void close_object_list(struct list_head *objects, list_for_each_entry_safe(obj, on, objects, st_link) { list_del(&obj->st_link); + i915_gem_object_lock(obj, NULL); i915_gem_object_unpin_pages(obj); __i915_gem_object_put_pages(obj); + i915_gem_object_unlock(obj); i915_gem_object_put(obj); } } @@ -713,7 +719,7 @@ static int igt_mock_ppgtt_huge_fill(void *arg) break; } - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) { i915_gem_object_put(obj); break; @@ -889,7 +895,7 @@ static int igt_mock_ppgtt_64K(void *arg) if (IS_ERR(obj)) return PTR_ERR(obj); - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) goto out_object_put; @@ -943,8 +949,10 @@ static int igt_mock_ppgtt_64K(void *arg) } i915_vma_unpin(vma); + i915_gem_object_lock(obj, NULL); i915_gem_object_unpin_pages(obj); __i915_gem_object_put_pages(obj); + i915_gem_object_unlock(obj); i915_gem_object_put(obj); } } @@ -954,7 +962,9 @@ static int igt_mock_ppgtt_64K(void *arg) out_vma_unpin: i915_vma_unpin(vma); out_object_unpin: + i915_gem_object_lock(obj, NULL); i915_gem_object_unpin_pages(obj); + i915_gem_object_unlock(obj); out_object_put: i915_gem_object_put(obj); @@ -1024,7 +1034,7 @@ static int __cpu_check_vmap(struct drm_i915_gem_object *obj, u32 dword, u32 val) if (err) return err; - ptr = i915_gem_object_pin_map(obj, I915_MAP_WC); + ptr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(ptr)) return PTR_ERR(ptr); @@ -1304,7 +1314,7 @@ static int igt_ppgtt_smoke_huge(void *arg) return err; } - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) { if (err == -ENXIO || err == -E2BIG) { i915_gem_object_put(obj); @@ -1327,8 +1337,10 @@ static int igt_ppgtt_smoke_huge(void *arg) __func__, size, i); } out_unpin: + i915_gem_object_lock(obj, NULL); i915_gem_object_unpin_pages(obj); __i915_gem_object_put_pages(obj); + i915_gem_object_unlock(obj); out_put: i915_gem_object_put(obj); @@ -1402,7 +1414,7 @@ static int igt_ppgtt_sanity_check(void *arg) return err; } - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) { i915_gem_object_put(obj); goto out; @@ -1416,8 +1428,10 @@ static int igt_ppgtt_sanity_check(void *arg) err = igt_write_huge(c
[Intel-gfx] [PATCH v6 53/64] drm/i915/selftests: Prepare execlists and lrc selftests for obj->mm.lock removal
Convert normal functions to unlocked versions where needed. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/selftest_execlists.c | 18 +- drivers/gpu/drm/i915/gt/selftest_lrc.c | 16 2 files changed, 17 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c index c9f3fef11f8f..e815cb2ca429 100644 --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c @@ -986,7 +986,7 @@ static int live_timeslice_preempt(void *arg) goto err_obj; } - vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC); + vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(vaddr)) { err = PTR_ERR(vaddr); goto err_obj; @@ -1294,7 +1294,7 @@ static int live_timeslice_queue(void *arg) goto err_obj; } - vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC); + vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(vaddr)) { err = PTR_ERR(vaddr); goto err_obj; @@ -1541,7 +1541,7 @@ static int live_busywait_preempt(void *arg) goto err_ctx_lo; } - map = i915_gem_object_pin_map(obj, I915_MAP_WC); + map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(map)) { err = PTR_ERR(map); goto err_obj; @@ -2732,7 +2732,7 @@ static int create_gang(struct intel_engine_cs *engine, if (err) goto err_obj; - cs = i915_gem_object_pin_map(obj, I915_MAP_WC); + cs = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(cs)) { err = PTR_ERR(cs); goto err_obj; @@ -3018,7 +3018,7 @@ static int live_preempt_gang(void *arg) * it will terminate the next lowest spinner until there * are no more spinners and the gang is complete. */ - cs = i915_gem_object_pin_map(rq->batch->obj, I915_MAP_WC); + cs = i915_gem_object_pin_map_unlocked(rq->batch->obj, I915_MAP_WC); if (!IS_ERR(cs)) { *cs = 0; i915_gem_object_unpin_map(rq->batch->obj); @@ -3083,7 +3083,7 @@ create_gpr_user(struct intel_engine_cs *engine, return ERR_PTR(err); } - cs = i915_gem_object_pin_map(obj, I915_MAP_WC); + cs = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(cs)) { i915_vma_put(vma); return ERR_CAST(cs); @@ -3293,7 +3293,7 @@ static int live_preempt_user(void *arg) if (IS_ERR(global)) return PTR_ERR(global); - result = i915_gem_object_pin_map(global->obj, I915_MAP_WC); + result = i915_gem_object_pin_map_unlocked(global->obj, I915_MAP_WC); if (IS_ERR(result)) { i915_vma_unpin_and_release(&global, 0); return PTR_ERR(result); @@ -3686,7 +3686,7 @@ static int live_preempt_smoke(void *arg) goto err_free; } - cs = i915_gem_object_pin_map(smoke.batch, I915_MAP_WB); + cs = i915_gem_object_pin_map_unlocked(smoke.batch, I915_MAP_WB); if (IS_ERR(cs)) { err = PTR_ERR(cs); goto err_batch; @@ -4291,7 +4291,7 @@ static int preserved_virtual_engine(struct intel_gt *gt, goto out_end; } - cs = i915_gem_object_pin_map(scratch->obj, I915_MAP_WB); + cs = i915_gem_object_pin_map_unlocked(scratch->obj, I915_MAP_WB); if (IS_ERR(cs)) { err = PTR_ERR(cs); goto out_end; diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 1474300a1a9e..196eebef52ac 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -625,7 +625,7 @@ static int __live_lrc_gpr(struct intel_engine_cs *engine, goto err_rq; } - cs = i915_gem_object_pin_map(scratch->obj, I915_MAP_WB); + cs = i915_gem_object_pin_map_unlocked(scratch->obj, I915_MAP_WB); if (IS_ERR(cs)) { err = PTR_ERR(cs); goto err_rq; @@ -919,7 +919,7 @@ store_context(struct intel_context *ce, struct i915_vma *scratch) if (IS_ERR(batch)) return batch; - cs = i915_gem_object_pin_map(batch->obj, I915_MAP_WC); + cs = i915_gem_object_pin_map_unlocked(batch->obj, I915_MAP_WC); if (IS_ERR(cs)) { i915_vma_put(batch); return ERR_CAST(cs); @@ -1083,7 +1083,7 @@ static struct i915_vma *load_context(struct intel_context *ce, u32 poison) if (IS_ERR(batch)) return batch; -
[Intel-gfx] [PATCH v6 34/64] drm/i915: Add ww locking around vm_access()
i915_gem_object_pin_map potentially needs a ww context, so ensure we have one we can revoke. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 24 ++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index 163208a6260d..2561a2f1e54f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -421,7 +421,9 @@ vm_access(struct vm_area_struct *area, unsigned long addr, { struct i915_mmap_offset *mmo = area->vm_private_data; struct drm_i915_gem_object *obj = mmo->obj; + struct i915_gem_ww_ctx ww; void *vaddr; + int err = 0; if (i915_gem_object_is_readonly(obj) && write) return -EACCES; @@ -430,10 +432,18 @@ vm_access(struct vm_area_struct *area, unsigned long addr, if (addr >= obj->base.size) return -EINVAL; + i915_gem_ww_ctx_init(&ww, true); +retry: + err = i915_gem_object_lock(obj, &ww); + if (err) + goto out; + /* As this is primarily for debugging, let's focus on simplicity */ vaddr = i915_gem_object_pin_map(obj, I915_MAP_FORCE_WC); - if (IS_ERR(vaddr)) - return PTR_ERR(vaddr); + if (IS_ERR(vaddr)) { + err = PTR_ERR(vaddr); + goto out; + } if (write) { memcpy(vaddr + addr, buf, len); @@ -443,6 +453,16 @@ vm_access(struct vm_area_struct *area, unsigned long addr, } i915_gem_object_unpin_map(obj); +out: + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + + if (err) + return err; return len; } -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 36/64] drm/i915: Lock ww in ucode objects correctly
In the ucode functions, the calls are done before userspace runs, when debugging using debugfs, or when creating semi-permanent mappings; we can safely use the unlocked versions that does the ww dance for us. Because there is no pin_pages_unlocked yet, add it as convenience function. This removes possible lockdep splats about missing resv lock for ucode. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 ++ drivers/gpu/drm/i915/gem/i915_gem_pages.c | 20 drivers/gpu/drm/i915/gt/uc/intel_guc.c | 2 +- drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 4 ++-- drivers/gpu/drm/i915/gt/uc/intel_huc.c | 2 +- drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 2 +- 6 files changed, 27 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 105a49102827..3fe41180cc81 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -342,6 +342,8 @@ i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) return __i915_gem_object_get_pages(obj); } +int i915_gem_object_pin_pages_unlocked(struct drm_i915_gem_object *obj); + static inline bool i915_gem_object_has_pages(struct drm_i915_gem_object *obj) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index b86bef4fb602..8a6c0dd1bfdd 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -136,6 +136,26 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj) return err; } +int i915_gem_object_pin_pages_unlocked(struct drm_i915_gem_object *obj) +{ + struct i915_gem_ww_ctx ww; + int err; + + i915_gem_ww_ctx_init(&ww, true); +retry: + err = i915_gem_object_lock(obj, &ww); + if (!err) + err = i915_gem_object_pin_pages(obj); + + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + return err; +} + /* Immediately discard the backing storage */ void i915_gem_object_truncate(struct drm_i915_gem_object *obj) { diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c index 2a343a977987..a65661eb5d5d 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c @@ -694,7 +694,7 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size, if (IS_ERR(vma)) return PTR_ERR(vma); - vaddr = i915_gem_object_pin_map(vma->obj, I915_MAP_WB); + vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB); if (IS_ERR(vaddr)) { i915_vma_unpin_and_release(&vma, 0); return PTR_ERR(vaddr); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c index c92f2c056db4..c36d5eb5bbb9 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c @@ -335,7 +335,7 @@ static int guc_log_map(struct intel_guc_log *log) * buffer pages, so that we can directly get the data * (up-to-date) from memory. */ - vaddr = i915_gem_object_pin_map(log->vma->obj, I915_MAP_WC); + vaddr = i915_gem_object_pin_map_unlocked(log->vma->obj, I915_MAP_WC); if (IS_ERR(vaddr)) return PTR_ERR(vaddr); @@ -744,7 +744,7 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p, if (!obj) return 0; - map = i915_gem_object_pin_map(obj, I915_MAP_WC); + map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(map)) { DRM_DEBUG("Failed to pin object\n"); drm_puts(p, "(log data unaccessible)\n"); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c index 65eeb44b397d..2126dd81ac38 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c @@ -82,7 +82,7 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc) if (IS_ERR(vma)) return PTR_ERR(vma); - vaddr = i915_gem_object_pin_map(vma->obj, I915_MAP_WB); + vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB); if (IS_ERR(vaddr)) { i915_vma_unpin_and_release(&vma, 0); return PTR_ERR(vaddr); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c index 602f1a0bc587..f40bd455830e 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c @@ -542,7 +542,7 @@ int intel_uc_fw_init(struct intel_uc_fw *uc_fw) i
[Intel-gfx] [PATCH v6 39/64] drm/i915: Fix ww locking in shmem_create_from_object
Quick fix, just use the unlocked version. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/shmem_utils.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c index 5982b62f913d..5d9bac7f39c4 100644 --- a/drivers/gpu/drm/i915/gt/shmem_utils.c +++ b/drivers/gpu/drm/i915/gt/shmem_utils.c @@ -39,7 +39,7 @@ struct file *shmem_create_from_object(struct drm_i915_gem_object *obj) return file; } - ptr = i915_gem_object_pin_map(obj, I915_MAP_WB); + ptr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(ptr)) return ERR_CAST(ptr); -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 56/64] drm/i915/selftests: Prepare timeline tests for obj->mm.lock removal
We can no longer call intel_timeline_pin with a null argument, so add a ww loop that locks the backing object. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/selftest_timeline.c | 30 + 1 file changed, 25 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c index 2e2bea7ac7a5..bed05ca5d670 100644 --- a/drivers/gpu/drm/i915/gt/selftest_timeline.c +++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c @@ -38,6 +38,26 @@ static unsigned long hwsp_cacheline(struct intel_timeline *tl) return (address + offset_in_page(tl->hwsp_offset)) / CACHELINE_BYTES; } +static int selftest_tl_pin(struct intel_timeline *tl) +{ + struct i915_gem_ww_ctx ww; + int err; + + i915_gem_ww_ctx_init(&ww, false); +retry: + err = i915_gem_object_lock(tl->hwsp_ggtt->obj, &ww); + if (!err) + err = intel_timeline_pin(tl, &ww); + + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + return err; +} + #define CACHELINES_PER_PAGE (PAGE_SIZE / CACHELINE_BYTES) struct mock_hwsp_freelist { @@ -79,7 +99,7 @@ static int __mock_hwsp_timeline(struct mock_hwsp_freelist *state, if (IS_ERR(tl)) return PTR_ERR(tl); - err = intel_timeline_pin(tl, NULL); + err = selftest_tl_pin(tl); if (err) { intel_timeline_put(tl); return err; @@ -465,7 +485,7 @@ checked_tl_write(struct intel_timeline *tl, struct intel_engine_cs *engine, u32 struct i915_request *rq; int err; - err = intel_timeline_pin(tl, NULL); + err = selftest_tl_pin(tl); if (err) { rq = ERR_PTR(err); goto out; @@ -665,7 +685,7 @@ static int live_hwsp_wrap(void *arg) if (!tl->has_initial_breadcrumb) goto out_free; - err = intel_timeline_pin(tl, NULL); + err = selftest_tl_pin(tl); if (err) goto out_free; @@ -812,13 +832,13 @@ static int setup_watcher(struct hwsp_watcher *w, struct intel_gt *gt) if (IS_ERR(obj)) return PTR_ERR(obj); - w->map = i915_gem_object_pin_map(obj, I915_MAP_WB); + w->map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(w->map)) { i915_gem_object_put(obj); return PTR_ERR(w->map); } - vma = i915_gem_object_ggtt_pin_ww(obj, NULL, NULL, 0, 0, 0); + vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, 0); if (IS_ERR(vma)) { i915_gem_object_put(obj); return PTR_ERR(vma); -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 51/64] drm/i915/selftests: Prepare context selftest for obj->mm.lock removal
Only needs to convert a single call to the unlocked version. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/selftest_context.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c index db738d400168..d1f19fd04769 100644 --- a/drivers/gpu/drm/i915/gt/selftest_context.c +++ b/drivers/gpu/drm/i915/gt/selftest_context.c @@ -88,8 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine) if (err) goto err; - vaddr = i915_gem_object_pin_map(ce->state->obj, - i915_coherent_map_type(engine->i915)); + vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj, + i915_coherent_map_type(engine->i915)); if (IS_ERR(vaddr)) { err = PTR_ERR(vaddr); intel_context_unpin(ce); -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 15/64] drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER.
Now that unsynchronized mappings are removed, the only time userptr works is when the MMU notifier is enabled. Put all of the userptr code behind a mmu notifier ifdef. Signed-off-by: Maarten Lankhorst --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 2 + drivers/gpu/drm/i915/gem/i915_gem_object.h| 4 ++ .../gpu/drm/i915/gem/i915_gem_object_types.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 58 +++ drivers/gpu/drm/i915/i915_drv.h | 2 + 5 files changed, 31 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 9c7480119d53..77de94e45f55 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1971,8 +1971,10 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb, err = 0; } +#ifdef CONFIG_MMU_NOTIFIER if (!err) flush_workqueue(eb->i915->mm.userptr_wq); +#endif err_relock: i915_gem_ww_ctx_init(&eb->ww, true); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 2ba6d114182c..dc4f3175550d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -537,7 +537,11 @@ void __i915_gem_object_invalidate_frontbuffer(struct drm_i915_gem_object *obj, static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) { +#ifdef CONFIG_MMU_NOTIFIER return obj->userptr.mm; +#else + return false; +#endif } static inline void diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index b53e44b06b09..6d3f451c15c6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -289,6 +289,7 @@ struct drm_i915_gem_object { unsigned long *bit_17; union { +#ifdef CONFIG_MMU_NOTIFIER struct i915_gem_userptr { uintptr_t ptr; @@ -296,6 +297,7 @@ struct drm_i915_gem_object { struct i915_mmu_object *mmu_object; struct work_struct *work; } userptr; +#endif unsigned long scratch; u64 encode; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c index 241f865077b9..1183b28c084b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -15,6 +15,8 @@ #include "i915_gem_object.h" #include "i915_scatterlist.h" +#if defined(CONFIG_MMU_NOTIFIER) + struct i915_mm_struct { struct mm_struct *mm; struct drm_i915_private *i915; @@ -24,7 +26,6 @@ struct i915_mm_struct { struct rcu_work work; }; -#if defined(CONFIG_MMU_NOTIFIER) #include struct i915_mmu_notifier { @@ -217,15 +218,11 @@ i915_mmu_notifier_find(struct i915_mm_struct *mm) } static int -i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj, - unsigned flags) +i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj) { struct i915_mmu_notifier *mn; struct i915_mmu_object *mo; - if (flags & I915_USERPTR_UNSYNCHRONIZED) - return -ENODEV; - if (GEM_WARN_ON(!obj->userptr.mm)) return -EINVAL; @@ -258,32 +255,6 @@ i915_mmu_notifier_free(struct i915_mmu_notifier *mn, kfree(mn); } -#else - -static void -__i915_gem_userptr_set_active(struct drm_i915_gem_object *obj, bool value) -{ -} - -static void -i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj) -{ -} - -static int -i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj, - unsigned flags) -{ - return -ENODEV; -} - -static void -i915_mmu_notifier_free(struct i915_mmu_notifier *mn, - struct mm_struct *mm) -{ -} - -#endif static struct i915_mm_struct * __i915_mm_struct_find(struct drm_i915_private *i915, struct mm_struct *real) @@ -725,6 +696,8 @@ static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = { .release = i915_gem_userptr_release, }; +#endif + /* * Creates a new mm object that wraps some normal memory from the process * context - user memory. @@ -765,12 +738,12 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { - static struct lock_class_key lock_class; + static struct lock_class_key __maybe_unused lock_class; struct drm_i915_private *dev_priv = to_i915(dev); struct drm_i915_gem_userptr *args = data; - struct drm_i915_gem_object *obj; - int ret; - u32 handle; + str
[Intel-gfx] [PATCH v6 48/64] drm/i915/selftests: Prepare object tests for obj->mm.lock removal.
Convert a single pin_pages call to use the unlocked version. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c index bf853c40ec65..740ee8086a27 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c @@ -47,7 +47,7 @@ static int igt_gem_huge(void *arg) if (IS_ERR(obj)) return PTR_ERR(obj); - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) { pr_err("Failed to allocate %u pages (%lu total), err=%d\n", nreal, obj->base.size / PAGE_SIZE, err); -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 32/64] drm/i915: Prepare for obj->mm.lock removal
From: Thomas Hellström Stolen objects need to lock, and we may call put_pages when refcount drops to 0, ensure all calls are handled correctly. Idea-from: Thomas Hellström Signed-off-by: Maarten Lankhorst Signed-off-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 14 ++ drivers/gpu/drm/i915/gem/i915_gem_pages.c | 14 -- drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 10 +- 3 files changed, 35 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index a8e98acdfc04..105a49102827 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -118,6 +118,20 @@ i915_gem_object_put(struct drm_i915_gem_object *obj) #define assert_object_held(obj) dma_resv_assert_held((obj)->base.resv) +/* + * If more than one potential simultaneous locker, assert held. + */ +static inline void assert_object_held_shared(struct drm_i915_gem_object *obj) +{ + /* +* Note mm list lookup is protected by +* kref_get_unless_zero(). +*/ + if (IS_ENABLED(CONFIG_LOCKDEP) && + kref_read(&obj->base.refcount) > 0) + lockdep_assert_held(&obj->mm.lock); +} + static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj, struct i915_gem_ww_ctx *ww, bool intr) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index e7e5898cb143..b86bef4fb602 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -18,7 +18,7 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, unsigned long supported = INTEL_INFO(i915)->page_sizes; int i; - lockdep_assert_held(&obj->mm.lock); + assert_object_held_shared(obj); if (i915_gem_object_is_volatile(obj)) obj->mm.madv = I915_MADV_DONTNEED; @@ -67,6 +67,7 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, struct list_head *list; unsigned long flags; + lockdep_assert_held(&obj->mm.lock); spin_lock_irqsave(&i915->mm.obj_lock, flags); i915->mm.shrink_count++; @@ -88,6 +89,8 @@ int i915_gem_object_get_pages(struct drm_i915_gem_object *obj) struct drm_i915_private *i915 = to_i915(obj->base.dev); int err; + assert_object_held_shared(obj); + if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) { drm_dbg(&i915->drm, "Attempting to obtain a purgeable object\n"); @@ -115,6 +118,8 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj) if (err) return err; + assert_object_held_shared(obj); + if (unlikely(!i915_gem_object_has_pages(obj))) { GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj)); @@ -142,7 +147,7 @@ void i915_gem_object_truncate(struct drm_i915_gem_object *obj) /* Try to discard unwanted pages */ void i915_gem_object_writeback(struct drm_i915_gem_object *obj) { - lockdep_assert_held(&obj->mm.lock); + assert_object_held_shared(obj); GEM_BUG_ON(i915_gem_object_has_pages(obj)); if (obj->ops->writeback) @@ -173,6 +178,8 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj) { struct sg_table *pages; + assert_object_held_shared(obj); + pages = fetch_and_zero(&obj->mm.pages); if (IS_ERR_OR_NULL(pages)) return pages; @@ -200,6 +207,9 @@ int __i915_gem_object_put_pages_locked(struct drm_i915_gem_object *obj) if (i915_gem_object_has_pinned_pages(obj)) return -EBUSY; + /* May be called by shrinker from within get_pages() (on another bo) */ + assert_object_held_shared(obj); + i915_gem_object_release_mmap_offset(obj); /* diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c index 5372b888ba01..ce9086d3a647 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c @@ -643,11 +643,19 @@ __i915_gem_object_create_stolen(struct intel_memory_region *mem, cache_level = HAS_LLC(mem->i915) ? I915_CACHE_LLC : I915_CACHE_NONE; i915_gem_object_set_cache_coherency(obj, cache_level); + if (WARN_ON(!i915_gem_object_trylock(obj))) { + err = -EBUSY; + goto cleanup; + } + err = i915_gem_object_pin_pages(obj); - if (err) + if (err) { + i915_gem_object_unlock(obj); goto cleanup; + } i915_gem_object_i
[Intel-gfx] [PATCH v6 55/64] drm/i915/selftests: Prepare ring submission for obj->mm.lock removal
Use unlocked versions when the ww lock is not held. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/selftest_ring_submission.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_ring_submission.c b/drivers/gpu/drm/i915/gt/selftest_ring_submission.c index 3350e7c995bc..99609271c3a7 100644 --- a/drivers/gpu/drm/i915/gt/selftest_ring_submission.c +++ b/drivers/gpu/drm/i915/gt/selftest_ring_submission.c @@ -35,7 +35,7 @@ static struct i915_vma *create_wally(struct intel_engine_cs *engine) return ERR_PTR(err); } - cs = i915_gem_object_pin_map(obj, I915_MAP_WC); + cs = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(cs)) { i915_gem_object_put(obj); return ERR_CAST(cs); @@ -212,7 +212,7 @@ static int __live_ctx_switch_wa(struct intel_engine_cs *engine) if (IS_ERR(bb)) return PTR_ERR(bb); - result = i915_gem_object_pin_map(bb->obj, I915_MAP_WC); + result = i915_gem_object_pin_map_unlocked(bb->obj, I915_MAP_WC); if (IS_ERR(result)) { intel_context_put(bb->private); i915_vma_unpin_and_release(&bb, 0); -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 31/64] drm/i915: Fix workarounds selftest, part 1
pin_map needs the ww lock, so ensure we pin both before submission. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_object.h| 3 + drivers/gpu/drm/i915/gem/i915_gem_pages.c | 12 +++ .../gpu/drm/i915/gt/selftest_workarounds.c| 76 --- 3 files changed, 64 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index fbd01e25ca71..a8e98acdfc04 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -395,6 +395,9 @@ enum i915_map_type { void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj, enum i915_map_type type); +void *__must_check i915_gem_object_pin_map_unlocked(struct drm_i915_gem_object *obj, + enum i915_map_type type); + void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj, unsigned long offset, unsigned long size); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index ffc3c3c474b1..e7e5898cb143 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -397,6 +397,18 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj, goto out_unlock; } +void *i915_gem_object_pin_map_unlocked(struct drm_i915_gem_object *obj, + enum i915_map_type type) +{ + void *ret; + + i915_gem_object_lock(obj, NULL); + ret = i915_gem_object_pin_map(obj, type); + i915_gem_object_unlock(obj); + + return ret; +} + void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj, unsigned long offset, unsigned long size) diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index 7ce7955c44d4..5cc78eada097 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -112,7 +112,7 @@ read_nonprivs(struct intel_context *ce) i915_gem_object_set_cache_coherency(result, I915_CACHE_LLC); - cs = i915_gem_object_pin_map(result, I915_MAP_WB); + cs = i915_gem_object_pin_map_unlocked(result, I915_MAP_WB); if (IS_ERR(cs)) { err = PTR_ERR(cs); goto err_obj; @@ -218,7 +218,7 @@ static int check_whitelist(struct intel_context *ce) i915_gem_object_lock(results, NULL); intel_wedge_on_timeout(&wedge, engine->gt, HZ / 5) /* safety net! */ err = i915_gem_object_set_to_cpu_domain(results, false); - i915_gem_object_unlock(results); + if (intel_gt_is_wedged(engine->gt)) err = -EIO; if (err) @@ -246,6 +246,7 @@ static int check_whitelist(struct intel_context *ce) i915_gem_object_unpin_map(results); out_put: + i915_gem_object_unlock(results); i915_gem_object_put(results); return err; } @@ -502,6 +503,7 @@ static int check_dirty_whitelist(struct intel_context *ce) for (i = 0; i < engine->whitelist.count; i++) { u32 reg = i915_mmio_reg_offset(engine->whitelist.list[i].reg); + struct i915_gem_ww_ctx ww; u64 addr = scratch->node.start; struct i915_request *rq; u32 srm, lrm, rsvd; @@ -517,6 +519,29 @@ static int check_dirty_whitelist(struct intel_context *ce) ro_reg = ro_register(reg); + i915_gem_ww_ctx_init(&ww, false); +retry: + cs = NULL; + err = i915_gem_object_lock(scratch->obj, &ww); + if (!err) + err = i915_gem_object_lock(batch->obj, &ww); + if (!err) + err = intel_context_pin_ww(ce, &ww); + if (err) + goto out; + + cs = i915_gem_object_pin_map(batch->obj, I915_MAP_WC); + if (IS_ERR(cs)) { + err = PTR_ERR(cs); + goto out_ctx; + } + + results = i915_gem_object_pin_map(scratch->obj, I915_MAP_WB); + if (IS_ERR(results)) { + err = PTR_ERR(results); + goto out_unmap_batch; + } + /* Clear non priv flags */ reg &= RING_FORCE_TO_NONPRIV_ADDRESS_MASK; @@ -528,12 +553,6 @@ static int check_dirty_whitelist(struct intel_context *ce) pr_debug("%s: Writing garbage to %x\n", engine->name, reg); - cs = i915_gem_object_pin_map(batch->obj, I915_MAP_WC); -
[Intel-gfx] [PATCH v6 02/64] drm/i915: Pin timeline map after first timeline pin, v3.
We're starting to require the reservation lock for pinning, so wait until we have that. Update the selftests to handle this correctly, and ensure pin is called in live_hwsp_rollover_user() and mock_hwsp_freelist(). Changes since v1: - Fix NULL + XX arithmatic, use casts. (kbuild) Changes since v2: - Clear entire cacheline when pinning. Signed-off-by: Maarten Lankhorst Reported-by: kernel test robot --- drivers/gpu/drm/i915/gt/intel_timeline.c| 40 + drivers/gpu/drm/i915/gt/intel_timeline.h| 2 + drivers/gpu/drm/i915/gt/mock_engine.c | 22 ++- drivers/gpu/drm/i915/gt/selftest_timeline.c | 63 +++-- drivers/gpu/drm/i915/i915_selftest.h| 2 + 5 files changed, 84 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 3d43f15f34f3..7509af471341 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -53,14 +53,29 @@ static int __timeline_active(struct i915_active *active) return 0; } +I915_SELFTEST_EXPORT int +intel_timeline_pin_map(struct intel_timeline *timeline) +{ + struct drm_i915_gem_object *obj = timeline->hwsp_ggtt->obj; + u32 ofs = offset_in_page(timeline->hwsp_offset); + void *vaddr; + + vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB); + if (IS_ERR(vaddr)) + return PTR_ERR(vaddr); + + timeline->hwsp_map = vaddr; + timeline->hwsp_seqno = memset(vaddr + ofs, 0, CACHELINE_BYTES); + clflush(vaddr + ofs); + + return 0; +} + static int intel_timeline_init(struct intel_timeline *timeline, struct intel_gt *gt, struct i915_vma *hwsp, unsigned int offset) { - void *vaddr; - u32 *seqno; - kref_init(&timeline->kref); atomic_set(&timeline->pin_count, 0); @@ -77,14 +92,8 @@ static int intel_timeline_init(struct intel_timeline *timeline, timeline->hwsp_ggtt = hwsp; } - vaddr = i915_gem_object_pin_map(hwsp->obj, I915_MAP_WB); - if (IS_ERR(vaddr)) - return PTR_ERR(vaddr); - - timeline->hwsp_map = vaddr; - seqno = vaddr + timeline->hwsp_offset; - WRITE_ONCE(*seqno, 0); - timeline->hwsp_seqno = seqno; + timeline->hwsp_map = NULL; + timeline->hwsp_seqno = (void *)(long)timeline->hwsp_offset; GEM_BUG_ON(timeline->hwsp_offset >= hwsp->size); @@ -114,7 +123,8 @@ static void intel_timeline_fini(struct rcu_head *rcu) struct intel_timeline *timeline = container_of(rcu, struct intel_timeline, rcu); - i915_gem_object_unpin_map(timeline->hwsp_ggtt->obj); + if (timeline->hwsp_map) + i915_gem_object_unpin_map(timeline->hwsp_ggtt->obj); i915_vma_put(timeline->hwsp_ggtt); i915_active_fini(&timeline->active); @@ -174,6 +184,12 @@ int intel_timeline_pin(struct intel_timeline *tl, struct i915_gem_ww_ctx *ww) if (atomic_add_unless(&tl->pin_count, 1, 0)) return 0; + if (!tl->hwsp_map) { + err = intel_timeline_pin_map(tl); + if (err) + return err; + } + err = i915_ggtt_pin(tl->hwsp_ggtt, ww, 0, PIN_HIGH); if (err) return err; diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.h b/drivers/gpu/drm/i915/gt/intel_timeline.h index f502a619843f..c50543d3ed5d 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.h +++ b/drivers/gpu/drm/i915/gt/intel_timeline.h @@ -110,4 +110,6 @@ void intel_gt_show_timelines(struct intel_gt *gt, const char *prefix, int indent)); +I915_SELFTEST_DECLARE(int intel_timeline_pin_map(struct intel_timeline *tl)); + #endif diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index 2f830017c51d..effbac877eec 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -32,9 +32,20 @@ #include "mock_engine.h" #include "selftests/mock_request.h" -static void mock_timeline_pin(struct intel_timeline *tl) +static int mock_timeline_pin(struct intel_timeline *tl) { + int err; + + if (WARN_ON(!i915_gem_object_trylock(tl->hwsp_ggtt->obj))) + return -EBUSY; + + err = intel_timeline_pin_map(tl); + i915_gem_object_unlock(tl->hwsp_ggtt->obj); + if (err) + return err; + atomic_inc(&tl->pin_count); + return 0; } static void mock_timeline_unpin(struct intel_timeline *tl) @@ -152,6 +163,8 @@ static void mock_context_destroy(struct kref *ref) static int mock_conte
[Intel-gfx] [PATCH v6 42/64] drm/i915/selftests: Prepare client blit for obj->mm.lock removal.
Straightforward conversion, just convert a bunch of calls to unlocked versions. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c index 6a674a7994df..d36873885cc1 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c @@ -45,7 +45,7 @@ static int __igt_client_fill(struct intel_engine_cs *engine) goto err_flush; } - vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB); + vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(vaddr)) { err = PTR_ERR(vaddr); goto err_put; @@ -157,7 +157,7 @@ static int prepare_blit(const struct tiled_blits *t, u32 src_pitch, dst_pitch; u32 cmd, *cs; - cs = i915_gem_object_pin_map(batch, I915_MAP_WC); + cs = i915_gem_object_pin_map_unlocked(batch, I915_MAP_WC); if (IS_ERR(cs)) return PTR_ERR(cs); @@ -377,7 +377,7 @@ static int verify_buffer(const struct tiled_blits *t, y = i915_prandom_u32_max_state(t->height, prng); p = y * t->width + x; - vaddr = i915_gem_object_pin_map(buf->vma->obj, I915_MAP_WC); + vaddr = i915_gem_object_pin_map_unlocked(buf->vma->obj, I915_MAP_WC); if (IS_ERR(vaddr)) return PTR_ERR(vaddr); @@ -564,7 +564,7 @@ static int tiled_blits_prepare(struct tiled_blits *t, int err; int i; - map = i915_gem_object_pin_map(t->scratch.vma->obj, I915_MAP_WC); + map = i915_gem_object_pin_map_unlocked(t->scratch.vma->obj, I915_MAP_WC); if (IS_ERR(map)) return PTR_ERR(map); -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 14/64] drm/i915: Reject UNSYNCHRONIZED for userptr, v2.
We should not allow this any more, as it will break with the new userptr implementation, it could still be made to work, but there's no point in doing so. Inspection of the beignet opencl driver shows that it's only used when normal userptr is not available, which means for new kernels you will need CONFIG_I915_USERPTR. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c index 64a946d5f753..241f865077b9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -224,7 +224,7 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj, struct i915_mmu_object *mo; if (flags & I915_USERPTR_UNSYNCHRONIZED) - return capable(CAP_SYS_ADMIN) ? 0 : -EPERM; + return -ENODEV; if (GEM_WARN_ON(!obj->userptr.mm)) return -EINVAL; @@ -274,13 +274,7 @@ static int i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj, unsigned flags) { - if ((flags & I915_USERPTR_UNSYNCHRONIZED) == 0) - return -ENODEV; - - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; - - return 0; + return -ENODEV; } static void -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 28/64] drm/i915: Take obj lock around set_domain ioctl
We need to lock the object to move it to the correct domain, add the missing lock. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index 51a33c4f61d0..e62f9e8dd339 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -516,6 +516,10 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, goto out; } + err = i915_gem_object_lock_interruptible(obj, NULL); + if (err) + goto out; + /* * Flush and acquire obj->pages so that we are coherent through * direct access in memory with previous cached writes through @@ -527,7 +531,7 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, */ err = i915_gem_object_pin_pages(obj); if (err) - goto out; + goto out_unlock; /* * Already in the desired write domain? Nothing for us to do! @@ -542,10 +546,6 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, if (READ_ONCE(obj->write_domain) == read_domains) goto out_unpin; - err = i915_gem_object_lock_interruptible(obj, NULL); - if (err) - goto out_unpin; - if (read_domains & I915_GEM_DOMAIN_WC) err = i915_gem_object_set_to_wc_domain(obj, write_domain); else if (read_domains & I915_GEM_DOMAIN_GTT) @@ -556,13 +556,15 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, /* And bump the LRU for this access */ i915_gem_object_bump_inactive_ggtt(obj); +out_unpin: + i915_gem_object_unpin_pages(obj); + +out_unlock: i915_gem_object_unlock(obj); - if (write_domain) + if (!err && write_domain) i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU); -out_unpin: - i915_gem_object_unpin_pages(obj); out: i915_gem_object_put(obj); return err; -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 29/64] drm/i915: Defer pin calls in buffer pool until first use by caller.
We need to take the obj lock to pin pages, so wait until the callers have done so, before making the object unshrinkable. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 2 + .../gpu/drm/i915/gem/i915_gem_object_blt.c| 6 +++ .../gpu/drm/i915/gt/intel_gt_buffer_pool.c| 47 +-- .../gpu/drm/i915/gt/intel_gt_buffer_pool.h| 5 ++ .../drm/i915/gt/intel_gt_buffer_pool_types.h | 1 + 5 files changed, 35 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index d4c065c1da06..1ac272f4634a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1342,6 +1342,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, err = PTR_ERR(cmd); goto err_pool; } + intel_gt_buffer_pool_mark_used(pool); memset32(cmd, 0, pool->obj->base.size / sizeof(u32)); @@ -2636,6 +2637,7 @@ static int eb_parse(struct i915_execbuffer *eb) err = PTR_ERR(shadow); goto err; } + intel_gt_buffer_pool_mark_used(pool); i915_gem_object_set_readonly(shadow->obj); shadow->private = pool; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c index 10cac9fac79b..13f681fc27db 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c @@ -55,6 +55,9 @@ struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce, if (unlikely(err)) goto out_put; + /* we pinned the pool, mark it as such */ + intel_gt_buffer_pool_mark_used(pool); + cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_WC); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); @@ -277,6 +280,9 @@ struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce, if (unlikely(err)) goto out_put; + /* we pinned the pool, mark it as such */ + intel_gt_buffer_pool_mark_used(pool); + cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_WC); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c index 104cb30e8c13..030759305196 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c @@ -98,28 +98,6 @@ static void pool_free_work(struct work_struct *wrk) round_jiffies_up_relative(HZ)); } -static int pool_active(struct i915_active *ref) -{ - struct intel_gt_buffer_pool_node *node = - container_of(ref, typeof(*node), active); - struct dma_resv *resv = node->obj->base.resv; - int err; - - if (dma_resv_trylock(resv)) { - dma_resv_add_excl_fence(resv, NULL); - dma_resv_unlock(resv); - } - - err = i915_gem_object_pin_pages(node->obj); - if (err) - return err; - - /* Hide this pinned object from the shrinker until retired */ - i915_gem_object_make_unshrinkable(node->obj); - - return 0; -} - __i915_active_call static void pool_retire(struct i915_active *ref) { @@ -129,10 +107,13 @@ static void pool_retire(struct i915_active *ref) struct list_head *list = bucket_for_size(pool, node->obj->base.size); unsigned long flags; - i915_gem_object_unpin_pages(node->obj); + if (node->pinned) { + i915_gem_object_unpin_pages(node->obj); - /* Return this object to the shrinker pool */ - i915_gem_object_make_purgeable(node->obj); + /* Return this object to the shrinker pool */ + i915_gem_object_make_purgeable(node->obj); + node->pinned = false; + } GEM_BUG_ON(node->age); spin_lock_irqsave(&pool->lock, flags); @@ -144,6 +125,19 @@ static void pool_retire(struct i915_active *ref) round_jiffies_up_relative(HZ)); } +void intel_gt_buffer_pool_mark_used(struct intel_gt_buffer_pool_node *node) +{ + assert_object_held(node->obj); + + if (node->pinned) + return; + + __i915_gem_object_pin_pages(node->obj); + /* Hide this pinned object from the shrinker until retired */ + i915_gem_object_make_unshrinkable(node->obj); + node->pinned = true; +} + static struct intel_gt_buffer_pool_node * node_create(struct intel_gt_buffer_pool *pool, size_t sz) { @@ -158,7 +152,8 @@ node_create(struct intel_gt_buffer_pool *pool, size_t sz) node->age = 0; node->pool = pool; - i915_active_init(&node->active, pool_acti
[Intel-gfx] [PATCH v6 12/64] drm/i915: No longer allow exporting userptr through dma-buf
It doesn't make sense to export a memory address, we will prevent allowing access this way to different address spaces when we rework userptr handling, so best to explicitly disable it. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström -- Still needs an ack from relevant userspace that it won't break, but should be good. --- drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c index 8c3d1eb2f96a..44af6265948d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -694,10 +694,9 @@ i915_gem_userptr_release(struct drm_i915_gem_object *obj) static int i915_gem_userptr_dmabuf_export(struct drm_i915_gem_object *obj) { - if (obj->userptr.mmu_object) - return 0; + drm_dbg(obj->base.dev, "Exporting userptr no longer allowed\n"); - return i915_gem_userptr_init__mmu_notifier(obj, 0); + return -EINVAL; } static int -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 62/64] drm/i915: Keep userpointer bindings if seqcount is unchanged, v2.
Instead of force unbinding and rebinding every time, we try to check if our notifier seqcount is still correct when pages are bound. This way we only rebind userptr when we need to, and prevent stalls. Changes since v1: - Missing mutex_unlock, reported by kbuild. Reported-by: kernel test robot Reported-by: Dan Carpenter Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 27 ++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c index 84c3e8dc1491..44fc0df89784 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -274,12 +274,33 @@ int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) if (ret) return ret; - /* Make sure userptr is unbound for next attempt, so we don't use stale pages. */ - ret = i915_gem_object_userptr_unbind(obj, false); + /* optimistically try to preserve current pages while unlocked */ + if (i915_gem_object_has_pages(obj) && + !mmu_interval_check_retry(&obj->userptr.notifier, + obj->userptr.notifier_seq)) { + spin_lock(&i915->mm.notifier_lock); + if (obj->userptr.pvec && + !mmu_interval_read_retry(&obj->userptr.notifier, +obj->userptr.notifier_seq)) { + obj->userptr.page_ref++; + + /* We can keep using the current binding, this is the fastpath */ + ret = 1; + } + spin_unlock(&i915->mm.notifier_lock); + } + + if (!ret) { + /* Make sure userptr is unbound for next attempt, so we don't use stale pages. */ + ret = i915_gem_object_userptr_unbind(obj, false); + } i915_gem_object_unlock(obj); - if (ret) + if (ret < 0) return ret; + if (ret > 0) + return 0; + notifier_seq = mmu_interval_read_begin(&obj->userptr.notifier); pvec = kvmalloc_array(num_pages, sizeof(struct page *), GFP_KERNEL); -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 09/64] drm/i915: Convert i915_gem_object_attach_phys() to ww locking, v2.
Simple adding of i915_gem_object_lock, we may start to pass ww to get_pages() in the future, but that won't be the case here; We override shmem's get_pages() handling by calling i915_gem_object_get_pages_phys(), no ww is needed. Changes since v1: - Call shmem put pages directly, the callback would go down the phys free path. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 3 ++- drivers/gpu/drm/i915/gem/i915_gem_phys.c | 12 ++-- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 17 ++--- 3 files changed, 22 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index aa55afc0c858..fa699d45e5a9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -43,10 +43,11 @@ int i915_gem_object_pread_phys(struct drm_i915_gem_object *obj, const struct drm_i915_gem_pread *args); int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align); +void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, +struct sg_table *pages); void i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj, struct sg_table *pages); - void i915_gem_flush_free_objects(struct drm_i915_private *i915); struct sg_table * diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c index 4bdd0429c08b..144e4940eede 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c @@ -201,7 +201,7 @@ static int i915_gem_object_shmem_to_phys(struct drm_i915_gem_object *obj) __i915_gem_object_pin_pages(obj); if (!IS_ERR_OR_NULL(pages)) - i915_gem_shmem_ops.put_pages(obj, pages); + i915_gem_object_put_pages_shmem(obj, pages); i915_gem_object_release_memory_region(obj); return 0; @@ -232,7 +232,13 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align) if (err) return err; - mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES); + err = i915_gem_object_lock_interruptible(obj, NULL); + if (err) + return err; + + err = mutex_lock_interruptible_nested(&obj->mm.lock, I915_MM_GET_PAGES); + if (err) + goto err_unlock; if (unlikely(!i915_gem_object_has_struct_page(obj))) goto out; @@ -263,6 +269,8 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align) out: mutex_unlock(&obj->mm.lock); +err_unlock: + i915_gem_object_unlock(obj); return err; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index d590e0c3bd00..7a59fd1ea4e5 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -296,18 +296,12 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, __start_cpu_write(obj); } -static void -shmem_put_pages(struct drm_i915_gem_object *obj, struct sg_table *pages) +void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages) { struct sgt_iter sgt_iter; struct pagevec pvec; struct page *page; - if (unlikely(!i915_gem_object_has_struct_page(obj))) { - i915_gem_object_put_pages_phys(obj, pages); - return; - } - __i915_gem_object_release_shmem(obj, pages, true); i915_gem_gtt_finish_pages(obj, pages); @@ -336,6 +330,15 @@ shmem_put_pages(struct drm_i915_gem_object *obj, struct sg_table *pages) kfree(pages); } +static void +shmem_put_pages(struct drm_i915_gem_object *obj, struct sg_table *pages) +{ + if (likely(i915_gem_object_has_struct_page(obj))) + i915_gem_object_put_pages_shmem(obj, pages); + else + i915_gem_object_put_pages_phys(obj, pages); +} + static int shmem_pwrite(struct drm_i915_gem_object *obj, const struct drm_i915_gem_pwrite *arg) -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 07/64] drm/i915: Move HAS_STRUCT_PAGE to obj->flags
We want to remove the changing of ops structure for attaching phys pages, so we need to kill off HAS_STRUCT_PAGE from ops->flags, and put it in the bo. This will remove a potential race of dereferencing the wrong obj->ops without ww mutex held. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_internal.c | 6 +++--- drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 4 ++-- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 7 +++ drivers/gpu/drm/i915/gem/i915_gem_object.c | 4 +++- drivers/gpu/drm/i915/gem/i915_gem_object.h | 5 +++-- drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 8 +--- drivers/gpu/drm/i915/gem/i915_gem_pages.c| 5 ++--- drivers/gpu/drm/i915/gem/i915_gem_phys.c | 2 ++ drivers/gpu/drm/i915/gem/i915_gem_region.c | 4 +--- drivers/gpu/drm/i915/gem/i915_gem_region.h | 3 +-- drivers/gpu/drm/i915/gem/i915_gem_shmem.c| 8 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 4 ++-- drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 6 +++--- drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c | 4 ++-- drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 10 +- drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 11 --- drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c | 12 drivers/gpu/drm/i915/gvt/dmabuf.c| 2 +- drivers/gpu/drm/i915/selftests/i915_gem_gtt.c| 2 +- drivers/gpu/drm/i915/selftests/mock_region.c | 4 ++-- 21 files changed, 62 insertions(+), 51 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index 04e9c04545ad..36e3c2765f4c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -258,7 +258,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, } drm_gem_private_object_init(dev, &obj->base, dma_buf->size); - i915_gem_object_init(obj, &i915_gem_object_dmabuf_ops, &lock_class); + i915_gem_object_init(obj, &i915_gem_object_dmabuf_ops, &lock_class, 0); obj->base.import_attach = attach; obj->base.resv = dma_buf->resv; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c index ad22f42541bd..21cc40897ca8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c @@ -138,8 +138,7 @@ static void i915_gem_object_put_pages_internal(struct drm_i915_gem_object *obj, static const struct drm_i915_gem_object_ops i915_gem_object_internal_ops = { .name = "i915_gem_object_internal", - .flags = I915_GEM_OBJECT_HAS_STRUCT_PAGE | -I915_GEM_OBJECT_IS_SHRINKABLE, + .flags = I915_GEM_OBJECT_IS_SHRINKABLE, .get_pages = i915_gem_object_get_pages_internal, .put_pages = i915_gem_object_put_pages_internal, }; @@ -178,7 +177,8 @@ i915_gem_object_create_internal(struct drm_i915_private *i915, return ERR_PTR(-ENOMEM); drm_gem_private_object_init(&i915->drm, &obj->base, size); - i915_gem_object_init(obj, &i915_gem_object_internal_ops, &lock_class); + i915_gem_object_init(obj, &i915_gem_object_internal_ops, &lock_class, +I915_BO_ALLOC_STRUCT_PAGE); /* * Mark the object as volatile, such that the pages are marked as diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c index 932ee21e6609..e953965f8263 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c @@ -45,13 +45,13 @@ __i915_gem_lmem_object_create(struct intel_memory_region *mem, return ERR_PTR(-ENOMEM); drm_gem_private_object_init(&i915->drm, &obj->base, size); - i915_gem_object_init(obj, &i915_gem_lmem_obj_ops, &lock_class); + i915_gem_object_init(obj, &i915_gem_lmem_obj_ops, &lock_class, flags); obj->read_domains = I915_GEM_DOMAIN_WC | I915_GEM_DOMAIN_GTT; i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); - i915_gem_object_init_memory_region(obj, mem, flags); + i915_gem_object_init_memory_region(obj, mem); return obj; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index ec28a6cde49b..c0034d811e50 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -251,7 +251,7 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf) goto out; iomap = -1; - if (!i915_gem_object_type_has(o
[Intel-gfx] [PATCH v6 26/64] drm/i915: Make lrc_init_wa_ctx compatible with ww locking.
Make creation separate from pinning, in order to take the lock only once, and pin the mapping with the lock held. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/intel_lrc.c | 32 ++--- 1 file changed, 25 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 4e856947fb13..707981db189b 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1422,7 +1422,7 @@ gen10_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch) #define CTX_WA_BB_SIZE (PAGE_SIZE) -static int lrc_setup_wa_ctx(struct intel_engine_cs *engine) +static int lrc_create_wa_ctx(struct intel_engine_cs *engine) { struct drm_i915_gem_object *obj; struct i915_vma *vma; @@ -1438,10 +1438,6 @@ static int lrc_setup_wa_ctx(struct intel_engine_cs *engine) goto err; } - err = i915_ggtt_pin(vma, NULL, 0, PIN_HIGH); - if (err) - goto err; - engine->wa_ctx.vma = vma; return 0; @@ -1464,6 +1460,7 @@ int lrc_init_wa_ctx(struct intel_engine_cs *engine) &wa_ctx->indirect_ctx, &wa_ctx->per_ctx }; wa_bb_func_t wa_bb_fn[ARRAY_SIZE(wa_bb)]; + struct i915_gem_ww_ctx ww; void *batch, *batch_ptr; unsigned int i; int ret; @@ -1492,13 +1489,21 @@ int lrc_init_wa_ctx(struct intel_engine_cs *engine) return 0; } - ret = lrc_setup_wa_ctx(engine); + ret = lrc_create_wa_ctx(engine); if (ret) { drm_dbg(&engine->i915->drm, "Failed to setup context WA page: %d\n", ret); return ret; } + i915_gem_ww_ctx_init(&ww, true); +retry: + ret = i915_gem_object_lock(wa_ctx->vma->obj, &ww); + if (!ret) + ret = i915_ggtt_pin(wa_ctx->vma, &ww, 0, PIN_HIGH); + if (ret) + goto err; + batch = i915_gem_object_pin_map(wa_ctx->vma->obj, I915_MAP_WB); /* @@ -1522,8 +1527,21 @@ int lrc_init_wa_ctx(struct intel_engine_cs *engine) __i915_gem_object_flush_map(wa_ctx->vma->obj, 0, batch_ptr - batch); __i915_gem_object_release_map(wa_ctx->vma->obj); + if (ret) - lrc_fini_wa_ctx(engine); + i915_vma_unpin(wa_ctx->vma); +err: + if (ret == -EDEADLK) { + ret = i915_gem_ww_ctx_backoff(&ww); + if (!ret) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + + if (ret && engine->wa_ctx.vma) { + i915_vma_put(engine->wa_ctx.vma); + engine->wa_ctx.vma = NULL; + } return ret; } -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 54/64] drm/i915/selftests: Prepare mocs tests for obj->mm.lock removal
Use pin_map_unlocked when we're not holding locks. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/selftest_mocs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c b/drivers/gpu/drm/i915/gt/selftest_mocs.c index 2dd063bc656d..fc8004e8a763 100644 --- a/drivers/gpu/drm/i915/gt/selftest_mocs.c +++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c @@ -80,7 +80,7 @@ static int live_mocs_init(struct live_mocs *arg, struct intel_gt *gt) if (IS_ERR(arg->scratch)) return PTR_ERR(arg->scratch); - arg->vaddr = i915_gem_object_pin_map(arg->scratch->obj, I915_MAP_WB); + arg->vaddr = i915_gem_object_pin_map_unlocked(arg->scratch->obj, I915_MAP_WB); if (IS_ERR(arg->vaddr)) { err = PTR_ERR(arg->vaddr); goto err_scratch; -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 58/64] drm/i915/selftests: Prepare memory region tests for obj->mm.lock removal
Use the unlocked variants for pin_map and pin_pages, and add lock around unpinning/putting pages. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- .../drm/i915/selftests/intel_memory_region.c | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c index 75839db63bea..3b88504d7061 100644 --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c @@ -31,10 +31,12 @@ static void close_objects(struct intel_memory_region *mem, struct drm_i915_gem_object *obj, *on; list_for_each_entry_safe(obj, on, objects, st_link) { + i915_gem_object_lock(obj, NULL); if (i915_gem_object_has_pinned_pages(obj)) i915_gem_object_unpin_pages(obj); /* No polluting the memory region between tests */ __i915_gem_object_put_pages(obj); + i915_gem_object_unlock(obj); list_del(&obj->st_link); i915_gem_object_put(obj); } @@ -69,7 +71,7 @@ static int igt_mock_fill(void *arg) break; } - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) { i915_gem_object_put(obj); break; @@ -109,7 +111,7 @@ igt_object_create(struct intel_memory_region *mem, if (IS_ERR(obj)) return obj; - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) goto put; @@ -123,8 +125,10 @@ igt_object_create(struct intel_memory_region *mem, static void igt_object_release(struct drm_i915_gem_object *obj) { + i915_gem_object_lock(obj, NULL); i915_gem_object_unpin_pages(obj); __i915_gem_object_put_pages(obj); + i915_gem_object_unlock(obj); list_del(&obj->st_link); i915_gem_object_put(obj); } @@ -433,7 +437,7 @@ static int igt_cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val) if (err) return err; - ptr = i915_gem_object_pin_map(obj, I915_MAP_WC); + ptr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(ptr)) return PTR_ERR(ptr); @@ -538,7 +542,7 @@ static int igt_lmem_create(void *arg) if (IS_ERR(obj)) return PTR_ERR(obj); - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) goto out_put; @@ -577,7 +581,7 @@ static int igt_lmem_write_gpu(void *arg) goto out_file; } - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) goto out_put; @@ -649,7 +653,7 @@ static int igt_lmem_write_cpu(void *arg) if (IS_ERR(obj)) return PTR_ERR(obj); - vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC); + vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(vaddr)) { err = PTR_ERR(vaddr); goto out_put; @@ -753,7 +757,7 @@ create_region_for_mapping(struct intel_memory_region *mr, u64 size, u32 type, return obj; } - addr = i915_gem_object_pin_map(obj, type); + addr = i915_gem_object_pin_map_unlocked(obj, type); if (IS_ERR(addr)) { i915_gem_object_put(obj); if (PTR_ERR(addr) == -ENXIO) -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 17/64] drm/i915: Flatten obj->mm.lock
With userptr fixed, there is no need for all separate lockdep classes now, and we can remove all lockdep tricks used. A trylock in the shrinker is all we need now to flatten the locking hierarchy. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 6 +--- drivers/gpu/drm/i915/gem/i915_gem_object.h | 20 ++-- drivers/gpu/drm/i915/gem/i915_gem_pages.c| 34 ++-- drivers/gpu/drm/i915/gem/i915_gem_phys.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 10 +++--- drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 2 +- 6 files changed, 27 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 1393988bd5af..028a556ab1a5 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -62,7 +62,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj, const struct drm_i915_gem_object_ops *ops, struct lock_class_key *key, unsigned flags) { - __mutex_init(&obj->mm.lock, ops->name ?: "obj->mm.lock", key); + mutex_init(&obj->mm.lock); spin_lock_init(&obj->vma.lock); INIT_LIST_HEAD(&obj->vma.list); @@ -86,10 +86,6 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj, mutex_init(&obj->mm.get_page.lock); INIT_RADIX_TREE(&obj->mm.get_dma_page.radix, GFP_KERNEL | __GFP_NOWARN); mutex_init(&obj->mm.get_dma_page.lock); - - if (IS_ENABLED(CONFIG_LOCKDEP) && i915_gem_object_is_shrinkable(obj)) - i915_gem_shrinker_taints_mutex(to_i915(obj->base.dev), - &obj->mm.lock); } /** diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index db4ddbfcee40..aaa574545776 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -322,27 +322,10 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, int i915_gem_object_get_pages(struct drm_i915_gem_object *obj); int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj); -enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock/struct_mutex */ - I915_MM_NORMAL = 0, - /* -* Only used by struct_mutex, when called "recursively" from -* direct-reclaim-esque. Safe because there is only every one -* struct_mutex in the entire system. -*/ - I915_MM_SHRINKER = 1, - /* -* Used for obj->mm.lock when allocating pages. Safe because the object -* isn't yet on any LRU, and therefore the shrinker can't deadlock on -* it. As soon as the object has pages, obj->mm.lock nests within -* fs_reclaim. -*/ - I915_MM_GET_PAGES = 1, -}; - static inline int __must_check i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) { - might_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES); + might_lock(&obj->mm.lock); if (atomic_inc_not_zero(&obj->mm.pages_pin_count)) return 0; @@ -386,6 +369,7 @@ i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) } int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj); +int __i915_gem_object_put_pages_locked(struct drm_i915_gem_object *obj); void i915_gem_object_truncate(struct drm_i915_gem_object *obj); void i915_gem_object_writeback(struct drm_i915_gem_object *obj); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index aefdbe36fe6f..ffc3c3c474b1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -111,7 +111,7 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj) { int err; - err = mutex_lock_interruptible_nested(&obj->mm.lock, I915_MM_GET_PAGES); + err = mutex_lock_interruptible(&obj->mm.lock); if (err) return err; @@ -193,21 +193,13 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj) return pages; } -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj) +int __i915_gem_object_put_pages_locked(struct drm_i915_gem_object *obj) { struct sg_table *pages; - int err; if (i915_gem_object_has_pinned_pages(obj)) return -EBUSY; - /* May be called by shrinker from within get_pages() (on another bo) */ - mutex_lock(&obj->mm.lock); - if (unlikely(atomic_read(&obj->mm.pages_pin_count))) { - err = -EBUSY; - goto unlock; - } - i915_gem_object_release_mmap_offset(obj); /* @@ -223,14 +215,22 @@ int __i915_gem_objec
[Intel-gfx] [PATCH v6 59/64] drm/i915/selftests: Prepare cs engine tests for obj->mm.lock removal
Same as other tests, use pin_map_unlocked. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/selftest_engine_cs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_cs.c b/drivers/gpu/drm/i915/gt/selftest_engine_cs.c index fd9414b0d1bd..625ebb7ccc4e 100644 --- a/drivers/gpu/drm/i915/gt/selftest_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/selftest_engine_cs.c @@ -76,7 +76,7 @@ static struct i915_vma *create_empty_batch(struct intel_context *ce) if (IS_ERR(obj)) return ERR_CAST(obj); - cs = i915_gem_object_pin_map(obj, I915_MAP_WB); + cs = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(cs)) { err = PTR_ERR(cs); goto err_put; @@ -212,7 +212,7 @@ static struct i915_vma *create_nop_batch(struct intel_context *ce) if (IS_ERR(obj)) return ERR_CAST(obj); - cs = i915_gem_object_pin_map(obj, I915_MAP_WB); + cs = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(cs)) { err = PTR_ERR(cs); goto err_put; -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 19/64] drm/i915: Make ring submission compatible with obj->mm.lock removal, v2.
We map the initial context during first pin. This allows us to remove pin_map from state allocation, which saves us a few retry loops. We won't need this until first pin anyway. intel_ring_submission_setup() is also reworked slightly to do all pinning in a single ww loop. Changes since v1: - Handle -EDEADLK backoff in intel_ring_submission_setup() better. - Handle smatch errors reported by Dan and testbot. Signed-off-by: Maarten Lankhorst Reported-by: kernel test robot Reported-by: Dan Carpenter Reviewed-by: Thomas Hellström --- .../gpu/drm/i915/gt/intel_ring_submission.c | 184 +++--- 1 file changed, 118 insertions(+), 66 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c index 4ea741f488a8..b44f7e64c986 100644 --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c @@ -511,6 +511,26 @@ static void ring_context_destroy(struct kref *ref) intel_context_free(ce); } +static int ring_context_init_default_state(struct intel_context *ce, + struct i915_gem_ww_ctx *ww) +{ + struct drm_i915_gem_object *obj = ce->state->obj; + void *vaddr; + + vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB); + if (IS_ERR(vaddr)) + return PTR_ERR(vaddr); + + shmem_read(ce->engine->default_state, 0, + vaddr, ce->engine->context_size); + + i915_gem_object_flush_map(obj); + __i915_gem_object_release_map(obj); + + __set_bit(CONTEXT_VALID_BIT, &ce->flags); + return 0; +} + static int ring_context_pre_pin(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void **unused) @@ -518,6 +538,13 @@ static int ring_context_pre_pin(struct intel_context *ce, struct i915_address_space *vm; int err = 0; + if (ce->engine->default_state && + !test_bit(CONTEXT_VALID_BIT, &ce->flags)) { + err = ring_context_init_default_state(ce, ww); + if (err) + return err; + } + vm = vm_alias(ce->vm); if (vm) err = gen6_ppgtt_pin(i915_vm_to_ppgtt((vm)), ww); @@ -573,22 +600,6 @@ alloc_context_vma(struct intel_engine_cs *engine) if (IS_IVYBRIDGE(i915)) i915_gem_object_set_cache_coherency(obj, I915_CACHE_L3_LLC); - if (engine->default_state) { - void *vaddr; - - vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB); - if (IS_ERR(vaddr)) { - err = PTR_ERR(vaddr); - goto err_obj; - } - - shmem_read(engine->default_state, 0, - vaddr, engine->context_size); - - i915_gem_object_flush_map(obj); - __i915_gem_object_release_map(obj); - } - vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL); if (IS_ERR(vma)) { err = PTR_ERR(vma); @@ -620,8 +631,6 @@ static int ring_context_alloc(struct intel_context *ce) return PTR_ERR(vma); ce->state = vma; - if (engine->default_state) - __set_bit(CONTEXT_VALID_BIT, &ce->flags); } return 0; @@ -1220,37 +1229,15 @@ static int gen7_ctx_switch_bb_setup(struct intel_engine_cs * const engine, return gen7_setup_clear_gpr_bb(engine, vma); } -static int gen7_ctx_switch_bb_init(struct intel_engine_cs *engine) +static int gen7_ctx_switch_bb_init(struct intel_engine_cs *engine, + struct i915_gem_ww_ctx *ww, + struct i915_vma *vma) { - struct drm_i915_gem_object *obj; - struct i915_vma *vma; - int size; int err; - size = gen7_ctx_switch_bb_setup(engine, NULL /* probe size */); - if (size <= 0) - return size; - - size = ALIGN(size, PAGE_SIZE); - obj = i915_gem_object_create_internal(engine->i915, size); - if (IS_ERR(obj)) - return PTR_ERR(obj); - - vma = i915_vma_instance(obj, engine->gt->vm, NULL); - if (IS_ERR(vma)) { - err = PTR_ERR(vma); - goto err_obj; - } - - vma->private = intel_context_create(engine); /* dummy residuals */ - if (IS_ERR(vma->private)) { - err = PTR_ERR(vma->private); - goto err_obj; - } - - err = i915_vma_pin(vma, 0, 0, PIN_USER | PIN_HIGH); + err = i915_vma_pin_ww(vma, ww, 0, 0, PIN_USER | PIN_HIGH); if (err) - goto err_private; + return err; err = i915_vma_sync(vma); if (err) @@ -1265,17 +1252,53 @
[Intel-gfx] [PATCH v6 24/64] drm/i915: Move pinning to inside engine_wa_list_verify()
This should be done as part of the ww loop, in order to remove a i915_vma_pin that needs ww held. Now only i915_ggtt_pin() callers remaining. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/intel_gtt.c| 14 +- drivers/gpu/drm/i915/gt/intel_gtt.h| 3 +++ drivers/gpu/drm/i915/gt/intel_workarounds.c| 10 -- drivers/gpu/drm/i915/gt/selftest_execlists.c | 5 +++-- drivers/gpu/drm/i915/gt/selftest_lrc.c | 2 +- drivers/gpu/drm/i915/gt/selftest_mocs.c| 3 ++- drivers/gpu/drm/i915/gt/selftest_workarounds.c | 6 +++--- 7 files changed, 33 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 04aa6601e984..444d9bacfafd 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -427,7 +427,6 @@ __vm_create_scratch_for_read(struct i915_address_space *vm, unsigned long size) { struct drm_i915_gem_object *obj; struct i915_vma *vma; - int err; obj = i915_gem_object_create_internal(vm->i915, PAGE_ALIGN(size)); if (IS_ERR(obj)) @@ -441,6 +440,19 @@ __vm_create_scratch_for_read(struct i915_address_space *vm, unsigned long size) return vma; } + return vma; +} + +struct i915_vma * +__vm_create_scratch_for_read_pinned(struct i915_address_space *vm, unsigned long size) +{ + struct i915_vma *vma; + int err; + + vma = __vm_create_scratch_for_read(vm, size); + if (IS_ERR(vma)) + return vma; + err = i915_vma_pin(vma, 0, 0, i915_vma_is_ggtt(vma) ? PIN_GLOBAL : PIN_USER); if (err) { diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index 29c10fde8ce3..af90090c3d18 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -576,6 +576,9 @@ void i915_vm_free_pt_stash(struct i915_address_space *vm, struct i915_vma * __vm_create_scratch_for_read(struct i915_address_space *vm, unsigned long size); +struct i915_vma * +__vm_create_scratch_for_read_pinned(struct i915_address_space *vm, unsigned long size); + static inline struct sgt_dma { struct scatterlist *sg; dma_addr_t dma, max; diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index c21a9726326a..4a19f809b529 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -2205,10 +2205,15 @@ static int engine_wa_list_verify(struct intel_context *ce, if (err) goto err_pm; + err = i915_vma_pin_ww(vma, &ww, 0, 0, + i915_vma_is_ggtt(vma) ? PIN_GLOBAL : PIN_USER); + if (err) + goto err_unpin; + rq = i915_request_create(ce); if (IS_ERR(rq)) { err = PTR_ERR(rq); - goto err_unpin; + goto err_vma; } err = i915_request_await_object(rq, vma->obj, true); @@ -2249,6 +2254,8 @@ static int engine_wa_list_verify(struct intel_context *ce, err_rq: i915_request_put(rq); +err_vma: + i915_vma_unpin(vma); err_unpin: intel_context_unpin(ce); err_pm: @@ -2259,7 +2266,6 @@ static int engine_wa_list_verify(struct intel_context *ce, } i915_gem_ww_ctx_fini(&ww); intel_engine_pm_put(ce->engine); - i915_vma_unpin(vma); i915_vma_put(vma); return err; } diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c index bfa7fd5c2c91..c9f3fef11f8f 100644 --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c @@ -4225,8 +4225,9 @@ static int preserved_virtual_engine(struct intel_gt *gt, int err = 0; u32 *cs; - scratch = __vm_create_scratch_for_read(&siblings[0]->gt->ggtt->vm, - PAGE_SIZE); + scratch = + __vm_create_scratch_for_read_pinned(&siblings[0]->gt->ggtt->vm, + PAGE_SIZE); if (IS_ERR(scratch)) return PTR_ERR(scratch); diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 3485cb7c431d..1474300a1a9e 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -27,7 +27,7 @@ static struct i915_vma *create_scratch(struct intel_gt *gt) { - return __vm_create_scratch_for_read(>->ggtt->vm, PAGE_SIZE); + return __vm_create_scratch_for_read_pinned(>->ggtt->vm, PAGE_SIZE); } static bool is_active(struct i915_request *rq) diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c b/drivers/gpu/drm/i915/gt/selftest_moc
[Intel-gfx] [PATCH v6 64/64] drm/i915: Avoid some false positives in assert_object_held()
From: Thomas Hellström In a ww transaction where we've already locked a reservation object, assert_object_held() might not throw a splat even if the object is unlocked. Improve on that situation by asserting that the reservation object's ww mutex is indeed locked. Signed-off-by: Thomas Hellström Cc: Matthew Auld --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 748b03aeb3c6..5a518373f6c5 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -116,7 +116,14 @@ i915_gem_object_put(struct drm_i915_gem_object *obj) __drm_gem_object_put(&obj->base); } -#define assert_object_held(obj) dma_resv_assert_held((obj)->base.resv) +#ifdef CONFIG_LOCKDEP +#define assert_object_held(obj) do { \ + dma_resv_assert_held((obj)->base.resv); \ + WARN_ON(!ww_mutex_is_locked(&(obj)->base.resv->lock)); \ + } while (0) +#else +#define assert_object_held(obj) do { } while (0) +#endif /* * If more than one potential simultaneous locker, assert held. -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 37/64] drm/i915: Add ww locking to dma-buf ops.
vmap is using pin_pages, but needs to use ww locking, add pin_pages_unlocked to correctly lock the mapping. Also add ww locking to begin/end cpu access. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 60 -- 1 file changed, 33 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index 36e3c2765f4c..c4b01e819786 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -82,7 +82,7 @@ static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct dma_buf_map *map struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf); void *vaddr; - vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB); + vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(vaddr)) return PTR_ERR(vaddr); @@ -123,42 +123,48 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire { struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf); bool write = (direction == DMA_BIDIRECTIONAL || direction == DMA_TO_DEVICE); + struct i915_gem_ww_ctx ww; int err; - err = i915_gem_object_pin_pages(obj); - if (err) - return err; - - err = i915_gem_object_lock_interruptible(obj, NULL); - if (err) - goto out; - - err = i915_gem_object_set_to_cpu_domain(obj, write); - i915_gem_object_unlock(obj); - -out: - i915_gem_object_unpin_pages(obj); + i915_gem_ww_ctx_init(&ww, true); +retry: + err = i915_gem_object_lock(obj, &ww); + if (!err) + err = i915_gem_object_pin_pages(obj); + if (!err) { + err = i915_gem_object_set_to_cpu_domain(obj, write); + i915_gem_object_unpin_pages(obj); + } + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); return err; } static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction) { struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf); + struct i915_gem_ww_ctx ww; int err; - err = i915_gem_object_pin_pages(obj); - if (err) - return err; - - err = i915_gem_object_lock_interruptible(obj, NULL); - if (err) - goto out; - - err = i915_gem_object_set_to_gtt_domain(obj, false); - i915_gem_object_unlock(obj); - -out: - i915_gem_object_unpin_pages(obj); + i915_gem_ww_ctx_init(&ww, true); +retry: + err = i915_gem_object_lock(obj, &ww); + if (!err) + err = i915_gem_object_pin_pages(obj); + if (!err) { + err = i915_gem_object_set_to_gtt_domain(obj, false); + i915_gem_object_unpin_pages(obj); + } + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); return err; } -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 60/64] drm/i915/selftests: Prepare gtt tests for obj->mm.lock removal
We need to lock the global gtt dma_resv, use i915_vm_lock_objects to handle this correctly. Add ww handling for this where required. Add the object lock around unpin/put pages, and use the unlocked versions of pin_pages and pin_map where required. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 92 ++- 1 file changed, 67 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c index d4e5b2df684d..83d4bbff1939 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c @@ -130,7 +130,7 @@ fake_dma_object(struct drm_i915_private *i915, u64 size) obj->cache_level = I915_CACHE_NONE; /* Preallocate the "backing storage" */ - if (i915_gem_object_pin_pages(obj)) + if (i915_gem_object_pin_pages_unlocked(obj)) goto err_obj; i915_gem_object_unpin_pages(obj); @@ -146,6 +146,7 @@ static int igt_ppgtt_alloc(void *arg) { struct drm_i915_private *dev_priv = arg; struct i915_ppgtt *ppgtt; + struct i915_gem_ww_ctx ww; u64 size, last, limit; int err = 0; @@ -171,6 +172,12 @@ static int igt_ppgtt_alloc(void *arg) limit = totalram_pages() << PAGE_SHIFT; limit = min(ppgtt->vm.total, limit); + i915_gem_ww_ctx_init(&ww, false); +retry: + err = i915_vm_lock_objects(&ppgtt->vm, &ww); + if (err) + goto err_ppgtt_cleanup; + /* Check we can allocate the entire range */ for (size = 4096; size <= limit; size <<= 2) { struct i915_vm_pt_stash stash = {}; @@ -215,6 +222,13 @@ static int igt_ppgtt_alloc(void *arg) } err_ppgtt_cleanup: + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + i915_vm_put(&ppgtt->vm); return err; } @@ -276,7 +290,7 @@ static int lowlevel_hole(struct i915_address_space *vm, GEM_BUG_ON(obj->base.size != BIT_ULL(size)); - if (i915_gem_object_pin_pages(obj)) { + if (i915_gem_object_pin_pages_unlocked(obj)) { i915_gem_object_put(obj); kfree(order); break; @@ -297,20 +311,36 @@ static int lowlevel_hole(struct i915_address_space *vm, if (vm->allocate_va_range) { struct i915_vm_pt_stash stash = {}; + struct i915_gem_ww_ctx ww; + int err; + + i915_gem_ww_ctx_init(&ww, false); +retry: + err = i915_vm_lock_objects(vm, &ww); + if (err) + goto alloc_vm_end; + err = -ENOMEM; if (i915_vm_alloc_pt_stash(vm, &stash, BIT_ULL(size))) - break; - - if (i915_vm_pin_pt_stash(vm, &stash)) { - i915_vm_free_pt_stash(vm, &stash); - break; - } + goto alloc_vm_end; - vm->allocate_va_range(vm, &stash, - addr, BIT_ULL(size)); + err = i915_vm_pin_pt_stash(vm, &stash); + if (!err) + vm->allocate_va_range(vm, &stash, + addr, BIT_ULL(size)); i915_vm_free_pt_stash(vm, &stash); +alloc_vm_end: + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + + if (err) + break; } mock_vma->pages = obj->mm.pages; @@ -1166,7 +1196,7 @@ static int igt_ggtt_page(void *arg) if (IS_ERR(obj)) return PTR_ERR(obj); - err = i915_gem_object_pin_pages(obj); + err = i915_gem_object_pin_pages_unlocked(obj); if (err) goto out_free; @@ -1333,7 +1363,7 @@ static int igt_gtt_r
[Intel-gfx] [PATCH v6 05/64] drm/i915: Ensure we hold the object mutex in pin correctly.
Currently we have a lot of places where we hold the gem object lock, but haven't yet been converted to the ww dance. Complain loudly about those places. i915_vma_pin shouldn't have the obj lock held, so we can do a ww dance, while i915_vma_pin_ww should. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström #irc --- drivers/gpu/drm/i915/gt/intel_renderstate.c | 2 +- drivers/gpu/drm/i915/i915_vma.c | 11 ++- drivers/gpu/drm/i915/i915_vma.h | 3 +++ 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_renderstate.c b/drivers/gpu/drm/i915/gt/intel_renderstate.c index ca816ba22197..334c557673dd 100644 --- a/drivers/gpu/drm/i915/gt/intel_renderstate.c +++ b/drivers/gpu/drm/i915/gt/intel_renderstate.c @@ -197,7 +197,7 @@ int intel_renderstate_init(struct intel_renderstate *so, if (err) goto err_context; - err = i915_vma_pin(so->vma, 0, 0, PIN_GLOBAL | PIN_HIGH); + err = i915_vma_pin_ww(so->vma, &so->ww, 0, 0, PIN_GLOBAL | PIN_HIGH); if (err) goto err_context; diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index caa9b041616b..7310893086f7 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -865,6 +865,8 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, #ifdef CONFIG_PROVE_LOCKING if (debug_locks && lockdep_is_held(&vma->vm->i915->drm.struct_mutex)) WARN_ON(!ww); + if (debug_locks && ww && vma->resv) + assert_vma_held(vma); #endif BUILD_BUG_ON(PIN_GLOBAL != I915_VMA_GLOBAL_BIND); @@ -1020,8 +1022,15 @@ int i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, GEM_BUG_ON(!i915_vma_is_ggtt(vma)); +#ifdef CONFIG_LOCKDEP + WARN_ON(!ww && vma->resv && dma_resv_held(vma->resv)); +#endif + do { - err = i915_vma_pin_ww(vma, ww, 0, align, flags | PIN_GLOBAL); + if (ww) + err = i915_vma_pin_ww(vma, ww, 0, align, flags | PIN_GLOBAL); + else + err = i915_vma_pin(vma, 0, align, flags | PIN_GLOBAL); if (err != -ENOSPC) { if (!err) { err = i915_vma_wait_for_bind(vma); diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 5b3a3c653454..838bbbeb11cc 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -243,6 +243,9 @@ i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, static inline int __must_check i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) { +#ifdef CONFIG_LOCKDEP + WARN_ON_ONCE(vma->resv && dma_resv_held(vma->resv)); +#endif return i915_vma_pin_ww(vma, NULL, size, alignment, flags); } -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 10/64] drm/i915: make lockdep slightly happier about execbuf.
As soon as we install fences, we should stop allocating memory in order to prevent any potential deadlocks. This is required later on, when we start adding support for dma-fence annotations. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 24 ++- drivers/gpu/drm/i915/i915_active.c| 20 drivers/gpu/drm/i915/i915_vma.c | 8 --- drivers/gpu/drm/i915/i915_vma.h | 3 +++ 4 files changed, 36 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index d5666157dfa6..9c7480119d53 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -50,11 +50,12 @@ enum { #define DBG_FORCE_RELOC 0 /* choose one of the above! */ }; -#define __EXEC_OBJECT_HAS_PIN BIT(31) -#define __EXEC_OBJECT_HAS_FENCEBIT(30) -#define __EXEC_OBJECT_NEEDS_MAPBIT(29) -#define __EXEC_OBJECT_NEEDS_BIAS BIT(28) -#define __EXEC_OBJECT_INTERNAL_FLAGS (~0u << 28) /* all of the above */ +/* __EXEC_OBJECT_NO_RESERVE is BIT(31), defined in i915_vma.h */ +#define __EXEC_OBJECT_HAS_PIN BIT(30) +#define __EXEC_OBJECT_HAS_FENCEBIT(29) +#define __EXEC_OBJECT_NEEDS_MAPBIT(28) +#define __EXEC_OBJECT_NEEDS_BIAS BIT(27) +#define __EXEC_OBJECT_INTERNAL_FLAGS (~0u << 27) /* all of the above + */ #define __EXEC_OBJECT_RESERVED (__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_FENCE) #define __EXEC_HAS_RELOC BIT(31) @@ -928,6 +929,12 @@ static int eb_validate_vmas(struct i915_execbuffer *eb) } } + if (!(ev->flags & EXEC_OBJECT_WRITE)) { + err = dma_resv_reserve_shared(vma->resv, 1); + if (err) + return err; + } + GEM_BUG_ON(drm_mm_node_allocated(&vma->node) && eb_vma_misplaced(&eb->exec[i], vma, ev->flags)); } @@ -2195,7 +2202,8 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) } if (err == 0) - err = i915_vma_move_to_active(vma, eb->request, flags); + err = i915_vma_move_to_active(vma, eb->request, + flags | __EXEC_OBJECT_NO_RESERVE); } if (unlikely(err)) @@ -2447,6 +2455,10 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb, if (err) goto err_commit; + err = dma_resv_reserve_shared(shadow->resv, 1); + if (err) + goto err_commit; + /* Wait for all writes (and relocs) into the batch to complete */ err = i915_sw_fence_await_reservation(&pw->base.chain, pw->batch->resv, NULL, false, diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index ab4382841c6b..4e78de64b71b 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -293,18 +293,13 @@ static struct active_node *__active_lookup(struct i915_active *ref, u64 idx) static struct i915_active_fence * active_instance(struct i915_active *ref, u64 idx) { - struct active_node *node, *prealloc; + struct active_node *node; struct rb_node **p, *parent; node = __active_lookup(ref, idx); if (likely(node)) return &node->base; - /* Preallocate a replacement, just in case */ - prealloc = kmem_cache_alloc(global.slab_cache, GFP_KERNEL); - if (!prealloc) - return NULL; - spin_lock_irq(&ref->tree_lock); GEM_BUG_ON(i915_active_is_idle(ref)); @@ -314,10 +309,8 @@ active_instance(struct i915_active *ref, u64 idx) parent = *p; node = rb_entry(parent, struct active_node, node); - if (node->timeline == idx) { - kmem_cache_free(global.slab_cache, prealloc); + if (node->timeline == idx) goto out; - } if (node->timeline < idx) p = &parent->rb_right; @@ -325,7 +318,14 @@ active_instance(struct i915_active *ref, u64 idx) p = &parent->rb_left; } - node = prealloc; + /* +* XXX: We should preallocate this before i915_active_ref() is ever +* called, but we cannot call into fs_reclaim() anyway, so use GFP_ATOMIC. +*/ + node = kmem_cache_alloc(global.slab_cache, GFP_ATOMIC); + if (!node) + goto out; + __i915_active_fence_init(&node-&g
[Intel-gfx] [PATCH v6 63/64] drm/i915: Move gt_revoke() slightly
We get a lockdep splat when the reset mutex is held, because it can be taken from fence_wait. This conflicts with the mmu notifier we have, because we recurse between reset mutex and mmap lock -> mmu notifier. Remove this recursion by calling revoke_mmaps before taking the lock. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/intel_reset.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 9d177297db79..3c0807d9a86e 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -975,8 +975,6 @@ static int do_reset(struct intel_gt *gt, intel_engine_mask_t stalled_mask) { int err, i; - gt_revoke(gt); - err = __intel_gt_reset(gt, ALL_ENGINES); for (i = 0; err && i < RESET_MAX_RETRIES; i++) { msleep(10 * (i + 1)); @@ -1031,6 +1029,9 @@ void intel_gt_reset(struct intel_gt *gt, might_sleep(); GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, >->reset.flags)); + + gt_revoke(gt); + mutex_lock(>->reset.mutex); /* Clear any previous failed attempts at recovery. Time to try again. */ -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 27/64] drm/i915: Make __engine_unpark() compatible with ww locking.
Take the ww lock around engine_unpark. Because of the many many places where rpm is used, I chose the safest option and used a trylock to opportunistically take this lock for __engine_unpark. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 2843db731b7d..f81bdd747fc9 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -27,12 +27,16 @@ static void dbg_poison_ce(struct intel_context *ce) int type = i915_coherent_map_type(ce->engine->i915); void *map; + if (!i915_gem_object_trylock(obj)) + return; + map = i915_gem_object_pin_map(obj, type); if (!IS_ERR(map)) { memset(map, CONTEXT_REDZONE, obj->base.size); i915_gem_object_flush_map(obj); i915_gem_object_unpin_map(obj); } + i915_gem_object_unlock(obj); } } -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 35/64] drm/i915: Increase ww locking for perf.
We need to lock a few more objects, some temporarily, add ww lock where needed. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/i915_perf.c | 56 1 file changed, 43 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 112ba5f2ce90..aac614204fc0 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1589,7 +1589,7 @@ static int alloc_oa_buffer(struct i915_perf_stream *stream) stream->oa_buffer.vma = vma; stream->oa_buffer.vaddr = - i915_gem_object_pin_map(bo, I915_MAP_WB); + i915_gem_object_pin_map_unlocked(bo, I915_MAP_WB); if (IS_ERR(stream->oa_buffer.vaddr)) { ret = PTR_ERR(stream->oa_buffer.vaddr); goto err_unpin; @@ -1643,6 +1643,7 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) const u32 base = stream->engine->mmio_base; #define CS_GPR(x) GEN8_RING_CS_GPR(base, x) u32 *batch, *ts0, *cs, *jump; + struct i915_gem_ww_ctx ww; int ret, i; enum { START_TS, @@ -1660,15 +1661,21 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) return PTR_ERR(bo); } + i915_gem_ww_ctx_init(&ww, true); +retry: + ret = i915_gem_object_lock(bo, &ww); + if (ret) + goto out_ww; + /* * We pin in GGTT because we jump into this buffer now because * multiple OA config BOs will have a jump to this address and it * needs to be fixed during the lifetime of the i915/perf stream. */ - vma = i915_gem_object_ggtt_pin(bo, NULL, 0, 0, PIN_HIGH); + vma = i915_gem_object_ggtt_pin_ww(bo, &ww, NULL, 0, 0, PIN_HIGH); if (IS_ERR(vma)) { ret = PTR_ERR(vma); - goto err_unref; + goto out_ww; } batch = cs = i915_gem_object_pin_map(bo, I915_MAP_WB); @@ -1802,12 +1809,19 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) __i915_gem_object_release_map(bo); stream->noa_wait = vma; - return 0; + goto out_ww; err_unpin: i915_vma_unpin_and_release(&vma, 0); -err_unref: - i915_gem_object_put(bo); +out_ww: + if (ret == -EDEADLK) { + ret = i915_gem_ww_ctx_backoff(&ww); + if (!ret) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + if (ret) + i915_gem_object_put(bo); return ret; } @@ -1850,6 +1864,7 @@ alloc_oa_config_buffer(struct i915_perf_stream *stream, { struct drm_i915_gem_object *obj; struct i915_oa_config_bo *oa_bo; + struct i915_gem_ww_ctx ww; size_t config_length = 0; u32 *cs; int err; @@ -1870,10 +1885,16 @@ alloc_oa_config_buffer(struct i915_perf_stream *stream, goto err_free; } + i915_gem_ww_ctx_init(&ww, true); +retry: + err = i915_gem_object_lock(obj, &ww); + if (err) + goto out_ww; + cs = i915_gem_object_pin_map(obj, I915_MAP_WB); if (IS_ERR(cs)) { err = PTR_ERR(cs); - goto err_oa_bo; + goto out_ww; } cs = write_cs_mi_lri(cs, @@ -1901,19 +1922,28 @@ alloc_oa_config_buffer(struct i915_perf_stream *stream, NULL); if (IS_ERR(oa_bo->vma)) { err = PTR_ERR(oa_bo->vma); - goto err_oa_bo; + goto out_ww; } oa_bo->oa_config = i915_oa_config_get(oa_config); llist_add(&oa_bo->node, &stream->oa_config_bos); - return oa_bo; +out_ww: + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); -err_oa_bo: - i915_gem_object_put(obj); + if (err) + i915_gem_object_put(obj); err_free: - kfree(oa_bo); - return ERR_PTR(err); + if (err) { + kfree(oa_bo); + return ERR_PTR(err); + } + return oa_bo; } static struct i915_vma * -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 22/64] drm/i915: Pass ww ctx to intel_pin_to_display_plane
Instead of multiple lockings, lock the object once, and perform the ww dance around attach_phys and pin_pages. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/display/intel_display.c | 69 --- drivers/gpu/drm/i915/display/intel_display.h | 2 +- drivers/gpu/drm/i915/display/intel_fbdev.c| 2 +- drivers/gpu/drm/i915/display/intel_overlay.c | 34 +++-- drivers/gpu/drm/i915/gem/i915_gem_domain.c| 30 ++-- drivers/gpu/drm/i915/gem/i915_gem_object.h| 1 + drivers/gpu/drm/i915/gem/i915_gem_phys.c | 10 +-- .../drm/i915/gem/selftests/i915_gem_phys.c| 2 + 8 files changed, 86 insertions(+), 64 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index c8df42193287..af0f419be94d 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -2170,6 +2170,7 @@ static bool intel_plane_uses_fence(const struct intel_plane_state *plane_state) struct i915_vma * intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, + bool phys_cursor, const struct i915_ggtt_view *view, bool uses_fence, unsigned long *out_flags) @@ -2178,14 +2179,19 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, struct drm_i915_private *dev_priv = to_i915(dev); struct drm_i915_gem_object *obj = intel_fb_obj(fb); intel_wakeref_t wakeref; + struct i915_gem_ww_ctx ww; struct i915_vma *vma; unsigned int pinctl; u32 alignment; + int ret; if (drm_WARN_ON(dev, !i915_gem_object_is_framebuffer(obj))) return ERR_PTR(-EINVAL); - alignment = intel_surf_alignment(fb, 0); + if (phys_cursor) + alignment = intel_cursor_alignment(dev_priv); + else + alignment = intel_surf_alignment(fb, 0); if (drm_WARN_ON(dev, alignment && !is_power_of_2(alignment))) return ERR_PTR(-EINVAL); @@ -2220,14 +2226,26 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, if (HAS_GMCH(dev_priv)) pinctl |= PIN_MAPPABLE; - vma = i915_gem_object_pin_to_display_plane(obj, - alignment, view, pinctl); - if (IS_ERR(vma)) + i915_gem_ww_ctx_init(&ww, true); +retry: + ret = i915_gem_object_lock(obj, &ww); + if (!ret && phys_cursor) + ret = i915_gem_object_attach_phys(obj, alignment); + if (!ret) + ret = i915_gem_object_pin_pages(obj); + if (ret) goto err; - if (uses_fence && i915_vma_is_map_and_fenceable(vma)) { - int ret; + if (!ret) { + vma = i915_gem_object_pin_to_display_plane(obj, &ww, alignment, + view, pinctl); + if (IS_ERR(vma)) { + ret = PTR_ERR(vma); + goto err_unpin; + } + } + if (uses_fence && i915_vma_is_map_and_fenceable(vma)) { /* * Install a fence for tiled scan-out. Pre-i965 always needs a * fence, whereas 965+ only requires a fence if using @@ -2248,16 +2266,28 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, ret = i915_vma_pin_fence(vma); if (ret != 0 && INTEL_GEN(dev_priv) < 4) { i915_gem_object_unpin_from_display_plane(vma); - vma = ERR_PTR(ret); - goto err; + goto err_unpin; } + ret = 0; - if (ret == 0 && vma->fence) + if (vma->fence) *out_flags |= PLANE_HAS_FENCE; } i915_vma_get(vma); + +err_unpin: + i915_gem_object_unpin_pages(obj); err: + if (ret == -EDEADLK) { + ret = i915_gem_ww_ctx_backoff(&ww); + if (!ret) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + if (ret) + vma = ERR_PTR(ret); + atomic_dec(&dev_priv->gpu_error.pending_fb_pin); intel_runtime_pm_put(&dev_priv->runtime_pm, wakeref); return vma; @@ -15598,19 +15628,11 @@ int intel_plane_pin_fb(struct intel_plane_state *plane_state) struct drm_i915_private *dev_priv = to_i915(plane->base.dev); struct drm_framebuffer *fb = plane_state->hw.fb; struct i915_vma *vma; + bool phys_cursor = + plane->id == PLANE_CURSOR && + INTEL_INFO(dev_priv)->display.cursor_needs_physical; - if (plane->id == PLANE_CURSOR && -
[Intel-gfx] [PATCH v6 49/64] drm/i915/selftests: Prepare object blit tests for obj->mm.lock removal.
Use some unlocked versions where we're not holding the ww lock. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c index 23b6e11bbc3e..ee9496f3d11d 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c @@ -262,7 +262,7 @@ static int igt_fill_blt_thread(void *arg) goto err_flush; } - vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB); + vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(vaddr)) { err = PTR_ERR(vaddr); goto err_put; @@ -380,7 +380,7 @@ static int igt_copy_blt_thread(void *arg) goto err_flush; } - vaddr = i915_gem_object_pin_map(src, I915_MAP_WB); + vaddr = i915_gem_object_pin_map_unlocked(src, I915_MAP_WB); if (IS_ERR(vaddr)) { err = PTR_ERR(vaddr); goto err_put_src; @@ -400,7 +400,7 @@ static int igt_copy_blt_thread(void *arg) goto err_put_src; } - vaddr = i915_gem_object_pin_map(dst, I915_MAP_WB); + vaddr = i915_gem_object_pin_map_unlocked(dst, I915_MAP_WB); if (IS_ERR(vaddr)) { err = PTR_ERR(vaddr); goto err_put_dst; -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 18/64] drm/i915: Populate logical context during first pin.
This allows us to remove pin_map from state allocation, which saves us a few retry loops. We won't need this until first pin, anyway. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- .../drm/i915/gt/intel_execlists_submission.c | 26 --- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index a5b442683c18..5f73611ed91b 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -2489,11 +2489,31 @@ static void execlists_submit_request(struct i915_request *request) spin_unlock_irqrestore(&engine->active.lock, flags); } +static int +__execlists_context_pre_pin(struct intel_context *ce, + struct intel_engine_cs *engine, + struct i915_gem_ww_ctx *ww, void **vaddr) +{ + int err; + + err = lrc_pre_pin(ce, engine, ww, vaddr); + if (err) + return err; + + if (!__test_and_set_bit(CONTEXT_INIT_BIT, &ce->flags)) { + lrc_init_state(ce, engine, *vaddr); + +__i915_gem_object_flush_map(ce->state->obj, 0, engine->context_size); + } + + return 0; +} + static int execlists_context_pre_pin(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void **vaddr) { - return lrc_pre_pin(ce, ce->engine, ww, vaddr); + return __execlists_context_pre_pin(ce, ce->engine, ww, vaddr); } static int execlists_context_pin(struct intel_context *ce, void *vaddr) @@ -3389,8 +3409,8 @@ static int virtual_context_pre_pin(struct intel_context *ce, { struct virtual_engine *ve = container_of(ce, typeof(*ve), context); - /* Note: we must use a real engine class for setting up reg state */ - return lrc_pre_pin(ce, ve->siblings[0], ww, vaddr); +/* Note: we must use a real engine class for setting up reg state */ + return __execlists_context_pre_pin(ce, ve->siblings[0], ww, vaddr); } static int virtual_context_pin(struct intel_context *ce, void *vaddr) -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 38/64] drm/i915: Add missing ww lock in intel_dsb_prepare.
Because of the long lifetime of the mapping, we cannot wrap this in a simple limited ww lock. Just use the unlocked version of pin_map, because we'll likely release the mapping a lot later, in a different thread. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/display/intel_dsb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c index 566fa72427b3..857126822a88 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.c +++ b/drivers/gpu/drm/i915/display/intel_dsb.c @@ -293,7 +293,7 @@ void intel_dsb_prepare(struct intel_crtc_state *crtc_state) goto out; } - buf = i915_gem_object_pin_map(vma->obj, I915_MAP_WC); + buf = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WC); if (IS_ERR(buf)) { drm_err(&i915->drm, "Command buffer creation failed\n"); i915_vma_unpin_and_release(&vma, I915_VMA_RELEASE_MAP); -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 06/64] drm/i915: Add gem object locking to madvise.
Doesn't need the full ww lock, only checking if pages are bound. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström #irc --- drivers/gpu/drm/i915/i915_gem.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 17a4636ee542..209d244e9879 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1051,10 +1051,14 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data, if (!obj) return -ENOENT; - err = mutex_lock_interruptible(&obj->mm.lock); + err = i915_gem_object_lock_interruptible(obj, NULL); if (err) goto out; + err = mutex_lock_interruptible(&obj->mm.lock); + if (err) + goto out_ww; + if (i915_gem_object_has_pages(obj) && i915_gem_object_is_tiled(obj) && i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) { @@ -1099,6 +1103,8 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data, args->retained = obj->mm.madv != __I915_MADV_PURGED; mutex_unlock(&obj->mm.lock); +out_ww: + i915_gem_object_unlock(obj); out: i915_gem_object_put(obj); return err; -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 57/64] drm/i915/selftests: Prepare i915_request tests for obj->mm.lock removal
Straightforward conversion by using unlocked versions. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/selftests/i915_request.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c index d2a678a2497e..9a9e92a775c8 100644 --- a/drivers/gpu/drm/i915/selftests/i915_request.c +++ b/drivers/gpu/drm/i915/selftests/i915_request.c @@ -620,7 +620,7 @@ static struct i915_vma *empty_batch(struct drm_i915_private *i915) if (IS_ERR(obj)) return ERR_CAST(obj); - cmd = i915_gem_object_pin_map(obj, I915_MAP_WB); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto err; @@ -782,7 +782,7 @@ static struct i915_vma *recursive_batch(struct drm_i915_private *i915) if (err) goto err; - cmd = i915_gem_object_pin_map(obj, I915_MAP_WC); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto err; @@ -817,7 +817,7 @@ static int recursive_batch_resolve(struct i915_vma *batch) { u32 *cmd; - cmd = i915_gem_object_pin_map(batch->obj, I915_MAP_WC); + cmd = i915_gem_object_pin_map_unlocked(batch->obj, I915_MAP_WC); if (IS_ERR(cmd)) return PTR_ERR(cmd); @@ -1070,8 +1070,8 @@ static int live_sequential_engines(void *arg) if (!request[idx]) break; - cmd = i915_gem_object_pin_map(request[idx]->batch->obj, - I915_MAP_WC); + cmd = i915_gem_object_pin_map_unlocked(request[idx]->batch->obj, + I915_MAP_WC); if (!IS_ERR(cmd)) { *cmd = MI_BATCH_BUFFER_END; -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 43/64] drm/i915/selftests: Prepare coherency tests for obj->mm.lock removal.
Straightforward conversion, just convert a bunch of calls to unlocked versions. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c index 654912abaeb4..7da9f1a53ab5 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c @@ -160,7 +160,7 @@ static int wc_set(struct context *ctx, unsigned long offset, u32 v) if (err) return err; - map = i915_gem_object_pin_map(ctx->obj, I915_MAP_WC); + map = i915_gem_object_pin_map_unlocked(ctx->obj, I915_MAP_WC); if (IS_ERR(map)) return PTR_ERR(map); @@ -183,7 +183,7 @@ static int wc_get(struct context *ctx, unsigned long offset, u32 *v) if (err) return err; - map = i915_gem_object_pin_map(ctx->obj, I915_MAP_WC); + map = i915_gem_object_pin_map_unlocked(ctx->obj, I915_MAP_WC); if (IS_ERR(map)) return PTR_ERR(map); -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 11/64] drm/i915: Disable userptr pread/pwrite support.
Userptr should not need the kernel for a userspace memcpy, userspace needs to call memcpy directly. Specifically, disable i915_gem_pwrite_ioctl() and i915_gem_pread_ioctl(). Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström -- Still needs an ack from relevant userspace that it won't break, but should be good. --- drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 20 drivers/gpu/drm/i915/i915_gem.c | 5 + 2 files changed, 25 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c index 30edc5a0a54e..8c3d1eb2f96a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -700,6 +700,24 @@ i915_gem_userptr_dmabuf_export(struct drm_i915_gem_object *obj) return i915_gem_userptr_init__mmu_notifier(obj, 0); } +static int +i915_gem_userptr_pwrite(struct drm_i915_gem_object *obj, + const struct drm_i915_gem_pwrite *args) +{ + drm_dbg(obj->base.dev, "pwrite to userptr no longer allowed\n"); + + return -EINVAL; +} + +static int +i915_gem_userptr_pread(struct drm_i915_gem_object *obj, + const struct drm_i915_gem_pread *args) +{ + drm_dbg(obj->base.dev, "pread from userptr no longer allowed\n"); + + return -EINVAL; +} + static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = { .name = "i915_gem_object_userptr", .flags = I915_GEM_OBJECT_IS_SHRINKABLE | @@ -708,6 +726,8 @@ static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = { .get_pages = i915_gem_userptr_get_pages, .put_pages = i915_gem_userptr_put_pages, .dmabuf_export = i915_gem_userptr_dmabuf_export, + .pwrite = i915_gem_userptr_pwrite, + .pread = i915_gem_userptr_pread, .release = i915_gem_userptr_release, }; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 209d244e9879..84eb2a574813 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -502,6 +502,11 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data, } trace_i915_gem_object_pread(obj, args->offset, args->size); + ret = -ENODEV; + if (obj->ops->pread) + ret = obj->ops->pread(obj, args); + if (ret != -ENODEV) + goto out; ret = -ENODEV; if (obj->ops->pread) -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 33/64] drm/i915: Add igt_spinner_pin() to allow for ww locking around spinner.
By default, we assume that it's called inside igt_create_request to keep existing selftests working, but allow for manual pinning when passing a ww context. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/selftests/igt_spinner.c | 136 --- drivers/gpu/drm/i915/selftests/igt_spinner.h | 5 + 2 files changed, 95 insertions(+), 46 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c index 83f6e5f31fb3..cfbbe415b57c 100644 --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c @@ -12,8 +12,6 @@ int igt_spinner_init(struct igt_spinner *spin, struct intel_gt *gt) { - unsigned int mode; - void *vaddr; int err; memset(spin, 0, sizeof(*spin)); @@ -24,6 +22,7 @@ int igt_spinner_init(struct igt_spinner *spin, struct intel_gt *gt) err = PTR_ERR(spin->hws); goto err; } + i915_gem_object_set_cache_coherency(spin->hws, I915_CACHE_LLC); spin->obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE); if (IS_ERR(spin->obj)) { @@ -31,34 +30,83 @@ int igt_spinner_init(struct igt_spinner *spin, struct intel_gt *gt) goto err_hws; } - i915_gem_object_set_cache_coherency(spin->hws, I915_CACHE_LLC); - vaddr = i915_gem_object_pin_map(spin->hws, I915_MAP_WB); - if (IS_ERR(vaddr)) { - err = PTR_ERR(vaddr); - goto err_obj; - } - spin->seqno = memset(vaddr, 0xff, PAGE_SIZE); - - mode = i915_coherent_map_type(gt->i915); - vaddr = i915_gem_object_pin_map(spin->obj, mode); - if (IS_ERR(vaddr)) { - err = PTR_ERR(vaddr); - goto err_unpin_hws; - } - spin->batch = vaddr; - return 0; -err_unpin_hws: - i915_gem_object_unpin_map(spin->hws); -err_obj: - i915_gem_object_put(spin->obj); err_hws: i915_gem_object_put(spin->hws); err: return err; } +static void *igt_spinner_pin_obj(struct intel_context *ce, +struct i915_gem_ww_ctx *ww, +struct drm_i915_gem_object *obj, +unsigned int mode, struct i915_vma **vma) +{ + void *vaddr; + int ret; + + *vma = i915_vma_instance(obj, ce->vm, NULL); + if (IS_ERR(*vma)) + return ERR_CAST(*vma); + + ret = i915_gem_object_lock(obj, ww); + if (ret) + return ERR_PTR(ret); + + vaddr = i915_gem_object_pin_map(obj, mode); + + if (!ww) + i915_gem_object_unlock(obj); + + if (IS_ERR(vaddr)) + return vaddr; + + if (ww) + ret = i915_vma_pin_ww(*vma, ww, 0, 0, PIN_USER); + else + ret = i915_vma_pin(*vma, 0, 0, PIN_USER); + + if (ret) { + i915_gem_object_unpin_map(obj); + return ERR_PTR(ret); + } + + return vaddr; +} + +int igt_spinner_pin(struct igt_spinner *spin, + struct intel_context *ce, + struct i915_gem_ww_ctx *ww) +{ + void *vaddr; + + if (spin->ce && WARN_ON(spin->ce != ce)) + return -ENODEV; + spin->ce = ce; + + if (!spin->seqno) { + vaddr = igt_spinner_pin_obj(ce, ww, spin->hws, I915_MAP_WB, &spin->hws_vma); + if (IS_ERR(vaddr)) + return PTR_ERR(vaddr); + + spin->seqno = memset(vaddr, 0xff, PAGE_SIZE); + } + + if (!spin->batch) { + unsigned int mode = + i915_coherent_map_type(spin->gt->i915); + + vaddr = igt_spinner_pin_obj(ce, ww, spin->obj, mode, &spin->batch_vma); + if (IS_ERR(vaddr)) + return PTR_ERR(vaddr); + + spin->batch = vaddr; + } + + return 0; +} + static unsigned int seqno_offset(u64 fence) { return offset_in_page(sizeof(u32) * fence); @@ -103,27 +151,18 @@ igt_spinner_create_request(struct igt_spinner *spin, if (!intel_engine_can_store_dword(ce->engine)) return ERR_PTR(-ENODEV); - vma = i915_vma_instance(spin->obj, ce->vm, NULL); - if (IS_ERR(vma)) - return ERR_CAST(vma); - - hws = i915_vma_instance(spin->hws, ce->vm, NULL); - if (IS_ERR(hws)) - return ERR_CAST(hws); + if (!spin->batch) { + err = igt_spinner_pin(spin, ce, NULL); + if (err) + return ERR_PTR(err); + } - err = i915_vma_pin(vma, 0, 0, PIN_USER); - if (err) - return ERR_PTR(err); - - err = i915_vma_pin(hws, 0, 0, PIN_USER)
[Intel-gfx] [PATCH v6 03/64] drm/i915: Move cmd parser pinning to execbuffer
We need to get rid of allocations in the cmd parser, because it needs to be called from a signaling context, first move all pinning to execbuf, where we already hold all locks. Allocate jump_whitelist in the execbuffer, and add annotations around intel_engine_cmd_parser(), to ensure we only call the command parser without allocating any memory, or taking any locks we're not supposed to. Because i915_gem_object_get_page() may also allocate memory, add a path to i915_gem_object_get_sg() that prevents memory allocations, and walk the sg list manually. It should be similarly fast. This has the added benefit of being able to catch all memory allocation errors before the point of no return, and return -ENOMEM safely to the execbuf submitter. Signed-off-by: Maarten Lankhorst Acked-by: Thomas Hellström --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 74 - drivers/gpu/drm/i915/gem/i915_gem_object.h| 10 +- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 21 +++- drivers/gpu/drm/i915/gt/intel_ggtt.c | 2 +- drivers/gpu/drm/i915/i915_cmd_parser.c| 104 -- drivers/gpu/drm/i915/i915_drv.h | 7 +- drivers/gpu/drm/i915/i915_memcpy.c| 2 +- drivers/gpu/drm/i915/i915_memcpy.h| 2 +- 8 files changed, 142 insertions(+), 80 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index cf9a6b4eb913..fe985e1f843d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -28,6 +28,7 @@ #include "i915_sw_fence_work.h" #include "i915_trace.h" #include "i915_user_extensions.h" +#include "i915_memcpy.h" struct eb_vma { struct i915_vma *vma; @@ -2274,24 +2275,45 @@ struct eb_parse_work { struct i915_vma *trampoline; unsigned long batch_offset; unsigned long batch_length; + unsigned long *jump_whitelist; + const void *batch_map; + void *shadow_map; }; static int __eb_parse(struct dma_fence_work *work) { struct eb_parse_work *pw = container_of(work, typeof(*pw), base); + int ret; + bool cookie; - return intel_engine_cmd_parser(pw->engine, - pw->batch, - pw->batch_offset, - pw->batch_length, - pw->shadow, - pw->trampoline); + cookie = dma_fence_begin_signalling(); + ret = intel_engine_cmd_parser(pw->engine, + pw->batch, + pw->batch_offset, + pw->batch_length, + pw->shadow, + pw->jump_whitelist, + pw->shadow_map, + pw->batch_map); + dma_fence_end_signalling(cookie); + + return ret; } static void __eb_parse_release(struct dma_fence_work *work) { struct eb_parse_work *pw = container_of(work, typeof(*pw), base); + if (!IS_ERR_OR_NULL(pw->jump_whitelist)) + kfree(pw->jump_whitelist); + + if (pw->batch_map) + i915_gem_object_unpin_map(pw->batch->obj); + else + i915_gem_object_unpin_pages(pw->batch->obj); + + i915_gem_object_unpin_map(pw->shadow->obj); + if (pw->trampoline) i915_active_release(&pw->trampoline->active); i915_active_release(&pw->shadow->active); @@ -2341,6 +2363,8 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb, struct i915_vma *trampoline) { struct eb_parse_work *pw; + struct drm_i915_gem_object *batch = eb->batch->vma->obj; + bool needs_clflush; int err; GEM_BUG_ON(overflows_type(eb->batch_start_offset, pw->batch_offset)); @@ -2364,6 +2388,34 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb, goto err_shadow; } + pw->shadow_map = i915_gem_object_pin_map(shadow->obj, I915_MAP_FORCE_WB); + if (IS_ERR(pw->shadow_map)) { + err = PTR_ERR(pw->shadow_map); + goto err_trampoline; + } + + needs_clflush = + !(batch->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ); + + pw->batch_map = ERR_PTR(-ENODEV); + if (needs_clflush && i915_has_memcpy_from_wc()) + pw->batch_map = i915_gem_object_pin_map(batch, I915_MAP_WC); + + if (IS_ERR(pw->batch_map)) { + err = i915_gem_object_pin_pages(batch); + if (err) +
[Intel-gfx] [PATCH v6 25/64] drm/i915: Take reservation lock around i915_vma_pin.
We previously complained when ww == NULL. This function is now only used in selftests to pin an object, and ww locking is now fixed. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- .../i915/gem/selftests/i915_gem_coherency.c | 14 + drivers/gpu/drm/i915/i915_gem.c | 6 +- drivers/gpu/drm/i915/i915_vma.c | 4 +--- drivers/gpu/drm/i915/i915_vma.h | 20 +++ 4 files changed, 27 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c index 1117d2a44518..654912abaeb4 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c @@ -200,16 +200,14 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) u32 *cs; int err; + vma = i915_gem_object_ggtt_pin(ctx->obj, NULL, 0, 0, 0); + if (IS_ERR(vma)) + return PTR_ERR(vma); + i915_gem_object_lock(ctx->obj, NULL); err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); if (err) - goto out_unlock; - - vma = i915_gem_object_ggtt_pin(ctx->obj, NULL, 0, 0, 0); - if (IS_ERR(vma)) { - err = PTR_ERR(vma); - goto out_unlock; - } + goto out_unpin; rq = intel_engine_create_kernel_request(ctx->engine); if (IS_ERR(rq)) { @@ -249,9 +247,7 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) i915_request_add(rq); out_unpin: i915_vma_unpin(vma); -out_unlock: i915_gem_object_unlock(ctx->obj); - return err; } diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index d138a8b18a96..c5d0f3887d29 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1016,7 +1016,11 @@ i915_gem_object_ggtt_pin_ww(struct drm_i915_gem_object *obj, return ERR_PTR(ret); } - ret = i915_vma_pin_ww(vma, ww, size, alignment, flags | PIN_GLOBAL); + if (ww) + ret = i915_vma_pin_ww(vma, ww, size, alignment, flags | PIN_GLOBAL); + else + ret = i915_vma_pin(vma, size, alignment, flags | PIN_GLOBAL); + if (ret) return ERR_PTR(ret); diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 1ffda2aaa7a0..265e3a3079e2 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -863,9 +863,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, int err; #ifdef CONFIG_PROVE_LOCKING - if (debug_locks && lockdep_is_held(&vma->vm->i915->drm.struct_mutex)) - WARN_ON(!ww); - if (debug_locks && ww && vma->resv) + if (debug_locks && !WARN_ON(!ww) && vma->resv) assert_vma_held(vma); #endif diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 3c951d5428cf..cea6e7b8611b 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -246,10 +246,22 @@ i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, static inline int __must_check i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) { -#ifdef CONFIG_LOCKDEP - WARN_ON_ONCE(vma->resv && dma_resv_held(vma->resv)); -#endif - return i915_vma_pin_ww(vma, NULL, size, alignment, flags); + struct i915_gem_ww_ctx ww; + int err; + + i915_gem_ww_ctx_init(&ww, true); +retry: + err = i915_gem_object_lock(vma->obj, &ww); + if (!err) + err = i915_vma_pin_ww(vma, &ww, size, alignment, flags); + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + + return err; } int i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 44/64] drm/i915/selftests: Prepare context tests for obj->mm.lock removal.
Straightforward conversion, just convert a bunch of calls to unlocked versions. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index d3f87dc4eda3..5fef592390cb 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -1094,7 +1094,7 @@ __read_slice_count(struct intel_context *ce, if (ret < 0) return ret; - buf = i915_gem_object_pin_map(obj, I915_MAP_WB); + buf = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(buf)) { ret = PTR_ERR(buf); return ret; @@ -1511,7 +1511,7 @@ static int write_to_scratch(struct i915_gem_context *ctx, if (IS_ERR(obj)) return PTR_ERR(obj); - cmd = i915_gem_object_pin_map(obj, I915_MAP_WB); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto out; @@ -1622,7 +1622,7 @@ static int read_from_scratch(struct i915_gem_context *ctx, if (err) goto out_vm; - cmd = i915_gem_object_pin_map(obj, I915_MAP_WB); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto out; @@ -1658,7 +1658,7 @@ static int read_from_scratch(struct i915_gem_context *ctx, if (err) goto out_vm; - cmd = i915_gem_object_pin_map(obj, I915_MAP_WB); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto out; @@ -1715,7 +1715,7 @@ static int read_from_scratch(struct i915_gem_context *ctx, if (err) goto out_vm; - cmd = i915_gem_object_pin_map(obj, I915_MAP_WB); + cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); if (IS_ERR(cmd)) { err = PTR_ERR(cmd); goto out_vm; -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 08/64] drm/i915: Rework struct phys attachment handling
Instead of creating a separate object type, we make changes to the shmem type, to clear struct page backing. This will allow us to ensure we never run into a race when we exchange obj->ops with other function pointers. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_object.h| 8 ++ drivers/gpu/drm/i915/gem/i915_gem_phys.c | 102 +- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 22 +++- .../drm/i915/gem/selftests/i915_gem_phys.c| 6 -- 4 files changed, 78 insertions(+), 60 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 04c29ed93632..aa55afc0c858 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -37,7 +37,15 @@ void __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages, bool needs_clflush); +int i915_gem_object_pwrite_phys(struct drm_i915_gem_object *obj, + const struct drm_i915_gem_pwrite *args); +int i915_gem_object_pread_phys(struct drm_i915_gem_object *obj, + const struct drm_i915_gem_pread *args); + int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align); +void i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj, + struct sg_table *pages); + void i915_gem_flush_free_objects(struct drm_i915_private *i915); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c index 965590d3a570..4bdd0429c08b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c @@ -76,6 +76,8 @@ static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) intel_gt_chipset_flush(&to_i915(obj->base.dev)->gt); + /* We're no longer struct page backed */ + obj->flags &= ~I915_BO_ALLOC_STRUCT_PAGE; __i915_gem_object_set_pages(obj, st, sg->length); return 0; @@ -89,7 +91,7 @@ static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) return -ENOMEM; } -static void +void i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj, struct sg_table *pages) { @@ -134,9 +136,8 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj, vaddr, dma); } -static int -phys_pwrite(struct drm_i915_gem_object *obj, - const struct drm_i915_gem_pwrite *args) +int i915_gem_object_pwrite_phys(struct drm_i915_gem_object *obj, + const struct drm_i915_gem_pwrite *args) { void *vaddr = sg_page(obj->mm.pages->sgl) + args->offset; char __user *user_data = u64_to_user_ptr(args->data_ptr); @@ -165,9 +166,8 @@ phys_pwrite(struct drm_i915_gem_object *obj, return 0; } -static int -phys_pread(struct drm_i915_gem_object *obj, - const struct drm_i915_gem_pread *args) +int i915_gem_object_pread_phys(struct drm_i915_gem_object *obj, + const struct drm_i915_gem_pread *args) { void *vaddr = sg_page(obj->mm.pages->sgl) + args->offset; char __user *user_data = u64_to_user_ptr(args->data_ptr); @@ -186,86 +186,82 @@ phys_pread(struct drm_i915_gem_object *obj, return 0; } -static void phys_release(struct drm_i915_gem_object *obj) +static int i915_gem_object_shmem_to_phys(struct drm_i915_gem_object *obj) { - fput(obj->base.filp); -} + struct sg_table *pages; + int err; -static const struct drm_i915_gem_object_ops i915_gem_phys_ops = { - .name = "i915_gem_object_phys", - .get_pages = i915_gem_object_get_pages_phys, - .put_pages = i915_gem_object_put_pages_phys, + pages = __i915_gem_object_unset_pages(obj); + + err = i915_gem_object_get_pages_phys(obj); + if (err) + goto err_xfer; - .pread = phys_pread, - .pwrite = phys_pwrite, + /* Perma-pin (until release) the physical set of pages */ + __i915_gem_object_pin_pages(obj); - .release = phys_release, -}; + if (!IS_ERR_OR_NULL(pages)) + i915_gem_shmem_ops.put_pages(obj, pages); + + i915_gem_object_release_memory_region(obj); + return 0; + +err_xfer: + if (!IS_ERR_OR_NULL(pages)) { + unsigned int sg_page_sizes = i915_sg_page_sizes(pages->sgl); + + __i915_gem_object_set_pages(obj, pages, sg_page_sizes); + } + return err; +} int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align) { - struct sg_table *pages; int err; if (align > obj->base.size) return -EINVAL; - if (obj-&g
[Intel-gfx] [PATCH v6 61/64] drm/i915: Finally remove obj->mm.lock.
With all callers and selftests fixed to use ww locking, we can now finally remove this lock. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_object.c| 2 - drivers/gpu/drm/i915/gem/i915_gem_object.h| 5 +-- .../gpu/drm/i915/gem/i915_gem_object_types.h | 1 - drivers/gpu/drm/i915/gem/i915_gem_pages.c | 43 --- drivers/gpu/drm/i915/gem/i915_gem_phys.c | 34 --- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 37 +++- drivers/gpu/drm/i915/gem/i915_gem_shrinker.h | 4 +- drivers/gpu/drm/i915/gem/i915_gem_tiling.c| 2 - drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 3 +- drivers/gpu/drm/i915/i915_debugfs.c | 4 +- drivers/gpu/drm/i915/i915_gem.c | 8 +--- drivers/gpu/drm/i915/i915_gem_gtt.c | 2 +- 13 files changed, 55 insertions(+), 92 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 028a556ab1a5..08d806bbf48e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -62,8 +62,6 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj, const struct drm_i915_gem_object_ops *ops, struct lock_class_key *key, unsigned flags) { - mutex_init(&obj->mm.lock); - spin_lock_init(&obj->vma.lock); INIT_LIST_HEAD(&obj->vma.list); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 3fe41180cc81..748b03aeb3c6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -129,7 +129,7 @@ static inline void assert_object_held_shared(struct drm_i915_gem_object *obj) */ if (IS_ENABLED(CONFIG_LOCKDEP) && kref_read(&obj->base.refcount) > 0) - lockdep_assert_held(&obj->mm.lock); + assert_object_held(obj); } static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj, @@ -334,7 +334,7 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj); static inline int __must_check i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) { - might_lock(&obj->mm.lock); + assert_object_held(obj); if (atomic_inc_not_zero(&obj->mm.pages_pin_count)) return 0; @@ -380,7 +380,6 @@ i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) } int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj); -int __i915_gem_object_put_pages_locked(struct drm_i915_gem_object *obj); void i915_gem_object_truncate(struct drm_i915_gem_object *obj); void i915_gem_object_writeback(struct drm_i915_gem_object *obj); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 5234c1ed62d4..b172e8cc53ab 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -209,7 +209,6 @@ struct drm_i915_gem_object { * Protects the pages and their use. Do not use directly, but * instead go through the pin/unpin interfaces. */ - struct mutex lock; atomic_t pages_pin_count; atomic_t shrink_pin; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index 8a6c0dd1bfdd..27d5261af949 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -67,7 +67,7 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, struct list_head *list; unsigned long flags; - lockdep_assert_held(&obj->mm.lock); + assert_object_held(obj); spin_lock_irqsave(&i915->mm.obj_lock, flags); i915->mm.shrink_count++; @@ -114,9 +114,7 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj) { int err; - err = mutex_lock_interruptible(&obj->mm.lock); - if (err) - return err; + assert_object_held(obj); assert_object_held_shared(obj); @@ -125,15 +123,13 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj) err = i915_gem_object_get_pages(obj); if (err) - goto unlock; + return err; smp_mb__before_atomic(); } atomic_inc(&obj->mm.pages_pin_count); -unlock: - mutex_unlock(&obj->mm.lock); - return err; + return 0; } int i915_gem_object_pin_pages_unlocked(struct drm_i915_gem_object *obj) @@ -220,7 +216,7 @@ __i915_gem_object_unset_pages(st
[Intel-gfx] [PATCH v6 52/64] drm/i915/selftests: Prepare hangcheck for obj->mm.lock removal
Convert a few calls to use the unlocked versions. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c index c28d1fcad673..403edf26a338 100644 --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c @@ -80,15 +80,15 @@ static int hang_init(struct hang *h, struct intel_gt *gt) } i915_gem_object_set_cache_coherency(h->hws, I915_CACHE_LLC); - vaddr = i915_gem_object_pin_map(h->hws, I915_MAP_WB); + vaddr = i915_gem_object_pin_map_unlocked(h->hws, I915_MAP_WB); if (IS_ERR(vaddr)) { err = PTR_ERR(vaddr); goto err_obj; } h->seqno = memset(vaddr, 0xff, PAGE_SIZE); - vaddr = i915_gem_object_pin_map(h->obj, - i915_coherent_map_type(gt->i915)); + vaddr = i915_gem_object_pin_map_unlocked(h->obj, + i915_coherent_map_type(gt->i915)); if (IS_ERR(vaddr)) { err = PTR_ERR(vaddr); goto err_unpin_hws; @@ -149,7 +149,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine) return ERR_CAST(obj); } - vaddr = i915_gem_object_pin_map(obj, i915_coherent_map_type(gt->i915)); + vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915)); if (IS_ERR(vaddr)) { i915_gem_object_put(obj); i915_vm_put(vm); -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 16/64] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v5.
Instead of doing what we do currently, which will never work with PROVE_LOCKING, do the same as AMD does, and something similar to relocation slowpath. When all locks are dropped, we acquire the pages for pinning. When the locks are taken, we transfer those pages in .get_pages() to the bo. As a final check before installing the fences, we ensure that the mmu notifier was not called; if it is, we return -EAGAIN to userspace to signal it has to start over. Changes since v1: - Unbinding is done in submit_init only. submit_begin() removed. - MMU_NOTFIER -> MMU_NOTIFIER Changes since v2: - Make i915->mm.notifier a spinlock. Changes since v3: - Add WARN_ON if there are any page references left, should have been 0. - Return 0 on success in submit_init(), bug from spinlock conversion. - Release pvec outside of notifier_lock (Thomas). Changes since v4: - Mention why we're clearing eb->[i + 1].vma in the code. (Thomas) - Actually check all invalidations in eb_move_to_gpu. (Thomas) - Do not wait when process is exiting to fix gem_ctx_persistence.userptr. Signed-off-by: Maarten Lankhorst --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 101 ++- drivers/gpu/drm/i915/gem/i915_gem_object.h| 35 +- .../gpu/drm/i915/gem/i915_gem_object_types.h | 10 +- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 765 ++ drivers/gpu/drm/i915/i915_drv.h | 9 +- drivers/gpu/drm/i915/i915_gem.c | 5 +- 7 files changed, 344 insertions(+), 583 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 77de94e45f55..d4c065c1da06 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -53,14 +53,16 @@ enum { /* __EXEC_OBJECT_NO_RESERVE is BIT(31), defined in i915_vma.h */ #define __EXEC_OBJECT_HAS_PIN BIT(30) #define __EXEC_OBJECT_HAS_FENCEBIT(29) -#define __EXEC_OBJECT_NEEDS_MAPBIT(28) -#define __EXEC_OBJECT_NEEDS_BIAS BIT(27) -#define __EXEC_OBJECT_INTERNAL_FLAGS (~0u << 27) /* all of the above + */ +#define __EXEC_OBJECT_USERPTR_INIT BIT(28) +#define __EXEC_OBJECT_NEEDS_MAPBIT(27) +#define __EXEC_OBJECT_NEEDS_BIAS BIT(26) +#define __EXEC_OBJECT_INTERNAL_FLAGS (~0u << 26) /* all of the above + */ #define __EXEC_OBJECT_RESERVED (__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_FENCE) #define __EXEC_HAS_RELOC BIT(31) #define __EXEC_ENGINE_PINNED BIT(30) -#define __EXEC_INTERNAL_FLAGS (~0u << 30) +#define __EXEC_USERPTR_USEDBIT(29) +#define __EXEC_INTERNAL_FLAGS (~0u << 29) #define UPDATE PIN_OFFSET_FIXED #define BATCH_OFFSET_BIAS (256*1024) @@ -864,6 +866,26 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb) } eb_add_vma(eb, i, batch, vma); + + if (i915_gem_object_is_userptr(vma->obj)) { + err = i915_gem_object_userptr_submit_init(vma->obj); + if (err) { + if (i + 1 < eb->buffer_count) { + /* +* Execbuffer code expects last vma entry to be NULL, +* since we already initialized this entry, +* set the next value to NULL or we mess up +* cleanup handling. +*/ + eb->vma[i + 1].vma = NULL; + } + + return err; + } + + eb->vma[i].flags |= __EXEC_OBJECT_USERPTR_INIT; + eb->args->flags |= __EXEC_USERPTR_USED; + } } if (unlikely(eb->batch->flags & EXEC_OBJECT_WRITE)) { @@ -965,7 +987,7 @@ eb_get_vma(const struct i915_execbuffer *eb, unsigned long handle) } } -static void eb_release_vmas(struct i915_execbuffer *eb, bool final) +static void eb_release_vmas(struct i915_execbuffer *eb, bool final, bool release_userptr) { const unsigned int count = eb->buffer_count; unsigned int i; @@ -979,6 +1001,11 @@ static void eb_release_vmas(struct i915_execbuffer *eb, bool final) eb_unreserve_vma(ev); + if (release_userptr && ev->flags & __EXEC_OBJECT_USERPTR_INIT) { + ev->flags &= ~__EXEC_OBJECT_USERPTR_INIT; + i915_gem_object_userptr_submit_fini(vma->obj); + } + if (final) i915_vma_put(vma); } @@ -1916,6 +1943,31 @@ static int eb_prefault_relocations(const struct i915_execbuffer *e
[Intel-gfx] [PATCH v6 13/64] drm/i915: Reject more ioctls for userptr
There are a couple of ioctl's related to tiling and cache placement, that make no sense for userptr, reject those: - i915_gem_set_tiling_ioctl() Tiling should always be linear for userptr. Changing placement will fail with -ENXIO. - i915_gem_set_caching_ioctl() Userptr memory should always be cached. Changing will fail with -ENXIO. - i915_gem_set_domain_ioctl() Changed to be equivalent to gem_wait, which is correct for the cached linear userptr pointers. This is required because we cannot grab a reference to the pages in the rework, but waiting for idle will do the same. This plus the previous changes have been tested against beignet by using its own unit tests, and intel-video-compute by using piglit's opencl tests. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström -- Still needs an ack from relevant userspace that it won't break, but should be good. --- drivers/gpu/drm/i915/display/intel_display.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 4 +++- drivers/gpu/drm/i915/gem/i915_gem_object.h | 6 ++ drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 3 ++- 4 files changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 0189d379a55e..c8df42193287 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -16399,7 +16399,7 @@ static int intel_user_framebuffer_create_handle(struct drm_framebuffer *fb, struct drm_i915_gem_object *obj = intel_fb_obj(fb); struct drm_i915_private *i915 = to_i915(obj->base.dev); - if (obj->userptr.mm) { + if (i915_gem_object_is_userptr(obj)) { drm_dbg(&i915->drm, "attempting to use a userptr for a framebuffer, denied\n"); return -EINVAL; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index fcce6909f201..c1d4bf62b3ea 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -528,7 +528,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, * considered to be outside of any cache domain. */ if (i915_gem_object_is_proxy(obj)) { - err = -ENXIO; + /* silently allow userptr to complete */ + if (!i915_gem_object_is_userptr(obj)) + err = -ENXIO; goto out; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index fa699d45e5a9..2ba6d114182c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -534,6 +534,12 @@ void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj, void __i915_gem_object_invalidate_frontbuffer(struct drm_i915_gem_object *obj, enum fb_op_origin origin); +static inline bool +i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) +{ + return obj->userptr.mm; +} + static inline void i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj, enum fb_op_origin origin) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c index 44af6265948d..64a946d5f753 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -721,7 +721,8 @@ static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = { .name = "i915_gem_object_userptr", .flags = I915_GEM_OBJECT_IS_SHRINKABLE | I915_GEM_OBJECT_NO_MMAP | -I915_GEM_OBJECT_ASYNC_CANCEL, +I915_GEM_OBJECT_ASYNC_CANCEL | +I915_GEM_OBJECT_IS_PROXY, .get_pages = i915_gem_userptr_get_pages, .put_pages = i915_gem_userptr_put_pages, .dmabuf_export = i915_gem_userptr_dmabuf_export, -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 04/64] drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2.
i915_vma_pin may fail with -EDEADLK when we start locking page tables, so ensure we handle this correctly. Changes since v1: - Drop -EDEADLK todo, this commit handles it. - Change eb_pin_vma from sort-of-bool + -EDEADLK to a proper int. (Matt) Cc: Matthew Brost Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 35 +-- 1 file changed, 24 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index fe985e1f843d..d5666157dfa6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -420,13 +420,14 @@ static u64 eb_pin_flags(const struct drm_i915_gem_exec_object2 *entry, return pin_flags; } -static inline bool +static inline int eb_pin_vma(struct i915_execbuffer *eb, const struct drm_i915_gem_exec_object2 *entry, struct eb_vma *ev) { struct i915_vma *vma = ev->vma; u64 pin_flags; + int err; if (vma->node.size) pin_flags = vma->node.start; @@ -438,24 +439,29 @@ eb_pin_vma(struct i915_execbuffer *eb, pin_flags |= PIN_GLOBAL; /* Attempt to reuse the current location if available */ - /* TODO: Add -EDEADLK handling here */ - if (unlikely(i915_vma_pin_ww(vma, &eb->ww, 0, 0, pin_flags))) { + err = i915_vma_pin_ww(vma, &eb->ww, 0, 0, pin_flags); + if (err == -EDEADLK) + return err; + + if (unlikely(err)) { if (entry->flags & EXEC_OBJECT_PINNED) - return false; + return err; /* Failing that pick any _free_ space if suitable */ - if (unlikely(i915_vma_pin_ww(vma, &eb->ww, + err = i915_vma_pin_ww(vma, &eb->ww, entry->pad_to_size, entry->alignment, eb_pin_flags(entry, ev->flags) | -PIN_USER | PIN_NOEVICT))) - return false; +PIN_USER | PIN_NOEVICT); + if (unlikely(err)) + return err; } if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_FENCE)) { - if (unlikely(i915_vma_pin_fence(vma))) { + err = i915_vma_pin_fence(vma); + if (unlikely(err)) { i915_vma_unpin(vma); - return false; + return err; } if (vma->fence) @@ -463,7 +469,10 @@ eb_pin_vma(struct i915_execbuffer *eb, } ev->flags |= __EXEC_OBJECT_HAS_PIN; - return !eb_vma_misplaced(entry, vma, ev->flags); + if (eb_vma_misplaced(entry, vma, ev->flags)) + return -EBADSLT; + + return 0; } static inline void @@ -899,7 +908,11 @@ static int eb_validate_vmas(struct i915_execbuffer *eb) if (err) return err; - if (eb_pin_vma(eb, entry, ev)) { + err = eb_pin_vma(eb, entry, ev); + if (err == -EDEADLK) + return err; + + if (!err) { if (entry->offset != vma->node.start) { entry->offset = vma->node.start | UPDATE; eb->args->flags |= __EXEC_HAS_RELOC; -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v6 20/64] drm/i915: Handle ww locking in init_status_page
Try to pin to ggtt first, and use a full ww loop to handle eviction correctly. Signed-off-by: Maarten Lankhorst Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 37 +++ 1 file changed, 24 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 1db608d1f26f..ace6fcbd1f38 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -618,6 +618,7 @@ static void cleanup_status_page(struct intel_engine_cs *engine) } static int pin_ggtt_status_page(struct intel_engine_cs *engine, + struct i915_gem_ww_ctx *ww, struct i915_vma *vma) { unsigned int flags; @@ -638,12 +639,13 @@ static int pin_ggtt_status_page(struct intel_engine_cs *engine, else flags = PIN_HIGH; - return i915_ggtt_pin(vma, NULL, 0, flags); + return i915_ggtt_pin(vma, ww, 0, flags); } static int init_status_page(struct intel_engine_cs *engine) { struct drm_i915_gem_object *obj; + struct i915_gem_ww_ctx ww; struct i915_vma *vma; void *vaddr; int ret; @@ -669,30 +671,39 @@ static int init_status_page(struct intel_engine_cs *engine) vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL); if (IS_ERR(vma)) { ret = PTR_ERR(vma); - goto err; + goto err_put; } + i915_gem_ww_ctx_init(&ww, true); +retry: + ret = i915_gem_object_lock(obj, &ww); + if (!ret && !HWS_NEEDS_PHYSICAL(engine->i915)) + ret = pin_ggtt_status_page(engine, &ww, vma); + if (ret) + goto err; + vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB); if (IS_ERR(vaddr)) { ret = PTR_ERR(vaddr); - goto err; + goto err_unpin; } engine->status_page.addr = memset(vaddr, 0, PAGE_SIZE); engine->status_page.vma = vma; - if (!HWS_NEEDS_PHYSICAL(engine->i915)) { - ret = pin_ggtt_status_page(engine, vma); - if (ret) - goto err_unpin; - } - - return 0; - err_unpin: - i915_gem_object_unpin_map(obj); + if (ret) + i915_vma_unpin(vma); err: - i915_gem_object_put(obj); + if (ret == -EDEADLK) { + ret = i915_gem_ww_ctx_backoff(&ww); + if (!ret) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); +err_put: + if (ret) + i915_gem_object_put(obj); return ret; } -- 2.30.0.rc1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PULL] drm-misc-next
ozlowski (1): drm/ingenic: depend on COMMON_CLK to fix compile tests Laurent Pinchart (1): drm: Remove drmm_add_final_kfree() declaration from public headers Linus Walleij (3): dt-bindings: display: mcde: Convert to YAML schema drm/panel: s6e63m0: Fix init sequence again drm/panel: s6e63m0: Support max-brightness Luben Tuikov (4): drm/scheduler: "node" --> "list" gpu/drm: ring_mirror_list --> pending_list drm/scheduler: Essentialize the job done callback drm/sched: Add missing structure comment Maarten Lankhorst (1): Merge drm/drm-next into drm-misc-next Maxime Ripard (20): drm/vc4: hdmi: Don't poll for the infoframes status on setup drm/vc4: drv: Remove the DSI pointer in vc4_drv drm/vc4: dsi: Use snprintf for the PHY clocks instead of an array drm/vc4: dsi: Introduce a variant structure drm: Introduce an atomic_commit_setup function drm: Document use-after-free gotcha with private objects drm/vc4: Simplify a bit the global atomic_check drm/vc4: kms: Wait on previous FIFO users before a commit drm/vc4: kms: Remove unassigned_channels from the HVS state drm/vc4: kms: Remove async modeset semaphore drm/vc4: kms: Convert to atomic helpers drm/vc4: hvs: Align the HVS atomic hooks to the new API drm/vc4: Pass the atomic state to encoder hooks drm/vc4: hdmi: Take into account the clock doubling flag in atomic_check drm/vc4: hdmi: Don't access the connector state in reset if kmalloc fails drm/vc4: hdmi: Create a custom connector state drm/vc4: hdmi: Store pixel frequency in the connector state drm/vc4: hdmi: Use the connector state pixel rate for the PHY drm/vc4: hdmi: Limit the BCM2711 to the max without scrambling drm/vc4: hdmi: Enable 10/12 bpc output Neil Armstrong (2): dt-bindings: panel-simple-dsi: add Khadas TS050 panel bindings drm: panel: add Khadas TS050 panel driver Nirmoy Das (1): drm/amdgpu: clean up bo in vce and vcn test Paul Cercueil (4): drm/ingenic: Add basic PM support drm/ingenic: Compute timings according to adjusted_mode->crtc_* drm/ingenic: Properly compute timings when using a 3x8-bit panel drm/ingenic: Add support for serial 8-bit delta-RGB panels Randy Dunlap (1): fbdev: aty: SPARC64 requires FB_ATY_CT Sam Ravnborg (35): video: Fix kernel-doc warnings in of_display_timing + of_videomode video: fbcon: Fix warnings by using pr_debug() in fbcon video: fbdev: s1d13xxxfb: Fix kernel-doc and set but not used warnings video: fbdev: aty: Delete unused variable in radeon_monitor video: fbdev: aty: Fix set but not used warnings video: fbdev: aty: Fix set but not used warnings in mach64_ct video: fbdev: sis: Fix defined but not used warnings video: fbdev: sis: Fix defined but not used warning of SiS_TVDelay video: fbdev: sis: Fix set but not used warnings in init.c video: fbdev: sis: Fix set but not used warnings in sis_main video: fbdev: via: Fix set but not used warning for mode_crt_table video: fbdev: tdfx: Fix set but not used warning in att_outb() video: fbdev: riva: Fix kernel-doc and set but not used warnings video: fbdev: pm2fb: Fix kernel-doc warnings video: fbdev: tgafb: Fix kernel-doc and set but not used warnings video: fbdev: mx3fb: Fix kernel-doc, set but not used and string warnings video: fbdev: sstfb: Updated logging to fix set but not used warnings video: fbdev: neofb: Fix set but not used warning for CursorMem video: fbdev: nvidia: Fix set but not used warnings video: fbdev: omapfb: Fix set but not used warnings in dsi video: fbdev: s3c-fb: Fix kernel-doc and set but not used warnings video: fbdev: uvesafb: Fix string related warnings video: fbdev: cirrusfb: Fix kernel-doc and set but not used warnings video: fbdev: hgafb: Fix kernel-doc warnings video: fbdev: core: Fix kernel-doc warnings in fbmon + fb_notify video: fbdev: omapfb: Fix set but not used warnings in hdmi*_core video: fbdev: uvesafb: Fix set but not used warning video: fbdev: sparc drivers: fix kernel-doc warnings for blank_mode video: fbdev: mmp: Fix kernel-doc warning for lcd_spi_write video: fbdev: wmt_ge_rops: Fix function not declared warnings video: fbdev: goldfishfb: Fix defined but not used warning video: fbdev: gbefb: Fix set but not used warning video: fbdev: efifb: Fix set but not used warning for screen_pitch video: fbdev: controlfb: Fix set but not used warnings video: fbdev: sis: Drop useless call to SiS_GetResInfo() Sebastian Reichel (49): Revert "drm/omap: dss: Remove unused omap_dss_device operations" drm/omap: drop unused dsi.configure_pins drm/omap: dsi: use MIPI_DSI_FMT_* instead of OMAP_
[Intel-gfx] [PATCH] drm/i915: Only set bind_async_flags when concurrent access wa is not active, v2.
We need to make the BSW workaround actually work. We correctly fixed the mutex nesting, but forgot to kill the worker. The worker is killed by clearing async_flags, and just running bind_vma synchronously. This still needs the stash, because we cannot allocate and pin with vm->mutex already held. Changes since v1: - Fix null pointer dereference when we forget to pass the work stash, it's still required to prealloc all on affected platforms. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 4 +++- drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 4 +++- drivers/gpu/drm/i915/i915_vma.c | 4 ++-- 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c index 1aee5e6b1b23..de3aa79b788e 100644 --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c @@ -433,7 +433,9 @@ struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt) ppgtt->base.vm.pd_shift = ilog2(SZ_4K * SZ_4K / sizeof(gen6_pte_t)); ppgtt->base.vm.top = 1; - ppgtt->base.vm.bind_async_flags = I915_VMA_LOCAL_BIND; + if (!intel_vm_no_concurrent_access_wa(gt->i915)) + ppgtt->base.vm.bind_async_flags = I915_VMA_LOCAL_BIND; + ppgtt->base.vm.allocate_va_range = gen6_alloc_va_range; ppgtt->base.vm.clear_range = gen6_ppgtt_clear_range; ppgtt->base.vm.insert_entries = gen6_ppgtt_insert_entries; diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index e3a8924d2286..aa58b0e48ae1 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -732,7 +732,9 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt) goto err_free_pd; } - ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND; + if (!intel_vm_no_concurrent_access_wa(gt->i915)) + ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND; + ppgtt->vm.insert_entries = gen8_ppgtt_insert; ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc; ppgtt->vm.clear_range = gen8_ppgtt_clear; diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index b319fd3f91cc..d550ee911e68 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -436,7 +436,7 @@ int i915_vma_bind(struct i915_vma *vma, work->pinned = i915_gem_object_get(vma->obj); } } else { - vma->ops->bind_vma(vma->vm, NULL, vma, cache_level, bind_flags); + vma->ops->bind_vma(vma->vm, work ? &work->stash : NULL, vma, cache_level, bind_flags); } atomic_or(bind_flags, &vma->flags); @@ -895,7 +895,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, if (flags & PIN_GLOBAL) wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm); - if (flags & vma->vm->bind_async_flags) { + if ((flags & vma->vm->bind_async_flags) || vma->vm->allocate_va_range) { /* lock VM */ err = i915_vm_lock_objects(vma->vm, ww); if (err) -- 2.31.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Only set bind_async_flags when concurrent access wa is not active, v3.
We need to make the BSW workaround actually work. We correctly fixed the mutex nesting, but forgot to kill the worker. The worker is killed by clearing async_flags, and just running bind_vma synchronously. This still needs the stash, because we cannot allocate and pin with vm->mutex already held. Changes since v1: - Fix null pointer dereference when we forget to pass the work stash, it's still required to prealloc all on affected platforms. Changes since v2: - Clear bind_async_flags correctly on ggtt w/a. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 4 +++- drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 4 +++- drivers/gpu/drm/i915/gt/intel_ggtt.c | 2 -- drivers/gpu/drm/i915/i915_vma.c | 4 ++-- 4 files changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c index 1aee5e6b1b23..de3aa79b788e 100644 --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c @@ -433,7 +433,9 @@ struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt) ppgtt->base.vm.pd_shift = ilog2(SZ_4K * SZ_4K / sizeof(gen6_pte_t)); ppgtt->base.vm.top = 1; - ppgtt->base.vm.bind_async_flags = I915_VMA_LOCAL_BIND; + if (!intel_vm_no_concurrent_access_wa(gt->i915)) + ppgtt->base.vm.bind_async_flags = I915_VMA_LOCAL_BIND; + ppgtt->base.vm.allocate_va_range = gen6_alloc_va_range; ppgtt->base.vm.clear_range = gen6_ppgtt_clear_range; ppgtt->base.vm.insert_entries = gen6_ppgtt_insert_entries; diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index e3a8924d2286..aa58b0e48ae1 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -732,7 +732,9 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt) goto err_free_pd; } - ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND; + if (!intel_vm_no_concurrent_access_wa(gt->i915)) + ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND; + ppgtt->vm.insert_entries = gen8_ppgtt_insert; ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc; ppgtt->vm.clear_range = gen8_ppgtt_clear; diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index 35069ca5d7de..aafcd0b2ab9b 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -914,8 +914,6 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt) if (intel_vm_no_concurrent_access_wa(i915)) { ggtt->vm.insert_entries = bxt_vtd_ggtt_insert_entries__BKL; ggtt->vm.insert_page= bxt_vtd_ggtt_insert_page__BKL; - ggtt->vm.bind_async_flags = - I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND; } ggtt->invalidate = gen8_ggtt_invalidate; diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index b319fd3f91cc..d550ee911e68 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -436,7 +436,7 @@ int i915_vma_bind(struct i915_vma *vma, work->pinned = i915_gem_object_get(vma->obj); } } else { - vma->ops->bind_vma(vma->vm, NULL, vma, cache_level, bind_flags); + vma->ops->bind_vma(vma->vm, work ? &work->stash : NULL, vma, cache_level, bind_flags); } atomic_or(bind_flags, &vma->flags); @@ -895,7 +895,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, if (flags & PIN_GLOBAL) wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm); - if (flags & vma->vm->bind_async_flags) { + if ((flags & vma->vm->bind_async_flags) || vma->vm->allocate_va_range) { /* lock VM */ err = i915_vm_lock_objects(vma->vm, ww); if (err) -- 2.31.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH V3] x86/gpu: add JasperLake to gen11 early quirks
Op 08-06-2021 om 07:34 schreef Tejas Upadhyay: > Let's reserve JSL stolen memory for graphics. > > JasperLake is a gen11 platform which is compatible with > ICL/EHL changes. > > Required due to below reference patch: > > commit 24ea098b7c0d80b56d62a200608e0b029056baf6 > drm/i915/jsl: Split EHL/JSL platform info and PCI ids > > V2: > - Added maintainer list in cc > - Added patch ref in commit message > V1: > - Added Cc: x...@kernel.org > > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: Borislav Petkov > Cc: "H. Peter Anvin" > Cc: x...@kernel.org > Cc: José Roberto de Souza > Signed-off-by: Tejas Upadhyay > --- > arch/x86/kernel/early-quirks.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c > index b553ffe9b985..38837dad46e6 100644 > --- a/arch/x86/kernel/early-quirks.c > +++ b/arch/x86/kernel/early-quirks.c > @@ -549,6 +549,7 @@ static const struct pci_device_id intel_early_ids[] > __initconst = { > INTEL_CNL_IDS(&gen9_early_ops), > INTEL_ICL_11_IDS(&gen11_early_ops), > INTEL_EHL_IDS(&gen11_early_ops), > + INTEL_JSL_IDS(&gen11_early_ops), > INTEL_TGL_12_IDS(&gen11_early_ops), > INTEL_RKL_IDS(&gen11_early_ops), > INTEL_ADLS_IDS(&gen11_early_ops), I cleaned up the commit message slightly, added cc stable and pushed. :) Thanks for patch. ~Maarten ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PULL] topic/i915-ttm
Pull request for drm-misc-next and drm-intel-gt-next. topic/i915-ttm-2021-06-11: drm-misc and drm-intel pull request for topic/i915-ttm: - Convert i915 lmem handling to ttm. - Add a patch to temporarily add a driver_private member to vma_node. - Use this to allow mixed object mmap handling for i915. The following changes since commit 1bd8a7dc28c1c410f1ceefae1f2a97c06d1a67c2: Merge tag 'exynos-drm-next-for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next (2021-06-11 14:19:12 +1000) are available in the Git repository at: git://anongit.freedesktop.org/drm/drm-misc tags/topic/i915-ttm-2021-06-11 for you to fetch changes up to cf3e3e86d77970211e0983130e896ae242601003: drm/i915: Use ttm mmap handling for ttm bo's. (2021-06-11 10:53:25 +0200) drm-misc and drm-intel pull request for topic/i915-ttm: - Convert i915 lmem handling to ttm. - Add a patch to temporarily add a driver_private member to vma_node. - Use this to allow mixed object mmap handling for i915. -------- Maarten Lankhorst (2): drm/vma: Add a driver_private member to vma_node. drm/i915: Use ttm mmap handling for ttm bo's. Thomas Hellström (2): drm/i915/ttm: Introduce a TTM i915 gem object backend drm/i915/lmem: Verify checks for lmem residency drivers/gpu/drm/drm_gem.c | 9 - drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/display/intel_display.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_create.c | 9 +- drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 126 ++-- drivers/gpu/drm/i915/gem/i915_gem_lmem.h | 5 - drivers/gpu/drm/i915/gem/i915_gem_mman.c | 83 ++- drivers/gpu/drm/i915/gem/i915_gem_object.c | 143 +++-- drivers/gpu/drm/i915/gem/i915_gem_object.h | 19 +- drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 30 +- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 3 +- drivers/gpu/drm/i915/gem/i915_gem_region.c | 6 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c| 647 + drivers/gpu/drm/i915/gem/i915_gem_ttm.h| 48 ++ drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 90 +-- drivers/gpu/drm/i915/gt/intel_region_lmem.c| 3 +- drivers/gpu/drm/i915/i915_gem.c| 5 +- drivers/gpu/drm/i915/intel_memory_region.c | 1 - drivers/gpu/drm/i915/intel_memory_region.h | 1 - drivers/gpu/drm/i915/intel_region_ttm.c| 8 +- drivers/gpu/drm/i915/intel_region_ttm.h| 11 +- drivers/gpu/drm/i915/selftests/igt_mmap.c | 25 +- drivers/gpu/drm/i915/selftests/igt_mmap.h | 12 +- include/drm/drm_vma_manager.h | 2 +- 24 files changed, 1039 insertions(+), 250 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ttm.c create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ttm.h ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: allow DG1 autoprobe for CONFIG_BROKEN
Op 14-06-2021 om 11:22 schreef Matthew Auld: > Purely for CI so we can get some pre-merge results for DG1. This is > especially useful for cross driver TTM changes where CI can hopefully > catch regressions. This is similar to how we already handle the DG1 > specific uAPI, which are also hidden behind CONFIG_BROKEN. > > Signed-off-by: Matthew Auld > Cc: Thomas Hellström > Cc: Daniel Vetter > Cc: Dave Airlie > --- > drivers/gpu/drm/i915/i915_pci.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c > index 83b500bb170c..78742157aaa3 100644 > --- a/drivers/gpu/drm/i915/i915_pci.c > +++ b/drivers/gpu/drm/i915/i915_pci.c > @@ -1040,6 +1040,9 @@ static const struct pci_device_id pciidlist[] = { > INTEL_RKL_IDS(&rkl_info), > INTEL_ADLS_IDS(&adl_s_info), > INTEL_ADLP_IDS(&adl_p_info), > +#if IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM) > + INTEL_DG1_IDS(&dg1_info), > +#endif > {0, 0, 0} > }; > MODULE_DEVICE_TABLE(pci, pciidlist); Reviewed-by: Maarten Lankhorst ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/ttm: Fix memory leaks
Op 15-06-2021 om 14:24 schreef Thomas Hellström: > Fix two memory leaks introduced with the ttm backend. > > Fixes: 213d50927763 ("drm/i915/ttm: Introduce a TTM i915 gem object backend") > Signed-off-by: Thomas Hellström > --- > drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c > b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c > index 08b72c280cb5..8059cb61bc3c 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c > @@ -122,6 +122,7 @@ static void i915_ttm_tt_destroy(struct ttm_device *bdev, > struct ttm_tt *ttm) > struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); > > ttm_tt_destroy_common(bdev, ttm); > + ttm_tt_fini(ttm); > kfree(i915_tt); > } > > @@ -217,6 +218,7 @@ static void i915_ttm_delete_mem_notify(struct > ttm_buffer_object *bo) > > if (likely(obj)) { > /* This releases all gem object bindings to the backend. */ > + i915_ttm_free_cached_io_st(obj); > __i915_gem_free_object(obj); > } > } Reviewed-by: Maarten Lankhorst ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v4 03/12] drm/i915: Introduce a ww transaction helper
Op 15-06-2021 om 15:14 schreef Thomas Hellström: > Introduce a for_i915_gem_ww(){} utility to help make the code > around a ww transaction more readable. > > Signed-off-by: Thomas Hellström > Reviewed-by: Matthew Auld > --- > drivers/gpu/drm/i915/i915_gem_ww.h | 31 +- > 1 file changed, 30 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h > b/drivers/gpu/drm/i915/i915_gem_ww.h > index f2d8769e4118..f6b1a796667b 100644 > --- a/drivers/gpu/drm/i915/i915_gem_ww.h > +++ b/drivers/gpu/drm/i915/i915_gem_ww.h > @@ -11,11 +11,40 @@ struct i915_gem_ww_ctx { > struct ww_acquire_ctx ctx; > struct list_head obj_list; > struct drm_i915_gem_object *contended; > - bool intr; > + unsigned short intr; > + unsigned short loop; > }; > > void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr); > void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx); > int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx); > void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj); > + > +/* Internal functions used by the inlines! Don't use. */ > +static inline int __i915_gem_ww_fini(struct i915_gem_ww_ctx *ww, int err) > +{ > + ww->loop = 0; > + if (err == -EDEADLK) { > + err = i915_gem_ww_ctx_backoff(ww); > + if (!err) > + ww->loop = 1; > + } > + > + if (!ww->loop) > + i915_gem_ww_ctx_fini(ww); > + > + return err; > +} > + > +static inline void > +__i915_gem_ww_init(struct i915_gem_ww_ctx *ww, bool intr) > +{ > + i915_gem_ww_ctx_init(ww, intr); > + ww->loop = 1; > +} > + > +#define for_i915_gem_ww(_ww, _err, _intr)\ > + for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop; \ > + _err = __i915_gem_ww_fini(_ww, _err)) > + > #endif With some more macro abuse, we should be able to kill off ww->loop, for (err = ({i915_gem_ww_ctx_init(ww, intr), -EDEADLK}); err == -EDEADLK; err = (err == -EDEADLK && !(err = i915_gem_ww_ctx_backoff(ww))) ? -EDEADLK : err) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Perform execbuffer object locking as a separate step
Op 15-06-2021 om 13:36 schreef Thomas Hellström: > To help avoid evicting already resident buffers from the batch we're > processing, perform locking as a separate step. > > Signed-off-by: Thomas Hellström > --- > .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 25 --- > 1 file changed, 21 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > index 201fed19d120..394eb40c95b5 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > @@ -922,21 +922,38 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb) > return err; > } > > -static int eb_validate_vmas(struct i915_execbuffer *eb) > +static int eb_lock_vmas(struct i915_execbuffer *eb) > { > unsigned int i; > int err; > > - INIT_LIST_HEAD(&eb->unbound); > - > for (i = 0; i < eb->buffer_count; i++) { > - struct drm_i915_gem_exec_object2 *entry = &eb->exec[i]; > struct eb_vma *ev = &eb->vma[i]; > struct i915_vma *vma = ev->vma; > > err = i915_gem_object_lock(vma->obj, &eb->ww); > if (err) > return err; > + } > + > + return 0; > +} > + > +static int eb_validate_vmas(struct i915_execbuffer *eb) > +{ > + unsigned int i; > + int err; > + > + INIT_LIST_HEAD(&eb->unbound); > + > + err = eb_lock_vmas(eb); > + if (err) > + return err; > + > + for (i = 0; i < eb->buffer_count; i++) { > + struct drm_i915_gem_exec_object2 *entry = &eb->exec[i]; > + struct eb_vma *ev = &eb->vma[i]; > + struct i915_vma *vma = ev->vma; > > err = eb_pin_vma(eb, entry, ev); > if (err == -EDEADLK) Reviewed-by: Maarten Lankhorst ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/3] drm/i915: Use intel_context->pin_mutex only for context allocation
Rename pin_mutex to alloc_mutex, and only use it for context allocation. This will allow us to simplify __intel_context_do_pin_ww, which no longer needs to run the pre_pin callback separately. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/intel_context.c | 40 ++- drivers/gpu/drm/i915/gt/intel_context.h | 17 +-- drivers/gpu/drm/i915/gt/intel_context_param.c | 49 --- drivers/gpu/drm/i915/gt/intel_context_sseu.c | 2 +- drivers/gpu/drm/i915/gt/intel_context_types.h | 2 +- 5 files changed, 62 insertions(+), 48 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 4033184f13b9..e6dab37c4266 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -53,7 +53,10 @@ int intel_context_alloc_state(struct intel_context *ce) { int err = 0; - if (mutex_lock_interruptible(&ce->pin_mutex)) + if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) + return 0; + + if (mutex_lock_interruptible(&ce->alloc_mutex)) return -EINTR; if (!test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) { @@ -66,11 +69,12 @@ int intel_context_alloc_state(struct intel_context *ce) if (unlikely(err)) goto unlock; + smp_mb__before_atomic(); set_bit(CONTEXT_ALLOC_BIT, &ce->flags); } unlock: - mutex_unlock(&ce->pin_mutex); + mutex_unlock(&ce->alloc_mutex); return err; } @@ -205,19 +209,11 @@ int __intel_context_do_pin_ww(struct intel_context *ce, { bool handoff = false; void *vaddr; - int err = 0; - - if (unlikely(!test_bit(CONTEXT_ALLOC_BIT, &ce->flags))) { - err = intel_context_alloc_state(ce); - if (err) - return err; - } + int err; - /* -* We always pin the context/ring/timeline here, to ensure a pin -* refcount for __intel_context_active(), which prevent a lock -* inversion of ce->pin_mutex vs dma_resv_lock(). -*/ + err = intel_context_alloc_state(ce); + if (err) + return err; err = i915_gem_object_lock(ce->timeline->hwsp_ggtt->obj, ww); if (!err && ce->ring->vma->obj) @@ -237,24 +233,20 @@ int __intel_context_do_pin_ww(struct intel_context *ce, if (err) goto err_release; - err = mutex_lock_interruptible(&ce->pin_mutex); - if (err) - goto err_post_unpin; - if (unlikely(intel_context_is_closed(ce))) { err = -ENOENT; - goto err_unlock; + goto err_post_unpin; } if (likely(!atomic_add_unless(&ce->pin_count, 1, 0))) { err = intel_context_active_acquire(ce); if (unlikely(err)) - goto err_unlock; + goto err_post_unpin; err = ce->ops->pin(ce, vaddr); if (err) { intel_context_active_release(ce); - goto err_unlock; + goto err_post_unpin; } CE_TRACE(ce, "pin ring:{start:%08x, head:%04x, tail:%04x}\n", @@ -268,8 +260,6 @@ int __intel_context_do_pin_ww(struct intel_context *ce, GEM_BUG_ON(!intel_context_is_pinned(ce)); /* no overflow! */ -err_unlock: - mutex_unlock(&ce->pin_mutex); err_post_unpin: if (!handoff) ce->ops->post_unpin(ce); @@ -381,7 +371,7 @@ intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine) spin_lock_init(&ce->signal_lock); INIT_LIST_HEAD(&ce->signals); - mutex_init(&ce->pin_mutex); + mutex_init(&ce->alloc_mutex); i915_active_init(&ce->active, __intel_context_active, __intel_context_retire, 0); @@ -393,7 +383,7 @@ void intel_context_fini(struct intel_context *ce) intel_timeline_put(ce->timeline); i915_vm_put(ce->vm); - mutex_destroy(&ce->pin_mutex); + mutex_destroy(&ce->alloc_mutex); i915_active_fini(&ce->active); } diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index f83a73a2b39f..4cf6901fcb81 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -49,9 +49,16 @@ int intel_context_reconfigure_sseu(struct intel_context *ce, * intel_context_is_pinned() remains stable. */ static inline int intel_context_lock_pinned(struct intel_context *ce) - __acquires(ce->pin_mutex) { - return mutex_lock_interruptible(&ce->pin_mutex); + int re
[Intel-gfx] [PATCH 3/3] drm/i915: Remove intel_context->ops->(pre_pin/post_unpin)
Now that intel_context->pin_mutex is gone, the reason for splitting pre_pin/post_unpin ops is also gone. Remove those ops, and handle this detail inside guc/execlists submission only. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/intel_context.c | 19 ++- drivers/gpu/drm/i915/gt/intel_context_types.h | 4 +- .../drm/i915/gt/intel_execlists_submission.c | 50 --- .../gpu/drm/i915/gt/intel_ring_submission.c | 16 +- drivers/gpu/drm/i915/gt/mock_engine.c | 14 +- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 25 ++ 6 files changed, 44 insertions(+), 84 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index e6dab37c4266..b630c1968794 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -207,8 +207,6 @@ static void intel_context_post_unpin(struct intel_context *ce) int __intel_context_do_pin_ww(struct intel_context *ce, struct i915_gem_ww_ctx *ww) { - bool handoff = false; - void *vaddr; int err; err = intel_context_alloc_state(ce); @@ -229,40 +227,32 @@ int __intel_context_do_pin_ww(struct intel_context *ce, if (err) goto err_ctx_unpin; - err = ce->ops->pre_pin(ce, ww, &vaddr); - if (err) - goto err_release; - if (unlikely(intel_context_is_closed(ce))) { err = -ENOENT; - goto err_post_unpin; + goto err_release; } if (likely(!atomic_add_unless(&ce->pin_count, 1, 0))) { err = intel_context_active_acquire(ce); if (unlikely(err)) - goto err_post_unpin; + goto err_release; - err = ce->ops->pin(ce, vaddr); + err = ce->ops->pin(ce, ww); if (err) { intel_context_active_release(ce); - goto err_post_unpin; + goto err_release; } CE_TRACE(ce, "pin ring:{start:%08x, head:%04x, tail:%04x}\n", i915_ggtt_offset(ce->ring->vma), ce->ring->head, ce->ring->tail); - handoff = true; smp_mb__before_atomic(); /* flush pin before it is visible */ atomic_inc(&ce->pin_count); } GEM_BUG_ON(!intel_context_is_pinned(ce)); /* no overflow! */ -err_post_unpin: - if (!handoff) - ce->ops->post_unpin(ce); err_release: i915_active_release(&ce->active); err_ctx_unpin: @@ -303,7 +293,6 @@ void intel_context_unpin(struct intel_context *ce) CE_TRACE(ce, "unpin\n"); ce->ops->unpin(ce); - ce->ops->post_unpin(ce); /* * Once released, we may asynchronously drop the active reference. diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index b104d5e9d3b6..e10057901c6c 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -35,10 +35,8 @@ struct intel_context_ops { int (*alloc)(struct intel_context *ce); - int (*pre_pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void **vaddr); - int (*pin)(struct intel_context *ce, void *vaddr); + int (*pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww); void (*unpin)(struct intel_context *ce); - void (*post_unpin)(struct intel_context *ce); void (*enter)(struct intel_context *ce); void (*exit)(struct intel_context *ce); diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index cdb2126a159a..6a1a45ffe816 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -2509,35 +2509,38 @@ static void execlists_submit_request(struct i915_request *request) } static int -__execlists_context_pre_pin(struct intel_context *ce, - struct intel_engine_cs *engine, - struct i915_gem_ww_ctx *ww, void **vaddr) +__execlists_context_pin(struct intel_context *ce, + struct intel_engine_cs *engine, + struct i915_gem_ww_ctx *ww) { int err; + void *vaddr; - err = lrc_pre_pin(ce, engine, ww, vaddr); + err = lrc_pre_pin(ce, engine, ww, &vaddr); if (err) return err; if (!__test_and_set_bit(CONTEXT_INIT_BIT, &ce->flags)) { - lrc_init_state(ce, engine, *vaddr); + lrc_init_state(ce, engine, vaddr); __i915_gem_object_flush_map(ce-
[Intel-gfx] [PATCH 1/3] drm/i915/gt: Do not allow setting ring size for legacy ring submission
It doesn't work for legacy ring submission, and is in the best case ignored. In the worst case we end up freeing engine->legacy.ring for all other active engines, resulting in a use-after-free. Signed-off-by: Maarten Lankhorst Fixes: 88be76cdafc7 ("drm/i915: Allow userspace to specify ringsize on construction") Cc: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_context_param.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.c b/drivers/gpu/drm/i915/gt/intel_context_param.c index 65dcd090245d..412c36d1b1dd 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_param.c +++ b/drivers/gpu/drm/i915/gt/intel_context_param.c @@ -12,6 +12,9 @@ int intel_context_set_ring_size(struct intel_context *ce, long sz) { int err; + if (ce->engine->gt->submission_method == INTEL_SUBMISSION_RING) + return 0; + if (intel_context_lock_pinned(ce)) return -EINTR; -- 2.31.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.
_render_line Not tainted 5.15.0-rc5-CI-Trybot_8062+ #1 <4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, BIOS TNTGL357.0038.2020.1124.1648 11/24/2020 <4> [83.540023] Call Trace: <4> [83.540026] dump_stack_lvl+0x56/0x7b <4> [83.540030] check_noncircular+0x12e/0x150 <4> [83.540034] ? _raw_spin_unlock_irqrestore+0x50/0x60 <4> [83.540038] validate_chain+0xb37/0x1e70 <4> [83.540042] __lock_acquire+0x5a1/0xb70 <4> [83.540046] lock_acquire+0xd3/0x310 <4> [83.540049] ? __kmalloc_track_caller+0x56/0x270 <4> [83.540052] ? find_held_lock+0x2d/0x90 <4> [83.540055] ? dma_resv_get_fences+0x1c3/0x280 <4> [83.540058] fs_reclaim_acquire+0x9d/0xd0 <4> [83.540061] ? __kmalloc_track_caller+0x56/0x270 <4> [83.540064] __kmalloc_track_caller+0x56/0x270 <4> [83.540067] krealloc+0x48/0xa0 <4> [83.540070] dma_resv_get_fences+0x1c3/0x280 <4> [83.540074] i915_gem_object_wait+0x1ff/0x410 [i915] <4> [83.540143] i915_gem_evict_for_node+0x16b/0x440 [i915] <4> [83.540212] i915_gem_gtt_reserve+0xff/0x130 [i915] <4> [83.540281] i915_vma_pin_ww+0x765/0x970 [i915] <4> [83.540354] eb_validate_vmas+0x6fe/0x8e0 [i915] <4> [83.540420] i915_gem_do_execbuffer+0x9a6/0x20a0 [i915] <4> [83.540485] ? lockdep_hardirqs_on+0xbf/0x130 <4> [83.540490] ? __lock_acquire+0x5c0/0xb70 <4> [83.540495] i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915] <4> [83.540559] ? i915_gem_do_execbuffer+0x20a0/0x20a0 [i915] <4> [83.540622] drm_ioctl_kernel+0xac/0x140 <4> [83.540625] drm_ioctl+0x201/0x3d0 <4> [83.540628] ? i915_gem_do_execbuffer+0x20a0/0x20a0 [i915] <4> [83.540691] __x64_sys_ioctl+0x6a/0xa0 <4> [83.540694] do_syscall_64+0x37/0xb0 <4> [83.540697] entry_SYSCALL_64_after_hwframe+0x44/0xae <4> [83.540700] RIP: 0033:0x7fc314edc50b Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/dma_resv_utils.c| 17 --- drivers/gpu/drm/i915/dma_resv_utils.h| 13 -- drivers/gpu/drm/i915/gem/i915_gem_wait.c | 56 3 files changed, 8 insertions(+), 78 deletions(-) delete mode 100644 drivers/gpu/drm/i915/dma_resv_utils.c delete mode 100644 drivers/gpu/drm/i915/dma_resv_utils.h diff --git a/drivers/gpu/drm/i915/dma_resv_utils.c b/drivers/gpu/drm/i915/dma_resv_utils.c deleted file mode 100644 index 7df91b7e4ca8.. --- a/drivers/gpu/drm/i915/dma_resv_utils.c +++ /dev/null @@ -1,17 +0,0 @@ -// SPDX-License-Identifier: MIT -/* - * Copyright © 2020 Intel Corporation - */ - -#include - -#include "dma_resv_utils.h" - -void dma_resv_prune(struct dma_resv *resv) -{ - if (dma_resv_trylock(resv)) { - if (dma_resv_test_signaled(resv, true)) - dma_resv_add_excl_fence(resv, NULL); - dma_resv_unlock(resv); - } -} diff --git a/drivers/gpu/drm/i915/dma_resv_utils.h b/drivers/gpu/drm/i915/dma_resv_utils.h deleted file mode 100644 index b9d8fb5f8367.. --- a/drivers/gpu/drm/i915/dma_resv_utils.h +++ /dev/null @@ -1,13 +0,0 @@ -/* SPDX-License-Identifier: MIT */ -/* - * Copyright © 2020 Intel Corporation - */ - -#ifndef DMA_RESV_UTILS_H -#define DMA_RESV_UTILS_H - -struct dma_resv; - -void dma_resv_prune(struct dma_resv *resv); - -#endif /* DMA_RESV_UTILS_H */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c index f909aaa09d9c..e59304a76b2c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c @@ -10,7 +10,6 @@ #include "gt/intel_engine.h" -#include "dma_resv_utils.h" #include "i915_gem_ioctls.h" #include "i915_gem_object.h" @@ -37,56 +36,17 @@ i915_gem_object_wait_reservation(struct dma_resv *resv, unsigned int flags, long timeout) { - struct dma_fence *excl; - bool prune_fences = false; - - if (flags & I915_WAIT_ALL) { - struct dma_fence **shared; - unsigned int count, i; - int ret; + struct dma_resv_iter cursor; + struct dma_fence *fence; - ret = dma_resv_get_fences(resv, &excl, &count, &shared); - if (ret) - return ret; - - for (i = 0; i < count; i++) { - timeout = i915_gem_object_wait_fence(shared[i], -flags, timeout); - if (timeout < 0) - break; + dma_resv_iter_begin(&cursor, resv, flags & I915_WAIT_ALL); + dma_resv_for_each_fence_unlocked(&cursor, fence) { - dma_fence_put(shared[i]); - } - - for (; i < count; i++) -
[Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.
_render_line Not tainted 5.15.0-rc5-CI-Trybot_8062+ #1 <4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, BIOS TNTGL357.0038.2020.1124.1648 11/24/2020 <4> [83.540023] Call Trace: <4> [83.540026] dump_stack_lvl+0x56/0x7b <4> [83.540030] check_noncircular+0x12e/0x150 <4> [83.540034] ? _raw_spin_unlock_irqrestore+0x50/0x60 <4> [83.540038] validate_chain+0xb37/0x1e70 <4> [83.540042] __lock_acquire+0x5a1/0xb70 <4> [83.540046] lock_acquire+0xd3/0x310 <4> [83.540049] ? __kmalloc_track_caller+0x56/0x270 <4> [83.540052] ? find_held_lock+0x2d/0x90 <4> [83.540055] ? dma_resv_get_fences+0x1c3/0x280 <4> [83.540058] fs_reclaim_acquire+0x9d/0xd0 <4> [83.540061] ? __kmalloc_track_caller+0x56/0x270 <4> [83.540064] __kmalloc_track_caller+0x56/0x270 <4> [83.540067] krealloc+0x48/0xa0 <4> [83.540070] dma_resv_get_fences+0x1c3/0x280 <4> [83.540074] i915_gem_object_wait+0x1ff/0x410 [i915] <4> [83.540143] i915_gem_evict_for_node+0x16b/0x440 [i915] <4> [83.540212] i915_gem_gtt_reserve+0xff/0x130 [i915] <4> [83.540281] i915_vma_pin_ww+0x765/0x970 [i915] <4> [83.540354] eb_validate_vmas+0x6fe/0x8e0 [i915] <4> [83.540420] i915_gem_do_execbuffer+0x9a6/0x20a0 [i915] <4> [83.540485] ? lockdep_hardirqs_on+0xbf/0x130 <4> [83.540490] ? __lock_acquire+0x5c0/0xb70 <4> [83.540495] i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915] <4> [83.540559] ? i915_gem_do_execbuffer+0x20a0/0x20a0 [i915] <4> [83.540622] drm_ioctl_kernel+0xac/0x140 <4> [83.540625] drm_ioctl+0x201/0x3d0 <4> [83.540628] ? i915_gem_do_execbuffer+0x20a0/0x20a0 [i915] <4> [83.540691] __x64_sys_ioctl+0x6a/0xa0 <4> [83.540694] do_syscall_64+0x37/0xb0 <4> [83.540697] entry_SYSCALL_64_after_hwframe+0x44/0xae <4> [83.540700] RIP: 0033:0x7fc314edc50b Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/Makefile| 1 - drivers/gpu/drm/i915/dma_resv_utils.c| 17 --- drivers/gpu/drm/i915/dma_resv_utils.h| 13 -- drivers/gpu/drm/i915/gem/i915_gem_wait.c | 56 4 files changed, 8 insertions(+), 79 deletions(-) delete mode 100644 drivers/gpu/drm/i915/dma_resv_utils.c delete mode 100644 drivers/gpu/drm/i915/dma_resv_utils.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 21b05ed0e4e8..88bb326d9031 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -58,7 +58,6 @@ i915-y += i915_drv.o \ # core library code i915-y += \ - dma_resv_utils.o \ i915_memcpy.o \ i915_mm.o \ i915_sw_fence.o \ diff --git a/drivers/gpu/drm/i915/dma_resv_utils.c b/drivers/gpu/drm/i915/dma_resv_utils.c deleted file mode 100644 index 7df91b7e4ca8.. --- a/drivers/gpu/drm/i915/dma_resv_utils.c +++ /dev/null @@ -1,17 +0,0 @@ -// SPDX-License-Identifier: MIT -/* - * Copyright © 2020 Intel Corporation - */ - -#include - -#include "dma_resv_utils.h" - -void dma_resv_prune(struct dma_resv *resv) -{ - if (dma_resv_trylock(resv)) { - if (dma_resv_test_signaled(resv, true)) - dma_resv_add_excl_fence(resv, NULL); - dma_resv_unlock(resv); - } -} diff --git a/drivers/gpu/drm/i915/dma_resv_utils.h b/drivers/gpu/drm/i915/dma_resv_utils.h deleted file mode 100644 index b9d8fb5f8367.. --- a/drivers/gpu/drm/i915/dma_resv_utils.h +++ /dev/null @@ -1,13 +0,0 @@ -/* SPDX-License-Identifier: MIT */ -/* - * Copyright © 2020 Intel Corporation - */ - -#ifndef DMA_RESV_UTILS_H -#define DMA_RESV_UTILS_H - -struct dma_resv; - -void dma_resv_prune(struct dma_resv *resv); - -#endif /* DMA_RESV_UTILS_H */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c index f909aaa09d9c..e59304a76b2c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c @@ -10,7 +10,6 @@ #include "gt/intel_engine.h" -#include "dma_resv_utils.h" #include "i915_gem_ioctls.h" #include "i915_gem_object.h" @@ -37,56 +36,17 @@ i915_gem_object_wait_reservation(struct dma_resv *resv, unsigned int flags, long timeout) { - struct dma_fence *excl; - bool prune_fences = false; - - if (flags & I915_WAIT_ALL) { - struct dma_fence **shared; - unsigned int count, i; - int ret; + struct dma_resv_iter cursor; + struct dma_fence *fence; - ret = dma_resv_get_fences(resv, &excl, &count, &shared); - if (ret) - return ret; - - for (i = 0; i < count; i++) { - timeout = i915_gem_object_wait_fence(shared[i], -
[Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.
_render_line Not tainted 5.15.0-rc5-CI-Trybot_8062+ #1 <4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, BIOS TNTGL357.0038.2020.1124.1648 11/24/2020 <4> [83.540023] Call Trace: <4> [83.540026] dump_stack_lvl+0x56/0x7b <4> [83.540030] check_noncircular+0x12e/0x150 <4> [83.540034] ? _raw_spin_unlock_irqrestore+0x50/0x60 <4> [83.540038] validate_chain+0xb37/0x1e70 <4> [83.540042] __lock_acquire+0x5a1/0xb70 <4> [83.540046] lock_acquire+0xd3/0x310 <4> [83.540049] ? __kmalloc_track_caller+0x56/0x270 <4> [83.540052] ? find_held_lock+0x2d/0x90 <4> [83.540055] ? dma_resv_get_fences+0x1c3/0x280 <4> [83.540058] fs_reclaim_acquire+0x9d/0xd0 <4> [83.540061] ? __kmalloc_track_caller+0x56/0x270 <4> [83.540064] __kmalloc_track_caller+0x56/0x270 <4> [83.540067] krealloc+0x48/0xa0 <4> [83.540070] dma_resv_get_fences+0x1c3/0x280 <4> [83.540074] i915_gem_object_wait+0x1ff/0x410 [i915] <4> [83.540143] i915_gem_evict_for_node+0x16b/0x440 [i915] <4> [83.540212] i915_gem_gtt_reserve+0xff/0x130 [i915] <4> [83.540281] i915_vma_pin_ww+0x765/0x970 [i915] <4> [83.540354] eb_validate_vmas+0x6fe/0x8e0 [i915] <4> [83.540420] i915_gem_do_execbuffer+0x9a6/0x20a0 [i915] <4> [83.540485] ? lockdep_hardirqs_on+0xbf/0x130 <4> [83.540490] ? __lock_acquire+0x5c0/0xb70 <4> [83.540495] i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915] <4> [83.540559] ? i915_gem_do_execbuffer+0x20a0/0x20a0 [i915] <4> [83.540622] drm_ioctl_kernel+0xac/0x140 <4> [83.540625] drm_ioctl+0x201/0x3d0 <4> [83.540628] ? i915_gem_do_execbuffer+0x20a0/0x20a0 [i915] <4> [83.540691] __x64_sys_ioctl+0x6a/0xa0 <4> [83.540694] do_syscall_64+0x37/0xb0 <4> [83.540697] entry_SYSCALL_64_after_hwframe+0x44/0xae <4> [83.540700] RIP: 0033:0x7fc314edc50b Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/Makefile| 1 - drivers/gpu/drm/i915/dma_resv_utils.c| 17 -- drivers/gpu/drm/i915/dma_resv_utils.h| 13 - drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 1 - drivers/gpu/drm/i915/gem/i915_gem_wait.c | 56 +++- 5 files changed, 8 insertions(+), 80 deletions(-) delete mode 100644 drivers/gpu/drm/i915/dma_resv_utils.c delete mode 100644 drivers/gpu/drm/i915/dma_resv_utils.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 21b05ed0e4e8..88bb326d9031 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -58,7 +58,6 @@ i915-y += i915_drv.o \ # core library code i915-y += \ - dma_resv_utils.o \ i915_memcpy.o \ i915_mm.o \ i915_sw_fence.o \ diff --git a/drivers/gpu/drm/i915/dma_resv_utils.c b/drivers/gpu/drm/i915/dma_resv_utils.c deleted file mode 100644 index 7df91b7e4ca8.. --- a/drivers/gpu/drm/i915/dma_resv_utils.c +++ /dev/null @@ -1,17 +0,0 @@ -// SPDX-License-Identifier: MIT -/* - * Copyright © 2020 Intel Corporation - */ - -#include - -#include "dma_resv_utils.h" - -void dma_resv_prune(struct dma_resv *resv) -{ - if (dma_resv_trylock(resv)) { - if (dma_resv_test_signaled(resv, true)) - dma_resv_add_excl_fence(resv, NULL); - dma_resv_unlock(resv); - } -} diff --git a/drivers/gpu/drm/i915/dma_resv_utils.h b/drivers/gpu/drm/i915/dma_resv_utils.h deleted file mode 100644 index b9d8fb5f8367.. --- a/drivers/gpu/drm/i915/dma_resv_utils.h +++ /dev/null @@ -1,13 +0,0 @@ -/* SPDX-License-Identifier: MIT */ -/* - * Copyright © 2020 Intel Corporation - */ - -#ifndef DMA_RESV_UTILS_H -#define DMA_RESV_UTILS_H - -struct dma_resv; - -void dma_resv_prune(struct dma_resv *resv); - -#endif /* DMA_RESV_UTILS_H */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index c80e6c1d2bcb..5375f3f9f016 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -15,7 +15,6 @@ #include "gt/intel_gt_requests.h" -#include "dma_resv_utils.h" #include "i915_trace.h" static bool swap_available(void) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c index f909aaa09d9c..e59304a76b2c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c @@ -10,7 +10,6 @@ #include "gt/intel_engine.h" -#include "dma_resv_utils.h" #include "i915_gem_ioctls.h" #include "i915_gem_object.h" @@ -37,56 +36,17 @@ i915_gem_object_wait_reservation(struct dma_resv *resv, unsigned int flags, long timeout) { - struct dma_fence *excl; - bool prune_fences = false; -
Re: [Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.
Op 13-10-2021 om 17:37 schreef Tvrtko Ursulin: > > On 13/10/2021 15:00, Daniel Vetter wrote: >> On Wed, Oct 13, 2021 at 02:32:03PM +0200, Maarten Lankhorst wrote: >>> No memory should be allocated when calling i915_gem_object_wait, >>> because it may be called to idle a BO when evicting memory. >>> >>> Fix this by using dma_resv_iter helpers to call >>> i915_gem_object_wait_fence() on each fence, which cleans up the code a lot. >>> Also remove dma_resv_prune, it's questionably. >>> >>> This will result in the following lockdep splat. >>> >>> <4> [83.538517] == >>> <4> [83.538520] WARNING: possible circular locking dependency detected >>> <4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted >>> <4> [83.538525] -- >>> <4> [83.538527] gem_render_line/5242 is trying to acquire lock: >>> <4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: >>> __kmalloc_track_caller+0x56/0x270 >>> <4> [83.538538] >>> but task is already holding lock: >>> <4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: >>> i915_vma_pin_ww+0x1c7/0x970 [i915] >>> <4> [83.538638] >>> which lock already depends on the new lock. >>> <4> [83.538642] >>> the existing dependency chain (in reverse order) is: >>> <4> [83.538645] >>> -> #1 (&vm->mutex/1){+.+.}-{3:3}: >>> <4> [83.538649] lock_acquire+0xd3/0x310 >>> <4> [83.538654] i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915] >>> <4> [83.538730] i915_address_space_init+0xf5/0x1b0 [i915] >>> <4> [83.538794] ppgtt_init+0x55/0x70 [i915] >>> <4> [83.538856] gen8_ppgtt_create+0x44/0x5d0 [i915] >>> <4> [83.538912] i915_ppgtt_create+0x28/0xf0 [i915] >>> <4> [83.538971] intel_gt_init+0x130/0x3b0 [i915] >>> <4> [83.539029] i915_gem_init+0x14b/0x220 [i915] >>> <4> [83.539100] i915_driver_probe+0x97e/0xdd0 [i915] >>> <4> [83.539149] i915_pci_probe+0x43/0x1d0 [i915] >>> <4> [83.539197] pci_device_probe+0x9b/0x110 >>> <4> [83.539201] really_probe+0x1b0/0x3b0 >>> <4> [83.539205] __driver_probe_device+0xf6/0x170 >>> <4> [83.539208] driver_probe_device+0x1a/0x90 >>> <4> [83.539210] __driver_attach+0x93/0x160 >>> <4> [83.539213] bus_for_each_dev+0x72/0xc0 >>> <4> [83.539216] bus_add_driver+0x14b/0x1f0 >>> <4> [83.539220] driver_register+0x66/0xb0 >>> <4> [83.539222] hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi] >>> <4> [83.539227] do_one_initcall+0x53/0x2e0 >>> <4> [83.539230] do_init_module+0x55/0x200 >>> <4> [83.539234] load_module+0x2700/0x2980 >>> <4> [83.539237] __do_sys_finit_module+0xaa/0x110 >>> <4> [83.539241] do_syscall_64+0x37/0xb0 >>> <4> [83.539244] entry_SYSCALL_64_after_hwframe+0x44/0xae >>> <4> [83.539247] >>> -> #0 (fs_reclaim){+.+.}-{0:0}: >>> <4> [83.539251] validate_chain+0xb37/0x1e70 >>> <4> [83.539254] __lock_acquire+0x5a1/0xb70 >>> <4> [83.539258] lock_acquire+0xd3/0x310 >>> <4> [83.539260] fs_reclaim_acquire+0x9d/0xd0 >>> <4> [83.539264] __kmalloc_track_caller+0x56/0x270 >>> <4> [83.539267] krealloc+0x48/0xa0 >>> <4> [83.539270] dma_resv_get_fences+0x1c3/0x280 >>> <4> [83.539274] i915_gem_object_wait+0x1ff/0x410 [i915] >>> <4> [83.539342] i915_gem_evict_for_node+0x16b/0x440 [i915] >>> <4> [83.539412] i915_gem_gtt_reserve+0xff/0x130 [i915] >>> <4> [83.539482] i915_vma_pin_ww+0x765/0x970 [i915] >>> <4> [83.539556] eb_validate_vmas+0x6fe/0x8e0 [i915] >>> <4> [83.539626] i915_gem_do_execbuffer+0x9a6/0x20a0 [i915] >>> <4> [83.539693] i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915] >>> <4> [83.539759] drm_ioctl_kernel+0xac/0x140 >>> <4> [83.539763] drm_ioctl+0x201/0x3d0 >>> <4> [83.539766] __x64_sys_ioctl+0x6a/0xa0 >>> <4> [83.539769] do_syscall_64+0x37/0xb0 >>> <4> [83
Re: [Intel-gfx] [PATCH 20/28] drm/i915: use new iterator in i915_gem_object_wait_reservation
Op 05-10-2021 om 13:37 schreef Christian König: > Simplifying the code a bit. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/i915/gem/i915_gem_wait.c | 51 +--- > 1 file changed, 9 insertions(+), 42 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c > b/drivers/gpu/drm/i915/gem/i915_gem_wait.c > index f909aaa09d9c..a13193db1dba 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c > @@ -37,55 +37,22 @@ i915_gem_object_wait_reservation(struct dma_resv *resv, >unsigned int flags, >long timeout) > { > - struct dma_fence *excl; > - bool prune_fences = false; > - > - if (flags & I915_WAIT_ALL) { > - struct dma_fence **shared; > - unsigned int count, i; > - int ret; > + struct dma_resv_iter cursor; > + struct dma_fence *fence; > > - ret = dma_resv_get_fences(resv, &excl, &count, &shared); > - if (ret) > - return ret; > - > - for (i = 0; i < count; i++) { > - timeout = i915_gem_object_wait_fence(shared[i], > - flags, timeout); > - if (timeout < 0) > - break; > - > - dma_fence_put(shared[i]); > - } > - > - for (; i < count; i++) > - dma_fence_put(shared[i]); > - kfree(shared); > - > - /* > - * If both shared fences and an exclusive fence exist, > - * then by construction the shared fences must be later > - * than the exclusive fence. If we successfully wait for > - * all the shared fences, we know that the exclusive fence > - * must all be signaled. If all the shared fences are > - * signaled, we can prune the array and recover the > - * floating references on the fences/requests. > - */ > - prune_fences = count && timeout >= 0; > - } else { > - excl = dma_resv_get_excl_unlocked(resv); > + dma_resv_iter_begin(&cursor, resv, flags & I915_WAIT_ALL); > + dma_resv_for_each_fence_unlocked(&cursor, fence) { > + timeout = i915_gem_object_wait_fence(fence, flags, timeout); > + if (timeout < 0) > + break; > } > - > - if (excl && timeout >= 0) > - timeout = i915_gem_object_wait_fence(excl, flags, timeout); > - > - dma_fence_put(excl); > + dma_resv_iter_end(&cursor); > > /* >* Opportunistically prune the fences iff we know they have *all* been >* signaled. >*/ > - if (prune_fences) > + if (timeout > 0) > dma_resv_prune(resv); > > return timeout; When replying to tvrtko about correctness of the conversion, I just now noticed a logic bug here, the same logic bug also affects dma_resv_wait_timeout. long dma_resv_wait_timeout(struct dma_resv *obj, bool wait_all, bool intr, unsigned long timeout) { long ret = timeout ? timeout : 1; struct dma_resv_iter cursor; struct dma_fence *fence; dma_resv_iter_begin(&cursor, obj, wait_all); dma_resv_for_each_fence_unlocked(&cursor, fence) { ret = dma_fence_wait_timeout(fence, intr, ret); if (ret <= 0) { dma_resv_iter_end(&cursor); return ret; } } dma_resv_iter_end(&cursor); return ret; } It fails to handle the case correctly when timeout = 0, I think the original code probably did. dma_fence_wait_timeout should be called with timeout = 0 explicitly. Fixed code for inner loop: ret = dma_fence_wait_timeout(fence, intr, timeout); if (ret <= 0) break; if (timeout) timeout = ret; This bug also affects i915_gem_object_wait_reservation, so the whole series might need to be respinned, or at least checked, if more wait conversions are affected. ~Maarten
Re: [Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.
Op 14-10-2021 om 10:37 schreef Tvrtko Ursulin: > > On 13/10/2021 11:41, Maarten Lankhorst wrote: >> No memory should be allocated when calling i915_gem_object_wait, >> because it may be called to idle a BO when evicting memory. >> >> Fix this by using dma_resv_iter helpers to call >> i915_gem_object_wait_fence() on each fence, which cleans up the code a lot. >> Also remove dma_resv_prune, it's questionably. >> >> This will result in the following lockdep splat. > > > >> @@ -37,56 +36,17 @@ i915_gem_object_wait_reservation(struct dma_resv *resv, >> unsigned int flags, >> long timeout) >> { >> - struct dma_fence *excl; >> - bool prune_fences = false; >> - >> - if (flags & I915_WAIT_ALL) { >> - struct dma_fence **shared; >> - unsigned int count, i; >> - int ret; >> + struct dma_resv_iter cursor; >> + struct dma_fence *fence; >> - ret = dma_resv_get_fences(resv, &excl, &count, &shared); >> - if (ret) >> - return ret; >> - >> - for (i = 0; i < count; i++) { >> - timeout = i915_gem_object_wait_fence(shared[i], >> - flags, timeout); >> - if (timeout < 0) >> - break; >> + dma_resv_iter_begin(&cursor, resv, flags & I915_WAIT_ALL); >> + dma_resv_for_each_fence_unlocked(&cursor, fence) { >> - dma_fence_put(shared[i]); >> - } >> - >> - for (; i < count; i++) >> - dma_fence_put(shared[i]); >> - kfree(shared); >> - >> - /* >> - * If both shared fences and an exclusive fence exist, >> - * then by construction the shared fences must be later >> - * than the exclusive fence. If we successfully wait for >> - * all the shared fences, we know that the exclusive fence >> - * must all be signaled. If all the shared fences are >> - * signaled, we can prune the array and recover the >> - * floating references on the fences/requests. >> - */ >> - prune_fences = count && timeout >= 0; >> - } else { >> - excl = dma_resv_get_excl_unlocked(resv); >> + timeout = i915_gem_object_wait_fence(fence, flags, timeout); >> + if (timeout <= 0) >> + break; > > You have another change in behaviour here, well a bug really. When userspace > passes in zero timeout you fail to report activity in other than the first > fence. Hmm, not necessarily, passing 0 to i915_gem_object_wait_fence timeout = 0 is a special case and means test only. It will return 1 on success. Of course it is still broken, I sent a reply to könig about it, hope it will get fixed and respun. :) ~Maarten
[Intel-gfx] [PULL] drm-misc-fixes
drm-misc-fixes-2021-10-14: drm-misc-fixes for v5.15-rc6: - Respun clock fixes for vc4/hdmi. - Cap connector_bad_edid()'s num_of_ext by num_blocks read. - Clamp fbdev size to max available height. - Hide hyper-v's hw pointer, to prevent double pointers. - Use the correct engine bit in nouveau's g84_fifo_chan_engine_fini. - Build fix for r128 on UML. - Add missing dependency for CONFIG_CRC32 to olimex-lcd-olinuxino. The following changes since commit f5a8703a9c418c6fc54eb772712dfe7641e3991c: drm/nouveau/debugfs: fix file release memory leak (2021-10-06 11:12:29 +0200) are available in the Git repository at: git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-fixes-2021-10-14 for you to fetch changes up to 6de148d82d9e790caf7622a002229df745fd2d94: drm/vc4: crtc: Make sure the HDMI controller is powered when disabling (2021-10-13 14:40:43 +0200) drm-misc-fixes for v5.15-rc6: - Respun clock fixes for vc4/hdmi. - Cap connector_bad_edid()'s num_of_ext by num_blocks read. - Clamp fbdev size to max available height. - Hide hyper-v's hw pointer, to prevent double pointers. - Use the correct engine bit in nouveau's g84_fifo_chan_engine_fini. - Build fix for r128 on UML. - Add missing dependency for CONFIG_CRC32 to olimex-lcd-olinuxino. Dexuan Cui (1): drm/hyperv: Fix double mouse pointers Douglas Anderson (1): drm/edid: In connector_bad_edid() cap num_of_ext by num_blocks read Marek Vasut (1): drm/nouveau/fifo: Reinstate the correct engine bit programming Maxime Ripard (11): clk: bcm-2835: Pick the closest clock rate clk: bcm-2835: Remove rounding up the dividers drm/vc4: hdmi: Set a default HSM rate drm/vc4: hdmi: Move the HSM clock enable to runtime_pm drm/vc4: hdmi: Make sure the controller is powered in detect drm/vc4: hdmi: Make sure the controller is powered up during bind drm/vc4: hdmi: Rework the pre_crtc_configure error handling drm/vc4: hdmi: Split the CEC disable / enable functions in two drm/vc4: hdmi: Make sure the device is powered with CEC drm/vc4: hdmi: Warn if we access the controller while disabled drm/vc4: crtc: Make sure the HDMI controller is powered when disabling Randy Dunlap (1): drm/r128: fix build for UML Thomas Zimmermann (1): drm/fbdev: Clamp fbdev surface size if too large Vegard Nossum (1): drm/panel: olimex-lcd-olinuxino: select CRC32 drivers/clk/bcm/clk-bcm2835.c | 13 +- drivers/gpu/drm/drm_edid.c | 15 +- drivers/gpu/drm/drm_fb_helper.c| 6 + drivers/gpu/drm/hyperv/hyperv_drm.h| 1 + drivers/gpu/drm/hyperv/hyperv_drm_modeset.c| 1 + drivers/gpu/drm/hyperv/hyperv_drm_proto.c | 54 +- drivers/gpu/drm/nouveau/nvkm/engine/fifo/chang84.c | 2 +- drivers/gpu/drm/panel/Kconfig | 1 + drivers/gpu/drm/r128/ati_pcigart.c | 2 +- drivers/gpu/drm/vc4/vc4_crtc.c | 19 +- drivers/gpu/drm/vc4/vc4_hdmi.c | 208 ++--- drivers/gpu/drm/vc4/vc4_hdmi_regs.h| 6 + 12 files changed, 248 insertions(+), 80 deletions(-)
Re: [Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.
Op 14-10-2021 om 15:25 schreef Tvrtko Ursulin: > > On 14/10/2021 13:05, Maarten Lankhorst wrote: >> Op 14-10-2021 om 10:37 schreef Tvrtko Ursulin: >>> >>> On 13/10/2021 11:41, Maarten Lankhorst wrote: >>>> No memory should be allocated when calling i915_gem_object_wait, >>>> because it may be called to idle a BO when evicting memory. >>>> >>>> Fix this by using dma_resv_iter helpers to call >>>> i915_gem_object_wait_fence() on each fence, which cleans up the code a lot. >>>> Also remove dma_resv_prune, it's questionably. >>>> >>>> This will result in the following lockdep splat. >>> >>> >>> >>>> @@ -37,56 +36,17 @@ i915_gem_object_wait_reservation(struct dma_resv *resv, >>>> unsigned int flags, >>>> long timeout) >>>> { >>>> - struct dma_fence *excl; >>>> - bool prune_fences = false; >>>> - >>>> - if (flags & I915_WAIT_ALL) { >>>> - struct dma_fence **shared; >>>> - unsigned int count, i; >>>> - int ret; >>>> + struct dma_resv_iter cursor; >>>> + struct dma_fence *fence; >>>> - ret = dma_resv_get_fences(resv, &excl, &count, &shared); >>>> - if (ret) >>>> - return ret; >>>> - >>>> - for (i = 0; i < count; i++) { >>>> - timeout = i915_gem_object_wait_fence(shared[i], >>>> - flags, timeout); >>>> - if (timeout < 0) >>>> - break; >>>> + dma_resv_iter_begin(&cursor, resv, flags & I915_WAIT_ALL); >>>> + dma_resv_for_each_fence_unlocked(&cursor, fence) { >>>> - dma_fence_put(shared[i]); >>>> - } >>>> - >>>> - for (; i < count; i++) >>>> - dma_fence_put(shared[i]); >>>> - kfree(shared); >>>> - >>>> - /* >>>> - * If both shared fences and an exclusive fence exist, >>>> - * then by construction the shared fences must be later >>>> - * than the exclusive fence. If we successfully wait for >>>> - * all the shared fences, we know that the exclusive fence >>>> - * must all be signaled. If all the shared fences are >>>> - * signaled, we can prune the array and recover the >>>> - * floating references on the fences/requests. >>>> - */ >>>> - prune_fences = count && timeout >= 0; >>>> - } else { >>>> - excl = dma_resv_get_excl_unlocked(resv); >>>> + timeout = i915_gem_object_wait_fence(fence, flags, timeout); >>>> + if (timeout <= 0) >>>> + break; >>> >>> You have another change in behaviour here, well a bug really. When >>> userspace passes in zero timeout you fail to report activity in other than >>> the first fence. >> >> Hmm, not necessarily, passing 0 to i915_gem_object_wait_fence timeout = 0 is >> a special case and means test only. It will return 1 on success. > > I tried to enumerate the whole chain here. All for timeout == 0. Please > double check I did not make a mistake somewhere since there are many return > code inversions here. > > As building blocks for the whole "game" we have: > > 1. dma_fence_default_wait, it returns for states: > > not signaled -> 0 > signaled -> 1 > > 2. i915_request_wait > > not signaled -> -ETIME > signaled -> 0 > > Then i915_gem_object_wait_fence builds on top of it and has therefore these > possible outputs: > > signaled -> 0 > not signaled: > i915 path -> -ETIME > ext fence -> 0 > > So this looks a like problem already with 0 for signaled and not signaled. > Unless it is by design that the return value does not want to report external > fences? But it is not documented and it still waits on them so odd. > > Then in i915_gem_object_wait_reservation we have a loop: > > for (i = 0; i < count; i++) { > timeout = i915_gem_object_wait_fence(shared[i], > flags, timeout); > if (timeout < 0) > break; > > So short circuit happens only for i915 fences, by virtue of no n
[Intel-gfx] [PATCH 1/2] dma-buf: Fix dma_resv_wait_timeout handling of timeout = 0.
Commit ada5c48b11a3 ("dma-buf: use new iterator in dma_resv_wait_timeout") accidentally started mishandling timeout = 0, by forcing a blocking wait with timeout = 1 passed to fences. This is not intended, as timeout = 0 may be used for peeking, similar to test_signaled. Fixes: ada5c48b11a3 ("dma-buf: use new iterator in dma_resv_wait_timeout") Cc: Christian König Cc: Daniel Vetter Signed-off-by: Maarten Lankhorst --- drivers/dma-buf/dma-resv.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 9eb2baa387d4..70a8082660c5 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -617,18 +617,18 @@ EXPORT_SYMBOL_GPL(dma_resv_get_fences); long dma_resv_wait_timeout(struct dma_resv *obj, bool wait_all, bool intr, unsigned long timeout) { - long ret = timeout ? timeout : 1; + long ret = timeout ?: 1; struct dma_resv_iter cursor; struct dma_fence *fence; dma_resv_iter_begin(&cursor, obj, wait_all); dma_resv_for_each_fence_unlocked(&cursor, fence) { + ret = dma_fence_wait_timeout(fence, intr, timeout); + if (ret <= 0) + break; - ret = dma_fence_wait_timeout(fence, intr, ret); - if (ret <= 0) { - dma_resv_iter_end(&cursor); - return ret; - } + if (timeout) + timeout = ret; } dma_resv_iter_end(&cursor); -- 2.33.0
[Intel-gfx] [PATCH 0/2] dma-buf: Fix breakages from dma_resv_iter conversion.
Urgent fixes! dma_resv_test_signaled is completely broken, dma_resv_wait_timeout kind of broken. Maarten Lankhorst (2): dma-buf: Fix dma_resv_wait_timeout handling of timeout = 0. dma-buf: Fix dma_resv_test_signaled. drivers/dma-buf/dma-resv.c | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) -- 2.33.0
[Intel-gfx] [PATCH 2/2] dma-buf: Fix dma_resv_test_signaled.
Commit 7fa828cb9265 ("dma-buf: use new iterator in dma_resv_test_signaled") accidentally forgot to test whether the dma-buf is actually signaled, breaking pretty much everything depending on it. Fixes: 7fa828cb9265 ("dma-buf: use new iterator in dma_resv_test_signaled") Cc: Christian König Cc: Daniel Vetter Signed-off-by: Maarten Lankhorst --- drivers/dma-buf/dma-resv.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 70a8082660c5..37ab2875e469 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -655,14 +655,16 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) { struct dma_resv_iter cursor; struct dma_fence *fence; + bool ret = true; dma_resv_iter_begin(&cursor, obj, test_all); dma_resv_for_each_fence_unlocked(&cursor, fence) { - dma_resv_iter_end(&cursor); - return false; + ret = dma_fence_is_signaled(fence); + if (!ret) + break; } dma_resv_iter_end(&cursor); - return true; + return ret; } EXPORT_SYMBOL_GPL(dma_resv_test_signaled); -- 2.33.0
Re: [Intel-gfx] [PATCH 2/2] dma-buf: Fix dma_resv_test_signaled.
Op 15-10-2021 om 14:07 schreef Christian König: > Am 15.10.21 um 13:57 schrieb Maarten Lankhorst: >> Commit 7fa828cb9265 ("dma-buf: use new iterator in dma_resv_test_signaled") >> accidentally forgot to test whether the dma-buf is actually signaled, >> breaking >> pretty much everything depending on it. > > NAK, the dma_resv_for_each_fence_unlocked() returns only unsignaled fences. > So the code is correct as it is. That seems like it might cause some unexpected behavior when that function is called with one of the fence locks held, if it calls dma_fence_signal(). Could it be changed to only test the signaled bit, in which case this patch would still be useful? Or at least add some lockdep annotations, that fence->lock might be taken. So any hangs would at least be easy to spot with lockdep. ~Maarten
Re: [Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.
Op 14-10-2021 om 15:56 schreef Tvrtko Ursulin: > > On 14/10/2021 14:45, Maarten Lankhorst wrote: >> Op 14-10-2021 om 15:25 schreef Tvrtko Ursulin: >>> >>> On 14/10/2021 13:05, Maarten Lankhorst wrote: >>>> Op 14-10-2021 om 10:37 schreef Tvrtko Ursulin: >>>>> >>>>> On 13/10/2021 11:41, Maarten Lankhorst wrote: >>>>>> No memory should be allocated when calling i915_gem_object_wait, >>>>>> because it may be called to idle a BO when evicting memory. >>>>>> >>>>>> Fix this by using dma_resv_iter helpers to call >>>>>> i915_gem_object_wait_fence() on each fence, which cleans up the code a >>>>>> lot. >>>>>> Also remove dma_resv_prune, it's questionably. >>>>>> >>>>>> This will result in the following lockdep splat. >>>>> >>>>> >>>>> >>>>>> @@ -37,56 +36,17 @@ i915_gem_object_wait_reservation(struct dma_resv >>>>>> *resv, >>>>>> unsigned int flags, >>>>>> long timeout) >>>>>> { >>>>>> - struct dma_fence *excl; >>>>>> - bool prune_fences = false; >>>>>> - >>>>>> - if (flags & I915_WAIT_ALL) { >>>>>> - struct dma_fence **shared; >>>>>> - unsigned int count, i; >>>>>> - int ret; >>>>>> + struct dma_resv_iter cursor; >>>>>> + struct dma_fence *fence; >>>>>> - ret = dma_resv_get_fences(resv, &excl, &count, &shared); >>>>>> - if (ret) >>>>>> - return ret; >>>>>> - >>>>>> - for (i = 0; i < count; i++) { >>>>>> - timeout = i915_gem_object_wait_fence(shared[i], >>>>>> - flags, timeout); >>>>>> - if (timeout < 0) >>>>>> - break; >>>>>> + dma_resv_iter_begin(&cursor, resv, flags & I915_WAIT_ALL); >>>>>> + dma_resv_for_each_fence_unlocked(&cursor, fence) { >>>>>> - dma_fence_put(shared[i]); >>>>>> - } >>>>>> - >>>>>> - for (; i < count; i++) >>>>>> - dma_fence_put(shared[i]); >>>>>> - kfree(shared); >>>>>> - >>>>>> - /* >>>>>> - * If both shared fences and an exclusive fence exist, >>>>>> - * then by construction the shared fences must be later >>>>>> - * than the exclusive fence. If we successfully wait for >>>>>> - * all the shared fences, we know that the exclusive fence >>>>>> - * must all be signaled. If all the shared fences are >>>>>> - * signaled, we can prune the array and recover the >>>>>> - * floating references on the fences/requests. >>>>>> - */ >>>>>> - prune_fences = count && timeout >= 0; >>>>>> - } else { >>>>>> - excl = dma_resv_get_excl_unlocked(resv); >>>>>> + timeout = i915_gem_object_wait_fence(fence, flags, timeout); >>>>>> + if (timeout <= 0) >>>>>> + break; >>>>> >>>>> You have another change in behaviour here, well a bug really. When >>>>> userspace passes in zero timeout you fail to report activity in other >>>>> than the first fence. >>>> >>>> Hmm, not necessarily, passing 0 to i915_gem_object_wait_fence timeout = 0 >>>> is a special case and means test only. It will return 1 on success. >>> >>> I tried to enumerate the whole chain here. All for timeout == 0. Please >>> double check I did not make a mistake somewhere since there are many return >>> code inversions here. >>> >>> As building blocks for the whole "game" we have: >>> >>> 1. dma_fence_default_wait, it returns for states: >>> not signaled -> 0 >>> signaled -> 1 >>> >>> 2. i915_request_wait >>> >>> not signaled -> -ETIME >>> signaled -> 0 >>&g
[Intel-gfx] [PULL] drm-misc-fixes
Hi Dave, New drm-misc-fixes without the vc4 changes. I feel that needs some more discussion first. drm-misc-fixes-2021-10-21-1: drm-misc-fixes for v5.15-rc7: - Rebased, to remove vc4 patches. - Fix mxsfb crash on unload. - Use correct sync parameters for Feixin K101-IM2BYL02. - Assorted kmb modeset/atomic fixes. The following changes since commit 519d81956ee277b4419c723adfb154603c2565ba: Linux 5.15-rc6 (2021-10-17 20:00:13 -1000) are available in the Git repository at: git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-fixes-2021-10-21-1 for you to fetch changes up to 74056092ff415e7e20ce2544689b32ee811c4f0b: drm/kmb: Enable ADV bridge after modeset (2021-10-21 11:08:09 +0200) drm-misc-fixes for v5.15-rc7: - Rebased, to remove vc4 patches. - Fix mxsfb crash on unload. - Use correct sync parameters for Feixin K101-IM2BYL02. - Assorted kmb modeset/atomic fixes. Anitha Chrisanthus (4): drm/kmb: Work around for higher system clock drm/kmb: Limit supported mode to 1080p drm/kmb: Corrected typo in handle_lcd_irq drm/kmb: Enable ADV bridge after modeset Dan Johansen (1): drm/panel: ilitek-ili9881c: Fix sync for Feixin K101-IM2BYL02 panel Edmund Dea (2): drm/kmb: Remove clearing DPHY regs drm/kmb: Disable change of plane parameters Marek Vasut (1): drm: mxsfb: Fix NULL pointer dereference crash on unload drivers/gpu/drm/kmb/kmb_crtc.c| 41 +++-- drivers/gpu/drm/kmb/kmb_drv.c | 2 +- drivers/gpu/drm/kmb/kmb_drv.h | 10 ++- drivers/gpu/drm/kmb/kmb_dsi.c | 25 +--- drivers/gpu/drm/kmb/kmb_dsi.h | 2 +- drivers/gpu/drm/kmb/kmb_plane.c | 43 ++- drivers/gpu/drm/kmb/kmb_plane.h | 6 drivers/gpu/drm/mxsfb/mxsfb_drv.c | 6 +++- drivers/gpu/drm/panel/panel-ilitek-ili9881c.c | 12 9 files changed, 123 insertions(+), 24 deletions(-)
[Intel-gfx] [PATCH 04/28] drm/i915: Remove unused bits of i915_vma/active api
When reworking the code to move the eviction fence to the object, the best code is removed code. Remove some functions that are unused, and change the function definition if it's only used in 1 place. Signed-off-by: Maarten Lankhorst Reviewed-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/i915_active.c | 28 +++- drivers/gpu/drm/i915/i915_active.h | 17 + drivers/gpu/drm/i915/i915_vma.c| 2 +- drivers/gpu/drm/i915/i915_vma.h| 2 -- 4 files changed, 5 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 3103c1e1fd14..ee2b3a375362 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -426,8 +426,9 @@ replace_barrier(struct i915_active *ref, struct i915_active_fence *active) return true; } -int i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence) +int i915_active_add_request(struct i915_active *ref, struct i915_request *rq) { + struct dma_fence *fence = &rq->fence; struct i915_active_fence *active; int err; @@ -436,7 +437,7 @@ int i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence) if (err) return err; - active = active_instance(ref, idx); + active = active_instance(ref, i915_request_timeline(rq)->fence_context); if (!active) { err = -ENOMEM; goto out; @@ -477,29 +478,6 @@ __i915_active_set_fence(struct i915_active *ref, return prev; } -static struct i915_active_fence * -__active_fence(struct i915_active *ref, u64 idx) -{ - struct active_node *it; - - it = __active_lookup(ref, idx); - if (unlikely(!it)) { /* Contention with parallel tree builders! */ - spin_lock_irq(&ref->tree_lock); - it = __active_lookup(ref, idx); - spin_unlock_irq(&ref->tree_lock); - } - GEM_BUG_ON(!it); /* slot must be preallocated */ - - return &it->base; -} - -struct dma_fence * -__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence) -{ - /* Only valid while active, see i915_active_acquire_for_context() */ - return __i915_active_set_fence(ref, __active_fence(ref, idx), fence); -} - struct dma_fence * i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f) { diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index 5fcdb0e2bc9e..7eb44132183a 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -164,26 +164,11 @@ void __i915_active_init(struct i915_active *ref, __i915_active_init(ref, active, retire, flags, &__mkey, &__wkey); \ } while (0) -struct dma_fence * -__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence); -int i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence); - -static inline int -i915_active_add_request(struct i915_active *ref, struct i915_request *rq) -{ - return i915_active_ref(ref, - i915_request_timeline(rq)->fence_context, - &rq->fence); -} +int i915_active_add_request(struct i915_active *ref, struct i915_request *rq); struct dma_fence * i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f); -static inline bool i915_active_has_exclusive(struct i915_active *ref) -{ - return rcu_access_pointer(ref->excl.fence); -} - int __i915_active_wait(struct i915_active *ref, int state); static inline int i915_active_wait(struct i915_active *ref) { diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 90546fa58fc1..1187f1956c20 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -1220,7 +1220,7 @@ __i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma) return __i915_request_await_exclusive(rq, &vma->active); } -int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request *rq) +static int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request *rq) { int err; diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 648dbe744c96..b882fd7b5f99 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -55,8 +55,6 @@ static inline bool i915_vma_is_active(const struct i915_vma *vma) /* do not reserve memory to prevent deadlocks */ #define __EXEC_OBJECT_NO_RESERVE BIT(31) -int __must_check __i915_vma_move_to_active(struct i915_vma *vma, - struct i915_request *rq); int __must_check _i915_vma_move_to_active(struct i915_vma *vma, struct i915_request *rq, struct dma_fence *fence, -- 2.33.0
[Intel-gfx] [PATCH 02/28] drm/i915: use new iterator in i915_gem_object_wait_reservation
From: Christian König Simplifying the code a bit. Signed-off-by: Christian König [mlankhorst: Handle timeout = 0 correctly, use new i915_request_wait_timeout.] Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gem/i915_gem_wait.c | 65 1 file changed, 20 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c index f909aaa09d9c..840c13706999 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c @@ -25,7 +25,7 @@ i915_gem_object_wait_fence(struct dma_fence *fence, return timeout; if (dma_fence_is_i915(fence)) - return i915_request_wait(to_request(fence), flags, timeout); + return i915_request_wait_timeout(to_request(fence), flags, timeout); return dma_fence_wait_timeout(fence, flags & I915_WAIT_INTERRUPTIBLE, @@ -37,58 +37,29 @@ i915_gem_object_wait_reservation(struct dma_resv *resv, unsigned int flags, long timeout) { - struct dma_fence *excl; - bool prune_fences = false; - - if (flags & I915_WAIT_ALL) { - struct dma_fence **shared; - unsigned int count, i; - int ret; - - ret = dma_resv_get_fences(resv, &excl, &count, &shared); - if (ret) - return ret; - - for (i = 0; i < count; i++) { - timeout = i915_gem_object_wait_fence(shared[i], -flags, timeout); - if (timeout < 0) - break; - - dma_fence_put(shared[i]); - } - - for (; i < count; i++) - dma_fence_put(shared[i]); - kfree(shared); + struct dma_resv_iter cursor; + struct dma_fence *fence; + long ret = timeout ?: 1; + + dma_resv_iter_begin(&cursor, resv, flags & I915_WAIT_ALL); + dma_resv_for_each_fence_unlocked(&cursor, fence) { + ret = i915_gem_object_wait_fence(fence, flags, timeout); + if (ret <= 0) + break; - /* -* If both shared fences and an exclusive fence exist, -* then by construction the shared fences must be later -* than the exclusive fence. If we successfully wait for -* all the shared fences, we know that the exclusive fence -* must all be signaled. If all the shared fences are -* signaled, we can prune the array and recover the -* floating references on the fences/requests. -*/ - prune_fences = count && timeout >= 0; - } else { - excl = dma_resv_get_excl_unlocked(resv); + if (timeout) + timeout = ret; } - - if (excl && timeout >= 0) - timeout = i915_gem_object_wait_fence(excl, flags, timeout); - - dma_fence_put(excl); + dma_resv_iter_end(&cursor); /* * Opportunistically prune the fences iff we know they have *all* been * signaled. */ - if (prune_fences) + if (timeout > 0) dma_resv_prune(resv); - return timeout; + return ret; } static void fence_set_priority(struct dma_fence *fence, @@ -196,7 +167,11 @@ i915_gem_object_wait(struct drm_i915_gem_object *obj, timeout = i915_gem_object_wait_reservation(obj->base.resv, flags, timeout); - return timeout < 0 ? timeout : 0; + + if (timeout < 0) + return timeout; + + return !timeout ? -ETIME : 0; } static inline unsigned long nsecs_to_jiffies_timeout(const u64 n) -- 2.33.0
[Intel-gfx] [PATCH 06/28] drm/i915: Remove gen6_ppgtt_unpin_all
gen6_ppgtt_unpin_all is unused, kill it. Signed-off-by: Maarten Lankhorst Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 11 --- drivers/gpu/drm/i915/gt/gen6_ppgtt.h | 1 - 2 files changed, 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c index 890191f286e3..9fdbd9d3372b 100644 --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c @@ -405,17 +405,6 @@ void gen6_ppgtt_unpin(struct i915_ppgtt *base) i915_vma_unpin(ppgtt->vma); } -void gen6_ppgtt_unpin_all(struct i915_ppgtt *base) -{ - struct gen6_ppgtt *ppgtt = to_gen6_ppgtt(base); - - if (!atomic_read(&ppgtt->pin_count)) - return; - - i915_vma_unpin(ppgtt->vma); - atomic_set(&ppgtt->pin_count, 0); -} - struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt) { struct i915_ggtt * const ggtt = gt->ggtt; diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.h b/drivers/gpu/drm/i915/gt/gen6_ppgtt.h index 6a61a5c3a85a..ab0eecb086dd 100644 --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.h +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.h @@ -71,7 +71,6 @@ static inline struct gen6_ppgtt *to_gen6_ppgtt(struct i915_ppgtt *base) int gen6_ppgtt_pin(struct i915_ppgtt *base, struct i915_gem_ww_ctx *ww); void gen6_ppgtt_unpin(struct i915_ppgtt *base); -void gen6_ppgtt_unpin_all(struct i915_ppgtt *base); void gen6_ppgtt_enable(struct intel_gt *gt); void gen7_ppgtt_enable(struct intel_gt *gt); struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt); -- 2.33.0
[Intel-gfx] [PATCH 01/28] drm/i915: Fix i915_request fence wait semantics
The i915_request fence wait behaves differently for timeout = 0 compared to expected dma-fence behavior. i915 behavior: - Unsignaled: -ETIME - Signaled: 0 (= timeout) Expected: - Unsignaled: 0 - Signaled: 1 Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/i915_request.c | 57 - drivers/gpu/drm/i915/i915_request.h | 5 +++ 2 files changed, 52 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 820a1f38b271..42cd17357771 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -96,9 +96,9 @@ static signed long i915_fence_wait(struct dma_fence *fence, bool interruptible, signed long timeout) { - return i915_request_wait(to_request(fence), -interruptible | I915_WAIT_PRIORITY, -timeout); + return i915_request_wait_timeout(to_request(fence), +interruptible | I915_WAIT_PRIORITY, +timeout); } struct kmem_cache *i915_request_slab_cache(void) @@ -1857,23 +1857,27 @@ static void request_wait_wake(struct dma_fence *fence, struct dma_fence_cb *cb) } /** - * i915_request_wait - wait until execution of request has finished + * i915_request_wait_timeout - wait until execution of request has finished * @rq: the request to wait upon * @flags: how to wait * @timeout: how long to wait in jiffies * - * i915_request_wait() waits for the request to be completed, for a + * i915_request_wait_timeout() waits for the request to be completed, for a * maximum of @timeout jiffies (with MAX_SCHEDULE_TIMEOUT implying an * unbounded wait). * * Returns the remaining time (in jiffies) if the request completed, which may - * be zero or -ETIME if the request is unfinished after the timeout expires. + * be zero if the request is unfinished after the timeout expires. + * If the timeout is 0, it will return 1 if the fence is signaled. + * * May return -EINTR is called with I915_WAIT_INTERRUPTIBLE and a signal is * pending before the request completes. + * + * NOTE: This function has the same wait semantics as dma-fence. */ -long i915_request_wait(struct i915_request *rq, - unsigned int flags, - long timeout) +long i915_request_wait_timeout(struct i915_request *rq, + unsigned int flags, + long timeout) { const int state = flags & I915_WAIT_INTERRUPTIBLE ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE; @@ -1883,7 +1887,7 @@ long i915_request_wait(struct i915_request *rq, GEM_BUG_ON(timeout < 0); if (dma_fence_is_signaled(&rq->fence)) - return timeout; + return timeout ?: 1; if (!timeout) return -ETIME; @@ -1992,6 +1996,39 @@ long i915_request_wait(struct i915_request *rq, return timeout; } +/** + * i915_request_wait - wait until execution of request has finished + * @rq: the request to wait upon + * @flags: how to wait + * @timeout: how long to wait in jiffies + * + * i915_request_wait() waits for the request to be completed, for a + * maximum of @timeout jiffies (with MAX_SCHEDULE_TIMEOUT implying an + * unbounded wait). + * + * Returns the remaining time (in jiffies) if the request completed, which may + * be zero or -ETIME if the request is unfinished after the timeout expires. + * May return -EINTR is called with I915_WAIT_INTERRUPTIBLE and a signal is + * pending before the request completes. + * + * NOTE: This function behaves differently from dma-fence wait semantics for + * timeout = 0. It returns 0 on success, and -ETIME if not signaled. + */ +long i915_request_wait(struct i915_request *rq, + unsigned int flags, + long timeout) +{ + long ret = i915_request_wait_timeout(rq, flags, timeout); + + if (!ret) + return -ETIME; + + if (ret > 0 && !timeout) + return 0; + + return ret; +} + static int print_sched_attr(const struct i915_sched_attr *attr, char *buf, int x, int len) { diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index dc359242d1ae..3c6e8acd1457 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -414,6 +414,11 @@ void i915_request_unsubmit(struct i915_request *request); void i915_request_cancel(struct i915_request *rq, int error); +long i915_request_wait_timeout(struct i915_request *rq, + unsigned int flags, + long timeout) + __attribute__((nonnull(1))); + long i915_request_wait(struct i915_request *rq,
[Intel-gfx] [PATCH 08/28] drm/i915: Create a full object for mock_ring, v2.
This allows us to finally get rid of all the assumptions that vma->obj is NULL. Changes since v1: - Ensure the mock_ring vma is pinned to prevent a fault. - Pin it high to avoid failure in evict_for_vma selftest. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/mock_engine.c | 38 --- 1 file changed, 28 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index 8b89215afe46..bb99fc03f503 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -35,9 +35,31 @@ static void mock_timeline_unpin(struct intel_timeline *tl) atomic_dec(&tl->pin_count); } +static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size) +{ + struct i915_address_space *vm = &ggtt->vm; + struct drm_i915_private *i915 = vm->i915; + struct drm_i915_gem_object *obj; + struct i915_vma *vma; + + obj = i915_gem_object_create_internal(i915, size); + if (IS_ERR(obj)) + return ERR_CAST(obj); + + vma = i915_vma_instance(obj, vm, NULL); + if (IS_ERR(vma)) + goto err; + + return vma; + +err: + i915_gem_object_put(obj); + return vma; +} + static struct intel_ring *mock_ring(struct intel_engine_cs *engine) { - const unsigned long sz = PAGE_SIZE / 2; + const unsigned long sz = PAGE_SIZE; struct intel_ring *ring; ring = kzalloc(sizeof(*ring) + sz, GFP_KERNEL); @@ -50,15 +72,11 @@ static struct intel_ring *mock_ring(struct intel_engine_cs *engine) ring->vaddr = (void *)(ring + 1); atomic_set(&ring->pin_count, 1); - ring->vma = i915_vma_alloc(); - if (!ring->vma) { + ring->vma = create_ring_vma(engine->gt->ggtt, PAGE_SIZE); + if (IS_ERR(ring->vma)) { kfree(ring); return NULL; } - i915_active_init(&ring->vma->active, NULL, NULL, 0); - __set_bit(I915_VMA_GGTT_BIT, __i915_vma_flags(ring->vma)); - __set_bit(DRM_MM_NODE_ALLOCATED_BIT, &ring->vma->node.flags); - ring->vma->node.size = sz; intel_ring_update_space(ring); @@ -67,8 +85,7 @@ static struct intel_ring *mock_ring(struct intel_engine_cs *engine) static void mock_ring_free(struct intel_ring *ring) { - i915_active_fini(&ring->vma->active); - i915_vma_free(ring->vma); + i915_vma_put(ring->vma); kfree(ring); } @@ -125,6 +142,7 @@ static void mock_context_unpin(struct intel_context *ce) static void mock_context_post_unpin(struct intel_context *ce) { + i915_vma_unpin(ce->ring->vma); } static void mock_context_destroy(struct kref *ref) @@ -169,7 +187,7 @@ static int mock_context_alloc(struct intel_context *ce) static int mock_context_pre_pin(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void **unused) { - return 0; + return i915_vma_pin_ww(ce->ring->vma, ww, 0, 0, PIN_GLOBAL | PIN_HIGH); } static int mock_context_pin(struct intel_context *ce, void *unused) -- 2.33.0
[Intel-gfx] [PATCH 03/28] drm/i915: Remove dma_resv_prune
The signaled bit is already used for quick testing if a fence is signaled. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/Makefile| 1 - drivers/gpu/drm/i915/dma_resv_utils.c| 17 - drivers/gpu/drm/i915/dma_resv_utils.h| 13 - drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 3 --- drivers/gpu/drm/i915/gem/i915_gem_wait.c | 8 5 files changed, 42 deletions(-) delete mode 100644 drivers/gpu/drm/i915/dma_resv_utils.c delete mode 100644 drivers/gpu/drm/i915/dma_resv_utils.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 467872cca027..b87e3ed10d86 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -60,7 +60,6 @@ i915-y += i915_drv.o \ # core library code i915-y += \ - dma_resv_utils.o \ i915_memcpy.o \ i915_mm.o \ i915_sw_fence.o \ diff --git a/drivers/gpu/drm/i915/dma_resv_utils.c b/drivers/gpu/drm/i915/dma_resv_utils.c deleted file mode 100644 index 7df91b7e4ca8.. --- a/drivers/gpu/drm/i915/dma_resv_utils.c +++ /dev/null @@ -1,17 +0,0 @@ -// SPDX-License-Identifier: MIT -/* - * Copyright © 2020 Intel Corporation - */ - -#include - -#include "dma_resv_utils.h" - -void dma_resv_prune(struct dma_resv *resv) -{ - if (dma_resv_trylock(resv)) { - if (dma_resv_test_signaled(resv, true)) - dma_resv_add_excl_fence(resv, NULL); - dma_resv_unlock(resv); - } -} diff --git a/drivers/gpu/drm/i915/dma_resv_utils.h b/drivers/gpu/drm/i915/dma_resv_utils.h deleted file mode 100644 index b9d8fb5f8367.. --- a/drivers/gpu/drm/i915/dma_resv_utils.h +++ /dev/null @@ -1,13 +0,0 @@ -/* SPDX-License-Identifier: MIT */ -/* - * Copyright © 2020 Intel Corporation - */ - -#ifndef DMA_RESV_UTILS_H -#define DMA_RESV_UTILS_H - -struct dma_resv; - -void dma_resv_prune(struct dma_resv *resv); - -#endif /* DMA_RESV_UTILS_H */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index 5ab136ffdeb2..af3eb7fd951d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -15,7 +15,6 @@ #include "gt/intel_gt_requests.h" -#include "dma_resv_utils.h" #include "i915_trace.h" static bool swap_available(void) @@ -229,8 +228,6 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww, i915_gem_object_unlock(obj); } - dma_resv_prune(obj->base.resv); - scanned += obj->base.size >> PAGE_SHIFT; skip: i915_gem_object_put(obj); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c index 840c13706999..1592d95c3ead 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c @@ -10,7 +10,6 @@ #include "gt/intel_engine.h" -#include "dma_resv_utils.h" #include "i915_gem_ioctls.h" #include "i915_gem_object.h" @@ -52,13 +51,6 @@ i915_gem_object_wait_reservation(struct dma_resv *resv, } dma_resv_iter_end(&cursor); - /* -* Opportunistically prune the fences iff we know they have *all* been -* signaled. -*/ - if (timeout > 0) - dma_resv_prune(resv); - return ret; } -- 2.33.0
[Intel-gfx] [PATCH 07/28] drm/i915: Create a dummy object for gen6 ppgtt
We currently have to special case vma->obj being NULL because of gen6 ppgtt and mock_engine. Fix gen6 ppgtt, so we may soon be able to remove a few checks. As the object only exists as a fake object pointing to ggtt, we have no backing storage, so no real object is created. It just has to look real enough. Also kill pin_mutex, it's not compatible with ww locking, and we can use the vm lock instead. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gem/i915_gem_internal.c | 44 --- drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 122 +++ drivers/gpu/drm/i915/gt/gen6_ppgtt.h | 1 - drivers/gpu/drm/i915/i915_drv.h | 4 + 4 files changed, 99 insertions(+), 72 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c index a57a6b7013c2..c5150a1ee3d2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c @@ -145,24 +145,10 @@ static const struct drm_i915_gem_object_ops i915_gem_object_internal_ops = { .put_pages = i915_gem_object_put_pages_internal, }; -/** - * i915_gem_object_create_internal: create an object with volatile pages - * @i915: the i915 device - * @size: the size in bytes of backing storage to allocate for the object - * - * Creates a new object that wraps some internal memory for private use. - * This object is not backed by swappable storage, and as such its contents - * are volatile and only valid whilst pinned. If the object is reaped by the - * shrinker, its pages and data will be discarded. Equally, it is not a full - * GEM object and so not valid for access from userspace. This makes it useful - * for hardware interfaces like ringbuffers (which are pinned from the time - * the request is written to the time the hardware stops accessing it), but - * not for contexts (which need to be preserved when not active for later - * reuse). Note that it is not cleared upon allocation. - */ struct drm_i915_gem_object * -i915_gem_object_create_internal(struct drm_i915_private *i915, - phys_addr_t size) +__i915_gem_object_create_internal(struct drm_i915_private *i915, + const struct drm_i915_gem_object_ops *ops, + phys_addr_t size) { static struct lock_class_key lock_class; struct drm_i915_gem_object *obj; @@ -179,7 +165,7 @@ i915_gem_object_create_internal(struct drm_i915_private *i915, return ERR_PTR(-ENOMEM); drm_gem_private_object_init(&i915->drm, &obj->base, size); - i915_gem_object_init(obj, &i915_gem_object_internal_ops, &lock_class, 0); + i915_gem_object_init(obj, ops, &lock_class, 0); obj->mem_flags |= I915_BO_FLAG_STRUCT_PAGE; /* @@ -199,3 +185,25 @@ i915_gem_object_create_internal(struct drm_i915_private *i915, return obj; } + +/** + * i915_gem_object_create_internal: create an object with volatile pages + * @i915: the i915 device + * @size: the size in bytes of backing storage to allocate for the object + * + * Creates a new object that wraps some internal memory for private use. + * This object is not backed by swappable storage, and as such its contents + * are volatile and only valid whilst pinned. If the object is reaped by the + * shrinker, its pages and data will be discarded. Equally, it is not a full + * GEM object and so not valid for access from userspace. This makes it useful + * for hardware interfaces like ringbuffers (which are pinned from the time + * the request is written to the time the hardware stops accessing it), but + * not for contexts (which need to be preserved when not active for later + * reuse). Note that it is not cleared upon allocation. + */ +struct drm_i915_gem_object * +i915_gem_object_create_internal(struct drm_i915_private *i915, + phys_addr_t size) +{ + return __i915_gem_object_create_internal(i915, &i915_gem_object_internal_ops, size); +} diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c index 9fdbd9d3372b..5caa1703716e 100644 --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c @@ -262,13 +262,10 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm) { struct gen6_ppgtt *ppgtt = to_gen6_ppgtt(i915_vm_to_ppgtt(vm)); - __i915_vma_put(ppgtt->vma); - gen6_ppgtt_free_pd(ppgtt); free_scratch(vm); mutex_destroy(&ppgtt->flush); - mutex_destroy(&ppgtt->pin_mutex); free_pd(&ppgtt->base.vm, ppgtt->base.pd); } @@ -331,37 +328,6 @@ static const struct i915_vma_ops pd_vma_ops = { .unbind_vma = pd_vma_unbind, }; -static struct i915_vma *pd_vma_create(struct gen6_ppgtt *ppgtt, int size) -{ - struct i915_ggtt *ggtt = ppgtt->base.vm.gt->
[Intel-gfx] [PATCH 05/28] drm/i915: Slightly rework EXEC_OBJECT_CAPTURE handling, v2.
Use a single null-terminated array for simplicity instead of a linked list. This might slightly speed up execbuf when many vma's may be marked as capture, but definitely removes an allocation from a signaling path. We are not allowed to allocate memory in eb_move_to_gpu, but we can't enforce it yet through annotations. Changes since v1: - Rebase on top of multi-batchbuffer changes. Signed-off-by: Maarten Lankhorst Reviewed-by: Niranjana Vishwanathapura #v1 --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 29 --- drivers/gpu/drm/i915/i915_gpu_error.c | 9 +++--- drivers/gpu/drm/i915/i915_request.c | 9 ++ drivers/gpu/drm/i915/i915_request.h | 7 + 4 files changed, 26 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 9c323666bd7c..eaacadd2d2e5 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -265,6 +265,9 @@ struct i915_execbuffer { /* number of batches in execbuf IOCTL */ unsigned int num_batches; + /* Number of objects with EXEC_OBJECT_CAPTURE set */ + unsigned int capture_count; + /** list of vma not yet bound during reservation phase */ struct list_head unbound; @@ -909,6 +912,9 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb) goto err; } + if (eb->exec[i].flags & EXEC_OBJECT_CAPTURE) + eb->capture_count++; + err = eb_validate_vma(eb, &eb->exec[i], vma); if (unlikely(err)) { i915_vma_put(vma); @@ -1906,19 +1912,11 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) assert_vma_held(vma); if (flags & EXEC_OBJECT_CAPTURE) { - struct i915_capture_list *capture; + eb->capture_count--; for_each_batch_create_order(eb, j) { - if (!eb->requests[j]) - break; - - capture = kmalloc(sizeof(*capture), GFP_KERNEL); - if (capture) { - capture->next = - eb->requests[j]->capture_list; - capture->vma = vma; - eb->requests[j]->capture_list = capture; - } + if (eb->requests[j]->capture_list) + eb->requests[j]->capture_list[eb->capture_count] = vma; } } @@ -3130,6 +3128,14 @@ eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence, return out_fence; } + if (eb->capture_count) { + eb->requests[i]->capture_list = + kvcalloc(eb->capture_count + 1, + sizeof(*eb->requests[i]->capture_list), + GFP_KERNEL | __GFP_NOWARN); + } + + /* * Only the first request added (committed to backend) has to * take the in fences into account as all subsequent requests @@ -3197,6 +3203,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, eb.fences = NULL; eb.num_fences = 0; + eb.capture_count = 0; memset(eb.requests, 0, sizeof(struct i915_request *) * ARRAY_SIZE(eb.requests)); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 2a2d7643b551..45104bb12a98 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1356,10 +1356,10 @@ capture_user(struct intel_engine_capture_vma *capture, const struct i915_request *rq, gfp_t gfp) { - struct i915_capture_list *c; + int i; - for (c = rq->capture_list; c; c = c->next) - capture = capture_vma(capture, c->vma, "user", gfp); + for (i = 0; rq->capture_list[i]; i++) + capture = capture_vma(capture, rq->capture_list[i], "user", gfp); return capture; } @@ -1407,7 +1407,8 @@ intel_engine_coredump_add_request(struct intel_engine_coredump *ee, * by userspace. */ vma = capture_vma(vma, rq->batch, "batch", gfp); - vma = capture_user(vma, rq, gfp); + if (rq->capture_list) + vma = capture_user(vma, rq, gfp); vma = capture_vma(vma, rq->ring->vma, "ring", gfp); vma = capture_vm