Re: [PATCH V2 2/3] dt-bindings: display: panel: Add NewVision NV3051D bindings
On Tue, 20 Sep 2022 09:59:04 -0500, Chris Morgan wrote: > From: Chris Morgan > > Add documentation for the NewVision NV3051D panel bindings. > Note that for the two expected consumers of this panel binding > the underlying LCD model is unknown. Name "anbernic,rg353p-panel" > is used because the hardware itself is known as "anbernic,rg353p". > > Signed-off-by: Chris Morgan > --- > .../display/panel/newvision,nv3051d.yaml | 55 +++ > 1 file changed, 55 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/display/panel/newvision,nv3051d.yaml > My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check' on your patch (DT_CHECKER_FLAGS is new in v5.13): yamllint warnings/errors: dtschema/dtc warnings/errors: /builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/display/panel/newvision,nv3051d.example.dtb: panel@0: compatible: ['anbernic,rg353p-panel'] is too short From schema: /builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/display/panel/newvision,nv3051d.yaml doc reference errors (make refcheckdocs): See https://patchwork.ozlabs.org/patch/ This check can fail if there are any dependencies. The base for a patch series is generally the most recent rc1. If you already ran 'make dt_binding_check' and didn't see the above error(s), then make sure 'yamllint' is installed and dt-schema is up to date: pip3 install dtschema --upgrade Please check and re-submit.
[RFC v4 00/14] drm/i915/vm_bind: Add VM_BIND functionality
DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer objects (BOs) or sections of a BOs at specified GPU virtual addresses on a specified address space (VM). Multiple mappings can map to the same physical pages of an object (aliasing). These mappings (also referred to as persistent mappings) will be persistent across multiple GPU submissions (execbuf calls) issued by the UMD, without user having to provide a list of all required mappings during each submission (as required by older execbuf mode). This patch series support VM_BIND version 1, as described by the param I915_PARAM_VM_BIND_VERSION. Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl. The new execbuf3 ioctl will not have any execlist support and all the legacy support like relocations etc., are removed. TODOs: * Support out fence for VM_UNBIND ioctl. * Async VM_UNBIND support. * Cleanups and optimizations. NOTEs: * It is based on below VM_BIND design+uapi rfc. Documentation/gpu/rfc/i915_vm_bind.rst * The IGT RFC series is posted as, [RFC v2 0/8] vm_bind: Add VM_BIND validation support v4: Share code between legacy execbuf and execbuf3. Address review feedback from Thomas and Tvrtko. Reformat patches and some cleanups. Signed-off-by: Niranjana Vishwanathapura Niranjana Vishwanathapura (14): drm/i915/vm_bind: Expose vm lookup function drm/i915/vm_bind: Add __i915_sw_fence_await_reservation() drm/i915/vm_bind: Expose i915_gem_object_max_page_size() drm/i915/vm_bind: Implement bind and unbind of object drm/i915/vm_bind: Support for VM private BOs drm/i915/vm_bind: Handle persistent vmas drm/i915/vm_bind: Add out fence support drm/i915/vm_bind: Abstract out common execbuf functions drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl drm/i915/vm_bind: Update i915_vma_verify_bind_complete() drm/i915/vm_bind: Handle persistent vmas in execbuf3 drm/i915/vm_bind: userptr dma-resv changes drm/i915/vm_bind: Skip vma_lookup for persistent vmas drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode drivers/gpu/drm/i915/Makefile | 3 + drivers/gpu/drm/i915/display/intel_fb_pin.c | 2 +- .../drm/i915/display/intel_plane_initial.c| 2 +- drivers/gpu/drm/i915/gem/i915_gem_context.c | 16 +- drivers/gpu/drm/i915/gem/i915_gem_context.h | 3 + drivers/gpu/drm/i915/gem/i915_gem_create.c| 60 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 6 + .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 520 +-- .../gpu/drm/i915/gem/i915_gem_execbuffer3.c | 861 ++ .../drm/i915/gem/i915_gem_execbuffer_common.c | 530 +++ .../drm/i915/gem/i915_gem_execbuffer_common.h | 47 + drivers/gpu/drm/i915/gem/i915_gem_ioctls.h| 2 + drivers/gpu/drm/i915/gem/i915_gem_object.c| 3 + drivers/gpu/drm/i915/gem/i915_gem_object.h| 2 + .../gpu/drm/i915/gem/i915_gem_object_types.h | 3 + drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 3 + drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 17 + drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h | 31 + .../drm/i915/gem/i915_gem_vm_bind_object.c| 421 + .../gpu/drm/i915/gem/selftests/huge_pages.c | 16 +- .../i915/gem/selftests/i915_gem_client_blt.c | 2 +- .../drm/i915/gem/selftests/i915_gem_context.c | 12 +- .../drm/i915/gem/selftests/i915_gem_migrate.c | 2 +- .../drm/i915/gem/selftests/i915_gem_mman.c| 6 +- .../drm/i915/gem/selftests/igt_gem_utils.c| 2 +- drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 2 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- drivers/gpu/drm/i915/gt/intel_gt.c| 2 +- drivers/gpu/drm/i915/gt/intel_gtt.c | 20 +- drivers/gpu/drm/i915/gt/intel_gtt.h | 27 + drivers/gpu/drm/i915/gt/intel_lrc.c | 4 +- drivers/gpu/drm/i915/gt/intel_renderstate.c | 2 +- drivers/gpu/drm/i915/gt/intel_ring.c | 2 +- .../gpu/drm/i915/gt/intel_ring_submission.c | 4 +- drivers/gpu/drm/i915/gt/intel_timeline.c | 2 +- drivers/gpu/drm/i915/gt/mock_engine.c | 2 +- drivers/gpu/drm/i915/gt/selftest_engine_cs.c | 4 +- drivers/gpu/drm/i915/gt/selftest_execlists.c | 16 +- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 6 +- drivers/gpu/drm/i915/gt/selftest_lrc.c| 2 +- .../drm/i915/gt/selftest_ring_submission.c| 2 +- drivers/gpu/drm/i915/gt/selftest_rps.c| 2 +- .../gpu/drm/i915/gt/selftest_workarounds.c| 4 +- drivers/gpu/drm/i915/gt/uc/intel_guc.c| 2 +- drivers/gpu/drm/i915/i915_driver.c| 4 + drivers/gpu/drm/i915/i915_gem.c | 2 +- drivers/gpu/drm/i915/i915_gem_gtt.c | 39 + drivers/gpu/drm/i915/i915_gem_gtt.h | 3 + drivers/gpu/drm/i915/i915_getparam.c | 3 + drivers/gpu/drm/i915/i915_perf.c | 2 +- drivers/gpu/drm/i915/i915_sw_fen
[RFC v4 07/14] drm/i915/vm_bind: Add out fence support
Add support for handling out fence for vm_bind call. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h | 4 + .../drm/i915/gem/i915_gem_vm_bind_object.c| 81 +++ drivers/gpu/drm/i915/i915_vma.c | 6 +- drivers/gpu/drm/i915/i915_vma_types.h | 7 ++ 4 files changed, 97 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h index 4f3cfa1f6ef6..facba29ead04 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h @@ -6,6 +6,7 @@ #ifndef __I915_GEM_VM_BIND_H #define __I915_GEM_VM_BIND_H +#include #include #include @@ -24,4 +25,7 @@ int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data, void i915_gem_vm_unbind_all(struct i915_address_space *vm); +void i915_vm_bind_signal_fence(struct i915_vma *vma, + struct dma_fence * const fence); + #endif /* __I915_GEM_VM_BIND_H */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c index 236f901b8b9c..5cd788404ee7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c @@ -7,6 +7,8 @@ #include +#include + #include "gem/i915_gem_context.h" #include "gem/i915_gem_vm_bind.h" @@ -106,6 +108,75 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj) i915_gem_object_put(vma->obj); } +static int i915_vm_bind_add_fence(struct drm_file *file, struct i915_vma *vma, + u32 handle, u64 point) +{ + struct drm_syncobj *syncobj; + + syncobj = drm_syncobj_find(file, handle); + if (!syncobj) { + DRM_DEBUG("Invalid syncobj handle provided\n"); + return -ENOENT; + } + + /* +* For timeline syncobjs we need to preallocate chains for +* later signaling. +*/ + if (point) { + vma->vm_bind_fence.chain_fence = dma_fence_chain_alloc(); + if (!vma->vm_bind_fence.chain_fence) { + drm_syncobj_put(syncobj); + return -ENOMEM; + } + } else { + vma->vm_bind_fence.chain_fence = NULL; + } + vma->vm_bind_fence.syncobj = syncobj; + vma->vm_bind_fence.value = point; + + return 0; +} + +static void i915_vm_bind_put_fence(struct i915_vma *vma) +{ + if (!vma->vm_bind_fence.syncobj) + return; + + drm_syncobj_put(vma->vm_bind_fence.syncobj); + dma_fence_chain_free(vma->vm_bind_fence.chain_fence); +} + +/** + * i915_vm_bind_signal_fence() - Add fence to vm_bind syncobj + * @vma: vma mapping requiring signaling + * @fence: fence to be added + * + * Associate specified @fence with the @vma's syncobj to be + * signaled after the @fence work completes. + */ +void i915_vm_bind_signal_fence(struct i915_vma *vma, + struct dma_fence * const fence) +{ + struct drm_syncobj *syncobj = vma->vm_bind_fence.syncobj; + + if (!syncobj) + return; + + if (vma->vm_bind_fence.chain_fence) { + drm_syncobj_add_point(syncobj, + vma->vm_bind_fence.chain_fence, + fence, vma->vm_bind_fence.value); + /* +* The chain's ownership is transferred to the +* timeline. +*/ + vma->vm_bind_fence.chain_fence = NULL; + } else { + drm_syncobj_replace_fence(syncobj, fence); + } +} + static int i915_gem_vm_unbind_vma(struct i915_address_space *vm, struct drm_i915_gem_vm_unbind *va) { @@ -233,6 +304,13 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm, goto unlock_vm; } + if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL) { + ret = i915_vm_bind_add_fence(file, vma, va->fence.handle, +va->fence.value); + if (ret) + goto put_vma; + } + pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER; for_i915_gem_ww(&ww, ret, true) { @@ -257,6 +335,9 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm, i915_gem_object_get(vma->obj); } + if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL) + i915_vm_bind_put_fence(vma); +put_vma: if (ret) i915_vma_destroy(vma); unlock_vm: diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index ff216e9a2c8d..f7d711e675d6 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -1540,8 +1540,12 @@ int i915_vma_pin
[RFC v4 08/14] drm/i915/vm_bind: Abstract out common execbuf functions
The new execbuf3 ioctl path and the legacy execbuf ioctl paths have many common functionalities. Share code between these two paths by abstracting out the common functionalities into a separate file where possible. Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/Makefile | 1 + .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 507 ++--- .../drm/i915/gem/i915_gem_execbuffer_common.c | 530 ++ .../drm/i915/gem/i915_gem_execbuffer_common.h | 47 ++ 4 files changed, 612 insertions(+), 473 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 9bf939ef18ea..bf952f478555 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -148,6 +148,7 @@ gem-y += \ gem/i915_gem_create.o \ gem/i915_gem_dmabuf.o \ gem/i915_gem_domain.o \ + gem/i915_gem_execbuffer_common.o \ gem/i915_gem_execbuffer.o \ gem/i915_gem_internal.o \ gem/i915_gem_object.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 33d989a20227..363b2a788cdf 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -9,8 +9,6 @@ #include #include -#include - #include "display/intel_frontbuffer.h" #include "gem/i915_gem_ioctls.h" @@ -28,6 +26,7 @@ #include "i915_file_private.h" #include "i915_gem_clflush.h" #include "i915_gem_context.h" +#include "i915_gem_execbuffer_common.h" #include "i915_gem_evict.h" #include "i915_gem_ioctls.h" #include "i915_trace.h" @@ -235,13 +234,6 @@ enum { * the batchbuffer in trusted mode, otherwise the ioctl is rejected. */ -struct eb_fence { - struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */ - struct dma_fence *dma_fence; - u64 value; - struct dma_fence_chain *chain_fence; -}; - struct i915_execbuffer { struct drm_i915_private *i915; /** i915 backpointer */ struct drm_file *file; /** per-file lookup tables and limits */ @@ -2446,164 +2438,29 @@ static const enum intel_engine_id user_ring_map[] = { [I915_EXEC_VEBOX] = VECS0 }; -static struct i915_request *eb_throttle(struct i915_execbuffer *eb, struct intel_context *ce) -{ - struct intel_ring *ring = ce->ring; - struct intel_timeline *tl = ce->timeline; - struct i915_request *rq; - - /* -* Completely unscientific finger-in-the-air estimates for suitable -* maximum user request size (to avoid blocking) and then backoff. -*/ - if (intel_ring_update_space(ring) >= PAGE_SIZE) - return NULL; - - /* -* Find a request that after waiting upon, there will be at least half -* the ring available. The hysteresis allows us to compete for the -* shared ring and should mean that we sleep less often prior to -* claiming our resources, but not so long that the ring completely -* drains before we can submit our next request. -*/ - list_for_each_entry(rq, &tl->requests, link) { - if (rq->ring != ring) - continue; - - if (__intel_ring_space(rq->postfix, - ring->emit, ring->size) > ring->size / 2) - break; - } - if (&rq->link == &tl->requests) - return NULL; /* weird, we will check again later for real */ - - return i915_request_get(rq); -} - -static int eb_pin_timeline(struct i915_execbuffer *eb, struct intel_context *ce, - bool throttle) -{ - struct intel_timeline *tl; - struct i915_request *rq = NULL; - - /* -* Take a local wakeref for preparing to dispatch the execbuf as -* we expect to access the hardware fairly frequently in the -* process, and require the engine to be kept awake between accesses. -* Upon dispatch, we acquire another prolonged wakeref that we hold -* until the timeline is idle, which in turn releases the wakeref -* taken on the engine, and the parent device. -*/ - tl = intel_context_timeline_lock(ce); - if (IS_ERR(tl)) - return PTR_ERR(tl); - - intel_context_enter(ce); - if (throttle) - rq = eb_throttle(eb, ce); - intel_context_timeline_unlock(tl); - - if (rq) { - bool nonblock = eb->file->filp->f_flags & O_NONBLOCK; - long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT; - - if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE, - timeout) < 0) { - i915_request_put(rq); - - /* -
[RFC v4 09/14] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl. The new execbuf3 ioctl will not have any list of objects to validate bind as all required objects binding would have been requested by the userspace before submitting the execbuf3. Legacy features like relocations etc are not supported by execbuf3. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/Makefile | 1 + .../gpu/drm/i915/gem/i915_gem_execbuffer3.c | 575 ++ drivers/gpu/drm/i915/gem/i915_gem_ioctls.h| 2 + drivers/gpu/drm/i915/i915_driver.c| 1 + include/uapi/drm/i915_drm.h | 64 ++ 5 files changed, 643 insertions(+) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index bf952f478555..3473ee5825bb 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -150,6 +150,7 @@ gem-y += \ gem/i915_gem_domain.o \ gem/i915_gem_execbuffer_common.o \ gem/i915_gem_execbuffer.o \ + gem/i915_gem_execbuffer3.o \ gem/i915_gem_internal.o \ gem/i915_gem_object.o \ gem/i915_gem_lmem.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c new file mode 100644 index ..b6229b955e62 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c @@ -0,0 +1,575 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2022 Intel Corporation + */ + +#include +#include +#include + +#include "gt/intel_context.h" +#include "gt/intel_gpu_commands.h" +#include "gt/intel_gt.h" +#include "gt/intel_gt_pm.h" +#include "gt/intel_ring.h" + +#include "i915_drv.h" +#include "i915_file_private.h" +#include "i915_gem_context.h" +#include "i915_gem_execbuffer_common.h" +#include "i915_gem_ioctls.h" +#include "i915_gem_vm_bind.h" +#include "i915_trace.h" + +#define __EXEC3_ENGINE_PINNED BIT_ULL(32) +#define __EXEC3_INTERNAL_FLAGS (~0ull << 32) + +/* Catch emission of unexpected errors for CI! */ +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) +#undef EINVAL +#define EINVAL ({ \ + DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \ + 22; \ +}) +#endif + +/** + * DOC: User command execution with execbuf3 ioctl + * + * A VM in VM_BIND mode will not support older execbuf mode of binding. + * The execbuf ioctl handling in VM_BIND mode differs significantly from the + * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2). + * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See + * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any + * execlist. Hence, no support for implicit sync. + * + * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only + * works with execbuf3 ioctl for submission. + * + * The execbuf3 ioctl directly specifies the batch addresses instead of as + * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not + * support many of the older features like in/out/submit fences, fence array, + * default gem context etc. (See struct drm_i915_gem_execbuffer3). + * + * In VM_BIND mode, VA allocation is completely managed by the user instead of + * the i915 driver. Hence all VA assignment, eviction are not applicable in + * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not + * be using the i915_vma active reference tracking. It will instead check the + * dma-resv object's fence list for that. + * + * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions, + * vma lookup table, implicit sync, vma active reference tracking etc., are not + * applicable for execbuf3 ioctl. + */ + +/** + * struct i915_execbuffer - execbuf struct for execbuf3 + * @i915: reference to the i915 instance we run on + * @file: drm file reference + * args: execbuf3 ioctl structure + * @gt: reference to the gt instance ioctl submitted for + * @context: logical state for the request + * @gem_context: callers context + * @requests: requests to be build + * @composite_fence: used for excl fence in dma_resv objects when > 1 BB submitted + * @ww: i915_gem_ww_ctx instance + * @num_batches: number of batches submitted + * @batch_addresses: addresses corresponds to the submitted batches + * @batches: references to the i915_vmas corresponding to the batches + */ +struct i915_execbuffer { + struct drm_i915_private *i915; + struct drm_file *file; + struct drm_i915_gem_execbuffer3 *args; + + struct intel_gt *gt; + struct intel_context *context; + struct i915_gem_context *gem_context; + + struct i915_request *requests[MAX_ENGINE_INSTANCE + 1]; + struct dma_fence *composite_fence; + + struct i915_gem_ww_ctx ww; + + unsigned int num_batches; + u64
[RFC v4 02/14] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
Add function __i915_sw_fence_await_reservation() for asynchronous wait on a dma-resv object with specified dma_resv_usage. This is required for async vma unbind with vm_bind. Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/i915_sw_fence.c | 25 ++--- drivers/gpu/drm/i915/i915_sw_fence.h | 7 ++- 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c index 6fc0d1b89690..0ce8f4efc1ed 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.c +++ b/drivers/gpu/drm/i915/i915_sw_fence.c @@ -569,12 +569,11 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence, return ret; } -int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, - struct dma_resv *resv, - const struct dma_fence_ops *exclude, - bool write, - unsigned long timeout, - gfp_t gfp) +int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence, + struct dma_resv *resv, + enum dma_resv_usage usage, + unsigned long timeout, + gfp_t gfp) { struct dma_resv_iter cursor; struct dma_fence *f; @@ -583,7 +582,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, debug_fence_assert(fence); might_sleep_if(gfpflags_allow_blocking(gfp)); - dma_resv_iter_begin(&cursor, resv, dma_resv_usage_rw(write)); + dma_resv_iter_begin(&cursor, resv, usage); dma_resv_for_each_fence_unlocked(&cursor, f) { pending = i915_sw_fence_await_dma_fence(fence, f, timeout, gfp); @@ -598,6 +597,18 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, return ret; } +int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, + struct dma_resv *resv, + const struct dma_fence_ops *exclude, + bool write, + unsigned long timeout, + gfp_t gfp) +{ + return __i915_sw_fence_await_reservation(fence, resv, +dma_resv_usage_rw(write), +timeout, gfp); +} + #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftests/lib_sw_fence.c" #include "selftests/i915_sw_fence.c" diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h index 619fc5a22f0c..3cf4b6e16f35 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.h +++ b/drivers/gpu/drm/i915/i915_sw_fence.h @@ -10,13 +10,13 @@ #define _I915_SW_FENCE_H_ #include +#include #include #include #include /* for NOTIFY_DONE */ #include struct completion; -struct dma_resv; struct i915_sw_fence; enum i915_sw_fence_notify { @@ -89,6 +89,11 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence, unsigned long timeout, gfp_t gfp); +int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence, + struct dma_resv *resv, + enum dma_resv_usage usage, + unsigned long timeout, + gfp_t gfp); int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, struct dma_resv *resv, const struct dma_fence_ops *exclude, -- 2.21.0.rc0.32.g243a4c7e27
[RFC v4 04/14] drm/i915/vm_bind: Implement bind and unbind of object
Add uapi and implement support for bind and unbind of an object at the specified GPU virtual addresses. The vm_bind mode is not supported in legacy execbuf2 ioctl. It will be supported only in the newer execbuf3 ioctl. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Prathap Kumar Valsan Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/Makefile | 1 + .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 5 + drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h | 27 ++ .../drm/i915/gem/i915_gem_vm_bind_object.c| 308 ++ drivers/gpu/drm/i915/gt/intel_gtt.c | 10 + drivers/gpu/drm/i915/gt/intel_gtt.h | 17 + drivers/gpu/drm/i915/i915_driver.c| 3 + drivers/gpu/drm/i915/i915_vma.c | 3 +- drivers/gpu/drm/i915/i915_vma.h | 2 - drivers/gpu/drm/i915/i915_vma_types.h | 14 + include/uapi/drm/i915_drm.h | 167 ++ 11 files changed, 554 insertions(+), 3 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index a26edcdadc21..9bf939ef18ea 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -166,6 +166,7 @@ gem-y += \ gem/i915_gem_ttm_move.o \ gem/i915_gem_ttm_pm.o \ gem/i915_gem_userptr.o \ + gem/i915_gem_vm_bind_object.o \ gem/i915_gem_wait.o \ gem/i915_gemfs.o i915-y += \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index cd75b0ca2555..f85f10cf9c34 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb) if (unlikely(IS_ERR(ctx))) return PTR_ERR(ctx); + if (ctx->vm->vm_bind_mode) { + i915_gem_context_put(ctx); + return -EOPNOTSUPP; + } + eb->gem_context = ctx; if (i915_gem_context_has_full_ppgtt(ctx)) eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h new file mode 100644 index ..4f3cfa1f6ef6 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2022 Intel Corporation + */ + +#ifndef __I915_GEM_VM_BIND_H +#define __I915_GEM_VM_BIND_H + +#include + +#include +#include + +#include "gt/intel_gtt.h" +#include "i915_vma_types.h" + +struct i915_vma * +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va); + +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data, +struct drm_file *file); + +void i915_gem_vm_unbind_all(struct i915_address_space *vm); + +#endif /* __I915_GEM_VM_BIND_H */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c new file mode 100644 index ..c24e22657617 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c @@ -0,0 +1,308 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2022 Intel Corporation + */ + +#include + +#include + +#include "gem/i915_gem_context.h" +#include "gem/i915_gem_vm_bind.h" + +#include "gt/intel_gpu_commands.h" + +#define START(node) ((node)->start) +#define LAST(node) ((node)->last) + +INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last, +START, LAST, static inline, i915_vm_bind_it) + +#undef START +#undef LAST + +/** + * DOC: VM_BIND/UNBIND ioctls + * + * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer + * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a + * specified address space (VM). Multiple mappings can map to the same physical + * pages of an object (aliasing). These mappings (also referred to as persistent + * mappings) will be persistent across multiple GPU submissions (execbuf calls) + * issued by the UMD, without user having to provide a list of all required + * mappings during each submission (as required by older execbuf mode). + * + * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for + * signaling the completion of bind/unbind operation. + * + * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION. + * User has to opt-in for VM_BIND mode of binding for an address space (VM) + * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension. + * + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently + * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be + * d
[RFC v4 01/14] drm/i915/vm_bind: Expose vm lookup function
Make i915_gem_vm_lookup() function non-static as it will be used by the vm_bind feature. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 11 ++- drivers/gpu/drm/i915/gem/i915_gem_context.h | 3 +++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 0bcde53c50c6..f4e648ec01ed 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -346,7 +346,16 @@ static int proto_context_register(struct drm_i915_file_private *fpriv, return ret; } -static struct i915_address_space * +/** + * i915_gem_vm_lookup() - looks up for the VM reference given the vm id + * @file_priv: the private data associated with the user's file + * @id: the VM id + * + * Finds the VM reference associated to a specific id. + * + * Returns the VM pointer on success, NULL in case of failure. + */ +struct i915_address_space * i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id) { struct i915_address_space *vm; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h index e5b0f66ea1fe..899fa8f1e0fe 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h @@ -139,6 +139,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data, struct drm_file *file); +struct i915_address_space * +i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id); + struct i915_gem_context * i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id); -- 2.21.0.rc0.32.g243a4c7e27
[RFC v4 11/14] drm/i915/vm_bind: Handle persistent vmas in execbuf3
Handle persistent (VM_BIND) mappings during the request submission in the execbuf3 path. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andi Shyti --- .../gpu/drm/i915/gem/i915_gem_execbuffer3.c | 189 +- 1 file changed, 188 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c index b6229b955e62..82a068d03440 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c @@ -4,6 +4,7 @@ */ #include +#include #include #include @@ -21,6 +22,7 @@ #include "i915_gem_vm_bind.h" #include "i915_trace.h" +#define __EXEC3_HAS_PINBIT_ULL(33) #define __EXEC3_ENGINE_PINNED BIT_ULL(32) #define __EXEC3_INTERNAL_FLAGS (~0ull << 32) @@ -44,7 +46,9 @@ * execlist. Hence, no support for implicit sync. * * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only - * works with execbuf3 ioctl for submission. + * works with execbuf3 ioctl for submission. All BOs mapped on that VM (through + * VM_BIND call) at the time of execbuf3 call are deemed required for that + * submission. * * The execbuf3 ioctl directly specifies the batch addresses instead of as * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not @@ -60,6 +64,13 @@ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions, * vma lookup table, implicit sync, vma active reference tracking etc., are not * applicable for execbuf3 ioctl. + * + * During each execbuf submission, request fence is added to all VM_BIND mapped + * objects with DMA_RESV_USAGE_BOOKKEEP. The DMA_RESV_USAGE_BOOKKEEP usage will + * prevent over sync (See enum dma_resv_usage). Note that DRM_I915_GEM_WAIT and + * DRM_I915_GEM_BUSY ioctls do not check for DMA_RESV_USAGE_BOOKKEEP usage and + * hence should not be used for end of batch check. Instead, the execbuf3 + * timeline out fence should be used for end of batch check. */ /** @@ -129,6 +140,23 @@ eb_find_vma(struct i915_address_space *vm, u64 addr) return i915_gem_vm_bind_lookup_vma(vm, va); } +static void eb_scoop_unbound_vma_all(struct i915_address_space *vm) +{ + struct i915_vma *vma, *vn; + + /** +* Move all unbound vmas back into vm_bind_list so that they are +* revalidated. +*/ + spin_lock(&vm->vm_rebind_lock); + list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, vm_rebind_link) { + list_del_init(&vma->vm_rebind_link); + if (!list_empty(&vma->vm_bind_link)) + list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list); + } + spin_unlock(&vm->vm_rebind_lock); +} + static int eb_lookup_vma_all(struct i915_execbuffer *eb) { unsigned int i, current_batch = 0; @@ -143,14 +171,121 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb) ++current_batch; } + eb_scoop_unbound_vma_all(eb->context->vm); + + return 0; +} + +static int eb_lock_vma_all(struct i915_execbuffer *eb) +{ + struct i915_address_space *vm = eb->context->vm; + struct i915_vma *vma; + int err; + + err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww); + if (err) + return err; + + list_for_each_entry(vma, &vm->non_priv_vm_bind_list, + non_priv_vm_bind_link) { + err = i915_gem_object_lock(vma->obj, &eb->ww); + if (err) + return err; + } + return 0; } +static void eb_release_persistent_vma_all(struct i915_execbuffer *eb, + bool final) +{ + struct i915_address_space *vm = eb->context->vm; + struct i915_vma *vma, *vn; + + lockdep_assert_held(&vm->vm_bind_lock); + + if (!(eb->args->flags & __EXEC3_HAS_PIN)) + return; + + assert_object_held(vm->root_obj); + + list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) + __i915_vma_unpin(vma); + + eb->args->flags &= ~__EXEC3_HAS_PIN; + if (!final) + return; + + list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link) + if (i915_vma_verify_bind_complete(vma)) + list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list); +} + static void eb_release_vma_all(struct i915_execbuffer *eb, bool final) { + eb_release_persistent_vma_all(eb, final); eb_unpin_engine(eb); } +static int eb_reserve_fence_for_persistent_vma_all(struct i915_execbuffer *eb) +{ + struct i915_address_space *vm = eb->context->vm; + struct i915_vma *vma; + int ret; + + ret = dma_resv_reserve_fences(vm->root_obj->base.resv, 1); + if (ret) + return ret; + + list_for_each_entry(vma, &vm->non_priv_vm_bind_list, +
[RFC v4 13/14] drm/i915/vm_bind: Skip vma_lookup for persistent vmas
vma_lookup is tied to segment of the object instead of section of VA space. Hence, it do not support aliasing (ie., multiple bindings to the same section of the object). Skip vma_lookup for persistent vmas as it supports aliasing. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/display/intel_fb_pin.c | 2 +- .../drm/i915/display/intel_plane_initial.c| 2 +- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 4 +- .../drm/i915/gem/i915_gem_vm_bind_object.c| 2 +- .../gpu/drm/i915/gem/selftests/huge_pages.c | 16 +++ .../i915/gem/selftests/i915_gem_client_blt.c | 2 +- .../drm/i915/gem/selftests/i915_gem_context.c | 12 ++--- .../drm/i915/gem/selftests/i915_gem_migrate.c | 2 +- .../drm/i915/gem/selftests/i915_gem_mman.c| 6 ++- .../drm/i915/gem/selftests/igt_gem_utils.c| 2 +- drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 2 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- drivers/gpu/drm/i915/gt/intel_gt.c| 2 +- drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 4 +- drivers/gpu/drm/i915/gt/intel_renderstate.c | 2 +- drivers/gpu/drm/i915/gt/intel_ring.c | 2 +- .../gpu/drm/i915/gt/intel_ring_submission.c | 4 +- drivers/gpu/drm/i915/gt/intel_timeline.c | 2 +- drivers/gpu/drm/i915/gt/mock_engine.c | 2 +- drivers/gpu/drm/i915/gt/selftest_engine_cs.c | 4 +- drivers/gpu/drm/i915/gt/selftest_execlists.c | 16 +++ drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 6 +-- drivers/gpu/drm/i915/gt/selftest_lrc.c| 2 +- .../drm/i915/gt/selftest_ring_submission.c| 2 +- drivers/gpu/drm/i915/gt/selftest_rps.c| 2 +- .../gpu/drm/i915/gt/selftest_workarounds.c| 4 +- drivers/gpu/drm/i915/gt/uc/intel_guc.c| 2 +- drivers/gpu/drm/i915/i915_gem.c | 2 +- drivers/gpu/drm/i915/i915_perf.c | 2 +- drivers/gpu/drm/i915/i915_vma.c | 26 +++ drivers/gpu/drm/i915/i915_vma.h | 3 +- drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 44 +-- drivers/gpu/drm/i915/selftests/i915_request.c | 4 +- drivers/gpu/drm/i915/selftests/i915_vma.c | 2 +- drivers/gpu/drm/i915/selftests/igt_spinner.c | 2 +- .../drm/i915/selftests/intel_memory_region.c | 2 +- 37 files changed, 106 insertions(+), 93 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_fb_pin.c b/drivers/gpu/drm/i915/display/intel_fb_pin.c index c86e5d4ee016..5a718b247bb3 100644 --- a/drivers/gpu/drm/i915/display/intel_fb_pin.c +++ b/drivers/gpu/drm/i915/display/intel_fb_pin.c @@ -47,7 +47,7 @@ intel_pin_fb_obj_dpt(struct drm_framebuffer *fb, goto err; } - vma = i915_vma_instance(obj, vm, view); + vma = i915_vma_instance(obj, vm, view, false); if (IS_ERR(vma)) goto err; diff --git a/drivers/gpu/drm/i915/display/intel_plane_initial.c b/drivers/gpu/drm/i915/display/intel_plane_initial.c index 76be796df255..7667e2faa3fb 100644 --- a/drivers/gpu/drm/i915/display/intel_plane_initial.c +++ b/drivers/gpu/drm/i915/display/intel_plane_initial.c @@ -136,7 +136,7 @@ initial_plane_vma(struct drm_i915_private *i915, goto err_obj; } - vma = i915_vma_instance(obj, &to_gt(i915)->ggtt->vm, NULL); + vma = i915_vma_instance(obj, &to_gt(i915)->ggtt->vm, NULL, false); if (IS_ERR(vma)) goto err_obj; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 363b2a788cdf..0ee43cb601b5 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -876,7 +876,7 @@ static struct i915_vma *eb_lookup_vma(struct i915_execbuffer *eb, u32 handle) } } - vma = i915_vma_instance(obj, vm, NULL); + vma = i915_vma_instance(obj, vm, NULL, false); if (IS_ERR(vma)) { i915_gem_object_put(obj); return vma; @@ -2208,7 +2208,7 @@ shadow_batch_pin(struct i915_execbuffer *eb, struct i915_vma *vma; int err; - vma = i915_vma_instance(obj, vm, NULL); + vma = i915_vma_instance(obj, vm, NULL, false); if (IS_ERR(vma)) return vma; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c index 3087731cc0c0..4468603af6f1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c @@ -252,7 +252,7 @@ static struct i915_vma *vm_bind_get_vma(struct i915_address_space *vm, view.type = I915_GTT_VIEW_PARTIAL; view.partial.offset = va->offset >> PAGE_SHIFT; view.partial.size = va->length >> PAGE_SHIFT; - vma = i915_vma_instance(obj, vm, &v
[RFC v4 03/14] drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
Expose i915_gem_object_max_page_size() function non-static which will be used by the vm_bind feature. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/gem/i915_gem_create.c | 20 +++- drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 ++ 2 files changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c index 33673fe7ee0a..3b3ab4abb0a3 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_create.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c @@ -11,14 +11,24 @@ #include "pxp/intel_pxp.h" #include "i915_drv.h" +#include "i915_gem_context.h" #include "i915_gem_create.h" #include "i915_trace.h" #include "i915_user_extensions.h" -static u32 object_max_page_size(struct intel_memory_region **placements, - unsigned int n_placements) +/** + * i915_gem_object_max_page_size() - max of min_page_size of the regions + * @placements: list of regions + * @n_placements: number of the placements + * + * Calculates the max of the min_page_size of a list of placements passed in. + * + * Return: max of the min_page_size + */ +u32 i915_gem_object_max_page_size(struct intel_memory_region **placements, + unsigned int n_placements) { - u32 max_page_size = 0; + u32 max_page_size = I915_GTT_PAGE_SIZE_4K; int i; for (i = 0; i < n_placements; i++) { @@ -28,7 +38,6 @@ static u32 object_max_page_size(struct intel_memory_region **placements, max_page_size = max_t(u32, max_page_size, mr->min_page_size); } - GEM_BUG_ON(!max_page_size); return max_page_size; } @@ -99,7 +108,8 @@ __i915_gem_object_create_user_ext(struct drm_i915_private *i915, u64 size, i915_gem_flush_free_objects(i915); - size = round_up(size, object_max_page_size(placements, n_placements)); + size = round_up(size, i915_gem_object_max_page_size(placements, + n_placements)); if (size == 0) return ERR_PTR(-EINVAL); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 7317d4102955..8c97bddad921 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -47,6 +47,8 @@ static inline bool i915_gem_object_size_2big(u64 size) } void i915_gem_init__objects(struct drm_i915_private *i915); +u32 i915_gem_object_max_page_size(struct intel_memory_region **placements, + unsigned int n_placements); void i915_objects_module_exit(void); int i915_objects_module_init(void); -- 2.21.0.rc0.32.g243a4c7e27
[RFC v4 10/14] drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
Ensure i915_vma_verify_bind_complete() handles case where bind is not initiated. Also make it non static, add documentation and move it out of CONFIG_DRM_I915_DEBUG_GEM. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/i915_vma.c | 16 +++- drivers/gpu/drm/i915/i915_vma.h | 1 + 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index f7d711e675d6..24f171588f56 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -406,12 +406,21 @@ int i915_vma_sync(struct i915_vma *vma) return i915_vm_sync(vma->vm); } -#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) -static int i915_vma_verify_bind_complete(struct i915_vma *vma) +/** + * i915_vma_verify_bind_complete() - Check for the bind completion of the vma + * @vma: vma to check for bind completion + * + * Returns: 0 if the vma bind is completed. Error code otherwise. + */ +int i915_vma_verify_bind_complete(struct i915_vma *vma) { struct dma_fence *fence = i915_active_fence_get(&vma->active.excl); int err; + /* Ensure vma bind is initiated */ + if (!i915_vma_is_bound(vma, I915_VMA_BIND_MASK)) + return -EINVAL; + if (!fence) return 0; @@ -424,9 +433,6 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma) return err; } -#else -#define i915_vma_verify_bind_complete(_vma) 0 -#endif I915_SELFTEST_EXPORT void i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res, diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index aa536c9ce472..3a47db2d85f5 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -433,6 +433,7 @@ void i915_vma_make_purgeable(struct i915_vma *vma); int i915_vma_wait_for_bind(struct i915_vma *vma); int i915_vma_sync(struct i915_vma *vma); +int i915_vma_verify_bind_complete(struct i915_vma *vma); /** * i915_vma_get_current_resource - Get the current resource of the vma -- 2.21.0.rc0.32.g243a4c7e27
[RFC v4 05/14] drm/i915/vm_bind: Support for VM private BOs
Each VM creates a root_obj and shares it with all of its private objects to use it as dma_resv object. This has a performance advantage as it requires a single dma_resv object update for all private BOs vs list of dma_resv objects update for shared BOs, in the execbuf path. VM private BOs can be only mapped on specified VM and cannot be dmabuf exported. Also, they are supported only in vm_bind mode. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/gem/i915_gem_create.c| 40 ++- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 6 +++ .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 4 ++ drivers/gpu/drm/i915/gem/i915_gem_object.c| 3 ++ .../gpu/drm/i915/gem/i915_gem_object_types.h | 3 ++ drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 3 ++ .../drm/i915/gem/i915_gem_vm_bind_object.c| 9 + drivers/gpu/drm/i915/gt/intel_gtt.c | 4 ++ drivers/gpu/drm/i915/gt/intel_gtt.h | 2 + drivers/gpu/drm/i915/i915_vma.c | 1 + drivers/gpu/drm/i915/i915_vma_types.h | 2 + include/uapi/drm/i915_drm.h | 30 ++ 12 files changed, 105 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c index 3b3ab4abb0a3..692d95ef5d3e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_create.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c @@ -253,6 +253,7 @@ struct create_ext { unsigned int n_placements; unsigned int placement_mask; unsigned long flags; + u32 vm_id; }; static void repr_placements(char *buf, size_t size, @@ -402,9 +403,24 @@ static int ext_set_protected(struct i915_user_extension __user *base, void *data return 0; } +static int ext_set_vm_private(struct i915_user_extension __user *base, + void *data) +{ + struct drm_i915_gem_create_ext_vm_private ext; + struct create_ext *ext_data = data; + + if (copy_from_user(&ext, base, sizeof(ext))) + return -EFAULT; + + ext_data->vm_id = ext.vm_id; + + return 0; +} + static const i915_user_extension_fn create_extensions[] = { [I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements, [I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected, + [I915_GEM_CREATE_EXT_VM_PRIVATE] = ext_set_vm_private, }; /** @@ -420,6 +436,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data, struct drm_i915_private *i915 = to_i915(dev); struct drm_i915_gem_create_ext *args = data; struct create_ext ext_data = { .i915 = i915 }; + struct i915_address_space *vm = NULL; struct drm_i915_gem_object *obj; int ret; @@ -433,6 +450,12 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data, if (ret) return ret; + if (ext_data.vm_id) { + vm = i915_gem_vm_lookup(file->driver_priv, ext_data.vm_id); + if (unlikely(!vm)) + return -ENOENT; + } + if (!ext_data.n_placements) { ext_data.placements[0] = intel_memory_region_by_type(i915, INTEL_MEMORY_SYSTEM); @@ -459,8 +482,21 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data, ext_data.placements, ext_data.n_placements, ext_data.flags); - if (IS_ERR(obj)) - return PTR_ERR(obj); + if (IS_ERR(obj)) { + ret = PTR_ERR(obj); + goto vm_put; + } + + if (vm) { + obj->base.resv = vm->root_obj->base.resv; + obj->priv_root = i915_gem_object_get(vm->root_obj); + i915_vm_put(vm); + } return i915_gem_publish(obj, file, &args->size, &args->handle); +vm_put: + if (vm) + i915_vm_put(vm); + + return ret; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index f5062d0c6333..6433173c3e84 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -218,6 +218,12 @@ struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags) struct drm_i915_gem_object *obj = to_intel_bo(gem_obj); DEFINE_DMA_BUF_EXPORT_INFO(exp_info); + if (obj->priv_root) { + drm_dbg(obj->base.dev, + "Exporting VM private objects is not allowed\n"); + return ERR_PTR(-EINVAL); + } + exp_info.ops = &i915_dmabuf_ops; exp_info.size = gem_obj->size; exp_info.flags = flags; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index f85f10cf9c34..33d989a20227 100644 --- a/dr
[RFC v4 06/14] drm/i915/vm_bind: Handle persistent vmas
Treat VM_BIND vmas as persistent across execbuf ioctl calls and handle them during the request submission in the execbuff path. Support eviction by maintaining a list of evicted persistent vmas for rebinding during next submission. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andi Shyti --- .../drm/i915/gem/i915_gem_vm_bind_object.c| 7 +++ drivers/gpu/drm/i915/gt/intel_gtt.c | 2 + drivers/gpu/drm/i915/gt/intel_gtt.h | 4 ++ drivers/gpu/drm/i915/i915_gem_gtt.c | 39 drivers/gpu/drm/i915/i915_gem_gtt.h | 3 ++ drivers/gpu/drm/i915/i915_vma.c | 46 +++ drivers/gpu/drm/i915/i915_vma.h | 45 +- drivers/gpu/drm/i915/i915_vma_types.h | 17 +++ 8 files changed, 151 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c index 7ca6a41fc981..236f901b8b9c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c @@ -91,6 +91,12 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj) { lockdep_assert_held(&vma->vm->vm_bind_lock); + spin_lock(&vma->vm->vm_rebind_lock); + if (!list_empty(&vma->vm_rebind_link)) + list_del_init(&vma->vm_rebind_link); + i915_vma_set_purged(vma); + spin_unlock(&vma->vm->vm_rebind_lock); + list_del_init(&vma->vm_bind_link); list_del_init(&vma->non_priv_vm_bind_link); i915_vm_bind_it_remove(vma, &vma->vm->va); @@ -181,6 +187,7 @@ static struct i915_vma *vm_bind_get_vma(struct i915_address_space *vm, vma->start = va->start; vma->last = va->start + va->length - 1; + i915_vma_set_persistent(vma); return vma; } diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index da4f9dee0397..6db31197fa87 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -296,6 +296,8 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass) INIT_LIST_HEAD(&vm->non_priv_vm_bind_list); vm->root_obj = i915_gem_object_create_internal(vm->i915, PAGE_SIZE); GEM_BUG_ON(IS_ERR(vm->root_obj)); + INIT_LIST_HEAD(&vm->vm_rebind_list); + spin_lock_init(&vm->vm_rebind_lock); } void *__px_vaddr(struct drm_i915_gem_object *p) diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index 3f2e87d3bf34..b73d35b4e05d 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -273,6 +273,10 @@ struct i915_address_space { struct list_head vm_bind_list; /** @vm_bound_list: List of vm_binding completed */ struct list_head vm_bound_list; + /* @vm_rebind_list: list of vmas to be rebinded */ + struct list_head vm_rebind_list; + /* @vm_rebind_lock: protects vm_rebound_list */ + spinlock_t vm_rebind_lock; /* @va: tree of persistent vmas */ struct rb_root_cached va; struct list_head non_priv_vm_bind_list; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 329ff75b80b9..b7d0844de561 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -25,6 +25,45 @@ #include "i915_trace.h" #include "i915_vgpu.h" +/** + * i915_vm_sync() - Wait until address space is not in use + * @vm: address space + * + * Waits until all requests using the address space are complete. + * + * Returns: 0 if success, -ve err code upon failure + */ +int i915_vm_sync(struct i915_address_space *vm) +{ + int ret; + + /* Wait for all requests under this vm to finish */ + ret = dma_resv_wait_timeout(vm->root_obj->base.resv, + DMA_RESV_USAGE_BOOKKEEP, false, + MAX_SCHEDULE_TIMEOUT); + if (ret < 0) + return ret; + else if (ret > 0) + return 0; + else + return -ETIMEDOUT; +} + +/** + * i915_vm_is_active() - Check if address space is being used + * @vm: address space + * + * Check if any request using the specified address space is + * active. + * + * Returns: true if address space is active, false otherwise. + */ +bool i915_vm_is_active(const struct i915_address_space *vm) +{ + return !dma_resv_test_signaled(vm->root_obj->base.resv, + DMA_RESV_USAGE_BOOKKEEP); +} + int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj, struct sg_table *pages) { diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 8c2f57eb5dda..a5bbdc59d9df 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -51,4 +51,7 @@ int
[RFC v4 14/14] drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
Add getparam support for VM_BIND capability version. Add VM creation time flag to enable vm_bind_mode for the VM. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 5 - drivers/gpu/drm/i915/i915_getparam.c| 3 +++ include/uapi/drm/i915_drm.h | 24 - 3 files changed, 30 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index f4e648ec01ed..e0ebf47d4d57 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1808,7 +1808,7 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data, if (!HAS_FULL_PPGTT(i915)) return -ENODEV; - if (args->flags) + if (args->flags & I915_VM_CREATE_FLAGS_UNKNOWN) return -EINVAL; ppgtt = i915_ppgtt_create(to_gt(i915), 0); @@ -1828,6 +1828,9 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data, if (err) goto err_put; + if (args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) + ppgtt->vm.vm_bind_mode = true; + GEM_BUG_ON(id == 0); /* reserved for invalid/unassigned ppgtt */ args->vm_id = id; return 0; diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index 342c8ca6414e..56ff5118d017 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -175,6 +175,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, case I915_PARAM_PERF_REVISION: value = i915_perf_ioctl_version(); break; + case I915_PARAM_VM_BIND_VERSION: + value = GRAPHICS_VER(i915) >= 12 ? 1 : 0; + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index eaeb80a3ede1..df0fb875276c 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -755,6 +755,27 @@ typedef struct drm_i915_irq_wait { /* Query if the kernel supports the I915_USERPTR_PROBE flag. */ #define I915_PARAM_HAS_USERPTR_PROBE 56 +/* + * VM_BIND feature version supported. + * + * The following versions of VM_BIND have been defined: + * + * 0: No VM_BIND support. + * + * 1: In VM_UNBIND calls, the UMD must specify the exact mappings created + *previously with VM_BIND, the ioctl will not support unbinding multiple + *mappings or splitting them. Similarly, VM_BIND calls will not replace + *any existing mappings. + * + * 2: The restrictions on unbinding partial or multiple mappings is + *lifted, Similarly, binding will replace any mappings in the given range. + * + * See struct drm_i915_gem_vm_bind and struct drm_i915_gem_vm_unbind. + * + * vm_bind versions are backward compatible. + */ +#define I915_PARAM_VM_BIND_VERSION 57 + /* Must be kept compact -- no holes and well documented */ /** @@ -2625,7 +2646,8 @@ struct drm_i915_gem_vm_control { /** @extensions: Zero-terminated chain of extensions. */ __u64 extensions; - /** @flags: reserved for future usage, currently MBZ */ +#define I915_VM_CREATE_FLAGS_USE_VM_BIND (1u << 0) +#define I915_VM_CREATE_FLAGS_UNKNOWN (-(I915_VM_CREATE_FLAGS_USE_VM_BIND << 1)) __u32 flags; /** @vm_id: Id of the VM created or to be destroyed */ -- 2.21.0.rc0.32.g243a4c7e27
[RFC v4 12/14] drm/i915/vm_bind: userptr dma-resv changes
For persistent (vm_bind) vmas of userptr BOs, handle the user page pinning by using the i915_gem_object_userptr_submit_init() /done() functions Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andi Shyti --- .../gpu/drm/i915/gem/i915_gem_execbuffer3.c | 99 +++ drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 17 .../drm/i915/gem/i915_gem_vm_bind_object.c| 16 +++ drivers/gpu/drm/i915/gt/intel_gtt.c | 2 + drivers/gpu/drm/i915/gt/intel_gtt.h | 4 + drivers/gpu/drm/i915/i915_vma_types.h | 2 + 6 files changed, 140 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c index 82a068d03440..7467e3daac5c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c @@ -22,6 +22,7 @@ #include "i915_gem_vm_bind.h" #include "i915_trace.h" +#define __EXEC3_USERPTR_USED BIT_ULL(34) #define __EXEC3_HAS_PINBIT_ULL(33) #define __EXEC3_ENGINE_PINNED BIT_ULL(32) #define __EXEC3_INTERNAL_FLAGS (~0ull << 32) @@ -144,6 +145,21 @@ static void eb_scoop_unbound_vma_all(struct i915_address_space *vm) { struct i915_vma *vma, *vn; +#ifdef CONFIG_MMU_NOTIFIER + /** +* Move all invalidated userptr vmas back into vm_bind_list so that +* they are looked up and revalidated. +*/ + spin_lock(&vm->userptr_invalidated_lock); + list_for_each_entry_safe(vma, vn, &vm->userptr_invalidated_list, +userptr_invalidated_link) { + list_del_init(&vma->userptr_invalidated_link); + if (!list_empty(&vma->vm_bind_link)) + list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list); + } + spin_unlock(&vm->userptr_invalidated_lock); +#endif + /** * Move all unbound vmas back into vm_bind_list so that they are * revalidated. @@ -157,10 +173,47 @@ static void eb_scoop_unbound_vma_all(struct i915_address_space *vm) spin_unlock(&vm->vm_rebind_lock); } +static int eb_lookup_persistent_userptr_vmas(struct i915_execbuffer *eb) +{ + struct i915_address_space *vm = eb->context->vm; + struct i915_vma *last_vma = NULL; + struct i915_vma *vma; + int err; + + lockdep_assert_held(&vm->vm_bind_lock); + + list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) { + if (!i915_gem_object_is_userptr(vma->obj)) + continue; + + err = i915_gem_object_userptr_submit_init(vma->obj); + if (err) + return err; + + /** +* The above submit_init() call does the object unbind and +* hence adds vma into vm_rebind_list. Remove it from that +* list as it is already scooped for revalidation. +*/ + spin_lock(&vm->vm_rebind_lock); + if (!list_empty(&vma->vm_rebind_link)) + list_del_init(&vma->vm_rebind_link); + spin_unlock(&vm->vm_rebind_lock); + + last_vma = vma; + } + + if (last_vma) + eb->args->flags |= __EXEC3_USERPTR_USED; + + return 0; +} + static int eb_lookup_vma_all(struct i915_execbuffer *eb) { unsigned int i, current_batch = 0; struct i915_vma *vma; + int err = 0; for (i = 0; i < eb->num_batches; i++) { vma = eb_find_vma(eb->context->vm, eb->batch_addresses[i]); @@ -173,6 +226,10 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb) eb_scoop_unbound_vma_all(eb->context->vm); + err = eb_lookup_persistent_userptr_vmas(eb); + if (err) + return err; + return 0; } @@ -330,15 +387,57 @@ static void eb_move_all_persistent_vma_to_active(struct i915_execbuffer *eb) static int eb_move_to_gpu(struct i915_execbuffer *eb) { + int err = 0; + lockdep_assert_held(&eb->context->vm->vm_bind_lock); assert_object_held(eb->context->vm->root_obj); eb_move_all_persistent_vma_to_active(eb); +#ifdef CONFIG_MMU_NOTIFIER + /* Check for further userptr invalidations */ + spin_lock(&eb->context->vm->userptr_invalidated_lock); + if (!list_empty(&eb->context->vm->userptr_invalidated_list)) + err = -EAGAIN; + spin_unlock(&eb->context->vm->userptr_invalidated_lock); + + if (!err && (eb->args->flags & __EXEC3_USERPTR_USED)) { + struct i915_vma *vma; + + lockdep_assert_held(&eb->context->vm->vm_bind_lock); + assert_object_held(eb->context->vm->root_obj); + + read_lock(&eb->i915->mm.notifier_lock); + list_for_each_entry(vma, &eb->context->vm->vm_bind_list, + vm_bind_link) { + if (
Re: [Intel-gfx] [RFC PATCH v3 10/17] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
On Mon, Sep 05, 2022 at 04:08:57PM +0100, Tvrtko Ursulin wrote: On 02/09/2022 06:41, Niranjana Vishwanathapura wrote: On Thu, Sep 01, 2022 at 08:58:57AM +0100, Tvrtko Ursulin wrote: On 01/09/2022 06:09, Niranjana Vishwanathapura wrote: On Wed, Aug 31, 2022 at 08:38:48AM +0100, Tvrtko Ursulin wrote: On 27/08/2022 20:43, Andi Shyti wrote: From: Niranjana Vishwanathapura Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl. The new execbuf3 ioctl will not have any list of objects to validate bind as all required objects binding would have been requested by the userspace before submitting the execbuf3. And the legacy support like relocations etc are removed. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Ramalingam C Signed-off-by: Andi Shyti --- [snip] +static void signal_fence_array(const struct i915_execbuffer *eb, + struct dma_fence * const fence) +{ + unsigned int n; + + for (n = 0; n < eb->num_fences; n++) { + struct drm_syncobj *syncobj; + unsigned int flags; + + syncobj = ptr_unpack_bits(eb->fences[n].syncobj, &flags, 2); + if (!(flags & I915_TIMELINE_FENCE_SIGNAL)) + continue; + + if (eb->fences[n].chain_fence) { + drm_syncobj_add_point(syncobj, + eb->fences[n].chain_fence, + fence, + eb->fences[n].value); + /* + * The chain's ownership is transferred to the + * timeline. + */ + eb->fences[n].chain_fence = NULL; + } else { + drm_syncobj_replace_fence(syncobj, fence); + } + } +} Semi-random place to ask - how many of the code here is direct copy of existing functions from i915_gem_execbuffer.c? There seems to be some 100% copies at least. And then some more with small tweaks. Spend some time and try to figure out some code sharing? During VM_BIND design review, maintainers expressed thought on keeping execbuf3 completely separate and not touch the legacy execbuf path. Got a link so this maintainer can see what exactly was said? Just to make sure there isn't any misunderstanding on what "completely separate" means to different people. Here is one (search for copypaste/copy-paste) https://patchwork.freedesktop.org/patch/486608/?series=93447&rev=3 It is hard to search for old discussion threads. May be maintainers can provide feedback here directly. Dave, Daniel? :) Thanks. I had a read and don't see a fundamental conflict with what I said. Conclusion seemed to be to go with a new ioctl and implement code sharing where it makes sense. Which is what TODO in the cover letter acknowledges so there should be no disagreement really. I also think, execbuf3 should be fully separate. We can do some code sharing where is a close 100% copy (there is a TODO in cover letter). There are some changes like the timeline fence array handling here which looks similar, but the uapi is not exactly the same. Probably, we should keep them separate and not try to force code sharing at least at this point. Okay did not spot that TODO in the cover. But fair since it is RFC to be unfinished. I do however think it should be improved before considering the merge. Because looking at the patch, 100% copies are: for_each_batch_create_order for_each_batch_add_order eb_throttle eb_pin_timeline eb_pin_engine eb_put_engine __free_fence_array put_fence_array await_fence_array signal_fence_array retire_requests eb_request_add eb_requests_get eb_requests_put eb_find_context Quite a lot. Then there is a bunch of almost same functions which could be shared if there weren't two incompatible local struct i915_execbuffer's. Especially given when the out fence TODO item gets handled a chunk more will also become a 100% copy. There are difinitely a few which is 100% copies hence should have a shared code. But some are not. Like, fence_array stuff though looks very similar, the uapi structures are different between execbuf3 and legacy execbuf. The internal flags are also different (eg., __EXEC3_ENGINE_PINNED vs __EXEC_ENGINE_PINNED) which causes minor differences hence not a 100% copy. So, I am not convinced if it is worth carrying legacy stuff into execbuf3 code. I think we need to look at these on a case by case basis and see if abstracting common functionality to a separate shared code makes sense or it is better to keep the code separate. No one is suggesting to carry any legacy stuff into eb3. What I'd suggest is to start something like i915_gem_eb_common.h|c and stuff the 100% copies from the above list in there. Common struct eb with struct eb2 and eb3 inheriting from it should do the trick. Similarly eb->flags shouldn't be a hard problem to solve. Then you see what remains and whether it makes sense to consolidate further. Tvrtko,
Re: [RFC PATCH v3 04/17] drm/i915: Implement bind and unbind of object
On Mon, Sep 12, 2022 at 04:11:54PM +0300, Jani Nikula wrote: On Sat, 27 Aug 2022, Andi Shyti wrote: From: Niranjana Vishwanathapura Implement the bind and unbind of an object at the specified GPU virtual addresses. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Prathap Kumar Valsan Signed-off-by: Ramalingam C Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h | 21 ++ .../drm/i915/gem/i915_gem_vm_bind_object.c| 322 ++ drivers/gpu/drm/i915/gt/intel_gtt.c | 10 + drivers/gpu/drm/i915/gt/intel_gtt.h | 9 + drivers/gpu/drm/i915/i915_driver.c| 1 + drivers/gpu/drm/i915/i915_vma.c | 3 +- drivers/gpu/drm/i915/i915_vma.h | 2 - drivers/gpu/drm/i915/i915_vma_types.h | 14 + include/uapi/drm/i915_drm.h | 163 + 10 files changed, 543 insertions(+), 3 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 522ef9b4aff32..4e1627e96c6e0 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -165,6 +165,7 @@ gem-y += \ gem/i915_gem_ttm_move.o \ gem/i915_gem_ttm_pm.o \ gem/i915_gem_userptr.o \ + gem/i915_gem_vm_bind_object.o \ gem/i915_gem_wait.o \ gem/i915_gemfs.o i915-y += \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h new file mode 100644 index 0..ebc493b7dafc1 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2022 Intel Corporation + */ + +#ifndef __I915_GEM_VM_BIND_H +#define __I915_GEM_VM_BIND_H + +#include "i915_drv.h" Please only include what you need. Here, you'll only need to include for u64 and bool. For everything else, add forward declarations. Jani, Thanks. I have posted vm_bind v4 rfc series with your comments addressed. Regards, Niranjana + +struct i915_vma * +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va); +void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj); + +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data, +struct drm_file *file); + +void i915_gem_vm_unbind_vma_all(struct i915_address_space *vm); Blank line here please. +#endif /* __I915_GEM_VM_BIND_H */ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c new file mode 100644 index 0..dadd1d4b1761b --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c @@ -0,0 +1,322 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2022 Intel Corporation + */ + +#include + +#include "gem/i915_gem_vm_bind.h" +#include "gem/i915_gem_context.h" +#include "gt/gen8_engine_cs.h" + +#include "i915_drv.h" +#include "i915_gem_gtt.h" + +#define START(node) ((node)->start) +#define LAST(node) ((node)->last) + +INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last, +START, LAST, static inline, i915_vm_bind_it) + +#undef START +#undef LAST + +/** + * DOC: VM_BIND/UNBIND ioctls + * + * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer + * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a + * specified address space (VM). Multiple mappings can map to the same physical + * pages of an object (aliasing). These mappings (also referred to as persistent + * mappings) will be persistent across multiple GPU submissions (execbuf calls) + * issued by the UMD, without user having to provide a list of all required + * mappings during each submission (as required by older execbuf mode). + * + * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for + * signaling the completion of bind/unbind operation. + * + * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION. + * User has to opt-in for VM_BIND mode of binding for an address space (VM) + * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension. + * + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently + * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be + * done asynchronously, when valid out fence is specified. + * + * VM_BIND locking order is as below. + * + * 1) vm_bind_lock mutex will protect vm_bind lists. This lock is taken in + *vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the + *mapping. + * + *In future, when GPU page faults are supported, we can potentially use a + *rwsem instead, so that
Re: [Intel-gfx] [RFC PATCH v3 07/17] drm/i915/vm_bind: Handle persistent vmas
On Mon, Sep 12, 2022 at 04:16:06PM +0300, Jani Nikula wrote: On Sat, 27 Aug 2022, Andi Shyti wrote: From: Niranjana Vishwanathapura Treat VM_BIND vmas as persistent across execbuf ioctl calls and handle them during the request submission in the execbuff path. Support eviction by maintaining a list of evicted persistent vmas for rebinding during next submission. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Ramalingam C Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/gem/i915_gem_object.c| 1 + .../drm/i915/gem/i915_gem_vm_bind_object.c| 8 +++ drivers/gpu/drm/i915/gt/intel_gtt.c | 2 + drivers/gpu/drm/i915/gt/intel_gtt.h | 4 ++ drivers/gpu/drm/i915/i915_gem_gtt.c | 38 + drivers/gpu/drm/i915/i915_gem_gtt.h | 3 + drivers/gpu/drm/i915/i915_vma.c | 50 +++-- drivers/gpu/drm/i915/i915_vma.h | 56 +++ drivers/gpu/drm/i915/i915_vma_types.h | 24 9 files changed, 169 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 389e9f157ca5e..825dce41f7113 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -38,6 +38,7 @@ #include "i915_gem_mman.h" #include "i915_gem_object.h" #include "i915_gem_ttm.h" +#include "i915_gem_vm_bind.h" Why do you add this here if you're not using anything from there? Addressed in v4 rfc series. #include "i915_memcpy.h" #include "i915_trace.h" diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c index 9ff929f187cfd..3b45529fe8d4c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c @@ -91,6 +91,13 @@ void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj) { lockdep_assert_held(&vma->vm->vm_bind_lock); + spin_lock(&vma->vm->vm_rebind_lock); + if (!list_empty(&vma->vm_rebind_link)) + list_del_init(&vma->vm_rebind_link); + i915_vma_set_purged(vma); + i915_vma_set_freed(vma); + spin_unlock(&vma->vm->vm_rebind_lock); + if (!list_empty(&vma->vm_bind_link)) { list_del_init(&vma->vm_bind_link); list_del_init(&vma->non_priv_vm_bind_link); @@ -190,6 +197,7 @@ static struct i915_vma *vm_bind_get_vma(struct i915_address_space *vm, vma->start = va->start; vma->last = va->start + va->length - 1; + i915_vma_set_persistent(vma); return vma; } diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index c4f75826213ae..97cd0089b516d 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -296,6 +296,8 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass) INIT_LIST_HEAD(&vm->non_priv_vm_bind_list); vm->root_obj = i915_gem_object_create_internal(vm->i915, PAGE_SIZE); GEM_BUG_ON(IS_ERR(vm->root_obj)); + INIT_LIST_HEAD(&vm->vm_rebind_list); + spin_lock_init(&vm->vm_rebind_lock); } void *__px_vaddr(struct drm_i915_gem_object *p) diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index 9a2665e4ec2e5..1f3b1967ec175 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -265,6 +265,10 @@ struct i915_address_space { struct list_head vm_bind_list; /** @vm_bound_list: List of vm_binding completed */ struct list_head vm_bound_list; + /* @vm_rebind_list: list of vmas to be rebinded */ + struct list_head vm_rebind_list; + /* @vm_rebind_lock: protects vm_rebound_list */ + spinlock_t vm_rebind_lock; /* @va: tree of persistent vmas */ struct rb_root_cached va; struct list_head non_priv_vm_bind_list; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 329ff75b80b97..f083724163deb 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -25,6 +25,44 @@ #include "i915_trace.h" #include "i915_vgpu.h" +/** + * i915_vm_sync() - Wait for all requests on private vmas of a vm to be completed + * @vm: address space we need to wait for idle + * + * Waits till all requests of the vm_binded private objs are completed. + * + * Returns: 0 on success -ve errcode on failure + */ +int i915_vm_sync(struct i915_address_space *vm) +{ + int ret; + + /* Wait for all requests under this vm to finish */ + ret = dma_resv_wait_timeout(vm->root_obj->base.resv, + DMA_RESV_USAGE_BOOKKEEP, false, + MAX_SCHEDULE_TIMEOUT); + if (ret < 0) + return ret; + else if (ret > 0) + return 0; + e
Re: [PATCH V2 1/2] dt-bindings: display: panel: Add Samsung AMS495QA01 bindings
On 20/09/2022 19:09, Chris Morgan wrote: > From: Chris Morgan > > Add documentation for the Samsung AMS495QA01 panel. > > Signed-off-by: Chris Morgan > --- > .../display/panel/samsung,ams495qa01.yaml | 46 +++ > 1 file changed, 46 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/display/panel/samsung,ams495qa01.yaml > > diff --git > a/Documentation/devicetree/bindings/display/panel/samsung,ams495qa01.yaml > b/Documentation/devicetree/bindings/display/panel/samsung,ams495qa01.yaml > new file mode 100644 > index ..08358cdad19c > --- /dev/null > +++ b/Documentation/devicetree/bindings/display/panel/samsung,ams495qa01.yaml > @@ -0,0 +1,46 @@ > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/display/panel/samsung,ams495qa01.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Samsung AMS495QA01 4.95in 960x544 DSI/SPI panel > + > +maintainers: > + - Chris Morgan > + > +allOf: > + - $ref: panel-common.yaml# > + > +properties: > + compatible: > +const: samsung,ams495qa01 Blank line. > + reg: true > + reset-gpios: true > + elvdd-supply: > +description: regulator that supplies voltage to the panel display > + enable-gpios: true > + port: true > + vdd-supply: > +description: regulator that supplies voltage to panel logic Best regards, Krzysztof
Re: [PATCH] drm/amdgpu: initialize r variable into amdgpu_cs_submit function
Hi Christian, On Tue, Sep 20, 2022 at 02:23:58PM +0200, Christian König wrote: > Am 20.09.22 um 14:22 schrieb Tommaso Merciai: > > The builds of arm64 allmodconfig with clang failed to build > > next-20220920 with the following result: > > > > 1190:3: error: variable 'r' is uninitialized when used here > > [-Werror,-Wuninitialized] > > note: initialize the variable 'r' to silence this warning > > > > This fix compilation error > > I've already send a patch to fix this to the mailing list 7 Minutes ago :) > > Please review or ack that one. Sorry, my bad. Don't see your patch :) Cheers, Tommaso > > Thanks, > Christian. > > > > > Signed-off-by: Tommaso Merciai > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > index 58088c663125..efa3dc9b69fd 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > @@ -1168,7 +1168,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser > > *p, > > struct amdgpu_bo_list_entry *e; > > struct amdgpu_job *job; > > uint64_t seq; > > - int r; > > + int r = 0; > > job = p->job; > > p->job = NULL; > -- Tommaso Merciai Embedded Linux Engineer tommaso.merc...@amarulasolutions.com __ Amarula Solutions SRL Via Le Canevare 30, 31100 Treviso, Veneto, IT T. +39 042 243 5310 i...@amarulasolutions.com www.amarulasolutions.com
[PATCH] drm/amdgpu: initialize r variable into amdgpu_cs_submit function
The builds of arm64 allmodconfig with clang failed to build next-20220920 with the following result: 1190:3: error: variable 'r' is uninitialized when used here [-Werror,-Wuninitialized] note: initialize the variable 'r' to silence this warning This fix compilation error Signed-off-by: Tommaso Merciai --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 58088c663125..efa3dc9b69fd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1168,7 +1168,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, struct amdgpu_bo_list_entry *e; struct amdgpu_job *job; uint64_t seq; - int r; + int r = 0; job = p->job; p->job = NULL; -- 2.25.1
Re: [PATCH v3 0/8] Support for NVDEC on Tegra234
On 20/09/2022 10:11, Mikko Perttunen wrote: > From: Mikko Perttunen > > v3: > * Updated patch 3 based on comments > > v2: > * Updated patches 1,3 based on comments > * Added Acked-by to patch 2 > > Original message: > > Hi all, > > this series adds support for the HW video decoder, NVDEC, > on Tegra234 (Orin). The main change is a switch from Falcon > to RISC-V for the internal microcontroller, which brings along > a change in how the engine is booted. Otherwise it is backwards > compatible with earlier versions. I asked you to describe the dependencies and patch merging strategy. It's still not here, so I assume there are no and I am taking patches relevant to me. Best regards, Krzysztof
Re: [PATCH v3 1/8] memory: tegra: Add API for retrieving carveout bounds
On 20/09/2022 10:11, Mikko Perttunen wrote: > From: Mikko Perttunen > > On Tegra234 NVDEC firmware is loaded from a secure carveout, where it > has been loaded by a bootloader. When booting NVDEC, we need to tell it > the address of this firmware, which we can determine by checking the > starting address of the carveout. As such, add an MC API to query the > bounds of carveouts, and add related information on Tegra234. > > Signed-off-by: Mikko Perttunen > --- > v2: > - Add check for 64-bit phys_addr_t. In practice phys_addr_t > is always 64 bits where this runs, but it avoids warnings in > compile test. > --- > drivers/memory/tegra/mc.c | 25 + > drivers/memory/tegra/tegra234.c | 5 + > include/soc/tegra/mc.h | 11 +++ I still need here Thierry's ack. Best regards, Krzysztof
Re: [PATCH v3 0/8] Support for NVDEC on Tegra234
On 9/21/22 10:26, Krzysztof Kozlowski wrote: On 20/09/2022 10:11, Mikko Perttunen wrote: From: Mikko Perttunen v3: * Updated patch 3 based on comments v2: * Updated patches 1,3 based on comments * Added Acked-by to patch 2 Original message: Hi all, this series adds support for the HW video decoder, NVDEC, on Tegra234 (Orin). The main change is a switch from Falcon to RISC-V for the internal microcontroller, which brings along a change in how the engine is booted. Otherwise it is backwards compatible with earlier versions. I asked you to describe the dependencies and patch merging strategy. It's still not here, so I assume there are no and I am taking patches relevant to me. Best regards, Krzysztof Sorry, I described it in the earlier email and forgot to add it to the cover letter.. Patch 8 does depend on patch 1 so it would be better to take the memory patch with it, or however works best from maintainership point of view (not my expertise). thanks, Mikko
Re: [PATCH] drm/i915: Do not cleanup obj with NULL bo->resource
Am 20.09.22 um 19:13 schrieb Matthew Auld: On 20/09/2022 18:06, Nirmoy Das wrote: For delayed BO release i915_ttm_delete_mem_notify() gets called twice, once with proper bo->resource and another time with NULL. We shouldn't do anything for the 2nd time as we already cleanedup the obj once. References: https://gitlab.freedesktop.org/drm/intel/-/issues/6850 Signed-off-by: Nirmoy Das Reviewed-by: Matthew Auld Christian, as per above it looks like ttm calls into the delete_mem_notify() hook twice if the object ends up on the delayed destroy list, is that expected/normal? Yeah, that's expected. IIRC some driver depended on this for some reason. I already wanted to change this behavior, but forgot to do so after the patch set which made bo->resource a pointer landed. Going to look into it once more. Thanks, Christian. --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 0544b0a4a43a..e3fc38dd5db0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -511,7 +511,7 @@ static void i915_ttm_delete_mem_notify(struct ttm_buffer_object *bo) struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); intel_wakeref_t wakeref = 0; - if (likely(obj)) { + if (bo->resource && likely(obj)) { /* ttm_bo_release() already has dma_resv_lock */ if (i915_ttm_cpu_maps_iomem(bo->resource)) wakeref = intel_runtime_pm_get(&to_i915(obj->base.dev)->runtime_pm);
Re: [PATCH v3 0/8] Support for NVDEC on Tegra234
On 21/09/2022 09:50, Mikko Perttunen wrote: > On 9/21/22 10:26, Krzysztof Kozlowski wrote: >> On 20/09/2022 10:11, Mikko Perttunen wrote: >>> From: Mikko Perttunen >>> >>> v3: >>> * Updated patch 3 based on comments >>> >>> v2: >>> * Updated patches 1,3 based on comments >>> * Added Acked-by to patch 2 >>> >>> Original message: >>> >>> Hi all, >>> >>> this series adds support for the HW video decoder, NVDEC, >>> on Tegra234 (Orin). The main change is a switch from Falcon >>> to RISC-V for the internal microcontroller, which brings along >>> a change in how the engine is booted. Otherwise it is backwards >>> compatible with earlier versions. >> >> I asked you to describe the dependencies and patch merging strategy. >> It's still not here, so I assume there are no and I am taking patches >> relevant to me. >> >> Best regards, >> Krzysztof > > Sorry, I described it in the earlier email and forgot to add it to the > cover letter.. Please keep it in cover letter. We all get too many emails and too many patchsets to remember. Plus, things can change and such dependency can disappear after some versions. Best regards, Krzysztof
Re: [PATCH 4/5] drm/damage-helper: Do partial updates if framebuffer has not been changed
Hi Ville Am 20.09.22 um 16:31 schrieb Ville Syrjälä: On Tue, Sep 20, 2022 at 03:56:18PM +0200, Thomas Zimmermann wrote: Set partial updates on a plane if the framebuffer has not been changed on an atomic commit. If such a plane has damage clips, the driver will use them; otherwise the update is effectively empty. Planes that change their framebuffer still perform a full update. I have a feeling you're sort of papering over some inefficient userspace that's asking updates on planes that don't actually need them. I'm not sure adding more code for that is a particularly great idea. Wouldn't it be better to just fix the userspace code? Some more context might be in order: The ast driver currently uses VRAM helpers, which leads to many problems; including blank screens from the low amount of video memory. The best solution is to switch SHMEM with BOs on system meory. The video memory is only the internal buffer for scanout. This update involves a mempcy from the BO to video memory. Ast's hardware is really slow, so it makes sense to reduce the updates to video memory to a minimum. There's support for cursor planes, which really makes a difference here. But doing any cursor-plane update (e.g., moving, animation) with SHMEM and the current damage helpers always results in a full-screen memcpy from the BO to the video memory for the primary plane. As the ast hardware is slow, performance goes down to a an estimated 5 frame per seconds. After moving the mouse, one can watch the mouse cursor follow along the screen for the next seconds. Userspace sends the movement information and DRM slowly processes them. The same can be observed for cursor animation. The problem is that each change to the cursor plane results in an atomic_update call for the primary plane. Not having damage information for the plane just means 'update everything'. Not a problem if redrawing consists of reprogramming the scanout address. For a memcpy it's not feasible. So can this be fixed in userspace? No realistically IMHO. I've seen this problem on Weston, Wayland-Gnome and X11-Gnome. And they are all problematic in their own way. (That's why there are multiple patches needed.) For example, X11 uses the legacy mouse ioctl, which one of the patches fixes. The other userspace needs the other heuristics. A potential userspace fix would mean to always set empty-damage information on all planes that don't get an update. And I don't consider X11 fixable TBH. Any property change on the plane could need a full plane update as well (eg. some color mangement stuff etc.). So you'd have to keep adding exceptions to that list whenever new properties are added. It's not about the occasional change of a property. It's the constant sluggish redraw when the interface is supposed to be snappy, such as mouse interaction. And I think the semantics of the ioctl(s) has so far been that basically any page flip (whether or not it actually changes the fb) still ends up doing whatever flushing is needed to guarantee the fb contents are up to date on the screen (if someone sneaked in some front buffer rendering in between). Ie. you could even use basically a nop page flip in place of dirtyfb. That's why full updates are still the default. Only in cases where it's save to avoid an update of unaffected planes, we do so. I know that we don't give performance guarantees to userspace. But using cursor/overlay planes should be faster then not using them. I think that's a reasonable assumption for userspace to make. Another thing the ioctls have always done is actually perform the commit whether anything changed or nor. That is, they still do all the same the same vblanky stuff and the commit takes the same amount of time. Not sure if your idea is to potentially short circuit the entire thing and make it take no time at all? The patches don't change the overall commit logics. All they do is to set damage updates to a size of 0 if a plane reliably does not need an update. A driver's atomic_update still runs, but the damage iterator does not return any damaged areas. Best regards Thomas -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH 04/18] phy: mediatek: ufs: use common register access helpers
Il 20/09/22 11:00, Chunfeng Yun ha scritto: No need define private register access helpers, use common ones defined in phy-mtk-io.h Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 05/18] phy: mediatek: pcie: use new helper to update register bits
Il 20/09/22 11:00, Chunfeng Yun ha scritto: The new helper will use FIELD_PREP() macro to prepare bits value according to mask, then we no need do it anymore. Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 08/18] phy: mediatek: hdmi: mt2701: use common helper to access registers
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Use MediaTek phy's common helper to access registers, then we can remove hdmi's I/O helpers. Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 13/18] phy: mediatek: mipi: mt8173: use GENMASK to generate bits mask
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Use GENMASK() macro to generate bits mask Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 06/18] phy: mediatek: hdmi: mt2701: use GENMASK and BIT to generate mask and bits
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Use GENMASK() and BIT() macros to generate mask and bits Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 07/18] phy: mediatek: hdmi: mt2701: use FIELD_PREP to prepare bits field
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Use FIELD_PREP() macro to prepare bits field value, then no need define macros of bits offset. Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 12/18] phy: mediatek: hdmi: remove register access helpers
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Remove private register access helpers, use the common ones instead. Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 11/18] phy: mediatek: hdmi: mt8173: use common helper to access registers
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Use MediaTek phy's common helper to access registers, then we can remove hdmi's I/O helpers. Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 09/18] phy: mediatek: hdmi: mt8173: use GENMASK to generate bits mask
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Use GENMASK() macro to generate bits mask Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 10/18] phy: mediatek: hdmi: mt8173: use FIELD_PREP to prepare bits field
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Use FIELD_PREP() macro to prepare bits field value, then no need define macros of bits offset. Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 15/18] phy: mediatek: mipi: mt8173: use common helper to access registers
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Use MediaTek phy's common helper to access registers, then we can remove mipi-dsi's I/O helpers. Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 14/18] phy: mediatek: mipi: mt8173: use FIELD_PREP to prepare bits field
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Use FIELD_PREP() macro to prepare bits field value, then no need define macros of bits offset. Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 17/18] phy: mediatek: mipi: mt8183: use common helper to access registers
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Use MediaTek phy's common helper to access registers, then we can remove mipi-dsi's I/O helpers. Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 16/18] phy: mediatek: mipi: mt8183: use GENMASK to generate bits mask
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Use GENMASK() macro to generate bits mask Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 18/18] phy: mediatek: mipi: remove register access helpers
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Remove private register access helpers, use the common ones instead. Signed-off-by: Chunfeng Yun Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH 01/18] phy: mediatek: add a new helper to update bitfield
Il 20/09/22 11:00, Chunfeng Yun ha scritto: Due to FIELD_PREP() macro can be used to prepare a bitfield value, local ones can be remove; add the new helper to make bitfield update easier. Signed-off-by: Chunfeng Yun --- drivers/phy/mediatek/phy-mtk-io.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/phy/mediatek/phy-mtk-io.h b/drivers/phy/mediatek/phy-mtk-io.h index 500fcdab165d..a723d4afc9b4 100644 --- a/drivers/phy/mediatek/phy-mtk-io.h +++ b/drivers/phy/mediatek/phy-mtk-io.h @@ -8,6 +8,7 @@ #ifndef __PHY_MTK_H__ #define __PHY_MTK_H__ +#include #include static inline void mtk_phy_clear_bits(void __iomem *reg, u32 bits) @@ -35,4 +36,10 @@ static inline void mtk_phy_update_bits(void __iomem *reg, u32 mask, u32 val) writel(tmp, reg); } +/* field @mask should be constant and continuous */ "Field @mask shall be [...]" ^ +static inline void mtk_phy_update_field(void __iomem *reg, u32 mask, u32 val) ...so, (void __iomem *reg, const u32 mask, u32 val) +{ + mtk_phy_update_bits(reg, mask, FIELD_PREP(mask, val)); +} + #endif
Re: [Intel-gfx] [PATCH] drm/i915: Do not cleanup obj with NULL bo->resource
Hi Matt On 9/20/2022 7:06 PM, Nirmoy Das wrote: For delayed BO release i915_ttm_delete_mem_notify() gets called twice, once with proper bo->resource and another time with NULL. We shouldn't do anything for the 2nd time as we already cleanedup the obj once. References: https://gitlab.freedesktop.org/drm/intel/-/issues/6850 Please add the below Fixes before merging, I missed that. Fixes: ad74457a6b5a96 ("drm/i915/dgfx: Release mmap on rpm suspend") Thanks, Nirmoy Signed-off-by: Nirmoy Das --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 0544b0a4a43a..e3fc38dd5db0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -511,7 +511,7 @@ static void i915_ttm_delete_mem_notify(struct ttm_buffer_object *bo) struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); intel_wakeref_t wakeref = 0; - if (likely(obj)) { + if (bo->resource && likely(obj)) { /* ttm_bo_release() already has dma_resv_lock */ if (i915_ttm_cpu_maps_iomem(bo->resource)) wakeref = intel_runtime_pm_get(&to_i915(obj->base.dev)->runtime_pm);
Re: [PATCH 4/5] drm/damage-helper: Do partial updates if framebuffer has not been changed
On Wed, Sep 21, 2022 at 09:59:10AM +0200, Thomas Zimmermann wrote: > Hi Ville > > Am 20.09.22 um 16:31 schrieb Ville Syrjälä: > > On Tue, Sep 20, 2022 at 03:56:18PM +0200, Thomas Zimmermann wrote: > >> Set partial updates on a plane if the framebuffer has not been changed > >> on an atomic commit. If such a plane has damage clips, the driver will > >> use them; otherwise the update is effectively empty. Planes that change > >> their framebuffer still perform a full update. > > > > I have a feeling you're sort of papering over some inefficient > > userspace that's asking updates on planes that don't actually > > need them. I'm not sure adding more code for that is a particularly > > great idea. Wouldn't it be better to just fix the userspace code? > > Some more context might be in order: > > The ast driver currently uses VRAM helpers, which leads to many > problems; including blank screens from the low amount of video memory. > The best solution is to switch SHMEM with BOs on system meory. The video > memory is only the internal buffer for scanout. This update involves a > mempcy from the BO to video memory. > > Ast's hardware is really slow, so it makes sense to reduce the updates > to video memory to a minimum. There's support for cursor planes, which > really makes a difference here. > > But doing any cursor-plane update (e.g., moving, animation) with SHMEM > and the current damage helpers always results in a full-screen memcpy > from the BO to the video memory for the primary plane. As the ast > hardware is slow, performance goes down to a an estimated 5 frame per > seconds. After moving the mouse, one can watch the mouse cursor follow > along the screen for the next seconds. Userspace sends the movement > information and DRM slowly processes them. The same can be observed for > cursor animation. > > The problem is that each change to the cursor plane results in an > atomic_update call for the primary plane. Not having damage information > for the plane just means 'update everything'. Not a problem if redrawing > consists of reprogramming the scanout address. For a memcpy it's not > feasible. > > So can this be fixed in userspace? No realistically IMHO. I've seen this > problem on Weston, Wayland-Gnome and X11-Gnome. And they are all > problematic in their own way. (That's why there are multiple patches > needed.) For example, X11 uses the legacy mouse ioctl, which one of the > patches fixes. The other userspace needs the other heuristics. A > potential userspace fix would mean to always set empty-damage > information on all planes that don't get an update. And I don't consider > X11 fixable TBH. > > > > > Any property change on the plane could need a full plane > > update as well (eg. some color mangement stuff etc.). So you'd > > have to keep adding exceptions to that list whenever new > > properties are added. > > It's not about the occasional change of a property. It's the constant > sluggish redraw when the interface is supposed to be snappy, such as > mouse interaction. > > > > > And I think the semantics of the ioctl(s) has so far been that > > basically any page flip (whether or not it actually changes > > the fb) still ends up doing whatever flushing is needed to > > guarantee the fb contents are up to date on the screen (if > > someone sneaked in some front buffer rendering in between). > > Ie. you could even use basically a nop page flip in place > > of dirtyfb. > > That's why full updates are still the default. Only in cases where it's > save to avoid an update of unaffected planes, we do so. Umm. Maybe I misread the patch in my haste but it seemed that you consider the thing undamaged if the fb pointer was unchanged. That goes against what I wrote above. Though I don't really know if a there is software relying on that behaviuor. I suppose one idea could be to keep that behaviour for the legacy ioctls but not for atomic. Ee. any fb directly specified in a legacy setcrtc/setplane/pageflip is always considered fully damaged. But including an the same fb in the atomic ioctl does not imply damage. That would mean atomic has to rely on specifying damage explicitly, and any userspace that doesn't do that will be broken. Or we could introduce a client cap for it I guess, but that would require (minimal) userspace changes. > > I know that we don't give performance guarantees to userspace. But using > cursor/overlay planes should be faster then not using them. I think > that's a reasonable assumption for userspace to make. > > > > > Another thing the ioctls have always done is actually perform > > the commit whether anything changed or nor. That is, they > > still do all the same the same vblanky stuff and the commit > > takes the same amount of time. Not sure if your idea is > > to potentially short circuit the entire thing and make it > > take no time at all? > > The patches don't change the overall commit logics. All they do is to > set damage
Re: [PATCH 2/2] drm/panfrost: replace endian-specific types with generic ones
On 20/09/2022 23:13, Alyssa Rosenzweig wrote: > Tentative r-b, but we *do* need to make a decision on how we want to > handle endianness. I don't have strong feelings but the results of that > discussion should go in the commit message. Linux currently treats the dump objects specially - the headers are little endian. All the other (Panfrost) DRM structures are native endian (although I doubt anyone has tested it so I expect bugs). I've no particularly strong views on this, but since the dump objects are likely to be saved to disk and transferred between computers it makes sense to fix the endianness for those. The __le types currently mean sparse can warn if we screw up in the kernel, so it would be a shame to lose that type checking. Another option would be to extend the list of typedefs in include/uapi/drm/drm.h to include the __le types. We'd need wider buy-in for that change though. Finally etnaviv 'solves' the issue by not including the dump structures in the UABI header... Or of course we could just actually use native endian and detect from the magic which endian is in use. That would require ripping out the cpu_to_lexx() calls in Linux and making the user space tool more intelligent. I'm happy with that, but it's pushing the complexity onto Mesa. Steve > On Tue, Sep 20, 2022 at 10:15:45PM +0100, Adri??n Larumbe wrote: >> __le32 and __l64 endian-specific types aren't portable and not available on >> FreeBSD, for which there's a uAPI compatible reimplementation of Panfrost. >> >> Replace these specific types with more generic unsigned ones, to prevent >> FreeBSD Mesa build errors. >> >> Bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7252 >> Fixes: 730c2bf4ad39 ("drm/panfrost: Add support for devcoredump") >> Signed-off-by: Adri??n Larumbe >> --- >> include/uapi/drm/panfrost_drm.h | 30 +++--- >> 1 file changed, 15 insertions(+), 15 deletions(-) >> >> diff --git a/include/uapi/drm/panfrost_drm.h >> b/include/uapi/drm/panfrost_drm.h >> index bd77254be121..c1a10a9366a9 100644 >> --- a/include/uapi/drm/panfrost_drm.h >> +++ b/include/uapi/drm/panfrost_drm.h >> @@ -236,24 +236,24 @@ struct drm_panfrost_madvise { >> #define PANFROSTDUMP_BUF_TRAILER (PANFROSTDUMP_BUF_BO + 1) >> >> struct panfrost_dump_object_header { >> -__le32 magic; >> -__le32 type; >> -__le32 file_size; >> -__le32 file_offset; >> +__u32 magic; >> +__u32 type; >> +__u32 file_size; >> +__u32 file_offset; >> >> union { >> struct { >> -__le64 jc; >> -__le32 gpu_id; >> -__le32 major; >> -__le32 minor; >> -__le64 nbos; >> +__u64 jc; >> +__u32 gpu_id; >> +__u32 major; >> +__u32 minor; >> +__u64 nbos; >> } reghdr; >> >> struct { >> -__le32 valid; >> -__le64 iova; >> -__le32 data[2]; >> +__u32 valid; >> +__u64 iova; >> +__u32 data[2]; >> } bomap; >> >> /* >> @@ -261,14 +261,14 @@ struct panfrost_dump_object_header { >> * with new fields and also keep it 512-byte aligned >> */ >> >> -__le32 sizer[496]; >> +__u32 sizer[496]; >> }; >> }; >> >> /* Registers object, an array of these */ >> struct panfrost_dump_registers { >> -__le32 reg; >> -__le32 value; >> +__u32 reg; >> +__u32 value; >> }; >> >> #if defined(__cplusplus) >> -- >> 2.37.0 >>
Re: [PATCH] drm/mediatek: dsi: Move mtk_dsi_stop() call back to mtk_dsi_poweroff()
On 9/19/22 21:32, AngeloGioacchino Del Regno wrote: > Il 19/09/22 10:40, Hsin-Yi Wang ha scritto: >> On Mon, Sep 19, 2022 at 4:39 PM Nícolas F. R. A. Prado >> wrote: >>> >>> As the comment right before the mtk_dsi_stop() call advises, >>> mtk_dsi_stop() should only be called after >>> mtk_drm_crtc_atomic_disable(). That's because that function calls >>> drm_crtc_wait_one_vblank(), which requires the vblank irq to be enabled. >>> >>> Previously mtk_dsi_stop(), being in mtk_dsi_poweroff() and guarded by a >>> refcount, would only be called at the end of >>> mtk_drm_crtc_atomic_disable(), through the call to >>> mtk_crtc_ddp_hw_fini(). >>> Commit cde7e2e35c28 ("drm/mediatek: Separate poweron/poweroff from >>> enable/disable and define new funcs") moved the mtk_dsi_stop() call to >>> mtk_output_dsi_disable(), causing it to be called before >>> mtk_drm_crtc_atomic_disable(), and consequently generating vblank >>> timeout warnings during suspend. >>> >>> Move the mtk_dsi_stop() call back to mtk_dsi_poweroff() so that we have >>> a working vblank irq during mtk_drm_crtc_atomic_disable() and stop >>> getting vblank timeout warnings. >>> >>> Fixes: cde7e2e35c28 ("drm/mediatek: Separate poweron/poweroff from >>> enable/disable and define new funcs") >>> Signed-off-by: Nícolas F. R. A. Prado >>> >> Tested-by: Hsin-Yi Wang >> > > Reviewed-by: AngeloGioacchino Del Regno > > > Tested suspend/resume work properly on mt8188 and mt8186 . Tested-by: Allen-KH Cheng
Re: [Intel-gfx] [RFC v4 02/14] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
On 21/09/2022 08:09, Niranjana Vishwanathapura wrote: Add function __i915_sw_fence_await_reservation() for asynchronous wait on a dma-resv object with specified dma_resv_usage. This is required for async vma unbind with vm_bind. Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/i915_sw_fence.c | 25 ++--- drivers/gpu/drm/i915/i915_sw_fence.h | 7 ++- 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c index 6fc0d1b89690..0ce8f4efc1ed 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.c +++ b/drivers/gpu/drm/i915/i915_sw_fence.c @@ -569,12 +569,11 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence, return ret; } -int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, - struct dma_resv *resv, - const struct dma_fence_ops *exclude, - bool write, - unsigned long timeout, - gfp_t gfp) +int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence, + struct dma_resv *resv, + enum dma_resv_usage usage, + unsigned long timeout, + gfp_t gfp) { struct dma_resv_iter cursor; struct dma_fence *f; @@ -583,7 +582,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, debug_fence_assert(fence); might_sleep_if(gfpflags_allow_blocking(gfp)); - dma_resv_iter_begin(&cursor, resv, dma_resv_usage_rw(write)); + dma_resv_iter_begin(&cursor, resv, usage); dma_resv_for_each_fence_unlocked(&cursor, f) { pending = i915_sw_fence_await_dma_fence(fence, f, timeout, gfp); @@ -598,6 +597,18 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, return ret; } +int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, + struct dma_resv *resv, + const struct dma_fence_ops *exclude, + bool write, + unsigned long timeout, + gfp_t gfp) +{ + return __i915_sw_fence_await_reservation(fence, resv, +dma_resv_usage_rw(write), +timeout, gfp); +} Drive by observation - it looked dodgy that you create a wrapper here which ignores one function parameter. On a more detailed look it seems no callers actually use exclude and it's even unused inside this function since 1b5bdf071e62 ("drm/i915: use the new iterator in i915_sw_fence_await_reservation v3"). So a cleanup patch before this one? Regards, Tvrtko + #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftests/lib_sw_fence.c" #include "selftests/i915_sw_fence.c" diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h index 619fc5a22f0c..3cf4b6e16f35 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.h +++ b/drivers/gpu/drm/i915/i915_sw_fence.h @@ -10,13 +10,13 @@ #define _I915_SW_FENCE_H_ #include +#include #include #include #include /* for NOTIFY_DONE */ #include struct completion; -struct dma_resv; struct i915_sw_fence; enum i915_sw_fence_notify { @@ -89,6 +89,11 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence, unsigned long timeout, gfp_t gfp); +int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence, + struct dma_resv *resv, + enum dma_resv_usage usage, + unsigned long timeout, + gfp_t gfp); int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, struct dma_resv *resv, const struct dma_fence_ops *exclude,
Re: [PATCH v3] drm/sched: Add FIFO sched policy to run queue v3
Inlined: On 2022-09-20 15:16, Andrey Grodzovsky wrote: > > On 2022-09-19 23:11, Luben Tuikov wrote: >> Please run this patch through checkpatch.pl, as it shows >> 12 warnings with it. Use these command line options: >> "--strict --show-types". >> >> Inlined: >> >> On 2022-09-13 16:40, Andrey Grodzovsky wrote: >>> Given many entities competing for same run queue on >>> the same scheduler and unacceptably long wait time for some >>> jobs waiting stuck in the run queue before being picked up are >>> observed (seen using GPUVis). >> Since the second part of this sentence is the result of the first, >> I'd say something like "When many entities ... we see unacceptably long ...". >> >>> The issue is due to the Round Robin policy used by schedulers >>> to pick up the next entity's job queue for execution. Under stress >>> of many entities and long job queus within entity some >> Spelling: "queues". >> >>> jobs could be stack for very long time in it's entity's >> "stuck", not "stack". >> >>> queue before being popped from the queue and executed >>> while for other entities with smaller job queues a job >>> might execute earlier even though that job arrived later >>> then the job in the long queue. >> "than". >> >>> >>> Fix: >>> Add FIFO selection policy to entities in run queue, chose next entity >>> on run queue in such order that if job on one entity arrived >>> earlier then job on another entity the first job will start >>> executing earlier regardless of the length of the entity's job >>> queue. >>> >>> v2: >>> Switch to rb tree structure for entities based on TS of >>> oldest job waiting in the job queue of an entity. Improves next >>> entity extraction to O(1). Entity TS update >>> O(log N) where N is the number of entities in the run-queue >>> >>> Drop default option in module control parameter. >>> >>> v3: >>> Various cosmetical fixes and minor refactoring of fifo update function. >>> Signed-off-by: Andrey Grodzovsky >>> Tested-by: Li Yunxiang (Teddy) >>> --- >>> drivers/gpu/drm/scheduler/sched_entity.c | 26 - >>> drivers/gpu/drm/scheduler/sched_main.c | 132 ++- >>> include/drm/gpu_scheduler.h | 35 ++ >>> 3 files changed, 187 insertions(+), 6 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c >>> b/drivers/gpu/drm/scheduler/sched_entity.c >>> index 6b25b2f4f5a3..f3ffce3c9304 100644 >>> --- a/drivers/gpu/drm/scheduler/sched_entity.c >>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c >>> @@ -73,6 +73,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, >>> entity->priority = priority; >>> entity->sched_list = num_sched_list > 1 ? sched_list : NULL; >>> entity->last_scheduled = NULL; >>> + RB_CLEAR_NODE(&entity->rb_tree_node); >>> >>> if(num_sched_list) >>> entity->rq = &sched_list[0]->sched_rq[entity->priority]; >>> @@ -417,14 +418,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct >>> drm_sched_entity *entity) >>> >>> sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); >>> if (!sched_job) >>> - return NULL; >>> + goto skip; >>> >>> while ((entity->dependency = >>> drm_sched_job_dependency(sched_job, entity))) { >>> trace_drm_sched_job_wait_dep(sched_job, entity->dependency); >>> >>> - if (drm_sched_entity_add_dependency_cb(entity)) >>> - return NULL; >>> + if (drm_sched_entity_add_dependency_cb(entity)) { >>> + sched_job = NULL; >>> + goto skip; >>> + } >>> } >>> >>> /* skip jobs from entity that marked guilty */ >>> @@ -443,6 +446,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct >>> drm_sched_entity *entity) >>> smp_wmb(); >>> >>> spsc_queue_pop(&entity->job_queue); >>> + >>> + /* >>> +* It's when head job is extracted we can access the next job (or empty) >>> +* queue and update the entity location in the min heap accordingly. >>> +*/ >>> +skip: >>> + if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) >>> + drm_sched_rq_update_fifo(entity, >>> +(sched_job ? sched_job->submit_ts : >>> ktime_get())); >>> + >>> return sched_job; >>> } >>> >>> @@ -502,11 +515,13 @@ void drm_sched_entity_push_job(struct drm_sched_job >>> *sched_job) >>> { >>> struct drm_sched_entity *entity = sched_job->entity; >>> bool first; >>> + ktime_t ts = ktime_get(); >>> >>> trace_drm_sched_job(sched_job, entity); >>> atomic_inc(entity->rq->sched->score); >>> WRITE_ONCE(entity->last_user, current->group_leader); >>> first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node); >>> + sched_job->submit_ts = ts; >>> >>> /* first job wakes up scheduler */ >>> if (first) { >>> @@ -518,8 +533,13 @@ void drm_sched_entity_push_job(struct drm_sched_job >>> *sched_job) >>>
Re: [PATCH] drm/i915/gvt: fix double-free bug in split_2MB_gtt_entry
I've sent it using git send-email with another email account (zyytlz...@163.com) Regards, Zheng Wang Jani Nikula 于2022年9月19日周一 17:30写道: > > On Mon, 19 Sep 2022, Zheng Wang <1002992...@qq.com> wrote: > > From afe79848cb74cc8e45ab426d13fa2394c87e0422 Mon Sep 17 00:00:00 2001 > > From: xmzyshypnc <1002992...@qq.com> > > Date: Fri, 16 Sep 2022 23:48:23 +0800 > > Subject: [PATCH] drm/i915/gvt: fix double-free bug in split_2MB_gtt_entry > > > > There is a double-free security bug in split_2MB_gtt_entry. > > > > Here is a calling chain : > > ppgtt_populate_spt->ppgtt_populate_shadow_entry->split_2MB_gtt_entry. > > > > If intel_gvt_dma_map_guest_page failed, it will call > > ppgtt_invalidate_spt, which will finally call ppgtt_free_spt and > > kfree(spt). But the caller does not notice that, and it will call > > ppgtt_free_spt again in error path. > > > > Fix this by only freeing spt in ppgtt_invalidate_spt in good case. > > > > Signed-off-by: xmzyshypnc <1002992...@qq.com> > > Please use git send-email. The patch is whitespace broken and line > wrapped, making it unusable. > > BR, > Jani. > > > > --- > > drivers/gpu/drm/i915/gvt/gtt.c | 16 +--- > > 1 file changed, 9 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c > > index ce0eb03709c3..550519f0acca 100644 > > --- a/drivers/gpu/drm/i915/gvt/gtt.c > > +++ b/drivers/gpu/drm/i915/gvt/gtt.c > > @@ -959,7 +959,7 @@ static inline int ppgtt_put_spt(struct > > intel_vgpu_ppgtt_spt *spt) > > return atomic_dec_return(&spt->refcount); > > } > > > > -static int ppgtt_invalidate_spt(struct intel_vgpu_ppgtt_spt *spt); > > +static int ppgtt_invalidate_spt(struct intel_vgpu_ppgtt_spt *spt, int > > is_error); > > > > static int ppgtt_invalidate_spt_by_shadow_entry(struct intel_vgpu *vgpu, > > struct intel_gvt_gtt_entry *e) > > @@ -995,7 +995,7 @@ static int > > ppgtt_invalidate_spt_by_shadow_entry(struct intel_vgpu *vgpu, > > ops->get_pfn(e)); > > return -ENXIO; > > } > > -return ppgtt_invalidate_spt(s); > > +return ppgtt_invalidate_spt(s, 0); > > } > > > > static inline void ppgtt_invalidate_pte(struct intel_vgpu_ppgtt_spt *spt, > > @@ -1016,7 +1016,7 @@ static inline void ppgtt_invalidate_pte(struct > > intel_vgpu_ppgtt_spt *spt, > > intel_gvt_dma_unmap_guest_page(vgpu, pfn << PAGE_SHIFT); > > } > > > > -static int ppgtt_invalidate_spt(struct intel_vgpu_ppgtt_spt *spt) > > +static int ppgtt_invalidate_spt(struct intel_vgpu_ppgtt_spt *spt, int > > is_error) > > { > > struct intel_vgpu *vgpu = spt->vgpu; > > struct intel_gvt_gtt_entry e; > > @@ -1059,9 +1059,11 @@ static int ppgtt_invalidate_spt(struct > > intel_vgpu_ppgtt_spt *spt) > > } > > } > > > > -trace_spt_change(spt->vgpu->id, "release", spt, > > +if (!is_error) { > > +trace_spt_change(spt->vgpu->id, "release", spt, > >spt->guest_page.gfn, spt->shadow_page.type); > > -ppgtt_free_spt(spt); > > +ppgtt_free_spt(spt); > > +} > > return 0; > > fail: > > gvt_vgpu_err("fail: shadow page %p shadow entry 0x%llx type %d\n", > > @@ -1215,7 +1217,7 @@ static int split_2MB_gtt_entry(struct intel_vgpu > > *vgpu, > > ret = intel_gvt_dma_map_guest_page(vgpu, start_gfn + sub_index, > > PAGE_SIZE, &dma_addr); > > if (ret) { > > -ppgtt_invalidate_spt(spt); > > +ppgtt_invalidate_spt(spt, 1); > > return ret; > > } > > sub_se.val64 = se->val64; > > @@ -1393,7 +1395,7 @@ static int ppgtt_handle_guest_entry_removal(struct > > intel_vgpu_ppgtt_spt *spt, > > ret = -ENXIO; > > goto fail; > > } > > -ret = ppgtt_invalidate_spt(s); > > +ret = ppgtt_invalidate_spt(s, 0); > > if (ret) > > goto fail; > > } else { > > -- > Jani Nikula, Intel Open Source Graphics Center
Re: [Intel-gfx] [RFC v4 03/14] drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
On 21/09/2022 08:09, Niranjana Vishwanathapura wrote: Expose i915_gem_object_max_page_size() function non-static which will be used by the vm_bind feature. Signed-off-by: Niranjana Vishwanathapura Signed-off-by: Andi Shyti --- drivers/gpu/drm/i915/gem/i915_gem_create.c | 20 +++- drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 ++ 2 files changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c index 33673fe7ee0a..3b3ab4abb0a3 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_create.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c @@ -11,14 +11,24 @@ #include "pxp/intel_pxp.h" #include "i915_drv.h" +#include "i915_gem_context.h" I can't spot that you are adding any code which would need this? I915_GTT_PAGE_SIZE_4K? It is in intel_gtt.h. #include "i915_gem_create.h" #include "i915_trace.h" #include "i915_user_extensions.h" -static u32 object_max_page_size(struct intel_memory_region **placements, - unsigned int n_placements) +/** + * i915_gem_object_max_page_size() - max of min_page_size of the regions + * @placements: list of regions + * @n_placements: number of the placements + * + * Calculates the max of the min_page_size of a list of placements passed in. + * + * Return: max of the min_page_size + */ +u32 i915_gem_object_max_page_size(struct intel_memory_region **placements, + unsigned int n_placements) { - u32 max_page_size = 0; + u32 max_page_size = I915_GTT_PAGE_SIZE_4K; int i; for (i = 0; i < n_placements; i++) { @@ -28,7 +38,6 @@ static u32 object_max_page_size(struct intel_memory_region **placements, max_page_size = max_t(u32, max_page_size, mr->min_page_size); } - GEM_BUG_ON(!max_page_size); return max_page_size; } @@ -99,7 +108,8 @@ __i915_gem_object_create_user_ext(struct drm_i915_private *i915, u64 size, i915_gem_flush_free_objects(i915); - size = round_up(size, object_max_page_size(placements, n_placements)); + size = round_up(size, i915_gem_object_max_page_size(placements, + n_placements)); if (size == 0) return ERR_PTR(-EINVAL); Because of the changes above this path is now unreachable. I suppose it was meant to tell the user "you have supplied no placements"? But then GEM_BUG_ON (which you remove) used to be wrong. Regards, Tvrtko diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 7317d4102955..8c97bddad921 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -47,6 +47,8 @@ static inline bool i915_gem_object_size_2big(u64 size) } void i915_gem_init__objects(struct drm_i915_private *i915); +u32 i915_gem_object_max_page_size(struct intel_memory_region **placements, + unsigned int n_placements); void i915_objects_module_exit(void); int i915_objects_module_init(void);
[PATCH] PM: runtime: Return properly from rpm_resume() if dev->power.needs_force_resume flag is set
After a device transitions to sleep state through it's system suspend callback pm_runtime_force_suspend(), the device's driver may still try to do runtime PM for the device(runtime suspend first and then runtime resume) although runtime PM is disabled by that callback. The runtime PM operations would not touch the device effectively and the device is assumed to be resumed through it's system resume callback pm_runtime_force_resume(). The problem is that since the device's runtime PM status is RPM_SUSPENDED in the sleep state, rpm_resume() would not take the device as in already active status and would return -EACCES instead of 1. The error code -EACCES may make the device's driver put the device into runtime suspend status with an auto suspend delay which may happen after the device is active again with force resume. A real problematic case is the panel-simple.c driver(works with a downstream DRM device driver), where the optional enable_gpio(controlled by the runtime PM callbacks) will be disabled after the procedure with an auto suspend delay if a DRM atomic commit(triggered by a DRM device's system resume callback) tries to do runtime PM resume for the panel before the panel is active with force resume: 1) System suspend: - pm_runtime_force_suspend() - panel_simple_suspend() // enable_gpio is disabled 2) Runtime suspend with a DRM atomic commit: - panel_simple_unprepare() - pm_runtime_put_autosuspend() 3) Runtime resume with a DRM atomic commit: - panel_simple_prepare() - pm_runtime_get_sync() - rpm_resume() // return -EACCES - pm_runtime_put_autosuspend() // 1 second delay ... 4) System resume: - pm_runtime_force_resume() - panel_simple_resume() // enable_gpio is enabled 5) ... - pm_runtime_work() - rpm_suspend() - panel_simple_suspend() // enable_gpio is disabled Fix the issue by checking dev->power.needs_force_resume flag in rpm_resume() so that it returns 1 instead of -EACCES in this scenario, since the flag is set in pm_runtime_force_suspend(). Also, update the documentation to change the description of pm_runtime_resume() to reflect the new behavior of rpm_resume(). Cc: Rafael J. Wysocki Cc: Len Brown Cc: Pavel Machek Cc: Greg Kroah-Hartman Cc: Ulf Hansson Cc: Douglas Anderson Signed-off-by: Liu Ying --- Documentation/power/runtime_pm.rst | 14 +++--- drivers/base/power/runtime.c | 2 ++ 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/Documentation/power/runtime_pm.rst b/Documentation/power/runtime_pm.rst index 65b86e487afe..6266f0ac02a8 100644 --- a/Documentation/power/runtime_pm.rst +++ b/Documentation/power/runtime_pm.rst @@ -337,13 +337,13 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: `int pm_runtime_resume(struct device *dev);` - execute the subsystem-level resume callback for the device; returns 0 on - success, 1 if the device's runtime PM status is already 'active' (also if - 'power.disable_depth' is nonzero, but the status was 'active' when it was - changing from 0 to 1) or error code on failure, where -EAGAIN means it may - be safe to attempt to resume the device again in future, but - 'power.runtime_error' should be checked additionally, and -EACCES means - that the callback could not be run, because 'power.disable_depth' was - different from 0 + success, 1 if the device's runtime PM status is assumed to be 'active' + with force resume or is already 'active' (also if 'power.disable_depth' is + nonzero, but the status was 'active' when it was changing from 0 to 1) or + error code on failure, where -EAGAIN means it may be safe to attempt to + resume the device again in future, but 'power.runtime_error' should be + checked additionally, and -EACCES means that the callback could not be + run, because 'power.disable_depth' was different from 0 `int pm_runtime_resume_and_get(struct device *dev);` - run pm_runtime_resume(dev) and if successful, increment the device's diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index 997be3ac20a7..0bce66ea0036 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -762,6 +762,8 @@ static int rpm_resume(struct device *dev, int rpmflags) repeat: if (dev->power.runtime_error) { retval = -EINVAL; + } else if (dev->power.needs_force_resume) { + retval = 1; } else if (dev->power.disable_depth > 0) { if (dev->power.runtime_status == RPM_ACTIVE && dev->power.last_status == RPM_ACTIVE) -- 2.37.1
Re: [PATCH v2 2/7] firmware: raspberrypi: Move the clock IDs to the firmware header
Hi, On Tue, Sep 20, 2022 at 06:01:08PM +0200, Stefan Wahren wrote: > Am 20.09.22 um 14:50 schrieb Maxime Ripard: > > We'll need the clock IDs in more drivers than just the clock driver from > > now on, so let's move them in the firmware header. > > recently as i reviewed the clk-raspberrypi i noticed this, too. But from my > point of view the clock ids should go to include/dt-bindings/clock (like > bcm2835.h) because these clock ids are actually referenced in the DTS files > and we need to make sure they are in sync. AFAIR this would also result in > change from enum to defines. > > Sorry, i didn't had the time to send a patch for this. IMO, we need both, and this enum still belongs in the firmware header. We have two separate things, the firmware interface and the DT interface. The kernel is a consumer for both, but the fact that they match is an implementation detail. It might even change in the future for all we know. So having a header to use defines for the clock indices in the DT looks like a good idea to me, but I think we should keep that enum in the firmware header as well. Maxime signature.asc Description: PGP signature
Re: [PATCH 4/5] drm/damage-helper: Do partial updates if framebuffer has not been changed
Hi Am 21.09.22 um 10:37 schrieb Ville Syrjälä: On Wed, Sep 21, 2022 at 09:59:10AM +0200, Thomas Zimmermann wrote: Hi Ville Am 20.09.22 um 16:31 schrieb Ville Syrjälä: On Tue, Sep 20, 2022 at 03:56:18PM +0200, Thomas Zimmermann wrote: Set partial updates on a plane if the framebuffer has not been changed on an atomic commit. If such a plane has damage clips, the driver will use them; otherwise the update is effectively empty. Planes that change their framebuffer still perform a full update. I have a feeling you're sort of papering over some inefficient userspace that's asking updates on planes that don't actually need them. I'm not sure adding more code for that is a particularly great idea. Wouldn't it be better to just fix the userspace code? Some more context might be in order: The ast driver currently uses VRAM helpers, which leads to many problems; including blank screens from the low amount of video memory. The best solution is to switch SHMEM with BOs on system meory. The video memory is only the internal buffer for scanout. This update involves a mempcy from the BO to video memory. Ast's hardware is really slow, so it makes sense to reduce the updates to video memory to a minimum. There's support for cursor planes, which really makes a difference here. But doing any cursor-plane update (e.g., moving, animation) with SHMEM and the current damage helpers always results in a full-screen memcpy from the BO to the video memory for the primary plane. As the ast hardware is slow, performance goes down to a an estimated 5 frame per seconds. After moving the mouse, one can watch the mouse cursor follow along the screen for the next seconds. Userspace sends the movement information and DRM slowly processes them. The same can be observed for cursor animation. The problem is that each change to the cursor plane results in an atomic_update call for the primary plane. Not having damage information for the plane just means 'update everything'. Not a problem if redrawing consists of reprogramming the scanout address. For a memcpy it's not feasible. So can this be fixed in userspace? No realistically IMHO. I've seen this problem on Weston, Wayland-Gnome and X11-Gnome. And they are all problematic in their own way. (That's why there are multiple patches needed.) For example, X11 uses the legacy mouse ioctl, which one of the patches fixes. The other userspace needs the other heuristics. A potential userspace fix would mean to always set empty-damage information on all planes that don't get an update. And I don't consider X11 fixable TBH. Any property change on the plane could need a full plane update as well (eg. some color mangement stuff etc.). So you'd have to keep adding exceptions to that list whenever new properties are added. It's not about the occasional change of a property. It's the constant sluggish redraw when the interface is supposed to be snappy, such as mouse interaction. And I think the semantics of the ioctl(s) has so far been that basically any page flip (whether or not it actually changes the fb) still ends up doing whatever flushing is needed to guarantee the fb contents are up to date on the screen (if someone sneaked in some front buffer rendering in between). Ie. you could even use basically a nop page flip in place of dirtyfb. That's why full updates are still the default. Only in cases where it's save to avoid an update of unaffected planes, we do so. Umm. Maybe I misread the patch in my haste but it seemed that you consider the thing undamaged if the fb pointer was unchanged. That goes against what I wrote above. I try to resolve that in patch 5. The fb_dirty flag signals that the framebuffer's content changed in some way. Patches 4 and 5 could be seen as one change, but that might overload the resulting patch. (Maybe it's not well designed overall. :/) Though I don't really know if a there is software relying on that behaviuor. I suppose one idea could be to keep that behaviour for the legacy ioctls but not for atomic. Ee. any fb directly specified in a legacy setcrtc/setplane/pageflip is always considered fully damaged. But including an the same Assuming the specified fb to be damaged seems like a non-issue to me. The problem is with the other framebuffers: if userspace sends us a CURSOR_MOVE ioctl, we can safely assume the cursor fb to be damaged. But we don't want the primary plane's fb to be damaged. Or else we'd redraw the full primary plane. fb in the atomic ioctl does not imply damage. That would mean atomic has to rely on specifying damage explicitly, and any userspace that doesn't do that will be broken. The problem again is not in the damage information on planes that legitimately need a redraw. It's all the other planes that are being redrawn as well. Or is there some scenario that I don't see? Or we could introduce a client cap for it I guess, but that would require (minimal) userspace changes. I know that we don't give p
Re: [PATCH] drm/bridge: ti-sn65dsi83: Add and use hs_rate and lp_rate
On Mon, Sep 19, 2022 at 08:17:11PM +0200, Marek Vasut wrote: > On 9/19/22 15:43, Maxime Ripard wrote: > > Hi, > > Hello Maxime, > > > On Sun, Sep 18, 2022 at 02:56:00PM +0200, Marek Vasut wrote: > > > On 8/1/22 15:11, Marek Vasut wrote: > > > > Fill in hs_rate and lp_rate to struct mipi_dsi_device for this bridge > > > > and > > > > adjust DSI input frequency calculations such that they expect the DSI > > > > host > > > > to configure HS clock according to hs_rate. > > > > > > > > This is an optimization for the DSI burst mode case. In case the DSI > > > > device > > > > supports DSI burst mode, it is recommended to operate the DSI interface > > > > at > > > > the highest possible HS clock frequency which the DSI device supports. > > > > This > > > > permits the DSI host to send as short as possible bursts of data on the > > > > DSI > > > > link and keep the DSI data lanes in LP mode otherwise, which reduces > > > > power > > > > consumption. > > > > > > > Signed-off-by: Marek Vasut > > > > Cc: Jagan Teki > > > > Cc: Laurent Pinchart > > > > Cc: Linus Walleij > > > > Cc: Robert Foss > > > > Cc: Sam Ravnborg > > > > Cc: dri-devel@lists.freedesktop.org > > > > --- > > > >drivers/gpu/drm/bridge/ti-sn65dsi83.c | 25 + > > > >1 file changed, 13 insertions(+), 12 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c > > > > b/drivers/gpu/drm/bridge/ti-sn65dsi83.c > > > > index 14e7aa77e7584..b161f25c3a2f5 100644 > > > > --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c > > > > +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c > > > > @@ -286,8 +286,7 @@ static u8 sn65dsi83_get_lvds_range(struct sn65dsi83 > > > > *ctx, > > > > return (mode_clock - 12500) / 25000; > > > >} > > > > -static u8 sn65dsi83_get_dsi_range(struct sn65dsi83 *ctx, > > > > - const struct drm_display_mode *mode) > > > > +static u8 sn65dsi83_get_dsi_range(struct sn65dsi83 *ctx) > > > >{ > > > > /* > > > > * The encoding of the CHA_DSI_CLK_RANGE is as follows: > > > > @@ -303,20 +302,20 @@ static u8 sn65dsi83_get_dsi_range(struct > > > > sn65dsi83 *ctx, > > > > * DSI_CLK = mode clock * bpp / dsi_data_lanes / 2 > > > > * the 2 is there because the bus is DDR. > > > > */ > > > > - return DIV_ROUND_UP(clamp((unsigned int)mode->clock * > > > > - > > > > mipi_dsi_pixel_format_to_bpp(ctx->dsi->format) / > > > > - ctx->dsi->lanes / 2, 4U, 50U), > > > > 5000U); > > > > + return DIV_ROUND_UP(ctx->dsi->hs_rate, 500U); > > > >} > > > > -static u8 sn65dsi83_get_dsi_div(struct sn65dsi83 *ctx) > > > > +static u8 sn65dsi83_get_dsi_div(struct sn65dsi83 *ctx, > > > > + const struct drm_display_mode *mode) > > > >{ > > > > /* The divider is (DSI_CLK / LVDS_CLK) - 1, which really is: */ > > > > - unsigned int dsi_div = > > > > mipi_dsi_pixel_format_to_bpp(ctx->dsi->format); > > > > + unsigned int dsi_div; > > > > + int mode_clock = mode->clock; > > > > - dsi_div /= ctx->dsi->lanes; > > > > + if (ctx->lvds_dual_link) > > > > + mode_clock /= 2; > > > > - if (!ctx->lvds_dual_link) > > > > - dsi_div /= 2; > > > > + dsi_div = (ctx->dsi->hs_rate / mode_clock) / 1000; > > > > return dsi_div - 1; > > > >} > > > > @@ -397,9 +396,9 @@ static void sn65dsi83_atomic_enable(struct > > > > drm_bridge *bridge, > > > > > > > > REG_RC_LVDS_PLL_LVDS_CLK_RANGE(sn65dsi83_get_lvds_range(ctx, mode)) | > > > > REG_RC_LVDS_PLL_HS_CLK_SRC_DPHY); > > > > regmap_write(ctx->regmap, REG_DSI_CLK, > > > > - > > > > REG_DSI_CLK_CHA_DSI_CLK_RANGE(sn65dsi83_get_dsi_range(ctx, mode))); > > > > + > > > > REG_DSI_CLK_CHA_DSI_CLK_RANGE(sn65dsi83_get_dsi_range(ctx))); > > > > regmap_write(ctx->regmap, REG_RC_DSI_CLK, > > > > - > > > > REG_RC_DSI_CLK_DSI_CLK_DIVIDER(sn65dsi83_get_dsi_div(ctx))); > > > > + > > > > REG_RC_DSI_CLK_DSI_CLK_DIVIDER(sn65dsi83_get_dsi_div(ctx, mode))); > > > > /* Set number of DSI lanes and LVDS link config. */ > > > > regmap_write(ctx->regmap, REG_DSI_LANE, > > > > @@ -643,6 +642,8 @@ static int sn65dsi83_host_attach(struct sn65dsi83 > > > > *ctx) > > > > dsi->lanes = dsi_lanes; > > > > dsi->format = MIPI_DSI_FMT_RGB888; > > > > dsi->mode_flags = MIPI_DSI_MODE_VIDEO | > > > > MIPI_DSI_MODE_VIDEO_BURST; > > > > + dsi->hs_rate = 5; > > > > + dsi->lp_rate = 1600; > > > > Let's leave aside the comment from Dave that the host might choose a > > lower HS rate, we can indeed assume it's true for now. > > > > However.. Is there any guarantee that the host can even reach that > > frequency in the first place? IIRC, the ma
Re: [PATCH 2/2] drm/panfrost: replace endian-specific types with generic ones
On 2022-09-21 09:48, Steven Price wrote: On 20/09/2022 23:13, Alyssa Rosenzweig wrote: Tentative r-b, but we *do* need to make a decision on how we want to handle endianness. I don't have strong feelings but the results of that discussion should go in the commit message. Linux currently treats the dump objects specially - the headers are little endian. All the other (Panfrost) DRM structures are native endian (although I doubt anyone has tested it so I expect bugs). If there can be *any* native-endian data included in the dump, then the original endianness needs to be recorded to be able to analyse it correctly anyway. The dumping code can't know the granularity at which arbitrary BOs may or may not need to be byteswapped to make everything consistently LE. I've no particularly strong views on this, but since the dump objects are likely to be saved to disk and transferred between computers it makes sense to fix the endianness for those. The __le types currently mean sparse can warn if we screw up in the kernel, so it would be a shame to lose that type checking. Another option would be to extend the list of typedefs in include/uapi/drm/drm.h to include the __le types. We'd need wider buy-in for that change though. Finally etnaviv 'solves' the issue by not including the dump structures in the UABI header... Or of course we could just actually use native endian and detect from the magic which endian is in use. That would require ripping out the cpu_to_lexx() calls in Linux and making the user space tool more intelligent. I'm happy with that, but it's pushing the complexity onto Mesa. If there's a clearly identifiable header, then I'd say making the whole dump native-endian is probably the way to go. Unless and until anyone actually demands to be able to do cross-endian post-mortem GPU debugging, the realistic extent of the complexity in Mesa is that it doesn't recognise the foreign dump format and gives up, which I assume is already implemented :) Cheers, Robin. Steve On Tue, Sep 20, 2022 at 10:15:45PM +0100, Adri??n Larumbe wrote: __le32 and __l64 endian-specific types aren't portable and not available on FreeBSD, for which there's a uAPI compatible reimplementation of Panfrost. Replace these specific types with more generic unsigned ones, to prevent FreeBSD Mesa build errors. Bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7252 Fixes: 730c2bf4ad39 ("drm/panfrost: Add support for devcoredump") Signed-off-by: Adri??n Larumbe --- include/uapi/drm/panfrost_drm.h | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h index bd77254be121..c1a10a9366a9 100644 --- a/include/uapi/drm/panfrost_drm.h +++ b/include/uapi/drm/panfrost_drm.h @@ -236,24 +236,24 @@ struct drm_panfrost_madvise { #define PANFROSTDUMP_BUF_TRAILER (PANFROSTDUMP_BUF_BO + 1) struct panfrost_dump_object_header { - __le32 magic; - __le32 type; - __le32 file_size; - __le32 file_offset; + __u32 magic; + __u32 type; + __u32 file_size; + __u32 file_offset; union { struct { - __le64 jc; - __le32 gpu_id; - __le32 major; - __le32 minor; - __le64 nbos; + __u64 jc; + __u32 gpu_id; + __u32 major; + __u32 minor; + __u64 nbos; } reghdr; struct { - __le32 valid; - __le64 iova; - __le32 data[2]; + __u32 valid; + __u64 iova; + __u32 data[2]; } bomap; /* @@ -261,14 +261,14 @@ struct panfrost_dump_object_header { * with new fields and also keep it 512-byte aligned */ - __le32 sizer[496]; + __u32 sizer[496]; }; }; /* Registers object, an array of these */ struct panfrost_dump_registers { - __le32 reg; - __le32 value; + __u32 reg; + __u32 value; }; #if defined(__cplusplus) -- 2.37.0
Re: [PATCH 4/5] drm/damage-helper: Do partial updates if framebuffer has not been changed
On Wed, Sep 21, 2022 at 11:39:51AM +0200, Thomas Zimmermann wrote: > Hi > > Am 21.09.22 um 10:37 schrieb Ville Syrjälä: > > On Wed, Sep 21, 2022 at 09:59:10AM +0200, Thomas Zimmermann wrote: > >> Hi Ville > >> > >> Am 20.09.22 um 16:31 schrieb Ville Syrjälä: > >>> On Tue, Sep 20, 2022 at 03:56:18PM +0200, Thomas Zimmermann wrote: > Set partial updates on a plane if the framebuffer has not been changed > on an atomic commit. If such a plane has damage clips, the driver will > use them; otherwise the update is effectively empty. Planes that change > their framebuffer still perform a full update. > >>> > >>> I have a feeling you're sort of papering over some inefficient > >>> userspace that's asking updates on planes that don't actually > >>> need them. I'm not sure adding more code for that is a particularly > >>> great idea. Wouldn't it be better to just fix the userspace code? > >> > >> Some more context might be in order: > >> > >> The ast driver currently uses VRAM helpers, which leads to many > >> problems; including blank screens from the low amount of video memory. > >> The best solution is to switch SHMEM with BOs on system meory. The video > >> memory is only the internal buffer for scanout. This update involves a > >> mempcy from the BO to video memory. > >> > >> Ast's hardware is really slow, so it makes sense to reduce the updates > >> to video memory to a minimum. There's support for cursor planes, which > >> really makes a difference here. > >> > >> But doing any cursor-plane update (e.g., moving, animation) with SHMEM > >> and the current damage helpers always results in a full-screen memcpy > >> from the BO to the video memory for the primary plane. As the ast > >> hardware is slow, performance goes down to a an estimated 5 frame per > >> seconds. After moving the mouse, one can watch the mouse cursor follow > >> along the screen for the next seconds. Userspace sends the movement > >> information and DRM slowly processes them. The same can be observed for > >> cursor animation. > >> > >> The problem is that each change to the cursor plane results in an > >> atomic_update call for the primary plane. Not having damage information > >> for the plane just means 'update everything'. Not a problem if redrawing > >> consists of reprogramming the scanout address. For a memcpy it's not > >> feasible. > >> > >> So can this be fixed in userspace? No realistically IMHO. I've seen this > >> problem on Weston, Wayland-Gnome and X11-Gnome. And they are all > >> problematic in their own way. (That's why there are multiple patches > >> needed.) For example, X11 uses the legacy mouse ioctl, which one of the > >> patches fixes. The other userspace needs the other heuristics. A > >> potential userspace fix would mean to always set empty-damage > >> information on all planes that don't get an update. And I don't consider > >> X11 fixable TBH. > >> > >>> > >>> Any property change on the plane could need a full plane > >>> update as well (eg. some color mangement stuff etc.). So you'd > >>> have to keep adding exceptions to that list whenever new > >>> properties are added. > >> > >> It's not about the occasional change of a property. It's the constant > >> sluggish redraw when the interface is supposed to be snappy, such as > >> mouse interaction. > >> > >>> > >>> And I think the semantics of the ioctl(s) has so far been that > >>> basically any page flip (whether or not it actually changes > >>> the fb) still ends up doing whatever flushing is needed to > >>> guarantee the fb contents are up to date on the screen (if > >>> someone sneaked in some front buffer rendering in between). > >>> Ie. you could even use basically a nop page flip in place > >>> of dirtyfb. > >> > >> That's why full updates are still the default. Only in cases where it's > >> save to avoid an update of unaffected planes, we do so. > > > > Umm. Maybe I misread the patch in my haste but it seemed > > that you consider the thing undamaged if the fb pointer > > was unchanged. That goes against what I wrote above. > > I try to resolve that in patch 5. The fb_dirty flag signals that the > framebuffer's content changed in some way. Patches 4 and 5 could be seen > as one change, but that might overload the resulting patch. (Maybe it's > not well designed overall. :/) > > > > > Though I don't really know if a there is software relying on > > that behaviuor. I suppose one idea could be to keep that > > behaviour for the legacy ioctls but not for atomic. Ee. any > > fb directly specified in a legacy setcrtc/setplane/pageflip > > is always considered fully damaged. But including an the same > > Assuming the specified fb to be damaged seems like a non-issue to me. > > The problem is with the other framebuffers: if userspace sends us a > CURSOR_MOVE ioctl, we can safely assume the cursor fb to be damaged. But > we don't want the primary plane's fb to be damaged. Or else we'd redraw > the full primary plane. > > > > fb in
Re: [Intel-gfx] [RFC v4 08/14] drm/i915/vm_bind: Abstract out common execbuf functions
On 21/09/2022 08:09, Niranjana Vishwanathapura wrote: The new execbuf3 ioctl path and the legacy execbuf ioctl paths have many common functionalities. Share code between these two paths by abstracting out the common functionalities into a separate file where possible. Looks like a good start to me. A couple comments/questions below. Signed-off-by: Niranjana Vishwanathapura --- drivers/gpu/drm/i915/Makefile | 1 + .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 507 ++--- .../drm/i915/gem/i915_gem_execbuffer_common.c | 530 ++ .../drm/i915/gem/i915_gem_execbuffer_common.h | 47 ++ 4 files changed, 612 insertions(+), 473 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 9bf939ef18ea..bf952f478555 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -148,6 +148,7 @@ gem-y += \ gem/i915_gem_create.o \ gem/i915_gem_dmabuf.o \ gem/i915_gem_domain.o \ + gem/i915_gem_execbuffer_common.o \ gem/i915_gem_execbuffer.o \ gem/i915_gem_internal.o \ gem/i915_gem_object.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 33d989a20227..363b2a788cdf 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -9,8 +9,6 @@ #include #include -#include - #include "display/intel_frontbuffer.h" #include "gem/i915_gem_ioctls.h" @@ -28,6 +26,7 @@ #include "i915_file_private.h" #include "i915_gem_clflush.h" #include "i915_gem_context.h" +#include "i915_gem_execbuffer_common.h" #include "i915_gem_evict.h" #include "i915_gem_ioctls.h" #include "i915_trace.h" @@ -235,13 +234,6 @@ enum { * the batchbuffer in trusted mode, otherwise the ioctl is rejected. */ -struct eb_fence { - struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */ - struct dma_fence *dma_fence; - u64 value; - struct dma_fence_chain *chain_fence; -}; - struct i915_execbuffer { struct drm_i915_private *i915; /** i915 backpointer */ struct drm_file *file; /** per-file lookup tables and limits */ @@ -2446,164 +2438,29 @@ static const enum intel_engine_id user_ring_map[] = { [I915_EXEC_VEBOX] = VECS0 }; -static struct i915_request *eb_throttle(struct i915_execbuffer *eb, struct intel_context *ce) -{ - struct intel_ring *ring = ce->ring; - struct intel_timeline *tl = ce->timeline; - struct i915_request *rq; - - /* -* Completely unscientific finger-in-the-air estimates for suitable -* maximum user request size (to avoid blocking) and then backoff. -*/ - if (intel_ring_update_space(ring) >= PAGE_SIZE) - return NULL; - - /* -* Find a request that after waiting upon, there will be at least half -* the ring available. The hysteresis allows us to compete for the -* shared ring and should mean that we sleep less often prior to -* claiming our resources, but not so long that the ring completely -* drains before we can submit our next request. -*/ - list_for_each_entry(rq, &tl->requests, link) { - if (rq->ring != ring) - continue; - - if (__intel_ring_space(rq->postfix, - ring->emit, ring->size) > ring->size / 2) - break; - } - if (&rq->link == &tl->requests) - return NULL; /* weird, we will check again later for real */ - - return i915_request_get(rq); -} - -static int eb_pin_timeline(struct i915_execbuffer *eb, struct intel_context *ce, - bool throttle) -{ - struct intel_timeline *tl; - struct i915_request *rq = NULL; - - /* -* Take a local wakeref for preparing to dispatch the execbuf as -* we expect to access the hardware fairly frequently in the -* process, and require the engine to be kept awake between accesses. -* Upon dispatch, we acquire another prolonged wakeref that we hold -* until the timeline is idle, which in turn releases the wakeref -* taken on the engine, and the parent device. -*/ - tl = intel_context_timeline_lock(ce); - if (IS_ERR(tl)) - return PTR_ERR(tl); - - intel_context_enter(ce); - if (throttle) - rq = eb_throttle(eb, ce); - intel_context_timeline_unlock(tl); - - if (rq) { - bool nonblock = eb->file->filp->f_flags & O_NONBLOCK; - long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT; - - if (i915_request_wait(rq, I915_WAIT_INTERRUPTI
Re: [PATCH v5 15/21] dma-buf: Move dma_buf_vmap() to dynamic locking specification
On 9/20/22 17:13, Sumit Semwal wrote: > Hi Dmitry, > > Thanks very much for the series. > > On Wed, 14 Sept 2022 at 00:59, Dmitry Osipenko > wrote: >> >> Move dma_buf_vmap/vunmap_unlocked() functions to the dynamic locking >> specification by asserting that the reservation lock is held. > Thanks for the patch; just a minor nit - I think you mean dma_buf_vmap > / vunmap() here, and not _unlocked? Yes, the _unlocked should be dropped here. Thank you for the review! -- Best regards, Dmitry
Re: [PATCH 4/5] drm/damage-helper: Do partial updates if framebuffer has not been changed
Hi Am 21.09.22 um 12:18 schrieb Ville Syrjälä: [...] Though I don't really know if a there is software relying on that behaviuor. I suppose one idea could be to keep that behaviour for the legacy ioctls but not for atomic. Ee. any fb directly specified in a legacy setcrtc/setplane/pageflip is always considered fully damaged. But including an the same Assuming the specified fb to be damaged seems like a non-issue to me. The problem is with the other framebuffers: if userspace sends us a CURSOR_MOVE ioctl, we can safely assume the cursor fb to be damaged. But we don't want the primary plane's fb to be damaged. Or else we'd redraw the full primary plane. fb in the atomic ioctl does not imply damage. That would mean atomic has to rely on specifying damage explicitly, and any userspace that doesn't do that will be broken. The problem again is not in the damage information on planes that legitimately need a redraw. It's all the other planes that are being redrawn as well. Or is there some scenario that I don't see? I thought we're talking about eg. a cursor update that also includes the primary plane because apparently userspace is lazy. I think what you're saying is that currently there is no damage information for the primary plane so you're attempting to infer it based on whether the fb property changed or not. Correct. And what I was saying is that IIRC historically specifying the same fb again has still implied full damage. Changing that behaviour may or may not break exising userspace. I cannot answer the question. The three cases I've seen at least worked with the new semantics. I think Daniel once mentioned that we already expect userspace to call the DIRTYFB ioctl after changing a framebuffer's content. Or we could introduce a client cap for it I guess, but that would require (minimal) userspace changes. I know that we don't give performance guarantees to userspace. But using cursor/overlay planes should be faster then not using them. I think that's a reasonable assumption for userspace to make. Another thing the ioctls have always done is actually perform the commit whether anything changed or nor. That is, they still do all the same the same vblanky stuff and the commit takes the same amount of time. Not sure if your idea is to potentially short circuit the entire thing and make it take no time at all? The patches don't change the overall commit logics. All they do is to set damage updates to a size of 0 if a plane reliably does not need an update. What I gathered is that you seemed to only consider the fb contents as something that needs damage handling. That is not the case for stuff like eDP PSR and DSI command mode displays where we need to upload a new full frame whenever also the non-damaged fb contents would get transformed by the hardware on the way to the remote frambuffer. That would mean any change in eg. scanout coordinates, color management, etc. There is support for changing scanout coordinates in drm_atomic_helper_damage_iter_init() in patch 2. In general, maybe the heuristic needs to be stricter to only handle updates that only change damage. For now, the problem only happens after converting ast to SHMEM. All the other SHMEM-based drivers use a single plane where the problem doesn't happen; or only reprogram the scanout address, which is fast. If the damage-handling logic imposes problems on other drivers, some of it could possibly be moved into ast itself. Maybe we have two clearly separate classes of uses case: 1. devices where only damage to the fb contents matter (eg. some kind of shadow fb that is the same size as the real fb). 2. devices where everything about the scanout process matters (eg. PSR) ? Maybe there should be helpers to deal with just the first case, and then some more helpers (or just driver code) to pile the rest on top as well when needed? Makes sense. I know that these plane-state are not good style, but with them in place, much of the heuristic could be moved from drm_atomic_helper_check_plane_damage() into the driver if necessary. Best regards Thomas -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH 1/7] drm/i915/hwmon: Add HWMON infrastructure
On 9/16/2022 8:30 PM, Badal Nilawar wrote: From: Dale B Stimson The i915 HWMON module will be used to expose voltage, power and energy values for dGfx. Here we set up i915 hwmon infrastructure including i915 hwmon registration, basic data structures and functions. v2: - Create HWMON infra patch (Ashutosh) - Fixed review comments (Jani) - Remove "select HWMON" from i915/Kconfig (Jani) v3: Use hwm_ prefix for static functions (Ashutosh) v4: s/#ifdef CONFIG_HWMON/#if IS_REACHABLE(CONFIG_HWMON)/ since the former doesn't work if hwmon is compiled as a module (Guenter) v5: Fixed review comments (Jani) Cc: Guenter Roeck Signed-off-by: Dale B Stimson Signed-off-by: Ashutosh Dixit Signed-off-by: Riana Tauro Signed-off-by: Badal Nilawar Acked-by: Guenter Roeck Reviewed-by: Ashutosh Dixit Reviewed-by: Anshuman Gupta --- drivers/gpu/drm/i915/Makefile | 3 + drivers/gpu/drm/i915/i915_driver.c | 5 ++ drivers/gpu/drm/i915/i915_drv.h| 2 + drivers/gpu/drm/i915/i915_hwmon.c | 136 + drivers/gpu/drm/i915/i915_hwmon.h | 20 + 5 files changed, 166 insertions(+) create mode 100644 drivers/gpu/drm/i915/i915_hwmon.c create mode 100644 drivers/gpu/drm/i915/i915_hwmon.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index a26edcdadc21..66a6023e61a6 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -209,6 +209,9 @@ i915-y += gt/uc/intel_uc.o \ # graphics system controller (GSC) support i915-y += gt/intel_gsc.o +# graphics hardware monitoring (HWMON) support +i915-$(CONFIG_HWMON) += i915_hwmon.o + # modesetting core code i915-y += \ display/hsw_ips.o \ diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c index c459eb362c47..75655adb7bd3 100644 --- a/drivers/gpu/drm/i915/i915_driver.c +++ b/drivers/gpu/drm/i915/i915_driver.c @@ -81,6 +81,7 @@ #include "i915_drm_client.h" #include "i915_drv.h" #include "i915_getparam.h" +#include "i915_hwmon.h" #include "i915_ioc32.h" #include "i915_ioctl.h" #include "i915_irq.h" @@ -763,6 +764,8 @@ static void i915_driver_register(struct drm_i915_private *dev_priv) for_each_gt(gt, dev_priv, i) intel_gt_driver_register(gt); + i915_hwmon_register(dev_priv); + intel_display_driver_register(dev_priv); intel_power_domains_enable(dev_priv); @@ -795,6 +798,8 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv) for_each_gt(gt, dev_priv, i) intel_gt_driver_unregister(gt); + i915_hwmon_unregister(dev_priv); + i915_perf_unregister(dev_priv); i915_pmu_unregister(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9f9372931fd2..01a2caf42635 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -353,6 +353,8 @@ struct drm_i915_private { struct i915_perf perf; + struct i915_hwmon *hwmon; + /* Abstract the submission mechanism (legacy ringbuffer or execlists) away */ struct intel_gt gt0; diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c new file mode 100644 index ..103dd543a214 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_hwmon.c @@ -0,0 +1,136 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2022 Intel Corporation + */ + +#include +#include +#include + +#include "i915_drv.h" +#include "i915_hwmon.h" +#include "i915_reg.h" +#include "intel_mchbar_regs.h" + +struct hwm_reg { +}; + +struct hwm_drvdata { + struct i915_hwmon *hwmon; + struct intel_uncore *uncore; + struct device *hwmon_dev; + char name[12]; +}; + +struct i915_hwmon { + struct hwm_drvdata ddat; + struct mutex hwmon_lock;/* counter overflow logic and rmw */ + struct hwm_reg rg; +}; + +static const struct hwmon_channel_info *hwm_info[] = { + NULL +}; + +static umode_t +hwm_is_visible(const void *drvdata, enum hwmon_sensor_types type, + u32 attr, int channel) +{ + switch (type) { + default: + return 0; + } +} + +static int +hwm_read(struct device *dev, enum hwmon_sensor_types type, u32 attr, +int channel, long *val) +{ + switch (type) { + default: + return -EOPNOTSUPP; + } +} + +static int +hwm_write(struct device *dev, enum hwmon_sensor_types type, u32 attr, + int channel, long val) +{ + switch (type) { + default: + return -EOPNOTSUPP; + } +} + +static const struct hwmon_ops hwm_ops = { + .is_visible = hwm_is_visible, + .read = hwm_read, + .write = hwm_write, +}; + +static const struct hwmon_chip_info hwm_chip_info = { + .ops = &hwm_ops, + .info = hwm_info, +}; + +static void +hwm_get_preregistration_info(struct drm_i915_private *i915) +{ +} + +void i9
Re: [PATCH 2/7] drm/i915/hwmon: Add HWMON current voltage support
On 9/16/2022 8:30 PM, Badal Nilawar wrote: From: Riana Tauro Use i915 HWMON subsystem to display current input voltage. v2: - Updated date and kernel version in feature description - Fixed review comments (Ashutosh) v3: Use macro HWMON_CHANNEL_INFO to define hwmon channel (Guenter) v4: - Fixed review comments (Ashutosh) - Use hwm_ prefix for static functions (Ashutosh) v5: - Added unit of voltage as millivolts (Ashutosh) - Updated date, kernel version in documentation Cc: Guenter Roeck Cc: Anshuman Gupta Signed-off-by: Riana Tauro Signed-off-by: Badal Nilawar Acked-by: Guenter Roeck Reviewed-by: Ashutosh Dixit Looks good to me. Reviewed-by: Anshuman Gupta --- .../ABI/testing/sysfs-driver-intel-i915-hwmon | 7 +++ drivers/gpu/drm/i915/gt/intel_gt_regs.h | 3 ++ drivers/gpu/drm/i915/i915_hwmon.c | 53 +++ 3 files changed, 63 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon new file mode 100644 index ..e2974f928e58 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon @@ -0,0 +1,7 @@ +What: /sys/devices/.../hwmon/hwmon/in0_input +Date: September 2022 +KernelVersion: 6 +Contact: dri-devel@lists.freedesktop.org +Description: RO. Current Voltage in millivolt. + + Only supported for particular Intel i915 graphics platforms. diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 2275ee47da95..65336514554d 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -1510,6 +1510,9 @@ #define VLV_RENDER_C0_COUNT _MMIO(0x138118) #define VLV_MEDIA_C0_COUNT_MMIO(0x13811c) +#define GEN12_RPSTAT1_MMIO(0x1381b4) +#define GEN12_VOLTAGE_MASK REG_GENMASK(10, 0) + #define GEN11_GT_INTR_DW(x) _MMIO(0x190018 + ((x) * 4)) #define GEN11_CSME (31) #define GEN11_GUNIT (28) diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c index 103dd543a214..45745afa5c5b 100644 --- a/drivers/gpu/drm/i915/i915_hwmon.c +++ b/drivers/gpu/drm/i915/i915_hwmon.c @@ -11,8 +11,16 @@ #include "i915_hwmon.h" #include "i915_reg.h" #include "intel_mchbar_regs.h" +#include "gt/intel_gt_regs.h" + +/* + * SF_* - scale factors for particular quantities according to hwmon spec. + * - voltage - millivolts + */ +#define SF_VOLTAGE 1000 struct hwm_reg { + i915_reg_t gt_perf_status; }; struct hwm_drvdata { @@ -29,14 +37,49 @@ struct i915_hwmon { }; static const struct hwmon_channel_info *hwm_info[] = { + HWMON_CHANNEL_INFO(in, HWMON_I_INPUT), NULL }; +static umode_t +hwm_in_is_visible(const struct hwm_drvdata *ddat, u32 attr) +{ + switch (attr) { + case hwmon_in_input: + return i915_mmio_reg_valid(ddat->hwmon->rg.gt_perf_status) ? 0444 : 0; + default: + return 0; + } +} + +static int +hwm_in_read(struct hwm_drvdata *ddat, u32 attr, long *val) +{ + struct i915_hwmon *hwmon = ddat->hwmon; + intel_wakeref_t wakeref; + u32 reg_value; + + switch (attr) { + case hwmon_in_input: + with_intel_runtime_pm(ddat->uncore->rpm, wakeref) + reg_value = intel_uncore_read(ddat->uncore, hwmon->rg.gt_perf_status); + /* HW register value in units of 2.5 millivolt */ + *val = DIV_ROUND_CLOSEST(REG_FIELD_GET(GEN12_VOLTAGE_MASK, reg_value) * 25, 10); + return 0; + default: + return -EOPNOTSUPP; + } +} + static umode_t hwm_is_visible(const void *drvdata, enum hwmon_sensor_types type, u32 attr, int channel) { + struct hwm_drvdata *ddat = (struct hwm_drvdata *)drvdata; + switch (type) { + case hwmon_in: + return hwm_in_is_visible(ddat, attr); default: return 0; } @@ -46,7 +89,11 @@ static int hwm_read(struct device *dev, enum hwmon_sensor_types type, u32 attr, int channel, long *val) { + struct hwm_drvdata *ddat = dev_get_drvdata(dev); + switch (type) { + case hwmon_in: + return hwm_in_read(ddat, attr, val); default: return -EOPNOTSUPP; } @@ -76,6 +123,12 @@ static const struct hwmon_chip_info hwm_chip_info = { static void hwm_get_preregistration_info(struct drm_i915_private *i915) { + struct i915_hwmon *hwmon = i915->hwmon; + + if (IS_DG1(i915) || IS_DG2(i915)) + hwmon->rg.gt_perf_status = GEN12_RPSTAT1; + else + hwmon->rg.gt_perf_status = INVALID_MMIO_REG;
Re: [PATCH v2 07/10] drm/ofdrm: Add ofdrm for Open Firmware framebuffers
Hi Am 26.07.22 um 15:17 schrieb Javier Martinez Canillas: Hello Thomas, On 7/20/22 16:27, Thomas Zimmermann wrote: Open Firmware provides basic display output via the 'display' node. DT platform code already provides a device that represents the node's framebuffer. Add a DRM driver for the device. The display mode and color format is pre-initialized by the system's firmware. Runtime modesetting via DRM is not possible. The display is useful during early boot stages or as error fallback. I'm not familiar with OF display but the driver looks good to me. Reviewed-by: Javier Martinez Canillas I just have a few questions below. [...] +static int ofdrm_primary_plane_helper_atomic_check(struct drm_plane *plane, + struct drm_atomic_state *new_state) +{ + struct drm_plane_state *new_plane_state = drm_atomic_get_new_plane_state(new_state, plane); + struct drm_crtc_state *new_crtc_state; + int ret; + + if (!new_plane_state->fb) + return 0; + + new_crtc_state = drm_atomic_get_new_crtc_state(new_state, new_plane_state->crtc); + + ret = drm_atomic_helper_check_plane_state(new_plane_state, new_crtc_state, + DRM_PLANE_HELPER_NO_SCALING, + DRM_PLANE_HELPER_NO_SCALING, + false, false); + if (ret) + return ret; + + return 0; +} This seems to be exactly the same check than used in the simpledrm driver. Maybe could be moved to the fwfb helper library too ? I've meanwhile dropped fwfb helpers. Color management requires specific code here, so there's no much to share anyway. [...] + +static void ofdrm_crtc_helper_atomic_disable(struct drm_crtc *crtc, +struct drm_atomic_state *old_state) +{ + /* +* Always enabled; disabling clears the screen in the +* primary plane's atomic_disable function. +*/ +} + Same comment than for simpledrm, are these no-op helpers really needed ? In simpledrm, I've meanwhile removed them. I'll do so here as well. [...] +static const struct of_device_id ofdrm_of_match_display[] = { + { .compatible = "display", }, + { }, +}; +MODULE_DEVICE_TABLE(of, ofdrm_of_match_display); + I don't see a binding for this in Documentation/devicetree/bindings/display. Do we need one or it's that only required for FDT and not Open Firmware DT ? No idea. The device is being created in drivers/of/platform.c. If offb didn't need these bindings, ofdrm probably won't need them either. Best regards Thomas -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature
Re: [Intel-gfx] [PATCH 3/7] drm/i915/hwmon: Power PL1 limit and TDP setting
On 21/09/2022 01:02, Dixit, Ashutosh wrote: On Fri, 16 Sep 2022 08:00:50 -0700, Badal Nilawar wrote: diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon index e2974f928e58..bc061238e35c 100644 --- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon +++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon @@ -5,3 +5,23 @@ Contact: dri-devel@lists.freedesktop.org Description: RO. Current Voltage in millivolt. Only supported for particular Intel i915 graphics platforms. + +What: /sys/devices/.../hwmon/hwmon/power1_max +Date: September 2022 +KernelVersion: 6 Maybe we should ask someone but even if we merge this today to drm-tip this will appear in kernel.org Linus' version only in 6.2. So I think we should set this as 6.2 on all patches. Correct, if merged today it will appear in 6.2 so please change to that before merging. As for the date that's harder to predict and I am not really sure how best to handle it. Crystal ball predicts February 2023 fwiw so maybe go with that for now. Seems less important than the release for me anyway. Regards, Tvrtko Except for this, thanks for making the changes, this is: Reviewed-by: Ashutosh Dixit
Re: [PATCH v2 08/10] drm/ofdrm: Add CRTC state
Hi Am 26.07.22 um 15:36 schrieb Javier Martinez Canillas: On 7/20/22 16:27, Thomas Zimmermann wrote: Add a dedicated CRTC state to ofdrm to later store information for palette updates. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/tiny/ofdrm.c | 62 ++-- Reviewed-by: Javier Martinez Canillas [...] +static void ofdrm_crtc_reset(struct drm_crtc *crtc) +{ + struct ofdrm_crtc_state *ofdrm_crtc_state; + + if (crtc->state) { + ofdrm_crtc_state_destroy(to_ofdrm_crtc_state(crtc->state)); + crtc->state = NULL; /* must be set to NULL here */ + } + + ofdrm_crtc_state = kzalloc(sizeof(*ofdrm_crtc_state), GFP_KERNEL); + if (!ofdrm_crtc_state) + return; + __drm_atomic_helper_crtc_reset(crtc, &ofdrm_crtc_state->base); +} + IMO this function is hard to read, I would instead write it as following: static void ofdrm_crtc_reset(struct drm_crtc *crtc) { struct ofdrm_crtc_state *ofdrm_crtc_state = kzalloc(sizeof(*ofdrm_crtc_state), GFP_KERNEL); if (!ofdrm_crtc_state) return; if (crtc->state) { ofdrm_crtc_state_destroy(to_ofdrm_crtc_state(crtc->state)); crtc->state = NULL; /* must be set to NULL here */ } __drm_atomic_helper_crtc_reset(crtc, &ofdrm_crtc_state->base); } Also with that form I think that the crtc->state = NULL could just be dropped ? I once had to add this line to a driver to make the DRM helpers work. But I cannot find any longer why. Maybe it's been resolved meanwhile. Best regards Thomas -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH 3/7] drm/i915/hwmon: Power PL1 limit and TDP setting
On 9/16/2022 8:30 PM, Badal Nilawar wrote: From: Dale B Stimson Use i915 HWMON to display/modify dGfx power PL1 limit and TDP setting. v2: - Fix review comments (Ashutosh) - Do not restore power1_max upon module unload/load sequence because on production systems modules are always loaded and not unloaded/reloaded (Ashutosh) - Fix review comments (Jani) - Remove endianness conversion (Ashutosh) v3: Add power1_rated_max (Ashutosh) v4: - Use macro HWMON_CHANNEL_INFO to define power channel (Guenter) - Update the date and kernel version in Documentation (Badal) v5: Use hwm_ prefix for static functions (Ashutosh) v6: - Fix review comments (Ashutosh) - Update date, kernel version in documentation Cc: Guenter Roeck Signed-off-by: Dale B Stimson Signed-off-by: Ashutosh Dixit Signed-off-by: Riana Tauro Signed-off-by: Badal Nilawar Acked-by: Guenter Roeck --- .../ABI/testing/sysfs-driver-intel-i915-hwmon | 20 +++ drivers/gpu/drm/i915/i915_hwmon.c | 158 +- drivers/gpu/drm/i915/i915_reg.h | 5 + drivers/gpu/drm/i915/intel_mchbar_regs.h | 6 + 4 files changed, 187 insertions(+), 2 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon index e2974f928e58..bc061238e35c 100644 --- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon +++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon @@ -5,3 +5,23 @@ Contact: dri-devel@lists.freedesktop.org Description: RO. Current Voltage in millivolt. Only supported for particular Intel i915 graphics platforms. + +What: /sys/devices/.../hwmon/hwmon/power1_max +Date: September 2022 +KernelVersion: 6 +Contact: dri-devel@lists.freedesktop.org +Description: RW. Card reactive sustained (PL1/Tau) power limit in microwatts. + + The power controller will throttle the operating frequency + if the power averaged over a window (typically seconds) + exceeds this limit. + + Only supported for particular Intel i915 graphics platforms. + +What: /sys/devices/.../hwmon/hwmon/power1_rated_max +Date: September 2022 +KernelVersion: 6 +Contact: dri-devel@lists.freedesktop.org +Description: RO. Card default power limit (default TDP setting). + + Only supported for particular Intel i915 graphics platforms. diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c index 45745afa5c5b..5183cf51a49b 100644 --- a/drivers/gpu/drm/i915/i915_hwmon.c +++ b/drivers/gpu/drm/i915/i915_hwmon.c @@ -16,11 +16,16 @@ /* * SF_* - scale factors for particular quantities according to hwmon spec. * - voltage - millivolts + * - power - microwatts */ #define SF_VOLTAGE1000 +#define SF_POWER 100 struct hwm_reg { i915_reg_t gt_perf_status; + i915_reg_t pkg_power_sku_unit; + i915_reg_t pkg_power_sku; + i915_reg_t pkg_rapl_limit; }; struct hwm_drvdata { @@ -34,10 +39,68 @@ struct i915_hwmon { struct hwm_drvdata ddat; struct mutex hwmon_lock;/* counter overflow logic and rmw */ struct hwm_reg rg; + int scl_shift_power; }; +static void +hwm_locked_with_pm_intel_uncore_rmw(struct hwm_drvdata *ddat, + i915_reg_t reg, u32 clear, u32 set) +{ + struct i915_hwmon *hwmon = ddat->hwmon; + struct intel_uncore *uncore = ddat->uncore; + intel_wakeref_t wakeref; + + mutex_lock(&hwmon->hwmon_lock); + + with_intel_runtime_pm(uncore->rpm, wakeref) + intel_uncore_rmw(uncore, reg, clear, set); + + mutex_unlock(&hwmon->hwmon_lock); +} + +/* + * This function's return type of u64 allows for the case where the scaling + * of the field taken from the 32-bit register value might cause a result to + * exceed 32 bits. + */ +static u64 +hwm_field_read_and_scale(struct hwm_drvdata *ddat, i915_reg_t rgadr, +u32 field_msk, int nshift, u32 scale_factor) +{ + struct intel_uncore *uncore = ddat->uncore; + intel_wakeref_t wakeref; + u32 reg_value; + + with_intel_runtime_pm(uncore->rpm, wakeref) + reg_value = intel_uncore_read(uncore, rgadr); + + reg_value = REG_FIELD_GET(field_msk, reg_value); + + return mul_u64_u32_shr(reg_value, scale_factor, nshift); +} + +static void +hwm_field_scale_and_write(struct hwm_drvdata *ddat, i915_reg_t rgadr, + u32 field_msk, int nshift, + unsigned int scale_factor, long lval) +{ + u32 nval; + u32 bits_to_clear; + u32 bits_to_set; + + /* Computation in 64-bits to avoid overflow. Round to nearest. */ + nval = DIV_ROUND_CLOSEST_ULL((u64)lval << nshift, scale_factor); + + bits_to_clear = field_msk; + bits_t
Re: [PATCH v9 09/10] leds: flash: mt6370: Add MediaTek MT6370 flashlight support
On Wed, Sep 21, 2022 at 4:48 AM ChiaEn Wu wrote: > On Sun, Sep 18, 2022 at 3:22 AM Han Jingoo wrote: > > On Mon, Aug 29, 2022 ChiaEn Wu wrote: > > > +#define MT6370_ITORCH_MIN_uA 25000 > > > +#define MT6370_ITORCH_STEP_uA 12500 > > > +#define MT6370_ITORCH_MAX_uA 40 > > > +#define MT6370_ITORCH_DOUBLE_MAX_uA80 > > > +#define MT6370_ISTRB_MIN_uA5 > > > +#define MT6370_ISTRB_STEP_uA 12500 > > > +#define MT6370_ISTRB_MAX_uA150 > > > +#define MT6370_ISTRB_DOUBLE_MAX_uA 300 > > > > Use upper letters as below: For microseconds (and other -seconds) the common practice (I assume historically) is to use upper letters, indeed. But for current it's more natural to use small letters for unit multiplier as it's easier to read and understand. > > #define MT6370_ITORCH_MIN_UA 25000 > > #define MT6370_ITORCH_STEP_UA 12500 > > #define MT6370_ITORCH_MAX_UA 40 > > #define MT6370_ITORCH_DOUBLE_MAX_UA80 > > #define MT6370_ISTRB_MIN_UA5 > > #define MT6370_ISTRB_STEP_UA 12500 > > #define MT6370_ISTRB_MAX_UA150 > > #define MT6370_ISTRB_DOUBLE_MAX_UA 300 > > > > > +#define MT6370_STRBTO_MIN_US 64000 > > > +#define MT6370_STRBTO_STEP_US 32000 > > > +#define MT6370_STRBTO_MAX_US 2432000 > > Hi Jingoo, > > This coding style is in accordance with Andy's opinion in this mail: > https://lore.kernel.org/linux-arm-kernel/CAHp75Vciq4M4kVrabNV9vTLLcd1vR=bme8jledaf9mkrtpc...@mail.gmail.com/ True. -- With Best Regards, Andy Shevchenko
Re: [PATCH 4/7] drm/i915/hwmon: Show device level energy usage
On 9/16/2022 8:30 PM, Badal Nilawar wrote: From: Dale B Stimson Use i915 HWMON to display device level energy input. v2: - Updated the date and kernel version in feature description v3: - Cleaned up hwm_energy function and removed unused function i915_hwmon_energy_status_get (Ashutosh) - Updated date, kernel version in documentation Signed-off-by: Dale B Stimson Signed-off-by: Ashutosh Dixit Signed-off-by: Riana Tauro Signed-off-by: Badal Nilawar Acked-by: Guenter Roeck Reviewed-by: Ashutosh Dixit --- .../ABI/testing/sysfs-driver-intel-i915-hwmon | 8 ++ drivers/gpu/drm/i915/i915_hwmon.c | 107 +- drivers/gpu/drm/i915/i915_hwmon.h | 1 + drivers/gpu/drm/i915/intel_mchbar_regs.h | 2 + 4 files changed, 116 insertions(+), 2 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon index bc061238e35c..94101f818a70 100644 --- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon +++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon @@ -25,3 +25,11 @@ Contact: dri-devel@lists.freedesktop.org Description: RO. Card default power limit (default TDP setting). Only supported for particular Intel i915 graphics platforms. + +What: /sys/devices/.../hwmon/hwmon/energy1_input +Date: September 2022 +KernelVersion: 6 +Contact: dri-devel@lists.freedesktop.org +Description: RO. Energy input of device in microjoules. + + Only supported for particular Intel i915 graphics platforms. diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c index 5183cf51a49b..a42cfad78bef 100644 --- a/drivers/gpu/drm/i915/i915_hwmon.c +++ b/drivers/gpu/drm/i915/i915_hwmon.c @@ -17,21 +17,30 @@ * SF_* - scale factors for particular quantities according to hwmon spec. * - voltage - millivolts * - power - microwatts + * - energy - microjoules */ #define SF_VOLTAGE1000 #define SF_POWER 100 +#define SF_ENERGY 100 struct hwm_reg { i915_reg_t gt_perf_status; i915_reg_t pkg_power_sku_unit; i915_reg_t pkg_power_sku; i915_reg_t pkg_rapl_limit; + i915_reg_t energy_status_all; +}; + +struct hwm_energy_info { + u32 reg_val_prev; + long accum_energy; /* Accumulated energy for energy1_input */ }; struct hwm_drvdata { struct i915_hwmon *hwmon; struct intel_uncore *uncore; struct device *hwmon_dev; + struct hwm_energy_info ei; /* Energy info for energy1_input */ char name[12]; }; @@ -40,6 +49,7 @@ struct i915_hwmon { struct mutex hwmon_lock;/* counter overflow logic and rmw */ struct hwm_reg rg; int scl_shift_power; + int scl_shift_energy; }; static void @@ -98,9 +108,60 @@ hwm_field_scale_and_write(struct hwm_drvdata *ddat, i915_reg_t rgadr, bits_to_clear, bits_to_set); } +/* + * hwm_energy - Obtain energy value + * + * The underlying energy hardware register is 32-bits and is subject to + * overflow. How long before overflow? For example, with an example + * scaling bit shift of 14 bits (see register *PACKAGE_POWER_SKU_UNIT) and + * a power draw of 1000 watts, the 32-bit counter will overflow in + * approximately 4.36 minutes. + * + * Examples: + *1 watt: (2^32 >> 14) /1 W / (60 * 60 * 24) secs/day -> 3 days + * 1000 watts: (2^32 >> 14) / 1000 W / 60 secs/min -> 4.36 minutes + * + * The function significantly increases overflow duration (from 4.36 + * minutes) by accumulating the energy register into a 'long' as allowed by + * the hwmon API. Using x86_64 128 bit arithmetic (see mul_u64_u32_shr()), + * a 'long' of 63 bits, SF_ENERGY of 1e6 (~20 bits) and + * hwmon->scl_shift_energy of 14 bits we have 57 (63 - 20 + 14) bits before + * energy1_input overflows. This at 1000 W is an overflow duration of 278 years. + */ +static int +hwm_energy(struct hwm_drvdata *ddat, long *energy) +{ + struct intel_uncore *uncore = ddat->uncore; + struct i915_hwmon *hwmon = ddat->hwmon; + struct hwm_energy_info *ei = &ddat->ei; + intel_wakeref_t wakeref; + i915_reg_t rgaddr; + u32 reg_val; + + rgaddr = hwmon->rg.energy_status_all; + + mutex_lock(&hwmon->hwmon_lock); + + with_intel_runtime_pm(uncore->rpm, wakeref) + reg_val = intel_uncore_read(uncore, rgaddr); + + if (reg_val >= ei->reg_val_prev) + ei->accum_energy += reg_val - ei->reg_val_prev; + else + ei->accum_energy += UINT_MAX - ei->reg_val_prev + reg_val; + ei->reg_val_prev = reg_val; + + *energy = mul_u64_u32_shr(ei->accum_energy, SF_ENERGY, + hwmon->scl_shift_energy); + mutex_unlock(&hwmon->hwmon_lock)
RE: [PATCH] virt: acrn: obtain pa from VMA with PFNMAP flag
Hi Daniel, Thank you for this info, we will fix this issue. Almost miss this mail, sorry! -Yonghua > -Original Message- > From: Daniel Vetter > Sent: Wednesday, August 10, 2022 20:20 > To: Huang, Yonghua > Cc: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org; > sta...@vger.kernel.org; Chatre, Reinette ; Wang, > Zhi A ; Wang, Yu1 ; Li, Fei1 > ; Linux MM ; DRI Development de...@lists.freedesktop.org> > Subject: Re: [PATCH] virt: acrn: obtain pa from VMA with PFNMAP flag > > On Mon, Feb 28, 2022 at 05:22:12AM +0300, Yonghua Huang wrote: > > acrn_vm_ram_map can't pin the user pages with VM_PFNMAP flag by > > calling get_user_pages_fast(), the PA(physical pages) may be mapped > > by kernel driver and set PFNMAP flag. > > > > This patch fixes logic to setup EPT mapping for PFN mapped RAM region > > by checking the memory attribute before adding EPT mapping for them. > > > > Fixes: 88f537d5e8dd ("virt: acrn: Introduce EPT mapping management") > > Signed-off-by: Yonghua Huang > > Signed-off-by: Fei Li > > --- > > drivers/virt/acrn/mm.c | 24 > > 1 file changed, 24 insertions(+) > > > > diff --git a/drivers/virt/acrn/mm.c b/drivers/virt/acrn/mm.c index > > c4f2e15c8a2b..3b1b1e7a844b 100644 > > --- a/drivers/virt/acrn/mm.c > > +++ b/drivers/virt/acrn/mm.c > > @@ -162,10 +162,34 @@ int acrn_vm_ram_map(struct acrn_vm *vm, struct > acrn_vm_memmap *memmap) > > void *remap_vaddr; > > int ret, pinned; > > u64 user_vm_pa; > > + unsigned long pfn; > > + struct vm_area_struct *vma; > > > > if (!vm || !memmap) > > return -EINVAL; > > > > + mmap_read_lock(current->mm); > > + vma = vma_lookup(current->mm, memmap->vma_base); > > + if (vma && ((vma->vm_flags & VM_PFNMAP) != 0)) { > > + if ((memmap->vma_base + memmap->len) > vma->vm_end) { > > + mmap_read_unlock(current->mm); > > + return -EINVAL; > > + } > > + > > + ret = follow_pfn(vma, memmap->vma_base, &pfn); > > This races, don't use follow_pfn() and most definitely don't add new users. In > some cases follow_pte, but the pte/pfn is still only valid for as long as you > hold > the pte spinlock. > > > + mmap_read_unlock(current->mm); > > Definitely after here there's zero guarantees about this pfn and it could > point at > anything. > > Please fix, I tried pretty hard to get rid of follow_pfn(), but some of them > are > just too hard to fix (e.g. kvm needs a pretty hug rewrite to get it all > sorted). > > Cheers, Daniel > > > + if (ret < 0) { > > + dev_dbg(acrn_dev.this_device, > > + "Failed to lookup PFN at VMA:%pK.\n", (void > *)memmap->vma_base); > > + return ret; > > + } > > + > > + return acrn_mm_region_add(vm, memmap->user_vm_pa, > > +PFN_PHYS(pfn), memmap->len, > > +ACRN_MEM_TYPE_WB, memmap->attr); > > + } > > + mmap_read_unlock(current->mm); > > + > > /* Get the page number of the map region */ > > nr_pages = memmap->len >> PAGE_SHIFT; > > pages = vzalloc(nr_pages * sizeof(struct page *)); > > > > base-commit: 73878e5eb1bd3c9656685ca60bc3a49d17311e0c > > -- > > 2.25.1 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
Re: [PATCH 2/2] drm/panfrost: replace endian-specific types with generic ones
> > Or of course we could just actually use native endian and detect from > > the magic which endian is in use. That would require ripping out the > > cpu_to_lexx() calls in Linux and making the user space tool more > > intelligent. I'm happy with that, but it's pushing the complexity onto Mesa. > > If there's a clearly identifiable header, then I'd say making the whole dump > native-endian is probably the way to go. Unless and until anyone actually > demands to be able to do cross-endian post-mortem GPU debugging, the > realistic extent of the complexity in Mesa is that it doesn't recognise the > foreign dump format and gives up, which I assume is already implemented :) +1 to this solution. Gets the complexity out of both kernel and Mesa, and in the vanishingly unlikely scenario that we need this functionality, we can add it to Mesa without kernel changes. As mesa panfrost maintainer I'll take those odds :+1:
Re: [PATCH v2 09/10] drm/ofdrm: Add per-model device function
Hi Am 05.08.22 um 02:22 schrieb Benjamin Herrenschmidt: On Tue, 2022-07-26 at 16:40 +0200, Michal Suchánek wrote: Hello, On Tue, Jul 26, 2022 at 03:38:37PM +0200, Javier Martinez Canillas wrote: On 7/20/22 16:27, Thomas Zimmermann wrote: Add a per-model device-function structure in preparation of adding color-management support. Detection of the individual models has been taken from fbdev's offb. Signed-off-by: Thomas Zimmermann --- Reviewed-by: Javier Martinez Canillas [...] +static bool is_avivo(__be32 vendor, __be32 device) +{ + /* This will match most R5xx */ + return (vendor == 0x1002) && + ((device >= 0x7100 && device < 0x7800) || (device >= 0x9400)); +} Maybe add some constant macros to not have these magic numbers ? This is based on the existing fbdev implementation's magic numbers: drivers/video/fbdev/offb.c: ((*did >= 0x7100 && *did < 0x7800) || Of course, it would be great if somebody knowledgeable could clarify those. I don't think anybody remembers :-) Vendor 0x1002 is PCI_VENDOR_ID_ATI, I do :) but the rest is basically ranges of PCI IDs for which we don't have symbolic constants. Should we add them to the official list in pci_ids.h? I cannot find 0x7800. The others are R520 and R600. Best regards Thomas Cheers, Ben. -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature
Re: [Intel-gfx] [PATCH 1/7] drm/i915/hwmon: Add HWMON infrastructure
Hi Badal, > +struct hwm_reg { > +}; > + > +struct hwm_drvdata { > + struct i915_hwmon *hwmon; > + struct intel_uncore *uncore; > + struct device *hwmon_dev; > + char name[12]; > +}; > + > +struct i915_hwmon { > + struct hwm_drvdata ddat; > + struct mutex hwmon_lock;/* counter overflow logic and > rmw */ > + struct hwm_reg rg; > +}; > + > +static const struct hwmon_channel_info *hwm_info[] = { > + NULL > +}; > + > +static umode_t > +hwm_is_visible(const void *drvdata, enum hwmon_sensor_types type, > +u32 attr, int channel) > +{ > + switch (type) { > + default: > + return 0; > + } > +} > + > +static int > +hwm_read(struct device *dev, enum hwmon_sensor_types type, u32 attr, > + int channel, long *val) > +{ > + switch (type) { > + default: > + return -EOPNOTSUPP; > + } > +} > + > +static int > +hwm_write(struct device *dev, enum hwmon_sensor_types type, u32 attr, > + int channel, long val) > +{ > + switch (type) { > + default: > + return -EOPNOTSUPP; > + } > +} > + > +static const struct hwmon_ops hwm_ops = { > + .is_visible = hwm_is_visible, > + .read = hwm_read, > + .write = hwm_write, > +}; > + > +static const struct hwmon_chip_info hwm_chip_info = { > + .ops = &hwm_ops, > + .info = hwm_info, > +}; what's the point for splitting so much? Can't you just send the hwmon driver all at once? With this patch you are not actually doing anything useful. In my opinion this should be squashed with the next ones. > +static void > +hwm_get_preregistration_info(struct drm_i915_private *i915) > +{ > +} > + > +void i915_hwmon_register(struct drm_i915_private *i915) > +{ > + struct device *dev = i915->drm.dev; > + struct i915_hwmon *hwmon; > + struct device *hwmon_dev; > + struct hwm_drvdata *ddat; > + > + /* hwmon is available only for dGfx */ > + if (!IS_DGFX(i915)) > + return; > + > + hwmon = kzalloc(sizeof(*hwmon), GFP_KERNEL); why don't we use devm_kzalloc? > + if (!hwmon) > + return; > + > + i915->hwmon = hwmon; > + mutex_init(&hwmon->hwmon_lock); > + ddat = &hwmon->ddat; > + > + ddat->hwmon = hwmon; > + ddat->uncore = &i915->uncore; > + snprintf(ddat->name, sizeof(ddat->name), "i915"); > + > + hwm_get_preregistration_info(i915); > + > + /* hwmon_dev points to device hwmon */ > + hwmon_dev = hwmon_device_register_with_info(dev, ddat->name, > + ddat, > + &hwm_chip_info, > + NULL); > + if (IS_ERR(hwmon_dev)) { > + mutex_destroy(&hwmon->hwmon_lock); there is not such a big need to destroy the mutex. Destroying mutexes is more useful when you actually are creating/destroying and there is some debug need. I don't think that's the case. With the devm_kzalloc this would be just a return. Andi > + i915->hwmon = NULL; > + kfree(hwmon); > + return; > + } > + > + ddat->hwmon_dev = hwmon_dev; > +} > + > +void i915_hwmon_unregister(struct drm_i915_private *i915) > +{ > + struct i915_hwmon *hwmon; > + struct hwm_drvdata *ddat; > + > + hwmon = fetch_and_zero(&i915->hwmon); > + if (!hwmon) > + return; > + > + ddat = &hwmon->ddat; > + if (ddat->hwmon_dev) > + hwmon_device_unregister(ddat->hwmon_dev); > + > + mutex_destroy(&hwmon->hwmon_lock); > + kfree(hwmon); > +} > diff --git a/drivers/gpu/drm/i915/i915_hwmon.h > b/drivers/gpu/drm/i915/i915_hwmon.h > new file mode 100644 > index ..7ca9cf2c34c9 > --- /dev/null > +++ b/drivers/gpu/drm/i915/i915_hwmon.h > @@ -0,0 +1,20 @@ > +/* SPDX-License-Identifier: MIT */ > + > +/* > + * Copyright © 2022 Intel Corporation > + */ > + > +#ifndef __I915_HWMON_H__ > +#define __I915_HWMON_H__ > + > +struct drm_i915_private; > + > +#if IS_REACHABLE(CONFIG_HWMON) > +void i915_hwmon_register(struct drm_i915_private *i915); > +void i915_hwmon_unregister(struct drm_i915_private *i915); > +#else > +static inline void i915_hwmon_register(struct drm_i915_private *i915) { }; > +static inline void i915_hwmon_unregister(struct drm_i915_private *i915) { }; > +#endif > + > +#endif /* __I915_HWMON_H__ */ > -- > 2.25.1
Re: [PATCH v2 10/10] drm/ofdrm: Support color management
Hi Am 05.08.22 um 02:19 schrieb Benjamin Herrenschmidt: On Wed, 2022-07-20 at 16:27 +0200, Thomas Zimmermann wrote: +#if !defined(CONFIG_PPC) +static inline void out_8(void __iomem *addr, int val) +{ } +static inline void out_le32(void __iomem *addr, int val) +{ } +static inline unsigned int in_le32(const void __iomem *addr) +{ + return 0; +} +#endif These guys could just be replaced with readb/writel/readl respectively (beware of the argument swap). I only added them for COMPILE_TEST. There appears to be no portable interface that implements out_le32() and in_le32()? Best regards Thomas Cheers, Ben. -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH v6,2/3] drm: mediatek: Adjust the dpi output format to MT8186
On Wed, 2022-09-21 at 09:35 +0800, CK Hu wrote: > Hi, Xinlei: > > On Wed, 2022-09-14 at 21:21 +0800, xinlei@mediatek.com wrote: > > From: Xinlei Lee > > > > Dpi output needs to adjust the output format to dual edge for > > MT8186. > > The bridge ic on MT8186 uses the output format of > > RGB888_dual_edge. > > I think different sink ic may support different output format, so > query > the sink information to decide which outout format. > > > Due > > to hardware changes, we need to modify the output format > > corresponding > > to the mmsys register. > > > > Co-developed-by: Jitao Shi > > Signed-off-by: Jitao Shi > > Signed-off-by: Xinlei Lee > > Reviewed-by: AngeloGioacchino Del Regno < > > angelogioacchino.delre...@collabora.com> > > Reviewed-by: Nís F. R. A. Prado > > --- > > drivers/gpu/drm/mediatek/mtk_dpi.c | 14 ++ > > 1 file changed, 14 insertions(+) > > > > diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c > > b/drivers/gpu/drm/mediatek/mtk_dpi.c > > index fb0b79704636..6e02f02f163c 100644 > > --- a/drivers/gpu/drm/mediatek/mtk_dpi.c > > +++ b/drivers/gpu/drm/mediatek/mtk_dpi.c > > @@ -14,6 +14,7 @@ > > #include > > #include > > #include > > +#include > > #include > > > > #include > > @@ -28,6 +29,7 @@ > > #include "mtk_disp_drv.h" > > #include "mtk_dpi_regs.h" > > #include "mtk_drm_ddp_comp.h" > > +#include "mtk_drm_drv.h" > > > > enum mtk_dpi_out_bit_num { > > MTK_DPI_OUT_BIT_NUM_8BITS, > > @@ -58,6 +60,11 @@ enum mtk_dpi_out_color_format { > > MTK_DPI_COLOR_FORMAT_YCBCR_422 > > }; > > > > +enum mtk_dpi_out_format_con { > > + MTK_DPI_RGB888_DDR_CON, > > + MTK_DPI_RGB565_SDR_CON > > +}; > > + > > struct mtk_dpi { > > struct drm_encoder encoder; > > struct drm_bridge bridge; > > @@ -80,6 +87,7 @@ struct mtk_dpi { > > struct pinctrl_state *pins_dpi; > > u32 output_fmt; > > int refcount; > > + struct device *mmsys_dev; > > }; > > > > static inline struct mtk_dpi *bridge_to_dpi(struct drm_bridge *b) > > @@ -133,6 +141,7 @@ struct mtk_dpi_yc_limit { > > * @yuv422_en_bit: Enable bit of yuv422. > > * @csc_enable_bit: Enable bit of CSC. > > * @pixels_per_iter: Quantity of transferred pixels per iteration. > > + * @edge_cfg_in_mmsys: If the edge configuration for DPI's output > > needs to be set in MMSYS. > > */ > > struct mtk_dpi_conf { > > unsigned int (*cal_factor)(int clock); > > @@ -151,6 +160,7 @@ struct mtk_dpi_conf { > > u32 yuv422_en_bit; > > u32 csc_enable_bit; > > u32 pixels_per_iter; > > + bool edge_cfg_in_mmsys; > > }; > > > > static void mtk_dpi_mask(struct mtk_dpi *dpi, u32 offset, u32 val, > > u32 mask) > > @@ -447,6 +457,8 @@ static void mtk_dpi_dual_edge(struct mtk_dpi > > *dpi) > > mtk_dpi_mask(dpi, DPI_OUTPUT_SETTING, > > dpi->output_fmt == > > MEDIA_BUS_FMT_RGB888_2X12_LE ? > > EDGE_SEL : 0, EDGE_SEL); > > + if (dpi->conf->edge_cfg_in_mmsys) > > + mtk_mmsys_ddp_dpi_fmt_config(dpi->mmsys_dev, > > MTK_DPI_RGB888_DDR_CON); > > Why do you set a DPI driver defined value MTK_DPI_RGB888_DDR_CON into > mmsys driver? I think you should set a value which mmsys driver > understand. > > Regards, > CK > > > } else { > > mtk_dpi_mask(dpi, DPI_DDR_SETTING, DDR_EN | DDR_4PHASE, > > 0); > > } > > @@ -776,8 +788,10 @@ static int mtk_dpi_bind(struct device *dev, > > struct device *master, void *data) > > { > > struct mtk_dpi *dpi = dev_get_drvdata(dev); > > struct drm_device *drm_dev = data; > > + struct mtk_drm_private *priv = drm_dev->dev_private; > > int ret; > > > > + dpi->mmsys_dev = priv->mmsys_dev; > > ret = drm_simple_encoder_init(drm_dev, &dpi->encoder, > > DRM_MODE_ENCODER_TMDS); > > if (ret) { > > Hi CK: Thanks for your review. Yes, different sink ICs may support other output formats. The current DRM architecture supports retrieving the output formats of all bridges (implemented through .atomic_check & .atomic_get_output_bus_fmts & .atomic_get_input_bus_fmts in dpi). If no unified output format is found It will use the default format output of soc (MEDIA_BUS_FMT_RGB888_2X12_LE is used in mt8186). The difference between MT8186 and other ICs is that when modifying the output format, we need to modify the mmsys_base + 0x400 register to take effect. Therefore, if there are other format sink ICs (RGB888_DDR/RGB888_SDR) in the future, the sink IC needs to add the func implementation mentioned above needs to be added. And the drm architecture will select the appropriate format to change the dpi output. I will put the registers that control mmsys in mtk-mmsys.h in the next release. Best Regards! Xinlei
Re: [PATCH v6,3/3] drm: mediatek: Add mt8186 dpi compatible to mtk_dpi.c
On Wed, 2022-09-21 at 09:42 +0800, CK Hu wrote: > Hi, Xinlei: > > On Wed, 2022-09-14 at 21:21 +0800, xinlei@mediatek.com wrote: > > From: Xinlei Lee > > > > Add the compatible because use edge_cfg_in_mmsys in mt8186. > > > > Signed-off-by: Xinlei Lee > > --- > > drivers/gpu/drm/mediatek/mtk_dpi.c | 21 + > > drivers/gpu/drm/mediatek/mtk_drm_drv.c | 2 ++ > > 2 files changed, 23 insertions(+) > > > > diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c > > b/drivers/gpu/drm/mediatek/mtk_dpi.c > > index 6e02f02f163c..52bcc4114afd 100644 > > --- a/drivers/gpu/drm/mediatek/mtk_dpi.c > > +++ b/drivers/gpu/drm/mediatek/mtk_dpi.c > > @@ -942,6 +942,24 @@ static const struct mtk_dpi_conf mt8183_conf = > > { > > .csc_enable_bit = CSC_ENABLE, > > }; > > > > +static const struct mtk_dpi_conf mt8186_conf = { > > + .cal_factor = mt8183_calculate_factor, > > + .reg_h_fre_con = 0xe0, > > + .max_clock_khz = 15, > > + .output_fmts = mt8183_output_fmts, > > + .num_output_fmts = ARRAY_SIZE(mt8183_output_fmts), > > + .edge_cfg_in_mmsys = true, > > According to the previous review, edge_cfg_in_mmsys is not necessary, > so mt8186_conf is identical to mt8192_conf. > > Regards, > CK > > > + .pixels_per_iter = 1, > > + .is_ck_de_pol = true, > > + .swap_input_support = true, > > + .support_direct_pin = true, > > + .dimension_mask = HPW_MASK, > > + .hvsize_mask = HSIZE_MASK, > > + .channel_swap_shift = CH_SWAP, > > + .yuv422_en_bit = YUV422_EN, > > + .csc_enable_bit = CSC_ENABLE, > > +}; > > + > > static const struct mtk_dpi_conf mt8192_conf = { > > .cal_factor = mt8183_calculate_factor, > > .reg_h_fre_con = 0xe0, > > @@ -1092,6 +1110,9 @@ static const struct of_device_id > > mtk_dpi_of_ids[] = { > > { .compatible = "mediatek,mt8183-dpi", > > .data = &mt8183_conf, > > }, > > + { .compatible = "mediatek,mt8186-dpi", > > + .data = &mt8186_conf, > > + }, > > { .compatible = "mediatek,mt8192-dpi", > > .data = &mt8192_conf, > > }, > > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c > > b/drivers/gpu/drm/mediatek/mtk_drm_drv.c > > index 546b79412815..3d32fbc66ac1 100644 > > --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c > > +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c > > @@ -646,6 +646,8 @@ static const struct of_device_id > > mtk_ddp_comp_dt_ids[] = { > > .data = (void *)MTK_DPI }, > > { .compatible = "mediatek,mt8183-dpi", > > .data = (void *)MTK_DPI }, > > + { .compatible = "mediatek,mt8186-dpi", > > + .data = (void *)MTK_DPI }, > > { .compatible = "mediatek,mt8192-dpi", > > .data = (void *)MTK_DPI }, > > { .compatible = "mediatek,mt8195-dp-intf", > > Hi CK: Thanks for your review. Because the difference between MT8186 and other ICs is that modifying the output format requires modifying the mmsys register to take effect. I think this mt8186_conf needs to be reserved so that we can distinguish it from the settings of other ICs when setting dpi_dual_edge. (the edge_cfg_in mmsys flag inside can prevent other ICs from setting the mmsys register) Best Regards! Xinlei
Re: [PATCH v2 00/41] drm: Analog TV Improvements
On Wed, Sep 07, 2022 at 06:44:53PM +0200, Noralf Trønnes wrote: > > > Den 07.09.2022 12.36, skrev Stefan Wahren: > > Hi Maxime, > > > > Am 05.09.22 um 16:57 schrieb Maxime Ripard: > >> On Fri, Sep 02, 2022 at 01:28:16PM +0200, Noralf Trønnes wrote: > >>> > >>> Den 01.09.2022 21.35, skrev Noralf Trønnes: > > I have finally found a workaround for my kernel hangs. > > Dom had a look at my kernel and found that the VideoCore was fine, and > he said this: > > > That suggests cause of lockup was on arm side rather than VC side. > > > > But it's hard to diagnose further. Once you've had a peripheral not > > respond, the AXI bus locks up and no further operations are possible. > > Usual causes of this are required clocks being stopped or domains > > disabled and then trying to access the hardware. > > > So when I got this on my 64-bit build: > > [ 166.702171] SError Interrupt on CPU1, code 0xbf02 -- > SError > [ 166.702187] CPU: 1 PID: 8 Comm: kworker/u8:0 Tainted: G W > 5.19.0-rc6-00096-gba7973977976-dirty #1 > [ 166.702200] Hardware name: Raspberry Pi 4 Model B Rev 1.1 (DT) > [ 166.702206] Workqueue: events_freezable_power_ > thermal_zone_device_check > [ 166.702231] pstate: 20c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS > BTYPE=--) > [ 166.702242] pc : regmap_mmio_read32le+0x10/0x28 > [ 166.702261] lr : regmap_mmio_read+0x44/0x70 > ... > [ 166.702606] bcm2711_get_temp+0x58/0xb0 [bcm2711_thermal] > > I wondered if that reg read was stalled due to a clock being stopped. > > Lo and behold, disabling runtime pm and keeping the vec clock running > all the time fixed it[1]. > > I don't know what the problem is, but at least I can now test this > patchset. > > [1] https://gist.github.com/notro/23b984e7fa05cfbda2db50a421cac065 > > >>> It turns out I didn't have to disable runtime pm: > >>> https://gist.github.com/notro/0adcfcb12460b54e54458afe11dc8ea2 > >> If the bcm2711_thermal IP needs that clock to be enabled, it should grab > >> a reference itself, but it looks like even the device tree binding > >> doesn't ask for one. > > The missing clock in the device tree binding is expected, because > > despite of the code there is not much information about the BCM2711 > > clock tree. But i'm skeptical that the AVS IP actually needs the VEC > > clock, i think it's more likely that the VEC clock parent is changed and > > that cause this issue. I could take care of the bcm2711 binding & driver > > if i know which clock is really necessary. > > Seems you're right, keeping the parent always enabled is enough: > > clk_prepare_enable(clk_get_parent(vec->clock)); // pllc_per > > I tried enabling just the grandparent clock as well, but that didn't help. Yeah, adding tracing to the clock framework shows that it indeed disables PLLC_PER. So there's probably some other device that depends on it but doesn't take a reference to it. I had a look, but it's not really obvious what that might be. This patch makes sure that the PLL*_PER are never disabled, could you test it? It seems to work for me. diff --git a/drivers/clk/bcm/clk-bcm2835.c b/drivers/clk/bcm/clk-bcm2835.c index 48a1eb9f2d55..3839261ee419 100644 --- a/drivers/clk/bcm/clk-bcm2835.c +++ b/drivers/clk/bcm/clk-bcm2835.c @@ -1675,7 +1675,7 @@ static const struct bcm2835_clk_desc clk_desc_array[] = { .load_mask = CM_PLLA_LOADPER, .hold_mask = CM_PLLA_HOLDPER, .fixed_divider = 1, - .flags = CLK_SET_RATE_PARENT), + .flags = CLK_IS_CRITICAL | CLK_SET_RATE_PARENT), [BCM2835_PLLA_DSI0] = REGISTER_PLL_DIV( SOC_ALL, .name = "plla_dsi0", @@ -1784,7 +1784,7 @@ static const struct bcm2835_clk_desc clk_desc_array[] = { .load_mask = CM_PLLC_LOADPER, .hold_mask = CM_PLLC_HOLDPER, .fixed_divider = 1, - .flags = CLK_SET_RATE_PARENT), + .flags = CLK_IS_CRITICAL | CLK_SET_RATE_PARENT), /* * PLLD is the display PLL, used to drive DSI display panels. @@ -1891,7 +1891,7 @@ static const struct bcm2835_clk_desc clk_desc_array[] = { .load_mask = CM_PLLH_LOADAUX, .hold_mask = 0, .fixed_divider = 1, - .flags = CLK_SET_RATE_PARENT), + .flags = CLK_IS_CRITICAL | CLK_SET_RATE_PARENT), [BCM2835_PLLH_PIX] = REGISTER_PLL_DIV( SOC_BCM2835, .name = "pllh_pix", Maxime signature.asc Description: PGP signature
Re: [PATCH v10 3/9] compiler_types.h: Add assert_type to catch type mis-match while compiling
On 9/13/22 3:01 PM, Kees Cook wrote: On Fri, Sep 09, 2022 at 07:59:07PM +0900, Gwan-gyeong Mun wrote: It adds assert_type and assert_typable macros to catch type mis-match while compiling. The existing typecheck() macro outputs build warnings, but the newly added assert_type() macro uses the _Static_assert() keyword (which is introduced in C11) to generate a build break when the types are different and can be used to detect explicit build errors. Unlike the assert_type() macro, assert_typable() macro allows a constant value as the second argument. Suggested-by: Kees Cook Signed-off-by: Gwan-gyeong Mun Cc: Thomas Hellström Cc: Matthew Auld Cc: Nirmoy Das Cc: Jani Nikula Cc: Andi Shyti Cc: Mauro Carvalho Chehab Cc: Andrzej Hajda Cc: Kees Cook --- include/linux/compiler_types.h | 39 ++ 1 file changed, 39 insertions(+) diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h index 4f2a819fd60a..19cc125918bb 100644 --- a/include/linux/compiler_types.h +++ b/include/linux/compiler_types.h @@ -294,6 +294,45 @@ struct ftrace_likely_data { /* Are two types/vars the same type (ignoring qualifiers)? */ #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b)) +/** + * assert_type - break compile if the first argument's data type and the second + * argument's data type are not the same + * + * @t1: data type or variable + * @t2: data type or variable + * + * The first and second arguments can be data types or variables or mixed (the + * first argument is the data type and the second argument is variable or vice + * versa). It determines whether the first argument's data type and the second + * argument's data type are the same while compiling, and it breaks compile if + * the two types are not the same. + * See also assert_typable(). + */ +#define assert_type(t1, t2) _Static_assert(__same_type(t1, t2)) + +/** + * assert_typable - break compile if the first argument's data type and the + * second argument's data type are not the same + * + * @t: data type or variable + * @n: data type or variable or constant value + * + * The first and second arguments can be data types or variables or mixed (the + * first argument is the data type and the second argument is variable or vice + * versa). Unlike the assert_type() macro, this macro allows a constant value + * as the second argument. And if the second argument is a constant value, it + * always passes. And it doesn't mean that the types are explicitly the same. + * When a constant value is used as the second argument, if you need an + * overflow check when assigning a constant value to a variable of the type of + * the first argument, you can use the overflows_type() macro. When a constant I wonder if the overflows_type() check should happen in this test? It seems weird that assert_typable(u8, 1024) would pass... Yes, that's right. If a constant is used as an argument here, it seems necessary to check whether an overflow occurs when the constant value is assigned to the target type or target variable. + * value is not used as a second argument, it determines whether the first + * argument's data type and the second argument's data type are the same while + * compiling, and it breaks compile if the two types are not the same. + * See also assert_type() and overflows_type(). + */ +#define assert_typable(t, n) _Static_assert(__builtin_constant_p(n) || \ + __same_type(t, typeof(n))) Totally untested -- I'm not sure if this gets the right semantics for constant expressoins, etc... static_assert(__builtin_choose_expression(__builtin_constant_p(n), \ overflows_type(n, typeof(t)), \ __same_type(t, typeof(n However, if we change the macro in the form below, the "error: expression in static assertion is not constant" error occurs due to the restriction [1][2] of _Static_assert() as you mentioned. ( overflows_type() internally uses the __builtin_add_overflow() builtin function [3], which returns a bool type.) #define assert_same_typable(t, n) static_assert( \ __builtin_choose_expr(__builtin_constant_p(n), \ overflows_type(n, typeof(t)) == false, \ __same_type(t, typeof(n Can I have your opinion on the new addition of overflows_type_return_const_expr(), which returns a constant value at compile time to check whether an overflow occurs when assigning a constant value to an argument type? If it is allowable to add the macro, I would try to use the macro that returns "an integer constant expression" after checking for overflow between the constant value and the argument type at compile time with reference to implemented in the previous version. [4] or [5] #define assert_same_typable(t, n) static_assert
Re: [PATCH v2 10/41] drm/modes: Add a function to generate analog display modes
Hi, Thanks again for your help On Sun, Sep 11, 2022 at 06:30:39AM +0200, kFYatek wrote: > W dniu 9.09.2022 o 16:00, Maxime Ripard pisze: > > On Wed, Sep 07, 2022 at 11:31:21PM +0200, Mateusz Kwiatkowski wrote: > >> The "canonical" modelines (at least for vc4's VEC, see the notes below): > >> > >> - (vfp==4, vsync==6, vbp==39) for 576i > >> - (vfp==7, vsync==6, vbp==32) for 480i > >> - (vfp==5, vsync==6, vbp==28) for 486i (full frame NTSC as originally > >> specified) > > > > It's not clear to me either how you come up with those timings? > > Well, experimentally ;) > > The values for 480i and 576i are the values currently used by the downstream > kernel, and those in turn has been copied from the firmware (or more > precisely, > I chose them so that the PV registers end up the same as the values set by the > firmware). > > I also checked with an oscilloscope that the waveforms look as they should. > VEC doesn't exactly handle the half-lines at the start and end of the odd > field > right, but otherwise, the blanking and sync pulses are where they should be. > > The 486i values has been constructed from the 480i ones according to the > calculations from cross-referencing SMPTE documents, see my previous e-mail. > > I know this is perhaps unsatisfactory ;) I don't have access to the VC4 > documentation, so this was _almost_ reverse engineering for me. It's not really that it's unsatisfactory, but the function here is supposed to be generic and thus generate a mode that is supposed to be a somewhat reasonable for a given set of parameters. If the vc4 driver needs some adjustments, then it needs to be out of that function and into the vc4 driver. And this is pretty much what I struggle with: I have a hard time (on top of everything else) figuring out what is supposed to be specific to vc4, and what isn't. I guess your 480i example, since it follows the spec, is fine, but I'm not sure for 576i. Maxime signature.asc Description: PGP signature
Re: [PATCH V2 2/3] dt-bindings: display: panel: Add NewVision NV3051D bindings
On Wed, Sep 21, 2022 at 08:51:34AM +0200, Krzysztof Kozlowski wrote: > On 20/09/2022 16:59, Chris Morgan wrote: > > From: Chris Morgan > > > > Add documentation for the NewVision NV3051D panel bindings. > > Note that for the two expected consumers of this panel binding > > the underlying LCD model is unknown. Name "anbernic,rg353p-panel" > > is used because the hardware itself is known as "anbernic,rg353p". > > > > Signed-off-by: Chris Morgan > > --- > > .../display/panel/newvision,nv3051d.yaml | 55 +++ > > 1 file changed, 55 insertions(+) > > create mode 100644 > > Documentation/devicetree/bindings/display/panel/newvision,nv3051d.yaml > > > > diff --git > > a/Documentation/devicetree/bindings/display/panel/newvision,nv3051d.yaml > > b/Documentation/devicetree/bindings/display/panel/newvision,nv3051d.yaml > > new file mode 100644 > > index ..d90bca4171c2 > > --- /dev/null > > +++ b/Documentation/devicetree/bindings/display/panel/newvision,nv3051d.yaml > > @@ -0,0 +1,55 @@ > > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) > > +%YAML 1.2 > > +--- > > +$id: > > https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdevicetree.org%2Fschemas%2Fdisplay%2Fpanel%2Fnewvision%2Cnv3051d.yaml%23&data=05%7C01%7C%7C844872fdf91b413aa65a08da9b9db9e7%7C84df9e7fe9f640afb435%7C1%7C0%7C637993398994635658%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=eTM2IFjR0TKPlQNYfoq3Poao8QYLSHRVaqiXtufJ7VA%3D&reserved=0 > > +$schema: > > https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdevicetree.org%2Fmeta-schemas%2Fcore.yaml%23&data=05%7C01%7C%7C844872fdf91b413aa65a08da9b9db9e7%7C84df9e7fe9f640afb435%7C1%7C0%7C637993398994635658%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=sjb7x2Z2wKu1C8mMBW1epwuXipe8V26zxpHcCAuKLZY%3D&reserved=0 > > + > > +title: NewVision NV3051D based LCD panel > > + > > +description: | > > + The NewVision NV3051D is a driver chip used to drive DSI panels. For now, > > + this driver only supports the 640x480 panels found in the Anbernic RG353 > > + based devices. > > + > > +maintainers: > > + - Chris Morgan > > + > > +allOf: > > + - $ref: panel-common.yaml# > > + > > +properties: > > + compatible: > > +items: > > + - enum: > > + - anbernic,rg353p-panel > > Are these vendor prefixs documented? Yes, they are in another patch series referenced in the cover letter. They were added for the Anbernic devicetrees and should (I believe) land in 6.1. > > > + - anbernic,rg353v-panel > > + - const: newvision,nv3051d > > Blank line. > Ack. > > + reg: true > > + backlight: true > > + port: true > > + reset-gpios: true > > + vdd-supply: > > +description: regulator that supplies the vdd voltage > > Skip description and make it just "true". It's kind of obvious. > Ack. > > + > > +required: > > + - compatible > > + - reg > > + - backlight > > + - vdd-supply > > Best regards, > Krzysztof
Re: [PATCH 3/7] drm/i915/hwmon: Power PL1 limit and TDP setting
On 21-09-2022 17:15, Gupta, Anshuman wrote: On 9/16/2022 8:30 PM, Badal Nilawar wrote: From: Dale B Stimson Use i915 HWMON to display/modify dGfx power PL1 limit and TDP setting. v2: - Fix review comments (Ashutosh) - Do not restore power1_max upon module unload/load sequence because on production systems modules are always loaded and not unloaded/reloaded (Ashutosh) - Fix review comments (Jani) - Remove endianness conversion (Ashutosh) v3: Add power1_rated_max (Ashutosh) v4: - Use macro HWMON_CHANNEL_INFO to define power channel (Guenter) - Update the date and kernel version in Documentation (Badal) v5: Use hwm_ prefix for static functions (Ashutosh) v6: - Fix review comments (Ashutosh) - Update date, kernel version in documentation Cc: Guenter Roeck Signed-off-by: Dale B Stimson Signed-off-by: Ashutosh Dixit Signed-off-by: Riana Tauro Signed-off-by: Badal Nilawar Acked-by: Guenter Roeck --- .../ABI/testing/sysfs-driver-intel-i915-hwmon | 20 +++ drivers/gpu/drm/i915/i915_hwmon.c | 158 +- drivers/gpu/drm/i915/i915_reg.h | 5 + drivers/gpu/drm/i915/intel_mchbar_regs.h | 6 + 4 files changed, 187 insertions(+), 2 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon index e2974f928e58..bc061238e35c 100644 --- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon +++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon @@ -5,3 +5,23 @@ Contact: dri-devel@lists.freedesktop.org Description: RO. Current Voltage in millivolt. Only supported for particular Intel i915 graphics platforms. + +What: /sys/devices/.../hwmon/hwmon/power1_max +Date: September 2022 +KernelVersion: 6 +Contact: dri-devel@lists.freedesktop.org +Description: RW. Card reactive sustained (PL1/Tau) power limit in microwatts. + + The power controller will throttle the operating frequency + if the power averaged over a window (typically seconds) + exceeds this limit. + + Only supported for particular Intel i915 graphics platforms. + +What: /sys/devices/.../hwmon/hwmon/power1_rated_max +Date: September 2022 +KernelVersion: 6 +Contact: dri-devel@lists.freedesktop.org +Description: RO. Card default power limit (default TDP setting). + + Only supported for particular Intel i915 graphics platforms. diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c index 45745afa5c5b..5183cf51a49b 100644 --- a/drivers/gpu/drm/i915/i915_hwmon.c +++ b/drivers/gpu/drm/i915/i915_hwmon.c @@ -16,11 +16,16 @@ /* * SF_* - scale factors for particular quantities according to hwmon spec. * - voltage - millivolts + * - power - microwatts */ #define SF_VOLTAGE 1000 +#define SF_POWER 100 struct hwm_reg { i915_reg_t gt_perf_status; + i915_reg_t pkg_power_sku_unit; + i915_reg_t pkg_power_sku; + i915_reg_t pkg_rapl_limit; }; struct hwm_drvdata { @@ -34,10 +39,68 @@ struct i915_hwmon { struct hwm_drvdata ddat; struct mutex hwmon_lock; /* counter overflow logic and rmw */ struct hwm_reg rg; + int scl_shift_power; }; +static void +hwm_locked_with_pm_intel_uncore_rmw(struct hwm_drvdata *ddat, + i915_reg_t reg, u32 clear, u32 set) +{ + struct i915_hwmon *hwmon = ddat->hwmon; + struct intel_uncore *uncore = ddat->uncore; + intel_wakeref_t wakeref; + + mutex_lock(&hwmon->hwmon_lock); + + with_intel_runtime_pm(uncore->rpm, wakeref) + intel_uncore_rmw(uncore, reg, clear, set); + + mutex_unlock(&hwmon->hwmon_lock); +} + +/* + * This function's return type of u64 allows for the case where the scaling + * of the field taken from the 32-bit register value might cause a result to + * exceed 32 bits. + */ +static u64 +hwm_field_read_and_scale(struct hwm_drvdata *ddat, i915_reg_t rgadr, + u32 field_msk, int nshift, u32 scale_factor) +{ + struct intel_uncore *uncore = ddat->uncore; + intel_wakeref_t wakeref; + u32 reg_value; + + with_intel_runtime_pm(uncore->rpm, wakeref) + reg_value = intel_uncore_read(uncore, rgadr); + + reg_value = REG_FIELD_GET(field_msk, reg_value); + + return mul_u64_u32_shr(reg_value, scale_factor, nshift); +} + +static void +hwm_field_scale_and_write(struct hwm_drvdata *ddat, i915_reg_t rgadr, + u32 field_msk, int nshift, + unsigned int scale_factor, long lval) +{ + u32 nval; + u32 bits_to_clear; + u32 bits_to_set; + + /* Computation in 64-bits to avoid overflow. Round to nearest. */ + nval = DIV_ROUND_CLOSEST_ULL((u64)lval << nshift, scale_factor); + + bits_to_clear = field_msk; + bits_to_set = FIELD_PREP(field_msk, nval); + + hwm_locked_with_pm_intel_uncore_rmw(ddat, rgadr, + bits_to_clear, bits_to_set);
Re: [PATCH v2 10/41] drm/modes: Add a function to generate analog display modes
Hi, On Sun, Sep 11, 2022 at 06:48:50AM +0200, Mateusz Kwiatkowski wrote: > >> Those extra vbp lines will be treated as a black bar at the top of the > >> frame, > >> and extra vfp lines will be at the bottom of the frame. > >> > >> However if someone specifies e.g. 720x604, there's nothing more you could > >> remove from vfp, so your only option is to reduce vbp compared to the > >> standard > >> mode, so you'll end up with (vfp==4, vsync==6, vbp==11). The image will > >> not be > >> centered, the topmost lines will get cropped out, but that's the best we > >> can do > >> and if someone is requesting such resolution, they most likely want to > >> actually > >> access the VBI to e.g. emit teletext. > >> > >> Your current code always starts at (vfp==5 or 6, vsync=6, vbp==6) and then > >> increases both vfp and vbp proportionately. This puts vsync dead center in > >> the > >> VBI, which is not how it's supposed to be - and that in turn causes the > >> image > >> to be significantly shifted upwards. > >> > >> I hope this makes more sense to you now. > > > > I'm really struggling with this, so thanks for explaining this further > > (and patiently ;)) > > > > If I get this right, what you'd like to change is this part of the > > calculus (simplified a bit, and using PAL, 576i): > > > > vfp_min = params->vfp_lines.even + params->vfp_lines.odd; // 5 > > vbp_min = params->vbp_lines.even + params->vbp_lines.odd; // 6 > > vslen = params->vslen_lines.even + params->vslen_lines.odd; // 6 > > > > porches = params->num_lines - vactive - vslen; // 43 > > porches_rem = porches - vfp_min - vbp_min; // 32 > > > > vfp = vfp_min + (porches_rem / 2); // 21 > > vbp = porches - vfp; // 22 > > > > Which is indeed having sync centered. > > > > I initially changed it to: > > > > vfp = vfp_min; // 6 > > vbp = num_lines - vactive - vslen - vfp; // 38 > > > > Which is close enough for 576i, but at 480i/50Hz would end up with 134, > > so still fairly far off. > > > > I guess your suggestion would be along the line of: > > > > vfp_min = params->vfp_lines.even + params->vfp_lines.odd; // 5 > > vbp_min = params->vbp_lines.even + params->vbp_lines.odd; // 38 > > vslen = params->vslen_lines.even + params->vslen_lines.odd; // 6 > > > > porches = params->num_lines - vactive - vslen; // 0 > > porches_rem = porches - vfp_min - vbp_min; // 0 > > > > vfp = vfp_min + (porches_rem / 2); // 5 > > vbp = porches - vfp; // 38 > > > > Which is still close enough for 576i, but for 480i would end up with: > > > > porches = params->num_lines - vactive - vslen; // 139 > > porches_rem = porches - vfp_min - vbp_min; // 96 > > > > vfp = vfp_min + (porches_rem / 2); // 53 > > vbp = porches - vfp; // 86 > > > > Right? > > Yes. And if that's supposed to mean 480i in 50 Hz "PAL" mode, that's also > "close enough" to the values I suggested above. > > If you substitute values for true 60 Hz "NTSC" 480i, you should also get > values > that are "close enough" to the official spec. > > The only thing I'd conceptually change is that the 38 lines is not really > "vbp_min". It's more like "vbp_typ". As I mentioned above, we may want to > lower > this value if someone wants more active lines than the official 486/576. porches_rem is an int, so if vactive > (num_lines + vslen + vfp_min + vbp_min), porches_rem is going to be negative and we'll remove equally between vfp and vbp to match what's been asked So I believe this should work fine? Maxime signature.asc Description: PGP signature
Re: [PATCH 5/7] drm/i915/hwmon: Expose card reactive critical power
On 9/16/2022 8:30 PM, Badal Nilawar wrote: From: Ashutosh Dixit Expose the card reactive critical (I1) power. I1 is exposed as power1_crit in microwatts (typically for client products) or as curr1_crit in milliamperes (typically for server). v2: Add curr1_crit functionality (Ashutosh) v3: - Use HWMON_CHANNEL_INFO to define power1_crit, curr1_crit (Badal) v4: Use hwm_ prefix for static functions (Ashutosh) v5: Updated date, kernel version in documentation Cc: Sujaritha Sundaresan Signed-off-by: Ashutosh Dixit Signed-off-by: Badal Nilawar Acked-by: Guenter Roeck --- .../ABI/testing/sysfs-driver-intel-i915-hwmon | 26 + drivers/gpu/drm/i915/i915_hwmon.c | 95 ++- drivers/gpu/drm/i915/i915_reg.h | 6 ++ 3 files changed, 126 insertions(+), 1 deletion(-) diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon index 94101f818a70..cc70596fff44 100644 --- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon +++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon @@ -26,6 +26,32 @@ Description: RO. Card default power limit (default TDP setting). Only supported for particular Intel i915 graphics platforms. +What: /sys/devices/.../hwmon/hwmon/power1_crit +Date: September 2022 +KernelVersion: 6 +Contact: dri-devel@lists.freedesktop.org +Description: RW. Card reactive critical (I1) power limit in microwatts. + + Card reactive critical (I1) power limit in microwatts is exposed + for client products. The power controller will throttle the + operating frequency if the power averaged over a window exceeds + this limit. + + Only supported for particular Intel i915 graphics platforms. + +What: /sys/devices/.../hwmon/hwmon/curr1_crit +Date: September 2022 +KernelVersion: 6 +Contact: dri-devel@lists.freedesktop.org +Description: RW. Card reactive critical (I1) power limit in milliamperes. + + Card reactive critical (I1) power limit in milliamperes is + exposed for server products. The power controller will throttle + the operating frequency if the power averaged over a window + exceeds this limit. + + Only supported for particular Intel i915 graphics platforms. + What: /sys/devices/.../hwmon/hwmon/energy1_input Date: September 2022 KernelVersion:6 diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c index a42cfad78bef..bd9ba312c474 100644 --- a/drivers/gpu/drm/i915/i915_hwmon.c +++ b/drivers/gpu/drm/i915/i915_hwmon.c @@ -11,16 +11,19 @@ #include "i915_hwmon.h" #include "i915_reg.h" #include "intel_mchbar_regs.h" +#include "intel_pcode.h" #include "gt/intel_gt_regs.h" /* * SF_* - scale factors for particular quantities according to hwmon spec. * - voltage - millivolts * - power - microwatts + * - curr - milliamperes * - energy - microjoules */ #define SF_VOLTAGE1000 #define SF_POWER 100 +#define SF_CURR1000 #define SF_ENERGY 100 struct hwm_reg { @@ -160,11 +163,25 @@ hwm_energy(struct hwm_drvdata *ddat, long *energy) static const struct hwmon_channel_info *hwm_info[] = { HWMON_CHANNEL_INFO(in, HWMON_I_INPUT), - HWMON_CHANNEL_INFO(power, HWMON_P_MAX | HWMON_P_RATED_MAX), + HWMON_CHANNEL_INFO(power, HWMON_P_MAX | HWMON_P_RATED_MAX | HWMON_P_CRIT), HWMON_CHANNEL_INFO(energy, HWMON_E_INPUT), + HWMON_CHANNEL_INFO(curr, HWMON_C_CRIT), NULL }; +/* I1 is exposed as power_crit or as curr_crit depending on bit 31 */ +static int hwm_pcode_read_i1(struct drm_i915_private *i915, u32 *uval) +{ + return snb_pcode_read_p(&i915->uncore, PCODE_POWER_SETUP, + POWER_SETUP_SUBCOMMAND_READ_I1, 0, uval); +} + +static int hwm_pcode_write_i1(struct drm_i915_private *i915, u32 uval) +{ + return snb_pcode_write_p(&i915->uncore, PCODE_POWER_SETUP, + POWER_SETUP_SUBCOMMAND_WRITE_I1, 0, uval); +} + static umode_t hwm_in_is_visible(const struct hwm_drvdata *ddat, u32 attr) { @@ -198,13 +215,18 @@ hwm_in_read(struct hwm_drvdata *ddat, u32 attr, long *val) static umode_t hwm_power_is_visible(const struct hwm_drvdata *ddat, u32 attr, int chan) { + struct drm_i915_private *i915 = ddat->uncore->i915; struct i915_hwmon *hwmon = ddat->hwmon; + u32 uval; switch (attr) { case hwmon_power_max: return i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit) ? 0664 : 0; case hwmon_power_rated_max: return i915_mmio_reg_valid(hwmon->rg.pkg_power_sku) ? 0444 : 0; + case hwmon_power_crit: + return (hwm_pcode_read_i1(i915, &uval) || + !(uval & POWER_S
Re: [Intel-gfx] [PATCH 1/7] drm/i915/hwmon: Add HWMON infrastructure
On 21-09-2022 18:14, Andi Shyti wrote: Hi Badal, +struct hwm_reg { +}; + +struct hwm_drvdata { + struct i915_hwmon *hwmon; + struct intel_uncore *uncore; + struct device *hwmon_dev; + char name[12]; +}; + +struct i915_hwmon { + struct hwm_drvdata ddat; + struct mutex hwmon_lock;/* counter overflow logic and rmw */ + struct hwm_reg rg; +}; + +static const struct hwmon_channel_info *hwm_info[] = { + NULL +}; + +static umode_t +hwm_is_visible(const void *drvdata, enum hwmon_sensor_types type, + u32 attr, int channel) +{ + switch (type) { + default: + return 0; + } +} + +static int +hwm_read(struct device *dev, enum hwmon_sensor_types type, u32 attr, +int channel, long *val) +{ + switch (type) { + default: + return -EOPNOTSUPP; + } +} + +static int +hwm_write(struct device *dev, enum hwmon_sensor_types type, u32 attr, + int channel, long val) +{ + switch (type) { + default: + return -EOPNOTSUPP; + } +} + +static const struct hwmon_ops hwm_ops = { + .is_visible = hwm_is_visible, + .read = hwm_read, + .write = hwm_write, +}; + +static const struct hwmon_chip_info hwm_chip_info = { + .ops = &hwm_ops, + .info = hwm_info, +}; what's the point for splitting so much? Can't you just send the hwmon driver all at once? With this patch you are not actually doing anything useful. In my opinion this should be squashed with the next ones. During discussion in cover letter of rev0 series we decided to create separate infrastructure patch, as we wanted to keep kconfig, i915 hwmon structures and new file addition in separate patch. Further feature wise we kept adding new patches. +static void +hwm_get_preregistration_info(struct drm_i915_private *i915) +{ +} + +void i915_hwmon_register(struct drm_i915_private *i915) +{ + struct device *dev = i915->drm.dev; + struct i915_hwmon *hwmon; + struct device *hwmon_dev; + struct hwm_drvdata *ddat; + + /* hwmon is available only for dGfx */ + if (!IS_DGFX(i915)) + return; + + hwmon = kzalloc(sizeof(*hwmon), GFP_KERNEL); why don't we use devm_kzalloc? + if (!hwmon) + return; + + i915->hwmon = hwmon; + mutex_init(&hwmon->hwmon_lock); + ddat = &hwmon->ddat; + + ddat->hwmon = hwmon; + ddat->uncore = &i915->uncore; + snprintf(ddat->name, sizeof(ddat->name), "i915"); + + hwm_get_preregistration_info(i915); + + /* hwmon_dev points to device hwmon */ + hwmon_dev = hwmon_device_register_with_info(dev, ddat->name, + ddat, + &hwm_chip_info, + NULL); + if (IS_ERR(hwmon_dev)) { + mutex_destroy(&hwmon->hwmon_lock); there is not such a big need to destroy the mutex. Destroying mutexes is more useful when you actually are creating/destroying and there is some debug need. I don't think that's the case. With the devm_kzalloc this would be just a return. I think we can switch to devm_kzalloc. Regards, Badal Andi + i915->hwmon = NULL; + kfree(hwmon); + return; + } + + ddat->hwmon_dev = hwmon_dev; +} + +void i915_hwmon_unregister(struct drm_i915_private *i915) +{ + struct i915_hwmon *hwmon; + struct hwm_drvdata *ddat; + + hwmon = fetch_and_zero(&i915->hwmon); + if (!hwmon) + return; + + ddat = &hwmon->ddat; + if (ddat->hwmon_dev) + hwmon_device_unregister(ddat->hwmon_dev); + + mutex_destroy(&hwmon->hwmon_lock); + kfree(hwmon); +} diff --git a/drivers/gpu/drm/i915/i915_hwmon.h b/drivers/gpu/drm/i915/i915_hwmon.h new file mode 100644 index ..7ca9cf2c34c9 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_hwmon.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: MIT */ + +/* + * Copyright © 2022 Intel Corporation + */ + +#ifndef __I915_HWMON_H__ +#define __I915_HWMON_H__ + +struct drm_i915_private; + +#if IS_REACHABLE(CONFIG_HWMON) +void i915_hwmon_register(struct drm_i915_private *i915); +void i915_hwmon_unregister(struct drm_i915_private *i915); +#else +static inline void i915_hwmon_register(struct drm_i915_private *i915) { }; +static inline void i915_hwmon_unregister(struct drm_i915_private *i915) { }; +#endif + +#endif /* __I915_HWMON_H__ */ -- 2.25.1
Re: [PATCH V2 2/3] dt-bindings: display: panel: Add NewVision NV3051D bindings
On 21/09/2022 16:38, Chris Morgan wrote: >>> + compatible: >>> +items: >>> + - enum: >>> + - anbernic,rg353p-panel >> >> Are these vendor prefixs documented? > > Yes, they are in another patch series referenced in the cover letter. > They were added for the Anbernic devicetrees and should (I believe) > land in 6.1. OK... you still need to test your bindings. Your patch was clearly not tested before sending. :( Best regards, Krzysztof
Re: [Intel-gfx] [PATCH 1/7] drm/i915/hwmon: Add HWMON infrastructure
Hi Badal, > > > +struct hwm_reg { > > > +}; > > > + > > > +struct hwm_drvdata { > > > + struct i915_hwmon *hwmon; > > > + struct intel_uncore *uncore; > > > + struct device *hwmon_dev; > > > + char name[12]; > > > +}; > > > + > > > +struct i915_hwmon { > > > + struct hwm_drvdata ddat; > > > + struct mutex hwmon_lock;/* counter overflow logic and > > > rmw */ > > > + struct hwm_reg rg; > > > +}; > > > + > > > +static const struct hwmon_channel_info *hwm_info[] = { > > > + NULL > > > +}; > > > + > > > +static umode_t > > > +hwm_is_visible(const void *drvdata, enum hwmon_sensor_types type, > > > +u32 attr, int channel) > > > +{ > > > + switch (type) { > > > + default: > > > + return 0; > > > + } > > > +} > > > + > > > +static int > > > +hwm_read(struct device *dev, enum hwmon_sensor_types type, u32 attr, > > > + int channel, long *val) > > > +{ > > > + switch (type) { > > > + default: > > > + return -EOPNOTSUPP; > > > + } > > > +} > > > + > > > +static int > > > +hwm_write(struct device *dev, enum hwmon_sensor_types type, u32 attr, > > > + int channel, long val) > > > +{ > > > + switch (type) { > > > + default: > > > + return -EOPNOTSUPP; > > > + } > > > +} > > > + > > > +static const struct hwmon_ops hwm_ops = { > > > + .is_visible = hwm_is_visible, > > > + .read = hwm_read, > > > + .write = hwm_write, > > > +}; > > > + > > > +static const struct hwmon_chip_info hwm_chip_info = { > > > + .ops = &hwm_ops, > > > + .info = hwm_info, > > > +}; > > > > what's the point for splitting so much? Can't you just send the > > hwmon driver all at once? With this patch you are not actually > > doing anything useful. In my opinion this should be squashed with > > the next ones. > During discussion in cover letter of rev0 series we decided to create > separate infrastructure patch, as we wanted to keep kconfig, i915 hwmon > structures and new file addition in separate patch. Further feature wise we > kept adding new patches. I don't really like this patch splitting, but it's my fault I haven't reviewed it already in v1. Please, ignore then. Andi
Re: [PATCH V2 2/3] dt-bindings: display: panel: Add NewVision NV3051D bindings
On Wed, Sep 21, 2022 at 05:21:19PM +0200, Krzysztof Kozlowski wrote: > On 21/09/2022 16:38, Chris Morgan wrote: > >>> + compatible: > >>> +items: > >>> + - enum: > >>> + - anbernic,rg353p-panel > >> > >> Are these vendor prefixs documented? > > > > Yes, they are in another patch series referenced in the cover letter. > > They were added for the Anbernic devicetrees and should (I believe) > > land in 6.1. > > OK... you still need to test your bindings. Your patch was clearly not > tested before sending. :( I did: yamllint, make dt_binding_check (with DT_SCHEMA_FILES specified), and make dtbs_check (with DT_SCHEMA_FILES specified again). That's the proper testing flow correct? In this case it's the pre-requisite that's causing the issue as I see on a pristine master tree I'm warned about the missing vendor prefix for anbernic. Should I wait for that to go upstream before I submit this again? I'll make the other change about the space and the description of the vdd-supply when I resubmit. Are we good with the panel compatibles? I'm still not entirely sure the best thing to name them as I have no part number whatsoever except the driver IC. Thank you. > > Best regards, > Krzysztof >
[PATCH AUTOSEL 5.19 02/16] drm/gma500: Fix BUG: sleeping function called from invalid context errors
From: Hans de Goede [ Upstream commit 63e37a79f7bd939314997e29c2f5a9f0ef184281 ] gma_crtc_page_flip() was holding the event_lock spinlock while calling crtc_funcs->mode_set_base() which takes ww_mutex. The only reason to hold event_lock is to clear gma_crtc->page_flip_event on mode_set_base() errors. Instead unlock it after setting gma_crtc->page_flip_event and on errors re-take the lock and clear gma_crtc->page_flip_event it it is still set. This fixes the following WARN/stacktrace: [ 512.122953] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:870 [ 512.123004] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1253, name: gnome-shell [ 512.123031] preempt_count: 1, expected: 0 [ 512.123048] RCU nest depth: 0, expected: 0 [ 512.123066] INFO: lockdep is turned off. [ 512.123080] irq event stamp: 0 [ 512.123094] hardirqs last enabled at (0): [<>] 0x0 [ 512.123134] hardirqs last disabled at (0): [] copy_process+0x9fc/0x1de0 [ 512.123176] softirqs last enabled at (0): [] copy_process+0x9fc/0x1de0 [ 512.123207] softirqs last disabled at (0): [<>] 0x0 [ 512.123233] Preemption disabled at: [ 512.123241] [<>] 0x0 [ 512.123275] CPU: 3 PID: 1253 Comm: gnome-shell Tainted: GW 5.19.0+ #1 [ 512.123304] Hardware name: Packard Bell dot s/SJE01_CT, BIOS V1.10 07/23/2013 [ 512.123323] Call Trace: [ 512.123346] [ 512.123370] dump_stack_lvl+0x5b/0x77 [ 512.123412] __might_resched.cold+0xff/0x13a [ 512.123458] ww_mutex_lock+0x1e/0xa0 [ 512.123495] psb_gem_pin+0x2c/0x150 [gma500_gfx] [ 512.123601] gma_pipe_set_base+0x76/0x240 [gma500_gfx] [ 512.123708] gma_crtc_page_flip+0x95/0x130 [gma500_gfx] [ 512.123808] drm_mode_page_flip_ioctl+0x57d/0x5d0 [ 512.123897] ? drm_mode_cursor2_ioctl+0x10/0x10 [ 512.123936] drm_ioctl_kernel+0xa1/0x150 [ 512.123984] drm_ioctl+0x21f/0x420 [ 512.124025] ? drm_mode_cursor2_ioctl+0x10/0x10 [ 512.124070] ? rcu_read_lock_bh_held+0xb/0x60 [ 512.124104] ? lock_release+0x1ef/0x2d0 [ 512.124161] __x64_sys_ioctl+0x8d/0xd0 [ 512.124203] do_syscall_64+0x58/0x80 [ 512.124239] ? do_syscall_64+0x67/0x80 [ 512.124267] ? trace_hardirqs_on_prepare+0x55/0xe0 [ 512.124300] ? do_syscall_64+0x67/0x80 [ 512.124340] ? rcu_read_lock_sched_held+0x10/0x80 [ 512.124377] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 512.124411] RIP: 0033:0x7fcc4a70740f [ 512.124442] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00 [ 512.124470] RSP: 002b:7ffda73f5390 EFLAGS: 0246 ORIG_RAX: 0010 [ 512.124503] RAX: ffda RBX: 55cc9e474500 RCX: 7fcc4a70740f [ 512.124524] RDX: 7ffda73f5420 RSI: c01864b0 RDI: 0009 [ 512.124544] RBP: 7ffda73f5420 R08: 55cc9c0b0cb0 R09: 0034 [ 512.124564] R10: R11: 0246 R12: c01864b0 [ 512.124584] R13: 0009 R14: 55cc9df484d0 R15: 55cc9af5d0c0 [ 512.124647] Signed-off-by: Hans de Goede Signed-off-by: Patrik Jakobsson Link: https://patchwork.freedesktop.org/patch/msgid/20220906203852.527663-2-hdego...@redhat.com Signed-off-by: Sasha Levin --- drivers/gpu/drm/gma500/gma_display.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/gma500/gma_display.c b/drivers/gpu/drm/gma500/gma_display.c index 34ec3fca09ba..12287c9bb4d8 100644 --- a/drivers/gpu/drm/gma500/gma_display.c +++ b/drivers/gpu/drm/gma500/gma_display.c @@ -531,15 +531,18 @@ int gma_crtc_page_flip(struct drm_crtc *crtc, WARN_ON(drm_crtc_vblank_get(crtc) != 0); gma_crtc->page_flip_event = event; + spin_unlock_irqrestore(&dev->event_lock, flags); /* Call this locked if we want an event at vblank interrupt. */ ret = crtc_funcs->mode_set_base(crtc, crtc->x, crtc->y, old_fb); if (ret) { - gma_crtc->page_flip_event = NULL; - drm_crtc_vblank_put(crtc); + spin_lock_irqsave(&dev->event_lock, flags); + if (gma_crtc->page_flip_event) { + gma_crtc->page_flip_event = NULL; + drm_crtc_vblank_put(crtc); + } + spin_unlock_irqrestore(&dev->event_lock, flags); } - - spin_unlock_irqrestore(&dev->event_lock, flags); } else { ret = crtc_funcs->mode_set_base(crtc, crtc->x, crtc->y, old_fb); } -- 2.35.1
[PATCH AUTOSEL 5.19 04/16] drm/gma500: Fix (vblank) IRQs not working after suspend/resume
From: Hans de Goede [ Upstream commit 235fdbc32d559db21e580f85035c59372704f09e ] Fix gnome-shell (and other page-flip users) hanging after suspend/resume because of the gma500's IRQs not working. This fixes 2 problems with the IRQ handling: 1. gma_power_off() calls gma_irq_uninstall() which does a free_irq(), but gma_power_on() called gma_irq_preinstall() + gma_irq_postinstall() which do not call request_irq. Replace the pre- + post-install calls with gma_irq_install() which does prep + request + post. 2. After fixing 1. IRQs still do not work on a Packard Bell Dot SC (Intel Atom N2600, cedarview) netbook. Cederview uses MSI interrupts and it seems that the BIOS re-configures things back to normal APIC based interrupts during S3 suspend. There is some MSI PCI-config registers save/restore code which tries to deal with this, but on the Packard Bell Dot SC this is not sufficient to restore MSI IRQ functionality after a suspend/resume. Replace the PCI-config registers save/restore with pci_disable_msi() on suspend + pci_enable_msi() on resume. Fixing e.g. gnome-shell hanging. Signed-off-by: Hans de Goede Signed-off-by: Patrik Jakobsson Link: https://patchwork.freedesktop.org/patch/msgid/20220906203852.527663-4-hdego...@redhat.com Signed-off-by: Sasha Levin --- drivers/gpu/drm/gma500/cdv_device.c | 4 +--- drivers/gpu/drm/gma500/oaktrail_device.c | 5 + drivers/gpu/drm/gma500/power.c | 8 +--- drivers/gpu/drm/gma500/psb_drv.c | 2 +- drivers/gpu/drm/gma500/psb_drv.h | 5 + drivers/gpu/drm/gma500/psb_irq.c | 15 --- drivers/gpu/drm/gma500/psb_irq.h | 2 +- 7 files changed, 18 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/gma500/cdv_device.c b/drivers/gpu/drm/gma500/cdv_device.c index dd32b484dd82..ce96234f3df2 100644 --- a/drivers/gpu/drm/gma500/cdv_device.c +++ b/drivers/gpu/drm/gma500/cdv_device.c @@ -581,11 +581,9 @@ static const struct psb_offset cdv_regmap[2] = { static int cdv_chip_setup(struct drm_device *dev) { struct drm_psb_private *dev_priv = to_drm_psb_private(dev); - struct pci_dev *pdev = to_pci_dev(dev->dev); INIT_WORK(&dev_priv->hotplug_work, cdv_hotplug_work_func); - if (pci_enable_msi(pdev)) - dev_warn(dev->dev, "Enabling MSI failed!\n"); + dev_priv->use_msi = true; dev_priv->regmap = cdv_regmap; gma_get_core_freq(dev); psb_intel_opregion_init(dev); diff --git a/drivers/gpu/drm/gma500/oaktrail_device.c b/drivers/gpu/drm/gma500/oaktrail_device.c index 5923a9c89312..f90e628cb482 100644 --- a/drivers/gpu/drm/gma500/oaktrail_device.c +++ b/drivers/gpu/drm/gma500/oaktrail_device.c @@ -501,12 +501,9 @@ static const struct psb_offset oaktrail_regmap[2] = { static int oaktrail_chip_setup(struct drm_device *dev) { struct drm_psb_private *dev_priv = to_drm_psb_private(dev); - struct pci_dev *pdev = to_pci_dev(dev->dev); int ret; - if (pci_enable_msi(pdev)) - dev_warn(dev->dev, "Enabling MSI failed!\n"); - + dev_priv->use_msi = true; dev_priv->regmap = oaktrail_regmap; ret = mid_chip_setup(dev); diff --git a/drivers/gpu/drm/gma500/power.c b/drivers/gpu/drm/gma500/power.c index b91de6d36e41..66873085d450 100644 --- a/drivers/gpu/drm/gma500/power.c +++ b/drivers/gpu/drm/gma500/power.c @@ -139,8 +139,6 @@ static void gma_suspend_pci(struct pci_dev *pdev) dev_priv->regs.saveBSM = bsm; pci_read_config_dword(pdev, 0xFC, &vbt); dev_priv->regs.saveVBT = vbt; - pci_read_config_dword(pdev, PSB_PCIx_MSI_ADDR_LOC, &dev_priv->msi_addr); - pci_read_config_dword(pdev, PSB_PCIx_MSI_DATA_LOC, &dev_priv->msi_data); pci_disable_device(pdev); pci_set_power_state(pdev, PCI_D3hot); @@ -168,9 +166,6 @@ static bool gma_resume_pci(struct pci_dev *pdev) pci_restore_state(pdev); pci_write_config_dword(pdev, 0x5c, dev_priv->regs.saveBSM); pci_write_config_dword(pdev, 0xFC, dev_priv->regs.saveVBT); - /* restoring MSI address and data in PCIx space */ - pci_write_config_dword(pdev, PSB_PCIx_MSI_ADDR_LOC, dev_priv->msi_addr); - pci_write_config_dword(pdev, PSB_PCIx_MSI_DATA_LOC, dev_priv->msi_data); ret = pci_enable_device(pdev); if (ret != 0) @@ -223,8 +218,7 @@ int gma_power_resume(struct device *_dev) mutex_lock(&power_mutex); gma_resume_pci(pdev); gma_resume_display(pdev); - gma_irq_preinstall(dev); - gma_irq_postinstall(dev); + gma_irq_install(dev); mutex_unlock(&power_mutex); return 0; } diff --git a/drivers/gpu/drm/gma500/psb_drv.c b/drivers/gpu/drm/gma500/psb_drv.c index 1d8744f3e702..54e756b48606 100644 --- a/drivers/gpu/drm/gma500/psb_drv.c +++ b/drivers/gpu/drm/gma500/psb_drv.c @@ -383,7 +383,7 @@ static int psb_driver_load(struct drm_device *dev, unsigned long flags)
[PATCH AUTOSEL 5.19 08/16] drm/amdgpu: change the alignment size of TMR BO to 1M
From: Yang Wang [ Upstream commit 36de13fdb04abef3ee03ade5129ab146de63983b ] align TMR BO size TO tmr size is not necessary, modify the size to 1M to avoid re-create BO fail when serious VRAM fragmentation. v2: add new macro PSP_TMR_ALIGNMENT for TMR BO alignment size Signed-off-by: Yang Wang Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index 2b00f8fe15a8..7b8d4484c3c4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -748,7 +748,7 @@ static int psp_tmr_init(struct psp_context *psp) } pptr = amdgpu_sriov_vf(psp->adev) ? &tmr_buf : NULL; - ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, PSP_TMR_SIZE(psp->adev), + ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, PSP_TMR_ALIGNMENT, AMDGPU_GEM_DOMAIN_VRAM, &psp->tmr_bo, &psp->tmr_mc_addr, pptr); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h index e431f4994931..cd366c7f311f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h @@ -36,6 +36,7 @@ #define PSP_CMD_BUFFER_SIZE0x1000 #define PSP_1_MEG 0x10 #define PSP_TMR_SIZE(adev) ((adev)->asic_type == CHIP_ALDEBARAN ? 0x80 : 0x40) +#define PSP_TMR_ALIGNMENT 0x10 #define PSP_FW_NAME_LEN0x24 enum psp_shared_mem_size { -- 2.35.1
[PATCH AUTOSEL 5.19 06/16] drm/amd/pm: disable BACO entry/exit completely on several sienna cichlid cards
From: Guchun Chen [ Upstream commit 7c6fb61a400bf3218c6504cb2d48858f98822c9d ] To avoid hardware intermittent failures. Signed-off-by: Guchun Chen Reviewed-by: Lijo Lazar Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- .../gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c index 32bb6b1d9526..d13e455c8827 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -368,6 +368,17 @@ static void sienna_cichlid_check_bxco_support(struct smu_context *smu) smu_baco->platform_support = (val & RCC_BIF_STRAP0__STRAP_PX_CAPABLE_MASK) ? true : false; + + /* +* Disable BACO entry/exit completely on below SKUs to +* avoid hardware intermittent failures. +*/ + if (((adev->pdev->device == 0x73A1) && + (adev->pdev->revision == 0x00)) || + ((adev->pdev->device == 0x73BF) && + (adev->pdev->revision == 0xCF))) + smu_baco->platform_support = false; + } } -- 2.35.1
[PATCH AUTOSEL 5.19 07/16] drm/amdgpu: use dirty framebuffer helper
From: Hamza Mahfooz [ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ] Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs struct. Signed-off-by: Hamza Mahfooz Acked-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index 4dfd6724b3ca..3451147beda3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -493,6 +494,7 @@ bool amdgpu_display_ddc_probe(struct amdgpu_connector *amdgpu_connector, static const struct drm_framebuffer_funcs amdgpu_fb_funcs = { .destroy = drm_gem_fb_destroy, .create_handle = drm_gem_fb_create_handle, + .dirty = drm_atomic_helper_dirtyfb, }; uint32_t amdgpu_display_supported_domains(struct amdgpu_device *adev, -- 2.35.1
[PATCH AUTOSEL 5.19 03/16] drm/gma500: Fix WARN_ON(lock->magic != lock) error
From: Hans de Goede [ Upstream commit b6f25c3b94f2aadbf5cbef954db4073614943d74 ] psb_gem_unpin() calls dma_resv_lock() but the underlying ww_mutex gets destroyed by drm_gem_object_release() move the drm_gem_object_release() call in psb_gem_free_object() to after the unpin to fix the below warning: [ 79.693962] [ cut here ] [ 79.693992] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [ 79.694015] WARNING: CPU: 0 PID: 240 at kernel/locking/mutex.c:582 __ww_mutex_lock.constprop.0+0x569/0xfb0 [ 79.694052] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer qrtr bnep ath9k ath9k_common ath9k_hw snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel ath3k snd_intel_dspcfg mac80211 snd_intel_sdw_acpi btusb snd_hda_codec btrtl btbcm btintel btmtk bluetooth at24 snd_hda_core snd_hwdep uvcvideo snd_seq libarc4 videobuf2_vmalloc ath videobuf2_memops videobuf2_v4l2 videobuf2_common snd_seq_device videodev acer_wmi intel_powerclamp coretemp mc snd_pcm joydev sparse_keymap ecdh_generic pcspkr wmi_bmof cfg80211 i2c_i801 i2c_smbus snd_timer snd r8169 rfkill lpc_ich soundcore acpi_cpufreq zram rtsx_pci_sdmmc mmc_core serio_raw rtsx_pci gma500_gfx(E) video wmi ip6_tables ip_tables i2c_dev fuse [ 79.694436] CPU: 0 PID: 240 Comm: plymouthd Tainted: GW E 6.0.0-rc3+ #490 [ 79.694457] Hardware name: Packard Bell dot s/SJE01_CT, BIOS V1.10 07/23/2013 [ 79.694469] RIP: 0010:__ww_mutex_lock.constprop.0+0x569/0xfb0 [ 79.694496] Code: ff 85 c0 0f 84 15 fb ff ff 8b 05 ca 3c 11 01 85 c0 0f 85 07 fb ff ff 48 c7 c6 30 cb 84 aa 48 c7 c7 a3 e1 82 aa e8 ac 29 f8 ff <0f> 0b e9 ed fa ff ff e8 5b 83 8a ff 85 c0 74 10 44 8b 0d 98 3c 11 [ 79.694513] RSP: 0018:ad1dc048bbe0 EFLAGS: 00010282 [ 79.694623] RAX: 0028 RBX: RCX: [ 79.694636] RDX: 0001 RSI: aa8b0ffc RDI: [ 79.694650] RBP: ad1dc048bc80 R08: R09: ad1dc048ba90 [ 79.694662] R10: 0003 R11: aad62fe8 R12: 9ff302103138 [ 79.694675] R13: 9ff306ec8000 R14: 9ff307779078 R15: 9ff3014c0270 [ 79.694690] FS: 7ff1cccf1740() GS:9ff3bc20() knlGS: [ 79.694705] CS: 0010 DS: ES: CR0: 80050033 [ 79.694719] CR2: 559ecbcb4420 CR3: 1321 CR4: 06f0 [ 79.694734] Call Trace: [ 79.694749] [ 79.694761] ? __schedule+0x47f/0x1670 [ 79.694796] ? psb_gem_unpin+0x27/0x1a0 [gma500_gfx] [ 79.694830] ? lock_is_held_type+0xe3/0x140 [ 79.694864] ? ww_mutex_lock+0x38/0xa0 [ 79.694885] ? __cond_resched+0x1c/0x30 [ 79.694902] ww_mutex_lock+0x38/0xa0 [ 79.694925] psb_gem_unpin+0x27/0x1a0 [gma500_gfx] [ 79.694964] psb_gem_unpin+0x199/0x1a0 [gma500_gfx] [ 79.694996] drm_gem_object_release_handle+0x50/0x60 [ 79.695020] ? drm_gem_object_handle_put_unlocked+0xf0/0xf0 [ 79.695042] idr_for_each+0x4b/0xb0 [ 79.695066] ? _raw_spin_unlock_irqrestore+0x30/0x60 [ 79.695095] drm_gem_release+0x1c/0x30 [ 79.695118] drm_file_free.part.0+0x1ea/0x260 [ 79.695150] drm_release+0x6a/0x120 [ 79.695175] __fput+0x9f/0x260 [ 79.695203] task_work_run+0x59/0xa0 [ 79.695227] do_exit+0x387/0xbe0 [ 79.695250] ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90 [ 79.695275] ? lockdep_hardirqs_on+0x7d/0x100 [ 79.695304] do_group_exit+0x33/0xb0 [ 79.695331] __x64_sys_exit_group+0x14/0x20 [ 79.695353] do_syscall_64+0x58/0x80 [ 79.695376] ? up_read+0x17/0x20 [ 79.695401] ? lock_is_held_type+0xe3/0x140 [ 79.695429] ? asm_exc_page_fault+0x22/0x30 [ 79.695450] ? lockdep_hardirqs_on+0x7d/0x100 [ 79.695473] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 79.695493] RIP: 0033:0x7ff1ccefe3f1 [ 79.695516] Code: Unable to access opcode bytes at RIP 0x7ff1ccefe3c7. [ 79.695607] RSP: 002b:7ffed4413378 EFLAGS: 0246 ORIG_RAX: 00e7 [ 79.695629] RAX: ffda RBX: 7ff1cd0159e0 RCX: 7ff1ccefe3f1 [ 79.695644] RDX: 003c RSI: 00e7 RDI: [ 79.695656] RBP: R08: ff80 R09: 7ff1cd020b20 [ 79.695671] R10: R11: 0246 R12: 7ff1cd0159e0 [ 79.695684] R13: R14: 7ff1cd01aee8 R15: 7ff1cd01af00 [ 79.695733] [ 79.695746] irq event stamp: 725979 [ 79.695757] hardirqs last enabled at (725979): [] finish_task_switch.isra.0+0xe4/0x3f0 [ 79.695780] hardirqs last disabled at (725978): [] __schedule+0xdd3/0x1670 [ 79.695803] softirqs last enabled at (725974): [] __irq_exit_rcu+0xed/0x160 [ 79.695825] softirqs last disabled at (725969): [] __irq_exit_rcu+0xed/0x160 [ 79.695845] ---[ end trace ]--- Signed-off-by: Hans de Goede Signed-off-by: Patrik Jakobsson Link: https://patchwork.freedesktop.org/patch/msgid/20220906203852.527663-3-hdego...@redhat.com Si
[PATCH AUTOSEL 5.15 05/10] drm/amdgpu: use dirty framebuffer helper
From: Hamza Mahfooz [ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ] Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs struct. Signed-off-by: Hamza Mahfooz Acked-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index 5c08047adb59..47fb722ab374 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -490,6 +491,7 @@ bool amdgpu_display_ddc_probe(struct amdgpu_connector *amdgpu_connector, static const struct drm_framebuffer_funcs amdgpu_fb_funcs = { .destroy = drm_gem_fb_destroy, .create_handle = drm_gem_fb_create_handle, + .dirty = drm_atomic_helper_dirtyfb, }; uint32_t amdgpu_display_supported_domains(struct amdgpu_device *adev, -- 2.35.1
[PATCH AUTOSEL 5.19 12/16] drm/amd/display: Reduce number of arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport()
From: Nathan Chancellor [ Upstream commit 37934d4118e22bceb80141804391975078f31734 ] Most of the arguments are identical between the two call sites and they can be accessed through the 'struct vba_vars_st' pointer. This reduces the total amount of stack space that dml31_ModeSupportAndSystemConfigurationFull() uses by 240 bytes with LLVM 16 (2216 -> 1976), helping clear up the following clang warning: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6: error: stack frame size (2216) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than] void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib) ^ 1 error generated. Link: https://github.com/ClangBuiltLinux/linux/issues/1681 Reported-by: "Sudip Mukherjee (Codethink)" Tested-by: Maíra Canal Reviewed-by: Rodrigo Siqueira Signed-off-by: Nathan Chancellor Signed-off-by: Rodrigo Siqueira Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- .../dc/dml/dcn31/display_mode_vba_31.c| 248 -- 1 file changed, 52 insertions(+), 196 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c index e4b9fd31223c..586825d85d66 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c @@ -321,64 +321,28 @@ static void CalculateVupdateAndDynamicMetadataParameters( static void CalculateWatermarksAndDRAMSpeedChangeSupport( struct display_mode_lib *mode_lib, unsigned int PrefetchMode, - unsigned int NumberOfActivePlanes, - unsigned int MaxLineBufferLines, - unsigned int LineBufferSize, - unsigned int WritebackInterfaceBufferSize, double DCFCLK, double ReturnBW, - bool SynchronizedVBlank, - unsigned int dpte_group_bytes[], - unsigned int MetaChunkSize, double UrgentLatency, double ExtraLatency, - double WritebackLatency, - double WritebackChunkSize, double SOCCLK, - double DRAMClockChangeLatency, - double SRExitTime, - double SREnterPlusExitTime, - double SRExitZ8Time, - double SREnterPlusExitZ8Time, double DCFCLKDeepSleep, unsigned int DETBufferSizeY[], unsigned int DETBufferSizeC[], unsigned int SwathHeightY[], unsigned int SwathHeightC[], - unsigned int LBBitPerPixel[], double SwathWidthY[], double SwathWidthC[], - double HRatio[], - double HRatioChroma[], - unsigned int vtaps[], - unsigned int VTAPsChroma[], - double VRatio[], - double VRatioChroma[], - unsigned int HTotal[], - double PixelClock[], - unsigned int BlendingAndTiming[], unsigned int DPPPerPlane[], double BytePerPixelDETY[], double BytePerPixelDETC[], - double DSTXAfterScaler[], - double DSTYAfterScaler[], - bool WritebackEnable[], - enum source_format_class WritebackPixelFormat[], - double WritebackDestinationWidth[], - double WritebackDestinationHeight[], - double WritebackSourceHeight[], bool UnboundedRequestEnabled, int unsigned CompressedBufferSizeInkByte, enum clock_change_support *DRAMClockChangeSupport, - double *UrgentWatermark, - double *WritebackUrgentWatermark, - double *DRAMClockChangeWatermark, - double *WritebackDRAMClockChangeWatermark, double *StutterExitWatermark, double *StutterEnterPlusExitWatermark, double *Z8StutterExitWatermark, - double *Z8StutterEnterPlusExitWatermark, - double *MinActiveDRAMClockChangeLatencySupported); + double *Z8StutterEnterPlusExitWatermark); static void CalculateDCFCLKDeepSleep( struct display_mode_lib *mode_lib, @@ -3027,64 +2991,28 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman CalculateWatermarksAndDRAMSpeedChangeSupport( mode_lib, PrefetchMode, - v->NumberOfActivePlanes, - v->MaxLineBufferLines, - v->LineBufferSize, - v->WritebackInterfaceBufferSize, v->DCFCLK,
[PATCH AUTOSEL 5.19 09/16] drm/amdgpu: add HDP remap functionality to nbio 7.7
From: Alex Deucher [ Upstream commit 8c5708d3da37b8c7c3c22c7e945b9a76a7c9539b ] Was missing before and would have resulted in a write to a non-existant register. Normally APUs don't use HDP, but other asics could use this code and APUs do use the HDP when used in passthrough. Reviewed-by: Lijo Lazar Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c index cdc0c9779848..6c1fd471a4c7 100644 --- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c @@ -28,6 +28,14 @@ #include "nbio/nbio_7_7_0_sh_mask.h" #include +static void nbio_v7_7_remap_hdp_registers(struct amdgpu_device *adev) +{ + WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_MEM_FLUSH_CNTL, +adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL); + WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_REG_FLUSH_CNTL, +adev->rmmio_remap.reg_offset + KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL); +} + static u32 nbio_v7_7_get_rev_id(struct amdgpu_device *adev) { u32 tmp; @@ -237,4 +245,5 @@ const struct amdgpu_nbio_funcs nbio_v7_7_funcs = { .ih_doorbell_range = nbio_v7_7_ih_doorbell_range, .ih_control = nbio_v7_7_ih_control, .init_registers = nbio_v7_7_init_registers, + .remap_hdp_registers = nbio_v7_7_remap_hdp_registers, }; -- 2.35.1
[PATCH AUTOSEL 5.19 15/16] drm/rockchip: Fix return type of cdn_dp_connector_mode_valid
From: Nathan Huckleberry [ Upstream commit b0b9408f132623dc88e78adb5282f74e4b64bb57 ] The mode_valid field in drm_connector_helper_funcs is expected to be of type: enum drm_mode_status (* mode_valid) (struct drm_connector *connector, struct drm_display_mode *mode); The mismatched return type breaks forward edge kCFI since the underlying function definition does not match the function hook definition. The return type of cdn_dp_connector_mode_valid should be changed from int to enum drm_mode_status. Reported-by: Dan Carpenter Link: https://github.com/ClangBuiltLinux/linux/issues/1703 Cc: l...@lists.linux.dev Signed-off-by: Nathan Huckleberry Reviewed-by: Nathan Chancellor Signed-off-by: Heiko Stuebner Link: https://patchwork.freedesktop.org/patch/msgid/2022091320.155149-1-nh...@google.com Signed-off-by: Sasha Levin --- drivers/gpu/drm/rockchip/cdn-dp-core.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c b/drivers/gpu/drm/rockchip/cdn-dp-core.c index c204e9b95c1f..518ee13b1d6f 100644 --- a/drivers/gpu/drm/rockchip/cdn-dp-core.c +++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c @@ -283,8 +283,9 @@ static int cdn_dp_connector_get_modes(struct drm_connector *connector) return ret; } -static int cdn_dp_connector_mode_valid(struct drm_connector *connector, - struct drm_display_mode *mode) +static enum drm_mode_status +cdn_dp_connector_mode_valid(struct drm_connector *connector, + struct drm_display_mode *mode) { struct cdn_dp_device *dp = connector_to_dp(connector); struct drm_display_info *display_info = &dp->connector.display_info; -- 2.35.1
[PATCH AUTOSEL 5.19 14/16] drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage
From: Nathan Chancellor [ Upstream commit 41012d715d5d7b9751ae84b8fb255e404ac9c5d0 ] This function consumes a lot of stack space and it blows up the size of dml30_ModeSupportAndSystemConfigurationFull() with clang: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6: error: stack frame size (2200) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than] void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib) ^ 1 error generated. Commit a0f7e7f759cf ("drm/amd/display: fix i386 frame size warning") aimed to address this for i386 but it did not help x86_64. To reduce the amount of stack space that dml30_ModeSupportAndSystemConfigurationFull() uses, mark UseMinimumDCFCLK() as noinline, using the _for_stack variant for documentation. While this will increase the total amount of stack usage between the two functions (1632 and 1304 bytes respectively), it will make sure both stay below the limit of 2048 bytes for these files. The aforementioned change does help reduce UseMinimumDCFCLK()'s stack usage so it should not be reverted in favor of this change. Link: https://github.com/ClangBuiltLinux/linux/issues/1681 Reported-by: "Sudip Mukherjee (Codethink)" Tested-by: Maíra Canal Reviewed-by: Rodrigo Siqueira Signed-off-by: Nathan Chancellor Signed-off-by: Rodrigo Siqueira Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c index f47d82da115c..42a567e71439 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c @@ -6651,8 +6651,7 @@ static double CalculateUrgentLatency( return ret; } - -static void UseMinimumDCFCLK( +static noinline_for_stack void UseMinimumDCFCLK( struct display_mode_lib *mode_lib, int MaxInterDCNTileRepeaters, int MaxPrefetchMode, -- 2.35.1
[PATCH AUTOSEL 5.19 13/16] drm/amd/display: Reduce number of arguments of dml31's CalculateFlipSchedule()
From: Nathan Chancellor [ Upstream commit 21485d3da659b66c37d99071623af83ee1c6733d ] Most of the arguments are identical between the two call sites and they can be accessed through the 'struct vba_vars_st' pointer. This reduces the total amount of stack space that dml31_ModeSupportAndSystemConfigurationFull() uses by 112 bytes with LLVM 16 (1976 -> 1864), helping clear up the following clang warning: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6: error: stack frame size (2216) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than] void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib) ^ 1 error generated. Link: https://github.com/ClangBuiltLinux/linux/issues/1681 Reported-by: "Sudip Mukherjee (Codethink)" Tested-by: Maíra Canal Reviewed-by: Rodrigo Siqueira Signed-off-by: Nathan Chancellor Signed-off-by: Rodrigo Siqueira Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- .../dc/dml/dcn31/display_mode_vba_31.c| 172 +- 1 file changed, 47 insertions(+), 125 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c index 586825d85d66..40a672236198 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c @@ -261,33 +261,13 @@ static void CalculateRowBandwidth( static void CalculateFlipSchedule( struct display_mode_lib *mode_lib, + unsigned int k, double HostVMInefficiencyFactor, double UrgentExtraLatency, double UrgentLatency, - unsigned int GPUVMMaxPageTableLevels, - bool HostVMEnable, - unsigned int HostVMMaxNonCachedPageTableLevels, - bool GPUVMEnable, - double HostVMMinPageSize, double PDEAndMetaPTEBytesPerFrame, double MetaRowBytes, - double DPTEBytesPerRow, - double BandwidthAvailableForImmediateFlip, - unsigned int TotImmediateFlipBytes, - enum source_format_class SourcePixelFormat, - double LineTime, - double VRatio, - double VRatioChroma, - double Tno_bw, - bool DCCEnable, - unsigned int dpte_row_height, - unsigned int meta_row_height, - unsigned int dpte_row_height_chroma, - unsigned int meta_row_height_chroma, - double *DestinationLinesToRequestVMInImmediateFlip, - double *DestinationLinesToRequestRowInImmediateFlip, - double *final_flip_bw, - bool *ImmediateFlipSupportedForPipe); + double DPTEBytesPerRow); static double CalculateWriteBackDelay( enum source_format_class WritebackPixelFormat, double WritebackHRatio, @@ -2878,33 +2858,13 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman for (k = 0; k < v->NumberOfActivePlanes; ++k) { CalculateFlipSchedule( mode_lib, + k, HostVMInefficiencyFactor, v->UrgentExtraLatency, v->UrgentLatency, - v->GPUVMMaxPageTableLevels, - v->HostVMEnable, - v->HostVMMaxNonCachedPageTableLevels, - v->GPUVMEnable, - v->HostVMMinPageSize, v->PDEAndMetaPTEBytesFrame[k], v->MetaRowByte[k], - v->PixelPTEBytesPerRow[k], - v->BandwidthAvailableForImmediateFlip, - v->TotImmediateFlipBytes, - v->SourcePixelFormat[k], - v->HTotal[k] / v->PixelClock[k], - v->VRatio[k], - v->VRatioChroma[k], - v->Tno_bw[k], - v->DCCEnable[k], - v->dpte_row_height[k], - v->meta_row_height[k], - v->dpte_row_hei
[PATCH AUTOSEL 5.19 10/16] drm/amdgpu: Skip reset error status for psp v13_0_0
From: Candice Li [ Upstream commit 86875d558b91cb46f43be112799c06ecce60ec1e ] No need to reset error status since only umc ras supported on psp v13_0_0. Signed-off-by: Candice Li Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index dac202ae864d..9193ca5d6fe7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -1805,7 +1805,8 @@ static void amdgpu_ras_log_on_err_counter(struct amdgpu_device *adev) amdgpu_ras_query_error_status(adev, &info); if (adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 2) && - adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 4)) { + adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 4) && + adev->ip_versions[MP0_HWIP][0] != IP_VERSION(13, 0, 0)) { if (amdgpu_ras_reset_error_status(adev, info.head.block)) dev_warn(adev->dev, "Failed to reset error counter and error status"); } -- 2.35.1