date:20220921

Re: [PATCH V2 2/3] dt-bindings: display: panel: Add NewVision NV3051D bindings

2022-09-21 Thread Krzysztof Kozlowski

On Tue, 20 Sep 2022 09:59:04 -0500, Chris Morgan wrote:
> From: Chris Morgan 
> 
> Add documentation for the NewVision NV3051D panel bindings.
> Note that for the two expected consumers of this panel binding
> the underlying LCD model is unknown. Name "anbernic,rg353p-panel"
> is used because the hardware itself is known as "anbernic,rg353p".
> 
> Signed-off-by: Chris Morgan 
> ---
>  .../display/panel/newvision,nv3051d.yaml  | 55 +++
>  1 file changed, 55 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/panel/newvision,nv3051d.yaml
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/display/panel/newvision,nv3051d.example.dtb:
 panel@0: compatible: ['anbernic,rg353p-panel'] is too short
From schema: 
/builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/display/panel/newvision,nv3051d.yaml

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/patch/

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.

[RFC v4 00/14] drm/i915/vm_bind: Add VM_BIND functionality

2022-09-21 Thread Niranjana Vishwanathapura

DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
buffer objects (BOs) or sections of a BOs at specified GPU virtual
addresses on a specified address space (VM). Multiple mappings can map
to the same physical pages of an object (aliasing). These mappings (also
referred to as persistent mappings) will be persistent across multiple
GPU submissions (execbuf calls) issued by the UMD, without user having
to provide a list of all required mappings during each submission (as
required by older execbuf mode).

This patch series support VM_BIND version 1, as described by the param
I915_PARAM_VM_BIND_VERSION.

Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
The new execbuf3 ioctl will not have any execlist support and all the
legacy support like relocations etc., are removed.

TODOs:
* Support out fence for VM_UNBIND ioctl.
* Async VM_UNBIND support.
* Cleanups and optimizations.

NOTEs:
* It is based on below VM_BIND design+uapi rfc.
  Documentation/gpu/rfc/i915_vm_bind.rst

* The IGT RFC series is posted as,
  [RFC v2 0/8] vm_bind: Add VM_BIND validation support

v4: Share code between legacy execbuf and execbuf3.
Address review feedback from Thomas and Tvrtko.
Reformat patches and some cleanups.

Signed-off-by: Niranjana Vishwanathapura 

Niranjana Vishwanathapura (14):
  drm/i915/vm_bind: Expose vm lookup function
  drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
  drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
  drm/i915/vm_bind: Implement bind and unbind of object
  drm/i915/vm_bind: Support for VM private BOs
  drm/i915/vm_bind: Handle persistent vmas
  drm/i915/vm_bind: Add out fence support
  drm/i915/vm_bind: Abstract out common execbuf functions
  drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
  drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
  drm/i915/vm_bind: Handle persistent vmas in execbuf3
  drm/i915/vm_bind: userptr dma-resv changes
  drm/i915/vm_bind: Skip vma_lookup for persistent vmas
  drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode

 drivers/gpu/drm/i915/Makefile |   3 +
 drivers/gpu/drm/i915/display/intel_fb_pin.c   |   2 +-
 .../drm/i915/display/intel_plane_initial.c|   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  16 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_create.c|  60 +-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|   6 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 520 +--
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 861 ++
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 530 +++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  47 +
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
 drivers/gpu/drm/i915/gem/i915_gem_object.c|   3 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h|   2 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  17 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  31 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 421 +
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  16 +-
 .../i915/gem/selftests/i915_gem_client_blt.c  |   2 +-
 .../drm/i915/gem/selftests/i915_gem_context.c |  12 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |   2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|   6 +-
 .../drm/i915/gem/selftests/igt_gem_utils.c|   2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |   2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |   2 +-
 drivers/gpu/drm/i915/gt/intel_gt.c|   2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  20 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  27 +
 drivers/gpu/drm/i915/gt/intel_lrc.c   |   4 +-
 drivers/gpu/drm/i915/gt/intel_renderstate.c   |   2 +-
 drivers/gpu/drm/i915/gt/intel_ring.c  |   2 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   |   4 +-
 drivers/gpu/drm/i915/gt/intel_timeline.c  |   2 +-
 drivers/gpu/drm/i915/gt/mock_engine.c |   2 +-
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  |   4 +-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  16 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   6 +-
 drivers/gpu/drm/i915/gt/selftest_lrc.c|   2 +-
 .../drm/i915/gt/selftest_ring_submission.c|   2 +-
 drivers/gpu/drm/i915/gt/selftest_rps.c|   2 +-
 .../gpu/drm/i915/gt/selftest_workarounds.c|   4 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|   2 +-
 drivers/gpu/drm/i915/i915_driver.c|   4 +
 drivers/gpu/drm/i915/i915_gem.c   |   2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c   |  39 +
 drivers/gpu/drm/i915/i915_gem_gtt.h   |   3 +
 drivers/gpu/drm/i915/i915_getparam.c  |   3 +
 drivers/gpu/drm/i915/i915_perf.c  |   2 +-
 drivers/gpu/drm/i915/i915_sw_fen

[RFC v4 07/14] drm/i915/vm_bind: Add out fence support

2022-09-21 Thread Niranjana Vishwanathapura

Add support for handling out fence for vm_bind call.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  4 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 81 +++
 drivers/gpu/drm/i915/i915_vma.c   |  6 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  7 ++
 4 files changed, 97 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
index 4f3cfa1f6ef6..facba29ead04 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -6,6 +6,7 @@
 #ifndef __I915_GEM_VM_BIND_H
 #define __I915_GEM_VM_BIND_H
 
+#include 
 #include 
 
 #include 
@@ -24,4 +25,7 @@ int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void 
*data,
 
 void i915_gem_vm_unbind_all(struct i915_address_space *vm);
 
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence);
+
 #endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 236f901b8b9c..5cd788404ee7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -7,6 +7,8 @@
 
 #include 
 
+#include 
+
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_vm_bind.h"
 
@@ -106,6 +108,75 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, 
bool release_obj)
i915_gem_object_put(vma->obj);
 }
 
+static int i915_vm_bind_add_fence(struct drm_file *file, struct i915_vma *vma,
+ u32 handle, u64 point)
+{
+   struct drm_syncobj *syncobj;
+
+   syncobj = drm_syncobj_find(file, handle);
+   if (!syncobj) {
+   DRM_DEBUG("Invalid syncobj handle provided\n");
+   return -ENOENT;
+   }
+
+   /*
+* For timeline syncobjs we need to preallocate chains for
+* later signaling.
+*/
+   if (point) {
+   vma->vm_bind_fence.chain_fence = dma_fence_chain_alloc();
+   if (!vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_put(syncobj);
+   return -ENOMEM;
+   }
+   } else {
+   vma->vm_bind_fence.chain_fence = NULL;
+   }
+   vma->vm_bind_fence.syncobj = syncobj;
+   vma->vm_bind_fence.value = point;
+
+   return 0;
+}
+
+static void i915_vm_bind_put_fence(struct i915_vma *vma)
+{
+   if (!vma->vm_bind_fence.syncobj)
+   return;
+
+   drm_syncobj_put(vma->vm_bind_fence.syncobj);
+   dma_fence_chain_free(vma->vm_bind_fence.chain_fence);
+}
+
+/**
+ * i915_vm_bind_signal_fence() - Add fence to vm_bind syncobj
+ * @vma: vma mapping requiring signaling
+ * @fence: fence to be added
+ *
+ * Associate specified @fence with the @vma's syncobj to be
+ * signaled after the @fence work completes.
+ */
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+  struct dma_fence * const fence)
+{
+   struct drm_syncobj *syncobj = vma->vm_bind_fence.syncobj;
+
+   if (!syncobj)
+   return;
+
+   if (vma->vm_bind_fence.chain_fence) {
+   drm_syncobj_add_point(syncobj,
+ vma->vm_bind_fence.chain_fence,
+ fence, vma->vm_bind_fence.value);
+   /*
+* The chain's ownership is transferred to the
+* timeline.
+*/
+   vma->vm_bind_fence.chain_fence = NULL;
+   } else {
+   drm_syncobj_replace_fence(syncobj, fence);
+   }
+}
+
 static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
  struct drm_i915_gem_vm_unbind *va)
 {
@@ -233,6 +304,13 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
goto unlock_vm;
}
 
+   if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL) {
+   ret = i915_vm_bind_add_fence(file, vma, va->fence.handle,
+va->fence.value);
+   if (ret)
+   goto put_vma;
+   }
+
pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER;
 
for_i915_gem_ww(&ww, ret, true) {
@@ -257,6 +335,9 @@ static int i915_gem_vm_bind_obj(struct i915_address_space 
*vm,
i915_gem_object_get(vma->obj);
}
 
+   if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL)
+   i915_vm_bind_put_fence(vma);
+put_vma:
if (ret)
i915_vma_destroy(vma);
 unlock_vm:
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index ff216e9a2c8d..f7d711e675d6 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1540,8 +1540,12 @@ int i915_vma_pin

[RFC v4 08/14] drm/i915/vm_bind: Abstract out common execbuf functions

2022-09-21 Thread Niranjana Vishwanathapura

The new execbuf3 ioctl path and the legacy execbuf ioctl
paths have many common functionalities.
Share code between these two paths by abstracting out the
common functionalities into a separate file where possible.

Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 507 ++---
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 530 ++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  47 ++
 4 files changed, 612 insertions(+), 473 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 9bf939ef18ea..bf952f478555 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -148,6 +148,7 @@ gem-y += \
gem/i915_gem_create.o \
gem/i915_gem_dmabuf.o \
gem/i915_gem_domain.o \
+   gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 33d989a20227..363b2a788cdf 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -9,8 +9,6 @@
 #include 
 #include 
 
-#include 
-
 #include "display/intel_frontbuffer.h"
 
 #include "gem/i915_gem_ioctls.h"
@@ -28,6 +26,7 @@
 #include "i915_file_private.h"
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
 #include "i915_gem_evict.h"
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
@@ -235,13 +234,6 @@ enum {
  * the batchbuffer in trusted mode, otherwise the ioctl is rejected.
  */
 
-struct eb_fence {
-   struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */
-   struct dma_fence *dma_fence;
-   u64 value;
-   struct dma_fence_chain *chain_fence;
-};
-
 struct i915_execbuffer {
struct drm_i915_private *i915; /** i915 backpointer */
struct drm_file *file; /** per-file lookup tables and limits */
@@ -2446,164 +2438,29 @@ static const enum intel_engine_id user_ring_map[] = {
[I915_EXEC_VEBOX]   = VECS0
 };
 
-static struct i915_request *eb_throttle(struct i915_execbuffer *eb, struct 
intel_context *ce)
-{
-   struct intel_ring *ring = ce->ring;
-   struct intel_timeline *tl = ce->timeline;
-   struct i915_request *rq;
-
-   /*
-* Completely unscientific finger-in-the-air estimates for suitable
-* maximum user request size (to avoid blocking) and then backoff.
-*/
-   if (intel_ring_update_space(ring) >= PAGE_SIZE)
-   return NULL;
-
-   /*
-* Find a request that after waiting upon, there will be at least half
-* the ring available. The hysteresis allows us to compete for the
-* shared ring and should mean that we sleep less often prior to
-* claiming our resources, but not so long that the ring completely
-* drains before we can submit our next request.
-*/
-   list_for_each_entry(rq, &tl->requests, link) {
-   if (rq->ring != ring)
-   continue;
-
-   if (__intel_ring_space(rq->postfix,
-  ring->emit, ring->size) > ring->size / 2)
-   break;
-   }
-   if (&rq->link == &tl->requests)
-   return NULL; /* weird, we will check again later for real */
-
-   return i915_request_get(rq);
-}
-
-static int eb_pin_timeline(struct i915_execbuffer *eb, struct intel_context 
*ce,
-  bool throttle)
-{
-   struct intel_timeline *tl;
-   struct i915_request *rq = NULL;
-
-   /*
-* Take a local wakeref for preparing to dispatch the execbuf as
-* we expect to access the hardware fairly frequently in the
-* process, and require the engine to be kept awake between accesses.
-* Upon dispatch, we acquire another prolonged wakeref that we hold
-* until the timeline is idle, which in turn releases the wakeref
-* taken on the engine, and the parent device.
-*/
-   tl = intel_context_timeline_lock(ce);
-   if (IS_ERR(tl))
-   return PTR_ERR(tl);
-
-   intel_context_enter(ce);
-   if (throttle)
-   rq = eb_throttle(eb, ce);
-   intel_context_timeline_unlock(tl);
-
-   if (rq) {
-   bool nonblock = eb->file->filp->f_flags & O_NONBLOCK;
-   long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
-
-   if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE,
- timeout) < 0) {
-   i915_request_put(rq);
-
-   /*
-

[RFC v4 09/14] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl

2022-09-21 Thread Niranjana Vishwanathapura

Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

Legacy features like relocations etc are not supported by execbuf3.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 575 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
 drivers/gpu/drm/i915/i915_driver.c|   1 +
 include/uapi/drm/i915_drm.h   |  64 ++
 5 files changed, 643 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index bf952f478555..3473ee5825bb 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -150,6 +150,7 @@ gem-y += \
gem/i915_gem_domain.o \
gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
+   gem/i915_gem_execbuffer3.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
gem/i915_gem_lmem.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
new file mode 100644
index ..b6229b955e62
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -0,0 +1,575 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+#include "gt/intel_gt_pm.h"
+#include "gt/intel_ring.h"
+
+#include "i915_drv.h"
+#include "i915_file_private.h"
+#include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
+#include "i915_gem_ioctls.h"
+#include "i915_gem_vm_bind.h"
+#include "i915_trace.h"
+
+#define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
+#define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
+
+/* Catch emission of unexpected errors for CI! */
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+#undef EINVAL
+#define EINVAL ({ \
+   DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
+   22; \
+})
+#endif
+
+/**
+ * DOC: User command execution with execbuf3 ioctl
+ *
+ * A VM in VM_BIND mode will not support older execbuf mode of binding.
+ * The execbuf ioctl handling in VM_BIND mode differs significantly from the
+ * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+ * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+ * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+ * execlist. Hence, no support for implicit sync.
+ *
+ * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+ * works with execbuf3 ioctl for submission.
+ *
+ * The execbuf3 ioctl directly specifies the batch addresses instead of as
+ * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+ * support many of the older features like in/out/submit fences, fence array,
+ * default gem context etc. (See struct drm_i915_gem_execbuffer3).
+ *
+ * In VM_BIND mode, VA allocation is completely managed by the user instead of
+ * the i915 driver. Hence all VA assignment, eviction are not applicable in
+ * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+ * be using the i915_vma active reference tracking. It will instead check the
+ * dma-resv object's fence list for that.
+ *
+ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
+ * vma lookup table, implicit sync, vma active reference tracking etc., are not
+ * applicable for execbuf3 ioctl.
+ */
+
+/**
+ * struct i915_execbuffer - execbuf struct for execbuf3
+ * @i915: reference to the i915 instance we run on
+ * @file: drm file reference
+ * args: execbuf3 ioctl structure
+ * @gt: reference to the gt instance ioctl submitted for
+ * @context: logical state for the request
+ * @gem_context: callers context
+ * @requests: requests to be build
+ * @composite_fence: used for excl fence in dma_resv objects when > 1 BB 
submitted
+ * @ww: i915_gem_ww_ctx instance
+ * @num_batches: number of batches submitted
+ * @batch_addresses: addresses corresponds to the submitted batches
+ * @batches: references to the i915_vmas corresponding to the batches
+ */
+struct i915_execbuffer {
+   struct drm_i915_private *i915;
+   struct drm_file *file;
+   struct drm_i915_gem_execbuffer3 *args;
+
+   struct intel_gt *gt;
+   struct intel_context *context;
+   struct i915_gem_context *gem_context;
+
+   struct i915_request *requests[MAX_ENGINE_INSTANCE + 1];
+   struct dma_fence *composite_fence;
+
+   struct i915_gem_ww_ctx ww;
+
+   unsigned int num_batches;
+   u64

[RFC v4 02/14] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()

2022-09-21 Thread Niranjana Vishwanathapura

Add function __i915_sw_fence_await_reservation() for
asynchronous wait on a dma-resv object with specified
dma_resv_usage. This is required for async vma unbind
with vm_bind.

Signed-off-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_sw_fence.c | 25 ++---
 drivers/gpu/drm/i915/i915_sw_fence.h |  7 ++-
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index 6fc0d1b89690..0ce8f4efc1ed 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -569,12 +569,11 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
return ret;
 }
 
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-   struct dma_resv *resv,
-   const struct dma_fence_ops *exclude,
-   bool write,
-   unsigned long timeout,
-   gfp_t gfp)
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ enum dma_resv_usage usage,
+ unsigned long timeout,
+ gfp_t gfp)
 {
struct dma_resv_iter cursor;
struct dma_fence *f;
@@ -583,7 +582,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
debug_fence_assert(fence);
might_sleep_if(gfpflags_allow_blocking(gfp));
 
-   dma_resv_iter_begin(&cursor, resv, dma_resv_usage_rw(write));
+   dma_resv_iter_begin(&cursor, resv, usage);
dma_resv_for_each_fence_unlocked(&cursor, f) {
pending = i915_sw_fence_await_dma_fence(fence, f, timeout,
gfp);
@@ -598,6 +597,18 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
return ret;
 }
 
+int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+   struct dma_resv *resv,
+   const struct dma_fence_ops *exclude,
+   bool write,
+   unsigned long timeout,
+   gfp_t gfp)
+{
+   return __i915_sw_fence_await_reservation(fence, resv,
+dma_resv_usage_rw(write),
+timeout, gfp);
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/lib_sw_fence.c"
 #include "selftests/i915_sw_fence.c"
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
b/drivers/gpu/drm/i915/i915_sw_fence.h
index 619fc5a22f0c..3cf4b6e16f35 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -10,13 +10,13 @@
 #define _I915_SW_FENCE_H_
 
 #include 
+#include 
 #include 
 #include 
 #include  /* for NOTIFY_DONE */
 #include 
 
 struct completion;
-struct dma_resv;
 struct i915_sw_fence;
 
 enum i915_sw_fence_notify {
@@ -89,6 +89,11 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
  unsigned long timeout,
  gfp_t gfp);
 
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ enum dma_resv_usage usage,
+ unsigned long timeout,
+ gfp_t gfp);
 int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
struct dma_resv *resv,
const struct dma_fence_ops *exclude,
-- 
2.21.0.rc0.32.g243a4c7e27

[RFC v4 04/14] drm/i915/vm_bind: Implement bind and unbind of object

2022-09-21 Thread Niranjana Vishwanathapura

Add uapi and implement support for bind and unbind of an
object at the specified GPU virtual addresses.

The vm_bind mode is not supported in legacy execbuf2 ioctl.
It will be supported only in the newer execbuf3 ioctl.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   5 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  27 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 308 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  17 +
 drivers/gpu/drm/i915/i915_driver.c|   3 +
 drivers/gpu/drm/i915/i915_vma.c   |   3 +-
 drivers/gpu/drm/i915/i915_vma.h   |   2 -
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 167 ++
 11 files changed, 554 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index a26edcdadc21..9bf939ef18ea 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -166,6 +166,7 @@ gem-y += \
gem/i915_gem_ttm_move.o \
gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
+   gem/i915_gem_vm_bind_object.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index cd75b0ca2555..f85f10cf9c34 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
if (unlikely(IS_ERR(ctx)))
return PTR_ERR(ctx);
 
+   if (ctx->vm->vm_bind_mode) {
+   i915_gem_context_put(ctx);
+   return -EOPNOTSUPP;
+   }
+
eb->gem_context = ctx;
if (i915_gem_context_has_full_ppgtt(ctx))
eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
new file mode 100644
index ..4f3cfa1f6ef6
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include 
+
+#include 
+#include 
+
+#include "gt/intel_gtt.h"
+#include "i915_vma_types.h"
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file);
+
+void i915_gem_vm_unbind_all(struct i915_address_space *vm);
+
+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
new file mode 100644
index ..c24e22657617
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,308 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "gem/i915_gem_context.h"
+#include "gem/i915_gem_vm_bind.h"
+
+#include "gt/intel_gpu_commands.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+START, LAST, static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+ * specified address space (VM). Multiple mappings can map to the same physical
+ * pages of an object (aliasing). These mappings (also referred to as 
persistent
+ * mappings) will be persistent across multiple GPU submissions (execbuf calls)
+ * issued by the UMD, without user having to provide a list of all required
+ * mappings during each submission (as required by older execbuf mode).
+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
+ * User has to opt-in for VM_BIND mode of binding for an address space (VM)
+ * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
+ * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be
+ * d

[RFC v4 01/14] drm/i915/vm_bind: Expose vm lookup function

2022-09-21 Thread Niranjana Vishwanathapura

Make i915_gem_vm_lookup() function non-static as it will be
used by the vm_bind feature.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 11 ++-
 drivers/gpu/drm/i915/gem/i915_gem_context.h |  3 +++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 0bcde53c50c6..f4e648ec01ed 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -346,7 +346,16 @@ static int proto_context_register(struct 
drm_i915_file_private *fpriv,
return ret;
 }
 
-static struct i915_address_space *
+/**
+ * i915_gem_vm_lookup() - looks up for the VM reference given the vm id
+ * @file_priv: the private data associated with the user's file
+ * @id: the VM id
+ *
+ * Finds the VM reference associated to a specific id.
+ *
+ * Returns the VM pointer on success, NULL in case of failure.
+ */
+struct i915_address_space *
 i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id)
 {
struct i915_address_space *vm;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index e5b0f66ea1fe..899fa8f1e0fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -139,6 +139,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, 
void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file);
 
+struct i915_address_space *
+i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
+
 struct i915_gem_context *
 i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
 
-- 
2.21.0.rc0.32.g243a4c7e27

[RFC v4 11/14] drm/i915/vm_bind: Handle persistent vmas in execbuf3

2022-09-21 Thread Niranjana Vishwanathapura

Handle persistent (VM_BIND) mappings during the request submission
in the execbuf3 path.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 189 +-
 1 file changed, 188 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index b6229b955e62..82a068d03440 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -4,6 +4,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
@@ -21,6 +22,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
 
+#define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
 
@@ -44,7 +46,9 @@
  * execlist. Hence, no support for implicit sync.
  *
  * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
- * works with execbuf3 ioctl for submission.
+ * works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+ * VM_BIND call) at the time of execbuf3 call are deemed required for that
+ * submission.
  *
  * The execbuf3 ioctl directly specifies the batch addresses instead of as
  * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
@@ -60,6 +64,13 @@
  * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
  * vma lookup table, implicit sync, vma active reference tracking etc., are not
  * applicable for execbuf3 ioctl.
+ *
+ * During each execbuf submission, request fence is added to all VM_BIND mapped
+ * objects with DMA_RESV_USAGE_BOOKKEEP. The DMA_RESV_USAGE_BOOKKEEP usage will
+ * prevent over sync (See enum dma_resv_usage). Note that DRM_I915_GEM_WAIT and
+ * DRM_I915_GEM_BUSY ioctls do not check for DMA_RESV_USAGE_BOOKKEEP usage and
+ * hence should not be used for end of batch check. Instead, the execbuf3
+ * timeline out fence should be used for end of batch check.
  */
 
 /**
@@ -129,6 +140,23 @@ eb_find_vma(struct i915_address_space *vm, u64 addr)
return i915_gem_vm_bind_lookup_vma(vm, va);
 }
 
+static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
+{
+   struct i915_vma *vma, *vn;
+
+   /**
+* Move all unbound vmas back into vm_bind_list so that they are
+* revalidated.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, vm_rebind_link) {
+   list_del_init(&vma->vm_rebind_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->vm_rebind_lock);
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
@@ -143,14 +171,121 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
++current_batch;
}
 
+   eb_scoop_unbound_vma_all(eb->context->vm);
+
+   return 0;
+}
+
+static int eb_lock_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma;
+   int err;
+
+   err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww);
+   if (err)
+   return err;
+
+   list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+   non_priv_vm_bind_link) {
+   err = i915_gem_object_lock(vma->obj, &eb->ww);
+   if (err)
+   return err;
+   }
+
return 0;
 }
 
+static void eb_release_persistent_vma_all(struct i915_execbuffer *eb,
+ bool final)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma, *vn;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   if (!(eb->args->flags & __EXEC3_HAS_PIN))
+   return;
+
+   assert_object_held(vm->root_obj);
+
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link)
+   __i915_vma_unpin(vma);
+
+   eb->args->flags &= ~__EXEC3_HAS_PIN;
+   if (!final)
+   return;
+
+   list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link)
+   if (i915_vma_verify_bind_complete(vma))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+}
+
 static void eb_release_vma_all(struct i915_execbuffer *eb, bool final)
 {
+   eb_release_persistent_vma_all(eb, final);
eb_unpin_engine(eb);
 }
 
+static int eb_reserve_fence_for_persistent_vma_all(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *vma;
+   int ret;
+
+   ret = dma_resv_reserve_fences(vm->root_obj->base.resv, 1);
+   if (ret)
+   return ret;
+
+   list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+

[RFC v4 13/14] drm/i915/vm_bind: Skip vma_lookup for persistent vmas

2022-09-21 Thread Niranjana Vishwanathapura

vma_lookup is tied to segment of the object instead of section
of VA space. Hence, it do not support aliasing (ie., multiple
bindings to the same section of the object).
Skip vma_lookup for persistent vmas as it supports aliasing.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/display/intel_fb_pin.c   |  2 +-
 .../drm/i915/display/intel_plane_initial.c|  2 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  4 +-
 .../drm/i915/gem/i915_gem_vm_bind_object.c|  2 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   | 16 +++
 .../i915/gem/selftests/i915_gem_client_blt.c  |  2 +-
 .../drm/i915/gem/selftests/i915_gem_context.c | 12 ++---
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  6 ++-
 .../drm/i915/gem/selftests/igt_gem_utils.c|  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_gt.c|  2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c   |  4 +-
 drivers/gpu/drm/i915/gt/intel_renderstate.c   |  2 +-
 drivers/gpu/drm/i915/gt/intel_ring.c  |  2 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  4 +-
 drivers/gpu/drm/i915/gt/intel_timeline.c  |  2 +-
 drivers/gpu/drm/i915/gt/mock_engine.c |  2 +-
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  |  4 +-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 16 +++
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  6 +--
 drivers/gpu/drm/i915/gt/selftest_lrc.c|  2 +-
 .../drm/i915/gt/selftest_ring_submission.c|  2 +-
 drivers/gpu/drm/i915/gt/selftest_rps.c|  2 +-
 .../gpu/drm/i915/gt/selftest_workarounds.c|  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|  2 +-
 drivers/gpu/drm/i915/i915_gem.c   |  2 +-
 drivers/gpu/drm/i915/i915_perf.c  |  2 +-
 drivers/gpu/drm/i915/i915_vma.c   | 26 +++
 drivers/gpu/drm/i915/i915_vma.h   |  3 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 44 +--
 drivers/gpu/drm/i915/selftests/i915_request.c |  4 +-
 drivers/gpu/drm/i915/selftests/i915_vma.c |  2 +-
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |  2 +-
 .../drm/i915/selftests/intel_memory_region.c  |  2 +-
 37 files changed, 106 insertions(+), 93 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fb_pin.c 
b/drivers/gpu/drm/i915/display/intel_fb_pin.c
index c86e5d4ee016..5a718b247bb3 100644
--- a/drivers/gpu/drm/i915/display/intel_fb_pin.c
+++ b/drivers/gpu/drm/i915/display/intel_fb_pin.c
@@ -47,7 +47,7 @@ intel_pin_fb_obj_dpt(struct drm_framebuffer *fb,
goto err;
}
 
-   vma = i915_vma_instance(obj, vm, view);
+   vma = i915_vma_instance(obj, vm, view, false);
if (IS_ERR(vma))
goto err;
 
diff --git a/drivers/gpu/drm/i915/display/intel_plane_initial.c 
b/drivers/gpu/drm/i915/display/intel_plane_initial.c
index 76be796df255..7667e2faa3fb 100644
--- a/drivers/gpu/drm/i915/display/intel_plane_initial.c
+++ b/drivers/gpu/drm/i915/display/intel_plane_initial.c
@@ -136,7 +136,7 @@ initial_plane_vma(struct drm_i915_private *i915,
goto err_obj;
}
 
-   vma = i915_vma_instance(obj, &to_gt(i915)->ggtt->vm, NULL);
+   vma = i915_vma_instance(obj, &to_gt(i915)->ggtt->vm, NULL, false);
if (IS_ERR(vma))
goto err_obj;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 363b2a788cdf..0ee43cb601b5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -876,7 +876,7 @@ static struct i915_vma *eb_lookup_vma(struct 
i915_execbuffer *eb, u32 handle)
}
}
 
-   vma = i915_vma_instance(obj, vm, NULL);
+   vma = i915_vma_instance(obj, vm, NULL, false);
if (IS_ERR(vma)) {
i915_gem_object_put(obj);
return vma;
@@ -2208,7 +2208,7 @@ shadow_batch_pin(struct i915_execbuffer *eb,
struct i915_vma *vma;
int err;
 
-   vma = i915_vma_instance(obj, vm, NULL);
+   vma = i915_vma_instance(obj, vm, NULL, false);
if (IS_ERR(vma))
return vma;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 3087731cc0c0..4468603af6f1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -252,7 +252,7 @@ static struct i915_vma *vm_bind_get_vma(struct 
i915_address_space *vm,
view.type = I915_GTT_VIEW_PARTIAL;
view.partial.offset = va->offset >> PAGE_SHIFT;
view.partial.size = va->length >> PAGE_SHIFT;
-   vma = i915_vma_instance(obj, vm, &v

[RFC v4 03/14] drm/i915/vm_bind: Expose i915_gem_object_max_page_size()

2022-09-21 Thread Niranjana Vishwanathapura

Expose i915_gem_object_max_page_size() function non-static
which will be used by the vm_bind feature.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 20 +++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 ++
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 33673fe7ee0a..3b3ab4abb0a3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -11,14 +11,24 @@
 #include "pxp/intel_pxp.h"
 
 #include "i915_drv.h"
+#include "i915_gem_context.h"
 #include "i915_gem_create.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
-static u32 object_max_page_size(struct intel_memory_region **placements,
-   unsigned int n_placements)
+/**
+ * i915_gem_object_max_page_size() - max of min_page_size of the regions
+ * @placements:  list of regions
+ * @n_placements: number of the placements
+ *
+ * Calculates the max of the min_page_size of a list of placements passed in.
+ *
+ * Return: max of the min_page_size
+ */
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+ unsigned int n_placements)
 {
-   u32 max_page_size = 0;
+   u32 max_page_size = I915_GTT_PAGE_SIZE_4K;
int i;
 
for (i = 0; i < n_placements; i++) {
@@ -28,7 +38,6 @@ static u32 object_max_page_size(struct intel_memory_region 
**placements,
max_page_size = max_t(u32, max_page_size, mr->min_page_size);
}
 
-   GEM_BUG_ON(!max_page_size);
return max_page_size;
 }
 
@@ -99,7 +108,8 @@ __i915_gem_object_create_user_ext(struct drm_i915_private 
*i915, u64 size,
 
i915_gem_flush_free_objects(i915);
 
-   size = round_up(size, object_max_page_size(placements, n_placements));
+   size = round_up(size, i915_gem_object_max_page_size(placements,
+   n_placements));
if (size == 0)
return ERR_PTR(-EINVAL);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 7317d4102955..8c97bddad921 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -47,6 +47,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
 }
 
 void i915_gem_init__objects(struct drm_i915_private *i915);
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+ unsigned int n_placements);
 
 void i915_objects_module_exit(void);
 int i915_objects_module_init(void);
-- 
2.21.0.rc0.32.g243a4c7e27

[RFC v4 10/14] drm/i915/vm_bind: Update i915_vma_verify_bind_complete()

2022-09-21 Thread Niranjana Vishwanathapura

Ensure i915_vma_verify_bind_complete() handles case where bind
is not initiated. Also make it non static, add documentation
and move it out of CONFIG_DRM_I915_DEBUG_GEM.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/i915_vma.c | 16 +++-
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index f7d711e675d6..24f171588f56 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -406,12 +406,21 @@ int i915_vma_sync(struct i915_vma *vma)
return i915_vm_sync(vma->vm);
 }
 
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
-static int i915_vma_verify_bind_complete(struct i915_vma *vma)
+/**
+ * i915_vma_verify_bind_complete() - Check for the bind completion of the vma
+ * @vma: vma to check for bind completion
+ *
+ * Returns: 0 if the vma bind is completed. Error code otherwise.
+ */
+int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
struct dma_fence *fence = i915_active_fence_get(&vma->active.excl);
int err;
 
+   /* Ensure vma bind is initiated */
+   if (!i915_vma_is_bound(vma, I915_VMA_BIND_MASK))
+   return -EINVAL;
+
if (!fence)
return 0;
 
@@ -424,9 +433,6 @@ static int i915_vma_verify_bind_complete(struct i915_vma 
*vma)
 
return err;
 }
-#else
-#define i915_vma_verify_bind_complete(_vma) 0
-#endif
 
 I915_SELFTEST_EXPORT void
 i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index aa536c9ce472..3a47db2d85f5 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -433,6 +433,7 @@ void i915_vma_make_purgeable(struct i915_vma *vma);
 
 int i915_vma_wait_for_bind(struct i915_vma *vma);
 int i915_vma_sync(struct i915_vma *vma);
+int i915_vma_verify_bind_complete(struct i915_vma *vma);
 
 /**
  * i915_vma_get_current_resource - Get the current resource of the vma
-- 
2.21.0.rc0.32.g243a4c7e27

[RFC v4 05/14] drm/i915/vm_bind: Support for VM private BOs

2022-09-21 Thread Niranjana Vishwanathapura

Each VM creates a root_obj and shares it with all of its private objects
to use it as dma_resv object. This has a performance advantage as it
requires a single dma_resv object update for all private BOs vs list of
dma_resv objects update for shared BOs, in the execbuf path.

VM private BOs can be only mapped on specified VM and cannot be dmabuf
exported. Also, they are supported only in vm_bind mode.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c| 40 ++-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|  6 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  4 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  3 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 ++
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   |  3 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c|  9 +
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  4 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  2 +
 drivers/gpu/drm/i915/i915_vma.c   |  1 +
 drivers/gpu/drm/i915/i915_vma_types.h |  2 +
 include/uapi/drm/i915_drm.h   | 30 ++
 12 files changed, 105 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 3b3ab4abb0a3..692d95ef5d3e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -253,6 +253,7 @@ struct create_ext {
unsigned int n_placements;
unsigned int placement_mask;
unsigned long flags;
+   u32 vm_id;
 };
 
 static void repr_placements(char *buf, size_t size,
@@ -402,9 +403,24 @@ static int ext_set_protected(struct i915_user_extension 
__user *base, void *data
return 0;
 }
 
+static int ext_set_vm_private(struct i915_user_extension __user *base,
+ void *data)
+{
+   struct drm_i915_gem_create_ext_vm_private ext;
+   struct create_ext *ext_data = data;
+
+   if (copy_from_user(&ext, base, sizeof(ext)))
+   return -EFAULT;
+
+   ext_data->vm_id = ext.vm_id;
+
+   return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
+   [I915_GEM_CREATE_EXT_VM_PRIVATE] = ext_set_vm_private,
 };
 
 /**
@@ -420,6 +436,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
struct drm_i915_private *i915 = to_i915(dev);
struct drm_i915_gem_create_ext *args = data;
struct create_ext ext_data = { .i915 = i915 };
+   struct i915_address_space *vm = NULL;
struct drm_i915_gem_object *obj;
int ret;
 
@@ -433,6 +450,12 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
if (ret)
return ret;
 
+   if (ext_data.vm_id) {
+   vm = i915_gem_vm_lookup(file->driver_priv, ext_data.vm_id);
+   if (unlikely(!vm))
+   return -ENOENT;
+   }
+
if (!ext_data.n_placements) {
ext_data.placements[0] =
intel_memory_region_by_type(i915, INTEL_MEMORY_SYSTEM);
@@ -459,8 +482,21 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
ext_data.placements,
ext_data.n_placements,
ext_data.flags);
-   if (IS_ERR(obj))
-   return PTR_ERR(obj);
+   if (IS_ERR(obj)) {
+   ret = PTR_ERR(obj);
+   goto vm_put;
+   }
+
+   if (vm) {
+   obj->base.resv = vm->root_obj->base.resv;
+   obj->priv_root = i915_gem_object_get(vm->root_obj);
+   i915_vm_put(vm);
+   }
 
return i915_gem_publish(obj, file, &args->size, &args->handle);
+vm_put:
+   if (vm)
+   i915_vm_put(vm);
+
+   return ret;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index f5062d0c6333..6433173c3e84 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -218,6 +218,12 @@ struct dma_buf *i915_gem_prime_export(struct 
drm_gem_object *gem_obj, int flags)
struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
 
+   if (obj->priv_root) {
+   drm_dbg(obj->base.dev,
+   "Exporting VM private objects is not allowed\n");
+   return ERR_PTR(-EINVAL);
+   }
+
exp_info.ops = &i915_dmabuf_ops;
exp_info.size = gem_obj->size;
exp_info.flags = flags;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index f85f10cf9c34..33d989a20227 100644
--- a/dr

[RFC v4 06/14] drm/i915/vm_bind: Handle persistent vmas

2022-09-21 Thread Niranjana Vishwanathapura

Treat VM_BIND vmas as persistent across execbuf ioctl calls and handle
them during the request submission in the execbuff path.

Support eviction by maintaining a list of evicted persistent vmas
for rebinding during next submission.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../drm/i915/gem/i915_gem_vm_bind_object.c|  7 +++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 ++
 drivers/gpu/drm/i915/i915_gem_gtt.c   | 39 
 drivers/gpu/drm/i915/i915_gem_gtt.h   |  3 ++
 drivers/gpu/drm/i915/i915_vma.c   | 46 +++
 drivers/gpu/drm/i915/i915_vma.h   | 45 +-
 drivers/gpu/drm/i915/i915_vma_types.h | 17 +++
 8 files changed, 151 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 7ca6a41fc981..236f901b8b9c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -91,6 +91,12 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, 
bool release_obj)
 {
lockdep_assert_held(&vma->vm->vm_bind_lock);
 
+   spin_lock(&vma->vm->vm_rebind_lock);
+   if (!list_empty(&vma->vm_rebind_link))
+   list_del_init(&vma->vm_rebind_link);
+   i915_vma_set_purged(vma);
+   spin_unlock(&vma->vm->vm_rebind_lock);
+
list_del_init(&vma->vm_bind_link);
list_del_init(&vma->non_priv_vm_bind_link);
i915_vm_bind_it_remove(vma, &vma->vm->va);
@@ -181,6 +187,7 @@ static struct i915_vma *vm_bind_get_vma(struct 
i915_address_space *vm,
 
vma->start = va->start;
vma->last = va->start + va->length - 1;
+   i915_vma_set_persistent(vma);
 
return vma;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index da4f9dee0397..6db31197fa87 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -296,6 +296,8 @@ void i915_address_space_init(struct i915_address_space *vm, 
int subclass)
INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
vm->root_obj = i915_gem_object_create_internal(vm->i915, PAGE_SIZE);
GEM_BUG_ON(IS_ERR(vm->root_obj));
+   INIT_LIST_HEAD(&vm->vm_rebind_list);
+   spin_lock_init(&vm->vm_rebind_lock);
 }
 
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 3f2e87d3bf34..b73d35b4e05d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -273,6 +273,10 @@ struct i915_address_space {
struct list_head vm_bind_list;
/** @vm_bound_list: List of vm_binding completed */
struct list_head vm_bound_list;
+   /* @vm_rebind_list: list of vmas to be rebinded */
+   struct list_head vm_rebind_list;
+   /* @vm_rebind_lock: protects vm_rebound_list */
+   spinlock_t vm_rebind_lock;
/* @va: tree of persistent vmas */
struct rb_root_cached va;
struct list_head non_priv_vm_bind_list;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 329ff75b80b9..b7d0844de561 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -25,6 +25,45 @@
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 
+/**
+ * i915_vm_sync() - Wait until address space is not in use
+ * @vm: address space
+ *
+ * Waits until all requests using the address space are complete.
+ *
+ * Returns: 0 if success, -ve err code upon failure
+ */
+int i915_vm_sync(struct i915_address_space *vm)
+{
+   int ret;
+
+   /* Wait for all requests under this vm to finish */
+   ret = dma_resv_wait_timeout(vm->root_obj->base.resv,
+   DMA_RESV_USAGE_BOOKKEEP, false,
+   MAX_SCHEDULE_TIMEOUT);
+   if (ret < 0)
+   return ret;
+   else if (ret > 0)
+   return 0;
+   else
+   return -ETIMEDOUT;
+}
+
+/**
+ * i915_vm_is_active() - Check if address space is being used
+ * @vm: address space
+ *
+ * Check if any request using the specified address space is
+ * active.
+ *
+ * Returns: true if address space is active, false otherwise.
+ */
+bool i915_vm_is_active(const struct i915_address_space *vm)
+{
+   return !dma_resv_test_signaled(vm->root_obj->base.resv,
+  DMA_RESV_USAGE_BOOKKEEP);
+}
+
 int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
   struct sg_table *pages)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 8c2f57eb5dda..a5bbdc59d9df 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -51,4 +51,7 @@ int

[RFC v4 14/14] drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode

2022-09-21 Thread Niranjana Vishwanathapura

Add getparam support for VM_BIND capability version.
Add VM creation time flag to enable vm_bind_mode for the VM.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |  5 -
 drivers/gpu/drm/i915/i915_getparam.c|  3 +++
 include/uapi/drm/i915_drm.h | 24 -
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index f4e648ec01ed..e0ebf47d4d57 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1808,7 +1808,7 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void 
*data,
if (!HAS_FULL_PPGTT(i915))
return -ENODEV;
 
-   if (args->flags)
+   if (args->flags & I915_VM_CREATE_FLAGS_UNKNOWN)
return -EINVAL;
 
ppgtt = i915_ppgtt_create(to_gt(i915), 0);
@@ -1828,6 +1828,9 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void 
*data,
if (err)
goto err_put;
 
+   if (args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND)
+   ppgtt->vm.vm_bind_mode = true;
+
GEM_BUG_ON(id == 0); /* reserved for invalid/unassigned ppgtt */
args->vm_id = id;
return 0;
diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
b/drivers/gpu/drm/i915/i915_getparam.c
index 342c8ca6414e..56ff5118d017 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -175,6 +175,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
case I915_PARAM_PERF_REVISION:
value = i915_perf_ioctl_version();
break;
+   case I915_PARAM_VM_BIND_VERSION:
+   value = GRAPHICS_VER(i915) >= 12 ? 1 : 0;
+   break;
default:
DRM_DEBUG("Unknown parameter %d\n", param->param);
return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index eaeb80a3ede1..df0fb875276c 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -755,6 +755,27 @@ typedef struct drm_i915_irq_wait {
 /* Query if the kernel supports the I915_USERPTR_PROBE flag. */
 #define I915_PARAM_HAS_USERPTR_PROBE 56
 
+/*
+ * VM_BIND feature version supported.
+ *
+ * The following versions of VM_BIND have been defined:
+ *
+ * 0: No VM_BIND support.
+ *
+ * 1: In VM_UNBIND calls, the UMD must specify the exact mappings created
+ *previously with VM_BIND, the ioctl will not support unbinding multiple
+ *mappings or splitting them. Similarly, VM_BIND calls will not replace
+ *any existing mappings.
+ *
+ * 2: The restrictions on unbinding partial or multiple mappings is
+ *lifted, Similarly, binding will replace any mappings in the given range.
+ *
+ * See struct drm_i915_gem_vm_bind and struct drm_i915_gem_vm_unbind.
+ *
+ * vm_bind versions are backward compatible.
+ */
+#define I915_PARAM_VM_BIND_VERSION 57
+
 /* Must be kept compact -- no holes and well documented */
 
 /**
@@ -2625,7 +2646,8 @@ struct drm_i915_gem_vm_control {
/** @extensions: Zero-terminated chain of extensions. */
__u64 extensions;
 
-   /** @flags: reserved for future usage, currently MBZ */
+#define I915_VM_CREATE_FLAGS_USE_VM_BIND   (1u << 0)
+#define I915_VM_CREATE_FLAGS_UNKNOWN   (-(I915_VM_CREATE_FLAGS_USE_VM_BIND << 
1))
__u32 flags;
 
/** @vm_id: Id of the VM created or to be destroyed */
-- 
2.21.0.rc0.32.g243a4c7e27

[RFC v4 12/14] drm/i915/vm_bind: userptr dma-resv changes

2022-09-21 Thread Niranjana Vishwanathapura

For persistent (vm_bind) vmas of userptr BOs, handle the user
page pinning by using the i915_gem_object_userptr_submit_init()
/done() functions

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 99 +++
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   | 17 
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 16 +++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 +
 drivers/gpu/drm/i915/i915_vma_types.h |  2 +
 6 files changed, 140 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index 82a068d03440..7467e3daac5c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -22,6 +22,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
 
+#define __EXEC3_USERPTR_USED   BIT_ULL(34)
 #define __EXEC3_HAS_PINBIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED  BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS (~0ull << 32)
@@ -144,6 +145,21 @@ static void eb_scoop_unbound_vma_all(struct 
i915_address_space *vm)
 {
struct i915_vma *vma, *vn;
 
+#ifdef CONFIG_MMU_NOTIFIER
+   /**
+* Move all invalidated userptr vmas back into vm_bind_list so that
+* they are looked up and revalidated.
+*/
+   spin_lock(&vm->userptr_invalidated_lock);
+   list_for_each_entry_safe(vma, vn, &vm->userptr_invalidated_list,
+userptr_invalidated_link) {
+   list_del_init(&vma->userptr_invalidated_link);
+   if (!list_empty(&vma->vm_bind_link))
+   list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+   }
+   spin_unlock(&vm->userptr_invalidated_lock);
+#endif
+
/**
 * Move all unbound vmas back into vm_bind_list so that they are
 * revalidated.
@@ -157,10 +173,47 @@ static void eb_scoop_unbound_vma_all(struct 
i915_address_space *vm)
spin_unlock(&vm->vm_rebind_lock);
 }
 
+static int eb_lookup_persistent_userptr_vmas(struct i915_execbuffer *eb)
+{
+   struct i915_address_space *vm = eb->context->vm;
+   struct i915_vma *last_vma = NULL;
+   struct i915_vma *vma;
+   int err;
+
+   lockdep_assert_held(&vm->vm_bind_lock);
+
+   list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+   if (!i915_gem_object_is_userptr(vma->obj))
+   continue;
+
+   err = i915_gem_object_userptr_submit_init(vma->obj);
+   if (err)
+   return err;
+
+   /**
+* The above submit_init() call does the object unbind and
+* hence adds vma into vm_rebind_list. Remove it from that
+* list as it is already scooped for revalidation.
+*/
+   spin_lock(&vm->vm_rebind_lock);
+   if (!list_empty(&vma->vm_rebind_link))
+   list_del_init(&vma->vm_rebind_link);
+   spin_unlock(&vm->vm_rebind_lock);
+
+   last_vma = vma;
+   }
+
+   if (last_vma)
+   eb->args->flags |= __EXEC3_USERPTR_USED;
+
+   return 0;
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
unsigned int i, current_batch = 0;
struct i915_vma *vma;
+   int err = 0;
 
for (i = 0; i < eb->num_batches; i++) {
vma = eb_find_vma(eb->context->vm, eb->batch_addresses[i]);
@@ -173,6 +226,10 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 
eb_scoop_unbound_vma_all(eb->context->vm);
 
+   err = eb_lookup_persistent_userptr_vmas(eb);
+   if (err)
+   return err;
+
return 0;
 }
 
@@ -330,15 +387,57 @@ static void eb_move_all_persistent_vma_to_active(struct 
i915_execbuffer *eb)
 
 static int eb_move_to_gpu(struct i915_execbuffer *eb)
 {
+   int err = 0;
+
lockdep_assert_held(&eb->context->vm->vm_bind_lock);
assert_object_held(eb->context->vm->root_obj);
 
eb_move_all_persistent_vma_to_active(eb);
 
+#ifdef CONFIG_MMU_NOTIFIER
+   /* Check for further userptr invalidations */
+   spin_lock(&eb->context->vm->userptr_invalidated_lock);
+   if (!list_empty(&eb->context->vm->userptr_invalidated_list))
+   err = -EAGAIN;
+   spin_unlock(&eb->context->vm->userptr_invalidated_lock);
+
+   if (!err && (eb->args->flags & __EXEC3_USERPTR_USED)) {
+   struct i915_vma *vma;
+
+   lockdep_assert_held(&eb->context->vm->vm_bind_lock);
+   assert_object_held(eb->context->vm->root_obj);
+
+   read_lock(&eb->i915->mm.notifier_lock);
+   list_for_each_entry(vma, &eb->context->vm->vm_bind_list,
+   vm_bind_link) {
+   if (

Re: [Intel-gfx] [RFC PATCH v3 10/17] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl

2022-09-21 Thread Niranjana Vishwanathapura


On Mon, Sep 05, 2022 at 04:08:57PM +0100, Tvrtko Ursulin wrote:


On 02/09/2022 06:41, Niranjana Vishwanathapura wrote:

On Thu, Sep 01, 2022 at 08:58:57AM +0100, Tvrtko Ursulin wrote:



On 01/09/2022 06:09, Niranjana Vishwanathapura wrote:

On Wed, Aug 31, 2022 at 08:38:48AM +0100, Tvrtko Ursulin wrote:


On 27/08/2022 20:43, Andi Shyti wrote:

From: Niranjana Vishwanathapura 

Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

And the legacy support like relocations etc are removed.

Signed-off-by: Niranjana Vishwanathapura 


Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---


[snip]


+static void signal_fence_array(const struct i915_execbuffer *eb,
+   struct dma_fence * const fence)
+{
+    unsigned int n;
+
+    for (n = 0; n < eb->num_fences; n++) {
+    struct drm_syncobj *syncobj;
+    unsigned int flags;
+
+    syncobj = ptr_unpack_bits(eb->fences[n].syncobj, &flags, 2);
+    if (!(flags & I915_TIMELINE_FENCE_SIGNAL))
+    continue;
+
+    if (eb->fences[n].chain_fence) {
+    drm_syncobj_add_point(syncobj,
+  eb->fences[n].chain_fence,
+  fence,
+  eb->fences[n].value);
+    /*
+ * The chain's ownership is transferred to the
+ * timeline.
+ */
+    eb->fences[n].chain_fence = NULL;
+    } else {
+    drm_syncobj_replace_fence(syncobj, fence);
+    }
+    }
+}
Semi-random place to ask - how many of the code here is direct 
copy of existing functions from i915_gem_execbuffer.c? There 
seems to be some 100% copies at least. And then some more with 
small tweaks. Spend some time and try to figure out some code 
sharing?




During VM_BIND design review, maintainers expressed thought on keeping
execbuf3 completely separate and not touch the legacy execbuf path.


Got a link so this maintainer can see what exactly was said? Just 
to make sure there isn't any misunderstanding on what "completely 
separate" means to different people.


Here is one (search for copypaste/copy-paste)
https://patchwork.freedesktop.org/patch/486608/?series=93447&rev=3
It is hard to search for old discussion threads. May be maintainers
can provide feedback here directly. Dave, Daniel? :)


Thanks. I had a read and don't see a fundamental conflict with what I 
said. Conclusion seemed to be to go with a new ioctl and implement 
code sharing where it makes sense. Which is what TODO in the cover 
letter acknowledges so there should be no disagreement really.



I also think, execbuf3 should be fully separate. We can do some code
sharing where is a close 100% copy (there is a TODO in cover letter).
There are some changes like the timeline fence array handling here
which looks similar, but the uapi is not exactly the same. Probably,
we should keep them separate and not try to force code sharing at
least at this point.


Okay did not spot that TODO in the cover. But fair since it is RFC 
to be unfinished.


I do however think it should be improved before considering the 
merge. Because looking at the patch, 100% copies are:


for_each_batch_create_order
for_each_batch_add_order
eb_throttle
eb_pin_timeline
eb_pin_engine
eb_put_engine
__free_fence_array
put_fence_array
await_fence_array
signal_fence_array
retire_requests
eb_request_add
eb_requests_get
eb_requests_put
eb_find_context

Quite a lot.

Then there is a bunch of almost same functions which could be 
shared if there weren't two incompatible local struct 
i915_execbuffer's. Especially given when the out fence TODO item 
gets handled a chunk more will also become a 100% copy.




There are difinitely a few which is 100% copies hence should have a
shared code.
But some are not. Like, fence_array stuff though looks very similar,
the uapi structures are different between execbuf3 and legacy execbuf.
The internal flags are also different (eg., __EXEC3_ENGINE_PINNED vs
__EXEC_ENGINE_PINNED) which causes minor differences hence not a
100% copy.

So, I am not convinced if it is worth carrying legacy stuff into
execbuf3 code. I think we need to look at these on a case by case
basis and see if abstracting common functionality to a separate
shared code makes sense or it is better to keep the code separate.


No one is suggesting to carry any legacy stuff into eb3. What I'd 
suggest is to start something like i915_gem_eb_common.h|c and stuff 
the 100% copies from the above list in there.


Common struct eb with struct eb2 and eb3 inheriting from it should do 
the trick. Similarly eb->flags shouldn't be a hard problem to solve.


Then you see what remains and whether it makes sense to consolidate further.



Tvrtko,

Re: [RFC PATCH v3 04/17] drm/i915: Implement bind and unbind of object

2022-09-21 Thread Niranjana Vishwanathapura


On Mon, Sep 12, 2022 at 04:11:54PM +0300, Jani Nikula wrote:

On Sat, 27 Aug 2022, Andi Shyti  wrote:

From: Niranjana Vishwanathapura 

Implement the bind and unbind of an object at the specified GPU virtual
addresses.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  21 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c| 322 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |   9 +
 drivers/gpu/drm/i915/i915_driver.c|   1 +
 drivers/gpu/drm/i915/i915_vma.c   |   3 +-
 drivers/gpu/drm/i915/i915_vma.h   |   2 -
 drivers/gpu/drm/i915/i915_vma_types.h |  14 +
 include/uapi/drm/i915_drm.h   | 163 +
 10 files changed, 543 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 522ef9b4aff32..4e1627e96c6e0 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -165,6 +165,7 @@ gem-y += \
gem/i915_gem_ttm_move.o \
gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
+   gem/i915_gem_vm_bind_object.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
new file mode 100644
index 0..ebc493b7dafc1
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include "i915_drv.h"


Please only include what you need. Here, you'll only need to include
 for u64 and bool. For everything else, add forward
declarations.


Jani,

Thanks. I have posted vm_bind v4 rfc series with your comments
addressed.

Regards,
Niranjana




+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file);
+
+void i915_gem_vm_unbind_vma_all(struct i915_address_space *vm);


Blank line here please.


+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
new file mode 100644
index 0..dadd1d4b1761b
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,322 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+
+#include "gem/i915_gem_vm_bind.h"
+#include "gem/i915_gem_context.h"
+#include "gt/gen8_engine_cs.h"
+
+#include "i915_drv.h"
+#include "i915_gem_gtt.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+START, LAST, static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+ * specified address space (VM). Multiple mappings can map to the same physical
+ * pages of an object (aliasing). These mappings (also referred to as 
persistent
+ * mappings) will be persistent across multiple GPU submissions (execbuf calls)
+ * issued by the UMD, without user having to provide a list of all required
+ * mappings during each submission (as required by older execbuf mode).
+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
+ * User has to opt-in for VM_BIND mode of binding for an address space (VM)
+ * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
+ * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be
+ * done asynchronously, when valid out fence is specified.
+ *
+ * VM_BIND locking order is as below.
+ *
+ * 1) vm_bind_lock mutex will protect vm_bind lists. This lock is taken in
+ *vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing 
the
+ *mapping.
+ *
+ *In future, when GPU page faults are supported, we can potentially use a
+ *rwsem instead, so that

Re: [Intel-gfx] [RFC PATCH v3 07/17] drm/i915/vm_bind: Handle persistent vmas

2022-09-21 Thread Niranjana Vishwanathapura


On Mon, Sep 12, 2022 at 04:16:06PM +0300, Jani Nikula wrote:

On Sat, 27 Aug 2022, Andi Shyti  wrote:

From: Niranjana Vishwanathapura 

Treat VM_BIND vmas as persistent across execbuf ioctl calls and handle
them during the request submission in the execbuff path.

Support eviction by maintaining a list of evicted persistent vmas
for rebinding during next submission.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Ramalingam C 
Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  1 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c|  8 +++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 ++
 drivers/gpu/drm/i915/i915_gem_gtt.c   | 38 +
 drivers/gpu/drm/i915/i915_gem_gtt.h   |  3 +
 drivers/gpu/drm/i915/i915_vma.c   | 50 +++--
 drivers/gpu/drm/i915/i915_vma.h   | 56 +++
 drivers/gpu/drm/i915/i915_vma_types.h | 24 
 9 files changed, 169 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 389e9f157ca5e..825dce41f7113 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -38,6 +38,7 @@
 #include "i915_gem_mman.h"
 #include "i915_gem_object.h"
 #include "i915_gem_ttm.h"
+#include "i915_gem_vm_bind.h"


Why do you add this here if you're not using anything from there?


Addressed in v4 rfc series.




 #include "i915_memcpy.h"
 #include "i915_trace.h"

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 9ff929f187cfd..3b45529fe8d4c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -91,6 +91,13 @@ void i915_gem_vm_bind_remove(struct i915_vma *vma, bool 
release_obj)
 {
lockdep_assert_held(&vma->vm->vm_bind_lock);

+   spin_lock(&vma->vm->vm_rebind_lock);
+   if (!list_empty(&vma->vm_rebind_link))
+   list_del_init(&vma->vm_rebind_link);
+   i915_vma_set_purged(vma);
+   i915_vma_set_freed(vma);
+   spin_unlock(&vma->vm->vm_rebind_lock);
+
if (!list_empty(&vma->vm_bind_link)) {
list_del_init(&vma->vm_bind_link);
list_del_init(&vma->non_priv_vm_bind_link);
@@ -190,6 +197,7 @@ static struct i915_vma *vm_bind_get_vma(struct 
i915_address_space *vm,

vma->start = va->start;
vma->last = va->start + va->length - 1;
+   i915_vma_set_persistent(vma);

return vma;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index c4f75826213ae..97cd0089b516d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -296,6 +296,8 @@ void i915_address_space_init(struct i915_address_space *vm, 
int subclass)
INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
vm->root_obj = i915_gem_object_create_internal(vm->i915, PAGE_SIZE);
GEM_BUG_ON(IS_ERR(vm->root_obj));
+   INIT_LIST_HEAD(&vm->vm_rebind_list);
+   spin_lock_init(&vm->vm_rebind_lock);
 }

 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 9a2665e4ec2e5..1f3b1967ec175 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -265,6 +265,10 @@ struct i915_address_space {
struct list_head vm_bind_list;
/** @vm_bound_list: List of vm_binding completed */
struct list_head vm_bound_list;
+   /* @vm_rebind_list: list of vmas to be rebinded */
+   struct list_head vm_rebind_list;
+   /* @vm_rebind_lock: protects vm_rebound_list */
+   spinlock_t vm_rebind_lock;
/* @va: tree of persistent vmas */
struct rb_root_cached va;
struct list_head non_priv_vm_bind_list;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 329ff75b80b97..f083724163deb 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -25,6 +25,44 @@
 #include "i915_trace.h"
 #include "i915_vgpu.h"

+/**
+ * i915_vm_sync() - Wait for all requests on private vmas of a vm to be 
completed
+ * @vm: address space we need to wait for idle
+ *
+ * Waits till all requests of the vm_binded private objs are completed.
+ *
+ * Returns: 0 on success -ve errcode on failure
+ */
+int i915_vm_sync(struct i915_address_space *vm)
+{
+   int ret;
+
+   /* Wait for all requests under this vm to finish */
+   ret = dma_resv_wait_timeout(vm->root_obj->base.resv,
+   DMA_RESV_USAGE_BOOKKEEP, false,
+   MAX_SCHEDULE_TIMEOUT);
+   if (ret < 0)
+   return ret;
+   else if (ret > 0)
+   return 0;
+   e

Re: [PATCH V2 1/2] dt-bindings: display: panel: Add Samsung AMS495QA01 bindings

2022-09-21 Thread Krzysztof Kozlowski

On 20/09/2022 19:09, Chris Morgan wrote:
> From: Chris Morgan 
> 
> Add documentation for the Samsung AMS495QA01 panel.
> 
> Signed-off-by: Chris Morgan 
> ---
>  .../display/panel/samsung,ams495qa01.yaml | 46 +++
>  1 file changed, 46 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/panel/samsung,ams495qa01.yaml
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/panel/samsung,ams495qa01.yaml 
> b/Documentation/devicetree/bindings/display/panel/samsung,ams495qa01.yaml
> new file mode 100644
> index ..08358cdad19c
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/panel/samsung,ams495qa01.yaml
> @@ -0,0 +1,46 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/panel/samsung,ams495qa01.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Samsung AMS495QA01 4.95in 960x544 DSI/SPI panel
> +
> +maintainers:
> +  - Chris Morgan 
> +
> +allOf:
> +  - $ref: panel-common.yaml#
> +
> +properties:
> +  compatible:
> +const: samsung,ams495qa01

Blank line.


> +  reg: true
> +  reset-gpios: true
> +  elvdd-supply:
> +description: regulator that supplies voltage to the panel display
> +  enable-gpios: true
> +  port: true
> +  vdd-supply:
> +description: regulator that supplies voltage to panel logic


Best regards,
Krzysztof

Re: [PATCH] drm/amdgpu: initialize r variable into amdgpu_cs_submit function

2022-09-21 Thread Tommaso Merciai

Hi Christian,

On Tue, Sep 20, 2022 at 02:23:58PM +0200, Christian König wrote:
> Am 20.09.22 um 14:22 schrieb Tommaso Merciai:
> > The builds of arm64 allmodconfig with clang failed to build
> > next-20220920 with the following result:
> > 
> > 1190:3: error: variable 'r' is uninitialized when used here 
> > [-Werror,-Wuninitialized]
> > note: initialize the variable 'r' to silence this warning
> > 
> > This fix compilation error
> 
> I've already send a patch to fix this to the mailing list 7 Minutes ago :)
> 
> Please review or ack that one.

Sorry, my bad. Don't see your patch :)

Cheers,
Tommaso

> 
> Thanks,
> Christian.
> 
> > 
> > Signed-off-by: Tommaso Merciai 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index 58088c663125..efa3dc9b69fd 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -1168,7 +1168,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser 
> > *p,
> > struct amdgpu_bo_list_entry *e;
> > struct amdgpu_job *job;
> > uint64_t seq;
> > -   int r;
> > +   int r = 0;
> > job = p->job;
> > p->job = NULL;
> 

-- 
Tommaso Merciai
Embedded Linux Engineer
tommaso.merc...@amarulasolutions.com
__

Amarula Solutions SRL
Via Le Canevare 30, 31100 Treviso, Veneto, IT
T. +39 042 243 5310
i...@amarulasolutions.com
www.amarulasolutions.com

[PATCH] drm/amdgpu: initialize r variable into amdgpu_cs_submit function

2022-09-21 Thread Tommaso Merciai

The builds of arm64 allmodconfig with clang failed to build
next-20220920 with the following result:

1190:3: error: variable 'r' is uninitialized when used here 
[-Werror,-Wuninitialized]
note: initialize the variable 'r' to silence this warning

This fix compilation error

Signed-off-by: Tommaso Merciai 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 58088c663125..efa3dc9b69fd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1168,7 +1168,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
struct amdgpu_bo_list_entry *e;
struct amdgpu_job *job;
uint64_t seq;
-   int r;
+   int r = 0;
 
job = p->job;
p->job = NULL;
-- 
2.25.1

Re: [PATCH v3 0/8] Support for NVDEC on Tegra234

2022-09-21 Thread Krzysztof Kozlowski

On 20/09/2022 10:11, Mikko Perttunen wrote:
> From: Mikko Perttunen 
> 
> v3:
> * Updated patch 3 based on comments
> 
> v2:
> * Updated patches 1,3 based on comments
> * Added Acked-by to patch 2
> 
> Original message:
> 
> Hi all,
> 
> this series adds support for the HW video decoder, NVDEC,
> on Tegra234 (Orin). The main change is a switch from Falcon
> to RISC-V for the internal microcontroller, which brings along
> a change in how the engine is booted. Otherwise it is backwards
> compatible with earlier versions.

I asked you to describe the dependencies and patch merging strategy.
It's still not here, so I assume there are no and I am taking patches
relevant to me.

Best regards,
Krzysztof

Re: [PATCH v3 1/8] memory: tegra: Add API for retrieving carveout bounds

2022-09-21 Thread Krzysztof Kozlowski

On 20/09/2022 10:11, Mikko Perttunen wrote:
> From: Mikko Perttunen 
> 
> On Tegra234 NVDEC firmware is loaded from a secure carveout, where it
> has been loaded by a bootloader. When booting NVDEC, we need to tell it
> the address of this firmware, which we can determine by checking the
> starting address of the carveout. As such, add an MC API to query the
> bounds of carveouts, and add related information on Tegra234.
> 
> Signed-off-by: Mikko Perttunen 
> ---
> v2:
> - Add check for 64-bit phys_addr_t. In practice phys_addr_t
>   is always 64 bits where this runs, but it avoids warnings in
>   compile test.
> ---
>  drivers/memory/tegra/mc.c   | 25 +
>  drivers/memory/tegra/tegra234.c |  5 +
>  include/soc/tegra/mc.h  | 11 +++

I still need here Thierry's ack.

Best regards,
Krzysztof

Re: [PATCH v3 0/8] Support for NVDEC on Tegra234

2022-09-21 Thread Mikko Perttunen


On 9/21/22 10:26, Krzysztof Kozlowski wrote:

On 20/09/2022 10:11, Mikko Perttunen wrote:

From: Mikko Perttunen 

v3:
* Updated patch 3 based on comments

v2:
* Updated patches 1,3 based on comments
* Added Acked-by to patch 2

Original message:

Hi all,

this series adds support for the HW video decoder, NVDEC,
on Tegra234 (Orin). The main change is a switch from Falcon
to RISC-V for the internal microcontroller, which brings along
a change in how the engine is booted. Otherwise it is backwards
compatible with earlier versions.


I asked you to describe the dependencies and patch merging strategy.
It's still not here, so I assume there are no and I am taking patches
relevant to me.

Best regards,
Krzysztof


Sorry, I described it in the earlier email and forgot to add it to the 
cover letter..


Patch 8 does depend on patch 1 so it would be better to take the memory 
patch with it, or however works best from maintainership point of view 
(not my expertise).


thanks,
Mikko

Re: [PATCH] drm/i915: Do not cleanup obj with NULL bo->resource

2022-09-21 Thread Christian König


Am 20.09.22 um 19:13 schrieb Matthew Auld:

On 20/09/2022 18:06, Nirmoy Das wrote:

For delayed BO release i915_ttm_delete_mem_notify()
gets called twice, once with proper bo->resource and
another time with NULL. We shouldn't do anything for
the 2nd time as we already cleanedup the obj once.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6850
Signed-off-by: Nirmoy Das 


Reviewed-by: Matthew Auld 

Christian, as per above it looks like ttm calls into the 
delete_mem_notify() hook twice if the object ends up on the delayed 
destroy list, is that expected/normal?


Yeah, that's expected. IIRC some driver depended on this for some reason.

I already wanted to change this behavior, but forgot to do so after the 
patch set which made bo->resource a pointer landed.


Going to look into it once more.

Thanks,
Christian.




---
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c

index 0544b0a4a43a..e3fc38dd5db0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -511,7 +511,7 @@ static void i915_ttm_delete_mem_notify(struct 
ttm_buffer_object *bo)

  struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
  intel_wakeref_t wakeref = 0;
  -    if (likely(obj)) {
+    if (bo->resource && likely(obj)) {
  /* ttm_bo_release() already has dma_resv_lock */
  if (i915_ttm_cpu_maps_iomem(bo->resource))
  wakeref = 
intel_runtime_pm_get(&to_i915(obj->base.dev)->runtime_pm);

Re: [PATCH v3 0/8] Support for NVDEC on Tegra234

2022-09-21 Thread Krzysztof Kozlowski

On 21/09/2022 09:50, Mikko Perttunen wrote:
> On 9/21/22 10:26, Krzysztof Kozlowski wrote:
>> On 20/09/2022 10:11, Mikko Perttunen wrote:
>>> From: Mikko Perttunen 
>>>
>>> v3:
>>> * Updated patch 3 based on comments
>>>
>>> v2:
>>> * Updated patches 1,3 based on comments
>>> * Added Acked-by to patch 2
>>>
>>> Original message:
>>>
>>> Hi all,
>>>
>>> this series adds support for the HW video decoder, NVDEC,
>>> on Tegra234 (Orin). The main change is a switch from Falcon
>>> to RISC-V for the internal microcontroller, which brings along
>>> a change in how the engine is booted. Otherwise it is backwards
>>> compatible with earlier versions.
>>
>> I asked you to describe the dependencies and patch merging strategy.
>> It's still not here, so I assume there are no and I am taking patches
>> relevant to me.
>>
>> Best regards,
>> Krzysztof
> 
> Sorry, I described it in the earlier email and forgot to add it to the 
> cover letter..

Please keep it in cover letter. We all get too many emails and too many
patchsets to remember. Plus, things can change and such dependency can
disappear after some versions.

Best regards,
Krzysztof

Re: [PATCH 4/5] drm/damage-helper: Do partial updates if framebuffer has not been changed

2022-09-21 Thread Thomas Zimmermann


Hi Ville

Am 20.09.22 um 16:31 schrieb Ville Syrjälä:

On Tue, Sep 20, 2022 at 03:56:18PM +0200, Thomas Zimmermann wrote:

Set partial updates on a plane if the framebuffer has not been changed
on an atomic commit. If such a plane has damage clips, the driver will
use them; otherwise the update is effectively empty. Planes that change
their framebuffer still perform a full update.


I have a feeling you're sort of papering over some inefficient
userspace that's asking updates on planes that don't actually
need them. I'm not sure adding more code for that is a particularly
great idea. Wouldn't it be better to just fix the userspace code?


Some more context might be in order:

The ast driver currently uses VRAM helpers, which leads to many 
problems; including blank screens from the low amount of video memory. 
The best solution is to switch SHMEM with BOs on system meory. The video 
memory is only the internal buffer for scanout. This update involves a 
mempcy from the BO to video memory.


Ast's hardware is really slow, so it makes sense to reduce the updates 
to video memory to a minimum. There's support for cursor planes, which 
really makes a difference here.


But doing any cursor-plane update (e.g., moving, animation) with SHMEM 
and the current damage helpers always results in a full-screen memcpy 
from the BO to the video memory for the primary plane. As the ast 
hardware is slow, performance goes down to a an estimated 5 frame per 
seconds. After moving the mouse, one can watch the mouse cursor follow 
along the screen for the next seconds. Userspace sends the movement 
information and DRM slowly processes them. The same can be observed for 
cursor animation.


The problem is that each change to the cursor plane results in an 
atomic_update call for the primary plane. Not having damage information 
for the plane just means 'update everything'. Not a problem if redrawing 
consists of reprogramming the scanout address. For a memcpy it's not 
feasible.


So can this be fixed in userspace? No realistically IMHO. I've seen this 
problem on Weston, Wayland-Gnome and X11-Gnome. And they are all 
problematic in their own way. (That's why there are multiple patches 
needed.) For example, X11 uses the legacy mouse ioctl, which one of the 
patches fixes. The other userspace needs the other heuristics. A 
potential userspace fix would mean to always set empty-damage 
information on all planes that don't get an update. And I don't consider 
X11 fixable TBH.




Any property change on the plane could need a full plane
update as well (eg. some color mangement stuff etc.). So you'd
have to keep adding exceptions to that list whenever new
properties are added.


It's not about the occasional change of a property. It's the constant 
sluggish redraw when the interface is supposed to be snappy, such as 
mouse interaction.




And I think the semantics of the ioctl(s) has so far been that
basically any page flip (whether or not it actually changes
the fb) still ends up doing whatever flushing is needed to
guarantee the fb contents are up to date on the screen (if
someone sneaked in some front buffer rendering in between).
Ie. you could even use basically a nop page flip in place
of dirtyfb.


That's why full updates are still the default. Only in cases where it's 
save to avoid an update of unaffected planes, we do so.


I know that we don't give performance guarantees to userspace. But using 
cursor/overlay planes should be faster then not using them. I think 
that's a reasonable assumption for userspace to make.




Another thing the ioctls have always done is actually perform
the commit whether anything changed or nor. That is, they
still do all the same the same vblanky stuff and the commit
takes the same amount of time. Not sure if your idea is
to potentially short circuit the entire thing and make it
take no time at all?


The patches don't change the overall commit logics. All they do is to 
set damage updates to a size of 0 if a plane reliably does not need an 
update. A driver's atomic_update still runs, but the damage iterator 
does not return any damaged areas.


Best regards
Thomas






--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH 04/18] phy: mediatek: ufs: use common register access helpers

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

No need define private register access helpers, use common ones defined
in phy-mtk-io.h

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 05/18] phy: mediatek: pcie: use new helper to update register bits

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

The new helper will use FIELD_PREP() macro to prepare bits value
according to mask, then we no need do it anymore.

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 08/18] phy: mediatek: hdmi: mt2701: use common helper to access registers

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Use MediaTek phy's common helper to access registers, then we can remove
hdmi's I/O helpers.

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 13/18] phy: mediatek: mipi: mt8173: use GENMASK to generate bits mask

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Use GENMASK() macro to generate bits mask

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 06/18] phy: mediatek: hdmi: mt2701: use GENMASK and BIT to generate mask and bits

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Use GENMASK() and BIT() macros to generate mask and bits

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 07/18] phy: mediatek: hdmi: mt2701: use FIELD_PREP to prepare bits field

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Use FIELD_PREP() macro to prepare bits field value, then no need define
macros of bits offset.

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 12/18] phy: mediatek: hdmi: remove register access helpers

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Remove private register access helpers, use the common ones instead.

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 11/18] phy: mediatek: hdmi: mt8173: use common helper to access registers

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Use MediaTek phy's common helper to access registers, then we can remove
hdmi's I/O helpers.

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 09/18] phy: mediatek: hdmi: mt8173: use GENMASK to generate bits mask

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Use GENMASK() macro to generate bits mask

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 10/18] phy: mediatek: hdmi: mt8173: use FIELD_PREP to prepare bits field

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Use FIELD_PREP() macro to prepare bits field value, then no need define
macros of bits offset.

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 15/18] phy: mediatek: mipi: mt8173: use common helper to access registers

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Use MediaTek phy's common helper to access registers, then we can remove
mipi-dsi's I/O helpers.

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 14/18] phy: mediatek: mipi: mt8173: use FIELD_PREP to prepare bits field

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Use FIELD_PREP() macro to prepare bits field value, then no need define
macros of bits offset.

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 17/18] phy: mediatek: mipi: mt8183: use common helper to access registers

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Use MediaTek phy's common helper to access registers, then we can remove
mipi-dsi's I/O helpers.

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 16/18] phy: mediatek: mipi: mt8183: use GENMASK to generate bits mask

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Use GENMASK() macro to generate bits mask

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 18/18] phy: mediatek: mipi: remove register access helpers

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Remove private register access helpers, use the common ones instead.

Signed-off-by: Chunfeng Yun 


Reviewed-by: AngeloGioacchino Del Regno

Re: [PATCH 01/18] phy: mediatek: add a new helper to update bitfield

2022-09-21 Thread AngeloGioacchino Del Regno


Il 20/09/22 11:00, Chunfeng Yun ha scritto:

Due to FIELD_PREP() macro can be used to prepare a bitfield value,
local ones can be remove; add the new helper to make bitfield update
easier.

Signed-off-by: Chunfeng Yun 
---
  drivers/phy/mediatek/phy-mtk-io.h | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/drivers/phy/mediatek/phy-mtk-io.h 
b/drivers/phy/mediatek/phy-mtk-io.h
index 500fcdab165d..a723d4afc9b4 100644
--- a/drivers/phy/mediatek/phy-mtk-io.h
+++ b/drivers/phy/mediatek/phy-mtk-io.h
@@ -8,6 +8,7 @@
  #ifndef __PHY_MTK_H__
  #define __PHY_MTK_H__
  
+#include 

  #include 
  
  static inline void mtk_phy_clear_bits(void __iomem *reg, u32 bits)

@@ -35,4 +36,10 @@ static inline void mtk_phy_update_bits(void __iomem *reg, 
u32 mask, u32 val)
writel(tmp, reg);
  }
  
+/* field @mask should be constant and continuous */


"Field @mask shall be [...]"
 ^


+static inline void mtk_phy_update_field(void __iomem *reg, u32 mask, u32 val)


...so, (void __iomem *reg, const u32 mask, u32 val)


+{
+   mtk_phy_update_bits(reg, mask, FIELD_PREP(mask, val));
+}
+
  #endif

Re: [Intel-gfx] [PATCH] drm/i915: Do not cleanup obj with NULL bo->resource

2022-09-21 Thread Das, Nirmoy


Hi Matt

On 9/20/2022 7:06 PM, Nirmoy Das wrote:

For delayed BO release i915_ttm_delete_mem_notify()
gets called twice, once with proper bo->resource and
another time with NULL. We shouldn't do anything for
the 2nd time as we already cleanedup the obj once.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6850



Please add the below Fixes before merging, I missed that.

Fixes: ad74457a6b5a96 ("drm/i915/dgfx: Release mmap on rpm suspend")

Thanks,
Nirmoy


Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 0544b0a4a43a..e3fc38dd5db0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -511,7 +511,7 @@ static void i915_ttm_delete_mem_notify(struct 
ttm_buffer_object *bo)
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
intel_wakeref_t wakeref = 0;
  
-	if (likely(obj)) {

+   if (bo->resource && likely(obj)) {
/* ttm_bo_release() already has dma_resv_lock */
if (i915_ttm_cpu_maps_iomem(bo->resource))
wakeref = 
intel_runtime_pm_get(&to_i915(obj->base.dev)->runtime_pm);

Re: [PATCH 4/5] drm/damage-helper: Do partial updates if framebuffer has not been changed

2022-09-21 Thread Ville Syrjälä

On Wed, Sep 21, 2022 at 09:59:10AM +0200, Thomas Zimmermann wrote:
> Hi Ville
> 
> Am 20.09.22 um 16:31 schrieb Ville Syrjälä:
> > On Tue, Sep 20, 2022 at 03:56:18PM +0200, Thomas Zimmermann wrote:
> >> Set partial updates on a plane if the framebuffer has not been changed
> >> on an atomic commit. If such a plane has damage clips, the driver will
> >> use them; otherwise the update is effectively empty. Planes that change
> >> their framebuffer still perform a full update.
> > 
> > I have a feeling you're sort of papering over some inefficient
> > userspace that's asking updates on planes that don't actually
> > need them. I'm not sure adding more code for that is a particularly
> > great idea. Wouldn't it be better to just fix the userspace code?
> 
> Some more context might be in order:
> 
> The ast driver currently uses VRAM helpers, which leads to many 
> problems; including blank screens from the low amount of video memory. 
> The best solution is to switch SHMEM with BOs on system meory. The video 
> memory is only the internal buffer for scanout. This update involves a 
> mempcy from the BO to video memory.
> 
> Ast's hardware is really slow, so it makes sense to reduce the updates 
> to video memory to a minimum. There's support for cursor planes, which 
> really makes a difference here.
> 
> But doing any cursor-plane update (e.g., moving, animation) with SHMEM 
> and the current damage helpers always results in a full-screen memcpy 
> from the BO to the video memory for the primary plane. As the ast 
> hardware is slow, performance goes down to a an estimated 5 frame per 
> seconds. After moving the mouse, one can watch the mouse cursor follow 
> along the screen for the next seconds. Userspace sends the movement 
> information and DRM slowly processes them. The same can be observed for 
> cursor animation.
> 
> The problem is that each change to the cursor plane results in an 
> atomic_update call for the primary plane. Not having damage information 
> for the plane just means 'update everything'. Not a problem if redrawing 
> consists of reprogramming the scanout address. For a memcpy it's not 
> feasible.
> 
> So can this be fixed in userspace? No realistically IMHO. I've seen this 
> problem on Weston, Wayland-Gnome and X11-Gnome. And they are all 
> problematic in their own way. (That's why there are multiple patches 
> needed.) For example, X11 uses the legacy mouse ioctl, which one of the 
> patches fixes. The other userspace needs the other heuristics. A 
> potential userspace fix would mean to always set empty-damage 
> information on all planes that don't get an update. And I don't consider 
> X11 fixable TBH.
> 
> > 
> > Any property change on the plane could need a full plane
> > update as well (eg. some color mangement stuff etc.). So you'd
> > have to keep adding exceptions to that list whenever new
> > properties are added.
> 
> It's not about the occasional change of a property. It's the constant 
> sluggish redraw when the interface is supposed to be snappy, such as 
> mouse interaction.
> 
> > 
> > And I think the semantics of the ioctl(s) has so far been that
> > basically any page flip (whether or not it actually changes
> > the fb) still ends up doing whatever flushing is needed to
> > guarantee the fb contents are up to date on the screen (if
> > someone sneaked in some front buffer rendering in between).
> > Ie. you could even use basically a nop page flip in place
> > of dirtyfb.
> 
> That's why full updates are still the default. Only in cases where it's 
> save to avoid an update of unaffected planes, we do so.

Umm. Maybe I misread the patch in my haste but it seemed 
that you consider the thing undamaged if the fb pointer
was unchanged. That goes against what I wrote above.

Though I don't really know if a there is software relying on
that behaviuor. I suppose one idea could be to keep that
behaviour for the legacy ioctls but not for atomic. Ee. any
fb directly specified in a legacy setcrtc/setplane/pageflip
is always considered fully damaged. But including an the same
fb in the atomic ioctl does not imply damage. That would
mean atomic has to rely on specifying damage explicitly, and
any userspace that doesn't do that will be broken.

Or we could introduce a client cap for it I guess, but that
would require (minimal) userspace changes.

> 
> I know that we don't give performance guarantees to userspace. But using 
> cursor/overlay planes should be faster then not using them. I think 
> that's a reasonable assumption for userspace to make.
> 
> > 
> > Another thing the ioctls have always done is actually perform
> > the commit whether anything changed or nor. That is, they
> > still do all the same the same vblanky stuff and the commit
> > takes the same amount of time. Not sure if your idea is
> > to potentially short circuit the entire thing and make it
> > take no time at all?
> 
> The patches don't change the overall commit logics. All they do is to 
> set damage

Re: [PATCH 2/2] drm/panfrost: replace endian-specific types with generic ones

2022-09-21 Thread Steven Price

On 20/09/2022 23:13, Alyssa Rosenzweig wrote:
> Tentative r-b, but we *do* need to make a decision on how we want to
> handle endianness. I don't have strong feelings but the results of that
> discussion should go in the commit message.

Linux currently treats the dump objects specially - the headers are
little endian. All the other (Panfrost) DRM structures are native endian
(although I doubt anyone has tested it so I expect bugs). I've no
particularly strong views on this, but since the dump objects are likely
to be saved to disk and transferred between computers it makes sense to
fix the endianness for those. The __le types currently mean sparse can
warn if we screw up in the kernel, so it would be a shame to lose that
type checking.

Another option would be to extend the list of typedefs in
include/uapi/drm/drm.h to include the __le types. We'd need wider buy-in
for that change though.

Finally etnaviv 'solves' the issue by not including the dump structures
in the UABI header...

Or of course we could just actually use native endian and detect from
the magic which endian is in use. That would require ripping out the
cpu_to_lexx() calls in Linux and making the user space tool more
intelligent. I'm happy with that, but it's pushing the complexity onto Mesa.

Steve

> On Tue, Sep 20, 2022 at 10:15:45PM +0100, Adri??n Larumbe wrote:
>> __le32 and __l64 endian-specific types aren't portable and not available on
>> FreeBSD, for which there's a uAPI compatible reimplementation of Panfrost.
>>
>> Replace these specific types with more generic unsigned ones, to prevent
>> FreeBSD Mesa build errors.
>>
>> Bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7252
>> Fixes: 730c2bf4ad39 ("drm/panfrost: Add support for devcoredump")
>> Signed-off-by: Adri??n Larumbe 
>> ---
>>  include/uapi/drm/panfrost_drm.h | 30 +++---
>>  1 file changed, 15 insertions(+), 15 deletions(-)
>>
>> diff --git a/include/uapi/drm/panfrost_drm.h 
>> b/include/uapi/drm/panfrost_drm.h
>> index bd77254be121..c1a10a9366a9 100644
>> --- a/include/uapi/drm/panfrost_drm.h
>> +++ b/include/uapi/drm/panfrost_drm.h
>> @@ -236,24 +236,24 @@ struct drm_panfrost_madvise {
>>  #define PANFROSTDUMP_BUF_TRAILER (PANFROSTDUMP_BUF_BO + 1)
>>  
>>  struct panfrost_dump_object_header {
>> -__le32 magic;
>> -__le32 type;
>> -__le32 file_size;
>> -__le32 file_offset;
>> +__u32 magic;
>> +__u32 type;
>> +__u32 file_size;
>> +__u32 file_offset;
>>  
>>  union {
>>  struct {
>> -__le64 jc;
>> -__le32 gpu_id;
>> -__le32 major;
>> -__le32 minor;
>> -__le64 nbos;
>> +__u64 jc;
>> +__u32 gpu_id;
>> +__u32 major;
>> +__u32 minor;
>> +__u64 nbos;
>>  } reghdr;
>>  
>>  struct {
>> -__le32 valid;
>> -__le64 iova;
>> -__le32 data[2];
>> +__u32 valid;
>> +__u64 iova;
>> +__u32 data[2];
>>  } bomap;
>>  
>>  /*
>> @@ -261,14 +261,14 @@ struct panfrost_dump_object_header {
>>   * with new fields and also keep it 512-byte aligned
>>   */
>>  
>> -__le32 sizer[496];
>> +__u32 sizer[496];
>>  };
>>  };
>>  
>>  /* Registers object, an array of these */
>>  struct panfrost_dump_registers {
>> -__le32 reg;
>> -__le32 value;
>> +__u32 reg;
>> +__u32 value;
>>  };
>>  
>>  #if defined(__cplusplus)
>> -- 
>> 2.37.0
>>

Re: [PATCH] drm/mediatek: dsi: Move mtk_dsi_stop() call back to mtk_dsi_poweroff()

2022-09-21 Thread Allen-KH Cheng




On 9/19/22 21:32, AngeloGioacchino Del Regno wrote:
> Il 19/09/22 10:40, Hsin-Yi Wang ha scritto:
>> On Mon, Sep 19, 2022 at 4:39 PM Nícolas F. R. A. Prado
>>  wrote:
>>>
>>> As the comment right before the mtk_dsi_stop() call advises,
>>> mtk_dsi_stop() should only be called after
>>> mtk_drm_crtc_atomic_disable(). That's because that function calls
>>> drm_crtc_wait_one_vblank(), which requires the vblank irq to be enabled.
>>>
>>> Previously mtk_dsi_stop(), being in mtk_dsi_poweroff() and guarded by a
>>> refcount, would only be called at the end of
>>> mtk_drm_crtc_atomic_disable(), through the call to
>>> mtk_crtc_ddp_hw_fini().
>>> Commit cde7e2e35c28 ("drm/mediatek: Separate poweron/poweroff from
>>> enable/disable and define new funcs") moved the mtk_dsi_stop() call to
>>> mtk_output_dsi_disable(), causing it to be called before
>>> mtk_drm_crtc_atomic_disable(), and consequently generating vblank
>>> timeout warnings during suspend.
>>>
>>> Move the mtk_dsi_stop() call back to mtk_dsi_poweroff() so that we have
>>> a working vblank irq during mtk_drm_crtc_atomic_disable() and stop
>>> getting vblank timeout warnings.
>>>
>>> Fixes: cde7e2e35c28 ("drm/mediatek: Separate poweron/poweroff from
>>> enable/disable and define new funcs")
>>> Signed-off-by: Nícolas F. R. A. Prado 
>>>
>> Tested-by: Hsin-Yi Wang 
>>
> 
> Reviewed-by: AngeloGioacchino Del Regno
> 
> 
> 

Tested suspend/resume work properly on mt8188 and mt8186  .

Tested-by: Allen-KH Cheng

Re: [Intel-gfx] [RFC v4 02/14] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()

2022-09-21 Thread Tvrtko Ursulin




On 21/09/2022 08:09, Niranjana Vishwanathapura wrote:

Add function __i915_sw_fence_await_reservation() for
asynchronous wait on a dma-resv object with specified
dma_resv_usage. This is required for async vma unbind
with vm_bind.

Signed-off-by: Niranjana Vishwanathapura 
---
  drivers/gpu/drm/i915/i915_sw_fence.c | 25 ++---
  drivers/gpu/drm/i915/i915_sw_fence.h |  7 ++-
  2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index 6fc0d1b89690..0ce8f4efc1ed 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -569,12 +569,11 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
return ret;
  }
  
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,

-   struct dma_resv *resv,
-   const struct dma_fence_ops *exclude,
-   bool write,
-   unsigned long timeout,
-   gfp_t gfp)
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+ struct dma_resv *resv,
+ enum dma_resv_usage usage,
+ unsigned long timeout,
+ gfp_t gfp)
  {
struct dma_resv_iter cursor;
struct dma_fence *f;
@@ -583,7 +582,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
debug_fence_assert(fence);
might_sleep_if(gfpflags_allow_blocking(gfp));
  
-	dma_resv_iter_begin(&cursor, resv, dma_resv_usage_rw(write));

+   dma_resv_iter_begin(&cursor, resv, usage);
dma_resv_for_each_fence_unlocked(&cursor, f) {
pending = i915_sw_fence_await_dma_fence(fence, f, timeout,
gfp);
@@ -598,6 +597,18 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
return ret;
  }
  
+int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,

+   struct dma_resv *resv,
+   const struct dma_fence_ops *exclude,
+   bool write,
+   unsigned long timeout,
+   gfp_t gfp)
+{
+   return __i915_sw_fence_await_reservation(fence, resv,
+dma_resv_usage_rw(write),
+timeout, gfp);
+}


Drive by observation - it looked dodgy that you create a wrapper here 
which ignores one function parameter.


On a more detailed look it seems no callers actually use exclude and 
it's even unused inside this function since 1b5bdf071e62 ("drm/i915: use 
the new iterator in i915_sw_fence_await_reservation v3").


So a cleanup patch before this one?

Regards,

Tvrtko



+
  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
  #include "selftests/lib_sw_fence.c"
  #include "selftests/i915_sw_fence.c"
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
b/drivers/gpu/drm/i915/i915_sw_fence.h
index 619fc5a22f0c..3cf4b6e16f35 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -10,13 +10,13 @@
  #define _I915_SW_FENCE_H_
  
  #include 

+#include 
  #include 
  #include 
  #include  /* for NOTIFY_DONE */
  #include 
  
  struct completion;

-struct dma_resv;
  struct i915_sw_fence;
  
  enum i915_sw_fence_notify {

@@ -89,6 +89,11 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
  unsigned long timeout,
  gfp_t gfp);
  
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,

+ struct dma_resv *resv,
+ enum dma_resv_usage usage,
+ unsigned long timeout,
+ gfp_t gfp);
  int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
struct dma_resv *resv,
const struct dma_fence_ops *exclude,

Re: [PATCH v3] drm/sched: Add FIFO sched policy to run queue v3

2022-09-21 Thread Luben Tuikov

Inlined:

On 2022-09-20 15:16, Andrey Grodzovsky wrote:
> 
> On 2022-09-19 23:11, Luben Tuikov wrote:
>> Please run this patch through checkpatch.pl, as it shows
>> 12 warnings with it. Use these command line options:
>> "--strict --show-types".
>>
>> Inlined:
>>
>> On 2022-09-13 16:40, Andrey Grodzovsky wrote:
>>> Given many entities competing for same run queue on
>>> the same scheduler and unacceptably long wait time for some
>>> jobs waiting stuck in the run queue before being picked up are
>>> observed (seen using  GPUVis).
>> Since the second part of this sentence is the result of the first,
>> I'd say something like "When many entities ... we see unacceptably long ...".
>>
>>> The issue is due to the Round Robin policy used by schedulers
>>> to pick up the next entity's job queue for execution. Under stress
>>> of many entities and long job queus within entity some
>> Spelling: "queues".
>>
>>> jobs could be stack for very long time in it's entity's
>> "stuck", not "stack".
>>
>>> queue before being popped from the queue and executed
>>> while for other entities with smaller job queues a job
>>> might execute earlier even though that job arrived later
>>> then the job in the long queue.
>> "than".
>>
>>> 
>>> Fix:
>>> Add FIFO selection policy to entities in run queue, chose next entity
>>> on run queue in such order that if job on one entity arrived
>>> earlier then job on another entity the first job will start
>>> executing earlier regardless of the length of the entity's job
>>> queue.
>>> 
>>> v2:
>>> Switch to rb tree structure for entities based on TS of
>>> oldest job waiting in the job queue of an entity. Improves next
>>> entity extraction to O(1). Entity TS update
>>> O(log N) where N is the number of entities in the run-queue
>>> 
>>> Drop default option in module control parameter.
>>>
>>> v3:
>>> Various cosmetical fixes and minor refactoring of fifo update function.
>>> Signed-off-by: Andrey Grodzovsky 
>>> Tested-by: Li Yunxiang (Teddy) 
>>> ---
>>>   drivers/gpu/drm/scheduler/sched_entity.c |  26 -
>>>   drivers/gpu/drm/scheduler/sched_main.c   | 132 ++-
>>>   include/drm/gpu_scheduler.h  |  35 ++
>>>   3 files changed, 187 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
>>> b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index 6b25b2f4f5a3..f3ffce3c9304 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -73,6 +73,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>>> entity->priority = priority;
>>> entity->sched_list = num_sched_list > 1 ? sched_list : NULL;
>>> entity->last_scheduled = NULL;
>>> +   RB_CLEAR_NODE(&entity->rb_tree_node);
>>>   
>>> if(num_sched_list)
>>> entity->rq = &sched_list[0]->sched_rq[entity->priority];
>>> @@ -417,14 +418,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
>>> drm_sched_entity *entity)
>>>   
>>> sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
>>> if (!sched_job)
>>> -   return NULL;
>>> +   goto skip;
>>>   
>>> while ((entity->dependency =
>>> drm_sched_job_dependency(sched_job, entity))) {
>>> trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
>>>   
>>> -   if (drm_sched_entity_add_dependency_cb(entity))
>>> -   return NULL;
>>> +   if (drm_sched_entity_add_dependency_cb(entity)) {
>>> +   sched_job = NULL;
>>> +   goto skip;
>>> +   }
>>> }
>>>   
>>> /* skip jobs from entity that marked guilty */
>>> @@ -443,6 +446,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
>>> drm_sched_entity *entity)
>>> smp_wmb();
>>>   
>>> spsc_queue_pop(&entity->job_queue);
>>> +
>>> +   /*
>>> +* It's when head job is extracted we can access the next job (or empty)
>>> +* queue and update the entity location in the min heap accordingly.
>>> +*/
>>> +skip:
>>> +   if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>> +   drm_sched_rq_update_fifo(entity,
>>> +(sched_job ? sched_job->submit_ts : 
>>> ktime_get()));
>>> +
>>> return sched_job;
>>>   }
>>>   
>>> @@ -502,11 +515,13 @@ void drm_sched_entity_push_job(struct drm_sched_job 
>>> *sched_job)
>>>   {
>>> struct drm_sched_entity *entity = sched_job->entity;
>>> bool first;
>>> +   ktime_t ts =  ktime_get();
>>>   
>>> trace_drm_sched_job(sched_job, entity);
>>> atomic_inc(entity->rq->sched->score);
>>> WRITE_ONCE(entity->last_user, current->group_leader);
>>> first = spsc_queue_push(&entity->job_queue, &sched_job->queue_node);
>>> +   sched_job->submit_ts = ts;
>>>   
>>> /* first job wakes up scheduler */
>>> if (first) {
>>> @@ -518,8 +533,13 @@ void drm_sched_entity_push_job(struct drm_sched_job 
>>> *sched_job)
>>>

Re: [PATCH] drm/i915/gvt: fix double-free bug in split_2MB_gtt_entry

2022-09-21 Thread Zheng Hacker

I've sent it using git send-email with another email account (zyytlz...@163.com)

Regards,
Zheng Wang

Jani Nikula  于2022年9月19日周一 17:30写道：
>
> On Mon, 19 Sep 2022, Zheng Wang <1002992...@qq.com> wrote:
> >  From afe79848cb74cc8e45ab426d13fa2394c87e0422 Mon Sep 17 00:00:00 2001
> > From: xmzyshypnc <1002992...@qq.com>
> > Date: Fri, 16 Sep 2022 23:48:23 +0800
> > Subject: [PATCH] drm/i915/gvt: fix double-free bug in split_2MB_gtt_entry
> >
> > There is a double-free security bug in split_2MB_gtt_entry.
> >
> > Here is a calling chain :
> > ppgtt_populate_spt->ppgtt_populate_shadow_entry->split_2MB_gtt_entry.
> >
> > If intel_gvt_dma_map_guest_page failed, it will call
> > ppgtt_invalidate_spt, which will finally call ppgtt_free_spt and
> > kfree(spt). But the caller does not notice that, and it will call
> > ppgtt_free_spt again in error path.
> >
> > Fix this by only freeing spt in ppgtt_invalidate_spt in good case.
> >
> > Signed-off-by: xmzyshypnc <1002992...@qq.com>
>
> Please use git send-email. The patch is whitespace broken and line
> wrapped, making it unusable.
>
> BR,
> Jani.
>
>
> > ---
> >   drivers/gpu/drm/i915/gvt/gtt.c | 16 +---
> >   1 file changed, 9 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c
> > index ce0eb03709c3..550519f0acca 100644
> > --- a/drivers/gpu/drm/i915/gvt/gtt.c
> > +++ b/drivers/gpu/drm/i915/gvt/gtt.c
> > @@ -959,7 +959,7 @@ static inline int ppgtt_put_spt(struct
> > intel_vgpu_ppgtt_spt *spt)
> >   return atomic_dec_return(&spt->refcount);
> >   }
> >
> > -static int ppgtt_invalidate_spt(struct intel_vgpu_ppgtt_spt *spt);
> > +static int ppgtt_invalidate_spt(struct intel_vgpu_ppgtt_spt *spt, int
> > is_error);
> >
> >   static int ppgtt_invalidate_spt_by_shadow_entry(struct intel_vgpu *vgpu,
> >   struct intel_gvt_gtt_entry *e)
> > @@ -995,7 +995,7 @@ static int
> > ppgtt_invalidate_spt_by_shadow_entry(struct intel_vgpu *vgpu,
> >   ops->get_pfn(e));
> >   return -ENXIO;
> >   }
> > -return ppgtt_invalidate_spt(s);
> > +return ppgtt_invalidate_spt(s, 0);
> >   }
> >
> >   static inline void ppgtt_invalidate_pte(struct intel_vgpu_ppgtt_spt *spt,
> > @@ -1016,7 +1016,7 @@ static inline void ppgtt_invalidate_pte(struct
> > intel_vgpu_ppgtt_spt *spt,
> >   intel_gvt_dma_unmap_guest_page(vgpu, pfn << PAGE_SHIFT);
> >   }
> >
> > -static int ppgtt_invalidate_spt(struct intel_vgpu_ppgtt_spt *spt)
> > +static int ppgtt_invalidate_spt(struct intel_vgpu_ppgtt_spt *spt, int
> > is_error)
> >   {
> >   struct intel_vgpu *vgpu = spt->vgpu;
> >   struct intel_gvt_gtt_entry e;
> > @@ -1059,9 +1059,11 @@ static int ppgtt_invalidate_spt(struct
> > intel_vgpu_ppgtt_spt *spt)
> >   }
> >   }
> >
> > -trace_spt_change(spt->vgpu->id, "release", spt,
> > +if (!is_error) {
> > +trace_spt_change(spt->vgpu->id, "release", spt,
> >spt->guest_page.gfn, spt->shadow_page.type);
> > -ppgtt_free_spt(spt);
> > +ppgtt_free_spt(spt);
> > +}
> >   return 0;
> >   fail:
> >   gvt_vgpu_err("fail: shadow page %p shadow entry 0x%llx type %d\n",
> > @@ -1215,7 +1217,7 @@ static int split_2MB_gtt_entry(struct intel_vgpu
> > *vgpu,
> >   ret = intel_gvt_dma_map_guest_page(vgpu, start_gfn + sub_index,
> >  PAGE_SIZE, &dma_addr);
> >   if (ret) {
> > -ppgtt_invalidate_spt(spt);
> > +ppgtt_invalidate_spt(spt, 1);
> >   return ret;
> >   }
> >   sub_se.val64 = se->val64;
> > @@ -1393,7 +1395,7 @@ static int ppgtt_handle_guest_entry_removal(struct
> > intel_vgpu_ppgtt_spt *spt,
> >   ret = -ENXIO;
> >   goto fail;
> >   }
> > -ret = ppgtt_invalidate_spt(s);
> > +ret = ppgtt_invalidate_spt(s, 0);
> >   if (ret)
> >   goto fail;
> >   } else {
>
> --
> Jani Nikula, Intel Open Source Graphics Center

Re: [Intel-gfx] [RFC v4 03/14] drm/i915/vm_bind: Expose i915_gem_object_max_page_size()

2022-09-21 Thread Tvrtko Ursulin




On 21/09/2022 08:09, Niranjana Vishwanathapura wrote:

Expose i915_gem_object_max_page_size() function non-static
which will be used by the vm_bind feature.

Signed-off-by: Niranjana Vishwanathapura 
Signed-off-by: Andi Shyti 
---
  drivers/gpu/drm/i915/gem/i915_gem_create.c | 20 +++-
  drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 ++
  2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 33673fe7ee0a..3b3ab4abb0a3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -11,14 +11,24 @@
  #include "pxp/intel_pxp.h"
  
  #include "i915_drv.h"

+#include "i915_gem_context.h"


I can't spot that you are adding any code which would need this? 
I915_GTT_PAGE_SIZE_4K? It is in intel_gtt.h.



  #include "i915_gem_create.h"
  #include "i915_trace.h"
  #include "i915_user_extensions.h"
  
-static u32 object_max_page_size(struct intel_memory_region **placements,

-   unsigned int n_placements)
+/**
+ * i915_gem_object_max_page_size() - max of min_page_size of the regions
+ * @placements:  list of regions
+ * @n_placements: number of the placements
+ *
+ * Calculates the max of the min_page_size of a list of placements passed in.
+ *
+ * Return: max of the min_page_size
+ */
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+ unsigned int n_placements)
  {
-   u32 max_page_size = 0;
+   u32 max_page_size = I915_GTT_PAGE_SIZE_4K;
int i;
  
  	for (i = 0; i < n_placements; i++) {

@@ -28,7 +38,6 @@ static u32 object_max_page_size(struct intel_memory_region 
**placements,
max_page_size = max_t(u32, max_page_size, mr->min_page_size);
}
  
-	GEM_BUG_ON(!max_page_size);

return max_page_size;
  }
  
@@ -99,7 +108,8 @@ __i915_gem_object_create_user_ext(struct drm_i915_private *i915, u64 size,
  
  	i915_gem_flush_free_objects(i915);
  
-	size = round_up(size, object_max_page_size(placements, n_placements));

+   size = round_up(size, i915_gem_object_max_page_size(placements,
+   n_placements));
if (size == 0)
return ERR_PTR(-EINVAL);


Because of the changes above this path is now unreachable. I suppose it 
was meant to tell the user "you have supplied no placements"? But then 
GEM_BUG_ON (which you remove) used to be wrong.


Regards,

Tvrtko

  
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h

index 7317d4102955..8c97bddad921 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -47,6 +47,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
  }
  
  void i915_gem_init__objects(struct drm_i915_private *i915);

+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+ unsigned int n_placements);
  
  void i915_objects_module_exit(void);

  int i915_objects_module_init(void);

[PATCH] PM: runtime: Return properly from rpm_resume() if dev->power.needs_force_resume flag is set

2022-09-21 Thread Liu Ying

After a device transitions to sleep state through it's system suspend
callback pm_runtime_force_suspend(), the device's driver may still try
to do runtime PM for the device(runtime suspend first and then runtime
resume) although runtime PM is disabled by that callback.  The runtime
PM operations would not touch the device effectively and the device is
assumed to be resumed through it's system resume callback
pm_runtime_force_resume().

The problem is that since the device's runtime PM status is RPM_SUSPENDED
in the sleep state, rpm_resume() would not take the device as in already
active status and would return -EACCES instead of 1.  The error code
-EACCES may make the device's driver put the device into runtime suspend
status with an auto suspend delay which may happen after the device is
active again with force resume.

A real problematic case is the panel-simple.c driver(works with a
downstream DRM device driver), where the optional enable_gpio(controlled
by the runtime PM callbacks) will be disabled after the procedure with
an auto suspend delay if a DRM atomic commit(triggered by a DRM device's
system resume callback) tries to do runtime PM resume for the panel
before the panel is active with force resume:

1) System suspend:
 - pm_runtime_force_suspend()
  - panel_simple_suspend() // enable_gpio is disabled

2) Runtime suspend with a DRM atomic commit:
 - panel_simple_unprepare()
  - pm_runtime_put_autosuspend()

3) Runtime resume with a DRM atomic commit:
 - panel_simple_prepare()
  - pm_runtime_get_sync()
- rpm_resume() // return -EACCES
  - pm_runtime_put_autosuspend() // 1 second delay ...

4) System resume:
 - pm_runtime_force_resume()
  - panel_simple_resume() // enable_gpio is enabled

5) ...
 - pm_runtime_work()
  - rpm_suspend()
   - panel_simple_suspend() // enable_gpio is disabled

Fix the issue by checking dev->power.needs_force_resume flag in
rpm_resume() so that it returns 1 instead of -EACCES in this scenario,
since the flag is set in pm_runtime_force_suspend().

Also, update the documentation to change the description of
pm_runtime_resume() to reflect the new behavior of rpm_resume().

Cc: Rafael J. Wysocki 
Cc: Len Brown 
Cc: Pavel Machek 
Cc: Greg Kroah-Hartman 
Cc: Ulf Hansson 
Cc: Douglas Anderson 
Signed-off-by: Liu Ying 
---
 Documentation/power/runtime_pm.rst | 14 +++---
 drivers/base/power/runtime.c   |  2 ++
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/Documentation/power/runtime_pm.rst 
b/Documentation/power/runtime_pm.rst
index 65b86e487afe..6266f0ac02a8 100644
--- a/Documentation/power/runtime_pm.rst
+++ b/Documentation/power/runtime_pm.rst
@@ -337,13 +337,13 @@ drivers/base/power/runtime.c and 
include/linux/pm_runtime.h:
 
   `int pm_runtime_resume(struct device *dev);`
 - execute the subsystem-level resume callback for the device; returns 0 on
-  success, 1 if the device's runtime PM status is already 'active' (also if
-  'power.disable_depth' is nonzero, but the status was 'active' when it was
-  changing from 0 to 1) or error code on failure, where -EAGAIN means it 
may
-  be safe to attempt to resume the device again in future, but
-  'power.runtime_error' should be checked additionally, and -EACCES means
-  that the callback could not be run, because 'power.disable_depth' was
-  different from 0
+  success, 1 if the device's runtime PM status is assumed to be 'active'
+  with force resume or is already 'active' (also if 'power.disable_depth' 
is
+  nonzero, but the status was 'active' when it was changing from 0 to 1) or
+  error code on failure, where -EAGAIN means it may be safe to attempt to
+  resume the device again in future, but 'power.runtime_error' should be
+  checked additionally, and -EACCES means that the callback could not be
+  run, because 'power.disable_depth' was different from 0
 
   `int pm_runtime_resume_and_get(struct device *dev);`
 - run pm_runtime_resume(dev) and if successful, increment the device's
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 997be3ac20a7..0bce66ea0036 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -762,6 +762,8 @@ static int rpm_resume(struct device *dev, int rpmflags)
  repeat:
if (dev->power.runtime_error) {
retval = -EINVAL;
+   } else if (dev->power.needs_force_resume) {
+   retval = 1;
} else if (dev->power.disable_depth > 0) {
if (dev->power.runtime_status == RPM_ACTIVE &&
dev->power.last_status == RPM_ACTIVE)
-- 
2.37.1

Re: [PATCH v2 2/7] firmware: raspberrypi: Move the clock IDs to the firmware header

2022-09-21 Thread Maxime Ripard

Hi,

On Tue, Sep 20, 2022 at 06:01:08PM +0200, Stefan Wahren wrote:
> Am 20.09.22 um 14:50 schrieb Maxime Ripard:
> > We'll need the clock IDs in more drivers than just the clock driver from
> > now on, so let's move them in the firmware header.
> 
> recently as i reviewed the clk-raspberrypi i noticed this, too. But from my
> point of view the clock ids should go to include/dt-bindings/clock (like
> bcm2835.h) because these clock ids are actually referenced in the DTS files
> and we need to make sure they are in sync. AFAIR this would also result in
> change from enum to defines.
> 
> Sorry, i didn't had the time to send a patch for this.

IMO, we need both, and this enum still belongs in the firmware header.
We have two separate things, the firmware interface and the DT
interface. The kernel is a consumer for both, but the fact that they
match is an implementation detail. It might even change in the future
for all we know.

So having a header to use defines for the clock indices in the DT looks
like a good idea to me, but I think we should keep that enum in the
firmware header as well.

Maxime

signature.asc
Description: PGP signature

Re: [PATCH 4/5] drm/damage-helper: Do partial updates if framebuffer has not been changed

2022-09-21 Thread Thomas Zimmermann


Hi

Am 21.09.22 um 10:37 schrieb Ville Syrjälä:

On Wed, Sep 21, 2022 at 09:59:10AM +0200, Thomas Zimmermann wrote:

Hi Ville

Am 20.09.22 um 16:31 schrieb Ville Syrjälä:

On Tue, Sep 20, 2022 at 03:56:18PM +0200, Thomas Zimmermann wrote:

Set partial updates on a plane if the framebuffer has not been changed
on an atomic commit. If such a plane has damage clips, the driver will
use them; otherwise the update is effectively empty. Planes that change
their framebuffer still perform a full update.


I have a feeling you're sort of papering over some inefficient
userspace that's asking updates on planes that don't actually
need them. I'm not sure adding more code for that is a particularly
great idea. Wouldn't it be better to just fix the userspace code?


Some more context might be in order:

The ast driver currently uses VRAM helpers, which leads to many
problems; including blank screens from the low amount of video memory.
The best solution is to switch SHMEM with BOs on system meory. The video
memory is only the internal buffer for scanout. This update involves a
mempcy from the BO to video memory.

Ast's hardware is really slow, so it makes sense to reduce the updates
to video memory to a minimum. There's support for cursor planes, which
really makes a difference here.

But doing any cursor-plane update (e.g., moving, animation) with SHMEM
and the current damage helpers always results in a full-screen memcpy
from the BO to the video memory for the primary plane. As the ast
hardware is slow, performance goes down to a an estimated 5 frame per
seconds. After moving the mouse, one can watch the mouse cursor follow
along the screen for the next seconds. Userspace sends the movement
information and DRM slowly processes them. The same can be observed for
cursor animation.

The problem is that each change to the cursor plane results in an
atomic_update call for the primary plane. Not having damage information
for the plane just means 'update everything'. Not a problem if redrawing
consists of reprogramming the scanout address. For a memcpy it's not
feasible.

So can this be fixed in userspace? No realistically IMHO. I've seen this
problem on Weston, Wayland-Gnome and X11-Gnome. And they are all
problematic in their own way. (That's why there are multiple patches
needed.) For example, X11 uses the legacy mouse ioctl, which one of the
patches fixes. The other userspace needs the other heuristics. A
potential userspace fix would mean to always set empty-damage
information on all planes that don't get an update. And I don't consider
X11 fixable TBH.



Any property change on the plane could need a full plane
update as well (eg. some color mangement stuff etc.). So you'd
have to keep adding exceptions to that list whenever new
properties are added.


It's not about the occasional change of a property. It's the constant
sluggish redraw when the interface is supposed to be snappy, such as
mouse interaction.



And I think the semantics of the ioctl(s) has so far been that
basically any page flip (whether or not it actually changes
the fb) still ends up doing whatever flushing is needed to
guarantee the fb contents are up to date on the screen (if
someone sneaked in some front buffer rendering in between).
Ie. you could even use basically a nop page flip in place
of dirtyfb.


That's why full updates are still the default. Only in cases where it's
save to avoid an update of unaffected planes, we do so.


Umm. Maybe I misread the patch in my haste but it seemed
that you consider the thing undamaged if the fb pointer
was unchanged. That goes against what I wrote above.


I try to resolve that in patch 5. The fb_dirty flag signals that the 
framebuffer's content changed in some way. Patches 4 and 5 could be seen 
as one change, but that might overload the resulting patch. (Maybe it's 
not well designed overall. :/)




Though I don't really know if a there is software relying on
that behaviuor. I suppose one idea could be to keep that
behaviour for the legacy ioctls but not for atomic. Ee. any
fb directly specified in a legacy setcrtc/setplane/pageflip
is always considered fully damaged. But including an the same


Assuming the specified fb to be damaged seems like a non-issue to me.

The problem is with the other framebuffers: if userspace sends us a 
CURSOR_MOVE ioctl, we can safely assume the cursor fb to be damaged. But 
we don't want the primary plane's fb to be damaged. Or else we'd redraw 
the full primary plane.




fb in the atomic ioctl does not imply damage. That would
mean atomic has to rely on specifying damage explicitly, and
any userspace that doesn't do that will be broken.


The problem again is not in the damage information on planes that 
legitimately need a redraw. It's all the other planes that are being 
redrawn as well. Or is there some scenario that I don't see?




Or we could introduce a client cap for it I guess, but that
would require (minimal) userspace changes.



I know that we don't give p

Re: [PATCH] drm/bridge: ti-sn65dsi83: Add and use hs_rate and lp_rate

2022-09-21 Thread Maxime Ripard

On Mon, Sep 19, 2022 at 08:17:11PM +0200, Marek Vasut wrote:
> On 9/19/22 15:43, Maxime Ripard wrote:
> > Hi,
> 
> Hello Maxime,
> 
> > On Sun, Sep 18, 2022 at 02:56:00PM +0200, Marek Vasut wrote:
> > > On 8/1/22 15:11, Marek Vasut wrote:
> > > > Fill in hs_rate and lp_rate to struct mipi_dsi_device for this bridge 
> > > > and
> > > > adjust DSI input frequency calculations such that they expect the DSI 
> > > > host
> > > > to configure HS clock according to hs_rate.
> > > > 
> > > > This is an optimization for the DSI burst mode case. In case the DSI 
> > > > device
> > > > supports DSI burst mode, it is recommended to operate the DSI interface 
> > > > at
> > > > the highest possible HS clock frequency which the DSI device supports. 
> > > > This
> > > > permits the DSI host to send as short as possible bursts of data on the 
> > > > DSI
> > > > link and keep the DSI data lanes in LP mode otherwise, which reduces 
> > > > power
> > > > consumption.
> > > 
> > > > Signed-off-by: Marek Vasut 
> > > > Cc: Jagan Teki 
> > > > Cc: Laurent Pinchart 
> > > > Cc: Linus Walleij 
> > > > Cc: Robert Foss 
> > > > Cc: Sam Ravnborg 
> > > > Cc: dri-devel@lists.freedesktop.org
> > > > ---
> > > >drivers/gpu/drm/bridge/ti-sn65dsi83.c | 25 +
> > > >1 file changed, 13 insertions(+), 12 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c 
> > > > b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> > > > index 14e7aa77e7584..b161f25c3a2f5 100644
> > > > --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> > > > +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> > > > @@ -286,8 +286,7 @@ static u8 sn65dsi83_get_lvds_range(struct sn65dsi83 
> > > > *ctx,
> > > > return (mode_clock - 12500) / 25000;
> > > >}
> > > > -static u8 sn65dsi83_get_dsi_range(struct sn65dsi83 *ctx,
> > > > - const struct drm_display_mode *mode)
> > > > +static u8 sn65dsi83_get_dsi_range(struct sn65dsi83 *ctx)
> > > >{
> > > > /*
> > > >  * The encoding of the CHA_DSI_CLK_RANGE is as follows:
> > > > @@ -303,20 +302,20 @@ static u8 sn65dsi83_get_dsi_range(struct 
> > > > sn65dsi83 *ctx,
> > > >  *  DSI_CLK = mode clock * bpp / dsi_data_lanes / 2
> > > >  * the 2 is there because the bus is DDR.
> > > >  */
> > > > -   return DIV_ROUND_UP(clamp((unsigned int)mode->clock *
> > > > -   
> > > > mipi_dsi_pixel_format_to_bpp(ctx->dsi->format) /
> > > > -   ctx->dsi->lanes / 2, 4U, 50U), 
> > > > 5000U);
> > > > +   return DIV_ROUND_UP(ctx->dsi->hs_rate, 500U);
> > > >}
> > > > -static u8 sn65dsi83_get_dsi_div(struct sn65dsi83 *ctx)
> > > > +static u8 sn65dsi83_get_dsi_div(struct sn65dsi83 *ctx,
> > > > +   const struct drm_display_mode *mode)
> > > >{
> > > > /* The divider is (DSI_CLK / LVDS_CLK) - 1, which really is: */
> > > > -   unsigned int dsi_div = 
> > > > mipi_dsi_pixel_format_to_bpp(ctx->dsi->format);
> > > > +   unsigned int dsi_div;
> > > > +   int mode_clock = mode->clock;
> > > > -   dsi_div /= ctx->dsi->lanes;
> > > > +   if (ctx->lvds_dual_link)
> > > > +   mode_clock /= 2;
> > > > -   if (!ctx->lvds_dual_link)
> > > > -   dsi_div /= 2;
> > > > +   dsi_div = (ctx->dsi->hs_rate / mode_clock) / 1000;
> > > > return dsi_div - 1;
> > > >}
> > > > @@ -397,9 +396,9 @@ static void sn65dsi83_atomic_enable(struct 
> > > > drm_bridge *bridge,
> > > >  
> > > > REG_RC_LVDS_PLL_LVDS_CLK_RANGE(sn65dsi83_get_lvds_range(ctx, mode)) |
> > > >  REG_RC_LVDS_PLL_HS_CLK_SRC_DPHY);
> > > > regmap_write(ctx->regmap, REG_DSI_CLK,
> > > > -
> > > > REG_DSI_CLK_CHA_DSI_CLK_RANGE(sn65dsi83_get_dsi_range(ctx, mode)));
> > > > +
> > > > REG_DSI_CLK_CHA_DSI_CLK_RANGE(sn65dsi83_get_dsi_range(ctx)));
> > > > regmap_write(ctx->regmap, REG_RC_DSI_CLK,
> > > > -
> > > > REG_RC_DSI_CLK_DSI_CLK_DIVIDER(sn65dsi83_get_dsi_div(ctx)));
> > > > +
> > > > REG_RC_DSI_CLK_DSI_CLK_DIVIDER(sn65dsi83_get_dsi_div(ctx, mode)));
> > > > /* Set number of DSI lanes and LVDS link config. */
> > > > regmap_write(ctx->regmap, REG_DSI_LANE,
> > > > @@ -643,6 +642,8 @@ static int sn65dsi83_host_attach(struct sn65dsi83 
> > > > *ctx)
> > > > dsi->lanes = dsi_lanes;
> > > > dsi->format = MIPI_DSI_FMT_RGB888;
> > > > dsi->mode_flags = MIPI_DSI_MODE_VIDEO | 
> > > > MIPI_DSI_MODE_VIDEO_BURST;
> > > > +   dsi->hs_rate = 5;
> > > > +   dsi->lp_rate = 1600;
> > 
> > Let's leave aside the comment from Dave that the host might choose a
> > lower HS rate, we can indeed assume it's true for now.
> > 
> > However.. Is there any guarantee that the host can even reach that
> > frequency in the first place? IIRC, the ma

Re: [PATCH 2/2] drm/panfrost: replace endian-specific types with generic ones

2022-09-21 Thread Robin Murphy


On 2022-09-21 09:48, Steven Price wrote:

On 20/09/2022 23:13, Alyssa Rosenzweig wrote:

Tentative r-b, but we *do* need to make a decision on how we want to
handle endianness. I don't have strong feelings but the results of that
discussion should go in the commit message.


Linux currently treats the dump objects specially - the headers are
little endian. All the other (Panfrost) DRM structures are native endian
(although I doubt anyone has tested it so I expect bugs).


If there can be *any* native-endian data included in the dump, then the 
original endianness needs to be recorded to be able to analyse it 
correctly anyway. The dumping code can't know the granularity at which 
arbitrary BOs may or may not need to be byteswapped to make everything 
consistently LE.



I've no
particularly strong views on this, but since the dump objects are likely
to be saved to disk and transferred between computers it makes sense to
fix the endianness for those. The __le types currently mean sparse can
warn if we screw up in the kernel, so it would be a shame to lose that
type checking.

Another option would be to extend the list of typedefs in
include/uapi/drm/drm.h to include the __le types. We'd need wider buy-in
for that change though.

Finally etnaviv 'solves' the issue by not including the dump structures
in the UABI header...

Or of course we could just actually use native endian and detect from
the magic which endian is in use. That would require ripping out the
cpu_to_lexx() calls in Linux and making the user space tool more
intelligent. I'm happy with that, but it's pushing the complexity onto Mesa.


If there's a clearly identifiable header, then I'd say making the whole 
dump native-endian is probably the way to go. Unless and until anyone 
actually demands to be able to do cross-endian post-mortem GPU 
debugging, the realistic extent of the complexity in Mesa is that it 
doesn't recognise the foreign dump format and gives up, which I assume 
is already implemented :)


Cheers,
Robin.



Steve


On Tue, Sep 20, 2022 at 10:15:45PM +0100, Adri??n Larumbe wrote:

__le32 and __l64 endian-specific types aren't portable and not available on
FreeBSD, for which there's a uAPI compatible reimplementation of Panfrost.

Replace these specific types with more generic unsigned ones, to prevent
FreeBSD Mesa build errors.

Bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7252
Fixes: 730c2bf4ad39 ("drm/panfrost: Add support for devcoredump")
Signed-off-by: Adri??n Larumbe 
---
  include/uapi/drm/panfrost_drm.h | 30 +++---
  1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index bd77254be121..c1a10a9366a9 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -236,24 +236,24 @@ struct drm_panfrost_madvise {
  #define PANFROSTDUMP_BUF_TRAILER (PANFROSTDUMP_BUF_BO + 1)
  
  struct panfrost_dump_object_header {

-   __le32 magic;
-   __le32 type;
-   __le32 file_size;
-   __le32 file_offset;
+   __u32 magic;
+   __u32 type;
+   __u32 file_size;
+   __u32 file_offset;
  
  	union {

struct {
-   __le64 jc;
-   __le32 gpu_id;
-   __le32 major;
-   __le32 minor;
-   __le64 nbos;
+   __u64 jc;
+   __u32 gpu_id;
+   __u32 major;
+   __u32 minor;
+   __u64 nbos;
} reghdr;
  
  		struct {

-   __le32 valid;
-   __le64 iova;
-   __le32 data[2];
+   __u32 valid;
+   __u64 iova;
+   __u32 data[2];
} bomap;
  
  		/*

@@ -261,14 +261,14 @@ struct panfrost_dump_object_header {
 * with new fields and also keep it 512-byte aligned
 */
  
-		__le32 sizer[496];

+   __u32 sizer[496];
};
  };
  
  /* Registers object, an array of these */

  struct panfrost_dump_registers {
-   __le32 reg;
-   __le32 value;
+   __u32 reg;
+   __u32 value;
  };
  
  #if defined(__cplusplus)

--
2.37.0

Re: [PATCH 4/5] drm/damage-helper: Do partial updates if framebuffer has not been changed

2022-09-21 Thread Ville Syrjälä

On Wed, Sep 21, 2022 at 11:39:51AM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 21.09.22 um 10:37 schrieb Ville Syrjälä:
> > On Wed, Sep 21, 2022 at 09:59:10AM +0200, Thomas Zimmermann wrote:
> >> Hi Ville
> >>
> >> Am 20.09.22 um 16:31 schrieb Ville Syrjälä:
> >>> On Tue, Sep 20, 2022 at 03:56:18PM +0200, Thomas Zimmermann wrote:
>  Set partial updates on a plane if the framebuffer has not been changed
>  on an atomic commit. If such a plane has damage clips, the driver will
>  use them; otherwise the update is effectively empty. Planes that change
>  their framebuffer still perform a full update.
> >>>
> >>> I have a feeling you're sort of papering over some inefficient
> >>> userspace that's asking updates on planes that don't actually
> >>> need them. I'm not sure adding more code for that is a particularly
> >>> great idea. Wouldn't it be better to just fix the userspace code?
> >>
> >> Some more context might be in order:
> >>
> >> The ast driver currently uses VRAM helpers, which leads to many
> >> problems; including blank screens from the low amount of video memory.
> >> The best solution is to switch SHMEM with BOs on system meory. The video
> >> memory is only the internal buffer for scanout. This update involves a
> >> mempcy from the BO to video memory.
> >>
> >> Ast's hardware is really slow, so it makes sense to reduce the updates
> >> to video memory to a minimum. There's support for cursor planes, which
> >> really makes a difference here.
> >>
> >> But doing any cursor-plane update (e.g., moving, animation) with SHMEM
> >> and the current damage helpers always results in a full-screen memcpy
> >> from the BO to the video memory for the primary plane. As the ast
> >> hardware is slow, performance goes down to a an estimated 5 frame per
> >> seconds. After moving the mouse, one can watch the mouse cursor follow
> >> along the screen for the next seconds. Userspace sends the movement
> >> information and DRM slowly processes them. The same can be observed for
> >> cursor animation.
> >>
> >> The problem is that each change to the cursor plane results in an
> >> atomic_update call for the primary plane. Not having damage information
> >> for the plane just means 'update everything'. Not a problem if redrawing
> >> consists of reprogramming the scanout address. For a memcpy it's not
> >> feasible.
> >>
> >> So can this be fixed in userspace? No realistically IMHO. I've seen this
> >> problem on Weston, Wayland-Gnome and X11-Gnome. And they are all
> >> problematic in their own way. (That's why there are multiple patches
> >> needed.) For example, X11 uses the legacy mouse ioctl, which one of the
> >> patches fixes. The other userspace needs the other heuristics. A
> >> potential userspace fix would mean to always set empty-damage
> >> information on all planes that don't get an update. And I don't consider
> >> X11 fixable TBH.
> >>
> >>>
> >>> Any property change on the plane could need a full plane
> >>> update as well (eg. some color mangement stuff etc.). So you'd
> >>> have to keep adding exceptions to that list whenever new
> >>> properties are added.
> >>
> >> It's not about the occasional change of a property. It's the constant
> >> sluggish redraw when the interface is supposed to be snappy, such as
> >> mouse interaction.
> >>
> >>>
> >>> And I think the semantics of the ioctl(s) has so far been that
> >>> basically any page flip (whether or not it actually changes
> >>> the fb) still ends up doing whatever flushing is needed to
> >>> guarantee the fb contents are up to date on the screen (if
> >>> someone sneaked in some front buffer rendering in between).
> >>> Ie. you could even use basically a nop page flip in place
> >>> of dirtyfb.
> >>
> >> That's why full updates are still the default. Only in cases where it's
> >> save to avoid an update of unaffected planes, we do so.
> > 
> > Umm. Maybe I misread the patch in my haste but it seemed
> > that you consider the thing undamaged if the fb pointer
> > was unchanged. That goes against what I wrote above.
> 
> I try to resolve that in patch 5. The fb_dirty flag signals that the 
> framebuffer's content changed in some way. Patches 4 and 5 could be seen 
> as one change, but that might overload the resulting patch. (Maybe it's 
> not well designed overall. :/)
> 
> > 
> > Though I don't really know if a there is software relying on
> > that behaviuor. I suppose one idea could be to keep that
> > behaviour for the legacy ioctls but not for atomic. Ee. any
> > fb directly specified in a legacy setcrtc/setplane/pageflip
> > is always considered fully damaged. But including an the same
> 
> Assuming the specified fb to be damaged seems like a non-issue to me.
> 
> The problem is with the other framebuffers: if userspace sends us a 
> CURSOR_MOVE ioctl, we can safely assume the cursor fb to be damaged. But 
> we don't want the primary plane's fb to be damaged. Or else we'd redraw 
> the full primary plane.
> 
> 
> > fb in

Re: [Intel-gfx] [RFC v4 08/14] drm/i915/vm_bind: Abstract out common execbuf functions

2022-09-21 Thread Tvrtko Ursulin




On 21/09/2022 08:09, Niranjana Vishwanathapura wrote:

The new execbuf3 ioctl path and the legacy execbuf ioctl
paths have many common functionalities.
Share code between these two paths by abstracting out the
common functionalities into a separate file where possible.


Looks like a good start to me. A couple comments/questions below.


Signed-off-by: Niranjana Vishwanathapura 
---
  drivers/gpu/drm/i915/Makefile |   1 +
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 507 ++---
  .../drm/i915/gem/i915_gem_execbuffer_common.c | 530 ++
  .../drm/i915/gem/i915_gem_execbuffer_common.h |  47 ++
  4 files changed, 612 insertions(+), 473 deletions(-)
  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 9bf939ef18ea..bf952f478555 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -148,6 +148,7 @@ gem-y += \
gem/i915_gem_create.o \
gem/i915_gem_dmabuf.o \
gem/i915_gem_domain.o \
+   gem/i915_gem_execbuffer_common.o \
gem/i915_gem_execbuffer.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 33d989a20227..363b2a788cdf 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -9,8 +9,6 @@
  #include 
  #include 
  
-#include 

-
  #include "display/intel_frontbuffer.h"
  
  #include "gem/i915_gem_ioctls.h"

@@ -28,6 +26,7 @@
  #include "i915_file_private.h"
  #include "i915_gem_clflush.h"
  #include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
  #include "i915_gem_evict.h"
  #include "i915_gem_ioctls.h"
  #include "i915_trace.h"
@@ -235,13 +234,6 @@ enum {
   * the batchbuffer in trusted mode, otherwise the ioctl is rejected.
   */
  
-struct eb_fence {

-   struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */
-   struct dma_fence *dma_fence;
-   u64 value;
-   struct dma_fence_chain *chain_fence;
-};
-
  struct i915_execbuffer {
struct drm_i915_private *i915; /** i915 backpointer */
struct drm_file *file; /** per-file lookup tables and limits */
@@ -2446,164 +2438,29 @@ static const enum intel_engine_id user_ring_map[] = {
[I915_EXEC_VEBOX]   = VECS0
  };
  
-static struct i915_request *eb_throttle(struct i915_execbuffer *eb, struct intel_context *ce)

-{
-   struct intel_ring *ring = ce->ring;
-   struct intel_timeline *tl = ce->timeline;
-   struct i915_request *rq;
-
-   /*
-* Completely unscientific finger-in-the-air estimates for suitable
-* maximum user request size (to avoid blocking) and then backoff.
-*/
-   if (intel_ring_update_space(ring) >= PAGE_SIZE)
-   return NULL;
-
-   /*
-* Find a request that after waiting upon, there will be at least half
-* the ring available. The hysteresis allows us to compete for the
-* shared ring and should mean that we sleep less often prior to
-* claiming our resources, but not so long that the ring completely
-* drains before we can submit our next request.
-*/
-   list_for_each_entry(rq, &tl->requests, link) {
-   if (rq->ring != ring)
-   continue;
-
-   if (__intel_ring_space(rq->postfix,
-  ring->emit, ring->size) > ring->size / 2)
-   break;
-   }
-   if (&rq->link == &tl->requests)
-   return NULL; /* weird, we will check again later for real */
-
-   return i915_request_get(rq);
-}
-
-static int eb_pin_timeline(struct i915_execbuffer *eb, struct intel_context 
*ce,
-  bool throttle)
-{
-   struct intel_timeline *tl;
-   struct i915_request *rq = NULL;
-
-   /*
-* Take a local wakeref for preparing to dispatch the execbuf as
-* we expect to access the hardware fairly frequently in the
-* process, and require the engine to be kept awake between accesses.
-* Upon dispatch, we acquire another prolonged wakeref that we hold
-* until the timeline is idle, which in turn releases the wakeref
-* taken on the engine, and the parent device.
-*/
-   tl = intel_context_timeline_lock(ce);
-   if (IS_ERR(tl))
-   return PTR_ERR(tl);
-
-   intel_context_enter(ce);
-   if (throttle)
-   rq = eb_throttle(eb, ce);
-   intel_context_timeline_unlock(tl);
-
-   if (rq) {
-   bool nonblock = eb->file->filp->f_flags & O_NONBLOCK;
-   long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
-
-   if (i915_request_wait(rq, I915_WAIT_INTERRUPTI

Re: [PATCH v5 15/21] dma-buf: Move dma_buf_vmap() to dynamic locking specification

2022-09-21 Thread Dmitry Osipenko

On 9/20/22 17:13, Sumit Semwal wrote:
> Hi Dmitry,
> 
> Thanks very much for the series.
> 
> On Wed, 14 Sept 2022 at 00:59, Dmitry Osipenko
>  wrote:
>>
>> Move dma_buf_vmap/vunmap_unlocked() functions to the dynamic locking
>> specification by asserting that the reservation lock is held.
> Thanks for the patch; just a minor nit - I think you mean dma_buf_vmap
> / vunmap() here, and not _unlocked?

Yes, the _unlocked should be dropped here. Thank you for the review!

-- 
Best regards,
Dmitry

Re: [PATCH 4/5] drm/damage-helper: Do partial updates if framebuffer has not been changed

2022-09-21 Thread Thomas Zimmermann


Hi

Am 21.09.22 um 12:18 schrieb Ville Syrjälä:
[...]




Though I don't really know if a there is software relying on
that behaviuor. I suppose one idea could be to keep that
behaviour for the legacy ioctls but not for atomic. Ee. any
fb directly specified in a legacy setcrtc/setplane/pageflip
is always considered fully damaged. But including an the same


Assuming the specified fb to be damaged seems like a non-issue to me.

The problem is with the other framebuffers: if userspace sends us a
CURSOR_MOVE ioctl, we can safely assume the cursor fb to be damaged. But
we don't want the primary plane's fb to be damaged. Or else we'd redraw
the full primary plane.



fb in the atomic ioctl does not imply damage. That would
mean atomic has to rely on specifying damage explicitly, and
any userspace that doesn't do that will be broken.


The problem again is not in the damage information on planes that
legitimately need a redraw. It's all the other planes that are being
redrawn as well. Or is there some scenario that I don't see?


I thought we're talking about eg. a cursor update that also
includes the primary plane because apparently userspace is lazy.

I think what you're saying is that currently there is no
damage information for the primary plane so you're attempting
to infer it based on whether the fb property changed or not.


Correct.



And what I was saying is that IIRC historically specifying
the same fb again has still implied full damage. Changing
that behaviour may or may not break exising userspace.


I cannot answer the question. The three cases I've seen at least worked 
with the new semantics. I think Daniel once mentioned that we already 
expect userspace to call the DIRTYFB ioctl after changing a 
framebuffer's content.








Or we could introduce a client cap for it I guess, but that
would require (minimal) userspace changes.



I know that we don't give performance guarantees to userspace. But using
cursor/overlay planes should be faster then not using them. I think
that's a reasonable assumption for userspace to make.



Another thing the ioctls have always done is actually perform
the commit whether anything changed or nor. That is, they
still do all the same the same vblanky stuff and the commit
takes the same amount of time. Not sure if your idea is
to potentially short circuit the entire thing and make it
take no time at all?


The patches don't change the overall commit logics. All they do is to
set damage updates to a size of 0 if a plane reliably does not need an
update.


What I gathered is that you seemed to only consider the fb
contents as something that needs damage handling. That is not
the case for stuff like eDP PSR and DSI command mode displays
where we need to upload a new full frame whenever also the
non-damaged fb contents would get transformed by the hardware
on the way to the remote frambuffer. That would mean any change
in eg. scanout coordinates, color management, etc.


There is support for changing scanout coordinates in
drm_atomic_helper_damage_iter_init() in patch 2. In general, maybe the
heuristic needs to be stricter to only handle updates that only change
damage.

For now, the problem only happens after converting ast to SHMEM. All the
other SHMEM-based drivers use a single plane where the problem doesn't
happen; or only reprogram the scanout address, which is fast. If the
damage-handling logic imposes problems on other drivers, some of it
could possibly be moved into ast itself.


Maybe we have two clearly separate classes of uses case:
1. devices where only damage to the fb contents matter (eg. some kind of
shadow fb that is the same size as the real fb).
2. devices where everything about the scanout process matters (eg. PSR)
?

Maybe there should be helpers to deal with just the first case,
and then some more helpers (or just driver code) to pile the rest
on top as well when needed?


Makes sense.

I know that these plane-state are not good style, but with them in 
place, much of the heuristic could be moved from 
drm_atomic_helper_check_plane_damage() into the driver if necessary.


Best regards
Thomas





--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH 1/7] drm/i915/hwmon: Add HWMON infrastructure

2022-09-21 Thread Gupta, Anshuman





On 9/16/2022 8:30 PM, Badal Nilawar wrote:

From: Dale B Stimson 

The i915 HWMON module will be used to expose voltage, power and energy
values for dGfx. Here we set up i915 hwmon infrastructure including i915
hwmon registration, basic data structures and functions.

v2:
   - Create HWMON infra patch (Ashutosh)
   - Fixed review comments (Jani)
   - Remove "select HWMON" from i915/Kconfig (Jani)
v3: Use hwm_ prefix for static functions (Ashutosh)
v4: s/#ifdef CONFIG_HWMON/#if IS_REACHABLE(CONFIG_HWMON)/ since the former
 doesn't work if hwmon is compiled as a module (Guenter)
v5: Fixed review comments (Jani)

Cc: Guenter Roeck 
Signed-off-by: Dale B Stimson 
Signed-off-by: Ashutosh Dixit 
Signed-off-by: Riana Tauro 
Signed-off-by: Badal Nilawar 
Acked-by: Guenter Roeck 
Reviewed-by: Ashutosh Dixit 

Reviewed-by: Anshuman Gupta 


---
  drivers/gpu/drm/i915/Makefile  |   3 +
  drivers/gpu/drm/i915/i915_driver.c |   5 ++
  drivers/gpu/drm/i915/i915_drv.h|   2 +
  drivers/gpu/drm/i915/i915_hwmon.c  | 136 +
  drivers/gpu/drm/i915/i915_hwmon.h  |  20 +
  5 files changed, 166 insertions(+)
  create mode 100644 drivers/gpu/drm/i915/i915_hwmon.c
  create mode 100644 drivers/gpu/drm/i915/i915_hwmon.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index a26edcdadc21..66a6023e61a6 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -209,6 +209,9 @@ i915-y += gt/uc/intel_uc.o \
  # graphics system controller (GSC) support
  i915-y += gt/intel_gsc.o
  
+# graphics hardware monitoring (HWMON) support

+i915-$(CONFIG_HWMON) += i915_hwmon.o
+
  # modesetting core code
  i915-y += \
display/hsw_ips.o \
diff --git a/drivers/gpu/drm/i915/i915_driver.c 
b/drivers/gpu/drm/i915/i915_driver.c
index c459eb362c47..75655adb7bd3 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -81,6 +81,7 @@
  #include "i915_drm_client.h"
  #include "i915_drv.h"
  #include "i915_getparam.h"
+#include "i915_hwmon.h"
  #include "i915_ioc32.h"
  #include "i915_ioctl.h"
  #include "i915_irq.h"
@@ -763,6 +764,8 @@ static void i915_driver_register(struct drm_i915_private 
*dev_priv)
for_each_gt(gt, dev_priv, i)
intel_gt_driver_register(gt);
  
+	i915_hwmon_register(dev_priv);

+
intel_display_driver_register(dev_priv);
  
  	intel_power_domains_enable(dev_priv);

@@ -795,6 +798,8 @@ static void i915_driver_unregister(struct drm_i915_private 
*dev_priv)
for_each_gt(gt, dev_priv, i)
intel_gt_driver_unregister(gt);
  
+	i915_hwmon_unregister(dev_priv);

+
i915_perf_unregister(dev_priv);
i915_pmu_unregister(dev_priv);
  
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h

index 9f9372931fd2..01a2caf42635 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -353,6 +353,8 @@ struct drm_i915_private {
  
  	struct i915_perf perf;
  
+	struct i915_hwmon *hwmon;

+
/* Abstract the submission mechanism (legacy ringbuffer or execlists) 
away */
struct intel_gt gt0;
  
diff --git a/drivers/gpu/drm/i915/i915_hwmon.c b/drivers/gpu/drm/i915/i915_hwmon.c

new file mode 100644
index ..103dd543a214
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -0,0 +1,136 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+
+#include "i915_drv.h"
+#include "i915_hwmon.h"
+#include "i915_reg.h"
+#include "intel_mchbar_regs.h"
+
+struct hwm_reg {
+};
+
+struct hwm_drvdata {
+   struct i915_hwmon *hwmon;
+   struct intel_uncore *uncore;
+   struct device *hwmon_dev;
+   char name[12];
+};
+
+struct i915_hwmon {
+   struct hwm_drvdata ddat;
+   struct mutex hwmon_lock;/* counter overflow logic and 
rmw */
+   struct hwm_reg rg;
+};
+
+static const struct hwmon_channel_info *hwm_info[] = {
+   NULL
+};
+
+static umode_t
+hwm_is_visible(const void *drvdata, enum hwmon_sensor_types type,
+  u32 attr, int channel)
+{
+   switch (type) {
+   default:
+   return 0;
+   }
+}
+
+static int
+hwm_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+int channel, long *val)
+{
+   switch (type) {
+   default:
+   return -EOPNOTSUPP;
+   }
+}
+
+static int
+hwm_write(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+ int channel, long val)
+{
+   switch (type) {
+   default:
+   return -EOPNOTSUPP;
+   }
+}
+
+static const struct hwmon_ops hwm_ops = {
+   .is_visible = hwm_is_visible,
+   .read = hwm_read,
+   .write = hwm_write,
+};
+
+static const struct hwmon_chip_info hwm_chip_info = {
+   .ops = &hwm_ops,
+   .info = hwm_info,
+};
+
+static void
+hwm_get_preregistration_info(struct drm_i915_private *i915)
+{
+}
+
+void i9

Re: [PATCH 2/7] drm/i915/hwmon: Add HWMON current voltage support

2022-09-21 Thread Gupta, Anshuman





On 9/16/2022 8:30 PM, Badal Nilawar wrote:

From: Riana Tauro 

Use i915 HWMON subsystem to display current input voltage.

v2:
   - Updated date and kernel version in feature description
   - Fixed review comments (Ashutosh)
v3: Use macro HWMON_CHANNEL_INFO to define hwmon channel (Guenter)
v4:
   - Fixed review comments (Ashutosh)
   - Use hwm_ prefix for static functions (Ashutosh)
v5:
   - Added unit of voltage as millivolts (Ashutosh)
   - Updated date, kernel version in documentation

Cc: Guenter Roeck 
Cc: Anshuman Gupta 
Signed-off-by: Riana Tauro 
Signed-off-by: Badal Nilawar 
Acked-by: Guenter Roeck 
Reviewed-by: Ashutosh Dixit 

Looks good to me.
Reviewed-by: Anshuman Gupta 

---
  .../ABI/testing/sysfs-driver-intel-i915-hwmon |  7 +++
  drivers/gpu/drm/i915/gt/intel_gt_regs.h   |  3 ++
  drivers/gpu/drm/i915/i915_hwmon.c | 53 +++
  3 files changed, 63 insertions(+)
  create mode 100644 Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon 
b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
new file mode 100644
index ..e2974f928e58
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
@@ -0,0 +1,7 @@
+What:  /sys/devices/.../hwmon/hwmon/in0_input
+Date:  September 2022
+KernelVersion: 6
+Contact:   dri-devel@lists.freedesktop.org
+Description:   RO. Current Voltage in millivolt.
+
+   Only supported for particular Intel i915 graphics platforms.
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 2275ee47da95..65336514554d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -1510,6 +1510,9 @@
  #define VLV_RENDER_C0_COUNT   _MMIO(0x138118)
  #define VLV_MEDIA_C0_COUNT_MMIO(0x13811c)
  
+#define GEN12_RPSTAT1_MMIO(0x1381b4)

+#define   GEN12_VOLTAGE_MASK   REG_GENMASK(10, 0)
+
  #define GEN11_GT_INTR_DW(x)   _MMIO(0x190018 + ((x) * 4))
  #define   GEN11_CSME  (31)
  #define   GEN11_GUNIT (28)
diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
b/drivers/gpu/drm/i915/i915_hwmon.c
index 103dd543a214..45745afa5c5b 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -11,8 +11,16 @@
  #include "i915_hwmon.h"
  #include "i915_reg.h"
  #include "intel_mchbar_regs.h"
+#include "gt/intel_gt_regs.h"
+
+/*
+ * SF_* - scale factors for particular quantities according to hwmon spec.
+ * - voltage  - millivolts
+ */
+#define SF_VOLTAGE 1000
  
  struct hwm_reg {

+   i915_reg_t gt_perf_status;
  };
  
  struct hwm_drvdata {

@@ -29,14 +37,49 @@ struct i915_hwmon {
  };
  
  static const struct hwmon_channel_info *hwm_info[] = {

+   HWMON_CHANNEL_INFO(in, HWMON_I_INPUT),
NULL
  };
  
+static umode_t

+hwm_in_is_visible(const struct hwm_drvdata *ddat, u32 attr)
+{
+   switch (attr) {
+   case hwmon_in_input:
+   return i915_mmio_reg_valid(ddat->hwmon->rg.gt_perf_status) ? 
0444 : 0;
+   default:
+   return 0;
+   }
+}
+
+static int
+hwm_in_read(struct hwm_drvdata *ddat, u32 attr, long *val)
+{
+   struct i915_hwmon *hwmon = ddat->hwmon;
+   intel_wakeref_t wakeref;
+   u32 reg_value;
+
+   switch (attr) {
+   case hwmon_in_input:
+   with_intel_runtime_pm(ddat->uncore->rpm, wakeref)
+   reg_value = intel_uncore_read(ddat->uncore, 
hwmon->rg.gt_perf_status);
+   /* HW register value in units of 2.5 millivolt */
+   *val = DIV_ROUND_CLOSEST(REG_FIELD_GET(GEN12_VOLTAGE_MASK, 
reg_value) * 25, 10);
+   return 0;
+   default:
+   return -EOPNOTSUPP;
+   }
+}
+
  static umode_t
  hwm_is_visible(const void *drvdata, enum hwmon_sensor_types type,
   u32 attr, int channel)
  {
+   struct hwm_drvdata *ddat = (struct hwm_drvdata *)drvdata;
+
switch (type) {
+   case hwmon_in:
+   return hwm_in_is_visible(ddat, attr);
default:
return 0;
}
@@ -46,7 +89,11 @@ static int
  hwm_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
 int channel, long *val)
  {
+   struct hwm_drvdata *ddat = dev_get_drvdata(dev);
+
switch (type) {
+   case hwmon_in:
+   return hwm_in_read(ddat, attr, val);
default:
return -EOPNOTSUPP;
}
@@ -76,6 +123,12 @@ static const struct hwmon_chip_info hwm_chip_info = {
  static void
  hwm_get_preregistration_info(struct drm_i915_private *i915)
  {
+   struct i915_hwmon *hwmon = i915->hwmon;
+
+   if (IS_DG1(i915) || IS_DG2(i915))
+   hwmon->rg.gt_perf_status = GEN12_RPSTAT1;
+   else
+   hwmon->rg.gt_perf_status = INVALID_MMIO_REG;

Re: [PATCH v2 07/10] drm/ofdrm: Add ofdrm for Open Firmware framebuffers

2022-09-21 Thread Thomas Zimmermann


Hi

Am 26.07.22 um 15:17 schrieb Javier Martinez Canillas:

Hello Thomas,

On 7/20/22 16:27, Thomas Zimmermann wrote:

Open Firmware provides basic display output via the 'display' node.
DT platform code already provides a device that represents the node's
framebuffer. Add a DRM driver for the device. The display mode and
color format is pre-initialized by the system's firmware. Runtime
modesetting via DRM is not possible. The display is useful during
early boot stages or as error fallback.



I'm not familiar with OF display but the driver looks good to me.

Reviewed-by: Javier Martinez Canillas 

I just have a few questions below.

[...]


+static int ofdrm_primary_plane_helper_atomic_check(struct drm_plane *plane,
+  struct drm_atomic_state 
*new_state)
+{
+   struct drm_plane_state *new_plane_state = 
drm_atomic_get_new_plane_state(new_state, plane);
+   struct drm_crtc_state *new_crtc_state;
+   int ret;
+
+   if (!new_plane_state->fb)
+   return 0;
+
+   new_crtc_state = drm_atomic_get_new_crtc_state(new_state, 
new_plane_state->crtc);
+
+   ret = drm_atomic_helper_check_plane_state(new_plane_state, 
new_crtc_state,
+ DRM_PLANE_HELPER_NO_SCALING,
+ DRM_PLANE_HELPER_NO_SCALING,
+ false, false);
+   if (ret)
+   return ret;
+
+   return 0;
+}


This seems to be exactly the same check than used in the simpledrm driver.
Maybe could be moved to the fwfb helper library too ?


I've meanwhile dropped fwfb helpers. Color management requires specific 
code here, so there's no much to share anyway.




[...]


+
+static void ofdrm_crtc_helper_atomic_disable(struct drm_crtc *crtc,
+struct drm_atomic_state *old_state)
+{
+   /*
+* Always enabled; disabling clears the screen in the
+* primary plane's atomic_disable function.
+*/
+}
+


Same comment than for simpledrm, are these no-op helpers really needed ?


In simpledrm, I've meanwhile removed them. I'll do so here as well.



[...]


+static const struct of_device_id ofdrm_of_match_display[] = {
+   { .compatible = "display", },
+   { },
+};
+MODULE_DEVICE_TABLE(of, ofdrm_of_match_display);
+


I don't see a binding for this in Documentation/devicetree/bindings/display.
Do we need one or it's that only required for FDT and not Open Firmware DT ?



No idea. The device is being created in drivers/of/platform.c. If offb 
didn't need these bindings, ofdrm probably won't need them either.


Best regards
Thomas


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [Intel-gfx] [PATCH 3/7] drm/i915/hwmon: Power PL1 limit and TDP setting

2022-09-21 Thread Tvrtko Ursulin




On 21/09/2022 01:02, Dixit, Ashutosh wrote:

On Fri, 16 Sep 2022 08:00:50 -0700, Badal Nilawar wrote:


diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon 
b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
index e2974f928e58..bc061238e35c 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
+++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
@@ -5,3 +5,23 @@ Contact:   dri-devel@lists.freedesktop.org
  Description:  RO. Current Voltage in millivolt.

Only supported for particular Intel i915 graphics platforms.
+
+What:  /sys/devices/.../hwmon/hwmon/power1_max
+Date:  September 2022
+KernelVersion: 6


Maybe we should ask someone but even if we merge this today to drm-tip this
will appear in kernel.org Linus' version only in 6.2. So I think we should
set this as 6.2 on all patches.


Correct, if merged today it will appear in 6.2 so please change to that 
before merging.


As for the date that's harder to predict and I am not really sure how 
best to handle it. Crystal ball predicts February 2023 fwiw so maybe go 
with that for now. Seems less important than the release for me anyway.


Regards,

Tvrtko


Except for this, thanks for making the changes, this is:

Reviewed-by: Ashutosh Dixit

Re: [PATCH v2 08/10] drm/ofdrm: Add CRTC state

2022-09-21 Thread Thomas Zimmermann


Hi

Am 26.07.22 um 15:36 schrieb Javier Martinez Canillas:

On 7/20/22 16:27, Thomas Zimmermann wrote:

Add a dedicated CRTC state to ofdrm to later store information for
palette updates.

Signed-off-by: Thomas Zimmermann 
---
  drivers/gpu/drm/tiny/ofdrm.c | 62 ++--
  


Reviewed-by: Javier Martinez Canillas 

[...]


+static void ofdrm_crtc_reset(struct drm_crtc *crtc)
+{
+   struct ofdrm_crtc_state *ofdrm_crtc_state;
+
+   if (crtc->state) {
+   ofdrm_crtc_state_destroy(to_ofdrm_crtc_state(crtc->state));
+   crtc->state = NULL; /* must be set to NULL here */
+   }
+
+   ofdrm_crtc_state = kzalloc(sizeof(*ofdrm_crtc_state), GFP_KERNEL);
+   if (!ofdrm_crtc_state)
+   return;
+   __drm_atomic_helper_crtc_reset(crtc, &ofdrm_crtc_state->base);
+}
+


IMO this function is hard to read, I would instead write it as following:

static void ofdrm_crtc_reset(struct drm_crtc *crtc)
{
 struct ofdrm_crtc_state *ofdrm_crtc_state = 
kzalloc(sizeof(*ofdrm_crtc_state), GFP_KERNEL);

if (!ofdrm_crtc_state)
return;

 if (crtc->state) {
 ofdrm_crtc_state_destroy(to_ofdrm_crtc_state(crtc->state));
crtc->state = NULL; /* must be set to NULL here */
}

 __drm_atomic_helper_crtc_reset(crtc, &ofdrm_crtc_state->base);
}

Also with that form I think that the crtc->state = NULL could just be dropped ?


I once had to add this line to a driver to make the DRM helpers work. 
But I cannot find any longer why. Maybe it's been resolved meanwhile.


Best regards
Thomas





--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH 3/7] drm/i915/hwmon: Power PL1 limit and TDP setting

2022-09-21 Thread Gupta, Anshuman





On 9/16/2022 8:30 PM, Badal Nilawar wrote:

From: Dale B Stimson 

Use i915 HWMON to display/modify dGfx power PL1 limit and TDP setting.

v2:
   - Fix review comments (Ashutosh)
   - Do not restore power1_max upon module unload/load sequence
 because on production systems modules are always loaded
 and not unloaded/reloaded (Ashutosh)
   - Fix review comments (Jani)
   - Remove endianness conversion (Ashutosh)
v3: Add power1_rated_max (Ashutosh)
v4:
   - Use macro HWMON_CHANNEL_INFO to define power channel (Guenter)
   - Update the date and kernel version in Documentation (Badal)
v5: Use hwm_ prefix for static functions (Ashutosh)
v6:
   - Fix review comments (Ashutosh)
   - Update date, kernel version in documentation

Cc: Guenter Roeck 
Signed-off-by: Dale B Stimson 
Signed-off-by: Ashutosh Dixit 
Signed-off-by: Riana Tauro 
Signed-off-by: Badal Nilawar 
Acked-by: Guenter Roeck 
---
  .../ABI/testing/sysfs-driver-intel-i915-hwmon |  20 +++
  drivers/gpu/drm/i915/i915_hwmon.c | 158 +-
  drivers/gpu/drm/i915/i915_reg.h   |   5 +
  drivers/gpu/drm/i915/intel_mchbar_regs.h  |   6 +
  4 files changed, 187 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon 
b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
index e2974f928e58..bc061238e35c 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
+++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
@@ -5,3 +5,23 @@ Contact:   dri-devel@lists.freedesktop.org
  Description:  RO. Current Voltage in millivolt.
  
  		Only supported for particular Intel i915 graphics platforms.

+
+What:  /sys/devices/.../hwmon/hwmon/power1_max
+Date:  September 2022
+KernelVersion: 6
+Contact:   dri-devel@lists.freedesktop.org
+Description:   RW. Card reactive sustained  (PL1/Tau) power limit in 
microwatts.
+
+   The power controller will throttle the operating frequency
+   if the power averaged over a window (typically seconds)
+   exceeds this limit.
+
+   Only supported for particular Intel i915 graphics platforms.
+
+What:  /sys/devices/.../hwmon/hwmon/power1_rated_max
+Date:  September 2022
+KernelVersion: 6
+Contact:   dri-devel@lists.freedesktop.org
+Description:   RO. Card default power limit (default TDP setting).
+
+   Only supported for particular Intel i915 graphics platforms.
diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
b/drivers/gpu/drm/i915/i915_hwmon.c
index 45745afa5c5b..5183cf51a49b 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -16,11 +16,16 @@
  /*
   * SF_* - scale factors for particular quantities according to hwmon spec.
   * - voltage  - millivolts
+ * - power  - microwatts
   */
  #define SF_VOLTAGE1000
+#define SF_POWER   100
  
  struct hwm_reg {

i915_reg_t gt_perf_status;
+   i915_reg_t pkg_power_sku_unit;
+   i915_reg_t pkg_power_sku;
+   i915_reg_t pkg_rapl_limit;
  };
  
  struct hwm_drvdata {

@@ -34,10 +39,68 @@ struct i915_hwmon {
struct hwm_drvdata ddat;
struct mutex hwmon_lock;/* counter overflow logic and 
rmw */
struct hwm_reg rg;
+   int scl_shift_power;
  };
  
+static void

+hwm_locked_with_pm_intel_uncore_rmw(struct hwm_drvdata *ddat,
+   i915_reg_t reg, u32 clear, u32 set)
+{
+   struct i915_hwmon *hwmon = ddat->hwmon;
+   struct intel_uncore *uncore = ddat->uncore;
+   intel_wakeref_t wakeref;
+
+   mutex_lock(&hwmon->hwmon_lock);
+
+   with_intel_runtime_pm(uncore->rpm, wakeref)
+   intel_uncore_rmw(uncore, reg, clear, set);
+
+   mutex_unlock(&hwmon->hwmon_lock);
+}
+
+/*
+ * This function's return type of u64 allows for the case where the scaling
+ * of the field taken from the 32-bit register value might cause a result to
+ * exceed 32 bits.
+ */
+static u64
+hwm_field_read_and_scale(struct hwm_drvdata *ddat, i915_reg_t rgadr,
+u32 field_msk, int nshift, u32 scale_factor)
+{
+   struct intel_uncore *uncore = ddat->uncore;
+   intel_wakeref_t wakeref;
+   u32 reg_value;
+
+   with_intel_runtime_pm(uncore->rpm, wakeref)
+   reg_value = intel_uncore_read(uncore, rgadr);
+
+   reg_value = REG_FIELD_GET(field_msk, reg_value);
+
+   return mul_u64_u32_shr(reg_value, scale_factor, nshift);
+}
+
+static void
+hwm_field_scale_and_write(struct hwm_drvdata *ddat, i915_reg_t rgadr,
+ u32 field_msk, int nshift,
+ unsigned int scale_factor, long lval)
+{
+   u32 nval;
+   u32 bits_to_clear;
+   u32 bits_to_set;
+
+   /* Computation in 64-bits to avoid overflow. Round to nearest. */
+   nval = DIV_ROUND_CLOSEST_ULL((u64)lval << nshift, scale_factor);
+
+   bits_to_clear = field_msk;
+   bits_t

Re: [PATCH v9 09/10] leds: flash: mt6370: Add MediaTek MT6370 flashlight support

2022-09-21 Thread Andy Shevchenko

On Wed, Sep 21, 2022 at 4:48 AM ChiaEn Wu  wrote:
> On Sun, Sep 18, 2022 at 3:22 AM Han Jingoo  wrote:
> > On Mon, Aug 29, 2022 ChiaEn Wu  wrote:

> > > +#define MT6370_ITORCH_MIN_uA   25000
> > > +#define MT6370_ITORCH_STEP_uA  12500
> > > +#define MT6370_ITORCH_MAX_uA   40
> > > +#define MT6370_ITORCH_DOUBLE_MAX_uA80
> > > +#define MT6370_ISTRB_MIN_uA5
> > > +#define MT6370_ISTRB_STEP_uA   12500
> > > +#define MT6370_ISTRB_MAX_uA150
> > > +#define MT6370_ISTRB_DOUBLE_MAX_uA 300
> >
> > Use upper letters as below:

For microseconds (and other -seconds) the common practice (I assume
historically) is to use upper letters, indeed. But for current it's
more natural to use small letters for unit multiplier as it's easier
to read and understand.

> > #define MT6370_ITORCH_MIN_UA   25000
> > #define MT6370_ITORCH_STEP_UA  12500
> > #define MT6370_ITORCH_MAX_UA   40
> > #define MT6370_ITORCH_DOUBLE_MAX_UA80
> > #define MT6370_ISTRB_MIN_UA5
> > #define MT6370_ISTRB_STEP_UA   12500
> > #define MT6370_ISTRB_MAX_UA150
> > #define MT6370_ISTRB_DOUBLE_MAX_UA 300
> >
> > > +#define MT6370_STRBTO_MIN_US   64000
> > > +#define MT6370_STRBTO_STEP_US  32000
> > > +#define MT6370_STRBTO_MAX_US   2432000
>
> Hi Jingoo,
>
> This coding style is in accordance with Andy's opinion in this mail:
> https://lore.kernel.org/linux-arm-kernel/CAHp75Vciq4M4kVrabNV9vTLLcd1vR=bme8jledaf9mkrtpc...@mail.gmail.com/

True.

-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH 4/7] drm/i915/hwmon: Show device level energy usage

2022-09-21 Thread Gupta, Anshuman





On 9/16/2022 8:30 PM, Badal Nilawar wrote:

From: Dale B Stimson 

Use i915 HWMON to display device level energy input.

v2:
   - Updated the date and kernel version in feature description
v3:
   - Cleaned up hwm_energy function and removed unused function
 i915_hwmon_energy_status_get (Ashutosh)
   - Updated date, kernel version in documentation

Signed-off-by: Dale B Stimson 
Signed-off-by: Ashutosh Dixit 
Signed-off-by: Riana Tauro 
Signed-off-by: Badal Nilawar 
Acked-by: Guenter Roeck 
Reviewed-by: Ashutosh Dixit 
---
  .../ABI/testing/sysfs-driver-intel-i915-hwmon |   8 ++
  drivers/gpu/drm/i915/i915_hwmon.c | 107 +-
  drivers/gpu/drm/i915/i915_hwmon.h |   1 +
  drivers/gpu/drm/i915/intel_mchbar_regs.h  |   2 +
  4 files changed, 116 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon 
b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
index bc061238e35c..94101f818a70 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
+++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
@@ -25,3 +25,11 @@ Contact: dri-devel@lists.freedesktop.org
  Description:  RO. Card default power limit (default TDP setting).
  
  		Only supported for particular Intel i915 graphics platforms.

+
+What:  /sys/devices/.../hwmon/hwmon/energy1_input
+Date:  September 2022
+KernelVersion: 6
+Contact:   dri-devel@lists.freedesktop.org
+Description:   RO. Energy input of device in microjoules.
+
+   Only supported for particular Intel i915 graphics platforms.
diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
b/drivers/gpu/drm/i915/i915_hwmon.c
index 5183cf51a49b..a42cfad78bef 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -17,21 +17,30 @@
   * SF_* - scale factors for particular quantities according to hwmon spec.
   * - voltage  - millivolts
   * - power  - microwatts
+ * - energy - microjoules
   */
  #define SF_VOLTAGE1000
  #define SF_POWER  100
+#define SF_ENERGY  100
  
  struct hwm_reg {

i915_reg_t gt_perf_status;
i915_reg_t pkg_power_sku_unit;
i915_reg_t pkg_power_sku;
i915_reg_t pkg_rapl_limit;
+   i915_reg_t energy_status_all;
+};
+
+struct hwm_energy_info {
+   u32 reg_val_prev;
+   long accum_energy;  /* Accumulated energy for 
energy1_input */
  };
  
  struct hwm_drvdata {

struct i915_hwmon *hwmon;
struct intel_uncore *uncore;
struct device *hwmon_dev;
+   struct hwm_energy_info ei;  /*  Energy info for 
energy1_input */
char name[12];
  };
  
@@ -40,6 +49,7 @@ struct i915_hwmon {

struct mutex hwmon_lock;/* counter overflow logic and 
rmw */
struct hwm_reg rg;
int scl_shift_power;
+   int scl_shift_energy;
  };
  
  static void

@@ -98,9 +108,60 @@ hwm_field_scale_and_write(struct hwm_drvdata *ddat, 
i915_reg_t rgadr,
bits_to_clear, bits_to_set);
  }
  
+/*

+ * hwm_energy - Obtain energy value
+ *
+ * The underlying energy hardware register is 32-bits and is subject to
+ * overflow. How long before overflow? For example, with an example
+ * scaling bit shift of 14 bits (see register *PACKAGE_POWER_SKU_UNIT) and
+ * a power draw of 1000 watts, the 32-bit counter will overflow in
+ * approximately 4.36 minutes.
+ *
+ * Examples:
+ *1 watt:  (2^32 >> 14) /1 W / (60 * 60 * 24) secs/day -> 3 days
+ * 1000 watts: (2^32 >> 14) / 1000 W / 60 secs/min -> 4.36 minutes
+ *
+ * The function significantly increases overflow duration (from 4.36
+ * minutes) by accumulating the energy register into a 'long' as allowed by
+ * the hwmon API. Using x86_64 128 bit arithmetic (see mul_u64_u32_shr()),
+ * a 'long' of 63 bits, SF_ENERGY of 1e6 (~20 bits) and
+ * hwmon->scl_shift_energy of 14 bits we have 57 (63 - 20 + 14) bits before
+ * energy1_input overflows. This at 1000 W is an overflow duration of 278 
years.
+ */
+static int
+hwm_energy(struct hwm_drvdata *ddat, long *energy)
+{
+   struct intel_uncore *uncore = ddat->uncore;
+   struct i915_hwmon *hwmon = ddat->hwmon;
+   struct hwm_energy_info *ei = &ddat->ei;
+   intel_wakeref_t wakeref;
+   i915_reg_t rgaddr;
+   u32 reg_val;
+
+   rgaddr = hwmon->rg.energy_status_all;
+
+   mutex_lock(&hwmon->hwmon_lock);
+
+   with_intel_runtime_pm(uncore->rpm, wakeref)
+   reg_val = intel_uncore_read(uncore, rgaddr);
+
+   if (reg_val >= ei->reg_val_prev)
+   ei->accum_energy += reg_val - ei->reg_val_prev;
+   else
+   ei->accum_energy += UINT_MAX - ei->reg_val_prev + reg_val;
+   ei->reg_val_prev = reg_val;
+
+   *energy = mul_u64_u32_shr(ei->accum_energy, SF_ENERGY,
+ hwmon->scl_shift_energy);
+   mutex_unlock(&hwmon->hwmon_lock)

RE: [PATCH] virt: acrn: obtain pa from VMA with PFNMAP flag

2022-09-21 Thread Huang, Yonghua

Hi Daniel,

 Thank you for this info, we will fix this issue.

Almost miss this mail, sorry!

-Yonghua

> -Original Message-
> From: Daniel Vetter 
> Sent: Wednesday, August 10, 2022 20:20
> To: Huang, Yonghua 
> Cc: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org;
> sta...@vger.kernel.org; Chatre, Reinette ; Wang,
> Zhi A ; Wang, Yu1 ; Li, Fei1
> ; Linux MM ; DRI Development  de...@lists.freedesktop.org>
> Subject: Re: [PATCH] virt: acrn: obtain pa from VMA with PFNMAP flag
> 
> On Mon, Feb 28, 2022 at 05:22:12AM +0300, Yonghua Huang wrote:
> >  acrn_vm_ram_map can't pin the user pages with VM_PFNMAP flag  by
> > calling get_user_pages_fast(), the PA(physical pages)  may be mapped
> > by kernel driver and set PFNMAP flag.
> >
> >  This patch fixes logic to setup EPT mapping for PFN mapped RAM region
> > by checking the memory attribute before adding EPT mapping for them.
> >
> > Fixes: 88f537d5e8dd ("virt: acrn: Introduce EPT mapping management")
> > Signed-off-by: Yonghua Huang 
> > Signed-off-by: Fei Li 
> > ---
> >  drivers/virt/acrn/mm.c | 24 
> >  1 file changed, 24 insertions(+)
> >
> > diff --git a/drivers/virt/acrn/mm.c b/drivers/virt/acrn/mm.c index
> > c4f2e15c8a2b..3b1b1e7a844b 100644
> > --- a/drivers/virt/acrn/mm.c
> > +++ b/drivers/virt/acrn/mm.c
> > @@ -162,10 +162,34 @@ int acrn_vm_ram_map(struct acrn_vm *vm, struct
> acrn_vm_memmap *memmap)
> > void *remap_vaddr;
> > int ret, pinned;
> > u64 user_vm_pa;
> > +   unsigned long pfn;
> > +   struct vm_area_struct *vma;
> >
> > if (!vm || !memmap)
> > return -EINVAL;
> >
> > +   mmap_read_lock(current->mm);
> > +   vma = vma_lookup(current->mm, memmap->vma_base);
> > +   if (vma && ((vma->vm_flags & VM_PFNMAP) != 0)) {
> > +   if ((memmap->vma_base + memmap->len) > vma->vm_end) {
> > +   mmap_read_unlock(current->mm);
> > +   return -EINVAL;
> > +   }
> > +
> > +   ret = follow_pfn(vma, memmap->vma_base, &pfn);
> 
> This races, don't use follow_pfn() and most definitely don't add new users. In
> some cases follow_pte, but the pte/pfn is still only valid for as long as you 
> hold
> the pte spinlock.
> 
> > +   mmap_read_unlock(current->mm);
> 
> Definitely after here there's zero guarantees about this pfn and it could 
> point at
> anything.
> 
> Please fix, I tried pretty hard to get rid of follow_pfn(), but some of them 
> are
> just too hard to fix (e.g. kvm needs a pretty hug rewrite to get it all 
> sorted).
> 
> Cheers, Daniel
> 
> > +   if (ret < 0) {
> > +   dev_dbg(acrn_dev.this_device,
> > +   "Failed to lookup PFN at VMA:%pK.\n", (void
> *)memmap->vma_base);
> > +   return ret;
> > +   }
> > +
> > +   return acrn_mm_region_add(vm, memmap->user_vm_pa,
> > +PFN_PHYS(pfn), memmap->len,
> > +ACRN_MEM_TYPE_WB, memmap->attr);
> > +   }
> > +   mmap_read_unlock(current->mm);
> > +
> > /* Get the page number of the map region */
> > nr_pages = memmap->len >> PAGE_SHIFT;
> > pages = vzalloc(nr_pages * sizeof(struct page *));
> >
> > base-commit: 73878e5eb1bd3c9656685ca60bc3a49d17311e0c
> > --
> > 2.25.1
> >
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Re: [PATCH 2/2] drm/panfrost: replace endian-specific types with generic ones

2022-09-21 Thread Alyssa Rosenzweig

> > Or of course we could just actually use native endian and detect from
> > the magic which endian is in use. That would require ripping out the
> > cpu_to_lexx() calls in Linux and making the user space tool more
> > intelligent. I'm happy with that, but it's pushing the complexity onto Mesa.
> 
> If there's a clearly identifiable header, then I'd say making the whole dump
> native-endian is probably the way to go. Unless and until anyone actually
> demands to be able to do cross-endian post-mortem GPU debugging, the
> realistic extent of the complexity in Mesa is that it doesn't recognise the
> foreign dump format and gives up, which I assume is already implemented :)

+1 to this solution. Gets the complexity out of both kernel and Mesa,
and in the vanishingly unlikely scenario that we need this
functionality, we can add it to Mesa without kernel changes. As mesa
panfrost maintainer I'll take those odds :+1:

Re: [PATCH v2 09/10] drm/ofdrm: Add per-model device function

2022-09-21 Thread Thomas Zimmermann


Hi

Am 05.08.22 um 02:22 schrieb Benjamin Herrenschmidt:

On Tue, 2022-07-26 at 16:40 +0200, Michal Suchánek wrote:

Hello,

On Tue, Jul 26, 2022 at 03:38:37PM +0200, Javier Martinez Canillas wrote:

On 7/20/22 16:27, Thomas Zimmermann wrote:

Add a per-model device-function structure in preparation of adding
color-management support. Detection of the individual models has been
taken from fbdev's offb.

Signed-off-by: Thomas Zimmermann 
---


Reviewed-by: Javier Martinez Canillas 

[...]


+static bool is_avivo(__be32 vendor, __be32 device)
+{
+   /* This will match most R5xx */
+   return (vendor == 0x1002) &&
+  ((device >= 0x7100 && device < 0x7800) || (device >= 0x9400));
+}


Maybe add some constant macros to not have these magic numbers ?


This is based on the existing fbdev implementation's magic numbers:

drivers/video/fbdev/offb.c: ((*did >= 0x7100 && *did < 0x7800) 
||

Of course, it would be great if somebody knowledgeable could clarify
those.


I don't think anybody remembers :-) Vendor 0x1002 is PCI_VENDOR_ID_ATI,


I do :)


but the rest is basically ranges of PCI IDs for which we don't have
symbolic constants.


Should we add them to the official list in pci_ids.h?  I cannot find 
0x7800. The others are R520 and R600.


Best regards
Thomas



Cheers,
Ben.



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [Intel-gfx] [PATCH 1/7] drm/i915/hwmon: Add HWMON infrastructure

2022-09-21 Thread Andi Shyti

Hi Badal,

> +struct hwm_reg {
> +};
> +
> +struct hwm_drvdata {
> + struct i915_hwmon *hwmon;
> + struct intel_uncore *uncore;
> + struct device *hwmon_dev;
> + char name[12];
> +};
> +
> +struct i915_hwmon {
> + struct hwm_drvdata ddat;
> + struct mutex hwmon_lock;/* counter overflow logic and 
> rmw */
> + struct hwm_reg rg;
> +};
> +
> +static const struct hwmon_channel_info *hwm_info[] = {
> + NULL
> +};
> +
> +static umode_t
> +hwm_is_visible(const void *drvdata, enum hwmon_sensor_types type,
> +u32 attr, int channel)
> +{
> + switch (type) {
> + default:
> + return 0;
> + }
> +}
> +
> +static int
> +hwm_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
> +  int channel, long *val)
> +{
> + switch (type) {
> + default:
> + return -EOPNOTSUPP;
> + }
> +}
> +
> +static int
> +hwm_write(struct device *dev, enum hwmon_sensor_types type, u32 attr,
> +   int channel, long val)
> +{
> + switch (type) {
> + default:
> + return -EOPNOTSUPP;
> + }
> +}
> +
> +static const struct hwmon_ops hwm_ops = {
> + .is_visible = hwm_is_visible,
> + .read = hwm_read,
> + .write = hwm_write,
> +};
> +
> +static const struct hwmon_chip_info hwm_chip_info = {
> + .ops = &hwm_ops,
> + .info = hwm_info,
> +};

what's the point for splitting so much? Can't you just send the
hwmon driver all at once? With this patch you are not actually
doing anything useful. In my opinion this should be squashed with
the next ones.

> +static void
> +hwm_get_preregistration_info(struct drm_i915_private *i915)
> +{
> +}
> +
> +void i915_hwmon_register(struct drm_i915_private *i915)
> +{
> + struct device *dev = i915->drm.dev;
> + struct i915_hwmon *hwmon;
> + struct device *hwmon_dev;
> + struct hwm_drvdata *ddat;
> +
> + /* hwmon is available only for dGfx */
> + if (!IS_DGFX(i915))
> + return;
> +
> + hwmon = kzalloc(sizeof(*hwmon), GFP_KERNEL);

why don't we use devm_kzalloc?

> + if (!hwmon)
> + return;
> +
> + i915->hwmon = hwmon;
> + mutex_init(&hwmon->hwmon_lock);
> + ddat = &hwmon->ddat;
> +
> + ddat->hwmon = hwmon;
> + ddat->uncore = &i915->uncore;
> + snprintf(ddat->name, sizeof(ddat->name), "i915");
> +
> + hwm_get_preregistration_info(i915);
> +
> + /*  hwmon_dev points to device hwmon */
> + hwmon_dev = hwmon_device_register_with_info(dev, ddat->name,
> + ddat,
> + &hwm_chip_info,
> + NULL);
> + if (IS_ERR(hwmon_dev)) {
> + mutex_destroy(&hwmon->hwmon_lock);

there is not such a big need to destroy the mutex. Destroying
mutexes is more useful when you actually are creating/destroying
and there is some debug need. I don't think that's the case.

With the devm_kzalloc this would be just a return.

Andi

> + i915->hwmon = NULL;
> + kfree(hwmon);
> + return;
> + }
> +
> + ddat->hwmon_dev = hwmon_dev;
> +}
> +
> +void i915_hwmon_unregister(struct drm_i915_private *i915)
> +{
> + struct i915_hwmon *hwmon;
> + struct hwm_drvdata *ddat;
> +
> + hwmon = fetch_and_zero(&i915->hwmon);
> + if (!hwmon)
> + return;
> +
> + ddat = &hwmon->ddat;
> + if (ddat->hwmon_dev)
> + hwmon_device_unregister(ddat->hwmon_dev);
> +
> + mutex_destroy(&hwmon->hwmon_lock);
> + kfree(hwmon);
> +}
> diff --git a/drivers/gpu/drm/i915/i915_hwmon.h 
> b/drivers/gpu/drm/i915/i915_hwmon.h
> new file mode 100644
> index ..7ca9cf2c34c9
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_hwmon.h
> @@ -0,0 +1,20 @@
> +/* SPDX-License-Identifier: MIT */
> +
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +#ifndef __I915_HWMON_H__
> +#define __I915_HWMON_H__
> +
> +struct drm_i915_private;
> +
> +#if IS_REACHABLE(CONFIG_HWMON)
> +void i915_hwmon_register(struct drm_i915_private *i915);
> +void i915_hwmon_unregister(struct drm_i915_private *i915);
> +#else
> +static inline void i915_hwmon_register(struct drm_i915_private *i915) { };
> +static inline void i915_hwmon_unregister(struct drm_i915_private *i915) { };
> +#endif
> +
> +#endif /* __I915_HWMON_H__ */
> -- 
> 2.25.1

Re: [PATCH v2 10/10] drm/ofdrm: Support color management

2022-09-21 Thread Thomas Zimmermann


Hi

Am 05.08.22 um 02:19 schrieb Benjamin Herrenschmidt:

On Wed, 2022-07-20 at 16:27 +0200, Thomas Zimmermann wrote:

+#if !defined(CONFIG_PPC)
+static inline void out_8(void __iomem *addr, int val)
+{ }
+static inline void out_le32(void __iomem *addr, int val)
+{ }
+static inline unsigned int in_le32(const void __iomem *addr)
+{
+   return 0;
+}
+#endif


These guys could just be replaced with readb/writel/readl respectively
(beware of the argument swap).


I only added them for COMPILE_TEST. There appears to be no portable 
interface that implements out_le32() and in_le32()?


Best regards
Thomas



Cheers,
Ben.



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH v6,2/3] drm: mediatek: Adjust the dpi output format to MT8186

2022-09-21 Thread xinlei . lee

On Wed, 2022-09-21 at 09:35 +0800, CK Hu wrote:
> Hi, Xinlei:
> 
> On Wed, 2022-09-14 at 21:21 +0800, xinlei@mediatek.com wrote:
> > From: Xinlei Lee 
> > 
> > Dpi output needs to adjust the output format to dual edge for
> > MT8186.
> > The bridge ic on MT8186 uses the output format of
> > RGB888_dual_edge. 
> 
> I think different sink ic may support different output format, so
> query
> the sink information to decide which outout format.
> 
> > Due
> > to hardware changes, we need to modify the output format
> > corresponding
> > to the mmsys register.
> > 
> > Co-developed-by: Jitao Shi 
> > Signed-off-by: Jitao Shi 
> > Signed-off-by: Xinlei Lee 
> > Reviewed-by: AngeloGioacchino Del Regno <
> > angelogioacchino.delre...@collabora.com>
> > Reviewed-by: Nís F. R. A. Prado 
> > ---
> >  drivers/gpu/drm/mediatek/mtk_dpi.c | 14 ++
> >  1 file changed, 14 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c
> > b/drivers/gpu/drm/mediatek/mtk_dpi.c
> > index fb0b79704636..6e02f02f163c 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_dpi.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_dpi.c
> > @@ -14,6 +14,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  
> >  #include 
> > @@ -28,6 +29,7 @@
> >  #include "mtk_disp_drv.h"
> >  #include "mtk_dpi_regs.h"
> >  #include "mtk_drm_ddp_comp.h"
> > +#include "mtk_drm_drv.h"
> >  
> >  enum mtk_dpi_out_bit_num {
> > MTK_DPI_OUT_BIT_NUM_8BITS,
> > @@ -58,6 +60,11 @@ enum mtk_dpi_out_color_format {
> > MTK_DPI_COLOR_FORMAT_YCBCR_422
> >  };
> >  
> > +enum mtk_dpi_out_format_con {
> > +   MTK_DPI_RGB888_DDR_CON,
> > +   MTK_DPI_RGB565_SDR_CON
> > +};
> > +
> >  struct mtk_dpi {
> > struct drm_encoder encoder;
> > struct drm_bridge bridge;
> > @@ -80,6 +87,7 @@ struct mtk_dpi {
> > struct pinctrl_state *pins_dpi;
> > u32 output_fmt;
> > int refcount;
> > +   struct device *mmsys_dev;
> >  };
> >  
> >  static inline struct mtk_dpi *bridge_to_dpi(struct drm_bridge *b)
> > @@ -133,6 +141,7 @@ struct mtk_dpi_yc_limit {
> >   * @yuv422_en_bit: Enable bit of yuv422.
> >   * @csc_enable_bit: Enable bit of CSC.
> >   * @pixels_per_iter: Quantity of transferred pixels per iteration.
> > + * @edge_cfg_in_mmsys: If the edge configuration for DPI's output
> > needs to be set in MMSYS.
> >   */
> >  struct mtk_dpi_conf {
> > unsigned int (*cal_factor)(int clock);
> > @@ -151,6 +160,7 @@ struct mtk_dpi_conf {
> > u32 yuv422_en_bit;
> > u32 csc_enable_bit;
> > u32 pixels_per_iter;
> > +   bool edge_cfg_in_mmsys;
> >  };
> >  
> >  static void mtk_dpi_mask(struct mtk_dpi *dpi, u32 offset, u32 val,
> > u32 mask)
> > @@ -447,6 +457,8 @@ static void mtk_dpi_dual_edge(struct mtk_dpi
> > *dpi)
> > mtk_dpi_mask(dpi, DPI_OUTPUT_SETTING,
> >  dpi->output_fmt ==
> > MEDIA_BUS_FMT_RGB888_2X12_LE ?
> >  EDGE_SEL : 0, EDGE_SEL);
> > +   if (dpi->conf->edge_cfg_in_mmsys)
> > +   mtk_mmsys_ddp_dpi_fmt_config(dpi->mmsys_dev,
> > MTK_DPI_RGB888_DDR_CON);
> 
> Why do you set a DPI driver defined value MTK_DPI_RGB888_DDR_CON into
> mmsys driver? I think you should set a value which mmsys driver
> understand.
> 
> Regards,
> CK
> 
> > } else {
> > mtk_dpi_mask(dpi, DPI_DDR_SETTING, DDR_EN | DDR_4PHASE,
> > 0);
> > }
> > @@ -776,8 +788,10 @@ static int mtk_dpi_bind(struct device *dev,
> > struct device *master, void *data)
> >  {
> > struct mtk_dpi *dpi = dev_get_drvdata(dev);
> > struct drm_device *drm_dev = data;
> > +   struct mtk_drm_private *priv = drm_dev->dev_private;
> > int ret;
> >  
> > +   dpi->mmsys_dev = priv->mmsys_dev;
> > ret = drm_simple_encoder_init(drm_dev, &dpi->encoder,
> >   DRM_MODE_ENCODER_TMDS);
> > if (ret) {
> 
> 
Hi CK:

Thanks for your review.

Yes, different sink ICs may support other output formats. 
The current DRM architecture supports retrieving the output formats of
all bridges (implemented through .atomic_check &
.atomic_get_output_bus_fmts & .atomic_get_input_bus_fmts in dpi).
If no unified output format is found It will use the default format
output of soc (MEDIA_BUS_FMT_RGB888_2X12_LE is used in mt8186).

The difference between MT8186 and other ICs is that when modifying the
output format, we need to modify the mmsys_base + 0x400 register to
take effect. 
Therefore, if there are other format sink ICs (RGB888_DDR/RGB888_SDR)
in the future, the sink IC needs to add the func implementation
mentioned above needs to be added. 
And the drm architecture will select the appropriate format to change
the dpi output.

I will put the registers that control mmsys in mtk-mmsys.h in the next
release.

Best Regards!
Xinlei

Re: [PATCH v6,3/3] drm: mediatek: Add mt8186 dpi compatible to mtk_dpi.c

2022-09-21 Thread xinlei . lee

On Wed, 2022-09-21 at 09:42 +0800, CK Hu wrote:
> Hi, Xinlei:
> 
> On Wed, 2022-09-14 at 21:21 +0800, xinlei@mediatek.com wrote:
> > From: Xinlei Lee 
> > 
> > Add the compatible because use edge_cfg_in_mmsys in mt8186.
> > 
> > Signed-off-by: Xinlei Lee 
> > ---
> >  drivers/gpu/drm/mediatek/mtk_dpi.c | 21 +
> >  drivers/gpu/drm/mediatek/mtk_drm_drv.c |  2 ++
> >  2 files changed, 23 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c
> > b/drivers/gpu/drm/mediatek/mtk_dpi.c
> > index 6e02f02f163c..52bcc4114afd 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_dpi.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_dpi.c
> > @@ -942,6 +942,24 @@ static const struct mtk_dpi_conf mt8183_conf =
> > {
> > .csc_enable_bit = CSC_ENABLE,
> >  };
> >  
> > +static const struct mtk_dpi_conf mt8186_conf = {
> > +   .cal_factor = mt8183_calculate_factor,
> > +   .reg_h_fre_con = 0xe0,
> > +   .max_clock_khz = 15,
> > +   .output_fmts = mt8183_output_fmts,
> > +   .num_output_fmts = ARRAY_SIZE(mt8183_output_fmts),
> > +   .edge_cfg_in_mmsys = true,
> 
> According to the previous review, edge_cfg_in_mmsys is not necessary,
> so mt8186_conf is identical to mt8192_conf.
> 
> Regards,
> CK
> 
> > +   .pixels_per_iter = 1,
> > +   .is_ck_de_pol = true,
> > +   .swap_input_support = true,
> > +   .support_direct_pin = true,
> > +   .dimension_mask = HPW_MASK,
> > +   .hvsize_mask = HSIZE_MASK,
> > +   .channel_swap_shift = CH_SWAP,
> > +   .yuv422_en_bit = YUV422_EN,
> > +   .csc_enable_bit = CSC_ENABLE,
> > +};
> > +
> >  static const struct mtk_dpi_conf mt8192_conf = {
> > .cal_factor = mt8183_calculate_factor,
> > .reg_h_fre_con = 0xe0,
> > @@ -1092,6 +1110,9 @@ static const struct of_device_id
> > mtk_dpi_of_ids[] = {
> > { .compatible = "mediatek,mt8183-dpi",
> >   .data = &mt8183_conf,
> > },
> > +   { .compatible = "mediatek,mt8186-dpi",
> > + .data = &mt8186_conf,
> > +   },
> > { .compatible = "mediatek,mt8192-dpi",
> >   .data = &mt8192_conf,
> > },
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> > b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> > index 546b79412815..3d32fbc66ac1 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> > @@ -646,6 +646,8 @@ static const struct of_device_id
> > mtk_ddp_comp_dt_ids[] = {
> >   .data = (void *)MTK_DPI },
> > { .compatible = "mediatek,mt8183-dpi",
> >   .data = (void *)MTK_DPI },
> > +   { .compatible = "mediatek,mt8186-dpi",
> > + .data = (void *)MTK_DPI },
> > { .compatible = "mediatek,mt8192-dpi",
> >   .data = (void *)MTK_DPI },
> > { .compatible = "mediatek,mt8195-dp-intf",
> 
> 

Hi CK:

Thanks for your review.

Because the difference between MT8186 and other ICs is that modifying
the output format requires modifying the mmsys register to take effect.

I think this mt8186_conf needs to be reserved so that we can
distinguish it from the settings of other ICs when setting
dpi_dual_edge.
(the edge_cfg_in mmsys flag inside can prevent other ICs from setting
the mmsys register)

Best Regards!
Xinlei

Re: [PATCH v2 00/41] drm: Analog TV Improvements

2022-09-21 Thread Maxime Ripard

On Wed, Sep 07, 2022 at 06:44:53PM +0200, Noralf Trønnes wrote:
> 
> 
> Den 07.09.2022 12.36, skrev Stefan Wahren:
> > Hi Maxime,
> > 
> > Am 05.09.22 um 16:57 schrieb Maxime Ripard:
> >> On Fri, Sep 02, 2022 at 01:28:16PM +0200, Noralf Trønnes wrote:
> >>>
> >>> Den 01.09.2022 21.35, skrev Noralf Trønnes:
> 
>  I have finally found a workaround for my kernel hangs.
> 
>  Dom had a look at my kernel and found that the VideoCore was fine, and
>  he said this:
> 
> > That suggests cause of lockup was on arm side rather than VC side.
> >
> > But it's hard to diagnose further. Once you've had a peripheral not
> > respond, the AXI bus locks up and no further operations are possible.
> > Usual causes of this are required clocks being stopped or domains
> > disabled and then trying to access the hardware.
> >
>  So when I got this on my 64-bit build:
> 
>  [  166.702171] SError Interrupt on CPU1, code 0xbf02 --
>  SError
>  [  166.702187] CPU: 1 PID: 8 Comm: kworker/u8:0 Tainted: G    W
>   5.19.0-rc6-00096-gba7973977976-dirty #1
>  [  166.702200] Hardware name: Raspberry Pi 4 Model B Rev 1.1 (DT)
>  [  166.702206] Workqueue: events_freezable_power_
>  thermal_zone_device_check
>  [  166.702231] pstate: 20c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS
>  BTYPE=--)
>  [  166.702242] pc : regmap_mmio_read32le+0x10/0x28
>  [  166.702261] lr : regmap_mmio_read+0x44/0x70
>  ...
>  [  166.702606]  bcm2711_get_temp+0x58/0xb0 [bcm2711_thermal]
> 
>  I wondered if that reg read was stalled due to a clock being stopped.
> 
>  Lo and behold, disabling runtime pm and keeping the vec clock running
>  all the time fixed it[1].
> 
>  I don't know what the problem is, but at least I can now test this
>  patchset.
> 
>  [1] https://gist.github.com/notro/23b984e7fa05cfbda2db50a421cac065
> 
> >>> It turns out I didn't have to disable runtime pm:
> >>> https://gist.github.com/notro/0adcfcb12460b54e54458afe11dc8ea2
> >> If the bcm2711_thermal IP needs that clock to be enabled, it should grab
> >> a reference itself, but it looks like even the device tree binding
> >> doesn't ask for one.
> > The missing clock in the device tree binding is expected, because
> > despite of the code there is not much information about the BCM2711
> > clock tree. But i'm skeptical that the AVS IP actually needs the VEC
> > clock, i think it's more likely that the VEC clock parent is changed and
> > that cause this issue. I could take care of the bcm2711 binding & driver
> > if i know which clock is really necessary.
> 
> Seems you're right, keeping the parent always enabled is enough:
> 
>   clk_prepare_enable(clk_get_parent(vec->clock)); // pllc_per
> 
> I tried enabling just the grandparent clock as well, but that didn't help.

Yeah, adding tracing to the clock framework shows that it indeed
disables PLLC_PER. So there's probably some other device that depends on
it but doesn't take a reference to it.

I had a look, but it's not really obvious what that might be.

This patch makes sure that the PLL*_PER are never disabled, could you
test it? It seems to work for me.


diff --git a/drivers/clk/bcm/clk-bcm2835.c b/drivers/clk/bcm/clk-bcm2835.c
index 48a1eb9f2d55..3839261ee419 100644
--- a/drivers/clk/bcm/clk-bcm2835.c
+++ b/drivers/clk/bcm/clk-bcm2835.c
@@ -1675,7 +1675,7 @@ static const struct bcm2835_clk_desc clk_desc_array[] = {
.load_mask = CM_PLLA_LOADPER,
.hold_mask = CM_PLLA_HOLDPER,
.fixed_divider = 1,
-   .flags = CLK_SET_RATE_PARENT),
+   .flags = CLK_IS_CRITICAL | CLK_SET_RATE_PARENT),
[BCM2835_PLLA_DSI0] = REGISTER_PLL_DIV(
SOC_ALL,
.name = "plla_dsi0",
@@ -1784,7 +1784,7 @@ static const struct bcm2835_clk_desc clk_desc_array[] = {
.load_mask = CM_PLLC_LOADPER,
.hold_mask = CM_PLLC_HOLDPER,
.fixed_divider = 1,
-   .flags = CLK_SET_RATE_PARENT),
+   .flags = CLK_IS_CRITICAL | CLK_SET_RATE_PARENT),

/*
 * PLLD is the display PLL, used to drive DSI display panels.
@@ -1891,7 +1891,7 @@ static const struct bcm2835_clk_desc clk_desc_array[] = {
.load_mask = CM_PLLH_LOADAUX,
.hold_mask = 0,
.fixed_divider = 1,
-   .flags = CLK_SET_RATE_PARENT),
+   .flags = CLK_IS_CRITICAL | CLK_SET_RATE_PARENT),
[BCM2835_PLLH_PIX]  = REGISTER_PLL_DIV(
SOC_BCM2835,
.name = "pllh_pix",


Maxime


signature.asc
Description: PGP signature

Re: [PATCH v10 3/9] compiler_types.h: Add assert_type to catch type mis-match while compiling

2022-09-21 Thread Gwan-gyeong Mun





On 9/13/22 3:01 PM, Kees Cook wrote:

On Fri, Sep 09, 2022 at 07:59:07PM +0900, Gwan-gyeong Mun wrote:

It adds assert_type and assert_typable macros to catch type mis-match while
compiling. The existing typecheck() macro outputs build warnings, but the
newly added assert_type() macro uses the _Static_assert() keyword (which is
introduced in C11) to generate a build break when the types are different
and can be used to detect explicit build errors.
Unlike the assert_type() macro, assert_typable() macro allows a constant
value as the second argument.

Suggested-by: Kees Cook 
Signed-off-by: Gwan-gyeong Mun 
Cc: Thomas Hellström 
Cc: Matthew Auld 
Cc: Nirmoy Das 
Cc: Jani Nikula 
Cc: Andi Shyti 
Cc: Mauro Carvalho Chehab 
Cc: Andrzej Hajda 
Cc: Kees Cook 
---
  include/linux/compiler_types.h | 39 ++
  1 file changed, 39 insertions(+)

diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 4f2a819fd60a..19cc125918bb 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -294,6 +294,45 @@ struct ftrace_likely_data {
  /* Are two types/vars the same type (ignoring qualifiers)? */
  #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
  
+/**

+ * assert_type - break compile if the first argument's data type and the second
+ *   argument's data type are not the same
+ *
+ * @t1: data type or variable
+ * @t2: data type or variable
+ *
+ * The first and second arguments can be data types or variables or mixed (the
+ * first argument is the data type and the second argument is variable or vice
+ * versa). It determines whether the first argument's data type and the second
+ * argument's data type are the same while compiling, and it breaks compile if
+ * the two types are not the same.
+ * See also assert_typable().
+ */
+#define assert_type(t1, t2) _Static_assert(__same_type(t1, t2))
+
+/**
+ * assert_typable - break compile if the first argument's data type and the
+ *  second argument's data type are not the same
+ *
+ * @t: data type or variable
+ * @n: data type or variable or constant value
+ *
+ * The first and second arguments can be data types or variables or mixed (the
+ * first argument is the data type and the second argument is variable or vice
+ * versa). Unlike the assert_type() macro, this macro allows a constant value
+ * as the second argument. And if the second argument is a constant value, it
+ * always passes. And it doesn't mean that the types are explicitly the same.
+ * When a constant value is used as the second argument, if you need an
+ * overflow check when assigning a constant value to a variable of the type of
+ * the first argument, you can use the overflows_type() macro. When a constant


I wonder if the overflows_type() check should happen in this test? It
seems weird that assert_typable(u8, 1024) would pass...

Yes, that's right. If a constant is used as an argument here, it seems 
necessary to check whether an overflow occurs when the constant value is 
assigned to the target type or target variable.



+ * value is not used as a second argument, it determines whether the first
+ * argument's data type and the second argument's data type are the same while
+ * compiling, and it breaks compile if the two types are not the same.
+ * See also assert_type() and overflows_type().
+ */
+#define assert_typable(t, n) _Static_assert(__builtin_constant_p(n) || \
+   __same_type(t, typeof(n)))


Totally untested -- I'm not sure if this gets the right semantics for
constant expressoins, etc...

static_assert(__builtin_choose_expression(__builtin_constant_p(n), \
overflows_type(n, typeof(t)), \
__same_type(t, typeof(n


However, if we change the macro in the form below, the "error: 
expression in static assertion is not constant" error occurs due to the 
restriction [1][2] of _Static_assert() as you mentioned.
( overflows_type() internally uses the __builtin_add_overflow() builtin 
function [3], which returns a bool type.)


#define assert_same_typable(t, n) static_assert( \
__builtin_choose_expr(__builtin_constant_p(n),   \
  overflows_type(n, typeof(t)) == false, \
  __same_type(t, typeof(n

Can I have your opinion on the new addition of 
overflows_type_return_const_expr(), which returns a constant value at 
compile time to check whether an overflow occurs when assigning a 
constant value to an argument type?
If it is allowable to add the macro,   I would try to use the macro that 
returns "an integer constant expression" after checking for overflow 
between the constant value and the argument type at compile time with 
reference to implemented in the previous version. [4] or [5]


#define assert_same_typable(t, n) static_assert

Re: [PATCH v2 10/41] drm/modes: Add a function to generate analog display modes

2022-09-21 Thread Maxime Ripard

Hi,

Thanks again for your help

On Sun, Sep 11, 2022 at 06:30:39AM +0200, kFYatek wrote:
> W dniu 9.09.2022 o 16:00, Maxime Ripard pisze:
> > On Wed, Sep 07, 2022 at 11:31:21PM +0200, Mateusz Kwiatkowski wrote:
> >> The "canonical" modelines (at least for vc4's VEC, see the notes below):
> >>
> >> - (vfp==4, vsync==6, vbp==39) for 576i
> >> - (vfp==7, vsync==6, vbp==32) for 480i
> >> - (vfp==5, vsync==6, vbp==28) for 486i (full frame NTSC as originally 
> >> specified)
> >
> > It's not clear to me either how you come up with those timings?
> 
> Well, experimentally ;)
> 
> The values for 480i and 576i are the values currently used by the downstream
> kernel, and those in turn has been copied from the firmware (or more 
> precisely,
> I chose them so that the PV registers end up the same as the values set by the
> firmware).
> 
> I also checked with an oscilloscope that the waveforms look as they should.
> VEC doesn't exactly handle the half-lines at the start and end of the odd 
> field
> right, but otherwise, the blanking and sync pulses are where they should be.
> 
> The 486i values has been constructed from the 480i ones according to the
> calculations from cross-referencing SMPTE documents, see my previous e-mail.
> 
> I know this is perhaps unsatisfactory ;) I don't have access to the VC4
> documentation, so this was _almost_ reverse engineering for me.

It's not really that it's unsatisfactory, but the function here is
supposed to be generic and thus generate a mode that is supposed to be a
somewhat reasonable for a given set of parameters.

If the vc4 driver needs some adjustments, then it needs to be out of
that function and into the vc4 driver. And this is pretty much what I
struggle with: I have a hard time (on top of everything else) figuring
out what is supposed to be specific to vc4, and what isn't.

I guess your 480i example, since it follows the spec, is fine, but I'm
not sure for 576i.
Maxime


signature.asc
Description: PGP signature

Re: [PATCH V2 2/3] dt-bindings: display: panel: Add NewVision NV3051D bindings

2022-09-21 Thread Chris Morgan

On Wed, Sep 21, 2022 at 08:51:34AM +0200, Krzysztof Kozlowski wrote:
> On 20/09/2022 16:59, Chris Morgan wrote:
> > From: Chris Morgan 
> > 
> > Add documentation for the NewVision NV3051D panel bindings.
> > Note that for the two expected consumers of this panel binding
> > the underlying LCD model is unknown. Name "anbernic,rg353p-panel"
> > is used because the hardware itself is known as "anbernic,rg353p".
> > 
> > Signed-off-by: Chris Morgan 
> > ---
> >  .../display/panel/newvision,nv3051d.yaml  | 55 +++
> >  1 file changed, 55 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/display/panel/newvision,nv3051d.yaml
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/display/panel/newvision,nv3051d.yaml 
> > b/Documentation/devicetree/bindings/display/panel/newvision,nv3051d.yaml
> > new file mode 100644
> > index ..d90bca4171c2
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/display/panel/newvision,nv3051d.yaml
> > @@ -0,0 +1,55 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: 
> > https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdevicetree.org%2Fschemas%2Fdisplay%2Fpanel%2Fnewvision%2Cnv3051d.yaml%23&data=05%7C01%7C%7C844872fdf91b413aa65a08da9b9db9e7%7C84df9e7fe9f640afb435%7C1%7C0%7C637993398994635658%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=eTM2IFjR0TKPlQNYfoq3Poao8QYLSHRVaqiXtufJ7VA%3D&reserved=0
> > +$schema: 
> > https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdevicetree.org%2Fmeta-schemas%2Fcore.yaml%23&data=05%7C01%7C%7C844872fdf91b413aa65a08da9b9db9e7%7C84df9e7fe9f640afb435%7C1%7C0%7C637993398994635658%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=sjb7x2Z2wKu1C8mMBW1epwuXipe8V26zxpHcCAuKLZY%3D&reserved=0
> > +
> > +title: NewVision NV3051D based LCD panel
> > +
> > +description: |
> > +  The NewVision NV3051D is a driver chip used to drive DSI panels. For now,
> > +  this driver only supports the 640x480 panels found in the Anbernic RG353
> > +  based devices.
> > +
> > +maintainers:
> > +  - Chris Morgan 
> > +
> > +allOf:
> > +  - $ref: panel-common.yaml#
> > +
> > +properties:
> > +  compatible:
> > +items:
> > +  - enum:
> > +  - anbernic,rg353p-panel
> 
> Are these vendor prefixs documented?

Yes, they are in another patch series referenced in the cover letter.
They were added for the Anbernic devicetrees and should (I believe)
land in 6.1.

> 
> > +  - anbernic,rg353v-panel
> > +  - const: newvision,nv3051d
> 
> Blank line.
> 

Ack.

> > +  reg: true
> > +  backlight: true
> > +  port: true
> > +  reset-gpios: true
> > +  vdd-supply:
> > +description: regulator that supplies the vdd voltage
> 
> Skip description and make it just "true". It's kind of obvious.
> 

Ack.

> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - backlight
> > +  - vdd-supply
> 
> Best regards,
> Krzysztof

Re: [PATCH 3/7] drm/i915/hwmon: Power PL1 limit and TDP setting

2022-09-21 Thread Nilawar, Badal





On 21-09-2022 17:15, Gupta, Anshuman wrote:



On 9/16/2022 8:30 PM, Badal Nilawar wrote:

From: Dale B Stimson 

Use i915 HWMON to display/modify dGfx power PL1 limit and TDP setting.

v2:
   - Fix review comments (Ashutosh)
   - Do not restore power1_max upon module unload/load sequence
 because on production systems modules are always loaded
 and not unloaded/reloaded (Ashutosh)
   - Fix review comments (Jani)
   - Remove endianness conversion (Ashutosh)
v3: Add power1_rated_max (Ashutosh)
v4:
   - Use macro HWMON_CHANNEL_INFO to define power channel (Guenter)
   - Update the date and kernel version in Documentation (Badal)
v5: Use hwm_ prefix for static functions (Ashutosh)
v6:
   - Fix review comments (Ashutosh)
   - Update date, kernel version in documentation

Cc: Guenter Roeck 
Signed-off-by: Dale B Stimson 
Signed-off-by: Ashutosh Dixit 
Signed-off-by: Riana Tauro 
Signed-off-by: Badal Nilawar 
Acked-by: Guenter Roeck 
---
  .../ABI/testing/sysfs-driver-intel-i915-hwmon |  20 +++
  drivers/gpu/drm/i915/i915_hwmon.c | 158 +-
  drivers/gpu/drm/i915/i915_reg.h   |   5 +
  drivers/gpu/drm/i915/intel_mchbar_regs.h  |   6 +
  4 files changed, 187 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon 
b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon

index e2974f928e58..bc061238e35c 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
+++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
@@ -5,3 +5,23 @@ Contact:    dri-devel@lists.freedesktop.org
  Description:    RO. Current Voltage in millivolt.
  Only supported for particular Intel i915 graphics platforms.
+
+What:    /sys/devices/.../hwmon/hwmon/power1_max
+Date:    September 2022
+KernelVersion:    6
+Contact:    dri-devel@lists.freedesktop.org
+Description:    RW. Card reactive sustained  (PL1/Tau) power limit in 
microwatts.

+
+    The power controller will throttle the operating frequency
+    if the power averaged over a window (typically seconds)
+    exceeds this limit.
+
+    Only supported for particular Intel i915 graphics platforms.
+
+What:    /sys/devices/.../hwmon/hwmon/power1_rated_max
+Date:    September 2022
+KernelVersion:    6
+Contact:    dri-devel@lists.freedesktop.org
+Description:    RO. Card default power limit (default TDP setting).
+
+    Only supported for particular Intel i915 graphics platforms.
diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
b/drivers/gpu/drm/i915/i915_hwmon.c

index 45745afa5c5b..5183cf51a49b 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -16,11 +16,16 @@
  /*
   * SF_* - scale factors for particular quantities according to hwmon 
spec.

   * - voltage  - millivolts
+ * - power  - microwatts
   */
  #define SF_VOLTAGE    1000
+#define SF_POWER    100
  struct hwm_reg {
  i915_reg_t gt_perf_status;
+    i915_reg_t pkg_power_sku_unit;
+    i915_reg_t pkg_power_sku;
+    i915_reg_t pkg_rapl_limit;
  };
  struct hwm_drvdata {
@@ -34,10 +39,68 @@ struct i915_hwmon {
  struct hwm_drvdata ddat;
  struct mutex hwmon_lock;    /* counter overflow logic and 
rmw */

  struct hwm_reg rg;
+    int scl_shift_power;
  };
+static void
+hwm_locked_with_pm_intel_uncore_rmw(struct hwm_drvdata *ddat,
+    i915_reg_t reg, u32 clear, u32 set)
+{
+    struct i915_hwmon *hwmon = ddat->hwmon;
+    struct intel_uncore *uncore = ddat->uncore;
+    intel_wakeref_t wakeref;
+
+    mutex_lock(&hwmon->hwmon_lock);
+
+    with_intel_runtime_pm(uncore->rpm, wakeref)
+    intel_uncore_rmw(uncore, reg, clear, set);
+
+    mutex_unlock(&hwmon->hwmon_lock);
+}
+
+/*
+ * This function's return type of u64 allows for the case where the 
scaling
+ * of the field taken from the 32-bit register value might cause a 
result to

+ * exceed 32 bits.
+ */
+static u64
+hwm_field_read_and_scale(struct hwm_drvdata *ddat, i915_reg_t rgadr,
+ u32 field_msk, int nshift, u32 scale_factor)
+{
+    struct intel_uncore *uncore = ddat->uncore;
+    intel_wakeref_t wakeref;
+    u32 reg_value;
+
+    with_intel_runtime_pm(uncore->rpm, wakeref)
+    reg_value = intel_uncore_read(uncore, rgadr);
+
+    reg_value = REG_FIELD_GET(field_msk, reg_value);
+
+    return mul_u64_u32_shr(reg_value, scale_factor, nshift);
+}
+
+static void
+hwm_field_scale_and_write(struct hwm_drvdata *ddat, i915_reg_t rgadr,
+  u32 field_msk, int nshift,
+  unsigned int scale_factor, long lval)
+{
+    u32 nval;
+    u32 bits_to_clear;
+    u32 bits_to_set;
+
+    /* Computation in 64-bits to avoid overflow. Round to nearest. */
+    nval = DIV_ROUND_CLOSEST_ULL((u64)lval << nshift, scale_factor);
+
+    bits_to_clear = field_msk;
+    bits_to_set = FIELD_PREP(field_msk, nval);
+
+    hwm_locked_with_pm_intel_uncore_rmw(ddat, rgadr,
+    bits_to_clear, bits_to_set);

Re: [PATCH v2 10/41] drm/modes: Add a function to generate analog display modes

2022-09-21 Thread Maxime Ripard

Hi,

On Sun, Sep 11, 2022 at 06:48:50AM +0200, Mateusz Kwiatkowski wrote:
> >> Those extra vbp lines will be treated as a black bar at the top of the 
> >> frame,
> >> and extra vfp lines will be at the bottom of the frame.
> >>
> >> However if someone specifies e.g. 720x604, there's nothing more you could
> >> remove from vfp, so your only option is to reduce vbp compared to the 
> >> standard
> >> mode, so you'll end up with (vfp==4, vsync==6, vbp==11). The image will 
> >> not be
> >> centered, the topmost lines will get cropped out, but that's the best we 
> >> can do
> >> and if someone is requesting such resolution, they most likely want to 
> >> actually
> >> access the VBI to e.g. emit teletext.
> >>
> >> Your current code always starts at (vfp==5 or 6, vsync=6, vbp==6) and then
> >> increases both vfp and vbp proportionately. This puts vsync dead center in 
> >> the
> >> VBI, which is not how it's supposed to be - and that in turn causes the 
> >> image
> >> to be significantly shifted upwards.
> >>
> >> I hope this makes more sense to you now.
> >
> > I'm really struggling with this, so thanks for explaining this further
> > (and patiently ;))
> >
> > If I get this right, what you'd like to change is this part of the
> > calculus (simplified a bit, and using PAL, 576i):
> >
> >   vfp_min = params->vfp_lines.even + params->vfp_lines.odd; // 5
> >   vbp_min = params->vbp_lines.even + params->vbp_lines.odd; // 6
> >   vslen = params->vslen_lines.even + params->vslen_lines.odd; // 6
> >
> >   porches = params->num_lines - vactive - vslen; // 43
> >   porches_rem = porches - vfp_min - vbp_min; // 32
> >
> >   vfp = vfp_min + (porches_rem / 2); // 21
> >   vbp = porches - vfp; // 22
> >
> > Which is indeed having sync centered.
> >
> > I initially changed it to:
> >
> >   vfp = vfp_min; // 6
> >   vbp = num_lines - vactive - vslen - vfp; // 38
> >
> > Which is close enough for 576i, but at 480i/50Hz would end up with 134,
> > so still fairly far off.
> >
> > I guess your suggestion would be along the line of:
> >
> >   vfp_min = params->vfp_lines.even + params->vfp_lines.odd; // 5
> >   vbp_min = params->vbp_lines.even + params->vbp_lines.odd; // 38
> >   vslen = params->vslen_lines.even + params->vslen_lines.odd; // 6
> >
> >   porches = params->num_lines - vactive - vslen; // 0
> >   porches_rem = porches - vfp_min - vbp_min; // 0
> >
> >   vfp = vfp_min + (porches_rem / 2); // 5
> >   vbp = porches - vfp; // 38
> >
> > Which is still close enough for 576i, but for 480i would end up with:
> >
> >   porches = params->num_lines - vactive - vslen; // 139
> >   porches_rem = porches - vfp_min - vbp_min; // 96
> >
> >   vfp = vfp_min + (porches_rem / 2); // 53
> >   vbp = porches - vfp; // 86
> >
> > Right?
> 
> Yes. And if that's supposed to mean 480i in 50 Hz "PAL" mode, that's also
> "close enough" to the values I suggested above.
> 
> If you substitute values for true 60 Hz "NTSC" 480i, you should also get 
> values
> that are "close enough" to the official spec.
> 
> The only thing I'd conceptually change is that the 38 lines is not really
> "vbp_min". It's more like "vbp_typ". As I mentioned above, we may want to 
> lower
> this value if someone wants more active lines than the official 486/576.

porches_rem is an int, so if vactive > (num_lines + vslen + vfp_min +
vbp_min), porches_rem is going to be negative and we'll remove equally
between vfp and vbp to match what's been asked

So I believe this should work fine?

Maxime


signature.asc
Description: PGP signature

Re: [PATCH 5/7] drm/i915/hwmon: Expose card reactive critical power

2022-09-21 Thread Gupta, Anshuman





On 9/16/2022 8:30 PM, Badal Nilawar wrote:

From: Ashutosh Dixit 

Expose the card reactive critical (I1) power. I1 is exposed as
power1_crit in microwatts (typically for client products) or as
curr1_crit in milliamperes (typically for server).

v2: Add curr1_crit functionality (Ashutosh)
v3:
   - Use HWMON_CHANNEL_INFO to define power1_crit, curr1_crit (Badal)
v4: Use hwm_ prefix for static functions (Ashutosh)
v5: Updated date, kernel version in documentation

Cc: Sujaritha Sundaresan 
Signed-off-by: Ashutosh Dixit 
Signed-off-by: Badal Nilawar 
Acked-by: Guenter Roeck 
---
  .../ABI/testing/sysfs-driver-intel-i915-hwmon | 26 +
  drivers/gpu/drm/i915/i915_hwmon.c | 95 ++-
  drivers/gpu/drm/i915/i915_reg.h   |  6 ++
  3 files changed, 126 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon 
b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
index 94101f818a70..cc70596fff44 100644
--- a/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
+++ b/Documentation/ABI/testing/sysfs-driver-intel-i915-hwmon
@@ -26,6 +26,32 @@ Description: RO. Card default power limit (default TDP 
setting).
  
  		Only supported for particular Intel i915 graphics platforms.
  
+What:		/sys/devices/.../hwmon/hwmon/power1_crit

+Date:  September 2022
+KernelVersion: 6
+Contact:   dri-devel@lists.freedesktop.org
+Description:   RW. Card reactive critical (I1) power limit in microwatts.
+
+   Card reactive critical (I1) power limit in microwatts is exposed
+   for client products. The power controller will throttle the
+   operating frequency if the power averaged over a window exceeds
+   this limit.
+
+   Only supported for particular Intel i915 graphics platforms.
+
+What:  /sys/devices/.../hwmon/hwmon/curr1_crit
+Date:  September 2022
+KernelVersion: 6
+Contact:   dri-devel@lists.freedesktop.org
+Description:   RW. Card reactive critical (I1) power limit in milliamperes.
+
+   Card reactive critical (I1) power limit in milliamperes is
+   exposed for server products. The power controller will throttle
+   the operating frequency if the power averaged over a window
+   exceeds this limit.
+
+   Only supported for particular Intel i915 graphics platforms.
+
  What: /sys/devices/.../hwmon/hwmon/energy1_input
  Date: September 2022
  KernelVersion:6
diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
b/drivers/gpu/drm/i915/i915_hwmon.c
index a42cfad78bef..bd9ba312c474 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -11,16 +11,19 @@
  #include "i915_hwmon.h"
  #include "i915_reg.h"
  #include "intel_mchbar_regs.h"
+#include "intel_pcode.h"
  #include "gt/intel_gt_regs.h"
  
  /*

   * SF_* - scale factors for particular quantities according to hwmon spec.
   * - voltage  - millivolts
   * - power  - microwatts
+ * - curr   - milliamperes
   * - energy - microjoules
   */
  #define SF_VOLTAGE1000
  #define SF_POWER  100
+#define SF_CURR1000
  #define SF_ENERGY 100
  
  struct hwm_reg {

@@ -160,11 +163,25 @@ hwm_energy(struct hwm_drvdata *ddat, long *energy)
  
  static const struct hwmon_channel_info *hwm_info[] = {

HWMON_CHANNEL_INFO(in, HWMON_I_INPUT),
-   HWMON_CHANNEL_INFO(power, HWMON_P_MAX | HWMON_P_RATED_MAX),
+   HWMON_CHANNEL_INFO(power, HWMON_P_MAX | HWMON_P_RATED_MAX | 
HWMON_P_CRIT),
HWMON_CHANNEL_INFO(energy, HWMON_E_INPUT),
+   HWMON_CHANNEL_INFO(curr, HWMON_C_CRIT),
NULL
  };
  
+/* I1 is exposed as power_crit or as curr_crit depending on bit 31 */

+static int hwm_pcode_read_i1(struct drm_i915_private *i915, u32 *uval)
+{
+   return snb_pcode_read_p(&i915->uncore, PCODE_POWER_SETUP,
+   POWER_SETUP_SUBCOMMAND_READ_I1, 0, uval);
+}
+
+static int hwm_pcode_write_i1(struct drm_i915_private *i915, u32 uval)
+{
+   return  snb_pcode_write_p(&i915->uncore, PCODE_POWER_SETUP,
+ POWER_SETUP_SUBCOMMAND_WRITE_I1, 0, uval);
+}
+
  static umode_t
  hwm_in_is_visible(const struct hwm_drvdata *ddat, u32 attr)
  {
@@ -198,13 +215,18 @@ hwm_in_read(struct hwm_drvdata *ddat, u32 attr, long *val)
  static umode_t
  hwm_power_is_visible(const struct hwm_drvdata *ddat, u32 attr, int chan)
  {
+   struct drm_i915_private *i915 = ddat->uncore->i915;
struct i915_hwmon *hwmon = ddat->hwmon;
+   u32 uval;
  
  	switch (attr) {

case hwmon_power_max:
return i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit) ? 0664 : 0;
case hwmon_power_rated_max:
return i915_mmio_reg_valid(hwmon->rg.pkg_power_sku) ? 0444 : 0;
+   case hwmon_power_crit:
+   return (hwm_pcode_read_i1(i915, &uval) ||
+   !(uval & POWER_S

Re: [Intel-gfx] [PATCH 1/7] drm/i915/hwmon: Add HWMON infrastructure

2022-09-21 Thread Nilawar, Badal





On 21-09-2022 18:14, Andi Shyti wrote:

Hi Badal,


+struct hwm_reg {
+};
+
+struct hwm_drvdata {
+   struct i915_hwmon *hwmon;
+   struct intel_uncore *uncore;
+   struct device *hwmon_dev;
+   char name[12];
+};
+
+struct i915_hwmon {
+   struct hwm_drvdata ddat;
+   struct mutex hwmon_lock;/* counter overflow logic and 
rmw */
+   struct hwm_reg rg;
+};
+
+static const struct hwmon_channel_info *hwm_info[] = {
+   NULL
+};
+
+static umode_t
+hwm_is_visible(const void *drvdata, enum hwmon_sensor_types type,
+  u32 attr, int channel)
+{
+   switch (type) {
+   default:
+   return 0;
+   }
+}
+
+static int
+hwm_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+int channel, long *val)
+{
+   switch (type) {
+   default:
+   return -EOPNOTSUPP;
+   }
+}
+
+static int
+hwm_write(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+ int channel, long val)
+{
+   switch (type) {
+   default:
+   return -EOPNOTSUPP;
+   }
+}
+
+static const struct hwmon_ops hwm_ops = {
+   .is_visible = hwm_is_visible,
+   .read = hwm_read,
+   .write = hwm_write,
+};
+
+static const struct hwmon_chip_info hwm_chip_info = {
+   .ops = &hwm_ops,
+   .info = hwm_info,
+};


what's the point for splitting so much? Can't you just send the
hwmon driver all at once? With this patch you are not actually
doing anything useful. In my opinion this should be squashed with
the next ones.
During discussion in cover letter of rev0 series we decided to create 
separate infrastructure patch, as we wanted to keep kconfig, i915 hwmon 
structures and new file addition in separate patch. Further feature wise 
we kept adding new patches.



+static void
+hwm_get_preregistration_info(struct drm_i915_private *i915)
+{
+}
+
+void i915_hwmon_register(struct drm_i915_private *i915)
+{
+   struct device *dev = i915->drm.dev;
+   struct i915_hwmon *hwmon;
+   struct device *hwmon_dev;
+   struct hwm_drvdata *ddat;
+
+   /* hwmon is available only for dGfx */
+   if (!IS_DGFX(i915))
+   return;
+
+   hwmon = kzalloc(sizeof(*hwmon), GFP_KERNEL);


why don't we use devm_kzalloc?


+   if (!hwmon)
+   return;
+
+   i915->hwmon = hwmon;
+   mutex_init(&hwmon->hwmon_lock);
+   ddat = &hwmon->ddat;
+
+   ddat->hwmon = hwmon;
+   ddat->uncore = &i915->uncore;
+   snprintf(ddat->name, sizeof(ddat->name), "i915");
+
+   hwm_get_preregistration_info(i915);
+
+   /*  hwmon_dev points to device hwmon */
+   hwmon_dev = hwmon_device_register_with_info(dev, ddat->name,
+   ddat,
+   &hwm_chip_info,
+   NULL);
+   if (IS_ERR(hwmon_dev)) {
+   mutex_destroy(&hwmon->hwmon_lock);


there is not such a big need to destroy the mutex. Destroying
mutexes is more useful when you actually are creating/destroying
and there is some debug need. I don't think that's the case.

With the devm_kzalloc this would be just a return.

I think we can switch to devm_kzalloc.

Regards,
Badal


Andi


+   i915->hwmon = NULL;
+   kfree(hwmon);
+   return;
+   }
+
+   ddat->hwmon_dev = hwmon_dev;
+}
+
+void i915_hwmon_unregister(struct drm_i915_private *i915)
+{
+   struct i915_hwmon *hwmon;
+   struct hwm_drvdata *ddat;
+
+   hwmon = fetch_and_zero(&i915->hwmon);
+   if (!hwmon)
+   return;
+
+   ddat = &hwmon->ddat;
+   if (ddat->hwmon_dev)
+   hwmon_device_unregister(ddat->hwmon_dev);
+
+   mutex_destroy(&hwmon->hwmon_lock);
+   kfree(hwmon);
+}
diff --git a/drivers/gpu/drm/i915/i915_hwmon.h 
b/drivers/gpu/drm/i915/i915_hwmon.h
new file mode 100644
index ..7ca9cf2c34c9
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_hwmon.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: MIT */
+
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_HWMON_H__
+#define __I915_HWMON_H__
+
+struct drm_i915_private;
+
+#if IS_REACHABLE(CONFIG_HWMON)
+void i915_hwmon_register(struct drm_i915_private *i915);
+void i915_hwmon_unregister(struct drm_i915_private *i915);
+#else
+static inline void i915_hwmon_register(struct drm_i915_private *i915) { };
+static inline void i915_hwmon_unregister(struct drm_i915_private *i915) { };
+#endif
+
+#endif /* __I915_HWMON_H__ */
--
2.25.1

Re: [PATCH V2 2/3] dt-bindings: display: panel: Add NewVision NV3051D bindings

2022-09-21 Thread Krzysztof Kozlowski

On 21/09/2022 16:38, Chris Morgan wrote:
>>> +  compatible:
>>> +items:
>>> +  - enum:
>>> +  - anbernic,rg353p-panel
>>
>> Are these vendor prefixs documented?
> 
> Yes, they are in another patch series referenced in the cover letter.
> They were added for the Anbernic devicetrees and should (I believe)
> land in 6.1.

OK... you still need to test your bindings. Your patch was clearly not
tested before sending. :(

Best regards,
Krzysztof

Re: [Intel-gfx] [PATCH 1/7] drm/i915/hwmon: Add HWMON infrastructure

2022-09-21 Thread Andi Shyti

Hi Badal,

> > > +struct hwm_reg {
> > > +};
> > > +
> > > +struct hwm_drvdata {
> > > + struct i915_hwmon *hwmon;
> > > + struct intel_uncore *uncore;
> > > + struct device *hwmon_dev;
> > > + char name[12];
> > > +};
> > > +
> > > +struct i915_hwmon {
> > > + struct hwm_drvdata ddat;
> > > + struct mutex hwmon_lock;/* counter overflow logic and 
> > > rmw */
> > > + struct hwm_reg rg;
> > > +};
> > > +
> > > +static const struct hwmon_channel_info *hwm_info[] = {
> > > + NULL
> > > +};
> > > +
> > > +static umode_t
> > > +hwm_is_visible(const void *drvdata, enum hwmon_sensor_types type,
> > > +u32 attr, int channel)
> > > +{
> > > + switch (type) {
> > > + default:
> > > + return 0;
> > > + }
> > > +}
> > > +
> > > +static int
> > > +hwm_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
> > > +  int channel, long *val)
> > > +{
> > > + switch (type) {
> > > + default:
> > > + return -EOPNOTSUPP;
> > > + }
> > > +}
> > > +
> > > +static int
> > > +hwm_write(struct device *dev, enum hwmon_sensor_types type, u32 attr,
> > > +   int channel, long val)
> > > +{
> > > + switch (type) {
> > > + default:
> > > + return -EOPNOTSUPP;
> > > + }
> > > +}
> > > +
> > > +static const struct hwmon_ops hwm_ops = {
> > > + .is_visible = hwm_is_visible,
> > > + .read = hwm_read,
> > > + .write = hwm_write,
> > > +};
> > > +
> > > +static const struct hwmon_chip_info hwm_chip_info = {
> > > + .ops = &hwm_ops,
> > > + .info = hwm_info,
> > > +};
> > 
> > what's the point for splitting so much? Can't you just send the
> > hwmon driver all at once? With this patch you are not actually
> > doing anything useful. In my opinion this should be squashed with
> > the next ones.

> During discussion in cover letter of rev0 series we decided to create
> separate infrastructure patch, as we wanted to keep kconfig, i915 hwmon
> structures and new file addition in separate patch. Further feature wise we
> kept adding new patches.

I don't really like this patch splitting, but it's my fault I
haven't reviewed it already in v1. Please, ignore then.

Andi

Re: [PATCH V2 2/3] dt-bindings: display: panel: Add NewVision NV3051D bindings

2022-09-21 Thread Chris Morgan

On Wed, Sep 21, 2022 at 05:21:19PM +0200, Krzysztof Kozlowski wrote:
> On 21/09/2022 16:38, Chris Morgan wrote:
> >>> +  compatible:
> >>> +items:
> >>> +  - enum:
> >>> +  - anbernic,rg353p-panel
> >>
> >> Are these vendor prefixs documented?
> > 
> > Yes, they are in another patch series referenced in the cover letter.
> > They were added for the Anbernic devicetrees and should (I believe)
> > land in 6.1.
> 
> OK... you still need to test your bindings. Your patch was clearly not
> tested before sending. :(

I did: yamllint, make dt_binding_check (with DT_SCHEMA_FILES specified), and
make dtbs_check (with DT_SCHEMA_FILES specified again). That's the proper
testing flow correct? In this case it's the pre-requisite that's causing
the issue as I see on a pristine master tree I'm warned about the missing
vendor prefix for anbernic. Should I wait for that to go upstream before
I submit this again?

I'll make the other change about the space and the description of the
vdd-supply when I resubmit. Are we good with the panel compatibles? I'm
still not entirely sure the best thing to name them as I have no part
number whatsoever except the driver IC.

Thank you.

> 
> Best regards,
> Krzysztof
>

[PATCH AUTOSEL 5.19 02/16] drm/gma500: Fix BUG: sleeping function called from invalid context errors

2022-09-21 Thread Sasha Levin

From: Hans de Goede 

[ Upstream commit 63e37a79f7bd939314997e29c2f5a9f0ef184281 ]

gma_crtc_page_flip() was holding the event_lock spinlock while calling
crtc_funcs->mode_set_base() which takes ww_mutex.

The only reason to hold event_lock is to clear gma_crtc->page_flip_event
on mode_set_base() errors.

Instead unlock it after setting gma_crtc->page_flip_event and on
errors re-take the lock and clear gma_crtc->page_flip_event it
it is still set.

This fixes the following WARN/stacktrace:

[  512.122953] BUG: sleeping function called from invalid context at 
kernel/locking/mutex.c:870
[  512.123004] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1253, 
name: gnome-shell
[  512.123031] preempt_count: 1, expected: 0
[  512.123048] RCU nest depth: 0, expected: 0
[  512.123066] INFO: lockdep is turned off.
[  512.123080] irq event stamp: 0
[  512.123094] hardirqs last  enabled at (0): [<>] 0x0
[  512.123134] hardirqs last disabled at (0): [] 
copy_process+0x9fc/0x1de0
[  512.123176] softirqs last  enabled at (0): [] 
copy_process+0x9fc/0x1de0
[  512.123207] softirqs last disabled at (0): [<>] 0x0
[  512.123233] Preemption disabled at:
[  512.123241] [<>] 0x0
[  512.123275] CPU: 3 PID: 1253 Comm: gnome-shell Tainted: GW 
5.19.0+ #1
[  512.123304] Hardware name: Packard Bell dot s/SJE01_CT, BIOS V1.10 07/23/2013
[  512.123323] Call Trace:
[  512.123346]  
[  512.123370]  dump_stack_lvl+0x5b/0x77
[  512.123412]  __might_resched.cold+0xff/0x13a
[  512.123458]  ww_mutex_lock+0x1e/0xa0
[  512.123495]  psb_gem_pin+0x2c/0x150 [gma500_gfx]
[  512.123601]  gma_pipe_set_base+0x76/0x240 [gma500_gfx]
[  512.123708]  gma_crtc_page_flip+0x95/0x130 [gma500_gfx]
[  512.123808]  drm_mode_page_flip_ioctl+0x57d/0x5d0
[  512.123897]  ? drm_mode_cursor2_ioctl+0x10/0x10
[  512.123936]  drm_ioctl_kernel+0xa1/0x150
[  512.123984]  drm_ioctl+0x21f/0x420
[  512.124025]  ? drm_mode_cursor2_ioctl+0x10/0x10
[  512.124070]  ? rcu_read_lock_bh_held+0xb/0x60
[  512.124104]  ? lock_release+0x1ef/0x2d0
[  512.124161]  __x64_sys_ioctl+0x8d/0xd0
[  512.124203]  do_syscall_64+0x58/0x80
[  512.124239]  ? do_syscall_64+0x67/0x80
[  512.124267]  ? trace_hardirqs_on_prepare+0x55/0xe0
[  512.124300]  ? do_syscall_64+0x67/0x80
[  512.124340]  ? rcu_read_lock_sched_held+0x10/0x80
[  512.124377]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  512.124411] RIP: 0033:0x7fcc4a70740f
[  512.124442] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 
00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 
00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
[  512.124470] RSP: 002b:7ffda73f5390 EFLAGS: 0246 ORIG_RAX: 
0010
[  512.124503] RAX: ffda RBX: 55cc9e474500 RCX: 7fcc4a70740f
[  512.124524] RDX: 7ffda73f5420 RSI: c01864b0 RDI: 0009
[  512.124544] RBP: 7ffda73f5420 R08: 55cc9c0b0cb0 R09: 0034
[  512.124564] R10:  R11: 0246 R12: c01864b0
[  512.124584] R13: 0009 R14: 55cc9df484d0 R15: 55cc9af5d0c0
[  512.124647]  

Signed-off-by: Hans de Goede 
Signed-off-by: Patrik Jakobsson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20220906203852.527663-2-hdego...@redhat.com
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/gma500/gma_display.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/gma500/gma_display.c 
b/drivers/gpu/drm/gma500/gma_display.c
index 34ec3fca09ba..12287c9bb4d8 100644
--- a/drivers/gpu/drm/gma500/gma_display.c
+++ b/drivers/gpu/drm/gma500/gma_display.c
@@ -531,15 +531,18 @@ int gma_crtc_page_flip(struct drm_crtc *crtc,
WARN_ON(drm_crtc_vblank_get(crtc) != 0);
 
gma_crtc->page_flip_event = event;
+   spin_unlock_irqrestore(&dev->event_lock, flags);
 
/* Call this locked if we want an event at vblank interrupt. */
ret = crtc_funcs->mode_set_base(crtc, crtc->x, crtc->y, old_fb);
if (ret) {
-   gma_crtc->page_flip_event = NULL;
-   drm_crtc_vblank_put(crtc);
+   spin_lock_irqsave(&dev->event_lock, flags);
+   if (gma_crtc->page_flip_event) {
+   gma_crtc->page_flip_event = NULL;
+   drm_crtc_vblank_put(crtc);
+   }
+   spin_unlock_irqrestore(&dev->event_lock, flags);
}
-
-   spin_unlock_irqrestore(&dev->event_lock, flags);
} else {
ret = crtc_funcs->mode_set_base(crtc, crtc->x, crtc->y, old_fb);
}
-- 
2.35.1

[PATCH AUTOSEL 5.19 04/16] drm/gma500: Fix (vblank) IRQs not working after suspend/resume

2022-09-21 Thread Sasha Levin

From: Hans de Goede 

[ Upstream commit 235fdbc32d559db21e580f85035c59372704f09e ]

Fix gnome-shell (and other page-flip users) hanging after suspend/resume
because of the gma500's IRQs not working.

This fixes 2 problems with the IRQ handling:

1. gma_power_off() calls gma_irq_uninstall() which does a free_irq(), but
   gma_power_on() called gma_irq_preinstall() + gma_irq_postinstall() which
   do not call request_irq. Replace the pre- + post-install calls with
   gma_irq_install() which does prep + request + post.

2. After fixing 1. IRQs still do not work on a Packard Bell Dot SC (Intel
   Atom N2600, cedarview) netbook.

   Cederview uses MSI interrupts and it seems that the BIOS re-configures
   things back to normal APIC based interrupts during S3 suspend. There is
   some MSI PCI-config registers save/restore code which tries to deal with
   this, but on the Packard Bell Dot SC this is not sufficient to restore
   MSI IRQ functionality after a suspend/resume.

   Replace the PCI-config registers save/restore with pci_disable_msi() on
   suspend + pci_enable_msi() on resume. Fixing e.g. gnome-shell hanging.

Signed-off-by: Hans de Goede 
Signed-off-by: Patrik Jakobsson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20220906203852.527663-4-hdego...@redhat.com
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/gma500/cdv_device.c  |  4 +---
 drivers/gpu/drm/gma500/oaktrail_device.c |  5 +
 drivers/gpu/drm/gma500/power.c   |  8 +---
 drivers/gpu/drm/gma500/psb_drv.c |  2 +-
 drivers/gpu/drm/gma500/psb_drv.h |  5 +
 drivers/gpu/drm/gma500/psb_irq.c | 15 ---
 drivers/gpu/drm/gma500/psb_irq.h |  2 +-
 7 files changed, 18 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/gma500/cdv_device.c 
b/drivers/gpu/drm/gma500/cdv_device.c
index dd32b484dd82..ce96234f3df2 100644
--- a/drivers/gpu/drm/gma500/cdv_device.c
+++ b/drivers/gpu/drm/gma500/cdv_device.c
@@ -581,11 +581,9 @@ static const struct psb_offset cdv_regmap[2] = {
 static int cdv_chip_setup(struct drm_device *dev)
 {
struct drm_psb_private *dev_priv = to_drm_psb_private(dev);
-   struct pci_dev *pdev = to_pci_dev(dev->dev);
INIT_WORK(&dev_priv->hotplug_work, cdv_hotplug_work_func);
 
-   if (pci_enable_msi(pdev))
-   dev_warn(dev->dev, "Enabling MSI failed!\n");
+   dev_priv->use_msi = true;
dev_priv->regmap = cdv_regmap;
gma_get_core_freq(dev);
psb_intel_opregion_init(dev);
diff --git a/drivers/gpu/drm/gma500/oaktrail_device.c 
b/drivers/gpu/drm/gma500/oaktrail_device.c
index 5923a9c89312..f90e628cb482 100644
--- a/drivers/gpu/drm/gma500/oaktrail_device.c
+++ b/drivers/gpu/drm/gma500/oaktrail_device.c
@@ -501,12 +501,9 @@ static const struct psb_offset oaktrail_regmap[2] = {
 static int oaktrail_chip_setup(struct drm_device *dev)
 {
struct drm_psb_private *dev_priv = to_drm_psb_private(dev);
-   struct pci_dev *pdev = to_pci_dev(dev->dev);
int ret;
 
-   if (pci_enable_msi(pdev))
-   dev_warn(dev->dev, "Enabling MSI failed!\n");
-
+   dev_priv->use_msi = true;
dev_priv->regmap = oaktrail_regmap;
 
ret = mid_chip_setup(dev);
diff --git a/drivers/gpu/drm/gma500/power.c b/drivers/gpu/drm/gma500/power.c
index b91de6d36e41..66873085d450 100644
--- a/drivers/gpu/drm/gma500/power.c
+++ b/drivers/gpu/drm/gma500/power.c
@@ -139,8 +139,6 @@ static void gma_suspend_pci(struct pci_dev *pdev)
dev_priv->regs.saveBSM = bsm;
pci_read_config_dword(pdev, 0xFC, &vbt);
dev_priv->regs.saveVBT = vbt;
-   pci_read_config_dword(pdev, PSB_PCIx_MSI_ADDR_LOC, &dev_priv->msi_addr);
-   pci_read_config_dword(pdev, PSB_PCIx_MSI_DATA_LOC, &dev_priv->msi_data);
 
pci_disable_device(pdev);
pci_set_power_state(pdev, PCI_D3hot);
@@ -168,9 +166,6 @@ static bool gma_resume_pci(struct pci_dev *pdev)
pci_restore_state(pdev);
pci_write_config_dword(pdev, 0x5c, dev_priv->regs.saveBSM);
pci_write_config_dword(pdev, 0xFC, dev_priv->regs.saveVBT);
-   /* restoring MSI address and data in PCIx space */
-   pci_write_config_dword(pdev, PSB_PCIx_MSI_ADDR_LOC, dev_priv->msi_addr);
-   pci_write_config_dword(pdev, PSB_PCIx_MSI_DATA_LOC, dev_priv->msi_data);
ret = pci_enable_device(pdev);
 
if (ret != 0)
@@ -223,8 +218,7 @@ int gma_power_resume(struct device *_dev)
mutex_lock(&power_mutex);
gma_resume_pci(pdev);
gma_resume_display(pdev);
-   gma_irq_preinstall(dev);
-   gma_irq_postinstall(dev);
+   gma_irq_install(dev);
mutex_unlock(&power_mutex);
return 0;
 }
diff --git a/drivers/gpu/drm/gma500/psb_drv.c b/drivers/gpu/drm/gma500/psb_drv.c
index 1d8744f3e702..54e756b48606 100644
--- a/drivers/gpu/drm/gma500/psb_drv.c
+++ b/drivers/gpu/drm/gma500/psb_drv.c
@@ -383,7 +383,7 @@ static int psb_driver_load(struct drm_device *dev, unsigned 
long flags)

[PATCH AUTOSEL 5.19 08/16] drm/amdgpu: change the alignment size of TMR BO to 1M

2022-09-21 Thread Sasha Levin

From: Yang Wang 

[ Upstream commit 36de13fdb04abef3ee03ade5129ab146de63983b ]

align TMR BO size TO tmr size is not necessary,
modify the size to 1M to avoid re-create BO fail
when serious VRAM fragmentation.

v2:
add new macro PSP_TMR_ALIGNMENT for TMR BO alignment size

Signed-off-by: Yang Wang 
Reviewed-by: Hawking Zhang 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 2b00f8fe15a8..7b8d4484c3c4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -748,7 +748,7 @@ static int psp_tmr_init(struct psp_context *psp)
}
 
pptr = amdgpu_sriov_vf(psp->adev) ? &tmr_buf : NULL;
-   ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, 
PSP_TMR_SIZE(psp->adev),
+   ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, PSP_TMR_ALIGNMENT,
  AMDGPU_GEM_DOMAIN_VRAM,
  &psp->tmr_bo, &psp->tmr_mc_addr, pptr);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index e431f4994931..cd366c7f311f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -36,6 +36,7 @@
 #define PSP_CMD_BUFFER_SIZE0x1000
 #define PSP_1_MEG  0x10
 #define PSP_TMR_SIZE(adev) ((adev)->asic_type == CHIP_ALDEBARAN ? 0x80 
: 0x40)
+#define PSP_TMR_ALIGNMENT  0x10
 #define PSP_FW_NAME_LEN0x24
 
 enum psp_shared_mem_size {
-- 
2.35.1

[PATCH AUTOSEL 5.19 06/16] drm/amd/pm: disable BACO entry/exit completely on several sienna cichlid cards

2022-09-21 Thread Sasha Levin

From: Guchun Chen 

[ Upstream commit 7c6fb61a400bf3218c6504cb2d48858f98822c9d ]

To avoid hardware intermittent failures.

Signed-off-by: Guchun Chen 
Reviewed-by: Lijo Lazar 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 32bb6b1d9526..d13e455c8827 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -368,6 +368,17 @@ static void sienna_cichlid_check_bxco_support(struct 
smu_context *smu)
smu_baco->platform_support =
(val & RCC_BIF_STRAP0__STRAP_PX_CAPABLE_MASK) ? true :
false;
+
+   /*
+* Disable BACO entry/exit completely on below SKUs to
+* avoid hardware intermittent failures.
+*/
+   if (((adev->pdev->device == 0x73A1) &&
+   (adev->pdev->revision == 0x00)) ||
+   ((adev->pdev->device == 0x73BF) &&
+   (adev->pdev->revision == 0xCF)))
+   smu_baco->platform_support = false;
+
}
 }
 
-- 
2.35.1

[PATCH AUTOSEL 5.19 07/16] drm/amdgpu: use dirty framebuffer helper

2022-09-21 Thread Sasha Levin

From: Hamza Mahfooz 

[ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ]

Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
struct.

Signed-off-by: Hamza Mahfooz 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 4dfd6724b3ca..3451147beda3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -493,6 +494,7 @@ bool amdgpu_display_ddc_probe(struct amdgpu_connector 
*amdgpu_connector,
 static const struct drm_framebuffer_funcs amdgpu_fb_funcs = {
.destroy = drm_gem_fb_destroy,
.create_handle = drm_gem_fb_create_handle,
+   .dirty = drm_atomic_helper_dirtyfb,
 };
 
 uint32_t amdgpu_display_supported_domains(struct amdgpu_device *adev,
-- 
2.35.1

[PATCH AUTOSEL 5.19 03/16] drm/gma500: Fix WARN_ON(lock->magic != lock) error

2022-09-21 Thread Sasha Levin

From: Hans de Goede 

[ Upstream commit b6f25c3b94f2aadbf5cbef954db4073614943d74 ]

psb_gem_unpin() calls dma_resv_lock() but the underlying ww_mutex
gets destroyed by drm_gem_object_release() move the
drm_gem_object_release() call in psb_gem_free_object() to after
the unpin to fix the below warning:

[   79.693962] [ cut here ]
[   79.693992] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[   79.694015] WARNING: CPU: 0 PID: 240 at kernel/locking/mutex.c:582 
__ww_mutex_lock.constprop.0+0x569/0xfb0
[   79.694052] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer qrtr bnep 
ath9k ath9k_common ath9k_hw snd_hda_codec_realtek snd_hda_codec_generic 
ledtrig_audio snd_hda_codec_hdmi snd_hda_intel ath3k snd_intel_dspcfg mac80211 
snd_intel_sdw_acpi btusb snd_hda_codec btrtl btbcm btintel btmtk bluetooth at24 
snd_hda_core snd_hwdep uvcvideo snd_seq libarc4 videobuf2_vmalloc ath 
videobuf2_memops videobuf2_v4l2 videobuf2_common snd_seq_device videodev 
acer_wmi intel_powerclamp coretemp mc snd_pcm joydev sparse_keymap ecdh_generic 
pcspkr wmi_bmof cfg80211 i2c_i801 i2c_smbus snd_timer snd r8169 rfkill lpc_ich 
soundcore acpi_cpufreq zram rtsx_pci_sdmmc mmc_core serio_raw rtsx_pci 
gma500_gfx(E) video wmi ip6_tables ip_tables i2c_dev fuse
[   79.694436] CPU: 0 PID: 240 Comm: plymouthd Tainted: GW   E  
6.0.0-rc3+ #490
[   79.694457] Hardware name: Packard Bell dot s/SJE01_CT, BIOS V1.10 07/23/2013
[   79.694469] RIP: 0010:__ww_mutex_lock.constprop.0+0x569/0xfb0
[   79.694496] Code: ff 85 c0 0f 84 15 fb ff ff 8b 05 ca 3c 11 01 85 c0 0f 85 
07 fb ff ff 48 c7 c6 30 cb 84 aa 48 c7 c7 a3 e1 82 aa e8 ac 29 f8 ff <0f> 0b e9 
ed fa ff ff e8 5b 83 8a ff 85 c0 74 10 44 8b 0d 98 3c 11
[   79.694513] RSP: 0018:ad1dc048bbe0 EFLAGS: 00010282
[   79.694623] RAX: 0028 RBX:  RCX: 
[   79.694636] RDX: 0001 RSI: aa8b0ffc RDI: 
[   79.694650] RBP: ad1dc048bc80 R08:  R09: ad1dc048ba90
[   79.694662] R10: 0003 R11: aad62fe8 R12: 9ff302103138
[   79.694675] R13: 9ff306ec8000 R14: 9ff307779078 R15: 9ff3014c0270
[   79.694690] FS:  7ff1cccf1740() GS:9ff3bc20() 
knlGS:
[   79.694705] CS:  0010 DS:  ES:  CR0: 80050033
[   79.694719] CR2: 559ecbcb4420 CR3: 1321 CR4: 06f0
[   79.694734] Call Trace:
[   79.694749]  
[   79.694761]  ? __schedule+0x47f/0x1670
[   79.694796]  ? psb_gem_unpin+0x27/0x1a0 [gma500_gfx]
[   79.694830]  ? lock_is_held_type+0xe3/0x140
[   79.694864]  ? ww_mutex_lock+0x38/0xa0
[   79.694885]  ? __cond_resched+0x1c/0x30
[   79.694902]  ww_mutex_lock+0x38/0xa0
[   79.694925]  psb_gem_unpin+0x27/0x1a0 [gma500_gfx]
[   79.694964]  psb_gem_unpin+0x199/0x1a0 [gma500_gfx]
[   79.694996]  drm_gem_object_release_handle+0x50/0x60
[   79.695020]  ? drm_gem_object_handle_put_unlocked+0xf0/0xf0
[   79.695042]  idr_for_each+0x4b/0xb0
[   79.695066]  ? _raw_spin_unlock_irqrestore+0x30/0x60
[   79.695095]  drm_gem_release+0x1c/0x30
[   79.695118]  drm_file_free.part.0+0x1ea/0x260
[   79.695150]  drm_release+0x6a/0x120
[   79.695175]  __fput+0x9f/0x260
[   79.695203]  task_work_run+0x59/0xa0
[   79.695227]  do_exit+0x387/0xbe0
[   79.695250]  ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
[   79.695275]  ? lockdep_hardirqs_on+0x7d/0x100
[   79.695304]  do_group_exit+0x33/0xb0
[   79.695331]  __x64_sys_exit_group+0x14/0x20
[   79.695353]  do_syscall_64+0x58/0x80
[   79.695376]  ? up_read+0x17/0x20
[   79.695401]  ? lock_is_held_type+0xe3/0x140
[   79.695429]  ? asm_exc_page_fault+0x22/0x30
[   79.695450]  ? lockdep_hardirqs_on+0x7d/0x100
[   79.695473]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   79.695493] RIP: 0033:0x7ff1ccefe3f1
[   79.695516] Code: Unable to access opcode bytes at RIP 0x7ff1ccefe3c7.
[   79.695607] RSP: 002b:7ffed4413378 EFLAGS: 0246 ORIG_RAX: 
00e7
[   79.695629] RAX: ffda RBX: 7ff1cd0159e0 RCX: 7ff1ccefe3f1
[   79.695644] RDX: 003c RSI: 00e7 RDI: 
[   79.695656] RBP:  R08: ff80 R09: 7ff1cd020b20
[   79.695671] R10:  R11: 0246 R12: 7ff1cd0159e0
[   79.695684] R13:  R14: 7ff1cd01aee8 R15: 7ff1cd01af00
[   79.695733]  
[   79.695746] irq event stamp: 725979
[   79.695757] hardirqs last  enabled at (725979): [] 
finish_task_switch.isra.0+0xe4/0x3f0
[   79.695780] hardirqs last disabled at (725978): [] 
__schedule+0xdd3/0x1670
[   79.695803] softirqs last  enabled at (725974): [] 
__irq_exit_rcu+0xed/0x160
[   79.695825] softirqs last disabled at (725969): [] 
__irq_exit_rcu+0xed/0x160
[   79.695845] ---[ end trace  ]---

Signed-off-by: Hans de Goede 
Signed-off-by: Patrik Jakobsson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20220906203852.527663-3-hdego...@redhat.com
Si

[PATCH AUTOSEL 5.15 05/10] drm/amdgpu: use dirty framebuffer helper

2022-09-21 Thread Sasha Levin

From: Hamza Mahfooz 

[ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ]

Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
struct.

Signed-off-by: Hamza Mahfooz 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 5c08047adb59..47fb722ab374 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -490,6 +491,7 @@ bool amdgpu_display_ddc_probe(struct amdgpu_connector 
*amdgpu_connector,
 static const struct drm_framebuffer_funcs amdgpu_fb_funcs = {
.destroy = drm_gem_fb_destroy,
.create_handle = drm_gem_fb_create_handle,
+   .dirty = drm_atomic_helper_dirtyfb,
 };
 
 uint32_t amdgpu_display_supported_domains(struct amdgpu_device *adev,
-- 
2.35.1

[PATCH AUTOSEL 5.19 12/16] drm/amd/display: Reduce number of arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport()

2022-09-21 Thread Sasha Levin

From: Nathan Chancellor 

[ Upstream commit 37934d4118e22bceb80141804391975078f31734 ]

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml31_ModeSupportAndSystemConfigurationFull() uses by 240 bytes with
LLVM 16 (2216 -> 1976), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
  ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Tested-by: Maíra Canal 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../dc/dml/dcn31/display_mode_vba_31.c| 248 --
 1 file changed, 52 insertions(+), 196 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
index e4b9fd31223c..586825d85d66 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
@@ -321,64 +321,28 @@ static void CalculateVupdateAndDynamicMetadataParameters(
 static void CalculateWatermarksAndDRAMSpeedChangeSupport(
struct display_mode_lib *mode_lib,
unsigned int PrefetchMode,
-   unsigned int NumberOfActivePlanes,
-   unsigned int MaxLineBufferLines,
-   unsigned int LineBufferSize,
-   unsigned int WritebackInterfaceBufferSize,
double DCFCLK,
double ReturnBW,
-   bool SynchronizedVBlank,
-   unsigned int dpte_group_bytes[],
-   unsigned int MetaChunkSize,
double UrgentLatency,
double ExtraLatency,
-   double WritebackLatency,
-   double WritebackChunkSize,
double SOCCLK,
-   double DRAMClockChangeLatency,
-   double SRExitTime,
-   double SREnterPlusExitTime,
-   double SRExitZ8Time,
-   double SREnterPlusExitZ8Time,
double DCFCLKDeepSleep,
unsigned int DETBufferSizeY[],
unsigned int DETBufferSizeC[],
unsigned int SwathHeightY[],
unsigned int SwathHeightC[],
-   unsigned int LBBitPerPixel[],
double SwathWidthY[],
double SwathWidthC[],
-   double HRatio[],
-   double HRatioChroma[],
-   unsigned int vtaps[],
-   unsigned int VTAPsChroma[],
-   double VRatio[],
-   double VRatioChroma[],
-   unsigned int HTotal[],
-   double PixelClock[],
-   unsigned int BlendingAndTiming[],
unsigned int DPPPerPlane[],
double BytePerPixelDETY[],
double BytePerPixelDETC[],
-   double DSTXAfterScaler[],
-   double DSTYAfterScaler[],
-   bool WritebackEnable[],
-   enum source_format_class WritebackPixelFormat[],
-   double WritebackDestinationWidth[],
-   double WritebackDestinationHeight[],
-   double WritebackSourceHeight[],
bool UnboundedRequestEnabled,
int unsigned CompressedBufferSizeInkByte,
enum clock_change_support *DRAMClockChangeSupport,
-   double *UrgentWatermark,
-   double *WritebackUrgentWatermark,
-   double *DRAMClockChangeWatermark,
-   double *WritebackDRAMClockChangeWatermark,
double *StutterExitWatermark,
double *StutterEnterPlusExitWatermark,
double *Z8StutterExitWatermark,
-   double *Z8StutterEnterPlusExitWatermark,
-   double *MinActiveDRAMClockChangeLatencySupported);
+   double *Z8StutterEnterPlusExitWatermark);
 
 static void CalculateDCFCLKDeepSleep(
struct display_mode_lib *mode_lib,
@@ -3027,64 +2991,28 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
CalculateWatermarksAndDRAMSpeedChangeSupport(
mode_lib,
PrefetchMode,
-   v->NumberOfActivePlanes,
-   v->MaxLineBufferLines,
-   v->LineBufferSize,
-   v->WritebackInterfaceBufferSize,
v->DCFCLK,

[PATCH AUTOSEL 5.19 09/16] drm/amdgpu: add HDP remap functionality to nbio 7.7

2022-09-21 Thread Sasha Levin

From: Alex Deucher 

[ Upstream commit 8c5708d3da37b8c7c3c22c7e945b9a76a7c9539b ]

Was missing before and would have resulted in a write to
a non-existant register. Normally APUs don't use HDP, but
other asics could use this code and APUs do use the HDP
when used in passthrough.

Reviewed-by: Lijo Lazar 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c
index cdc0c9779848..6c1fd471a4c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c
@@ -28,6 +28,14 @@
 #include "nbio/nbio_7_7_0_sh_mask.h"
 #include 
 
+static void nbio_v7_7_remap_hdp_registers(struct amdgpu_device *adev)
+{
+   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_MEM_FLUSH_CNTL,
+adev->rmmio_remap.reg_offset + 
KFD_MMIO_REMAP_HDP_MEM_FLUSH_CNTL);
+   WREG32_SOC15(NBIO, 0, regBIF_BX0_REMAP_HDP_REG_FLUSH_CNTL,
+adev->rmmio_remap.reg_offset + 
KFD_MMIO_REMAP_HDP_REG_FLUSH_CNTL);
+}
+
 static u32 nbio_v7_7_get_rev_id(struct amdgpu_device *adev)
 {
u32 tmp;
@@ -237,4 +245,5 @@ const struct amdgpu_nbio_funcs nbio_v7_7_funcs = {
.ih_doorbell_range = nbio_v7_7_ih_doorbell_range,
.ih_control = nbio_v7_7_ih_control,
.init_registers = nbio_v7_7_init_registers,
+   .remap_hdp_registers = nbio_v7_7_remap_hdp_registers,
 };
-- 
2.35.1

[PATCH AUTOSEL 5.19 15/16] drm/rockchip: Fix return type of cdn_dp_connector_mode_valid

2022-09-21 Thread Sasha Levin

From: Nathan Huckleberry 

[ Upstream commit b0b9408f132623dc88e78adb5282f74e4b64bb57 ]

The mode_valid field in drm_connector_helper_funcs is expected to be of
type:
enum drm_mode_status (* mode_valid) (struct drm_connector *connector,
 struct drm_display_mode *mode);

The mismatched return type breaks forward edge kCFI since the underlying
function definition does not match the function hook definition.

The return type of cdn_dp_connector_mode_valid should be changed from
int to enum drm_mode_status.

Reported-by: Dan Carpenter 
Link: https://github.com/ClangBuiltLinux/linux/issues/1703
Cc: l...@lists.linux.dev
Signed-off-by: Nathan Huckleberry 
Reviewed-by: Nathan Chancellor 
Signed-off-by: Heiko Stuebner 
Link: 
https://patchwork.freedesktop.org/patch/msgid/2022091320.155149-1-nh...@google.com
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/rockchip/cdn-dp-core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c 
b/drivers/gpu/drm/rockchip/cdn-dp-core.c
index c204e9b95c1f..518ee13b1d6f 100644
--- a/drivers/gpu/drm/rockchip/cdn-dp-core.c
+++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c
@@ -283,8 +283,9 @@ static int cdn_dp_connector_get_modes(struct drm_connector 
*connector)
return ret;
 }
 
-static int cdn_dp_connector_mode_valid(struct drm_connector *connector,
-  struct drm_display_mode *mode)
+static enum drm_mode_status
+cdn_dp_connector_mode_valid(struct drm_connector *connector,
+   struct drm_display_mode *mode)
 {
struct cdn_dp_device *dp = connector_to_dp(connector);
struct drm_display_info *display_info = &dp->connector.display_info;
-- 
2.35.1

[PATCH AUTOSEL 5.19 14/16] drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage

2022-09-21 Thread Sasha Levin

From: Nathan Chancellor 

[ Upstream commit 41012d715d5d7b9751ae84b8fb255e404ac9c5d0 ]

This function consumes a lot of stack space and it blows up the size of
dml30_ModeSupportAndSystemConfigurationFull() with clang:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6:
 error: stack frame size (2200) exceeds limit (2048) in 
'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
   ^
  1 error generated.

Commit a0f7e7f759cf ("drm/amd/display: fix i386 frame size warning")
aimed to address this for i386 but it did not help x86_64.

To reduce the amount of stack space that
dml30_ModeSupportAndSystemConfigurationFull() uses, mark
UseMinimumDCFCLK() as noinline, using the _for_stack variant for
documentation. While this will increase the total amount of stack usage
between the two functions (1632 and 1304 bytes respectively), it will
make sure both stay below the limit of 2048 bytes for these files. The
aforementioned change does help reduce UseMinimumDCFCLK()'s stack usage
so it should not be reverted in favor of this change.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Tested-by: Maíra Canal 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
index f47d82da115c..42a567e71439 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
@@ -6651,8 +6651,7 @@ static double CalculateUrgentLatency(
return ret;
 }
 
-
-static void UseMinimumDCFCLK(
+static noinline_for_stack void UseMinimumDCFCLK(
struct display_mode_lib *mode_lib,
int MaxInterDCNTileRepeaters,
int MaxPrefetchMode,
-- 
2.35.1

[PATCH AUTOSEL 5.19 13/16] drm/amd/display: Reduce number of arguments of dml31's CalculateFlipSchedule()

2022-09-21 Thread Sasha Levin

From: Nathan Chancellor 

[ Upstream commit 21485d3da659b66c37d99071623af83ee1c6733d ]

Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml31_ModeSupportAndSystemConfigurationFull() uses by 112 bytes with
LLVM 16 (1976 -> 1864), helping clear up the following clang warning:

  
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6:
 error: stack frame size (2216) exceeds limit (2048) in 
'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
*mode_lib)
  ^
  1 error generated.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" 
Tested-by: Maíra Canal 
Reviewed-by: Rodrigo Siqueira 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Rodrigo Siqueira 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../dc/dml/dcn31/display_mode_vba_31.c| 172 +-
 1 file changed, 47 insertions(+), 125 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
index 586825d85d66..40a672236198 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
@@ -261,33 +261,13 @@ static void CalculateRowBandwidth(
 
 static void CalculateFlipSchedule(
struct display_mode_lib *mode_lib,
+   unsigned int k,
double HostVMInefficiencyFactor,
double UrgentExtraLatency,
double UrgentLatency,
-   unsigned int GPUVMMaxPageTableLevels,
-   bool HostVMEnable,
-   unsigned int HostVMMaxNonCachedPageTableLevels,
-   bool GPUVMEnable,
-   double HostVMMinPageSize,
double PDEAndMetaPTEBytesPerFrame,
double MetaRowBytes,
-   double DPTEBytesPerRow,
-   double BandwidthAvailableForImmediateFlip,
-   unsigned int TotImmediateFlipBytes,
-   enum source_format_class SourcePixelFormat,
-   double LineTime,
-   double VRatio,
-   double VRatioChroma,
-   double Tno_bw,
-   bool DCCEnable,
-   unsigned int dpte_row_height,
-   unsigned int meta_row_height,
-   unsigned int dpte_row_height_chroma,
-   unsigned int meta_row_height_chroma,
-   double *DestinationLinesToRequestVMInImmediateFlip,
-   double *DestinationLinesToRequestRowInImmediateFlip,
-   double *final_flip_bw,
-   bool *ImmediateFlipSupportedForPipe);
+   double DPTEBytesPerRow);
 static double CalculateWriteBackDelay(
enum source_format_class WritebackPixelFormat,
double WritebackHRatio,
@@ -2878,33 +2858,13 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
for (k = 0; k < v->NumberOfActivePlanes; ++k) {
CalculateFlipSchedule(
mode_lib,
+   k,
HostVMInefficiencyFactor,
v->UrgentExtraLatency,
v->UrgentLatency,
-   v->GPUVMMaxPageTableLevels,
-   v->HostVMEnable,
-   
v->HostVMMaxNonCachedPageTableLevels,
-   v->GPUVMEnable,
-   v->HostVMMinPageSize,
v->PDEAndMetaPTEBytesFrame[k],
v->MetaRowByte[k],
-   v->PixelPTEBytesPerRow[k],
-   
v->BandwidthAvailableForImmediateFlip,
-   v->TotImmediateFlipBytes,
-   v->SourcePixelFormat[k],
-   v->HTotal[k] / v->PixelClock[k],
-   v->VRatio[k],
-   v->VRatioChroma[k],
-   v->Tno_bw[k],
-   v->DCCEnable[k],
-   v->dpte_row_height[k],
-   v->meta_row_height[k],
-   v->dpte_row_hei

[PATCH AUTOSEL 5.19 10/16] drm/amdgpu: Skip reset error status for psp v13_0_0

2022-09-21 Thread Sasha Levin

From: Candice Li 

[ Upstream commit 86875d558b91cb46f43be112799c06ecce60ec1e ]

No need to reset error status since only umc ras supported on psp v13_0_0.

Signed-off-by: Candice Li 
Reviewed-by: Hawking Zhang 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index dac202ae864d..9193ca5d6fe7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1805,7 +1805,8 @@ static void amdgpu_ras_log_on_err_counter(struct 
amdgpu_device *adev)
amdgpu_ras_query_error_status(adev, &info);
 
if (adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 2) &&
-   adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 4)) {
+   adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 4) &&
+   adev->ip_versions[MP0_HWIP][0] != IP_VERSION(13, 0, 0)) {
if (amdgpu_ras_reset_error_status(adev, 
info.head.block))
dev_warn(adev->dev, "Failed to reset error 
counter and error status");
}
-- 
2.35.1

1 2 >

1 - 100 of 169 matches

Mail list logo