[bug report] vmwgfx: Implement fence objects

2021-09-10 Thread Dan Carpenter
Hello Thomas Hellstrom,

The patch ae2a104058e2: "vmwgfx: Implement fence objects" from Sep 1,
2011, leads to the following
Smatch static checker warning:

drivers/dma-buf/dma-fence.c:790 dma_fence_default_wait()
warn: user controlled unbound timeout

drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
   784  int vmw_fence_obj_wait_ioctl(struct drm_device *dev, void *data,
   785   struct drm_file *file_priv)
   786  {
   787  struct drm_vmw_fence_wait_arg *arg =
   788  (struct drm_vmw_fence_wait_arg *)data;
   789  unsigned long timeout;
   790  struct ttm_base_object *base;
   791  struct vmw_fence_obj *fence;
   792  struct ttm_object_file *tfile = vmw_fpriv(file_priv)->tfile;
   793  int ret;
   794  uint64_t wait_timeout = ((uint64_t)arg->timeout_us * HZ);

timeout comes from the ioctl.

   795  
   796  /*
   797   * 64-bit division not present on 32-bit systems, so do an
   798   * approximation. (Divide by 100).
   799   */
   800  
   801  wait_timeout = (wait_timeout >> 20) + (wait_timeout >> 24) -
   802(wait_timeout >> 26);
   803  
   804  if (!arg->cookie_valid) {
   805  arg->cookie_valid = 1;
   806  arg->kernel_cookie = jiffies + wait_timeout;
   807  }
   808  
   809  base = vmw_fence_obj_lookup(tfile, arg->handle);
   810  if (IS_ERR(base))
   811  return PTR_ERR(base);
   812  
   813  fence = &(container_of(base, struct vmw_user_fence, 
base)->fence);
   814  
   815  timeout = jiffies;
   816  if (time_after_eq(timeout, (unsigned long)arg->kernel_cookie)) {
   817  ret = ((vmw_fence_obj_signaled(fence)) ?
   818 0 : -EBUSY);
   819  goto out;
   820  }
   821  
   822  timeout = (unsigned long)arg->kernel_cookie - timeout;
   823  
   824  ret = vmw_fence_obj_wait(fence, arg->lazy, true, timeout);

This is a new Smatch warning.  To try figure out places which can
trigger sysbot "task hung" warnings.  I don't know if an upper bound on
timeout is appropriate here because this is new experimental check...

   825  
   826  out:
   827  ttm_base_object_unref(&base);

regards,
dan carpenter


Re: i915 ttm_tt shmem backend

2021-09-10 Thread Thomas Hellström

Hi,

On 9/9/21 4:56 PM, Matthew Auld wrote:

Hi Christian,

We are looking into using shmem as a ttm_tt backend in i915 for cached
system memory objects. We would also like to make such objects visible
to the i915-gem shrinker, so that they may be swapped out or discarded
when under memory pressure.

One idea for handling this is roughly something like:
- Add a new TTM_PAGE_FLAG_SHMEM flag, or similar.
- Skip the ttm_pages_allocated accounting on such objects, similar to
how FLAG_SG is already handled.
- Skip all the page->mapping and page->index related bits, like in
tt_add_mapping, since it looks like these are set and used by shmem.
Not sure what functionally this might break, but looks like it's maybe
only driver specific?


IIrc the page->mapping and index is needed when doing dirty-tracking 
using mkwrite and by vmwgfx at some point when doing fb_defio on top of 
TTM buffers. I don't think vmwgfx does that anymore, but it still does 
dirty-tracking.


/Thomas




Re: i915 ttm_tt shmem backend

2021-09-10 Thread Christian König

Am 09.09.21 um 18:56 schrieb Matthew Auld:

On Thu, 9 Sept 2021 at 17:43, Koenig, Christian
 wrote:

Hi Matthew,

this doesn't work, I've already tried something similar.

TTM uses the reverse lookup functionality when migrating BOs between system and 
device memory. And that doesn't seem to work with pages from a shmem file.

Hmm, what do you mean by reverse lookup functionality? Could you
please point out where that is in the TTM code?


When TTM moves a buffer it must make sure that the buffer is not 
accessed by the CPU while moving it.


For this the standard reverse lockup functionality of the Linux kernel 
is used to figure out in which page tables a page is mapped and mark 
those as invalid. Accessing the buffer object will then cause a page 
fault which in turn waits for the buffer move to finish.


But when you back the pages in a TT object with pages from a shmemfile 
this reverse lockup functionality doesn't work for some reason. I 
couldn't figure out what exactly was going wrong here and didn't looked 
deeper, I assumed it's because of not setting up page->mapping and 
page->index correctly. Thomas or Daniel might know more.


Apart from that your approach sounds like pretty much what I tried as well.

Regards,
Christian.




Regards,
Christian.


Von: Matthew Auld 
Gesendet: Donnerstag, 9. September 2021 16:56
An: Christian König ; Koenig, Christian 

Cc: Thomas Hellström ; ML dri-devel 

Betreff: i915 ttm_tt shmem backend

Hi Christian,

We are looking into using shmem as a ttm_tt backend in i915 for cached
system memory objects. We would also like to make such objects visible
to the i915-gem shrinker, so that they may be swapped out or discarded
when under memory pressure.

One idea for handling this is roughly something like:
- Add a new TTM_PAGE_FLAG_SHMEM flag, or similar.
- Skip the ttm_pages_allocated accounting on such objects, similar to
how FLAG_SG is already handled.
- Skip all the page->mapping and page->index related bits, like in
tt_add_mapping, since it looks like these are set and used by shmem.
Not sure what functionally this might break, but looks like it's maybe
only driver specific?
- Skip calling into ttm_bo_swap_out/in and just have
ttm_populate/unpopulate handle this directly for such objects.
- Make such objects visible to the i915-gem shrinker.

Does this approach look acceptable?




Re: i915 ttm_tt shmem backend

2021-09-10 Thread Thomas Hellström
Perhaps some background and goal is worth mentioning here.


On Thu, 2021-09-09 at 17:56 +0100, Matthew Auld wrote:
> On Thu, 9 Sept 2021 at 17:43, Koenig, Christian
>  wrote:
> > 
> > Hi Matthew,
> > 
> > this doesn't work, I've already tried something similar.
> > 
> > TTM uses the reverse lookup functionality when migrating BOs
> > between system and device memory. And that doesn't seem to work
> > with pages from a shmem file.
> 
> Hmm, what do you mean by reverse lookup functionality? Could you
> please point out where that is in the TTM code?

I think this is in unmap_mapping_range() where, if we use VM_MIXEDMAP,
there is a reverse lookup on the PTEs that point to real pages. Now
that we move over to VM_PFNMAP, that problem should go away since core
vm never has a page to investigate. Probably this is why things works
on non-TTM i915 GEM.

@Christian: Some background here:
First I think that there might be things like the above that will pose
problems, and we may or may not be able to overcome those but more
importantly is that we agree with you that *if* we make it work, it is
something that you as a maintainer of TTM can accept from a design- and
maintainabiltiy point of view.

The approach would be similar to the buddy allocator, we adapt some
driver code to TTM in a way that it may be reused with other drivers,
and if other drivers are interested, we'd assist in moving to core TTM.
In essence it'd be a TTM shmem page pool with full shrinking ability
for cached pages only.

What we're really after here is the ability to shrink that doesn't
regress much w r t the elaborate shrinker that's in i915 today that is
power management aware and is also able to start shmem writebacks to
avoid shmem just caching the pages instead of giving them back to the
system (IIRC it was partly the lack of this that blocked earlier TTM
shrinking efforts).

And since it doesn't really matter whether the shrinker sits in core
TTM or in a driver, I think a future goal might be a set of TTM
shrinker helpers that makes sure we shrink the right TTM object, and
perhaps a simple implementation that is typically used by simple
drivers and other drivers can build on that for a more elaborate power-
management aware shrinker.

/Thomas



> 
> > 
> > Regards,
> > Christian.
> > 
> > 
> > Von: Matthew Auld 
> > Gesendet: Donnerstag, 9. September 2021 16:56
> > An: Christian König ; Koenig,
> > Christian 
> > Cc: Thomas Hellström ; ML dri-
> > devel 
> > Betreff: i915 ttm_tt shmem backend
> > 
> > Hi Christian,
> > 
> > We are looking into using shmem as a ttm_tt backend in i915 for
> > cached
> > system memory objects. We would also like to make such objects
> > visible
> > to the i915-gem shrinker, so that they may be swapped out or
> > discarded
> > when under memory pressure.
> > 
> > One idea for handling this is roughly something like:
> > - Add a new TTM_PAGE_FLAG_SHMEM flag, or similar.
> > - Skip the ttm_pages_allocated accounting on such objects, similar
> > to
> > how FLAG_SG is already handled.
> > - Skip all the page->mapping and page->index related bits, like in
> > tt_add_mapping, since it looks like these are set and used by
> > shmem.
> > Not sure what functionally this might break, but looks like it's
> > maybe
> > only driver specific?
> > - Skip calling into ttm_bo_swap_out/in and just have
> > ttm_populate/unpopulate handle this directly for such objects.
> > - Make such objects visible to the i915-gem shrinker.
> > 
> > Does this approach look acceptable?




Re: [Intel-gfx] [PATCH v8 10/17] drm/i915/pxp: interfaces for using protected objects

2021-09-10 Thread Daniele Ceraolo Spurio




On 9/9/2021 2:07 PM, Rodrigo Vivi wrote:

On Thu, Sep 09, 2021 at 05:29:08AM -0700, Daniele Ceraolo Spurio wrote:

This api allow user mode to create protected buffers and to mark
contexts as making use of such objects. Only when using contexts
marked in such a way is the execution guaranteed to work as expected.

Contexts can only be marked as using protected content at creation time
(i.e. the parameter is immutable) and they must be both bannable and not
recoverable. Given that the protected session gets invalidated on
suspend, contexts created this way hold a runtime pm wakeref until
they're either destroyed or invalidated.

All protected objects and contexts will be considered invalid when the
PXP session is destroyed and all new submissions using them will be
rejected. All intel contexts within the invalidated gem contexts will be
marked banned. Userspace can detect that an invalidation has occurred via
the RESET_STATS ioctl, where we report it the same way as a ban due to a
hang.

v5: squash patches, rebase on proto_ctx, update kerneldoc

v6: rebase on obj create_ext changes

v7: Use session counter to check if an object it valid, hold wakeref in
 context, don't add a new flag to RESET_STATS (Daniel)

v8: don't increase guilty count for contexts banned during pxp
 invalidation (Rodrigo)

Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Bommu Krishnaiah 
Cc: Rodrigo Vivi 
Cc: Chris Wilson 
Cc: Lionel Landwerlin 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 98 ---
  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  6 ++
  .../gpu/drm/i915/gem/i915_gem_context_types.h | 28 ++
  drivers/gpu/drm/i915/gem/i915_gem_create.c| 72 ++
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 18 
  drivers/gpu/drm/i915/gem/i915_gem_object.c|  1 +
  drivers/gpu/drm/i915/gem/i915_gem_object.h|  6 ++
  .../gpu/drm/i915/gem/i915_gem_object_types.h  |  8 ++
  .../gpu/drm/i915/gem/selftests/mock_context.c |  4 +-
  drivers/gpu/drm/i915/pxp/intel_pxp.c  | 77 +++
  drivers/gpu/drm/i915/pxp/intel_pxp.h  | 12 +++
  drivers/gpu/drm/i915/pxp/intel_pxp_session.c  |  6 ++
  drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  9 ++
  include/uapi/drm/i915_drm.h   | 52 +-
  14 files changed, 362 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c2ab0e22db0a..8d2d4dbdab7c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -77,6 +77,8 @@
  #include "gt/intel_gpu_commands.h"
  #include "gt/intel_ring.h"
  
+#include "pxp/intel_pxp.h"

+
  #include "i915_gem_context.h"
  #include "i915_trace.h"
  #include "i915_user_extensions.h"
@@ -186,10 +188,13 @@ static int validate_priority(struct drm_i915_private 
*i915,
return 0;
  }
  
-static void proto_context_close(struct i915_gem_proto_context *pc)

+static void proto_context_close(struct drm_i915_private *i915,
+   struct i915_gem_proto_context *pc)
  {
int i;
  
+	if (pc->pxp_wakeref)

+   intel_runtime_pm_put(&i915->runtime_pm, pc->pxp_wakeref);

now that you do this we can remove the intel_pxp_invalidate from the 
runtime_suspend.


ok




if (pc->vm)
i915_vm_put(pc->vm);
if (pc->user_engines) {
@@ -241,6 +246,33 @@ static int proto_context_set_persistence(struct 
drm_i915_private *i915,
return 0;
  }
  
+static int proto_context_set_protected(struct drm_i915_private *i915,

+  struct i915_gem_proto_context *pc,
+  bool protected)
+{
+   int ret = 0;
+
+   if (!intel_pxp_is_enabled(&i915->gt.pxp)) {
+   ret = -ENODEV;
+   } else if (!protected) {
+   pc->uses_protected_content = false;
+   } else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) ||
+  !(pc->user_flags & BIT(UCONTEXT_BANNABLE))) {
+   ret = -EPERM;
+   } else {
+   pc->uses_protected_content = true;
+
+   /*
+* protected context usage requires the PXP session to be up,
+* which in turn requires the device to be active.
+*/
+   pc->pxp_wakeref = intel_runtime_pm_get(&i915->runtime_pm);
+   ret = intel_pxp_wait_for_arb_start(&i915->gt.pxp);
+   }
+
+   return ret;
+}
+
  static struct i915_gem_proto_context *
  proto_context_create(struct drm_i915_private *i915, unsigned int flags)
  {
@@ -269,7 +301,7 @@ proto_context_create(struct drm_i915_private *i915, 
unsigned int flags)
return pc;
  
  proto_close:

-   proto_context_close(pc);
+   proto_context_close(i915, pc);
return err;
  }
  
@@ -693,6 +725,8 @@ static int set_proto_ctx_param(struct drm_i915_file_pr

[PATCH] drm/i915: Add ww context to intel_dpt_pin

2021-09-10 Thread Maarten Lankhorst
Ensure i915_vma_pin_iomap and vma_unpin are done with dpt->obj lock held.

I don't think there's much of a point in merging intel_dpt_pin() with
intel_pin_fb_obj_dpt(), they touch different objects.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/display/intel_dpt.c | 40 +++-
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index de62bd77b15e..edd6f1aa2626 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -121,32 +121,42 @@ struct i915_vma *intel_dpt_pin(struct i915_address_space 
*vm)
intel_wakeref_t wakeref;
struct i915_vma *vma;
void __iomem *iomem;
+   struct i915_gem_ww_ctx ww;
+   int err;
 
wakeref = intel_runtime_pm_get(&i915->runtime_pm);
atomic_inc(&i915->gpu_error.pending_fb_pin);
 
-   vma = i915_gem_object_ggtt_pin(dpt->obj, NULL, 0, 4096,
-  HAS_LMEM(i915) ? 0 : PIN_MAPPABLE);
-   if (IS_ERR(vma))
-   goto err;
+   for_i915_gem_ww(&ww, err, true) {
+   err = i915_gem_object_lock(dpt->obj, &ww);
+   if (err)
+   continue;
 
-   iomem = i915_vma_pin_iomap(vma);
-   i915_vma_unpin(vma);
-   if (IS_ERR(iomem)) {
-   vma = ERR_CAST(iomem);
-   goto err;
-   }
+   vma = i915_gem_object_ggtt_pin_ww(dpt->obj, &ww, NULL, 0, 4096,
+ HAS_LMEM(i915) ? 0 : 
PIN_MAPPABLE);
+   if (IS_ERR(vma)) {
+   err = PTR_ERR(vma);
+   continue;
+   }
+
+   iomem = i915_vma_pin_iomap(vma);
+   i915_vma_unpin(vma);
 
-   dpt->vma = vma;
-   dpt->iomem = iomem;
+   if (IS_ERR(iomem)) {
+   err = PTR_ERR(vma);
+   continue;
+   }
 
-   i915_vma_get(vma);
+   dpt->vma = vma;
+   dpt->iomem = iomem;
+
+   i915_vma_get(vma);
+   }
 
-err:
atomic_dec(&i915->gpu_error.pending_fb_pin);
intel_runtime_pm_put(&i915->runtime_pm, wakeref);
 
-   return vma;
+   return err ? ERR_PTR(err) : vma;
 }
 
 void intel_dpt_unpin(struct i915_address_space *vm)
-- 
2.33.0



Re: i915 ttm_tt shmem backend

2021-09-10 Thread Christian König




Am 10.09.21 um 10:08 schrieb Thomas Hellström:

Perhaps some background and goal is worth mentioning here.


On Thu, 2021-09-09 at 17:56 +0100, Matthew Auld wrote:

On Thu, 9 Sept 2021 at 17:43, Koenig, Christian
 wrote:

Hi Matthew,

this doesn't work, I've already tried something similar.

TTM uses the reverse lookup functionality when migrating BOs
between system and device memory. And that doesn't seem to work
with pages from a shmem file.

Hmm, what do you mean by reverse lookup functionality? Could you
please point out where that is in the TTM code?

I think this is in unmap_mapping_range() where, if we use VM_MIXEDMAP,
there is a reverse lookup on the PTEs that point to real pages. Now
that we move over to VM_PFNMAP, that problem should go away since core
vm never has a page to investigate. Probably this is why things works
on non-TTM i915 GEM.


Yeah, that was really likely the root problem. I didn't kept 
investigating after realizing that my approach wouldn't work.



@Christian: Some background here:
First I think that there might be things like the above that will pose
problems, and we may or may not be able to overcome those but more
importantly is that we agree with you that *if* we make it work, it is
something that you as a maintainer of TTM can accept from a design- and
maintainabiltiy point of view.

The approach would be similar to the buddy allocator, we adapt some
driver code to TTM in a way that it may be reused with other drivers,
and if other drivers are interested, we'd assist in moving to core TTM.
In essence it'd be a TTM shmem page pool with full shrinking ability
for cached pages only.

What we're really after here is the ability to shrink that doesn't
regress much w r t the elaborate shrinker that's in i915 today that is
power management aware and is also able to start shmem writebacks to
avoid shmem just caching the pages instead of giving them back to the
system (IIRC it was partly the lack of this that blocked earlier TTM
shrinking efforts).

And since it doesn't really matter whether the shrinker sits in core
TTM or in a driver, I think a future goal might be a set of TTM
shrinker helpers that makes sure we shrink the right TTM object, and
perhaps a simple implementation that is typically used by simple
drivers and other drivers can build on that for a more elaborate power-
management aware shrinker.


That's understandable, but I think not necessary what we should aim for 
in the long term.


First of all I would really like to move more of the functionality from 
ttm_pool.c into the core memory management, especially handling of 
uncached and write combined memory.


That's essentially completely architecture dependent and currently 
implemented extremely awkward. Either Daniels suggestion of having a 
GFP_WC or Christophs approach of moving all this into the DMA API is the 
way to go here.


As long as i915 has no interest in USWC support implementing their own 
shmemfile backend sounds fine to me, but I have strong doubt that this 
will be of use to anybody else.


Christian.



/Thomas




Regards,
Christian.


Von: Matthew Auld 
Gesendet: Donnerstag, 9. September 2021 16:56
An: Christian König ; Koenig,
Christian 
Cc: Thomas Hellström ; ML dri-
devel 
Betreff: i915 ttm_tt shmem backend

Hi Christian,

We are looking into using shmem as a ttm_tt backend in i915 for
cached
system memory objects. We would also like to make such objects
visible
to the i915-gem shrinker, so that they may be swapped out or
discarded
when under memory pressure.

One idea for handling this is roughly something like:
- Add a new TTM_PAGE_FLAG_SHMEM flag, or similar.
- Skip the ttm_pages_allocated accounting on such objects, similar
to
how FLAG_SG is already handled.
- Skip all the page->mapping and page->index related bits, like in
tt_add_mapping, since it looks like these are set and used by
shmem.
Not sure what functionally this might break, but looks like it's
maybe
only driver specific?
- Skip calling into ttm_bo_swap_out/in and just have
ttm_populate/unpopulate handle this directly for such objects.
- Make such objects visible to the i915-gem shrinker.

Does this approach look acceptable?






[PATCH 01/14] dma-buf: add dma_resv_for_each_fence_unlocked

2021-09-10 Thread Christian König
Abstract the complexity of iterating over all the fences
in a dma_resv object.

The new loop handles the whole RCU and retry dance and
returns only fences where we can be sure we grabbed the
right one.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 63 ++
 include/linux/dma-resv.h   | 36 ++
 2 files changed, 99 insertions(+)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 84fbe60629e3..213a9b7251ca 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -323,6 +323,69 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct 
dma_fence *fence)
 }
 EXPORT_SYMBOL(dma_resv_add_excl_fence);
 
+/**
+ * dma_resv_walk_unlocked - walk over fences in a dma_resv obj
+ * @obj: the dma_resv object
+ * @cursor: cursor to record the current position
+ * @all_fences: true returns also the shared fences
+ * @first: if we should start over
+ *
+ * Return all the fences in the dma_resv object which are not yet signaled.
+ * The returned fence has an extra local reference so will stay alive.
+ * If a concurrent modify is detected the whole iterator is started over again.
+ */
+struct dma_fence *dma_resv_walk_unlocked(struct dma_resv *obj,
+struct dma_resv_cursor *cursor,
+bool all_fences, bool first)
+{
+   struct dma_fence *fence = NULL;
+
+   do {
+   /* Drop the reference from the previous round */
+   dma_fence_put(fence);
+
+   cursor->is_first = first;
+   if (first) {
+   cursor->seq = read_seqcount_begin(&obj->seq);
+   cursor->index = -1;
+   cursor->fences = dma_resv_shared_list(obj);
+   cursor->is_exclusive = true;
+
+   fence = dma_resv_excl_fence(obj);
+   if (fence && test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+ &fence->flags))
+   fence = NULL;
+   } else {
+   fence = NULL;
+   }
+
+   if (fence) {
+   fence = dma_fence_get_rcu(fence);
+   } else if (all_fences && cursor->fences) {
+   struct dma_resv_list *fences = cursor->fences;
+
+   cursor->is_exclusive = false;
+   while (++cursor->index < fences->shared_count) {
+   fence = rcu_dereference(fences->shared[
+   cursor->index]);
+   if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+ &fence->flags))
+   break;
+   }
+   if (cursor->index < fences->shared_count)
+   fence = dma_fence_get_rcu(fence);
+   else
+   fence = NULL;
+   }
+
+   /* For the eventually next round */
+   first = true;
+   } while (read_seqcount_retry(&obj->seq, cursor->seq));
+
+   return fence;
+}
+EXPORT_SYMBOL_GPL(dma_resv_walk_unlocked);
+
 /**
  * dma_resv_copy_fences - Copy all fences from src to dst.
  * @dst: the destination reservation object
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 9100dd3dc21f..f5b91c292ee0 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -149,6 +149,39 @@ struct dma_resv {
struct dma_resv_list __rcu *fence;
 };
 
+/**
+ * struct dma_resv_cursor - current position into the dma_resv fences
+ * @seq: sequence number to check
+ * @index: index into the shared fences
+ * @shared: the shared fences
+ * @is_first: true if this is the first returned fence
+ * @is_exclusive: if the current fence is the exclusive one
+ */
+struct dma_resv_cursor {
+   unsigned int seq;
+   unsigned int index;
+   struct dma_resv_list *fences;
+   bool is_first;
+   bool is_exclusive;
+};
+
+/**
+ * dma_resv_for_each_fence_unlocked - fence iterator
+ * @obj: a dma_resv object pointer
+ * @cursor: a struct dma_resv_cursor pointer
+ * @all_fences: true if all fences should be returned
+ * @fence: the current fence
+ *
+ * Iterate over the fences in a struct dma_resv object without holding the
+ * dma_resv::lock. The RCU read side lock must be hold when using this, but can
+ * be dropped and re-taken as necessary inside the loop. @all_fences controls
+ * if the shared fences are returned as well.
+ */
+#define dma_resv_for_each_fence_unlocked(obj, cursor, all_fences, fence)\
+   for (fence = dma_resv_walk_unlocked(obj, cursor, all_fences, true); \
+fence; dma_fence_put(fence),   \
+fence = dma_resv_walk_unlocked(obj, curso

[PATCH 02/14] dma-buf: add dma_resv_for_each_fence

2021-09-10 Thread Christian König
A simpler version of the iterator to be used when the dma_resv object is
locked.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 38 ++
 include/linux/dma-resv.h   | 18 ++
 2 files changed, 56 insertions(+)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 213a9b7251ca..8cbccaae169d 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -323,6 +323,44 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct 
dma_fence *fence)
 }
 EXPORT_SYMBOL(dma_resv_add_excl_fence);
 
+/**
+ * dma_resv_walk - walk over fences in a dma_resv obj
+ * @obj: the dma_resv object
+ * @cursor: cursor to record the current position
+ * @all_fences: true returns also the shared fences
+ * @first: if we should start over
+ *
+ * Return all the fences in the dma_resv object while holding the
+ * dma_resv::lock.
+ */
+struct dma_fence *dma_resv_walk(struct dma_resv *obj,
+   struct dma_resv_cursor *cursor,
+   bool all_fences, bool first)
+{
+   dma_resv_assert_held(obj);
+
+   cursor->is_first = first;
+   if (first) {
+   struct dma_fence *fence;
+
+   cursor->index = -1;
+   cursor->fences = dma_resv_shared_list(obj);
+   cursor->is_exclusive = true;
+
+   fence = dma_resv_excl_fence(obj);
+   if (fence)
+   return fence;
+   }
+
+   if (!all_fences || !cursor->fences ||
+   ++cursor->index >= cursor->fences->shared_count)
+   return NULL;
+
+   return rcu_dereference_protected(cursor->fences->shared[cursor->index],
+dma_resv_held(obj));
+}
+EXPORT_SYMBOL_GPL(dma_resv_walk);
+
 /**
  * dma_resv_walk_unlocked - walk over fences in a dma_resv obj
  * @obj: the dma_resv object
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index f5b91c292ee0..6f9bb7e4c538 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -165,6 +165,21 @@ struct dma_resv_cursor {
bool is_exclusive;
 };
 
+/**
+ * dma_resv_for_each_fence - fence iterator
+ * @obj: a dma_resv object pointer
+ * @cursor: a struct dma_resv_cursor pointer
+ * @all_fences: true if all fences should be returned
+ * @fence: the current fence
+ *
+ * Iterate over the fences in a struct dma_resv object while holding the
+ * dma_resv::lock. @all_fences controls if the shared fences are returned as
+ * well.
+ */
+#define dma_resv_for_each_fence(obj, cursor, all_fences, fence)
  \
+   for (fence = dma_resv_walk(obj, cursor, all_fences, true); fence; \
+fence = dma_resv_walk(obj, cursor, all_fences, false))
+
 /**
  * dma_resv_for_each_fence_unlocked - fence iterator
  * @obj: a dma_resv object pointer
@@ -399,6 +414,9 @@ void dma_resv_fini(struct dma_resv *obj);
 int dma_resv_reserve_shared(struct dma_resv *obj, unsigned int num_fences);
 void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence);
 void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence);
+struct dma_fence *dma_resv_walk(struct dma_resv *obj,
+   struct dma_resv_cursor *cursor,
+   bool first, bool all_fences);
 struct dma_fence *dma_resv_walk_unlocked(struct dma_resv *obj,
 struct dma_resv_cursor *cursor,
 bool first, bool all_fences);
-- 
2.25.1



[PATCH 03/14] dma-buf: use new iterator in dma_resv_copy_fences

2021-09-10 Thread Christian König
This makes the function much simpler since the complex
retry logic is now handled else where.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 81 +++---
 1 file changed, 32 insertions(+), 49 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 8cbccaae169d..9a9c0bba772b 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -433,74 +433,57 @@ EXPORT_SYMBOL_GPL(dma_resv_walk_unlocked);
  */
 int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src)
 {
-   struct dma_resv_list *src_list, *dst_list;
-   struct dma_fence *old, *new;
-   unsigned int i;
+   struct dma_resv_cursor cursor;
+   struct dma_resv_list *list;
+   struct dma_fence *f, *excl;
 
dma_resv_assert_held(dst);
 
-   rcu_read_lock();
-   src_list = dma_resv_shared_list(src);
+   list = NULL;
+   excl = NULL;
 
-retry:
-   if (src_list) {
-   unsigned int shared_count = src_list->shared_count;
+   rcu_read_lock();
+   dma_resv_for_each_fence_unlocked(dst, &cursor, true, f) {
 
-   rcu_read_unlock();
+   if (cursor.is_first) {
+   dma_resv_list_free(list);
+   dma_fence_put(excl);
 
-   dst_list = dma_resv_list_alloc(shared_count);
-   if (!dst_list)
-   return -ENOMEM;
+   if (cursor.fences) {
+   unsigned int cnt = cursor.fences->shared_count;
 
-   rcu_read_lock();
-   src_list = dma_resv_shared_list(src);
-   if (!src_list || src_list->shared_count > shared_count) {
-   kfree(dst_list);
-   goto retry;
-   }
+   rcu_read_unlock();
+   list = dma_resv_list_alloc(cnt);
+   if (!list)
+   return -ENOMEM;
 
-   dst_list->shared_count = 0;
-   for (i = 0; i < src_list->shared_count; ++i) {
-   struct dma_fence __rcu **dst;
-   struct dma_fence *fence;
+   list->shared_count = 0;
+   rcu_read_lock();
 
-   fence = rcu_dereference(src_list->shared[i]);
-   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-&fence->flags))
-   continue;
-
-   if (!dma_fence_get_rcu(fence)) {
-   dma_resv_list_free(dst_list);
-   src_list = dma_resv_shared_list(src);
-   goto retry;
+   } else {
+   list = NULL;
}
+   excl = NULL;
+   }
 
-   if (dma_fence_is_signaled(fence)) {
-   dma_fence_put(fence);
-   continue;
-   }
+   if (cursor.is_exclusive)
+   excl = f;
+   else
+   RCU_INIT_POINTER(list->shared[list->shared_count++], f);
 
-   dst = &dst_list->shared[dst_list->shared_count++];
-   rcu_assign_pointer(*dst, fence);
-   }
-   } else {
-   dst_list = NULL;
+   /* Don't drop the reference */
+   f = NULL;
}
 
-   new = dma_fence_get_rcu_safe(&src->fence_excl);
rcu_read_unlock();
 
-   src_list = dma_resv_shared_list(dst);
-   old = dma_resv_excl_fence(dst);
-
write_seqcount_begin(&dst->seq);
-   /* write_seqcount_begin provides the necessary memory barrier */
-   RCU_INIT_POINTER(dst->fence_excl, new);
-   RCU_INIT_POINTER(dst->fence, dst_list);
+   excl = rcu_replace_pointer(dst->fence_excl, excl, dma_resv_held(dst));
+   list = rcu_replace_pointer(dst->fence, list, dma_resv_held(dst));
write_seqcount_end(&dst->seq);
 
-   dma_resv_list_free(src_list);
-   dma_fence_put(old);
+   dma_resv_list_free(list);
+   dma_fence_put(excl);
 
return 0;
 }
-- 
2.25.1



[PATCH 04/14] dma-buf: use new iterator in dma_resv_get_fences

2021-09-10 Thread Christian König
This makes the function much simpler since the complex
retry logic is now handled elsewhere.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 110 +
 1 file changed, 37 insertions(+), 73 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 9a9c0bba772b..2dfb04e6a62f 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -493,99 +493,63 @@ EXPORT_SYMBOL(dma_resv_copy_fences);
  * dma_resv_get_fences - Get an object's shared and exclusive
  * fences without update side lock held
  * @obj: the reservation object
- * @pfence_excl: the returned exclusive fence (or NULL)
- * @pshared_count: the number of shared fences returned
- * @pshared: the array of shared fence ptrs returned (array is krealloc'd to
+ * @fence_excl: the returned exclusive fence (or NULL)
+ * @shared_count: the number of shared fences returned
+ * @shared: the array of shared fence ptrs returned (array is krealloc'd to
  * the required size, and must be freed by caller)
  *
  * Retrieve all fences from the reservation object. If the pointer for the
  * exclusive fence is not specified the fence is put into the array of the
  * shared fences as well. Returns either zero or -ENOMEM.
  */
-int dma_resv_get_fences(struct dma_resv *obj, struct dma_fence **pfence_excl,
-   unsigned int *pshared_count,
-   struct dma_fence ***pshared)
+int dma_resv_get_fences(struct dma_resv *obj, struct dma_fence **fence_excl,
+   unsigned int *shared_count, struct dma_fence ***shared)
 {
-   struct dma_fence **shared = NULL;
-   struct dma_fence *fence_excl;
-   unsigned int shared_count;
-   int ret = 1;
-
-   do {
-   struct dma_resv_list *fobj;
-   unsigned int i, seq;
-   size_t sz = 0;
-
-   shared_count = i = 0;
-
-   rcu_read_lock();
-   seq = read_seqcount_begin(&obj->seq);
-
-   fence_excl = dma_resv_excl_fence(obj);
-   if (fence_excl && !dma_fence_get_rcu(fence_excl))
-   goto unlock;
+   struct dma_resv_cursor cursor;
+   struct dma_fence *fence;
 
-   fobj = dma_resv_shared_list(obj);
-   if (fobj)
-   sz += sizeof(*shared) * fobj->shared_max;
+   *shared_count = 0;
+   *shared = NULL;
 
-   if (!pfence_excl && fence_excl)
-   sz += sizeof(*shared);
+   if (fence_excl)
+   *fence_excl = NULL;
 
-   if (sz) {
-   struct dma_fence **nshared;
+   rcu_read_lock();
+   dma_resv_for_each_fence_unlocked(obj, &cursor, true, fence) {
 
-   nshared = krealloc(shared, sz,
-  GFP_NOWAIT | __GFP_NOWARN);
-   if (!nshared) {
-   rcu_read_unlock();
+   if (cursor.is_first) {
+   unsigned int count;
 
-   dma_fence_put(fence_excl);
-   fence_excl = NULL;
+   while (*shared_count)
+   dma_fence_put((*shared)[--(*shared_count)]);
 
-   nshared = krealloc(shared, sz, GFP_KERNEL);
-   if (nshared) {
-   shared = nshared;
-   continue;
-   }
+   if (fence_excl)
+   dma_fence_put(*fence_excl);
 
-   ret = -ENOMEM;
-   break;
-   }
-   shared = nshared;
-   shared_count = fobj ? fobj->shared_count : 0;
-   for (i = 0; i < shared_count; ++i) {
-   shared[i] = rcu_dereference(fobj->shared[i]);
-   if (!dma_fence_get_rcu(shared[i]))
-   break;
-   }
-   }
+   count = cursor.fences ? cursor.fences->shared_count : 0;
+   count += fence_excl ? 0 : 1;
+   rcu_read_unlock();
 
-   if (i != shared_count || read_seqcount_retry(&obj->seq, seq)) {
-   while (i--)
-   dma_fence_put(shared[i]);
-   dma_fence_put(fence_excl);
-   goto unlock;
+   /* Eventually re-allocate the array */
+   *shared = krealloc_array(*shared, count,
+sizeof(*shared),
+GFP_KERNEL);
+   if (count && !*shared)
+   return -ENOMEM;
+

[PATCH 07/14] drm/i915: use the new iterator in i915_gem_busy_ioctl

2021-09-10 Thread Christian König
This makes the function much simpler since the complex
retry logic is now handled else where.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/i915/gem/i915_gem_busy.c | 30 +++-
 1 file changed, 9 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c 
b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
index 6234e17259c1..c6c6d747b33e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
@@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 {
struct drm_i915_gem_busy *args = data;
struct drm_i915_gem_object *obj;
-   struct dma_resv_list *list;
-   unsigned int seq;
+   struct dma_resv_cursor cursor;
+   struct dma_fence *fence;
int err;
 
err = -ENOENT;
@@ -109,28 +109,16 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 * to report the overall busyness. This is what the wait-ioctl does.
 *
 */
-retry:
-   seq = raw_read_seqcount(&obj->base.resv->seq);
-
-   /* Translate the exclusive fence to the READ *and* WRITE engine */
-   args->busy = busy_check_writer(dma_resv_excl_fence(obj->base.resv));
-
-   /* Translate shared fences to READ set of engines */
-   list = dma_resv_shared_list(obj->base.resv);
-   if (list) {
-   unsigned int shared_count = list->shared_count, i;
-
-   for (i = 0; i < shared_count; ++i) {
-   struct dma_fence *fence =
-   rcu_dereference(list->shared[i]);
-
+   args->busy = false;
+   dma_resv_for_each_fence_unlocked(obj->base.resv, &cursor, true, fence) {
+   if (cursor.is_exclusive)
+   /* Translate the exclusive fence to the READ *and* 
WRITE engine */
+   args->busy = busy_check_writer(fence);
+   else
+   /* Translate shared fences to READ set of engines */
args->busy |= busy_check_reader(fence);
-   }
}
 
-   if (args->busy && read_seqcount_retry(&obj->base.resv->seq, seq))
-   goto retry;
-
err = 0;
 out:
rcu_read_unlock();
-- 
2.25.1



[PATCH 06/14] dma-buf: use new iterator in dma_resv_test_signaled

2021-09-10 Thread Christian König
This makes the function much simpler since the complex
retry logic is now handled elsewhere.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 54 +-
 1 file changed, 7 insertions(+), 47 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 645cf52a6a6c..cde5d448d029 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -593,22 +593,6 @@ long dma_resv_wait_timeout(struct dma_resv *obj, bool 
wait_all, bool intr,
 EXPORT_SYMBOL_GPL(dma_resv_wait_timeout);
 
 
-static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence)
-{
-   struct dma_fence *fence, *lfence = passed_fence;
-   int ret = 1;
-
-   if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &lfence->flags)) {
-   fence = dma_fence_get_rcu(lfence);
-   if (!fence)
-   return -1;
-
-   ret = !!dma_fence_is_signaled(fence);
-   dma_fence_put(fence);
-   }
-   return ret;
-}
-
 /**
  * dma_resv_test_signaled - Test if a reservation object's fences have been
  * signaled.
@@ -625,43 +609,19 @@ static inline int dma_resv_test_signaled_single(struct 
dma_fence *passed_fence)
  */
 bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
 {
+   struct dma_resv_cursor cursor;
struct dma_fence *fence;
-   unsigned int seq;
-   int ret;
 
rcu_read_lock();
-retry:
-   ret = true;
-   seq = read_seqcount_begin(&obj->seq);
-
-   if (test_all) {
-   struct dma_resv_list *fobj = dma_resv_shared_list(obj);
-   unsigned int i, shared_count;
-
-   shared_count = fobj ? fobj->shared_count : 0;
-   for (i = 0; i < shared_count; ++i) {
-   fence = rcu_dereference(fobj->shared[i]);
-   ret = dma_resv_test_signaled_single(fence);
-   if (ret < 0)
-   goto retry;
-   else if (!ret)
-   break;
+   dma_resv_for_each_fence_unlocked(obj, &cursor, test_all, fence) {
+   if (!dma_fence_is_signaled(fence)) {
+   rcu_read_unlock();
+   dma_fence_put(fence);
+   return false;
}
}
-
-   fence = dma_resv_excl_fence(obj);
-   if (ret && fence) {
-   ret = dma_resv_test_signaled_single(fence);
-   if (ret < 0)
-   goto retry;
-
-   }
-
-   if (read_seqcount_retry(&obj->seq, seq))
-   goto retry;
-
rcu_read_unlock();
-   return ret;
+   return true;
 }
 EXPORT_SYMBOL_GPL(dma_resv_test_signaled);
 
-- 
2.25.1



[PATCH 05/14] dma-buf: use new iterator in dma_resv_wait_timeout

2021-09-10 Thread Christian König
This makes the function much simpler since the complex
retry logic is now handled elsewhere.

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 64 +-
 1 file changed, 7 insertions(+), 57 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 2dfb04e6a62f..645cf52a6a6c 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -571,74 +571,24 @@ long dma_resv_wait_timeout(struct dma_resv *obj, bool 
wait_all, bool intr,
   unsigned long timeout)
 {
long ret = timeout ? timeout : 1;
-   unsigned int seq, shared_count;
+   struct dma_resv_cursor cursor;
struct dma_fence *fence;
-   int i;
 
-retry:
-   shared_count = 0;
-   seq = read_seqcount_begin(&obj->seq);
rcu_read_lock();
-   i = -1;
-
-   fence = dma_resv_excl_fence(obj);
-   if (fence && !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
-   if (!dma_fence_get_rcu(fence))
-   goto unlock_retry;
+   dma_resv_for_each_fence_unlocked(obj, &cursor, wait_all, fence) {
+   rcu_read_unlock();
 
-   if (dma_fence_is_signaled(fence)) {
+   ret = dma_fence_wait_timeout(fence, intr, ret);
+   if (ret <= 0) {
dma_fence_put(fence);
-   fence = NULL;
+   return ret;
}
 
-   } else {
-   fence = NULL;
-   }
-
-   if (wait_all) {
-   struct dma_resv_list *fobj = dma_resv_shared_list(obj);
-
-   if (fobj)
-   shared_count = fobj->shared_count;
-
-   for (i = 0; !fence && i < shared_count; ++i) {
-   struct dma_fence *lfence;
-
-   lfence = rcu_dereference(fobj->shared[i]);
-   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-&lfence->flags))
-   continue;
-
-   if (!dma_fence_get_rcu(lfence))
-   goto unlock_retry;
-
-   if (dma_fence_is_signaled(lfence)) {
-   dma_fence_put(lfence);
-   continue;
-   }
-
-   fence = lfence;
-   break;
-   }
+   rcu_read_lock();
}
-
rcu_read_unlock();
-   if (fence) {
-   if (read_seqcount_retry(&obj->seq, seq)) {
-   dma_fence_put(fence);
-   goto retry;
-   }
 
-   ret = dma_fence_wait_timeout(fence, intr, ret);
-   dma_fence_put(fence);
-   if (ret > 0 && wait_all && (i + 1 < shared_count))
-   goto retry;
-   }
return ret;
-
-unlock_retry:
-   rcu_read_unlock();
-   goto retry;
 }
 EXPORT_SYMBOL_GPL(dma_resv_wait_timeout);
 
-- 
2.25.1



[PATCH 08/14] drm/ttm: use the new iterator in ttm_bo_flush_all_fences

2021-09-10 Thread Christian König
This is probably a fix since we didn't even grabed a reference to the
fences.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 0a3127436f61..5dd0c3dfec3c 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -269,19 +269,11 @@ static int ttm_bo_individualize_resv(struct 
ttm_buffer_object *bo)
 static void ttm_bo_flush_all_fences(struct ttm_buffer_object *bo)
 {
struct dma_resv *resv = &bo->base._resv;
-   struct dma_resv_list *fobj;
+   struct dma_resv_cursor cursor;
struct dma_fence *fence;
-   int i;
 
rcu_read_lock();
-   fobj = dma_resv_shared_list(resv);
-   fence = dma_resv_excl_fence(resv);
-   if (fence && !fence->ops->signaled)
-   dma_fence_enable_sw_signaling(fence);
-
-   for (i = 0; fobj && i < fobj->shared_count; ++i) {
-   fence = rcu_dereference(fobj->shared[i]);
-
+   dma_resv_for_each_fence_unlocked(resv, &cursor, true, fence) {
if (!fence->ops->signaled)
dma_fence_enable_sw_signaling(fence);
}
-- 
2.25.1



[PATCH 10/14] drm/amdgpu: use the new iterator in amdgpu_sync_resv

2021-09-10 Thread Christian König
Simplifying the code a bit.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 44 
 1 file changed, 14 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index 862eb3c1c4c5..031ba20debb9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -252,41 +252,25 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct 
amdgpu_sync *sync,
 struct dma_resv *resv, enum amdgpu_sync_mode mode,
 void *owner)
 {
-   struct dma_resv_list *flist;
+   struct dma_resv_cursor cursor;
struct dma_fence *f;
-   unsigned i;
-   int r = 0;
+   int r;
 
if (resv == NULL)
return -EINVAL;
 
-   /* always sync to the exclusive fence */
-   f = dma_resv_excl_fence(resv);
-   dma_fence_chain_for_each(f, f) {
-   struct dma_fence_chain *chain = to_dma_fence_chain(f);
-
-   if (amdgpu_sync_test_fence(adev, mode, owner, chain ?
-  chain->fence : f)) {
-   r = amdgpu_sync_fence(sync, f);
-   dma_fence_put(f);
-   if (r)
-   return r;
-   break;
-   }
-   }
-
-   flist = dma_resv_shared_list(resv);
-   if (!flist)
-   return 0;
-
-   for (i = 0; i < flist->shared_count; ++i) {
-   f = rcu_dereference_protected(flist->shared[i],
- dma_resv_held(resv));
-
-   if (amdgpu_sync_test_fence(adev, mode, owner, f)) {
-   r = amdgpu_sync_fence(sync, f);
-   if (r)
-   return r;
+   dma_resv_for_each_fence(resv, &cursor, true, f) {
+   dma_fence_chain_for_each(f, f) {
+   struct dma_fence_chain *chain = to_dma_fence_chain(f);
+
+   if (amdgpu_sync_test_fence(adev, mode, owner, chain ?
+  chain->fence : f)) {
+   r = amdgpu_sync_fence(sync, f);
+   dma_fence_put(f);
+   if (r)
+   return r;
+   break;
+   }
}
}
return 0;
-- 
2.25.1



[PATCH 09/14] drm/etnaviv: use new iterator in etnaviv_gem_describe

2021-09-10 Thread Christian König
Instead of hand rolling the logic.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.c | 27 +--
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index b8fa6ed3dd73..6808dbef5c79 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -437,19 +437,17 @@ int etnaviv_gem_wait_bo(struct etnaviv_gpu *gpu, struct 
drm_gem_object *obj,
 static void etnaviv_gem_describe_fence(struct dma_fence *fence,
const char *type, struct seq_file *m)
 {
-   if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
-   seq_printf(m, "\t%9s: %s %s seq %llu\n",
-  type,
-  fence->ops->get_driver_name(fence),
-  fence->ops->get_timeline_name(fence),
-  fence->seqno);
+   seq_printf(m, "\t%9s: %s %s seq %llu\n", type,
+  fence->ops->get_driver_name(fence),
+  fence->ops->get_timeline_name(fence),
+  fence->seqno);
 }
 
 static void etnaviv_gem_describe(struct drm_gem_object *obj, struct seq_file 
*m)
 {
struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
struct dma_resv *robj = obj->resv;
-   struct dma_resv_list *fobj;
+   struct dma_resv_cursor cursor;
struct dma_fence *fence;
unsigned long off = drm_vma_node_start(&obj->vma_node);
 
@@ -459,19 +457,12 @@ static void etnaviv_gem_describe(struct drm_gem_object 
*obj, struct seq_file *m)
off, etnaviv_obj->vaddr, obj->size);
 
rcu_read_lock();
-   fobj = dma_resv_shared_list(robj);
-   if (fobj) {
-   unsigned int i, shared_count = fobj->shared_count;
-
-   for (i = 0; i < shared_count; i++) {
-   fence = rcu_dereference(fobj->shared[i]);
+   dma_resv_for_each_fence_unlocked(robj, &cursor, true, fence) {
+   if (cursor.is_exclusive)
+   etnaviv_gem_describe_fence(fence, "Exclusive", m);
+   else
etnaviv_gem_describe_fence(fence, "Shared", m);
-   }
}
-
-   fence = dma_resv_excl_fence(robj);
-   if (fence)
-   etnaviv_gem_describe_fence(fence, "Exclusive", m);
rcu_read_unlock();
 }
 
-- 
2.25.1



[PATCH 14/14] drm/radeon: use new iterator in radeon_sync_resv

2021-09-10 Thread Christian König
Simplifying the code a bit.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/radeon/radeon_sync.c | 22 +++---
 1 file changed, 3 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_sync.c 
b/drivers/gpu/drm/radeon/radeon_sync.c
index 9257b60144c4..14a4d8135bad 100644
--- a/drivers/gpu/drm/radeon/radeon_sync.c
+++ b/drivers/gpu/drm/radeon/radeon_sync.c
@@ -91,33 +91,17 @@ int radeon_sync_resv(struct radeon_device *rdev,
 struct dma_resv *resv,
 bool shared)
 {
-   struct dma_resv_list *flist;
-   struct dma_fence *f;
+   struct dma_resv_cursor cursor;
struct radeon_fence *fence;
-   unsigned i;
+   struct dma_fence *f;
int r = 0;
 
-   /* always sync to the exclusive fence */
-   f = dma_resv_excl_fence(resv);
-   fence = f ? to_radeon_fence(f) : NULL;
-   if (fence && fence->rdev == rdev)
-   radeon_sync_fence(sync, fence);
-   else if (f)
-   r = dma_fence_wait(f, true);
-
-   flist = dma_resv_shared_list(resv);
-   if (shared || !flist || r)
-   return r;
-
-   for (i = 0; i < flist->shared_count; ++i) {
-   f = rcu_dereference_protected(flist->shared[i],
- dma_resv_held(resv));
+   dma_resv_for_each_fence(resv, &cursor, shared, f) {
fence = to_radeon_fence(f);
if (fence && fence->rdev == rdev)
radeon_sync_fence(sync, fence);
else
r = dma_fence_wait(f, true);
-
if (r)
break;
}
-- 
2.25.1



[PATCH 13/14] drm/nouveau: use the new iterator in nouveau_fence_sync

2021-09-10 Thread Christian König
Simplifying the code a bit.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 48 +++--
 1 file changed, 12 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 05d0b3eb3690..dc8d7ca1e239 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -339,14 +339,15 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool 
lazy, bool intr)
 }
 
 int
-nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool 
exclusive, bool intr)
+nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
+  bool exclusive, bool intr)
 {
struct nouveau_fence_chan *fctx = chan->fence;
-   struct dma_fence *fence;
struct dma_resv *resv = nvbo->bo.base.resv;
-   struct dma_resv_list *fobj;
+   struct dma_resv_cursor cursor;
+   struct dma_fence *fence;
struct nouveau_fence *f;
-   int ret = 0, i;
+   int ret;
 
if (!exclusive) {
ret = dma_resv_reserve_shared(resv, 1);
@@ -355,10 +356,7 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct 
nouveau_channel *chan, bool e
return ret;
}
 
-   fobj = dma_resv_shared_list(resv);
-   fence = dma_resv_excl_fence(resv);
-
-   if (fence) {
+   dma_resv_for_each_fence(resv, &cursor, exclusive, fence) {
struct nouveau_channel *prev = NULL;
bool must_wait = true;
 
@@ -366,41 +364,19 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct 
nouveau_channel *chan, bool e
if (f) {
rcu_read_lock();
prev = rcu_dereference(f->channel);
-   if (prev && (prev == chan || fctx->sync(f, prev, chan) 
== 0))
+   if (prev && (prev == chan ||
+fctx->sync(f, prev, chan) == 0))
must_wait = false;
rcu_read_unlock();
}
 
-   if (must_wait)
+   if (must_wait) {
ret = dma_fence_wait(fence, intr);
-
-   return ret;
-   }
-
-   if (!exclusive || !fobj)
-   return ret;
-
-   for (i = 0; i < fobj->shared_count && !ret; ++i) {
-   struct nouveau_channel *prev = NULL;
-   bool must_wait = true;
-
-   fence = rcu_dereference_protected(fobj->shared[i],
-   dma_resv_held(resv));
-
-   f = nouveau_local_fence(fence, chan->drm);
-   if (f) {
-   rcu_read_lock();
-   prev = rcu_dereference(f->channel);
-   if (prev && (prev == chan || fctx->sync(f, prev, chan) 
== 0))
-   must_wait = false;
-   rcu_read_unlock();
+   if (ret)
+   return ret;
}
-
-   if (must_wait)
-   ret = dma_fence_wait(fence, intr);
}
-
-   return ret;
+   return 0;
 }
 
 void
-- 
2.25.1



[PATCH 11/14] drm/amdgpu: use new iterator in amdgpu_ttm_bo_eviction_valuable

2021-09-10 Thread Christian König
Simplifying the code a bit.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 489e22190e29..0a927006ba9c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1332,10 +1332,9 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct 
ttm_buffer_object *bo,
const struct ttm_place *place)
 {
unsigned long num_pages = bo->resource->num_pages;
+   struct dma_resv_cursor resv_cursor;
struct amdgpu_res_cursor cursor;
-   struct dma_resv_list *flist;
struct dma_fence *f;
-   int i;
 
/* Swapout? */
if (bo->resource->mem_type == TTM_PL_SYSTEM)
@@ -1349,14 +1348,9 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct 
ttm_buffer_object *bo,
 * If true, then return false as any KFD process needs all its BOs to
 * be resident to run successfully
 */
-   flist = dma_resv_shared_list(bo->base.resv);
-   if (flist) {
-   for (i = 0; i < flist->shared_count; ++i) {
-   f = rcu_dereference_protected(flist->shared[i],
-   dma_resv_held(bo->base.resv));
-   if (amdkfd_fence_check_mm(f, current->mm))
-   return false;
-   }
+   dma_resv_for_each_fence(bo->base.resv, &resv_cursor, true, f) {
+   if (amdkfd_fence_check_mm(f, current->mm))
+   return false;
}
 
switch (bo->resource->mem_type) {
-- 
2.25.1



[PATCH 12/14] drm/msm: use new iterator in msm_gem_describe

2021-09-10 Thread Christian König
Simplifying the code a bit. Also drop the RCU read side lock since the
object is locked anyway.

Untested since I can't get the driver to compile on !ARM.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/msm/msm_gem.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 5db07fc287ad..8ee4e8881b03 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -906,7 +906,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct 
seq_file *m,
 {
struct msm_gem_object *msm_obj = to_msm_bo(obj);
struct dma_resv *robj = obj->resv;
-   struct dma_resv_list *fobj;
+   struct dma_resv_cursor cursor;
struct dma_fence *fence;
struct msm_gem_vma *vma;
uint64_t off = drm_vma_node_start(&obj->vma_node);
@@ -981,22 +981,13 @@ void msm_gem_describe(struct drm_gem_object *obj, struct 
seq_file *m,
seq_puts(m, "\n");
}
 
-   rcu_read_lock();
-   fobj = dma_resv_shared_list(robj);
-   if (fobj) {
-   unsigned int i, shared_count = fobj->shared_count;
-
-   for (i = 0; i < shared_count; i++) {
-   fence = rcu_dereference(fobj->shared[i]);
+   dma_resv_for_each_fence(robj, &cursor, true, fence) {
+   if (cursor.is_exclusive)
+   describe_fence(fence, "Exclusive", m);
+   else
describe_fence(fence, "Shared", m);
-   }
}
 
-   fence = dma_resv_excl_fence(robj);
-   if (fence)
-   describe_fence(fence, "Exclusive", m);
-   rcu_read_unlock();
-
msm_gem_unlock(obj);
 }
 
-- 
2.25.1



Re: [Intel-gfx] [PATCH 05/27] drm/i915: Add GT PM unpark worker

2021-09-10 Thread Tvrtko Ursulin



On 20/08/2021 23:44, Matthew Brost wrote:

Sometimes it is desirable to queue work up for later if the GT PM isn't
held and run that work on next GT PM unpark.


Sounds maybe plausible, but it depends how much work can happen on 
unpark and whether it can have too much of a negative impact on latency 
for interactive loads? Or from a reverse angle, why the work wouldn't be 
done on parking?


Also what kind of mechanism for dealing with too much stuff being put on 
this list you have? Can there be pressure which triggers (or would need 
to trigger) these deregistrations to happen at runtime (no park/unpark 
transitions)?



Implemented with a list in the GT of all pending work, workqueues in
the list, a callback to add a workqueue to the list, and finally a
wakeref post_get callback that iterates / drains the list + queues the
workqueues.

First user of this is deregistration of GuC contexts.


Does first imply there are more incoming?


Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/Makefile |  1 +
  drivers/gpu/drm/i915/gt/intel_gt.c|  3 ++
  drivers/gpu/drm/i915/gt/intel_gt_pm.c |  8 
  .../gpu/drm/i915/gt/intel_gt_pm_unpark_work.c | 35 
  .../gpu/drm/i915/gt/intel_gt_pm_unpark_work.h | 40 +++
  drivers/gpu/drm/i915/gt/intel_gt_types.h  | 10 +
  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  8 ++--
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 +--
  drivers/gpu/drm/i915/intel_wakeref.c  |  5 +++
  drivers/gpu/drm/i915/intel_wakeref.h  |  1 +
  10 files changed, 119 insertions(+), 7 deletions(-)
  create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c
  create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 642a5b5a1b81..579bdc069f25 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -103,6 +103,7 @@ gt-y += \
gt/intel_gt_clock_utils.o \
gt/intel_gt_irq.o \
gt/intel_gt_pm.o \
+   gt/intel_gt_pm_unpark_work.o \
gt/intel_gt_pm_irq.o \
gt/intel_gt_requests.o \
gt/intel_gtt.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 62d40c986642..7e690e74baa2 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -29,6 +29,9 @@ void intel_gt_init_early(struct intel_gt *gt, struct 
drm_i915_private *i915)
  
  	spin_lock_init(>->irq_lock);
  
+	spin_lock_init(>->pm_unpark_work_lock);

+   INIT_LIST_HEAD(>->pm_unpark_work_list);
+
INIT_LIST_HEAD(>->closed_vma);
spin_lock_init(>->closed_lock);
  
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c

index dea8e2479897..564c11a3748b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -90,6 +90,13 @@ static int __gt_unpark(struct intel_wakeref *wf)
return 0;
  }
  
+static void __gt_unpark_work_queue(struct intel_wakeref *wf)

+{
+   struct intel_gt *gt = container_of(wf, typeof(*gt), wakeref);
+
+   intel_gt_pm_unpark_work_queue(gt);
+}
+
  static int __gt_park(struct intel_wakeref *wf)
  {
struct intel_gt *gt = container_of(wf, typeof(*gt), wakeref);
@@ -118,6 +125,7 @@ static int __gt_park(struct intel_wakeref *wf)
  
  static const struct intel_wakeref_ops wf_ops = {

.get = __gt_unpark,
+   .post_get = __gt_unpark_work_queue,
.put = __gt_park,
  };
  
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c b/drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c

new file mode 100644
index ..23162dbd0c35
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_unpark_work.c
@@ -0,0 +1,35 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "intel_runtime_pm.h"
+#include "intel_gt_pm.h"
+
+void intel_gt_pm_unpark_work_queue(struct intel_gt *gt)
+{
+   struct intel_gt_pm_unpark_work *work, *next;
+   unsigned long flags;
+
+   spin_lock_irqsave(>->pm_unpark_work_lock, flags);
+   list_for_each_entry_safe(work, next,
+>->pm_unpark_work_list, link) {
+   list_del_init(&work->link);
+   queue_work(system_unbound_wq, &work->worker);
+   }
+   spin_unlock_irqrestore(>->pm_unpark_work_lock, flags);
+}
+
+void intel_gt_pm_unpark_work_add(struct intel_gt *gt,
+struct intel_gt_pm_unpark_work *work)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(>->pm_unpark_work_lock, flags);
+   if (intel_gt_pm_is_awake(gt))
+   queue_work(system_unbound_wq, &work->worker);
+   else if (list_empty(&work->link))


What's the list_empty check for, something can race by design?


+   list_add_tail(&work->link, >->pm_unpark_work_li

Re: i915 ttm_tt shmem backend

2021-09-10 Thread Thomas Hellström
On Fri, 2021-09-10 at 10:25 +0200, Christian König wrote:
> 
> 
> Am 10.09.21 um 10:08 schrieb Thomas Hellström:
> > Perhaps some background and goal is worth mentioning here.
> > 
> > 
> > On Thu, 2021-09-09 at 17:56 +0100, Matthew Auld wrote:
> > > On Thu, 9 Sept 2021 at 17:43, Koenig, Christian
> > >  wrote:
> > > > Hi Matthew,
> > > > 
> > > > this doesn't work, I've already tried something similar.
> > > > 
> > > > TTM uses the reverse lookup functionality when migrating BOs
> > > > between system and device memory. And that doesn't seem to work
> > > > with pages from a shmem file.
> > > Hmm, what do you mean by reverse lookup functionality? Could you
> > > please point out where that is in the TTM code?
> > I think this is in unmap_mapping_range() where, if we use
> > VM_MIXEDMAP,
> > there is a reverse lookup on the PTEs that point to real pages. Now
> > that we move over to VM_PFNMAP, that problem should go away since
> > core
> > vm never has a page to investigate. Probably this is why things
> > works
> > on non-TTM i915 GEM.
> 
> Yeah, that was really likely the root problem. I didn't kept 
> investigating after realizing that my approach wouldn't work.
> 
> > @Christian: Some background here:
> > First I think that there might be things like the above that will
> > pose
> > problems, and we may or may not be able to overcome those but more
> > importantly is that we agree with you that *if* we make it work, it
> > is
> > something that you as a maintainer of TTM can accept from a design-
> > and
> > maintainabiltiy point of view.
> > 
> > The approach would be similar to the buddy allocator, we adapt some
> > driver code to TTM in a way that it may be reused with other
> > drivers,
> > and if other drivers are interested, we'd assist in moving to core
> > TTM.
> > In essence it'd be a TTM shmem page pool with full shrinking
> > ability
> > for cached pages only.
> > 
> > What we're really after here is the ability to shrink that doesn't
> > regress much w r t the elaborate shrinker that's in i915 today that
> > is
> > power management aware and is also able to start shmem writebacks
> > to
> > avoid shmem just caching the pages instead of giving them back to
> > the
> > system (IIRC it was partly the lack of this that blocked earlier
> > TTM
> > shrinking efforts).
> > 
> > And since it doesn't really matter whether the shrinker sits in
> > core
> > TTM or in a driver, I think a future goal might be a set of TTM
> > shrinker helpers that makes sure we shrink the right TTM object,
> > and
> > perhaps a simple implementation that is typically used by simple
> > drivers and other drivers can build on that for a more elaborate
> > power-
> > management aware shrinker.
> 
> That's understandable, but I think not necessary what we should aim
> for 
> in the long term.
> 
> First of all I would really like to move more of the functionality
> from 
> ttm_pool.c into the core memory management, especially handling of 
> uncached and write combined memory.
> 
> That's essentially completely architecture dependent and currently 
> implemented extremely awkward. Either Daniels suggestion of having a 
> GFP_WC or Christophs approach of moving all this into the DMA API is
> the 
> way to go here.
> 
> As long as i915 has no interest in USWC support implementing their
> own 
> shmemfile backend sounds fine to me, but I have strong doubt that
> this 
> will be of use to anybody else.

OK. Sounds fine. In situations where we use WC system memory we will
use what's in TTM today. BTW on the shrinking approach for WC pages,
does the Christoph's DMA API solution envision some kind of support for
this?

/Thomas

> 
> Christian.
> 
> > 
> > /Thomas
> > 
> > 
> > 
> > > > Regards,
> > > > Christian.
> > > > 
> > > > 
> > > > Von: Matthew Auld 
> > > > Gesendet: Donnerstag, 9. September 2021 16:56
> > > > An: Christian König ; Koenig,
> > > > Christian 
> > > > Cc: Thomas Hellström ; ML
> > > > dri-
> > > > devel 
> > > > Betreff: i915 ttm_tt shmem backend
> > > > 
> > > > Hi Christian,
> > > > 
> > > > We are looking into using shmem as a ttm_tt backend in i915 for
> > > > cached
> > > > system memory objects. We would also like to make such objects
> > > > visible
> > > > to the i915-gem shrinker, so that they may be swapped out or
> > > > discarded
> > > > when under memory pressure.
> > > > 
> > > > One idea for handling this is roughly something like:
> > > > - Add a new TTM_PAGE_FLAG_SHMEM flag, or similar.
> > > > - Skip the ttm_pages_allocated accounting on such objects,
> > > > similar
> > > > to
> > > > how FLAG_SG is already handled.
> > > > - Skip all the page->mapping and page->index related bits, like
> > > > in
> > > > tt_add_mapping, since it looks like these are set and used by
> > > > shmem.
> > > > Not sure what functionally this might break, but looks like
> > > > it's
> > > > maybe
> > > > only driver specific?
> > > > - Skip calling into ttm_bo_swap_out/in and just

RE: [PATCH] drm/ttm: add a BUG_ON in ttm_set_driver_manager when array bounds

2021-09-10 Thread Chen, Guchun
[Public]

Hi Christian and Xinhui,

Thanks for your suggestion. The cause is I saw data corruption in several 
proprietary use cases. BUILD_BUG_ON will have build variation per gcc 
difference?

Anyway, WARN_ON is fine to me, and I will send a new patch set soon to address 
this.

Regards,
Guchun

From: Koenig, Christian 
Sent: Friday, September 10, 2021 2:37 PM
To: Pan, Xinhui ; amd-...@lists.freedesktop.org; 
dri-devel@lists.freedesktop.org; Deucher, Alexander 
; Chen, Guchun 
Cc: Shi, Leslie 
Subject: Re: [PATCH] drm/ttm: add a BUG_ON in ttm_set_driver_manager when array 
bounds

Yeah, that's a good point.

If build_bug_on() doesn't works for some reason then we at least need to lower 
this to a WARN_ON.

A BUG_ON() is only justified if we prevent strong data corruption with it or 
note a NULL pointer earlier on or similar.

Regards,
Christian.
Am 10.09.21 um 06:36 schrieb Pan, Xinhui:

[AMD Official Use Only]

looks good to me.
But maybe build_bug_on works too and more reasonable to detect such wrong usage.

From: Chen, Guchun 
Sent: Friday, September 10, 2021 12:30:14 PM
To: amd-...@lists.freedesktop.org 
; 
dri-devel@lists.freedesktop.org 
; 
Koenig, Christian ; 
Pan, Xinhui ; Deucher, Alexander 

Cc: Chen, Guchun ; Shi, Leslie 

Subject: [PATCH] drm/ttm: add a BUG_ON in ttm_set_driver_manager when array 
bounds

Vendor will define their own memory types on top of TTM_PL_PRIV,
but call ttm_set_driver_manager directly without checking mem_type
value when setting up memory manager. So add such check to aware
the case when array bounds.

Signed-off-by: Leslie Shi 
Signed-off-by: Guchun Chen 
---
 include/drm/ttm/ttm_device.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index 7a0f561c57ee..24ad76ca8022 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -308,6 +308,7 @@ ttm_manager_type(struct ttm_device *bdev, int mem_type)
 static inline void ttm_set_driver_manager(struct ttm_device *bdev, int type,
   struct ttm_resource_manager *manager)
 {
+   BUG_ON(type >= TTM_NUM_MEM_TYPES);
 bdev->man_drv[type] = manager;
 }

--
2.17.1



Re: i915 ttm_tt shmem backend

2021-09-10 Thread Christian König

Am 10.09.21 um 10:40 schrieb Thomas Hellström:

On Fri, 2021-09-10 at 10:25 +0200, Christian König wrote:


Am 10.09.21 um 10:08 schrieb Thomas Hellström:

Perhaps some background and goal is worth mentioning here.


On Thu, 2021-09-09 at 17:56 +0100, Matthew Auld wrote:

On Thu, 9 Sept 2021 at 17:43, Koenig, Christian
 wrote:

Hi Matthew,

this doesn't work, I've already tried something similar.

TTM uses the reverse lookup functionality when migrating BOs
between system and device memory. And that doesn't seem to work
with pages from a shmem file.

Hmm, what do you mean by reverse lookup functionality? Could you
please point out where that is in the TTM code?

I think this is in unmap_mapping_range() where, if we use
VM_MIXEDMAP,
there is a reverse lookup on the PTEs that point to real pages. Now
that we move over to VM_PFNMAP, that problem should go away since
core
vm never has a page to investigate. Probably this is why things
works
on non-TTM i915 GEM.

Yeah, that was really likely the root problem. I didn't kept
investigating after realizing that my approach wouldn't work.


@Christian: Some background here:
First I think that there might be things like the above that will
pose
problems, and we may or may not be able to overcome those but more
importantly is that we agree with you that *if* we make it work, it
is
something that you as a maintainer of TTM can accept from a design-
and
maintainabiltiy point of view.

The approach would be similar to the buddy allocator, we adapt some
driver code to TTM in a way that it may be reused with other
drivers,
and if other drivers are interested, we'd assist in moving to core
TTM.
In essence it'd be a TTM shmem page pool with full shrinking
ability
for cached pages only.

What we're really after here is the ability to shrink that doesn't
regress much w r t the elaborate shrinker that's in i915 today that
is
power management aware and is also able to start shmem writebacks
to
avoid shmem just caching the pages instead of giving them back to
the
system (IIRC it was partly the lack of this that blocked earlier
TTM
shrinking efforts).

And since it doesn't really matter whether the shrinker sits in
core
TTM or in a driver, I think a future goal might be a set of TTM
shrinker helpers that makes sure we shrink the right TTM object,
and
perhaps a simple implementation that is typically used by simple
drivers and other drivers can build on that for a more elaborate
power-
management aware shrinker.

That's understandable, but I think not necessary what we should aim
for
in the long term.

First of all I would really like to move more of the functionality
from
ttm_pool.c into the core memory management, especially handling of
uncached and write combined memory.

That's essentially completely architecture dependent and currently
implemented extremely awkward. Either Daniels suggestion of having a
GFP_WC or Christophs approach of moving all this into the DMA API is
the
way to go here.

As long as i915 has no interest in USWC support implementing their
own
shmemfile backend sounds fine to me, but I have strong doubt that
this
will be of use to anybody else.

OK. Sounds fine. In situations where we use WC system memory we will
use what's in TTM today. BTW on the shrinking approach for WC pages,
does the Christoph's DMA API solution envision some kind of support for
this?


Not Christoph DMA API solution, but what I have in mind for the TTM 
shrinker should work.


Essentially a shmemfile per device should help in solving most of the 
issues we ran into.


Christian.



/Thomas


Christian.


/Thomas




Regards,
Christian.


Von: Matthew Auld 
Gesendet: Donnerstag, 9. September 2021 16:56
An: Christian König ; Koenig,
Christian 
Cc: Thomas Hellström ; ML
dri-
devel 
Betreff: i915 ttm_tt shmem backend

Hi Christian,

We are looking into using shmem as a ttm_tt backend in i915 for
cached
system memory objects. We would also like to make such objects
visible
to the i915-gem shrinker, so that they may be swapped out or
discarded
when under memory pressure.

One idea for handling this is roughly something like:
- Add a new TTM_PAGE_FLAG_SHMEM flag, or similar.
- Skip the ttm_pages_allocated accounting on such objects,
similar
to
how FLAG_SG is already handled.
- Skip all the page->mapping and page->index related bits, like
in
tt_add_mapping, since it looks like these are set and used by
shmem.
Not sure what functionally this might break, but looks like
it's
maybe
only driver specific?
- Skip calling into ttm_bo_swap_out/in and just have
ttm_populate/unpopulate handle this directly for such objects.
- Make such objects visible to the i915-gem shrinker.

Does this approach look acceptable?






Re: [Intel-gfx] [PATCH v8 16/17] drm/i915/pxp: add PXP documentation

2021-09-10 Thread Daniele Ceraolo Spurio




On 9/9/2021 2:25 PM, Rodrigo Vivi wrote:

On Thu, Sep 09, 2021 at 05:29:14AM -0700, Daniele Ceraolo Spurio wrote:

Now that all the pieces are in place we can add a description of how the
feature works. Also modify the comments in struct intel_pxp into
kerneldoc.

Signed-off-by: Daniele Ceraolo Spurio 
Cc: Daniel Vetter 
Cc: Rodrigo Vivi 
---
  Documentation/gpu/i915.rst |  8 
  drivers/gpu/drm/i915/pxp/intel_pxp.c   | 28 +
  drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 47 --
  3 files changed, 71 insertions(+), 12 deletions(-)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 101dde3eb1ea..78ecb9d5ec20 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -471,6 +471,14 @@ Object Tiling IOCTLs
  .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_tiling.c
 :doc: buffer object tiling
  
+Protected Objects

+-
+
+.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp.c
+   :doc: PXP
+
+.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp_types.h
+
  Microcontrollers
  
  
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c

index d8815e91e091..4e095a9a9f07 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -11,6 +11,34 @@
  #include "gt/intel_context.h"
  #include "i915_drv.h"
  
+/**

+ * DOC: PXP
+ *
+ * PXP (Protected Xe Path) is a Gen12+ feature that allows execution and

We should start avoiding the + naming to identify this-and-newer. This will
soon conflict with some other Xe naming.

what about something like:

PXP (Protected Xe Path) is a feature available in Gen12 and newer platforms. It
allows...


ok




+ * flip to display of protected (i.e. encrypted) objects. The SW support is
+ * enabled via the CONFIG_DRM_I915_PXP kconfig.
+ *
+ * Some of the PXP setup operations are performed by the Management Engine,
+ * which is handled by the mei driver; communication between i915 and mei is
+ * performed via the mei_pxp component module.

I believe this is kind of secondary so it should go below the context buffer
and flag information. Is there any MEI mandatory command or something we should
also make sure we document here?


no commands the user cares about, I only mentioned the module dependency 
because mei_pxp has to be compiled in for stuff to work.



+ *
+ * Objects can opt-in to PXP encryption at creation time via the
+ * I915_GEM_CREATE_EXT_PROTECTED_CONTENT create_ext flag. For objects to be
+ * correctly protected they must be used in conjunction with a context created
+ * with the I915_CONTEXT_PARAM_PROTECTED_CONTENT flag. See the documentation
+ * of those two uapi flags for details and restrictions.

Instead of pointing to see their documentation, could we add some concrete
example of usage in this section? Our goal is to have documentation that 
exemplifies
how the UMD could really use them, without having to go to IGT or Mesa codes to
check for examples.


All the other usage examples are in the uapi file, so I'm going to add 
the pxp ones there as well for consistency. The user needs to check that 
documentation anyway for the struct definitions etc.


Daniele




+ *
+ * Protected objects are tied to a pxp session; currently we only support one
+ * session, which i915 manages and whose index is available in the uapi
+ * (I915_PROTECTED_CONTENT_DEFAULT_SESSION) for use in instructions targeting
+ * protected objects.
+ * The session is invalidated by the HW when certain events occur (e.g.
+ * suspend/resume). When this happens, all the objects that were used with the
+ * session are marked as invalid and all contexts marked as using protected
+ * content are banned. Any further attempt at using them in an execbuf call is
+ * rejected, while flips are converted to black frames.
+ */
+
  /* KCR register definitions */
  #define KCR_INIT _MMIO(0x320f0)
  
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_types.h b/drivers/gpu/drm/i915/pxp/intel_pxp_types.h

index ae24064bb57e..73ef7d1754e1 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_types.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_types.h
@@ -16,42 +16,65 @@
  struct intel_context;
  struct i915_pxp_component;
  
+/**

+ * struct intel_pxp - pxp state
+ */
  struct intel_pxp {
+   /**
+* @pxp_component: i915_pxp_component struct of the bound mei_pxp
+* module. Only set and cleared inside component bind/unbind functions,
+* which are protected by &tee_mutex.
+*/
struct i915_pxp_component *pxp_component;
+   /**
+* @pxp_component_added: track if the pxp component has been added.
+* Set and cleared in tee init and fini functions respectively.
+*/
bool pxp_component_added;
  
+	/** @ce: kernel-owned context used for PXP operations */

struct intel_context *ce;
  
-	/*

+   /** @arb_mutex: protects arb session start */
+   struct mu

[PATCH] drm/ttm: add a WARN_ON in ttm_set_driver_manager when array bounds (v2)

2021-09-10 Thread Guchun Chen
Vendor will define their own memory types on top of TTM_PL_PRIV,
but call ttm_set_driver_manager directly without checking mem_type
value when setting up memory manager. So add such check to aware
the case when array bounds.

v2: lower check level to WARN_ON

Signed-off-by: Leslie Shi 
Signed-off-by: Guchun Chen 
---
 include/drm/ttm/ttm_device.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
index 07d722950d5b..aa79953c807c 100644
--- a/include/drm/ttm/ttm_device.h
+++ b/include/drm/ttm/ttm_device.h
@@ -291,6 +291,7 @@ ttm_manager_type(struct ttm_device *bdev, int mem_type)
 static inline void ttm_set_driver_manager(struct ttm_device *bdev, int type,
  struct ttm_resource_manager *manager)
 {
+   WARN_ON(type >= TTM_NUM_MEM_TYPES);
bdev->man_drv[type] = manager;
 }
 
-- 
2.17.1



[PATCH v4 00/24] drm/bridge: Make panel and bridge probe order consistent

2021-09-10 Thread Maxime Ripard
Hi,

We've encountered an issue with the RaspberryPi DSI panel that prevented the
whole display driver from probing.

The issue is described in detail in the commit 7213246a803f ("drm/vc4: dsi:
Only register our component once a DSI device is attached"), but the basic idea
is that since the panel is probed through i2c, there's no synchronization
between its probe and the registration of the MIPI-DSI host it's attached to.

We initially moved the component framework registration to the MIPI-DSI Host
attach hook to make sure we register our component only when we have a DSI
device attached to our MIPI-DSI host, and then use lookup our DSI device in our
bind hook.

However, all the DSI bridges controlled through i2c are only registering their
associated DSI device in their bridge attach hook, meaning with our change
above, we never got that far, and therefore ended up in the same situation than
the one we were trying to fix for panels.

The best practice to avoid those issues is to register its functions only after
all its dependencies are live. We also shouldn't wait any longer than we should
to play nice with the other components that are waiting for us, so in our case
that would mean moving the DSI device registration to the bridge probe.

I also had a look at all the DSI hosts, and it seems that exynos, kirin and msm
would be affected by this and wouldn't probe anymore after those changes.
Exynos and kirin seems to be simple enough for a mechanical change (that still
requires to be tested), but the changes in msm seemed to be far more important
and I wasn't confortable doing them.

Let me know what you think,
Maxime

---

Changes from v3:
  - Converted exynos and kirin
  - Converted all the affected bridge drivers
  - Reworded the documentation a bit

Changes from v2:
  - Changed the approach as suggested by Andrzej, and aligned the bridge on the
panel this time.
  - Fixed some typos

Changes from v1:
  - Change the name of drm_of_get_next function to drm_of_get_bridge
  - Mention the revert of 87154ff86bf6 and squash the two patches that were
reverting that commit
  - Add some documentation
  - Make drm_panel_attach and _detach succeed when no callback is there

Maxime Ripard (24):
  drm/bridge: Add documentation sections
  drm/bridge: Document the probe issue with MIPI-DSI bridges
  drm/mipi-dsi: Create devm device registration
  drm/mipi-dsi: Create devm device attachment
  drm/bridge: adv7533: Switch to devm MIPI-DSI helpers
  drm/bridge: adv7511: Register and attach our DSI device at probe
  drm/bridge: anx7625: Switch to devm MIPI-DSI helpers
  drm/bridge: anx7625: Register and attach our DSI device at probe
  drm/bridge: lt8912b: Switch to devm MIPI-DSI helpers
  drm/bridge: lt8912b: Register and attach our DSI device at probe
  drm/bridge: lt9611: Switch to devm MIPI-DSI helpers
  drm/bridge: lt9611: Register and attach our DSI device at probe
  drm/bridge: lt9611uxc: Switch to devm MIPI-DSI helpers
  drm/bridge: lt9611uxc: Register and attach our DSI device at probe
  drm/bridge: ps8640: Switch to devm MIPI-DSI helpers
  drm/bridge: ps8640: Register and attach our DSI device at probe
  drm/bridge: sn65dsi83: Switch to devm MIPI-DSI helpers
  drm/bridge: sn65dsi83: Register and attach our DSI device at probe
  drm/bridge: sn65dsi86: Switch to devm MIPI-DSI helpers
  drm/bridge: sn65dsi86: Register and attach our DSI device at probe
  drm/bridge: tc358775: Switch to devm MIPI-DSI helpers
  drm/bridge: tc358775: Register and attach our DSI device at probe
  drm/kirin: dsi: Adjust probe order
  drm/exynos: dsi: Adjust probe order

 Documentation/gpu/drm-kms-helpers.rst|  12 +++
 drivers/gpu/drm/bridge/adv7511/adv7511.h |   1 -
 drivers/gpu/drm/bridge/adv7511/adv7511_drv.c |  15 ++-
 drivers/gpu/drm/bridge/adv7511/adv7533.c |  20 +---
 drivers/gpu/drm/bridge/analogix/anx7625.c|  40 
 drivers/gpu/drm/bridge/lontium-lt8912b.c |  31 ++
 drivers/gpu/drm/bridge/lontium-lt9611.c  |  62 +---
 drivers/gpu/drm/bridge/lontium-lt9611uxc.c   |  65 +---
 drivers/gpu/drm/bridge/parade-ps8640.c   | 101 ++-
 drivers/gpu/drm/bridge/tc358775.c|  50 +
 drivers/gpu/drm/bridge/ti-sn65dsi83.c|  86 
 drivers/gpu/drm/bridge/ti-sn65dsi86.c|  94 -
 drivers/gpu/drm/drm_bridge.c |  69 -
 drivers/gpu/drm/drm_mipi_dsi.c   |  81 +++
 drivers/gpu/drm/exynos/exynos_drm_dsi.c  |  19 ++--
 drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c |  27 +++--
 include/drm/drm_mipi_dsi.h   |   4 +
 17 files changed, 460 insertions(+), 317 deletions(-)

-- 
2.31.1



[PATCH v4 01/24] drm/bridge: Add documentation sections

2021-09-10 Thread Maxime Ripard
The bridge documentation overview is quite packed already, and we'll add
some more documentation that isn't part of an overview at all.

Let's add some sections to the documentation to separate each bits.

Reviewed-by: Andrzej Hajda 
Reviewed-by: Sam Ravnborg 
Signed-off-by: Maxime Ripard 
---
 Documentation/gpu/drm-kms-helpers.rst |  6 ++
 drivers/gpu/drm/drm_bridge.c  | 14 +-
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/Documentation/gpu/drm-kms-helpers.rst 
b/Documentation/gpu/drm-kms-helpers.rst
index 389892f36185..10f8df7aecc0 100644
--- a/Documentation/gpu/drm-kms-helpers.rst
+++ b/Documentation/gpu/drm-kms-helpers.rst
@@ -151,6 +151,12 @@ Overview
 .. kernel-doc:: drivers/gpu/drm/drm_bridge.c
:doc: overview
 
+Display Driver Integration
+--
+
+.. kernel-doc:: drivers/gpu/drm/drm_bridge.c
+   :doc: display driver integration
+
 Bridge Operations
 -
 
diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
index a8ed66751c2d..baff74ea4a33 100644
--- a/drivers/gpu/drm/drm_bridge.c
+++ b/drivers/gpu/drm/drm_bridge.c
@@ -49,6 +49,15 @@
  * Chaining multiple bridges to the output of a bridge, or the same bridge to
  * the output of different bridges, is not supported.
  *
+ * &drm_bridge, like &drm_panel, aren't &drm_mode_object entities like planes,
+ * CRTCs, encoders or connectors and hence are not visible to userspace. They
+ * just provide additional hooks to get the desired output at the end of the
+ * encoder chain.
+ */
+
+/**
+ * DOC:display driver integration
+ *
  * Display drivers are responsible for linking encoders with the first bridge
  * in the chains. This is done by acquiring the appropriate bridge with
  * of_drm_find_bridge() or drm_of_find_panel_or_bridge(), or creating it for a
@@ -85,11 +94,6 @@
  * helper to create the &drm_connector, or implement it manually on top of the
  * connector-related operations exposed by the bridge (see the overview
  * documentation of bridge operations for more details).
- *
- * &drm_bridge, like &drm_panel, aren't &drm_mode_object entities like planes,
- * CRTCs, encoders or connectors and hence are not visible to userspace. They
- * just provide additional hooks to get the desired output at the end of the
- * encoder chain.
  */
 
 static DEFINE_MUTEX(bridge_lock);
-- 
2.31.1



[PATCH v4 02/24] drm/bridge: Document the probe issue with MIPI-DSI bridges

2021-09-10 Thread Maxime Ripard
Interactions between bridges, panels, MIPI-DSI host and the component
framework are not trivial and can lead to probing issues when
implementing a display driver. Let's document the various cases we need
too consider, and the solution to support all the cases.

Signed-off-by: Maxime Ripard 
---
 Documentation/gpu/drm-kms-helpers.rst |  6 +++
 drivers/gpu/drm/drm_bridge.c  | 57 +++
 2 files changed, 63 insertions(+)

diff --git a/Documentation/gpu/drm-kms-helpers.rst 
b/Documentation/gpu/drm-kms-helpers.rst
index 10f8df7aecc0..ec2f65b31930 100644
--- a/Documentation/gpu/drm-kms-helpers.rst
+++ b/Documentation/gpu/drm-kms-helpers.rst
@@ -157,6 +157,12 @@ Display Driver Integration
 .. kernel-doc:: drivers/gpu/drm/drm_bridge.c
:doc: display driver integration
 
+Special Care with MIPI-DSI bridges
+--
+
+.. kernel-doc:: drivers/gpu/drm/drm_bridge.c
+   :doc: special care dsi
+
 Bridge Operations
 -
 
diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
index baff74ea4a33..7cc2d2f94ae3 100644
--- a/drivers/gpu/drm/drm_bridge.c
+++ b/drivers/gpu/drm/drm_bridge.c
@@ -96,6 +96,63 @@
  * documentation of bridge operations for more details).
  */
 
+/**
+ * DOC: special care dsi
+ *
+ * The interaction between the bridges and other frameworks involved in
+ * the probing of the upstream driver and the bridge driver can be
+ * challenging. Indeed, there's multiple cases that needs to be
+ * considered:
+ *
+ * - The upstream driver doesn't use the component framework and isn't a
+ *   MIPI-DSI host. In this case, the bridge driver will probe at some
+ *   point and the upstream driver should try to probe again by returning
+ *   EPROBE_DEFER as long as the bridge driver hasn't probed.
+ *
+ * - The upstream driver doesn't use the component framework, but is a
+ *   MIPI-DSI host. The bridge device uses the MIPI-DCS commands to be
+ *   controlled. In this case, the bridge device is a child of the
+ *   display device and when it will probe it's assured that the display
+ *   device (and MIPI-DSI host) is present. The upstream driver will be
+ *   assured that the bridge driver is connected between the
+ *   &mipi_dsi_host_ops.attach and &mipi_dsi_host_ops.detach operations.
+ *   Therefore, it must run mipi_dsi_host_register() in its probe
+ *   function, and then run drm_bridge_attach() in its
+ *   &mipi_dsi_host_ops.attach hook.
+ *
+ * - The upstream driver uses the component framework and is a MIPI-DSI
+ *   host. The bridge device uses the MIPI-DCS commands to be
+ *   controlled. This is the same situation than above, and can run
+ *   mipi_dsi_host_register() in either its probe or bind hooks.
+ *
+ * - The upstream driver uses the component framework and is a MIPI-DSI
+ *   host. The bridge device uses a separate bus (such as I2C) to be
+ *   controlled. In this case, there's no correlation between the probe
+ *   of the bridge and upstream drivers, so care must be taken to avoid
+ *   an endless EPROBE_DEFER loop, with each driver waiting for the
+ *   other to probe.
+ *
+ * The ideal pattern to cover the last item (and all the others in the
+ * MIPI-DSI host driver case) is to split the operations like this:
+ *
+ * - The MIPI-DSI host driver must run mipi_dsi_host_register() in its
+ *   probe hook. It will make sure that the MIPI-DSI host sticks around,
+ *   and that the driver's bind can be called.
+ *
+ * - In its probe hook, the bridge driver must try to find its MIPI-DSI
+ *   host, register as a MIPI-DSI device and attach the MIPI-DSI device
+ *   to its host. The bridge driver is now functional.
+ *
+ * - In its &struct mipi_dsi_host_ops.attach hook, the MIPI-DSI host can
+ *   now add its component. Its bind hook will now be called and since
+ *   the bridge driver is attached and registered, we can now look for
+ *   and attach it.
+ *
+ * At this point, we're now certain that both the upstream driver and
+ * the bridge driver are functional and we can't have a deadlock-like
+ * situation when probing.
+ */
+
 static DEFINE_MUTEX(bridge_lock);
 static LIST_HEAD(bridge_list);
 
-- 
2.31.1



[PATCH v4 03/24] drm/mipi-dsi: Create devm device registration

2021-09-10 Thread Maxime Ripard
Devices that take their data through the MIPI-DSI bus but are controlled
through a secondary bus like I2C have to register a secondary device on
the MIPI-DSI bus through the mipi_dsi_device_register_full() function.

At removal or when an error occurs, that device needs to be removed
through a call to mipi_dsi_device_unregister().

Let's create a device-managed variant of the registration function that
will automatically unregister the device at unbind.

Reviewed-by: Andrzej Hajda 
Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/drm_mipi_dsi.c | 46 ++
 include/drm/drm_mipi_dsi.h |  3 +++
 2 files changed, 49 insertions(+)

diff --git a/drivers/gpu/drm/drm_mipi_dsi.c b/drivers/gpu/drm/drm_mipi_dsi.c
index 5dd475e82995..ddf67463eaa1 100644
--- a/drivers/gpu/drm/drm_mipi_dsi.c
+++ b/drivers/gpu/drm/drm_mipi_dsi.c
@@ -246,6 +246,52 @@ void mipi_dsi_device_unregister(struct mipi_dsi_device 
*dsi)
 }
 EXPORT_SYMBOL(mipi_dsi_device_unregister);
 
+static void devm_mipi_dsi_device_unregister(void *arg)
+{
+   struct mipi_dsi_device *dsi = arg;
+
+   mipi_dsi_device_unregister(dsi);
+}
+
+/**
+ * devm_mipi_dsi_device_register_full - create a managed MIPI DSI device
+ * @dev: device to tie the MIPI-DSI device lifetime to
+ * @host: DSI host to which this device is connected
+ * @info: pointer to template containing DSI device information
+ *
+ * Create a MIPI DSI device by using the device information provided by
+ * mipi_dsi_device_info template
+ *
+ * This is the managed version of mipi_dsi_device_register_full() which
+ * automatically calls mipi_dsi_device_unregister() when @dev is
+ * unbound.
+ *
+ * Returns:
+ * A pointer to the newly created MIPI DSI device, or, a pointer encoded
+ * with an error
+ */
+struct mipi_dsi_device *
+devm_mipi_dsi_device_register_full(struct device *dev,
+  struct mipi_dsi_host *host,
+  const struct mipi_dsi_device_info *info)
+{
+   struct mipi_dsi_device *dsi;
+   int ret;
+
+   dsi = mipi_dsi_device_register_full(host, info);
+   if (IS_ERR(dsi))
+   return dsi;
+
+   ret = devm_add_action_or_reset(dev,
+  devm_mipi_dsi_device_unregister,
+  dsi);
+   if (ret)
+   return ERR_PTR(ret);
+
+   return dsi;
+}
+EXPORT_SYMBOL_GPL(devm_mipi_dsi_device_register_full);
+
 static DEFINE_MUTEX(host_lock);
 static LIST_HEAD(host_list);
 
diff --git a/include/drm/drm_mipi_dsi.h b/include/drm/drm_mipi_dsi.h
index af7ba8071eb0..d0032e435e08 100644
--- a/include/drm/drm_mipi_dsi.h
+++ b/include/drm/drm_mipi_dsi.h
@@ -227,6 +227,9 @@ struct mipi_dsi_device *
 mipi_dsi_device_register_full(struct mipi_dsi_host *host,
  const struct mipi_dsi_device_info *info);
 void mipi_dsi_device_unregister(struct mipi_dsi_device *dsi);
+struct mipi_dsi_device *
+devm_mipi_dsi_device_register_full(struct device *dev, struct mipi_dsi_host 
*host,
+  const struct mipi_dsi_device_info *info);
 struct mipi_dsi_device *of_find_mipi_dsi_device_by_node(struct device_node 
*np);
 int mipi_dsi_attach(struct mipi_dsi_device *dsi);
 int mipi_dsi_detach(struct mipi_dsi_device *dsi);
-- 
2.31.1



[PATCH v4 04/24] drm/mipi-dsi: Create devm device attachment

2021-09-10 Thread Maxime Ripard
MIPI-DSI devices need to call mipi_dsi_attach() when their probe is done
to attach against their host.

However, at removal or when an error occurs, that attachment needs to be
undone through a call to mipi_dsi_detach().

Let's create a device-managed variant of the attachment function that
will automatically detach the device at unbind.

Reviewed-by: Andrzej Hajda 
Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/drm_mipi_dsi.c | 35 ++
 include/drm/drm_mipi_dsi.h |  1 +
 2 files changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/drm_mipi_dsi.c b/drivers/gpu/drm/drm_mipi_dsi.c
index ddf67463eaa1..18cef04df2f2 100644
--- a/drivers/gpu/drm/drm_mipi_dsi.c
+++ b/drivers/gpu/drm/drm_mipi_dsi.c
@@ -391,6 +391,41 @@ int mipi_dsi_detach(struct mipi_dsi_device *dsi)
 }
 EXPORT_SYMBOL(mipi_dsi_detach);
 
+static void devm_mipi_dsi_detach(void *arg)
+{
+   struct mipi_dsi_device *dsi = arg;
+
+   mipi_dsi_detach(dsi);
+}
+
+/**
+ * devm_mipi_dsi_attach - Attach a MIPI-DSI device to its DSI Host
+ * @dev: device to tie the MIPI-DSI device attachment lifetime to
+ * @dsi: DSI peripheral
+ *
+ * This is the managed version of mipi_dsi_attach() which automatically
+ * calls mipi_dsi_detach() when @dev is unbound.
+ *
+ * Returns:
+ * 0 on success, a negative error code on failure.
+ */
+int devm_mipi_dsi_attach(struct device *dev,
+struct mipi_dsi_device *dsi)
+{
+   int ret;
+
+   ret = mipi_dsi_attach(dsi);
+   if (ret)
+   return ret;
+
+   ret = devm_add_action_or_reset(dev, devm_mipi_dsi_detach, dsi);
+   if (ret)
+   return ret;
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(devm_mipi_dsi_attach);
+
 static ssize_t mipi_dsi_device_transfer(struct mipi_dsi_device *dsi,
struct mipi_dsi_msg *msg)
 {
diff --git a/include/drm/drm_mipi_dsi.h b/include/drm/drm_mipi_dsi.h
index d0032e435e08..147e51b6d241 100644
--- a/include/drm/drm_mipi_dsi.h
+++ b/include/drm/drm_mipi_dsi.h
@@ -233,6 +233,7 @@ devm_mipi_dsi_device_register_full(struct device *dev, 
struct mipi_dsi_host *hos
 struct mipi_dsi_device *of_find_mipi_dsi_device_by_node(struct device_node 
*np);
 int mipi_dsi_attach(struct mipi_dsi_device *dsi);
 int mipi_dsi_detach(struct mipi_dsi_device *dsi);
+int devm_mipi_dsi_attach(struct device *dev, struct mipi_dsi_device *dsi);
 int mipi_dsi_shutdown_peripheral(struct mipi_dsi_device *dsi);
 int mipi_dsi_turn_on_peripheral(struct mipi_dsi_device *dsi);
 int mipi_dsi_set_maximum_return_packet_size(struct mipi_dsi_device *dsi,
-- 
2.31.1



[PATCH v4 05/24] drm/bridge: adv7533: Switch to devm MIPI-DSI helpers

2021-09-10 Thread Maxime Ripard
Let's switch to the new devm MIPI-DSI function to register and attach
our secondary device. This also avoids leaking the device when we detach
the bridge.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/adv7511/adv7511.h |  1 -
 drivers/gpu/drm/bridge/adv7511/adv7511_drv.c |  2 --
 drivers/gpu/drm/bridge/adv7511/adv7533.c | 20 
 3 files changed, 4 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511.h 
b/drivers/gpu/drm/bridge/adv7511/adv7511.h
index 05e3abb5a0c9..592ecfcf00ca 100644
--- a/drivers/gpu/drm/bridge/adv7511/adv7511.h
+++ b/drivers/gpu/drm/bridge/adv7511/adv7511.h
@@ -401,7 +401,6 @@ void adv7533_mode_set(struct adv7511 *adv, const struct 
drm_display_mode *mode);
 int adv7533_patch_registers(struct adv7511 *adv);
 int adv7533_patch_cec_registers(struct adv7511 *adv);
 int adv7533_attach_dsi(struct adv7511 *adv);
-void adv7533_detach_dsi(struct adv7511 *adv);
 int adv7533_parse_dt(struct device_node *np, struct adv7511 *adv);
 
 #ifdef CONFIG_DRM_I2C_ADV7511_AUDIO
diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c 
b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
index 76555ae64e9c..9e3585f23cf1 100644
--- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
+++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
@@ -1307,8 +1307,6 @@ static int adv7511_remove(struct i2c_client *i2c)
 {
struct adv7511 *adv7511 = i2c_get_clientdata(i2c);
 
-   if (adv7511->type == ADV7533 || adv7511->type == ADV7535)
-   adv7533_detach_dsi(adv7511);
i2c_unregister_device(adv7511->i2c_cec);
clk_disable_unprepare(adv7511->cec_clk);
 
diff --git a/drivers/gpu/drm/bridge/adv7511/adv7533.c 
b/drivers/gpu/drm/bridge/adv7511/adv7533.c
index 59d718bde8c4..eb7579dec40a 100644
--- a/drivers/gpu/drm/bridge/adv7511/adv7533.c
+++ b/drivers/gpu/drm/bridge/adv7511/adv7533.c
@@ -153,11 +153,10 @@ int adv7533_attach_dsi(struct adv7511 *adv)
return -EPROBE_DEFER;
}
 
-   dsi = mipi_dsi_device_register_full(host, &info);
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
dev_err(dev, "failed to create dsi device\n");
-   ret = PTR_ERR(dsi);
-   goto err_dsi_device;
+   return PTR_ERR(dsi);
}
 
adv->dsi = dsi;
@@ -167,24 +166,13 @@ int adv7533_attach_dsi(struct adv7511 *adv)
dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_SYNC_PULSE |
  MIPI_DSI_MODE_NO_EOT_PACKET | MIPI_DSI_MODE_VIDEO_HSE;
 
-   ret = mipi_dsi_attach(dsi);
+   ret = devm_mipi_dsi_attach(dev, dsi);
if (ret < 0) {
dev_err(dev, "failed to attach dsi to host\n");
-   goto err_dsi_attach;
+   return ret;
}
 
return 0;
-
-err_dsi_attach:
-   mipi_dsi_device_unregister(dsi);
-err_dsi_device:
-   return ret;
-}
-
-void adv7533_detach_dsi(struct adv7511 *adv)
-{
-   mipi_dsi_detach(adv->dsi);
-   mipi_dsi_device_unregister(adv->dsi);
 }
 
 int adv7533_parse_dt(struct device_node *np, struct adv7511 *adv)
-- 
2.31.1



[PATCH v4 06/24] drm/bridge: adv7511: Register and attach our DSI device at probe

2021-09-10 Thread Maxime Ripard
In order to avoid any probe ordering issue, the best practice is to move
the secondary MIPI-DSI device registration and attachment to the
MIPI-DSI host at probe time. Let's do this.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/adv7511/adv7511_drv.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c 
b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
index 9e3585f23cf1..f8e5da148599 100644
--- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
+++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
@@ -910,9 +910,6 @@ static int adv7511_bridge_attach(struct drm_bridge *bridge,
return ret;
}
 
-   if (adv->type == ADV7533 || adv->type == ADV7535)
-   ret = adv7533_attach_dsi(adv);
-
if (adv->i2c_main->irq)
regmap_write(adv->regmap, ADV7511_REG_INT_ENABLE(0),
 ADV7511_INT0_HPD);
@@ -1288,8 +1285,18 @@ static int adv7511_probe(struct i2c_client *i2c, const 
struct i2c_device_id *id)
drm_bridge_add(&adv7511->bridge);
 
adv7511_audio_init(dev, adv7511);
+
+   if (adv7511->type == ADV7533 || adv7511->type == ADV7535) {
+   ret = adv7533_attach_dsi(adv7511);
+   if (ret)
+   goto err_unregister_audio;
+   }
+
return 0;
 
+err_unregister_audio:
+   adv7511_audio_exit(adv7511);
+   drm_bridge_remove(&adv7511->bridge);
 err_unregister_cec:
i2c_unregister_device(adv7511->i2c_cec);
clk_disable_unprepare(adv7511->cec_clk);
-- 
2.31.1



[PATCH v4 07/24] drm/bridge: anx7625: Switch to devm MIPI-DSI helpers

2021-09-10 Thread Maxime Ripard
Let's switch to the new devm MIPI-DSI function to register and attach
our secondary device.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/analogix/anx7625.c | 20 +---
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c 
b/drivers/gpu/drm/bridge/analogix/anx7625.c
index 1a871f6b6822..4adeb2bad03a 100644
--- a/drivers/gpu/drm/bridge/analogix/anx7625.c
+++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
@@ -1316,6 +1316,7 @@ static int anx7625_attach_dsi(struct anx7625_data *ctx)
.channel = 0,
.node = NULL,
};
+   int ret;
 
DRM_DEV_DEBUG_DRIVER(dev, "attach dsi\n");
 
@@ -1325,7 +1326,7 @@ static int anx7625_attach_dsi(struct anx7625_data *ctx)
return -EINVAL;
}
 
-   dsi = mipi_dsi_device_register_full(host, &info);
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
DRM_DEV_ERROR(dev, "fail to create dsi device.\n");
return -EINVAL;
@@ -1337,10 +1338,10 @@ static int anx7625_attach_dsi(struct anx7625_data *ctx)
MIPI_DSI_MODE_VIDEO_SYNC_PULSE  |
MIPI_DSI_MODE_VIDEO_HSE;
 
-   if (mipi_dsi_attach(dsi) < 0) {
+   ret = devm_mipi_dsi_attach(dev, dsi);
+   if (ret) {
DRM_DEV_ERROR(dev, "fail to attach dsi to host.\n");
-   mipi_dsi_device_unregister(dsi);
-   return -EINVAL;
+   return ret;
}
 
ctx->dsi = dsi;
@@ -1350,16 +1351,6 @@ static int anx7625_attach_dsi(struct anx7625_data *ctx)
return 0;
 }
 
-static void anx7625_bridge_detach(struct drm_bridge *bridge)
-{
-   struct anx7625_data *ctx = bridge_to_anx7625(bridge);
-
-   if (ctx->dsi) {
-   mipi_dsi_detach(ctx->dsi);
-   mipi_dsi_device_unregister(ctx->dsi);
-   }
-}
-
 static int anx7625_bridge_attach(struct drm_bridge *bridge,
 enum drm_bridge_attach_flags flags)
 {
@@ -1624,7 +1615,6 @@ static struct edid *anx7625_bridge_get_edid(struct 
drm_bridge *bridge,
 
 static const struct drm_bridge_funcs anx7625_bridge_funcs = {
.attach = anx7625_bridge_attach,
-   .detach = anx7625_bridge_detach,
.disable = anx7625_bridge_disable,
.mode_valid = anx7625_bridge_mode_valid,
.mode_set = anx7625_bridge_mode_set,
-- 
2.31.1



[PATCH v4 08/24] drm/bridge: anx7625: Register and attach our DSI device at probe

2021-09-10 Thread Maxime Ripard
In order to avoid any probe ordering issue, the best practice is to move
the secondary MIPI-DSI device registration and attachment to the
MIPI-DSI host at probe time. Let's do this.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/analogix/anx7625.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c 
b/drivers/gpu/drm/bridge/analogix/anx7625.c
index 4adeb2bad03a..d0317651cd75 100644
--- a/drivers/gpu/drm/bridge/analogix/anx7625.c
+++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
@@ -1367,12 +1367,6 @@ static int anx7625_bridge_attach(struct drm_bridge 
*bridge,
return -ENODEV;
}
 
-   err = anx7625_attach_dsi(ctx);
-   if (err) {
-   DRM_DEV_ERROR(dev, "Fail to attach to dsi : %d\n", err);
-   return err;
-   }
-
if (ctx->pdata.panel_bridge) {
err = drm_bridge_attach(bridge->encoder,
ctx->pdata.panel_bridge,
@@ -1845,10 +1839,24 @@ static int anx7625_i2c_probe(struct i2c_client *client,
platform->bridge.type = DRM_MODE_CONNECTOR_eDP;
drm_bridge_add(&platform->bridge);
 
+   ret = anx7625_attach_dsi(platform);
+   if (ret) {
+   DRM_DEV_ERROR(dev, "Fail to attach to dsi : %d\n", ret);
+   goto unregister_bridge;
+   }
+
DRM_DEV_DEBUG_DRIVER(dev, "probe done\n");
 
return 0;
 
+unregister_bridge:
+   drm_bridge_remove(&platform->bridge);
+
+   if (!platform->pdata.low_power_mode)
+   pm_runtime_put_sync_suspend(&client->dev);
+
+   anx7625_unregister_i2c_dummy_clients(platform);
+
 free_wq:
if (platform->workqueue)
destroy_workqueue(platform->workqueue);
-- 
2.31.1



[PATCH v4 09/24] drm/bridge: lt8912b: Switch to devm MIPI-DSI helpers

2021-09-10 Thread Maxime Ripard
Let's switch to the new devm MIPI-DSI function to register and attach
our secondary device.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/lontium-lt8912b.c | 20 
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/bridge/lontium-lt8912b.c 
b/drivers/gpu/drm/bridge/lontium-lt8912b.c
index 1b0c7eaf6c84..cc968d65936b 100644
--- a/drivers/gpu/drm/bridge/lontium-lt8912b.c
+++ b/drivers/gpu/drm/bridge/lontium-lt8912b.c
@@ -472,11 +472,11 @@ static int lt8912_attach_dsi(struct lt8912 *lt)
return -EPROBE_DEFER;
}
 
-   dsi = mipi_dsi_device_register_full(host, &info);
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
ret = PTR_ERR(dsi);
dev_err(dev, "failed to create dsi device (%d)\n", ret);
-   goto err_dsi_device;
+   return ret;
}
 
lt->dsi = dsi;
@@ -489,24 +489,13 @@ static int lt8912_attach_dsi(struct lt8912 *lt)
  MIPI_DSI_MODE_LPM |
  MIPI_DSI_MODE_NO_EOT_PACKET;
 
-   ret = mipi_dsi_attach(dsi);
+   ret = devm_mipi_dsi_attach(dev, dsi);
if (ret < 0) {
dev_err(dev, "failed to attach dsi to host\n");
-   goto err_dsi_attach;
+   return ret;
}
 
return 0;
-
-err_dsi_attach:
-   mipi_dsi_device_unregister(dsi);
-err_dsi_device:
-   return ret;
-}
-
-static void lt8912_detach_dsi(struct lt8912 *lt)
-{
-   mipi_dsi_detach(lt->dsi);
-   mipi_dsi_device_unregister(lt->dsi);
 }
 
 static int lt8912_bridge_connector_init(struct drm_bridge *bridge)
@@ -573,7 +562,6 @@ static void lt8912_bridge_detach(struct drm_bridge *bridge)
struct lt8912 *lt = bridge_to_lt8912(bridge);
 
if (lt->is_attached) {
-   lt8912_detach_dsi(lt);
lt8912_hard_power_off(lt);
drm_connector_unregister(<->connector);
drm_connector_cleanup(<->connector);
-- 
2.31.1



[PATCH v4 10/24] drm/bridge: lt8912b: Register and attach our DSI device at probe

2021-09-10 Thread Maxime Ripard
In order to avoid any probe ordering issue, the best practice is to move
the secondary MIPI-DSI device registration and attachment to the
MIPI-DSI host at probe time. Let's do this.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/lontium-lt8912b.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/bridge/lontium-lt8912b.c 
b/drivers/gpu/drm/bridge/lontium-lt8912b.c
index cc968d65936b..c642d1e02b2f 100644
--- a/drivers/gpu/drm/bridge/lontium-lt8912b.c
+++ b/drivers/gpu/drm/bridge/lontium-lt8912b.c
@@ -544,10 +544,6 @@ static int lt8912_bridge_attach(struct drm_bridge *bridge,
if (ret)
goto error;
 
-   ret = lt8912_attach_dsi(lt);
-   if (ret)
-   goto error;
-
lt->is_attached = true;
 
return 0;
@@ -706,8 +702,15 @@ static int lt8912_probe(struct i2c_client *client,
 
drm_bridge_add(<->bridge);
 
+   ret = lt8912_attach_dsi(lt);
+   if (ret)
+   goto err_attach;
+
return 0;
 
+err_attach:
+   drm_bridge_remove(<->bridge);
+   lt8912_free_i2c(lt);
 err_i2c:
lt8912_put_dt(lt);
 err_dt_parse:
-- 
2.31.1



[PATCH v4 11/24] drm/bridge: lt9611: Switch to devm MIPI-DSI helpers

2021-09-10 Thread Maxime Ripard
Let's switch to the new devm MIPI-DSI function to register and attach
our secondary device.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/lontium-lt9611.c | 24 
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/bridge/lontium-lt9611.c 
b/drivers/gpu/drm/bridge/lontium-lt9611.c
index 29b1ce2140ab..654131aca5ed 100644
--- a/drivers/gpu/drm/bridge/lontium-lt9611.c
+++ b/drivers/gpu/drm/bridge/lontium-lt9611.c
@@ -760,6 +760,7 @@ static struct mipi_dsi_device *lt9611_attach_dsi(struct 
lt9611 *lt9611,
const struct mipi_dsi_device_info info = { "lt9611", 0, NULL };
struct mipi_dsi_device *dsi;
struct mipi_dsi_host *host;
+   struct device *dev = lt9611->dev;
int ret;
 
host = of_find_mipi_dsi_host_by_node(dsi_node);
@@ -768,7 +769,7 @@ static struct mipi_dsi_device *lt9611_attach_dsi(struct 
lt9611 *lt9611,
return ERR_PTR(-EPROBE_DEFER);
}
 
-   dsi = mipi_dsi_device_register_full(host, &info);
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
dev_err(lt9611->dev, "failed to create dsi device\n");
return dsi;
@@ -779,29 +780,15 @@ static struct mipi_dsi_device *lt9611_attach_dsi(struct 
lt9611 *lt9611,
dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_SYNC_PULSE |
  MIPI_DSI_MODE_VIDEO_HSE;
 
-   ret = mipi_dsi_attach(dsi);
+   ret = devm_mipi_dsi_attach(dev, dsi);
if (ret < 0) {
-   dev_err(lt9611->dev, "failed to attach dsi to host\n");
-   mipi_dsi_device_unregister(dsi);
+   dev_err(dev, "failed to attach dsi to host\n");
return ERR_PTR(ret);
}
 
return dsi;
 }
 
-static void lt9611_bridge_detach(struct drm_bridge *bridge)
-{
-   struct lt9611 *lt9611 = bridge_to_lt9611(bridge);
-
-   if (lt9611->dsi1) {
-   mipi_dsi_detach(lt9611->dsi1);
-   mipi_dsi_device_unregister(lt9611->dsi1);
-   }
-
-   mipi_dsi_detach(lt9611->dsi0);
-   mipi_dsi_device_unregister(lt9611->dsi0);
-}
-
 static int lt9611_connector_init(struct drm_bridge *bridge, struct lt9611 
*lt9611)
 {
int ret;
@@ -855,9 +842,7 @@ static int lt9611_bridge_attach(struct drm_bridge *bridge,
return 0;
 
 err_unregister_dsi0:
-   lt9611_bridge_detach(bridge);
drm_connector_cleanup(<9611->connector);
-   mipi_dsi_device_unregister(lt9611->dsi0);
 
return ret;
 }
@@ -952,7 +937,6 @@ static void lt9611_bridge_hpd_enable(struct drm_bridge 
*bridge)
 
 static const struct drm_bridge_funcs lt9611_bridge_funcs = {
.attach = lt9611_bridge_attach,
-   .detach = lt9611_bridge_detach,
.mode_valid = lt9611_bridge_mode_valid,
.enable = lt9611_bridge_enable,
.disable = lt9611_bridge_disable,
-- 
2.31.1



[PATCH v4 12/24] drm/bridge: lt9611: Register and attach our DSI device at probe

2021-09-10 Thread Maxime Ripard
In order to avoid any probe ordering issue, the best practice is to move
the secondary MIPI-DSI device registration and attachment to the
MIPI-DSI host at probe time. Let's do this.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/lontium-lt9611.c | 38 -
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/bridge/lontium-lt9611.c 
b/drivers/gpu/drm/bridge/lontium-lt9611.c
index 654131aca5ed..d2f45a0f79c8 100644
--- a/drivers/gpu/drm/bridge/lontium-lt9611.c
+++ b/drivers/gpu/drm/bridge/lontium-lt9611.c
@@ -825,26 +825,7 @@ static int lt9611_bridge_attach(struct drm_bridge *bridge,
return ret;
}
 
-   /* Attach primary DSI */
-   lt9611->dsi0 = lt9611_attach_dsi(lt9611, lt9611->dsi0_node);
-   if (IS_ERR(lt9611->dsi0))
-   return PTR_ERR(lt9611->dsi0);
-
-   /* Attach secondary DSI, if specified */
-   if (lt9611->dsi1_node) {
-   lt9611->dsi1 = lt9611_attach_dsi(lt9611, lt9611->dsi1_node);
-   if (IS_ERR(lt9611->dsi1)) {
-   ret = PTR_ERR(lt9611->dsi1);
-   goto err_unregister_dsi0;
-   }
-   }
-
return 0;
-
-err_unregister_dsi0:
-   drm_connector_cleanup(<9611->connector);
-
-   return ret;
 }
 
 static enum drm_mode_status lt9611_bridge_mode_valid(struct drm_bridge *bridge,
@@ -1165,10 +1146,29 @@ static int lt9611_probe(struct i2c_client *client,
 
drm_bridge_add(<9611->bridge);
 
+   /* Attach primary DSI */
+   lt9611->dsi0 = lt9611_attach_dsi(lt9611, lt9611->dsi0_node);
+   if (IS_ERR(lt9611->dsi0)) {
+   ret = PTR_ERR(lt9611->dsi0);
+   goto err_remove_bridge;
+   }
+
+   /* Attach secondary DSI, if specified */
+   if (lt9611->dsi1_node) {
+   lt9611->dsi1 = lt9611_attach_dsi(lt9611, lt9611->dsi1_node);
+   if (IS_ERR(lt9611->dsi1)) {
+   ret = PTR_ERR(lt9611->dsi1);
+   goto err_remove_bridge;
+   }
+   }
+
lt9611_enable_hpd_interrupts(lt9611);
 
return lt9611_audio_init(dev, lt9611);
 
+err_remove_bridge:
+   drm_bridge_remove(<9611->bridge);
+
 err_disable_regulators:
regulator_bulk_disable(ARRAY_SIZE(lt9611->supplies), lt9611->supplies);
 
-- 
2.31.1



[PATCH v4 13/24] drm/bridge: lt9611uxc: Switch to devm MIPI-DSI helpers

2021-09-10 Thread Maxime Ripard
Let's switch to the new devm MIPI-DSI function to register and attach
our secondary device.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/lontium-lt9611uxc.c | 38 +-
 1 file changed, 8 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/bridge/lontium-lt9611uxc.c 
b/drivers/gpu/drm/bridge/lontium-lt9611uxc.c
index 3cac16db970f..e5083bdf4c89 100644
--- a/drivers/gpu/drm/bridge/lontium-lt9611uxc.c
+++ b/drivers/gpu/drm/bridge/lontium-lt9611uxc.c
@@ -257,17 +257,18 @@ static struct mipi_dsi_device 
*lt9611uxc_attach_dsi(struct lt9611uxc *lt9611uxc,
const struct mipi_dsi_device_info info = { "lt9611uxc", 0, NULL };
struct mipi_dsi_device *dsi;
struct mipi_dsi_host *host;
+   struct device *dev = lt9611uxc->dev;
int ret;
 
host = of_find_mipi_dsi_host_by_node(dsi_node);
if (!host) {
-   dev_err(lt9611uxc->dev, "failed to find dsi host\n");
+   dev_err(dev, "failed to find dsi host\n");
return ERR_PTR(-EPROBE_DEFER);
}
 
-   dsi = mipi_dsi_device_register_full(host, &info);
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
-   dev_err(lt9611uxc->dev, "failed to create dsi device\n");
+   dev_err(dev, "failed to create dsi device\n");
return dsi;
}
 
@@ -276,10 +277,9 @@ static struct mipi_dsi_device *lt9611uxc_attach_dsi(struct 
lt9611uxc *lt9611uxc,
dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_SYNC_PULSE |
  MIPI_DSI_MODE_VIDEO_HSE;
 
-   ret = mipi_dsi_attach(dsi);
+   ret = devm_mipi_dsi_attach(dev, dsi);
if (ret < 0) {
-   dev_err(lt9611uxc->dev, "failed to attach dsi to host\n");
-   mipi_dsi_device_unregister(dsi);
+   dev_err(dev, "failed to attach dsi to host\n");
return ERR_PTR(ret);
}
 
@@ -352,19 +352,6 @@ static int lt9611uxc_connector_init(struct drm_bridge 
*bridge, struct lt9611uxc
return drm_connector_attach_encoder(<9611uxc->connector, 
bridge->encoder);
 }
 
-static void lt9611uxc_bridge_detach(struct drm_bridge *bridge)
-{
-   struct lt9611uxc *lt9611uxc = bridge_to_lt9611uxc(bridge);
-
-   if (lt9611uxc->dsi1) {
-   mipi_dsi_detach(lt9611uxc->dsi1);
-   mipi_dsi_device_unregister(lt9611uxc->dsi1);
-   }
-
-   mipi_dsi_detach(lt9611uxc->dsi0);
-   mipi_dsi_device_unregister(lt9611uxc->dsi0);
-}
-
 static int lt9611uxc_bridge_attach(struct drm_bridge *bridge,
   enum drm_bridge_attach_flags flags)
 {
@@ -385,19 +372,11 @@ static int lt9611uxc_bridge_attach(struct drm_bridge 
*bridge,
/* Attach secondary DSI, if specified */
if (lt9611uxc->dsi1_node) {
lt9611uxc->dsi1 = lt9611uxc_attach_dsi(lt9611uxc, 
lt9611uxc->dsi1_node);
-   if (IS_ERR(lt9611uxc->dsi1)) {
-   ret = PTR_ERR(lt9611uxc->dsi1);
-   goto err_unregister_dsi0;
-   }
+   if (IS_ERR(lt9611uxc->dsi1))
+   return PTR_ERR(lt9611uxc->dsi1);
}
 
return 0;
-
-err_unregister_dsi0:
-   mipi_dsi_detach(lt9611uxc->dsi0);
-   mipi_dsi_device_unregister(lt9611uxc->dsi0);
-
-   return ret;
 }
 
 static enum drm_mode_status
@@ -541,7 +520,6 @@ static struct edid *lt9611uxc_bridge_get_edid(struct 
drm_bridge *bridge,
 
 static const struct drm_bridge_funcs lt9611uxc_bridge_funcs = {
.attach = lt9611uxc_bridge_attach,
-   .detach = lt9611uxc_bridge_detach,
.mode_valid = lt9611uxc_bridge_mode_valid,
.mode_set = lt9611uxc_bridge_mode_set,
.detect = lt9611uxc_bridge_detect,
-- 
2.31.1



[PATCH v4 14/24] drm/bridge: lt9611uxc: Register and attach our DSI device at probe

2021-09-10 Thread Maxime Ripard
In order to avoid any probe ordering issue, the best practice is to move
the secondary MIPI-DSI device registration and attachment to the
MIPI-DSI host at probe time. Let's do this.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/lontium-lt9611uxc.c | 31 +-
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/bridge/lontium-lt9611uxc.c 
b/drivers/gpu/drm/bridge/lontium-lt9611uxc.c
index e5083bdf4c89..78c4175e0a12 100644
--- a/drivers/gpu/drm/bridge/lontium-lt9611uxc.c
+++ b/drivers/gpu/drm/bridge/lontium-lt9611uxc.c
@@ -364,18 +364,6 @@ static int lt9611uxc_bridge_attach(struct drm_bridge 
*bridge,
return ret;
}
 
-   /* Attach primary DSI */
-   lt9611uxc->dsi0 = lt9611uxc_attach_dsi(lt9611uxc, lt9611uxc->dsi0_node);
-   if (IS_ERR(lt9611uxc->dsi0))
-   return PTR_ERR(lt9611uxc->dsi0);
-
-   /* Attach secondary DSI, if specified */
-   if (lt9611uxc->dsi1_node) {
-   lt9611uxc->dsi1 = lt9611uxc_attach_dsi(lt9611uxc, 
lt9611uxc->dsi1_node);
-   if (IS_ERR(lt9611uxc->dsi1))
-   return PTR_ERR(lt9611uxc->dsi1);
-   }
-
return 0;
 }
 
@@ -955,8 +943,27 @@ static int lt9611uxc_probe(struct i2c_client *client,
 
drm_bridge_add(<9611uxc->bridge);
 
+   /* Attach primary DSI */
+   lt9611uxc->dsi0 = lt9611uxc_attach_dsi(lt9611uxc, lt9611uxc->dsi0_node);
+   if (IS_ERR(lt9611uxc->dsi0)) {
+   ret = PTR_ERR(lt9611uxc->dsi0);
+   goto err_remove_bridge;
+   }
+
+   /* Attach secondary DSI, if specified */
+   if (lt9611uxc->dsi1_node) {
+   lt9611uxc->dsi1 = lt9611uxc_attach_dsi(lt9611uxc, 
lt9611uxc->dsi1_node);
+   if (IS_ERR(lt9611uxc->dsi1)) {
+   ret = PTR_ERR(lt9611uxc->dsi1);
+   goto err_remove_bridge;
+   }
+   }
+
return lt9611uxc_audio_init(dev, lt9611uxc);
 
+err_remove_bridge:
+   drm_bridge_remove(<9611uxc->bridge);
+
 err_disable_regulators:
regulator_bulk_disable(ARRAY_SIZE(lt9611uxc->supplies), 
lt9611uxc->supplies);
 
-- 
2.31.1



[PATCH v4 15/24] drm/bridge: ps8640: Switch to devm MIPI-DSI helpers

2021-09-10 Thread Maxime Ripard
Let's switch to the new devm MIPI-DSI function to register and attach
our secondary device. This also avoids leaking the device on removal.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/parade-ps8640.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c 
b/drivers/gpu/drm/bridge/parade-ps8640.c
index 685e9c38b2db..c943045f3370 100644
--- a/drivers/gpu/drm/bridge/parade-ps8640.c
+++ b/drivers/gpu/drm/bridge/parade-ps8640.c
@@ -243,7 +243,7 @@ static int ps8640_bridge_attach(struct drm_bridge *bridge,
if (!host)
return -ENODEV;
 
-   dsi = mipi_dsi_device_register_full(host, &info);
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
dev_err(dev, "failed to create dsi device\n");
ret = PTR_ERR(dsi);
@@ -257,17 +257,13 @@ static int ps8640_bridge_attach(struct drm_bridge *bridge,
  MIPI_DSI_MODE_VIDEO_SYNC_PULSE;
dsi->format = MIPI_DSI_FMT_RGB888;
dsi->lanes = NUM_MIPI_LANES;
-   ret = mipi_dsi_attach(dsi);
+   ret = devm_mipi_dsi_attach(dev, dsi);
if (ret)
-   goto err_dsi_attach;
+   return ret;
 
/* Attach the panel-bridge to the dsi bridge */
return drm_bridge_attach(bridge->encoder, ps_bridge->panel_bridge,
 &ps_bridge->bridge, flags);
-
-err_dsi_attach:
-   mipi_dsi_device_unregister(dsi);
-   return ret;
 }
 
 static struct edid *ps8640_bridge_get_edid(struct drm_bridge *bridge,
-- 
2.31.1



[PATCH v4 16/24] drm/bridge: ps8640: Register and attach our DSI device at probe

2021-09-10 Thread Maxime Ripard
In order to avoid any probe ordering issue, the best practice is to move
the secondary MIPI-DSI device registration and attachment to the
MIPI-DSI host at probe time. Let's do this.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/parade-ps8640.c | 97 +++---
 1 file changed, 55 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c 
b/drivers/gpu/drm/bridge/parade-ps8640.c
index c943045f3370..8d161b6cdbb2 100644
--- a/drivers/gpu/drm/bridge/parade-ps8640.c
+++ b/drivers/gpu/drm/bridge/parade-ps8640.c
@@ -215,52 +215,10 @@ static int ps8640_bridge_attach(struct drm_bridge *bridge,
enum drm_bridge_attach_flags flags)
 {
struct ps8640 *ps_bridge = bridge_to_ps8640(bridge);
-   struct device *dev = &ps_bridge->page[0]->dev;
-   struct device_node *in_ep, *dsi_node;
-   struct mipi_dsi_device *dsi;
-   struct mipi_dsi_host *host;
-   int ret;
-   const struct mipi_dsi_device_info info = { .type = "ps8640",
-  .channel = 0,
-  .node = NULL,
-};
 
if (!(flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR))
return -EINVAL;
 
-   /* port@0 is ps8640 dsi input port */
-   in_ep = of_graph_get_endpoint_by_regs(dev->of_node, 0, -1);
-   if (!in_ep)
-   return -ENODEV;
-
-   dsi_node = of_graph_get_remote_port_parent(in_ep);
-   of_node_put(in_ep);
-   if (!dsi_node)
-   return -ENODEV;
-
-   host = of_find_mipi_dsi_host_by_node(dsi_node);
-   of_node_put(dsi_node);
-   if (!host)
-   return -ENODEV;
-
-   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
-   if (IS_ERR(dsi)) {
-   dev_err(dev, "failed to create dsi device\n");
-   ret = PTR_ERR(dsi);
-   return ret;
-   }
-
-   ps_bridge->dsi = dsi;
-
-   dsi->host = host;
-   dsi->mode_flags = MIPI_DSI_MODE_VIDEO |
- MIPI_DSI_MODE_VIDEO_SYNC_PULSE;
-   dsi->format = MIPI_DSI_FMT_RGB888;
-   dsi->lanes = NUM_MIPI_LANES;
-   ret = devm_mipi_dsi_attach(dev, dsi);
-   if (ret)
-   return ret;
-
/* Attach the panel-bridge to the dsi bridge */
return drm_bridge_attach(bridge->encoder, ps_bridge->panel_bridge,
 &ps_bridge->bridge, flags);
@@ -307,6 +265,53 @@ static const struct drm_bridge_funcs ps8640_bridge_funcs = 
{
.pre_enable = ps8640_pre_enable,
 };
 
+static int ps8640_bridge_host_attach(struct device *dev, struct ps8640 
*ps_bridge)
+{
+   struct device_node *in_ep, *dsi_node;
+   struct mipi_dsi_device *dsi;
+   struct mipi_dsi_host *host;
+   int ret;
+   const struct mipi_dsi_device_info info = { .type = "ps8640",
+  .channel = 0,
+  .node = NULL,
+};
+
+   /* port@0 is ps8640 dsi input port */
+   in_ep = of_graph_get_endpoint_by_regs(dev->of_node, 0, -1);
+   if (!in_ep)
+   return -ENODEV;
+
+   dsi_node = of_graph_get_remote_port_parent(in_ep);
+   of_node_put(in_ep);
+   if (!dsi_node)
+   return -ENODEV;
+
+   host = of_find_mipi_dsi_host_by_node(dsi_node);
+   of_node_put(dsi_node);
+   if (!host)
+   return -EPROBE_DEFER;
+
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
+   if (IS_ERR(dsi)) {
+   dev_err(dev, "failed to create dsi device\n");
+   return PTR_ERR(dsi);
+   }
+
+   ps_bridge->dsi = dsi;
+
+   dsi->host = host;
+   dsi->mode_flags = MIPI_DSI_MODE_VIDEO |
+ MIPI_DSI_MODE_VIDEO_SYNC_PULSE;
+   dsi->format = MIPI_DSI_FMT_RGB888;
+   dsi->lanes = NUM_MIPI_LANES;
+
+   ret = devm_mipi_dsi_attach(dev, dsi);
+   if (ret)
+   return ret;
+
+   return 0;
+}
+
 static int ps8640_probe(struct i2c_client *client)
 {
struct device *dev = &client->dev;
@@ -373,7 +378,15 @@ static int ps8640_probe(struct i2c_client *client)
 
drm_bridge_add(&ps_bridge->bridge);
 
+   ret = ps8640_bridge_host_attach(dev, ps_bridge);
+   if (ret)
+   goto err_bridge_remove;
+
return 0;
+
+err_bridge_remove:
+   drm_bridge_remove(&ps_bridge->bridge);
+   return ret;
 }
 
 static int ps8640_remove(struct i2c_client *client)
-- 
2.31.1



[PATCH v4 17/24] drm/bridge: sn65dsi83: Switch to devm MIPI-DSI helpers

2021-09-10 Thread Maxime Ripard
Let's switch to the new devm MIPI-DSI function to register and attach
our secondary device. This also avoids leaking the device when we detach
the bridge but don't remove its driver.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/ti-sn65dsi83.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
index a32f70bc68ea..db4d39082705 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
@@ -262,7 +262,7 @@ static int sn65dsi83_attach(struct drm_bridge *bridge,
return -EPROBE_DEFER;
}
 
-   dsi = mipi_dsi_device_register_full(host, &info);
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
return dev_err_probe(dev, PTR_ERR(dsi),
 "failed to create dsi device\n");
@@ -274,18 +274,14 @@ static int sn65dsi83_attach(struct drm_bridge *bridge,
dsi->format = MIPI_DSI_FMT_RGB888;
dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST;
 
-   ret = mipi_dsi_attach(dsi);
+   ret = devm_mipi_dsi_attach(dev, dsi);
if (ret < 0) {
dev_err(dev, "failed to attach dsi to host\n");
-   goto err_dsi_attach;
+   return ret;
}
 
return drm_bridge_attach(bridge->encoder, ctx->panel_bridge,
 &ctx->bridge, flags);
-
-err_dsi_attach:
-   mipi_dsi_device_unregister(dsi);
-   return ret;
 }
 
 static void sn65dsi83_atomic_pre_enable(struct drm_bridge *bridge,
@@ -697,8 +693,6 @@ static int sn65dsi83_remove(struct i2c_client *client)
 {
struct sn65dsi83 *ctx = i2c_get_clientdata(client);
 
-   mipi_dsi_detach(ctx->dsi);
-   mipi_dsi_device_unregister(ctx->dsi);
drm_bridge_remove(&ctx->bridge);
of_node_put(ctx->host_node);
 
-- 
2.31.1



[PATCH v4 18/24] drm/bridge: sn65dsi83: Register and attach our DSI device at probe

2021-09-10 Thread Maxime Ripard
In order to avoid any probe ordering issue, the best practice is to move
the secondary MIPI-DSI device registration and attachment to the
MIPI-DSI host at probe time. Let's do this.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/ti-sn65dsi83.c | 80 +++
 1 file changed, 46 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
index db4d39082705..f951eb19767b 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
@@ -245,40 +245,6 @@ static int sn65dsi83_attach(struct drm_bridge *bridge,
enum drm_bridge_attach_flags flags)
 {
struct sn65dsi83 *ctx = bridge_to_sn65dsi83(bridge);
-   struct device *dev = ctx->dev;
-   struct mipi_dsi_device *dsi;
-   struct mipi_dsi_host *host;
-   int ret = 0;
-
-   const struct mipi_dsi_device_info info = {
-   .type = "sn65dsi83",
-   .channel = 0,
-   .node = NULL,
-   };
-
-   host = of_find_mipi_dsi_host_by_node(ctx->host_node);
-   if (!host) {
-   dev_err(dev, "failed to find dsi host\n");
-   return -EPROBE_DEFER;
-   }
-
-   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
-   if (IS_ERR(dsi)) {
-   return dev_err_probe(dev, PTR_ERR(dsi),
-"failed to create dsi device\n");
-   }
-
-   ctx->dsi = dsi;
-
-   dsi->lanes = ctx->dsi_lanes;
-   dsi->format = MIPI_DSI_FMT_RGB888;
-   dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST;
-
-   ret = devm_mipi_dsi_attach(dev, dsi);
-   if (ret < 0) {
-   dev_err(dev, "failed to attach dsi to host\n");
-   return ret;
-   }
 
return drm_bridge_attach(bridge->encoder, ctx->panel_bridge,
 &ctx->bridge, flags);
@@ -646,6 +612,44 @@ static int sn65dsi83_parse_dt(struct sn65dsi83 *ctx, enum 
sn65dsi83_model model)
return 0;
 }
 
+static int sn65dsi83_host_attach(struct sn65dsi83 *ctx)
+{
+   struct device *dev = ctx->dev;
+   struct mipi_dsi_device *dsi;
+   struct mipi_dsi_host *host;
+   const struct mipi_dsi_device_info info = {
+   .type = "sn65dsi83",
+   .channel = 0,
+   .node = NULL,
+   };
+   int ret;
+
+   host = of_find_mipi_dsi_host_by_node(ctx->host_node);
+   if (!host) {
+   dev_err(dev, "failed to find dsi host\n");
+   return -EPROBE_DEFER;
+   }
+
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
+   if (IS_ERR(dsi))
+   return dev_err_probe(dev, PTR_ERR(dsi),
+"failed to create dsi device\n");
+
+   ctx->dsi = dsi;
+
+   dsi->lanes = ctx->dsi_lanes;
+   dsi->format = MIPI_DSI_FMT_RGB888;
+   dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST;
+
+   ret = devm_mipi_dsi_attach(dev, dsi);
+   if (ret < 0) {
+   dev_err(dev, "failed to attach dsi to host: %d\n", ret);
+   return ret;
+   }
+
+   return 0;
+}
+
 static int sn65dsi83_probe(struct i2c_client *client,
   const struct i2c_device_id *id)
 {
@@ -686,7 +690,15 @@ static int sn65dsi83_probe(struct i2c_client *client,
ctx->bridge.of_node = dev->of_node;
drm_bridge_add(&ctx->bridge);
 
+   ret = sn65dsi83_host_attach(ctx);
+   if (ret)
+   goto err_remove_bridge;
+
return 0;
+
+err_remove_bridge:
+   drm_bridge_remove(&ctx->bridge);
+   return ret;
 }
 
 static int sn65dsi83_remove(struct i2c_client *client)
-- 
2.31.1



[PATCH v4 19/24] drm/bridge: sn65dsi86: Switch to devm MIPI-DSI helpers

2021-09-10 Thread Maxime Ripard
Let's switch to the new devm MIPI-DSI function to register and attach
our secondary device. This also avoids leaking the device when we detach
the bridge.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 22 +++---
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index 41d48a393e7f..b5662269ff95 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -674,6 +674,7 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge);
struct mipi_dsi_host *host;
struct mipi_dsi_device *dsi;
+   struct device *dev = pdata->dev;
const struct mipi_dsi_device_info info = { .type = "ti_sn_bridge",
   .channel = 0,
   .node = NULL,
@@ -713,7 +714,7 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
goto err_dsi_host;
}
 
-   dsi = mipi_dsi_device_register_full(host, &info);
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
DRM_ERROR("failed to create dsi device\n");
ret = PTR_ERR(dsi);
@@ -726,16 +727,16 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
dsi->mode_flags = MIPI_DSI_MODE_VIDEO;
 
/* check if continuous dsi clock is required or not */
-   pm_runtime_get_sync(pdata->dev);
+   pm_runtime_get_sync(dev);
regmap_read(pdata->regmap, SN_DPPLL_SRC_REG, &val);
-   pm_runtime_put_autosuspend(pdata->dev);
+   pm_runtime_put_autosuspend(dev);
if (!(val & DPPLL_CLK_SRC_DSICLK))
dsi->mode_flags |= MIPI_DSI_CLOCK_NON_CONTINUOUS;
 
-   ret = mipi_dsi_attach(dsi);
+   ret = devm_mipi_dsi_attach(dev, dsi);
if (ret < 0) {
DRM_ERROR("failed to attach dsi to host\n");
-   goto err_dsi_attach;
+   goto err_dsi_host;
}
pdata->dsi = dsi;
 
@@ -746,14 +747,10 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
ret = drm_bridge_attach(bridge->encoder, pdata->next_bridge,
&pdata->bridge, flags);
if (ret < 0)
-   goto err_dsi_detach;
+   goto err_dsi_host;
 
return 0;
 
-err_dsi_detach:
-   mipi_dsi_detach(dsi);
-err_dsi_attach:
-   mipi_dsi_device_unregister(dsi);
 err_dsi_host:
drm_connector_cleanup(&pdata->connector);
 err_conn_init:
@@ -1236,11 +1233,6 @@ static void ti_sn_bridge_remove(struct auxiliary_device 
*adev)
if (!pdata)
return;
 
-   if (pdata->dsi) {
-   mipi_dsi_detach(pdata->dsi);
-   mipi_dsi_device_unregister(pdata->dsi);
-   }
-
drm_bridge_remove(&pdata->bridge);
 
of_node_put(pdata->host_node);
-- 
2.31.1



[PATCH v4 21/24] drm/bridge: tc358775: Switch to devm MIPI-DSI helpers

2021-09-10 Thread Maxime Ripard
Let's switch to the new devm MIPI-DSI function to register and attach
our secondary device. This also avoids leaking the device when we detach
the bridge.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/tc358775.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/bridge/tc358775.c 
b/drivers/gpu/drm/bridge/tc358775.c
index 2272adcc5b4a..35e66d1b6456 100644
--- a/drivers/gpu/drm/bridge/tc358775.c
+++ b/drivers/gpu/drm/bridge/tc358775.c
@@ -610,11 +610,10 @@ static int tc_bridge_attach(struct drm_bridge *bridge,
return -EPROBE_DEFER;
}
 
-   dsi = mipi_dsi_device_register_full(host, &info);
+   dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
dev_err(dev, "failed to create dsi device\n");
-   ret = PTR_ERR(dsi);
-   goto err_dsi_device;
+   return PTR_ERR(dsi);
}
 
tc->dsi = dsi;
@@ -623,19 +622,15 @@ static int tc_bridge_attach(struct drm_bridge *bridge,
dsi->format = MIPI_DSI_FMT_RGB888;
dsi->mode_flags = MIPI_DSI_MODE_VIDEO;
 
-   ret = mipi_dsi_attach(dsi);
+   ret = devm_mipi_dsi_attach(dev, dsi);
if (ret < 0) {
dev_err(dev, "failed to attach dsi to host\n");
-   goto err_dsi_attach;
+   return ret;
}
 
/* Attach the panel-bridge to the dsi bridge */
return drm_bridge_attach(bridge->encoder, tc->panel_bridge,
 &tc->bridge, flags);
-err_dsi_attach:
-   mipi_dsi_device_unregister(dsi);
-err_dsi_device:
-   return ret;
 }
 
 static const struct drm_bridge_funcs tc_bridge_funcs = {
-- 
2.31.1



[PATCH v4 20/24] drm/bridge: sn65dsi86: Register and attach our DSI device at probe

2021-09-10 Thread Maxime Ripard
In order to avoid any probe ordering issue, the best practice is to move
the secondary MIPI-DSI device registration and attachment to the
MIPI-DSI host at probe time. Let's do this.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 74 ++-
 1 file changed, 38 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index b5662269ff95..7f71329536a2 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -667,58 +667,27 @@ static struct ti_sn65dsi86 *bridge_to_ti_sn65dsi86(struct 
drm_bridge *bridge)
return container_of(bridge, struct ti_sn65dsi86, bridge);
 }
 
-static int ti_sn_bridge_attach(struct drm_bridge *bridge,
-  enum drm_bridge_attach_flags flags)
+static int ti_sn_attach_host(struct ti_sn65dsi86 *pdata)
 {
int ret, val;
-   struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge);
struct mipi_dsi_host *host;
struct mipi_dsi_device *dsi;
struct device *dev = pdata->dev;
const struct mipi_dsi_device_info info = { .type = "ti_sn_bridge",
   .channel = 0,
   .node = NULL,
-};
+   };
 
-   if (flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR) {
-   DRM_ERROR("Fix bridge driver to make connector optional!");
-   return -EINVAL;
-   }
-
-   pdata->aux.drm_dev = bridge->dev;
-   ret = drm_dp_aux_register(&pdata->aux);
-   if (ret < 0) {
-   drm_err(bridge->dev, "Failed to register DP AUX channel: %d\n", 
ret);
-   return ret;
-   }
-
-   ret = ti_sn_bridge_connector_init(pdata);
-   if (ret < 0)
-   goto err_conn_init;
-
-   /*
-* TODO: ideally finding host resource and dsi dev registration needs
-* to be done in bridge probe. But some existing DSI host drivers will
-* wait for any of the drm_bridge/drm_panel to get added to the global
-* bridge/panel list, before completing their probe. So if we do the
-* dsi dev registration part in bridge probe, before populating in
-* the global bridge list, then it will cause deadlock as dsi host probe
-* will never complete, neither our bridge probe. So keeping it here
-* will satisfy most of the existing host drivers. Once the host driver
-* is fixed we can move the below code to bridge probe safely.
-*/
host = of_find_mipi_dsi_host_by_node(pdata->host_node);
if (!host) {
DRM_ERROR("failed to find dsi host\n");
-   ret = -ENODEV;
-   goto err_dsi_host;
+   return -ENODEV;
}
 
dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
if (IS_ERR(dsi)) {
DRM_ERROR("failed to create dsi device\n");
-   ret = PTR_ERR(dsi);
-   goto err_dsi_host;
+   return PTR_ERR(dsi);
}
 
/* TODO: setting to 4 MIPI lanes always for now */
@@ -736,10 +705,35 @@ static int ti_sn_bridge_attach(struct drm_bridge *bridge,
ret = devm_mipi_dsi_attach(dev, dsi);
if (ret < 0) {
DRM_ERROR("failed to attach dsi to host\n");
-   goto err_dsi_host;
+   return ret;
}
pdata->dsi = dsi;
 
+   return 0;
+}
+
+static int ti_sn_bridge_attach(struct drm_bridge *bridge,
+  enum drm_bridge_attach_flags flags)
+{
+   struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge);
+   int ret;
+
+   if (flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR) {
+   DRM_ERROR("Fix bridge driver to make connector optional!");
+   return -EINVAL;
+   }
+
+   pdata->aux.drm_dev = bridge->dev;
+   ret = drm_dp_aux_register(&pdata->aux);
+   if (ret < 0) {
+   drm_err(bridge->dev, "Failed to register DP AUX channel: %d\n", 
ret);
+   return ret;
+   }
+
+   ret = ti_sn_bridge_connector_init(pdata);
+   if (ret < 0)
+   goto err_conn_init;
+
/* We never want the next bridge to *also* create a connector: */
flags |= DRM_BRIDGE_ATTACH_NO_CONNECTOR;
 
@@ -1223,7 +1217,15 @@ static int ti_sn_bridge_probe(struct auxiliary_device 
*adev,
 
drm_bridge_add(&pdata->bridge);
 
+   ret = ti_sn_attach_host(pdata);
+   if (ret)
+   goto err_remove_bridge;
+
return 0;
+
+err_remove_bridge:
+   drm_bridge_remove(&pdata->bridge);
+   return ret;
 }
 
 static void ti_sn_bridge_remove(struct auxiliary_device *adev)
-- 
2.31.1



[PATCH v4 22/24] drm/bridge: tc358775: Register and attach our DSI device at probe

2021-09-10 Thread Maxime Ripard
In order to avoid any probe ordering issue, the best practice is to move
the secondary MIPI-DSI device registration and attachment to the
MIPI-DSI host at probe time. Let's do this.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/bridge/tc358775.c | 37 +--
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/bridge/tc358775.c 
b/drivers/gpu/drm/bridge/tc358775.c
index 35e66d1b6456..2c76331b251d 100644
--- a/drivers/gpu/drm/bridge/tc358775.c
+++ b/drivers/gpu/drm/bridge/tc358775.c
@@ -594,11 +594,26 @@ static int tc_bridge_attach(struct drm_bridge *bridge,
enum drm_bridge_attach_flags flags)
 {
struct tc_data *tc = bridge_to_tc(bridge);
+
+   /* Attach the panel-bridge to the dsi bridge */
+   return drm_bridge_attach(bridge->encoder, tc->panel_bridge,
+&tc->bridge, flags);
+}
+
+static const struct drm_bridge_funcs tc_bridge_funcs = {
+   .attach = tc_bridge_attach,
+   .pre_enable = tc_bridge_pre_enable,
+   .enable = tc_bridge_enable,
+   .mode_valid = tc_mode_valid,
+   .post_disable = tc_bridge_post_disable,
+};
+
+static int tc_attach_host(struct tc_data *tc)
+{
struct device *dev = &tc->i2c->dev;
struct mipi_dsi_host *host;
struct mipi_dsi_device *dsi;
int ret;
-
const struct mipi_dsi_device_info info = { .type = "tc358775",
.channel = 0,
.node = NULL,
@@ -628,19 +643,9 @@ static int tc_bridge_attach(struct drm_bridge *bridge,
return ret;
}
 
-   /* Attach the panel-bridge to the dsi bridge */
-   return drm_bridge_attach(bridge->encoder, tc->panel_bridge,
-&tc->bridge, flags);
+   return 0;
 }
 
-static const struct drm_bridge_funcs tc_bridge_funcs = {
-   .attach = tc_bridge_attach,
-   .pre_enable = tc_bridge_pre_enable,
-   .enable = tc_bridge_enable,
-   .mode_valid = tc_mode_valid,
-   .post_disable = tc_bridge_post_disable,
-};
-
 static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id)
 {
struct device *dev = &client->dev;
@@ -704,7 +709,15 @@ static int tc_probe(struct i2c_client *client, const 
struct i2c_device_id *id)
 
i2c_set_clientdata(client, tc);
 
+   ret = tc_attach_host(tc);
+   if (ret)
+   goto err_bridge_remove;
+
return 0;
+
+err_bridge_remove:
+   drm_bridge_remove(&tc->bridge);
+   return ret;
 }
 
 static int tc_remove(struct i2c_client *client)
-- 
2.31.1



[PATCH v4 23/24] drm/kirin: dsi: Adjust probe order

2021-09-10 Thread Maxime Ripard
Without proper care and an agreement between how DSI hosts and devices
drivers register their MIPI-DSI entities and potential components, we can
end up in a situation where the drivers can never probe.

Most drivers were taking evasive maneuvers to try to workaround this,
but not all of them were following the same conventions, resulting in
various incompatibilities between DSI hosts and devices.

Now that we have a sequence agreed upon and documented, let's convert
kirin to it.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c | 27 +++-
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c 
b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
index 952cfdb1961d..be20c2ffe798 100644
--- a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
+++ b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
@@ -720,10 +720,13 @@ static int dw_drm_encoder_init(struct device *dev,
return 0;
 }
 
+static const struct component_ops dsi_ops;
 static int dsi_host_attach(struct mipi_dsi_host *host,
   struct mipi_dsi_device *mdsi)
 {
struct dw_dsi *dsi = host_to_dsi(host);
+   struct device *dev = host->dev;
+   int ret;
 
if (mdsi->lanes < 1 || mdsi->lanes > 4) {
DRM_ERROR("dsi device params invalid\n");
@@ -734,13 +737,20 @@ static int dsi_host_attach(struct mipi_dsi_host *host,
dsi->format = mdsi->format;
dsi->mode_flags = mdsi->mode_flags;
 
+   ret = component_add(dev, &dsi_ops);
+   if (ret)
+   return ret;
+
return 0;
 }
 
 static int dsi_host_detach(struct mipi_dsi_host *host,
   struct mipi_dsi_device *mdsi)
 {
-   /* do nothing */
+   struct device *dev = host->dev;
+
+   component_del(dev, &dsi_ops);
+
return 0;
 }
 
@@ -785,10 +795,6 @@ static int dsi_bind(struct device *dev, struct device 
*master, void *data)
if (ret)
return ret;
 
-   ret = dsi_host_init(dev, dsi);
-   if (ret)
-   return ret;
-
ret = dsi_bridge_init(drm_dev, dsi);
if (ret)
return ret;
@@ -859,12 +865,19 @@ static int dsi_probe(struct platform_device *pdev)
 
platform_set_drvdata(pdev, data);
 
-   return component_add(&pdev->dev, &dsi_ops);
+   ret = dsi_host_init(&pdev->dev, dsi);
+   if (ret)
+   return ret;
+
+   return 0;
 }
 
 static int dsi_remove(struct platform_device *pdev)
 {
-   component_del(&pdev->dev, &dsi_ops);
+   struct dsi_data *data = platform_get_drvdata(pdev);
+   struct dw_dsi *dsi = &data->dsi;
+
+   mipi_dsi_host_unregister(&dsi->host);
 
return 0;
 }
-- 
2.31.1



[PATCH v4 24/24] drm/exynos: dsi: Adjust probe order

2021-09-10 Thread Maxime Ripard
Without proper care and an agreement between how DSI hosts and devices
drivers register their MIPI-DSI entities and potential components, we can
end up in a situation where the drivers can never probe.

Most drivers were taking evasive maneuvers to try to workaround this,
but not all of them were following the same conventions, resulting in
various incompatibilities between DSI hosts and devices.

Now that we have a sequence agreed upon and documented, let's convert
exynos to it.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/exynos/exynos_drm_dsi.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
index e39fac889edc..dfda2b259c44 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
@@ -1529,6 +1529,7 @@ static const struct drm_encoder_helper_funcs 
exynos_dsi_encoder_helper_funcs = {
 
 MODULE_DEVICE_TABLE(of, exynos_dsi_of_match);
 
+static const struct component_ops exynos_dsi_component_ops;
 static int exynos_dsi_host_attach(struct mipi_dsi_host *host,
  struct mipi_dsi_device *device)
 {
@@ -1536,6 +1537,7 @@ static int exynos_dsi_host_attach(struct mipi_dsi_host 
*host,
struct drm_encoder *encoder = &dsi->encoder;
struct drm_device *drm = encoder->dev;
struct drm_bridge *out_bridge;
+   struct device *dev = host->dev;
 
out_bridge  = of_drm_find_bridge(device->dev.of_node);
if (out_bridge) {
@@ -1585,7 +1587,7 @@ static int exynos_dsi_host_attach(struct mipi_dsi_host 
*host,
if (drm->mode_config.poll_enabled)
drm_kms_helper_hotplug_event(drm);
 
-   return 0;
+   return component_add(dev, &exynos_dsi_component_ops);
 }
 
 static int exynos_dsi_host_detach(struct mipi_dsi_host *host,
@@ -1593,6 +1595,9 @@ static int exynos_dsi_host_detach(struct mipi_dsi_host 
*host,
 {
struct exynos_dsi *dsi = host_to_dsi(host);
struct drm_device *drm = dsi->encoder.dev;
+   struct device *dev = host->dev;
+
+   component_del(dev, &exynos_dsi_component_ops);
 
if (dsi->panel) {
mutex_lock(&drm->mode_config.mutex);
@@ -1716,7 +1721,7 @@ static int exynos_dsi_bind(struct device *dev, struct 
device *master,
of_node_put(in_bridge_node);
}
 
-   return mipi_dsi_host_register(&dsi->dsi_host);
+   return 0;
 }
 
 static void exynos_dsi_unbind(struct device *dev, struct device *master,
@@ -1726,8 +1731,6 @@ static void exynos_dsi_unbind(struct device *dev, struct 
device *master,
struct drm_encoder *encoder = &dsi->encoder;
 
exynos_dsi_disable(encoder);
-
-   mipi_dsi_host_unregister(&dsi->dsi_host);
 }
 
 static const struct component_ops exynos_dsi_component_ops = {
@@ -1821,7 +1824,7 @@ static int exynos_dsi_probe(struct platform_device *pdev)
 
pm_runtime_enable(dev);
 
-   ret = component_add(dev, &exynos_dsi_component_ops);
+   ret = mipi_dsi_host_register(&dsi->dsi_host);
if (ret)
goto err_disable_runtime;
 
@@ -1835,10 +1838,12 @@ static int exynos_dsi_probe(struct platform_device 
*pdev)
 
 static int exynos_dsi_remove(struct platform_device *pdev)
 {
+   struct exynos_dsi *dsi = platform_get_drvdata(pdev);
+
+   mipi_dsi_host_unregister(&dsi->dsi_host);
+
pm_runtime_disable(&pdev->dev);
 
-   component_del(&pdev->dev, &exynos_dsi_component_ops);
-
return 0;
 }
 
-- 
2.31.1



Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-10 Thread Tvrtko Ursulin



On 20/08/2021 23:44, Matthew Brost wrote:

Add logical engine mapping. This is required for split-frame, as
workloads need to be placed on engines in a logically contiguous manner.

v2:
  (Daniel Vetter)
   - Add kernel doc for new fields

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 ---
  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 ++
  .../drm/i915/gt/intel_execlists_submission.c  |  1 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  2 +-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
  5 files changed, 60 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 0d9105a31d84..4d790f9a65dd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs *engine, 
u16 iir)
GEM_DEBUG_WARN_ON(iir);
  }
  
-static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)

+static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
+ u8 logical_instance)
  {
const struct engine_info *info = &intel_engines[id];
struct drm_i915_private *i915 = gt->i915;
@@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
  
  	engine->class = info->class;

engine->instance = info->instance;
+   engine->logical_mask = BIT(logical_instance);
__sprint_engine_name(engine);
  
  	engine->props.heartbeat_interval_ms =

@@ -572,6 +574,37 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
return info->engine_mask;
  }
  
+static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids,

+u8 class, const u8 *map, u8 num_instances)
+{
+   int i, j;
+   u8 current_logical_id = 0;
+
+   for (j = 0; j < num_instances; ++j) {
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   if (!HAS_ENGINE(gt, i) ||
+   intel_engines[i].class != class)
+   continue;
+
+   if (intel_engines[i].instance == map[j]) {
+   logical_ids[intel_engines[i].instance] =
+   current_logical_id++;
+   break;
+   }
+   }
+   }
+}
+
+static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 class)
+{
+   int i;
+   u8 map[MAX_ENGINE_INSTANCE + 1];
+
+   for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
+   map[i] = i;


What's the point of the map array since it is 1:1 with instance?


+   populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map));
+}
+
  /**
   * intel_engines_init_mmio() - allocate and prepare the Engine Command 
Streamers
   * @gt: pointer to struct intel_gt
@@ -583,7 +616,8 @@ int intel_engines_init_mmio(struct intel_gt *gt)
struct drm_i915_private *i915 = gt->i915;
const unsigned int engine_mask = init_engine_mask(gt);
unsigned int mask = 0;
-   unsigned int i;
+   unsigned int i, class;
+   u8 logical_ids[MAX_ENGINE_INSTANCE + 1];
int err;
  
  	drm_WARN_ON(&i915->drm, engine_mask == 0);

@@ -593,15 +627,23 @@ int intel_engines_init_mmio(struct intel_gt *gt)
if (i915_inject_probe_failure(i915))
return -ENODEV;
  
-	for (i = 0; i < ARRAY_SIZE(intel_engines); i++) {

-   if (!HAS_ENGINE(gt, i))
-   continue;
+   for (class = 0; class < MAX_ENGINE_CLASS + 1; ++class) {
+   setup_logical_ids(gt, logical_ids, class);
  
-		err = intel_engine_setup(gt, i);

-   if (err)
-   goto cleanup;
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   u8 instance = intel_engines[i].instance;
+
+   if (intel_engines[i].class != class ||
+   !HAS_ENGINE(gt, i))
+   continue;
  
-		mask |= BIT(i);

+   err = intel_engine_setup(gt, i,
+logical_ids[instance]);
+   if (err)
+   goto cleanup;
+
+   mask |= BIT(i);


I still this there is a less clunky way to set this up in less code and 
more readable at the same time. Like do it in two passes so you can 
iterate gt->engine_class[] array instead of having to implement a skip 
condition (both on class and HAS_ENGINE at two places) and also avoid 
walking the flat intel_engines array recursively.



+   }
}
  
  	/*

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index ed91bcff20eb..fddf35546b58 10

Re: [Intel-gfx] [PATCH 23/27] drm/i915/guc: Implement no mid batch preemption for multi-lrc

2021-09-10 Thread Tvrtko Ursulin



On 20/08/2021 23:44, Matthew Brost wrote:

For some users of multi-lrc, e.g. split frame, it isn't safe to preempt
mid BB. To safely enable preemption at the BB boundary, a handshake
between to parent and child is needed. This is implemented via custom
emit_bb_start & emit_fini_breadcrumb functions and enabled via by
default if a context is configured by set parallel extension.


FWIW I think it's wrong to hardcode the requirements of a particular 
hardware generation fixed media pipeline into the uapi. IMO better 
solution was when concept of parallel submission was decoupled from the 
no preemption mid batch preambles. Otherwise might as well call the 
extension I915_CONTEXT_ENGINES_EXT_MEDIA_SPLIT_FRAME_SUBMIT or something.


Regards,

Tvrtko

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gt/intel_context.c   |   2 +-
  drivers/gpu/drm/i915/gt/intel_context_types.h |   3 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |   2 +-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 283 +-
  4 files changed, 287 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 5615be32879c..2de62649e275 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -561,7 +561,7 @@ void intel_context_bind_parent_child(struct intel_context 
*parent,
GEM_BUG_ON(intel_context_is_child(child));
GEM_BUG_ON(intel_context_is_parent(child));
  
-	parent->guc_number_children++;

+   child->guc_child_index = parent->guc_number_children++;
list_add_tail(&child->guc_child_link,
  &parent->guc_child_list);
child->parent = parent;
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 713d85b0b364..727f91e7f7c2 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -246,6 +246,9 @@ struct intel_context {
/** @guc_number_children: number of children if parent */
u8 guc_number_children;
  
+		/** @guc_child_index: index into guc_child_list if child */

+   u8 guc_child_index;
+
/**
 * @parent_page: page in context used by parent for work queue,
 * work queue descriptor
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index 6cd26dc060d1..9f61cfa5566a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -188,7 +188,7 @@ struct guc_process_desc {
u32 wq_status;
u32 engine_presence;
u32 priority;
-   u32 reserved[30];
+   u32 reserved[36];
  } __packed;
  
  #define CONTEXT_REGISTRATION_FLAG_KMD	BIT(0)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 91330525330d..1a18f99bf12a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -11,6 +11,7 @@
  #include "gt/intel_context.h"
  #include "gt/intel_engine_pm.h"
  #include "gt/intel_engine_heartbeat.h"
+#include "gt/intel_gpu_commands.h"
  #include "gt/intel_gt.h"
  #include "gt/intel_gt_irq.h"
  #include "gt/intel_gt_pm.h"
@@ -366,10 +367,14 @@ static struct i915_priolist *to_priolist(struct rb_node 
*rb)
  
  /*

   * When using multi-lrc submission an extra page in the context state is
- * reserved for the process descriptor and work queue.
+ * reserved for the process descriptor, work queue, and preempt BB boundary
+ * handshake between the parent + childlren contexts.
   *
   * The layout of this page is below:
   * 0  guc_process_desc
+ * + sizeof(struct guc_process_desc)   child go
+ * + CACHELINE_BYTES   child join ...
+ * + CACHELINE_BYTES ...
   * ...unused
   * PAGE_SIZE / 2  work queue start
   * ...work queue
@@ -1785,6 +1790,30 @@ static int deregister_context(struct intel_context *ce, 
u32 guc_id, bool loop)
return __guc_action_deregister_context(guc, guc_id, loop);
  }
  
+static inline void clear_children_join_go_memory(struct intel_context *ce)

+{
+   u32 *mem = (u32 *)(__get_process_desc(ce) + 1);
+   u8 i;
+
+   for (i = 0; i < ce->guc_number_children + 1; ++i)
+   mem[i * (CACHELINE_BYTES / sizeof(u32))] = 0;
+}
+
+static inline u32 get_children_go_value(struct intel_context *ce)
+{
+   u32 *mem = (u32 *)(__get_process_desc(ce) + 1);
+
+   return mem[0];
+}
+
+static inline u32 get_children_join_value(struct intel_context *ce,
+ u8 child_index)
+{
+   u32 *mem = (u32 *)(__get_process_desc(ce) + 1);
+
+

Re: [PATCH RESEND] drm/i915: Mark GPU wedging on driver unregister unrecoverable

2021-09-10 Thread Michał Winiarski

On 03.09.2021 16:28, Janusz Krzysztofik wrote:

GPU wedged flag now set on driver unregister to prevent from further
using the GPU can be then cleared unintentionally when calling
__intel_gt_unset_wedged() still before the flag is finally marked
unrecoverable.  We need to have it marked unrecoverable earlier.
Implement that by replacing a call to intel_gt_set_wedged() in
intel_gt_driver_unregister() with intel_gt_set_wedged_on_fini().

With the above in place, intel_gt_set_wedged_on_fini() is now called
twice on driver remove, second time from __intel_gt_disable().  This
seems harmless, while dropping intel_gt_set_wedged_on_fini() from
__intel_gt_disable() proved to break some driver probe error unwind
paths as well as mock selftest exit path.

Signed-off-by: Janusz Krzysztofik 
Cc: Michał Winiarski 


Reviewed-by: Michał Winiarski 

-Michał


---
Resending with Cc: dri-devel@lists.freedesktop.org as requested.

Thanks,
Janusz

  drivers/gpu/drm/i915/gt/intel_gt.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 62d40c986642..173b53cb2b47 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -750,7 +750,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
 * all in-flight requests so that we can quickly unbind the active
 * resources.
 */
-   intel_gt_set_wedged(gt);
+   intel_gt_set_wedged_on_fini(gt);
  
  	/* Scrub all HW state upon release */

with_intel_runtime_pm(gt->uncore->rpm, wakeref)





[Bug 213391] AMDGPU retries page fault with some specific processes amdgpu and sometimes followed [gfxhub0] retry page fault until *ERROR* ring gfx timeout, but soft recovered

2021-09-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213391

--- Comment #36 from Lahfa Samy (s...@lahfa.xyz) ---
Did anyone test whether this has been fixed in newer firmware updates, or
should we still stay on version 20210315.3568f96-3 ?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [Intel-gfx] [PATCH 5/6] drm/i915/uncore: Drop gen11 mmio read handlers

2021-09-10 Thread Tvrtko Ursulin



On 10/09/2021 06:33, Matt Roper wrote:

Consolidate down to just a single 'fwtable' implementation.  For reads
we don't need to worry about shadow tables.  Also, the
NEEDS_FORCE_WAKE() check we previously had in the fwtable implementation
can be dropped --- if a register is outside that range on one of the old
platforms, then it won't belong to any forcewake range and 0 will be
returned anyway.

Signed-off-by: Matt Roper 
---
  drivers/gpu/drm/i915/intel_uncore.c | 45 +++--
  1 file changed, 17 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index c181e74fbf43..95398cb69722 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -935,14 +935,6 @@ static const struct intel_forcewake_range 
__vlv_fw_ranges[] = {
  };
  
  #define __fwtable_reg_read_fw_domains(uncore, offset) \

-({ \
-   enum forcewake_domains __fwd = 0; \
-   if (NEEDS_FORCE_WAKE((offset))) \
-   __fwd = find_fw_domain(uncore, offset); \
-   __fwd; \
-})
-
-#define __gen11_fwtable_reg_read_fw_domains(uncore, offset) \
find_fw_domain(uncore, offset)


Looks like you can drop this macro and just call find_fw_domain or you 
think there is value to keep it?


Regards,

Tvrtko

  
  /* *Must* be sorted by offset! See intel_shadow_table_check(). */

@@ -1577,33 +1569,30 @@ static inline void __force_wake_auto(struct 
intel_uncore *uncore,
___force_wake_auto(uncore, fw_domains);
  }
  
-#define __gen_read(func, x) \

+#define __gen_fwtable_read(x) \
  static u##x \
-func##_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { \
+fwtable_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) \
+{ \
enum forcewake_domains fw_engine; \
GEN6_READ_HEADER(x); \
-   fw_engine = __##func##_reg_read_fw_domains(uncore, offset); \
+   fw_engine = __fwtable_reg_read_fw_domains(uncore, offset); \
if (fw_engine) \
__force_wake_auto(uncore, fw_engine); \
val = __raw_uncore_read##x(uncore, reg); \
GEN6_READ_FOOTER; \
  }
  
-#define __gen_reg_read_funcs(func) \

-static enum forcewake_domains \
-func##_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \
-   return __##func##_reg_read_fw_domains(uncore, 
i915_mmio_reg_offset(reg)); \
-} \
-\
-__gen_read(func, 8) \
-__gen_read(func, 16) \
-__gen_read(func, 32) \
-__gen_read(func, 64)
+static enum forcewake_domains
+fwtable_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) {
+   return __fwtable_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg));
+}
  
-__gen_reg_read_funcs(gen11_fwtable);

-__gen_reg_read_funcs(fwtable);
+__gen_fwtable_read(8)
+__gen_fwtable_read(16)
+__gen_fwtable_read(32)
+__gen_fwtable_read(64)
  
-#undef __gen_reg_read_funcs

+#undef __gen_fwtable_read
  #undef GEN6_READ_FOOTER
  #undef GEN6_READ_HEADER
  
@@ -2069,22 +2058,22 @@ static int uncore_forcewake_init(struct intel_uncore *uncore)

ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) >= 12) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) == 11) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_GRAPHICS_VER(i915, 9, 10)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges);
ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);



Re: [Intel-gfx] [PATCH 1/6] drm/i915/uncore: Convert gen6/gen7 read operations to fwtable

2021-09-10 Thread Tvrtko Ursulin




On 10/09/2021 06:33, Matt Roper wrote:

On gen6-gen8 (except vlv/chv) we don't use a forcewake lookup table; we
simply check whether the register offset is < 0x4, and return
FORCEWAKE_RENDER if it is.  To prepare for upcoming refactoring, let's
define a single-entry forcewake table from [0x0, 0x3] and switch
these platforms over to use the fwtable reader functions.

Signed-off-by: Matt Roper 
---
  drivers/gpu/drm/i915/intel_uncore.c | 11 ---
  1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index f9767054dbdf..7f92f12d95f2 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1064,6 +1064,10 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, 
i915_reg_t reg)
__fwd; \
  })
  


Is __gen6_reg_read_fw_domains left orphaned somewhere around here or in 
a later patch?


Regards,

Tvrtko


+static const struct intel_forcewake_range __gen6_fw_ranges[] = {
+   GEN_FW_RANGE(0x0, 0x3, FORCEWAKE_RENDER),
+};
+
  /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
  static const struct intel_forcewake_range __chv_fw_ranges[] = {
GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
@@ -1623,7 +1627,6 @@ __gen_read(func, 64)
  
  __gen_reg_read_funcs(gen11_fwtable);

  __gen_reg_read_funcs(fwtable);
-__gen_reg_read_funcs(gen6);
  
  #undef __gen_reg_read_funcs

  #undef GEN6_READ_FOOTER
@@ -2111,15 +2114,17 @@ static int uncore_forcewake_init(struct intel_uncore 
*uncore)
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (GRAPHICS_VER(i915) == 8) {
+   ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen6);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_VALLEYVIEW(i915)) {
ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
} else if (IS_GRAPHICS_VER(i915, 6, 7)) {
+   ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6);
-   ASSIGN_READ_MMIO_VFUNCS(uncore, gen6);
+   ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
}
  
  	uncore->pmic_bus_access_nb.notifier_call = i915_pmic_bus_access_notifier;




Re: [Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table

2021-09-10 Thread Tvrtko Ursulin



On 10/09/2021 06:33, Matt Roper wrote:

Our uncore MMIO functions for reading/writing registers have become very
complicated over time.  There's significant macro magic used to generate
several nearly-identical functions that only really differ in terms of
which platform-specific shadow register table they should check on write
operations.  We can significantly simplify our MMIO handlers by storing
a reference to the current platform's shadow table within the 'struct
intel_uncore' the same way we already do for forcewake; this allows us
to consolidate the multiple variants of each 'write' function down to
just a single 'fwtable' version that gets the shadow table out of the
uncore struct rather than hardcoding the name of a specific platform's
table.  We can do similar consolidation on the MMIO read side by
creating a single-entry forcewake table to replace the open-coded range
check they had been using previously.

The final patch of the series adds a new shadow table for DG2; this
becomes quite clean and simple now, given the refactoring in the first
five patches.


Tidy and it ends up saving kernel binary size.

However I am undecided yet, because one thing to note is that the trade 
off is source code and kernel text consolidation at the expense of more 
indirect calls at runtime and larger common read/write functions.


To expand, current code generates a bunch of per gen functions but in 
doing so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE 
and BSEARCH (from find_fw_domain) so at runtime each platform mmio 
read/write does not have to do indirect calls to do lookups.


It may matter a lot in the grand scheme of things but this trade off is 
something to note in the cover letter I think.


Regards,

Tvrtko


Matt Roper (6):
   drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
   drm/i915/uncore: Associate shadow table with uncore
   drm/i915/uncore: Replace gen8 write functions with general fwtable
   drm/i915/uncore: Drop gen11/gen12 mmio write handlers
   drm/i915/uncore: Drop gen11 mmio read handlers
   drm/i915/dg2: Add DG2-specific shadow register table

  drivers/gpu/drm/i915/intel_uncore.c   | 190 ++
  drivers/gpu/drm/i915/intel_uncore.h   |   7 +
  drivers/gpu/drm/i915/selftests/intel_uncore.c |   1 +
  3 files changed, 110 insertions(+), 88 deletions(-)



Re: [PATCH v5 1/3] dt-bindings: Add YAML bindings for NVDEC

2021-09-10 Thread Rob Herring
On Fri, 10 Sep 2021 13:42:45 +0300, Mikko Perttunen wrote:
> Add YAML device tree bindings for NVDEC, now in a more appropriate
> place compared to the old textual Host1x bindings.
> 
> Signed-off-by: Mikko Perttunen 
> ---
> v5:
> * Changed from nvidia,instance to nvidia,host1x-class optional
>   property.
> * Added dma-coherent
> v4:
> * Fix incorrect compatibility string in 'if' condition
> v3:
> * Drop host1x bindings
> * Change read2 to read-1 in interconnect names
> v2:
> * Fix issues pointed out in v1
> * Add T194 nvidia,instance property
> ---
>  .../gpu/host1x/nvidia,tegra210-nvdec.yaml | 104 ++
>  MAINTAINERS   |   1 +
>  2 files changed, 105 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:
./Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml:104:1:
 [warning] too many blank lines (2 > 1) (empty-lines)

dtschema/dtc warnings/errors:

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/patch/1526459

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.



[PATCH 0/3] drm/bridge: Create a function to abstract panels away

2021-09-10 Thread Maxime Ripard
Hi,

This series used to be part of the DSI probe order series, but got removed
since it wasn't useful there anymore.

However, I still believe there is value in moving towards merging bridges and
panels by only making encoder (or upstream bridges) manipulate bridges.

The first patch creates a new helper that does just this by looking for a
bridge and a panel, and if a panel is found create a panel_bridge to return
that bridge instead.

The next two patches convert the vc4 encoders to use it.

If it's accepted, I plan on converting all the relevant users over time.

Let me know what you think,
Maxime

Maxime Ripard (3):
  drm/bridge: Add a function to abstract away panels
  drm/vc4: dpi: Switch to devm_drm_of_get_bridge
  drm/vc4: dsi: Switch to devm_drm_of_get_bridge

 drivers/gpu/drm/drm_bridge.c  | 42 +++
 drivers/gpu/drm/drm_of.c  |  3 +++
 drivers/gpu/drm/vc4/vc4_dpi.c | 15 -
 drivers/gpu/drm/vc4/vc4_drv.c |  2 ++
 drivers/gpu/drm/vc4/vc4_dsi.c | 28 ---
 include/drm/drm_bridge.h  |  2 ++
 6 files changed, 53 insertions(+), 39 deletions(-)

-- 
2.31.1



[PATCH 1/3] drm/bridge: Add a function to abstract away panels

2021-09-10 Thread Maxime Ripard
Display drivers so far need to have a lot of boilerplate to first
retrieve either the panel or bridge that they are connected to using
drm_of_find_panel_or_bridge(), and then either deal with each with ad-hoc
functions or create a drm panel bridge through drm_panel_bridge_add.

In order to reduce the boilerplate and hopefully create a path of least
resistance towards using the DRM panel bridge layer, let's create the
function devm_drm_of_get_next to reduce that boilerplate.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/drm_bridge.c | 42 
 drivers/gpu/drm/drm_of.c |  3 +++
 include/drm/drm_bridge.h |  2 ++
 3 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
index a8ed66751c2d..10ddca4638b0 100644
--- a/drivers/gpu/drm/drm_bridge.c
+++ b/drivers/gpu/drm/drm_bridge.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "drm_crtc_internal.h"
@@ -51,10 +52,8 @@
  *
  * Display drivers are responsible for linking encoders with the first bridge
  * in the chains. This is done by acquiring the appropriate bridge with
- * of_drm_find_bridge() or drm_of_find_panel_or_bridge(), or creating it for a
- * panel with drm_panel_bridge_add_typed() (or the managed version
- * devm_drm_panel_bridge_add_typed()). Once acquired, the bridge shall be
- * attached to the encoder with a call to drm_bridge_attach().
+ * devm_drm_of_get_bridge(). Once acquired, the bridge shall be attached to the
+ * encoder with a call to drm_bridge_attach().
  *
  * Bridges are responsible for linking themselves with the next bridge in the
  * chain, if any. This is done the same way as for encoders, with the call to
@@ -1233,6 +1232,41 @@ struct drm_bridge *of_drm_find_bridge(struct device_node 
*np)
return NULL;
 }
 EXPORT_SYMBOL(of_drm_find_bridge);
+
+/**
+ * devm_drm_of_get_bridge - Return next bridge in the chain
+ * @dev: device to tie the bridge lifetime to
+ * @np: device tree node containing encoder output ports
+ * @port: port in the device tree node
+ * @endpoint: endpoint in the device tree node
+ *
+ * Given a DT node's port and endpoint number, finds the connected node
+ * and returns the associated bridge if any, or creates and returns a
+ * drm panel bridge instance if a panel is connected.
+ *
+ * Returns a pointer to the bridge if successful, or an error pointer
+ * otherwise.
+ */
+struct drm_bridge *devm_drm_of_get_bridge(struct device *dev,
+ struct device_node *np,
+ unsigned int port,
+ unsigned int endpoint)
+{
+   struct drm_bridge *bridge;
+   struct drm_panel *panel;
+   int ret;
+
+   ret = drm_of_find_panel_or_bridge(np, port, endpoint,
+ &panel, &bridge);
+   if (ret)
+   return ERR_PTR(ret);
+
+   if (panel)
+   bridge = devm_drm_panel_bridge_add(dev, panel);
+
+   return bridge;
+}
+EXPORT_SYMBOL(devm_drm_of_get_bridge);
 #endif
 
 MODULE_AUTHOR("Ajay Kumar ");
diff --git a/drivers/gpu/drm/drm_of.c b/drivers/gpu/drm/drm_of.c
index 997b8827fed2..37c34146eea8 100644
--- a/drivers/gpu/drm/drm_of.c
+++ b/drivers/gpu/drm/drm_of.c
@@ -231,6 +231,9 @@ EXPORT_SYMBOL_GPL(drm_of_encoder_active_endpoint);
  * return either the associated struct drm_panel or drm_bridge device. Either
  * @panel or @bridge must not be NULL.
  *
+ * This function is deprecated and should not be used in new drivers. Use
+ * devm_drm_of_get_bridge() instead.
+ *
  * Returns zero if successful, or one of the standard error codes if it fails.
  */
 int drm_of_find_panel_or_bridge(const struct device_node *np,
diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h
index 46bdfa48c413..f70c88ca96ef 100644
--- a/include/drm/drm_bridge.h
+++ b/include/drm/drm_bridge.h
@@ -911,6 +911,8 @@ struct drm_bridge *devm_drm_panel_bridge_add(struct device 
*dev,
 struct drm_bridge *devm_drm_panel_bridge_add_typed(struct device *dev,
   struct drm_panel *panel,
   u32 connector_type);
+struct drm_bridge *devm_drm_of_get_bridge(struct device *dev, struct 
device_node *node,
+   unsigned int port, unsigned int 
endpoint);
 struct drm_connector *drm_panel_bridge_connector(struct drm_bridge *bridge);
 #endif
 
-- 
2.31.1



[PATCH 2/3] drm/vc4: dpi: Switch to devm_drm_of_get_bridge

2021-09-10 Thread Maxime Ripard
The new devm_drm_of_get_bridge removes most of the boilerplate we
have to deal with. Let's switch to it.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_dpi.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_dpi.c b/drivers/gpu/drm/vc4/vc4_dpi.c
index a90f2545baee..c180eb60bee8 100644
--- a/drivers/gpu/drm/vc4/vc4_dpi.c
+++ b/drivers/gpu/drm/vc4/vc4_dpi.c
@@ -229,26 +229,19 @@ static const struct of_device_id vc4_dpi_dt_match[] = {
 static int vc4_dpi_init_bridge(struct vc4_dpi *dpi)
 {
struct device *dev = &dpi->pdev->dev;
-   struct drm_panel *panel;
struct drm_bridge *bridge;
-   int ret;
 
-   ret = drm_of_find_panel_or_bridge(dev->of_node, 0, 0,
- &panel, &bridge);
-   if (ret) {
+   bridge = devm_drm_of_get_bridge(dev, dev->of_node, 0, 0);
+   if (IS_ERR(bridge)) {
/* If nothing was connected in the DT, that's not an
 * error.
 */
-   if (ret == -ENODEV)
+   if (PTR_ERR(bridge) == -ENODEV)
return 0;
else
-   return ret;
+   return PTR_ERR(bridge);
}
 
-   if (panel)
-   bridge = drm_panel_bridge_add_typed(panel,
-   DRM_MODE_CONNECTOR_DPI);
-
return drm_bridge_attach(dpi->encoder, bridge, NULL, 0);
 }
 
-- 
2.31.1



[PATCH 3/3] drm/vc4: dsi: Switch to devm_drm_of_get_bridge

2021-09-10 Thread Maxime Ripard
The new devm_drm_of_get_bridge removes most of the boilerplate we
have to deal with. Let's switch to it.

Signed-off-by: Maxime Ripard 
---
 drivers/gpu/drm/vc4/vc4_drv.c |  2 ++
 drivers/gpu/drm/vc4/vc4_dsi.c | 28 
 2 files changed, 6 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 16abc3a3d601..96c526f1022e 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -25,7 +25,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/drivers/gpu/drm/vc4/vc4_dsi.c b/drivers/gpu/drm/vc4/vc4_dsi.c
index a185027911ce..a229da58962a 100644
--- a/drivers/gpu/drm/vc4/vc4_dsi.c
+++ b/drivers/gpu/drm/vc4/vc4_dsi.c
@@ -1497,7 +1497,6 @@ static int vc4_dsi_bind(struct device *dev, struct device 
*master, void *data)
struct drm_device *drm = dev_get_drvdata(master);
struct vc4_dsi *dsi = dev_get_drvdata(dev);
struct vc4_dsi_encoder *vc4_dsi_encoder;
-   struct drm_panel *panel;
const struct of_device_id *match;
dma_cap_mask_t dma_mask;
int ret;
@@ -1609,27 +1608,9 @@ static int vc4_dsi_bind(struct device *dev, struct 
device *master, void *data)
return ret;
}
 
-   ret = drm_of_find_panel_or_bridge(dev->of_node, 0, 0,
- &panel, &dsi->bridge);
-   if (ret) {
-   /* If the bridge or panel pointed by dev->of_node is not
-* enabled, just return 0 here so that we don't prevent the DRM
-* dev from being registered. Of course that means the DSI
-* encoder won't be exposed, but that's not a problem since
-* nothing is connected to it.
-*/
-   if (ret == -ENODEV)
-   return 0;
-
-   return ret;
-   }
-
-   if (panel) {
-   dsi->bridge = devm_drm_panel_bridge_add_typed(dev, panel,
- 
DRM_MODE_CONNECTOR_DSI);
-   if (IS_ERR(dsi->bridge))
-   return PTR_ERR(dsi->bridge);
-   }
+   dsi->bridge = devm_drm_of_get_bridge(dev, dev->of_node, 0, 0);
+   if (IS_ERR(dsi->bridge))
+   return PTR_ERR(dsi->bridge);
 
/* The esc clock rate is supposed to always be 100Mhz. */
ret = clk_set_rate(dsi->escape_clock, 100 * 100);
@@ -1667,8 +1648,7 @@ static void vc4_dsi_unbind(struct device *dev, struct 
device *master,
 {
struct vc4_dsi *dsi = dev_get_drvdata(dev);
 
-   if (dsi->bridge)
-   pm_runtime_disable(dev);
+   pm_runtime_disable(dev);
 
/*
 * Restore the bridge_chain so the bridge detach procedure can happen
-- 
2.31.1



[RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource

2021-09-10 Thread Thomas Hellström
Both the provider (resource manager) and the consumer (the TTM driver)
want to subclass struct ttm_resource. Since this is left for the resource
manager, we need to provide a private pointer for the TTM driver.

Provide a struct ttm_resource_private for the driver to subclass for
data with the same lifetime as the struct ttm_resource: In the i915 case
it will, for example, be an sg-table and radix tree into the LMEM
/VRAM pages that currently are awkwardly attached to the GEM object.

Provide an ops structure for associated ops (Which is only destroy() ATM)
It might seem pointless to provide a separate ops structure, but Linus
has previously made it clear that that's the norm.

After careful audit one could perhaps also on a per-driver basis
replace the delete_mem_notify() TTM driver callback with the above
destroy function.

Cc: Matthew Auld 
Cc: König Christian 
Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/ttm/ttm_resource.c | 10 +++---
 include/drm/ttm/ttm_resource.h | 28 
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_resource.c 
b/drivers/gpu/drm/ttm/ttm_resource.c
index 2431717376e7..973e7c50bfed 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -57,13 +57,17 @@ int ttm_resource_alloc(struct ttm_buffer_object *bo,
 void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource **res)
 {
struct ttm_resource_manager *man;
+   struct ttm_resource *resource = *res;
 
-   if (!*res)
+   if (!resource)
return;
 
-   man = ttm_manager_type(bo->bdev, (*res)->mem_type);
-   man->func->free(man, *res);
*res = NULL;
+   if (resource->priv)
+   resource->priv->ops.destroy(resource->priv);
+
+   man = ttm_manager_type(bo->bdev, resource->mem_type);
+   man->func->free(man, resource);
 }
 EXPORT_SYMBOL(ttm_resource_free);
 
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h
index 140b6b9a8bbe..5a22c9a29c05 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -44,6 +44,7 @@ struct dma_buf_map;
 struct io_mapping;
 struct sg_table;
 struct scatterlist;
+struct ttm_resource_private;
 
 struct ttm_resource_manager_func {
/**
@@ -153,6 +154,32 @@ struct ttm_bus_placement {
enum ttm_cachingcaching;
 };
 
+/**
+ * struct ttm_resource_private_ops - Operations for a struct
+ * ttm_resource_private
+ *
+ * Not much benefit to keep this as a separate struct with only a single 
member,
+ * but keeping a separate ops struct is the norm.
+ */
+struct ttm_resource_private_ops {
+   /**
+* destroy() - Callback to destroy the private data
+* @priv - The private data to destroy
+*/
+   void (*destroy) (struct ttm_resource_private *priv);
+};
+
+/**
+ * struct ttm_resource_private - TTM driver private data
+ * @ops: Pointer to struct ttm_resource_private_ops with associated operations
+ *
+ * Intended to be subclassed to hold, for example cached data sharing the
+ * lifetime with a struct ttm_resource.
+ */
+struct ttm_resource_private {
+   const struct ttm_resource_private_ops ops;
+};
+
 /**
  * struct ttm_resource
  *
@@ -171,6 +198,7 @@ struct ttm_resource {
uint32_t mem_type;
uint32_t placement;
struct ttm_bus_placement bus;
+   struct ttm_resource_private *priv;
 };
 
 /**
-- 
2.31.1



Re: [PATCH v2 3/6] drm/i915 Implement LMEM backup and restore for suspend / resume

2021-09-10 Thread Thomas Hellström



On 9/6/21 6:55 PM, Thomas Hellström wrote:

Just evict unpinned objects to system. For pinned LMEM objects,
make a backup system object and blit the contents to that.

Backup is performed in three steps,
1: Opportunistically evict evictable objects using the gpu blitter.
2: After gt idle, evict evictable objects using the gpu blitter. This will
be modified in an upcoming patch to backup pinned objects that are not used
by the blitter itself.
3: Backup remaining pinned objects using memcpy.

Also move uC suspend to after 2) to make sure we have a functional GuC
during 2) if using GuC submission.

v2:
- Major refactor to make sure gem_exec_suspend@hang-SX subtests work, and
   suspend / resume works with a slightly modified GuC submission enabling
   patch series.

Signed-off-by: Thomas Hellström 
---
  drivers/gpu/drm/i915/Makefile |   1 +
  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   1 +
  drivers/gpu/drm/i915/gem/i915_gem_pm.c|  92 +++-
  drivers/gpu/drm/i915/gem/i915_gem_pm.h|   3 +-
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c   |  29 ++-
  drivers/gpu/drm/i915/gem/i915_gem_ttm.h   |  10 +
  drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c| 205 ++
  drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.h|  24 ++
  drivers/gpu/drm/i915/gt/intel_gt_pm.c |   4 +-
  drivers/gpu/drm/i915/i915_drv.c   |  10 +-
  drivers/gpu/drm/i915/i915_drv.h   |   2 +-
  11 files changed, 364 insertions(+), 17 deletions(-)
  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index c36c8a4f0716..3379a0a6c91e 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -155,6 +155,7 @@ gem-y += \
gem/i915_gem_throttle.o \
gem/i915_gem_tiling.o \
gem/i915_gem_ttm.o \
+   gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 2471f36aaff3..734cc8e16481 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -534,6 +534,7 @@ struct drm_i915_gem_object {
struct {
struct sg_table *cached_io_st;
struct i915_gem_object_page_iter get_io_page;
+   struct drm_i915_gem_object *backup;
bool created:1;
} ttm;
  
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c

index 8b9d7d14c4bd..9746c255ddcc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
@@ -5,6 +5,7 @@
   */
  
  #include "gem/i915_gem_pm.h"

+#include "gem/i915_gem_ttm_pm.h"
  #include "gt/intel_gt.h"
  #include "gt/intel_gt_pm.h"
  #include "gt/intel_gt_requests.h"
@@ -39,7 +40,79 @@ void i915_gem_suspend(struct drm_i915_private *i915)
i915_gem_drain_freed_objects(i915);
  }
  
-void i915_gem_suspend_late(struct drm_i915_private *i915)

+static int lmem_restore(struct drm_i915_private *i915, bool allow_gpu)
+{
+   struct intel_memory_region *mr;
+   int ret = 0, id;
+
+   for_each_memory_region(mr, i915, id) {
+   if (mr->type == INTEL_MEMORY_LOCAL) {
+   ret = i915_ttm_restore_region(mr, allow_gpu);
+   if (ret)
+   break;
+   }
+   }
+
+   return ret;
+}
+
+static int lmem_suspend(struct drm_i915_private *i915, bool allow_gpu,
+   bool backup_pinned)
+{
+   struct intel_memory_region *mr;
+   int ret = 0, id;
+
+   for_each_memory_region(mr, i915, id) {
+   if (mr->type == INTEL_MEMORY_LOCAL) {
+   ret = i915_ttm_backup_region(mr, allow_gpu, 
backup_pinned);
+   if (ret)
+   break;
+   }
+   }
+
+   return ret;
+}
+
+static void lmem_recover(struct drm_i915_private *i915)
+{
+   struct intel_memory_region *mr;
+   int id;
+
+   for_each_memory_region(mr, i915, id)
+   if (mr->type == INTEL_MEMORY_LOCAL)
+   i915_ttm_recover_region(mr);
+}
+
+int i915_gem_backup_suspend(struct drm_i915_private *i915)
+{
+   int ret;
+
+   /* Opportunistically try to evict unpinned objects */
+   ret = lmem_suspend(i915, true, false);
+   if (ret)
+   goto out_recover;
+
+   i915_gem_suspend(i915);
+
+   /*
+* More objects may have become unpinned as requests were
+* retired. Now try to evict again. The gt may be wedged here
+* in which case we automatically fall back to memcpy.
+*/
+
+   ret = lmem_suspend(i915, true, false);
+   if (ret)
+  

[Bug 213391] AMDGPU retries page fault with some specific processes amdgpu and sometimes followed [gfxhub0] retry page fault until *ERROR* ring gfx timeout, but soft recovered

2021-09-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213391

--- Comment #37 from Michel Dänzer (mic...@daenzer.net) ---
(In reply to Lahfa Samy from comment #36)
> Did anyone test whether this has been fixed in newer firmware updates, or
> should we still stay on version 20210315.3568f96-3 ?

It's fixed in upstream linux-firmware 20210818.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [Intel-gfx] [PATCH v5] drm/i915: Use Transparent Hugepages when IOMMU is enabled

2021-09-10 Thread Tvrtko Ursulin



On 09/09/2021 17:17, Rodrigo Vivi wrote:

On Thu, Sep 09, 2021 at 12:44:48PM +0100, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

Usage of Transparent Hugepages was disabled in 9987da4b5dcf
("drm/i915: Disable THP until we have a GPU read BW W/A"), but since it
appears majority of performance regressions reported with an enabled IOMMU
can be almost eliminated by turning them on, lets just do that.

To err on the side of safety we keep the current default in cases where
IOMMU is not active, and only when it is default to the "huge=within_size"
mode. Although there probably would be wins to enable them throughout,
more extensive testing across benchmarks and platforms would need to be
done.

With the patch and IOMMU enabled my local testing on a small Skylake part
shows OglVSTangent regression being reduced from ~14% (IOMMU on versus
IOMMU off) to ~2% (same comparison but with THP on).

More detailed testing done in the below referenced Gitlab issue by Eero:

Skylake GT4e:

Performance drops from enabling IOMMU:

 30-35% SynMark CSDof
 20-25% Unigine Heaven, MemBW GPU write, SynMark VSTangent
 ~20% GLB Egypt  (1/2 screen window)
 10-15% GLB T-Rex (1/2 screen window)
 8-10% GfxBench T-Rex, MemBW GPU blit
 7-8% SynMark DeferredAA + TerrainFly* + ZBuffer
 6-7% GfxBench Manhattan 3.0 + 3.1, SynMark TexMem128 & CSCloth
 5-6% GfxBench CarChase, Unigine Valley
 3-5% GfxBench Vulkan & GL AztecRuins + ALU2, MemBW GPU texture,
  SynMark Fill*, Deferred, TerrainPan*
 1-2% Most of the other tests

With the patch drops become:

 20-25% SynMark TexMem*
 15-20% GLB Egypt (1/2 screen window)
 10-15% GLB T-Rex (1/2 screen window)
 4-7% GfxBench T-Rex, GpuTest Triangle
 1-8% GfxBench ALU2 (offscreen 1%, onscreen 8%)
 3% GfxBench Manhattan 3.0, SynMark CSDof
 2-3% Unigine Heaven + Valley, MemBW GPU texture
 1-3 GfxBench Manhattan 3.1 + CarChase + Vulkan & GL AztecRuins

Broxton:

Performance drops from IOMMU, without patch:

 30% MemBW GPU write
 25% SynMark ZBuffer + Fill*
 20% MemBW GPU blit
 15% MemBW GPU blend, GpuTest Triangle
 10-15% MemBW GPU texture
 10% GLB Egypt, Unigine Heaven (had hangs), SynMark TerrainFly*
 7-9% GLB T-Rex, GfxBench Manhattan 3.0 + T-Rex,
  SynMark Deferred* + TexMem*
 6-8% GfxBench CarChase, Unigine Valley,
  SynMark CSCloth + ShMapVsm + TerrainPan*
 5-6% GfxBench Manhattan 3.1 + GL AztecRuins,
  SynMark CSDof + TexFilterTri
 2-4% GfxBench ALU2, SynMark DrvRes + GSCloth + ShMapPcf + Batch[0-5] +
  TexFilterAniso, GpuTest GiMark + 32-bit Julia

And with patch:

 15-20% MemBW GPU texture
 10% SynMark TexMem*
 8-9% GLB Egypt (1/2 screen window)
 4-5% GLB T-Rex (1/2 screen window)
 3-6% GfxBench Manhattan 3.0, GpuTest FurMark,
  SynMark Deferred + TexFilterTri
 3-4% GfxBench Manhattan 3.1 + T-Rex, SynMark VSInstancing
 2-4% GpuTest Triangle, SynMark DeferredAA
 2-3% Unigine Heaven + Valley
 1-3% SynMark Terrain*
 1-2% GfxBench CarChase, SynMark TexFilterAniso + ZBuffer

Tigerlake-H:

 20-25% MemBW GPU texture
 15-20% GpuTest Triangle
 13-15% SynMark TerrainFly* + DeferredAA + HdrBloom
 8-10% GfxBench Manhattan 3.1, SynMark TerrainPan* + DrvRes
 6-7% GfxBench Manhattan 3.0, SynMark TexMem*
 4-8% GLB onscreen Fill + T-Rex + Egypt (more in onscreen than
  offscreen versions of T-Rex/Egypt)
 4-6% GfxBench CarChase + GLES AztecRuins + ALU2, GpuTest 32-bit Julia,
  SynMark CSDof + DrvState
 3-5% GfxBench T-Rex + Egypt, Unigine Heaven + Valley, GpuTest Plot3D
 1-7% Media tests
 2-3% MemBW GPU blit
 1-3% Most of the rest of 3D tests

With the patch:

 6-8% MemBW GPU blend => the only regression in these tests (compared
  to IOMMU without THP)
 4-6% SynMark DrvState (not impacted) + HdrBloom (improved)
 3-4% GLB T-Rex
 ~3% GLB Egypt, SynMark DrvRes
 1-3% GfxBench T-Rex + Egypt, SynMark TexFilterTri
 1-2% GfxBench CarChase + GLES AztecRuins, Unigine Valley,
 GpuTest Triangle
 ~1% GfxBench Manhattan 3.0/3.1, Unigine Heaven

Perf of several tests actually improved with IOMMU + THP, compared to no
IOMMU / no THP:

 10-15% SynMark Batch[0-3]
 5-10% MemBW GPU texture, SynMark ShMapVsm
 3-4% SynMark Fill* + Geom*
 2-3% SynMark TexMem512 + CSCloth
 1-2% SynMark TexMem128 + DeferredAA

As a summary across all platforms, these are the benchmarks where enabling
THP on top of IOMMU enabled brings regressions:

  * Skylake GT4e:
20-25% SynMark TexMem*
(whereas all MemBW GPU tests either improve or are not affected)

  * Broxton J4205:
7% MemBW GPU texture
2-3% SynMark TexMem*

  * Tigerlake-H:
7% MemBW GPU blend

Other benchmarks show either lowering of regressions or improvements.

v2:
  * Add Kconfig dependency to transparent hugepages and some help text.
  * Move to helper for e

Re: [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code

2021-09-10 Thread Jason Gunthorpe
On Fri, Sep 10, 2021 at 01:10:46PM +0100, Christoph Hellwig wrote:
> On Thu, Sep 09, 2021 at 04:38:45PM -0300, Jason Gunthorpe wrote:
> > Every driver just emits a static string, simply feed it through the ops
> > and provide a standard sysfs show function.
> 
> Looks sensible.  But can you make the attribute optional and add a
> comment marking it deprecated?  Because it really is completely useless.
> We don't version userspace APIs, userspae has to discover new features
> individually by e.g. finding new sysfs files or just trying new ioctls.

To be honest I have no idea what side effects that would have..

device code search tells me libvirt reads it and stuffs it into some
XML

Something called mdevctl touches it, feeds it into some JSON and
other stuff..

qemu has some VFIO_DEVICE_API_* constants but it is all dead code

I agree it shouldn't have been there in the first place

Cornelia? Alex? Any thoughts?

Jason


Re: [PATCH v3 1/6] drm/vc4: select PM

2021-09-10 Thread Dave Stevenson
On Thu, 19 Aug 2021 at 14:59, Maxime Ripard  wrote:
>
> We already depend on runtime PM to get the power domains and clocks for
> most of the devices supported by the vc4 driver, so let's just select it
> to make sure it's there, and remove the ifdef.
>
> Signed-off-by: Maxime Ripard 

Reviewed-by: Dave Stevenson 

> ---
>  drivers/gpu/drm/vc4/Kconfig| 1 +
>  drivers/gpu/drm/vc4/vc4_hdmi.c | 2 --
>  2 files changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/Kconfig b/drivers/gpu/drm/vc4/Kconfig
> index 118e8a426b1a..f774ab340863 100644
> --- a/drivers/gpu/drm/vc4/Kconfig
> +++ b/drivers/gpu/drm/vc4/Kconfig
> @@ -9,6 +9,7 @@ config DRM_VC4
> select DRM_KMS_CMA_HELPER
> select DRM_GEM_CMA_HELPER
> select DRM_PANEL_BRIDGE
> +   select PM
> select SND_PCM
> select SND_PCM_ELD
> select SND_SOC_GENERIC_DMAENGINE_PCM
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> index c2876731ee2d..602203b2d8e1 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> @@ -2107,7 +2107,6 @@ static int vc5_hdmi_init_resources(struct vc4_hdmi 
> *vc4_hdmi)
> return 0;
>  }
>
> -#ifdef CONFIG_PM
>  static int vc4_hdmi_runtime_suspend(struct device *dev)
>  {
> struct vc4_hdmi *vc4_hdmi = dev_get_drvdata(dev);
> @@ -2128,7 +2127,6 @@ static int vc4_hdmi_runtime_resume(struct device *dev)
>
> return 0;
>  }
> -#endif
>
>  static int vc4_hdmi_bind(struct device *dev, struct device *master, void 
> *data)
>  {
> --
> 2.31.1
>


Re: [PATCH] drm/vc4: hdmi: Remove unused struct

2021-09-10 Thread Dave Stevenson
On Thu, 19 Aug 2021 at 15:08, Maxime Ripard  wrote:
>
> Commitc7d30623540b ("drm/vc4: hdmi: Remove unused struct") removed the
> references to the vc4_hdmi_audio_widgets and vc4_hdmi_audio_routes
> structures, but not the structures themselves resulting in two warnings.
> Remove it.
>
> Fixes: c7d30623540b ("drm/vc4: hdmi: Remove unused struct")
> Reported-by: kernel test robot 
> Signed-off-by: Maxime Ripard 

Reviewed-by: Dave Stevenson 

> ---
>  drivers/gpu/drm/vc4/vc4_hdmi.c | 8 
>  1 file changed, 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> index b7dc32a0c9bb..1e2d976e8736 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> @@ -1403,14 +1403,6 @@ static int vc4_hdmi_audio_prepare(struct device *dev, 
> void *data,
> return 0;
>  }
>
> -static const struct snd_soc_dapm_widget vc4_hdmi_audio_widgets[] = {
> -   SND_SOC_DAPM_OUTPUT("TX"),
> -};
> -
> -static const struct snd_soc_dapm_route vc4_hdmi_audio_routes[] = {
> -   { "TX", NULL, "Playback" },
> -};
> -
>  static const struct snd_soc_component_driver vc4_hdmi_audio_cpu_dai_comp = {
> .name = "vc4-hdmi-cpu-dai-component",
>  };
> --
> 2.31.1
>


Re: [Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table

2021-09-10 Thread Matt Roper
On Fri, Sep 10, 2021 at 02:03:44PM +0100, Tvrtko Ursulin wrote:
> 
> On 10/09/2021 06:33, Matt Roper wrote:
> > Our uncore MMIO functions for reading/writing registers have become very
> > complicated over time.  There's significant macro magic used to generate
> > several nearly-identical functions that only really differ in terms of
> > which platform-specific shadow register table they should check on write
> > operations.  We can significantly simplify our MMIO handlers by storing
> > a reference to the current platform's shadow table within the 'struct
> > intel_uncore' the same way we already do for forcewake; this allows us
> > to consolidate the multiple variants of each 'write' function down to
> > just a single 'fwtable' version that gets the shadow table out of the
> > uncore struct rather than hardcoding the name of a specific platform's
> > table.  We can do similar consolidation on the MMIO read side by
> > creating a single-entry forcewake table to replace the open-coded range
> > check they had been using previously.
> > 
> > The final patch of the series adds a new shadow table for DG2; this
> > becomes quite clean and simple now, given the refactoring in the first
> > five patches.
> 
> Tidy and it ends up saving kernel binary size.
> 
> However I am undecided yet, because one thing to note is that the trade off
> is source code and kernel text consolidation at the expense of more indirect
> calls at runtime and larger common read/write functions.
> 
> To expand, current code generates a bunch of per gen functions but in doing
> so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and BSEARCH
> (from find_fw_domain) so at runtime each platform mmio read/write does not
> have to do indirect calls to do lookups.
> 
> It may matter a lot in the grand scheme of things but this trade off is
> something to note in the cover letter I think.

That's true.  However it seems like if the extra indirect calls are good
enough for our forcewake lookups (which are called more frequently and
have to search through much larger tables) then using the same strategy
for shadow registers should be less of a concern.  Plus most of
timing-critical parts of the code don't call through this at all; they
just grab an explicit forcewake and then issue a bunch of *_fw()
operations that skip all the per-register forcewake and shadow handling.

But you're right that this is something I should mention more clearly in
the cover letter.


Matt

> 
> Regards,
> 
> Tvrtko
> 
> > Matt Roper (6):
> >drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
> >drm/i915/uncore: Associate shadow table with uncore
> >drm/i915/uncore: Replace gen8 write functions with general fwtable
> >drm/i915/uncore: Drop gen11/gen12 mmio write handlers
> >drm/i915/uncore: Drop gen11 mmio read handlers
> >drm/i915/dg2: Add DG2-specific shadow register table
> > 
> >   drivers/gpu/drm/i915/intel_uncore.c   | 190 ++
> >   drivers/gpu/drm/i915/intel_uncore.h   |   7 +
> >   drivers/gpu/drm/i915/selftests/intel_uncore.c |   1 +
> >   3 files changed, 110 insertions(+), 88 deletions(-)
> > 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795


Re: [RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource

2021-09-10 Thread Christian König




Am 10.09.21 um 15:15 schrieb Thomas Hellström:

Both the provider (resource manager) and the consumer (the TTM driver)
want to subclass struct ttm_resource. Since this is left for the resource
manager, we need to provide a private pointer for the TTM driver.

Provide a struct ttm_resource_private for the driver to subclass for
data with the same lifetime as the struct ttm_resource: In the i915 case
it will, for example, be an sg-table and radix tree into the LMEM
/VRAM pages that currently are awkwardly attached to the GEM object.

Provide an ops structure for associated ops (Which is only destroy() ATM)
It might seem pointless to provide a separate ops structure, but Linus
has previously made it clear that that's the norm.

After careful audit one could perhaps also on a per-driver basis
replace the delete_mem_notify() TTM driver callback with the above
destroy function.


Well this is a really big NAK to this approach.

If you need to attach some additional information to the resource then 
implement your own resource manager like everybody else does.


Regards,
Christian.



Cc: Matthew Auld 
Cc: König Christian 
Signed-off-by: Thomas Hellström 
---
  drivers/gpu/drm/ttm/ttm_resource.c | 10 +++---
  include/drm/ttm/ttm_resource.h | 28 
  2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_resource.c 
b/drivers/gpu/drm/ttm/ttm_resource.c
index 2431717376e7..973e7c50bfed 100644
--- a/drivers/gpu/drm/ttm/ttm_resource.c
+++ b/drivers/gpu/drm/ttm/ttm_resource.c
@@ -57,13 +57,17 @@ int ttm_resource_alloc(struct ttm_buffer_object *bo,
  void ttm_resource_free(struct ttm_buffer_object *bo, struct ttm_resource 
**res)
  {
struct ttm_resource_manager *man;
+   struct ttm_resource *resource = *res;
  
-	if (!*res)

+   if (!resource)
return;
  
-	man = ttm_manager_type(bo->bdev, (*res)->mem_type);

-   man->func->free(man, *res);
*res = NULL;
+   if (resource->priv)
+   resource->priv->ops.destroy(resource->priv);
+
+   man = ttm_manager_type(bo->bdev, resource->mem_type);
+   man->func->free(man, resource);
  }
  EXPORT_SYMBOL(ttm_resource_free);
  
diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h

index 140b6b9a8bbe..5a22c9a29c05 100644
--- a/include/drm/ttm/ttm_resource.h
+++ b/include/drm/ttm/ttm_resource.h
@@ -44,6 +44,7 @@ struct dma_buf_map;
  struct io_mapping;
  struct sg_table;
  struct scatterlist;
+struct ttm_resource_private;
  
  struct ttm_resource_manager_func {

/**
@@ -153,6 +154,32 @@ struct ttm_bus_placement {
enum ttm_cachingcaching;
  };
  
+/**

+ * struct ttm_resource_private_ops - Operations for a struct
+ * ttm_resource_private
+ *
+ * Not much benefit to keep this as a separate struct with only a single 
member,
+ * but keeping a separate ops struct is the norm.
+ */
+struct ttm_resource_private_ops {
+   /**
+* destroy() - Callback to destroy the private data
+* @priv - The private data to destroy
+*/
+   void (*destroy) (struct ttm_resource_private *priv);
+};
+
+/**
+ * struct ttm_resource_private - TTM driver private data
+ * @ops: Pointer to struct ttm_resource_private_ops with associated operations
+ *
+ * Intended to be subclassed to hold, for example cached data sharing the
+ * lifetime with a struct ttm_resource.
+ */
+struct ttm_resource_private {
+   const struct ttm_resource_private_ops ops;
+};
+
  /**
   * struct ttm_resource
   *
@@ -171,6 +198,7 @@ struct ttm_resource {
uint32_t mem_type;
uint32_t placement;
struct ttm_bus_placement bus;
+   struct ttm_resource_private *priv;
  };
  
  /**




[PATCH v5 2/3] arm64: tegra: Add NVDEC to Tegra186/194 device trees

2021-09-10 Thread Mikko Perttunen
Add a device tree node for NVDEC on Tegra186, and
device tree nodes for NVDEC and NVDEC1 on Tegra194.

Signed-off-by: Mikko Perttunen 
---
v5:
* Change from nvidia,instance to nvidia,host1x-class
v4:
* Add dma-coherent markers
v3:
* Change read2 to read-1
v2:
* Add NVDECSRD1 memory client
* Add also to T194 (both NVDEC0/1)
---
 arch/arm64/boot/dts/nvidia/tegra186.dtsi | 16 ++
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 38 
 2 files changed, 54 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
index d02f6bf3e2ca..4f2f21242b2c 100644
--- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
@@ -1342,6 +1342,22 @@ dsib: dsi@1540 {
power-domains = <&bpmp TEGRA186_POWER_DOMAIN_DISP>;
};
 
+   nvdec@1548 {
+   compatible = "nvidia,tegra186-nvdec";
+   reg = <0x1548 0x4>;
+   clocks = <&bpmp TEGRA186_CLK_NVDEC>;
+   clock-names = "nvdec";
+   resets = <&bpmp TEGRA186_RESET_NVDEC>;
+   reset-names = "nvdec";
+
+   power-domains = <&bpmp TEGRA186_POWER_DOMAIN_NVDEC>;
+   interconnects = <&mc TEGRA186_MEMORY_CLIENT_NVDECSRD 
&emc>,
+   <&mc TEGRA186_MEMORY_CLIENT_NVDECSRD1 
&emc>,
+   <&mc TEGRA186_MEMORY_CLIENT_NVDECSWR 
&emc>;
+   interconnect-names = "dma-mem", "read-1", "write";
+   iommus = <&smmu TEGRA186_SID_NVDEC>;
+   };
+
sor0: sor@1554 {
compatible = "nvidia,tegra186-sor";
reg = <0x1554 0x1>;
diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index 5ba7a4519b95..04e883aa7aa2 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -1412,6 +1412,25 @@ host1x@13e0 {
interconnect-names = "dma-mem";
iommus = <&smmu TEGRA194_SID_HOST1X>;
 
+   nvdec@1514 {
+   compatible = "nvidia,tegra194-nvdec";
+   reg = <0x1514 0x0004>;
+   clocks = <&bpmp TEGRA194_CLK_NVDEC1>;
+   clock-names = "nvdec";
+   resets = <&bpmp TEGRA194_RESET_NVDEC1>;
+   reset-names = "nvdec";
+
+   power-domains = <&bpmp 
TEGRA194_POWER_DOMAIN_NVDECB>;
+   interconnects = <&mc 
TEGRA194_MEMORY_CLIENT_NVDEC1SRD &emc>,
+   <&mc 
TEGRA194_MEMORY_CLIENT_NVDEC1SRD1 &emc>,
+   <&mc 
TEGRA194_MEMORY_CLIENT_NVDEC1SWR &emc>;
+   interconnect-names = "dma-mem", "read-1", 
"write";
+   iommus = <&smmu TEGRA194_SID_NVDEC1>;
+   dma-coherent;
+
+   nvidia,host1x-class = <0xf5>;
+   };
+
display-hub@1520 {
compatible = "nvidia,tegra194-display";
reg = <0x1520 0x0004>;
@@ -1525,6 +1544,25 @@ vic@1534 {
iommus = <&smmu TEGRA194_SID_VIC>;
};
 
+   nvdec@1548 {
+   compatible = "nvidia,tegra194-nvdec";
+   reg = <0x1548 0x0004>;
+   clocks = <&bpmp TEGRA194_CLK_NVDEC>;
+   clock-names = "nvdec";
+   resets = <&bpmp TEGRA194_RESET_NVDEC>;
+   reset-names = "nvdec";
+
+   power-domains = <&bpmp 
TEGRA194_POWER_DOMAIN_NVDECA>;
+   interconnects = <&mc 
TEGRA194_MEMORY_CLIENT_NVDECSRD &emc>,
+   <&mc 
TEGRA194_MEMORY_CLIENT_NVDECSRD1 &emc>,
+   <&mc 
TEGRA194_MEMORY_CLIENT_NVDECSWR &emc>;
+   interconnect-names = "dma-mem", "read-1", 
"write";
+   iommus = <&smmu TEGRA194_SID_NVDEC>;
+   dma-coherent;
+
+   nvidia,host1x-class = <0xf0>;
+   };
+
dpaux0: dpaux@155c {
compatible = "nvidia,tegra194-dpaux";
reg = <0x155c 0x1>;
-- 
2.32.0



[PATCH v5 3/3] drm/tegra: Add NVDEC driver

2021-09-10 Thread Mikko Perttunen
Add support for booting and using NVDEC on Tegra210, Tegra186
and Tegra194 to the Host1x and TegraDRM drivers. Booting in
secure mode is not currently supported.

Signed-off-by: Mikko Perttunen 
---
v5:
* Remove num_instances
* Change from nvidia,instance to nvidia,host1x-class
v3:
* Change num_instances to unsigned int
* Remove unnecessary '= 0' initializer
* Populate num_instances data
* Fix instance number check
v2:
* Use devm_platform_get_and_ioremap_resource
* Remove reset handling, done by power domain code
* Assume runtime PM is enabled
---
 drivers/gpu/drm/tegra/Makefile |   3 +-
 drivers/gpu/drm/tegra/drm.c|   4 +
 drivers/gpu/drm/tegra/drm.h|   1 +
 drivers/gpu/drm/tegra/nvdec.c  | 464 +
 drivers/gpu/host1x/dev.c   |  18 ++
 include/linux/host1x.h |   2 +
 6 files changed, 491 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/tegra/nvdec.c

diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile
index 5d2039f0c734..b248c631f790 100644
--- a/drivers/gpu/drm/tegra/Makefile
+++ b/drivers/gpu/drm/tegra/Makefile
@@ -24,7 +24,8 @@ tegra-drm-y := \
gr2d.o \
gr3d.o \
falcon.o \
-   vic.o
+   vic.o \
+   nvdec.o
 
 tegra-drm-y += trace.o
 
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index b20fd0833661..5f5afd7ba37e 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -1337,15 +1337,18 @@ static const struct of_device_id host1x_drm_subdevs[] = 
{
{ .compatible = "nvidia,tegra210-sor", },
{ .compatible = "nvidia,tegra210-sor1", },
{ .compatible = "nvidia,tegra210-vic", },
+   { .compatible = "nvidia,tegra210-nvdec", },
{ .compatible = "nvidia,tegra186-display", },
{ .compatible = "nvidia,tegra186-dc", },
{ .compatible = "nvidia,tegra186-sor", },
{ .compatible = "nvidia,tegra186-sor1", },
{ .compatible = "nvidia,tegra186-vic", },
+   { .compatible = "nvidia,tegra186-nvdec", },
{ .compatible = "nvidia,tegra194-display", },
{ .compatible = "nvidia,tegra194-dc", },
{ .compatible = "nvidia,tegra194-sor", },
{ .compatible = "nvidia,tegra194-vic", },
+   { .compatible = "nvidia,tegra194-nvdec", },
{ /* sentinel */ }
 };
 
@@ -1369,6 +1372,7 @@ static struct platform_driver * const drivers[] = {
&tegra_gr2d_driver,
&tegra_gr3d_driver,
&tegra_vic_driver,
+   &tegra_nvdec_driver,
 };
 
 static int __init host1x_drm_init(void)
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index 8b28327c931c..fc0a19554eac 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -202,5 +202,6 @@ extern struct platform_driver tegra_sor_driver;
 extern struct platform_driver tegra_gr2d_driver;
 extern struct platform_driver tegra_gr3d_driver;
 extern struct platform_driver tegra_vic_driver;
+extern struct platform_driver tegra_nvdec_driver;
 
 #endif /* HOST1X_DRM_H */
diff --git a/drivers/gpu/drm/tegra/nvdec.c b/drivers/gpu/drm/tegra/nvdec.c
new file mode 100644
index ..c3b6fe7fb454
--- /dev/null
+++ b/drivers/gpu/drm/tegra/nvdec.c
@@ -0,0 +1,464 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2015-2021, NVIDIA Corporation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "drm.h"
+#include "falcon.h"
+#include "vic.h"
+
+struct nvdec_config {
+   const char *firmware;
+   unsigned int version;
+   bool supports_sid;
+};
+
+struct nvdec {
+   struct falcon falcon;
+
+   void __iomem *regs;
+   struct tegra_drm_client client;
+   struct host1x_channel *channel;
+   struct device *dev;
+   struct clk *clk;
+
+   /* Platform configuration */
+   const struct nvdec_config *config;
+};
+
+static inline struct nvdec *to_nvdec(struct tegra_drm_client *client)
+{
+   return container_of(client, struct nvdec, client);
+}
+
+static void nvdec_writel(struct nvdec *nvdec, u32 value, unsigned int offset)
+{
+   writel(value, nvdec->regs + offset);
+}
+
+static int nvdec_boot(struct nvdec *nvdec)
+{
+#ifdef CONFIG_IOMMU_API
+   struct iommu_fwspec *spec = dev_iommu_fwspec_get(nvdec->dev);
+#endif
+   int err;
+
+#ifdef CONFIG_IOMMU_API
+   if (nvdec->config->supports_sid && spec) {
+   u32 value;
+
+   value = TRANSCFG_ATT(1, TRANSCFG_SID_FALCON) | TRANSCFG_ATT(0, 
TRANSCFG_SID_HW);
+   nvdec_writel(nvdec, value, VIC_TFBIF_TRANSCFG);
+
+   if (spec->num_ids > 0) {
+   value = spec->ids[0] & 0x;
+
+   nvdec_writel(nvdec, value, VIC_THI_STREAMID0);
+   nvdec_writel(nvdec, value, VIC_THI_STREAMID1);
+   }
+   }
+#endif
+
+   err = falcon_boot(&nvdec->falcon);
+   if (

[PATCH v5 1/3] dt-bindings: Add YAML bindings for NVDEC

2021-09-10 Thread Mikko Perttunen
Add YAML device tree bindings for NVDEC, now in a more appropriate
place compared to the old textual Host1x bindings.

Signed-off-by: Mikko Perttunen 
---
v5:
* Changed from nvidia,instance to nvidia,host1x-class optional
  property.
* Added dma-coherent
v4:
* Fix incorrect compatibility string in 'if' condition
v3:
* Drop host1x bindings
* Change read2 to read-1 in interconnect names
v2:
* Fix issues pointed out in v1
* Add T194 nvidia,instance property
---
 .../gpu/host1x/nvidia,tegra210-nvdec.yaml | 104 ++
 MAINTAINERS   |   1 +
 2 files changed, 105 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml

diff --git 
a/Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml 
b/Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml
new file mode 100644
index ..f1f8d083d736
--- /dev/null
+++ b/Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml
@@ -0,0 +1,104 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/gpu/host1x/nvidia,tegra210-nvdec.yaml#";
+$schema: "http://devicetree.org/meta-schemas/core.yaml#";
+
+title: Device tree binding for NVIDIA Tegra NVDEC
+
+description: |
+  NVDEC is the hardware video decoder present on NVIDIA Tegra210
+  and newer chips. It is located on the Host1x bus and typically
+  programmed through Host1x channels.
+
+maintainers:
+  - Thierry Reding 
+  - Mikko Perttunen 
+
+properties:
+  $nodename:
+pattern: "^nvdec@[0-9a-f]*$"
+
+  compatible:
+enum:
+  - nvidia,tegra210-nvdec
+  - nvidia,tegra186-nvdec
+  - nvidia,tegra194-nvdec
+
+  reg:
+maxItems: 1
+
+  clocks:
+maxItems: 1
+
+  clock-names:
+items:
+  - const: nvdec
+
+  resets:
+maxItems: 1
+
+  reset-names:
+items:
+  - const: nvdec
+
+  power-domains:
+maxItems: 1
+
+  iommus:
+maxItems: 1
+
+  dma-coherent: true
+
+  interconnects:
+items:
+  - description: DMA read memory client
+  - description: DMA read 2 memory client
+  - description: DMA write memory client
+
+  interconnect-names:
+items:
+  - const: dma-mem
+  - const: read-1
+  - const: write
+
+  nvidia,host1x-class:
+description: Host1x class of the engine. If not specified, defaults to 
0xf0.
+$ref: /schemas/types.yaml#/definitions/uint32
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - clock-names
+  - resets
+  - reset-names
+  - power-domains
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+#include 
+#include 
+#include 
+#include 
+
+nvdec@1548 {
+compatible = "nvidia,tegra186-nvdec";
+reg = <0x1548 0x4>;
+clocks = <&bpmp TEGRA186_CLK_NVDEC>;
+clock-names = "nvdec";
+resets = <&bpmp TEGRA186_RESET_NVDEC>;
+reset-names = "nvdec";
+
+power-domains = <&bpmp TEGRA186_POWER_DOMAIN_NVDEC>;
+interconnects = <&mc TEGRA186_MEMORY_CLIENT_NVDECSRD &emc>,
+<&mc TEGRA186_MEMORY_CLIENT_NVDECSRD1 &emc>,
+<&mc TEGRA186_MEMORY_CLIENT_NVDECSWR &emc>;
+interconnect-names = "dma-mem", "read-1", "write";
+iommus = <&smmu TEGRA186_SID_NVDEC>;
+};
+
+
diff --git a/MAINTAINERS b/MAINTAINERS
index 69932194e1ba..ce9e360639d5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6230,6 +6230,7 @@ L:linux-te...@vger.kernel.org
 S: Supported
 T: git git://anongit.freedesktop.org/tegra/linux.git
 F: 
Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt
+F: Documentation/devicetree/bindings/gpu/host1x/
 F: drivers/gpu/drm/tegra/
 F: drivers/gpu/host1x/
 F: include/linux/host1x.h
-- 
2.32.0



[PATCH v5 0/3] NVIDIA Tegra NVDEC support

2021-09-10 Thread Mikko Perttunen
Here's the v5 of the NVDEC support series, containing the
following changes:

* Changed from nvidia,instance property to nvidia,host1x-class
  property.
* Set additionalProperties to false in DT bindings.
* Added dma-coherent property to DT bindings.

NVDEC hardware documentation can be found at
https://github.com/NVIDIA/open-gpu-doc/tree/master/classes/video

and example userspace can be found at
https://github.com/cyndis/vaapi-tegra-driver

Thanks,
Mikko

Mikko Perttunen (3):
  dt-bindings: Add YAML bindings for NVDEC
  arm64: tegra: Add NVDEC to Tegra186/194 device trees
  drm/tegra: Add NVDEC driver

 .../gpu/host1x/nvidia,tegra210-nvdec.yaml | 104 
 MAINTAINERS   |   1 +
 arch/arm64/boot/dts/nvidia/tegra186.dtsi  |  16 +
 arch/arm64/boot/dts/nvidia/tegra194.dtsi  |  38 ++
 drivers/gpu/drm/tegra/Makefile|   3 +-
 drivers/gpu/drm/tegra/drm.c   |   4 +
 drivers/gpu/drm/tegra/drm.h   |   1 +
 drivers/gpu/drm/tegra/nvdec.c | 464 ++
 drivers/gpu/host1x/dev.c  |  18 +
 include/linux/host1x.h|   2 +
 10 files changed, 650 insertions(+), 1 deletion(-)
 create mode 100644 
Documentation/devicetree/bindings/gpu/host1x/nvidia,tegra210-nvdec.yaml
 create mode 100644 drivers/gpu/drm/tegra/nvdec.c

-- 
2.32.0



Re: [PATCH v3 2/8] mm: Introduce a function to check for confidential computing features

2021-09-10 Thread Borislav Petkov
On Wed, Sep 08, 2021 at 05:58:33PM -0500, Tom Lendacky wrote:
> In prep for other confidential computing technologies, introduce a generic

preparation

> helper function, cc_platform_has(), that can be used to check for specific
> active confidential computing attributes, like memory encryption. This is
> intended to eliminate having to add multiple technology-specific checks to
> the code (e.g. if (sev_active() || tdx_active())).

...

> diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
> new file mode 100644
> index ..253f3ea66cd8
> --- /dev/null
> +++ b/include/linux/cc_platform.h
> @@ -0,0 +1,88 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Confidential Computing Platform Capability checks
> + *
> + * Copyright (C) 2021 Advanced Micro Devices, Inc.
> + *
> + * Author: Tom Lendacky 
> + */
> +
> +#ifndef _CC_PLATFORM_H

_LINUX_CC_PLATFORM_H

> +#define _CC_PLATFORM_H

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


[PATCH 1/1] lib, stackdepot: Add helper to print stack entries into buffer.

2021-09-10 Thread Imran Khan
To print stack entries into a buffer, users of stackdepot,
first get a list of stack entries using stack_depot_fetch
and then print this list into a buffer using stack_trace_snprint.
Provide a helper in stackdepot for this purpose.
Also change above mentioned users to use this helper.

Signed-off-by: Imran Khan 
Suggested-by: Vlastimil Babka 
---
 drivers/gpu/drm/drm_dp_mst_topology.c   |  5 +
 drivers/gpu/drm/drm_mm.c|  5 +
 drivers/gpu/drm/i915/i915_vma.c |  5 +
 drivers/gpu/drm/i915/intel_runtime_pm.c | 20 +---
 include/linux/stackdepot.h  |  3 +++
 lib/stackdepot.c| 23 +++
 mm/page_owner.c |  5 +
 7 files changed, 35 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c 
b/drivers/gpu/drm/drm_dp_mst_topology.c
index 86d13d6bc463..2d1adab9e360 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -1668,13 +1668,10 @@ __dump_topology_ref_history(struct 
drm_dp_mst_topology_ref_history *history,
for (i = 0; i < history->len; i++) {
const struct drm_dp_mst_topology_ref_entry *entry =
&history->entries[i];
-   ulong *entries;
-   uint nr_entries;
u64 ts_nsec = entry->ts_nsec;
u32 rem_nsec = do_div(ts_nsec, 10);
 
-   nr_entries = stack_depot_fetch(entry->backtrace, &entries);
-   stack_trace_snprint(buf, PAGE_SIZE, entries, nr_entries, 4);
+   stack_depot_snprint(entry->backtrace, buf, PAGE_SIZE, 4);
 
drm_printf(&p, "  %d %ss (last at %5llu.%06u):\n%s",
   entry->count,
diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 93d48a6f04ab..ca04d7f6f7b5 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -118,8 +118,6 @@ static noinline void save_stack(struct drm_mm_node *node)
 static void show_leaks(struct drm_mm *mm)
 {
struct drm_mm_node *node;
-   unsigned long *entries;
-   unsigned int nr_entries;
char *buf;
 
buf = kmalloc(BUFSZ, GFP_KERNEL);
@@ -133,8 +131,7 @@ static void show_leaks(struct drm_mm *mm)
continue;
}
 
-   nr_entries = stack_depot_fetch(node->stack, &entries);
-   stack_trace_snprint(buf, BUFSZ, entries, nr_entries, 0);
+   stack_depot_snprint(node->stack, buf, BUFSZ);
DRM_ERROR("node [%08llx + %08llx]: inserted at\n%s",
  node->start, node->size, buf);
}
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 4b7fc4647e46..f2d9ed375109 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -56,8 +56,6 @@ void i915_vma_free(struct i915_vma *vma)
 
 static void vma_print_allocator(struct i915_vma *vma, const char *reason)
 {
-   unsigned long *entries;
-   unsigned int nr_entries;
char buf[512];
 
if (!vma->node.stack) {
@@ -66,8 +64,7 @@ static void vma_print_allocator(struct i915_vma *vma, const 
char *reason)
return;
}
 
-   nr_entries = stack_depot_fetch(vma->node.stack, &entries);
-   stack_trace_snprint(buf, sizeof(buf), entries, nr_entries, 0);
+   stack_depot_snprint(vma->node.stack, buf, sizeof(buf), 0);
DRM_DEBUG_DRIVER("vma.node [%08llx + %08llx] %s: inserted at %s\n",
 vma->node.start, vma->node.size, reason, buf);
 }
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
b/drivers/gpu/drm/i915/intel_runtime_pm.c
index eaf7688f517d..cc312f0a05eb 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -65,16 +65,6 @@ static noinline depot_stack_handle_t __save_depot_stack(void)
return stack_depot_save(entries, n, GFP_NOWAIT | __GFP_NOWARN);
 }
 
-static void __print_depot_stack(depot_stack_handle_t stack,
-   char *buf, int sz, int indent)
-{
-   unsigned long *entries;
-   unsigned int nr_entries;
-
-   nr_entries = stack_depot_fetch(stack, &entries);
-   stack_trace_snprint(buf, sz, entries, nr_entries, indent);
-}
-
 static void init_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm)
 {
spin_lock_init(&rpm->debug.lock);
@@ -146,12 +136,12 @@ static void untrack_intel_runtime_pm_wakeref(struct 
intel_runtime_pm *rpm,
if (!buf)
return;
 
-   __print_depot_stack(stack, buf, PAGE_SIZE, 2);
+   stack_depot_snprint(stack, buf, PAGE_SIZE, 2);
DRM_DEBUG_DRIVER("wakeref %x from\n%s", stack, buf);
 
stack = READ_ONCE(rpm->debug.last_release);
if (stack) {
-   __print_depot_stack(stack, buf, PAGE_SIZE, 2);
+  

[PATCH 0/1] lib, stackdepot: Add helper to print stack entries into buffer.

2021-09-10 Thread Imran Khan
This change is in response to discussion at [1].
The patch has been created on top of my earlier changes [2] and [3].
If needed I can resend all of these patches together, though my
earlier patches have been Acked.

[1] https://lore.kernel.org/lkml/e6f6fb85-1d83-425b-9e36-b5784cc9e...@suse.cz/
[2] https://lore.kernel.org/lkml/fe94ffd8-d235-87d8-9c3d-80f7f73e0...@suse.cz/
[3] https://lore.kernel.org/lkml/85f4f073-0b5a-9052-0ba9-74d450608...@suse.cz/

Imran Khan (1):
  lib, stackdepot: Add helper to print stack entries into buffer.

 drivers/gpu/drm/drm_dp_mst_topology.c   |  5 +
 drivers/gpu/drm/drm_mm.c|  5 +
 drivers/gpu/drm/i915/i915_vma.c |  5 +
 drivers/gpu/drm/i915/intel_runtime_pm.c | 20 +---
 include/linux/stackdepot.h  |  3 +++
 lib/stackdepot.c| 23 +++
 mm/page_owner.c |  5 +
 7 files changed, 35 insertions(+), 31 deletions(-)

-- 
2.30.2



Re: [Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table

2021-09-10 Thread Tvrtko Ursulin



On 10/09/2021 15:24, Matt Roper wrote:

On Fri, Sep 10, 2021 at 02:03:44PM +0100, Tvrtko Ursulin wrote:


On 10/09/2021 06:33, Matt Roper wrote:

Our uncore MMIO functions for reading/writing registers have become very
complicated over time.  There's significant macro magic used to generate
several nearly-identical functions that only really differ in terms of
which platform-specific shadow register table they should check on write
operations.  We can significantly simplify our MMIO handlers by storing
a reference to the current platform's shadow table within the 'struct
intel_uncore' the same way we already do for forcewake; this allows us
to consolidate the multiple variants of each 'write' function down to
just a single 'fwtable' version that gets the shadow table out of the
uncore struct rather than hardcoding the name of a specific platform's
table.  We can do similar consolidation on the MMIO read side by
creating a single-entry forcewake table to replace the open-coded range
check they had been using previously.

The final patch of the series adds a new shadow table for DG2; this
becomes quite clean and simple now, given the refactoring in the first
five patches.


Tidy and it ends up saving kernel binary size.

However I am undecided yet, because one thing to note is that the trade off
is source code and kernel text consolidation at the expense of more indirect
calls at runtime and larger common read/write functions.

To expand, current code generates a bunch of per gen functions but in doing
so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and BSEARCH
(from find_fw_domain) so at runtime each platform mmio read/write does not
have to do indirect calls to do lookups.

It may matter a lot in the grand scheme of things but this trade off is
something to note in the cover letter I think.


That's true.  However it seems like if the extra indirect calls are good
enough for our forcewake lookups (which are called more frequently and
have to search through much larger tables) then using the same strategy
for shadow registers should be less of a concern.  Plus most of
timing-critical parts of the code don't call through this at all; they
just grab an explicit forcewake and then issue a bunch of *_fw()
operations that skip all the per-register forcewake and shadow handling.


With lookups you mean intel_uncore_forcewake_for_reg? Yeah I don't have 
a good idea of how many of those followed by "_fw" accessors we have vs 
"un-optimized" access. But it's a good point.


I was mostly coming from the point of view of old platforms like gen6, 
where with this series reads go from inlined checks (NEEDS_FORCE_WAKE) 
to always calling find_fw_domain. Just because it is a bit unfortunate 
to burden old CPUs (they are not getting any faster) with executing more 
code. It's not nice when old hardware gets slower and slower with 
software updates. :) But whether or not this case would at all be 
measurable.. probably not. Unless some compounding effects, like "death 
by thousand cuts", would come into play.


Regards,

Tvrtko


But you're right that this is something I should mention more clearly in
the cover letter.


Matt



Regards,

Tvrtko


Matt Roper (6):
drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
drm/i915/uncore: Associate shadow table with uncore
drm/i915/uncore: Replace gen8 write functions with general fwtable
drm/i915/uncore: Drop gen11/gen12 mmio write handlers
drm/i915/uncore: Drop gen11 mmio read handlers
drm/i915/dg2: Add DG2-specific shadow register table

   drivers/gpu/drm/i915/intel_uncore.c   | 190 ++
   drivers/gpu/drm/i915/intel_uncore.h   |   7 +
   drivers/gpu/drm/i915/selftests/intel_uncore.c |   1 +
   3 files changed, 110 insertions(+), 88 deletions(-)





Re: [PATCH v2] kernel/locking: Add context to ww_mutex_trylock.

2021-09-10 Thread Peter Zijlstra
On Thu, Sep 09, 2021 at 11:32:18AM +0200, Maarten Lankhorst wrote:
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index d456579d0952..791c28005eef 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -736,6 +736,44 @@ __ww_mutex_lock(struct mutex *lock, unsigned int state, 
> unsigned int subclass,
>   return __mutex_lock_common(lock, state, subclass, NULL, ip, ww_ctx, 
> true);
>  }
>  
> +/**
> + * ww_mutex_trylock - tries to acquire the w/w mutex with optional acquire 
> context
> + * @lock: mutex to lock
> + * @ctx: optional w/w acquire context
> + *
> + * Trylocks a mutex with the optional acquire context; no deadlock detection 
> is
> + * possible. Returns 1 if the mutex has been acquired successfully, 0 
> otherwise.
> + *
> + * Unlike ww_mutex_lock, no deadlock handling is performed. However, if a 
> @ctx is
> + * specified, -EALREADY and -EDEADLK handling may happen in calls to 
> ww_mutex_lock.
> + *
> + * A mutex acquired with this function must be released with ww_mutex_unlock.
> + */
> +int __sched
> +ww_mutex_trylock(struct ww_mutex *ww, struct ww_acquire_ctx *ctx)
> +{
> + bool locked;
> +
> + if (!ctx)
> + return mutex_trylock(&ww->base);
> +
> +#ifdef CONFIG_DEBUG_MUTEXES
> + DEBUG_LOCKS_WARN_ON(ww->base.magic != &ww->base);
> +#endif
> +
> + preempt_disable();
> + locked = __mutex_trylock(&ww->base);
> +
> + if (locked) {
> + ww_mutex_set_context_fastpath(ww, ctx);
> + mutex_acquire_nest(&ww->base.dep_map, 0, 1, &ctx->dep_map, 
> _RET_IP_);
> + }
> + preempt_enable();
> +
> + return locked;
> +}
> +EXPORT_SYMBOL(ww_mutex_trylock);
> +
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
>  void __sched
>  mutex_lock_nested(struct mutex *lock, unsigned int subclass)

> diff --git a/kernel/locking/ww_rt_mutex.c b/kernel/locking/ww_rt_mutex.c
> index 3f1fff7d2780..c4cb863edb4c 100644
> --- a/kernel/locking/ww_rt_mutex.c
> +++ b/kernel/locking/ww_rt_mutex.c
> @@ -50,6 +50,18 @@ __ww_rt_mutex_lock(struct ww_mutex *lock, struct 
> ww_acquire_ctx *ww_ctx,
>   return ret;
>  }
>  
> +int __sched
> +ww_mutex_trylock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
> +{
> + int locked = rt_mutex_trylock(&lock->base);
> +
> + if (locked && ctx)
> + ww_mutex_set_context_fastpath(lock, ctx);
> +
> + return locked;
> +}
> +EXPORT_SYMBOL(ww_mutex_trylock);
> +
>  int __sched
>  ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
>  {

That doesn't look right, how's this for you?

---
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -94,6 +94,9 @@ static inline unsigned long __owner_flag
return owner & MUTEX_FLAGS;
 }
 
+/*
+ * Returns: __mutex_owner(lock) on failure or NULL on success.
+ */
 static inline struct task_struct *__mutex_trylock_common(struct mutex *lock, 
bool handoff)
 {
unsigned long owner, curr = (unsigned long)current;
@@ -736,6 +739,47 @@ __ww_mutex_lock(struct mutex *lock, unsi
return __mutex_lock_common(lock, state, subclass, NULL, ip, ww_ctx, 
true);
 }
 
+/**
+ * ww_mutex_trylock - tries to acquire the w/w mutex with optional acquire 
context
+ * @ww: mutex to lock
+ * @ww_ctx: optional w/w acquire context
+ *
+ * Trylocks a mutex with the optional acquire context; no deadlock detection is
+ * possible. Returns 1 if the mutex has been acquired successfully, 0 
otherwise.
+ *
+ * Unlike ww_mutex_lock, no deadlock handling is performed. However, if a @ctx 
is
+ * specified, -EALREADY handling may happen in calls to ww_mutex_trylock.
+ *
+ * A mutex acquired with this function must be released with ww_mutex_unlock.
+ */
+int ww_mutex_trylock(struct ww_mutex *ww, struct ww_acquire_ctx *ww_ctx)
+{
+   if (!ww_ctx)
+   return mutex_trylock(&ww->base);
+
+   MUTEX_WARN_ON(ww->base.magic != &ww->base);
+
+   if (unlikely(ww_ctx == READ_ONCE(ww->ctx)))
+   return -EALREADY;
+
+   /*
+* Reset the wounded flag after a kill. No other process can
+* race and wound us here, since they can't have a valid owner
+* pointer if we don't have any locks held.
+*/
+   if (ww_ctx->acquired == 0)
+   ww_ctx->wounded = 0;
+
+   if (__mutex_trylock(&ww->base)) {
+   ww_mutex_set_context_fastpath(ww, ww_ctx);
+   mutex_acquire_nest(&ww->base.dep_map, 0, 1, &ww_ctx->dep_map, 
_RET_IP_);
+   return 1;
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL(ww_mutex_trylock);
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 void __sched
 mutex_lock_nested(struct mutex *lock, unsigned int subclass)
--- a/kernel/locking/ww_rt_mutex.c
+++ b/kernel/locking/ww_rt_mutex.c
@@ -9,6 +9,34 @@
 #define WW_RT
 #include "rtmutex.c"
 
+int ww_mutex_trylock(struct ww_mutex *lock, struct ww_acquire_ctx *ww_ctx)
+{
+   struct rt_mutex *rtm = &lock->base;
+
+   if (!ww_ctx)
+   return rt_mutex_trylock(rtm);
+
+   if (unlikely(ww_

Re: [Intel-gfx] [PATCH 0/6] i915: Simplify mmio handling & add new DG2 shadow table

2021-09-10 Thread Matt Roper
On Fri, Sep 10, 2021 at 04:03:50PM +0100, Tvrtko Ursulin wrote:
> 
> On 10/09/2021 15:24, Matt Roper wrote:
> > On Fri, Sep 10, 2021 at 02:03:44PM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 10/09/2021 06:33, Matt Roper wrote:
> > > > Our uncore MMIO functions for reading/writing registers have become very
> > > > complicated over time.  There's significant macro magic used to generate
> > > > several nearly-identical functions that only really differ in terms of
> > > > which platform-specific shadow register table they should check on write
> > > > operations.  We can significantly simplify our MMIO handlers by storing
> > > > a reference to the current platform's shadow table within the 'struct
> > > > intel_uncore' the same way we already do for forcewake; this allows us
> > > > to consolidate the multiple variants of each 'write' function down to
> > > > just a single 'fwtable' version that gets the shadow table out of the
> > > > uncore struct rather than hardcoding the name of a specific platform's
> > > > table.  We can do similar consolidation on the MMIO read side by
> > > > creating a single-entry forcewake table to replace the open-coded range
> > > > check they had been using previously.
> > > > 
> > > > The final patch of the series adds a new shadow table for DG2; this
> > > > becomes quite clean and simple now, given the refactoring in the first
> > > > five patches.
> > > 
> > > Tidy and it ends up saving kernel binary size.
> > > 
> > > However I am undecided yet, because one thing to note is that the trade 
> > > off
> > > is source code and kernel text consolidation at the expense of more 
> > > indirect
> > > calls at runtime and larger common read/write functions.
> > > 
> > > To expand, current code generates a bunch of per gen functions but in 
> > > doing
> > > so it manages to inline a bunch of checks like NEEDS_FORCE_WAKE and 
> > > BSEARCH
> > > (from find_fw_domain) so at runtime each platform mmio read/write does not
> > > have to do indirect calls to do lookups.
> > > 
> > > It may matter a lot in the grand scheme of things but this trade off is
> > > something to note in the cover letter I think.
> > 
> > That's true.  However it seems like if the extra indirect calls are good
> > enough for our forcewake lookups (which are called more frequently and
> > have to search through much larger tables) then using the same strategy
> > for shadow registers should be less of a concern.  Plus most of
> > timing-critical parts of the code don't call through this at all; they
> > just grab an explicit forcewake and then issue a bunch of *_fw()
> > operations that skip all the per-register forcewake and shadow handling.
> 
> With lookups you mean intel_uncore_forcewake_for_reg? Yeah I don't have a
> good idea of how many of those followed by "_fw" accessors we have vs
> "un-optimized" access. But it's a good point.
> 
> I was mostly coming from the point of view of old platforms like gen6, where
> with this series reads go from inlined checks (NEEDS_FORCE_WAKE) to always
> calling find_fw_domain. Just because it is a bit unfortunate to burden old
> CPUs (they are not getting any faster) with executing more code. It's not
> nice when old hardware gets slower and slower with software updates. :) But
> whether or not this case would at all be measurable.. probably not. Unless
> some compounding effects, like "death by thousand cuts", would come into
> play.

Chris pointed out in an offline mail that NEEDS_FORCE_WAKE does cut cut
out a lot of display MMIO lookups.  So I think it might be worth adding
that back, but also adding an "|| GEN11_BSD_RING_BASE" so that it will
still be accurate for the newer platforms too.

But I think another thing to consider here would be that we might want
to switch our intel_de_{read,write} wrappers to call raw mmio directly,
to completely bypass forcewake and shadow logic.


Matt

> 
> Regards,
> 
> Tvrtko
> 
> > But you're right that this is something I should mention more clearly in
> > the cover letter.
> > 
> > 
> > Matt
> > 
> > > 
> > > Regards,
> > > 
> > > Tvrtko
> > > 
> > > > Matt Roper (6):
> > > > drm/i915/uncore: Convert gen6/gen7 read operations to fwtable
> > > > drm/i915/uncore: Associate shadow table with uncore
> > > > drm/i915/uncore: Replace gen8 write functions with general fwtable
> > > > drm/i915/uncore: Drop gen11/gen12 mmio write handlers
> > > > drm/i915/uncore: Drop gen11 mmio read handlers
> > > > drm/i915/dg2: Add DG2-specific shadow register table
> > > > 
> > > >drivers/gpu/drm/i915/intel_uncore.c   | 190 
> > > > ++
> > > >drivers/gpu/drm/i915/intel_uncore.h   |   7 +
> > > >drivers/gpu/drm/i915/selftests/intel_uncore.c |   1 +
> > > >3 files changed, 110 insertions(+), 88 deletions(-)
> > > > 
> > 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795


Re: [PATCH v2 1/3] video: fbdev: asiliantfb: Error out if 'pixclock' equals zero

2021-09-10 Thread Geert Uytterhoeven
Hi Zheyu,

On Mon, Jul 26, 2021 at 12:04 PM Zheyu Ma  wrote:
> The userspace program could pass any values to the driver through
> ioctl() interface. If the driver doesn't check the value of 'pixclock',
> it may cause divide error.
>
> Fix this by checking whether 'pixclock' is zero first.
>
> The following log reveals it:
>
> [   43.861711] divide error:  [#1] PREEMPT SMP KASAN PTI
> [   43.861737] CPU: 2 PID: 11764 Comm: i740 Not tainted 
> 5.14.0-rc2-00513-gac532c9bbcfb-dirty #224
> [   43.861756] RIP: 0010:asiliantfb_check_var+0x4e/0x730
> [   43.861843] Call Trace:
> [   43.861848]  ? asiliantfb_remove+0x190/0x190
> [   43.861858]  fb_set_var+0x2e4/0xeb0
> [   43.861866]  ? fb_blank+0x1a0/0x1a0
> [   43.861873]  ? lock_acquire+0x1ef/0x530
> [   43.861884]  ? lock_release+0x810/0x810
> [   43.861892]  ? lock_is_held_type+0x100/0x140
> [   43.861903]  ? ___might_sleep+0x1ee/0x2d0
> [   43.861914]  ? __mutex_lock+0x620/0x1190
> [   43.861921]  ? do_fb_ioctl+0x313/0x700
> [   43.861929]  ? mutex_lock_io_nested+0xfa0/0xfa0
> [   43.861936]  ? __this_cpu_preempt_check+0x1d/0x30
> [   43.861944]  ? _raw_spin_unlock_irqrestore+0x46/0x60
> [   43.861952]  ? lockdep_hardirqs_on+0x59/0x100
> [   43.861959]  ? _raw_spin_unlock_irqrestore+0x46/0x60
> [   43.861967]  ? trace_hardirqs_on+0x6a/0x1c0
> [   43.861978]  do_fb_ioctl+0x31e/0x700
>
> Signed-off-by: Zheyu Ma 

Thanks for your patch!

> ---
> Changes in v2:
> - Make commit log more descriptive
> ---
>  drivers/video/fbdev/asiliantfb.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/video/fbdev/asiliantfb.c 
> b/drivers/video/fbdev/asiliantfb.c
> index 3e006da47752..84c56f525889 100644
> --- a/drivers/video/fbdev/asiliantfb.c
> +++ b/drivers/video/fbdev/asiliantfb.c
> @@ -227,6 +227,9 @@ static int asiliantfb_check_var(struct fb_var_screeninfo 
> *var,
>  {
> unsigned long Ftarget, ratio, remainder;
>
> +   if (!var->pixclock)
> +   return -EINVAL;

While this fixes the crash, it is not correct: according to the
fbdev API, invalid values must be rounded up to a supported value,
if possible.  -EINVAL should only be returned if rounding up values
in fb_var_screeninfo cannot give a valid mode.

The same comment applies to the other patches in this series:
[PATCH v2 2/3] video: fbdev: kyro: Error out if 'pixclock' equals zero
[PATCH v2 3/3] video: fbdev: riva: Error out if 'pixclock' equals zero

> +
> ratio = 100 / var->pixclock;
> remainder = 100 % var->pixclock;
> Ftarget = 100 * ratio + (100 * remainder) / var->pixclock;

Gr{oetje,eeting}s,

Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [RFC PATCH] drm/ttm: Add a private member to the struct ttm_resource

2021-09-10 Thread Thomas Hellström
On Fri, 2021-09-10 at 16:40 +0200, Christian König wrote:
> 
> 
> Am 10.09.21 um 15:15 schrieb Thomas Hellström:
> > Both the provider (resource manager) and the consumer (the TTM
> > driver)
> > want to subclass struct ttm_resource. Since this is left for the
> > resource
> > manager, we need to provide a private pointer for the TTM driver.
> > 
> > Provide a struct ttm_resource_private for the driver to subclass
> > for
> > data with the same lifetime as the struct ttm_resource: In the i915
> > case
> > it will, for example, be an sg-table and radix tree into the LMEM
> > /VRAM pages that currently are awkwardly attached to the GEM
> > object.
> > 
> > Provide an ops structure for associated ops (Which is only
> > destroy() ATM)
> > It might seem pointless to provide a separate ops structure, but
> > Linus
> > has previously made it clear that that's the norm.
> > 
> > After careful audit one could perhaps also on a per-driver basis
> > replace the delete_mem_notify() TTM driver callback with the above
> > destroy function.
> 
> Well this is a really big NAK to this approach.
> 
> If you need to attach some additional information to the resource
> then 
> implement your own resource manager like everybody else does.

Well this was the long discussion we had back then when the resource
mangagers started to derive from struct resource and I was under the
impression that we had come to an agreement about the different use-
cases here, and this was my main concern.

I mean, it's a pretty big layer violation to do that for this use-case.
The TTM resource manager doesn't want to know about this data at all,
it's private to the ttm resource user layer and the resource manager
works perfectly well without it. (I assume the other drivers that
implement their own resource managers need the data that the
subclassing provides?)

The fundamental problem here is that there are two layers wanting to
subclass struct ttm_resource. That means one layer gets to do that, the
second gets to use a private pointer, (which in turn can provide yet
another private pointer to a potential third layer). With your
suggestion, the second layer instead is forced to subclass each
subclassed instance it uses from  the first layer provides?

Ofc we can do that, but it does indeed feel pretty awkward.

In any case, if you still think that's the approach we should go for,
I'd need to add init() and fini() members to the ttm_range_manager_func
struct to allow subclassing without having to unnecessarily copy the
full code? 

Thanks,
Thomas










> 
> Regards,
> Christian.
> 
> > 
> > Cc: Matthew Auld 
> > Cc: König Christian 
> > Signed-off-by: Thomas Hellström 
> > ---
> >   drivers/gpu/drm/ttm/ttm_resource.c | 10 +++---
> >   include/drm/ttm/ttm_resource.h | 28
> > 
> >   2 files changed, 35 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/ttm/ttm_resource.c
> > b/drivers/gpu/drm/ttm/ttm_resource.c
> > index 2431717376e7..973e7c50bfed 100644
> > --- a/drivers/gpu/drm/ttm/ttm_resource.c
> > +++ b/drivers/gpu/drm/ttm/ttm_resource.c
> > @@ -57,13 +57,17 @@ int ttm_resource_alloc(struct ttm_buffer_object
> > *bo,
> >   void ttm_resource_free(struct ttm_buffer_object *bo, struct
> > ttm_resource **res)
> >   {
> > struct ttm_resource_manager *man;
> > +   struct ttm_resource *resource = *res;
> >   
> > -   if (!*res)
> > +   if (!resource)
> > return;
> >   
> > -   man = ttm_manager_type(bo->bdev, (*res)->mem_type);
> > -   man->func->free(man, *res);
> > *res = NULL;
> > +   if (resource->priv)
> > +   resource->priv->ops.destroy(resource->priv);
> > +
> > +   man = ttm_manager_type(bo->bdev, resource->mem_type);
> > +   man->func->free(man, resource);
> >   }
> >   EXPORT_SYMBOL(ttm_resource_free);
> >   
> > diff --git a/include/drm/ttm/ttm_resource.h
> > b/include/drm/ttm/ttm_resource.h
> > index 140b6b9a8bbe..5a22c9a29c05 100644
> > --- a/include/drm/ttm/ttm_resource.h
> > +++ b/include/drm/ttm/ttm_resource.h
> > @@ -44,6 +44,7 @@ struct dma_buf_map;
> >   struct io_mapping;
> >   struct sg_table;
> >   struct scatterlist;
> > +struct ttm_resource_private;
> >   
> >   struct ttm_resource_manager_func {
> > /**
> > @@ -153,6 +154,32 @@ struct ttm_bus_placement {
> > enum ttm_cachingcaching;
> >   };
> >   
> > +/**
> > + * struct ttm_resource_private_ops - Operations for a struct
> > + * ttm_resource_private
> > + *
> > + * Not much benefit to keep this as a separate struct with only a
> > single member,
> > + * but keeping a separate ops struct is the norm.
> > + */
> > +struct ttm_resource_private_ops {
> > +   /**
> > +    * destroy() - Callback to destroy the private data
> > +    * @priv - The private data to destroy
> > +    */
> > +   void (*destroy) (struct ttm_resource_private *priv);
> > +};
> > +
> > +/**
> > + * struct ttm_resource_private - TTM drive

[PATCH v9 00/17] drm/i915: Introduce Intel PXP

2021-09-10 Thread Daniele Ceraolo Spurio
PXP (Protected Xe Path) is an i915 component, available on
GEN12i and newer platforms, that helps to establish the hardware
protected session and manage the status of the alive software session,
as well as its life cycle.

changes from v8:
- comments/docs improvements
- remove rpm put race (pxp_inval vs context_close)
- don't call pxp_invalidate on rpm suspend because it's redundant

Tested with: https://patchwork.freedesktop.org/series/87570/

Cc: Gaurav Kumar 
Cc: Chris Wilson 
Cc: Rodrigo Vivi 
Cc: Joonas Lahtinen 
Cc: Juston Li 
Cc: Alan Previn 
Cc: Lionel Landwerlin 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 

Anshuman Gupta (2):
  drm/i915/pxp: Add plane decryption support
  drm/i915/pxp: black pixels on pxp disabled

Daniele Ceraolo Spurio (9):
  drm/i915/pxp: Define PXP component interface
  drm/i915/pxp: define PXP device flag and kconfig
  drm/i915/pxp: allocate a vcs context for pxp usage
  drm/i915/pxp: set KCR reg init
  drm/i915/pxp: interfaces for using protected objects
  drm/i915/pxp: start the arb session on demand
  drm/i915/pxp: add pxp debugfs
  drm/i915/pxp: add PXP documentation
  drm/i915/pxp: enable PXP for integrated Gen12

Huang, Sean Z (5):
  drm/i915/pxp: Implement funcs to create the TEE channel
  drm/i915/pxp: Create the arbitrary session after boot
  drm/i915/pxp: Implement arb session teardown
  drm/i915/pxp: Implement PXP irq handler
  drm/i915/pxp: Enable PXP power management

Vitaly Lubart (1):
  mei: pxp: export pavp client to me client bus

 Documentation/gpu/i915.rst|   8 +
 drivers/gpu/drm/i915/Kconfig  |  11 +
 drivers/gpu/drm/i915/Makefile |  10 +
 drivers/gpu/drm/i915/display/intel_display.c  |  34 +++
 .../drm/i915/display/intel_display_types.h|   6 +
 .../drm/i915/display/skl_universal_plane.c|  49 ++-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 100 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   6 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  28 ++
 drivers/gpu/drm/i915/gem/i915_gem_create.c|  72 +++--
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  18 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c|   1 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h|   6 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   8 +
 .../gpu/drm/i915/gem/selftests/mock_context.c |   4 +-
 drivers/gpu/drm/i915/gt/debugfs_gt.c  |   2 +
 drivers/gpu/drm/i915/gt/intel_engine.h|   2 +
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |  22 +-
 drivers/gpu/drm/i915/gt/intel_gt.c|   5 +
 drivers/gpu/drm/i915/gt/intel_gt_irq.c|   7 +
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |  15 +-
 drivers/gpu/drm/i915/gt/intel_gt_types.h  |   3 +
 drivers/gpu/drm/i915/i915_drv.c   |   2 +
 drivers/gpu/drm/i915/i915_drv.h   |   3 +
 drivers/gpu/drm/i915/i915_pci.c   |   2 +
 drivers/gpu/drm/i915/i915_reg.h   |  48 +++
 drivers/gpu/drm/i915/intel_device_info.h  |   1 +
 drivers/gpu/drm/i915/pxp/intel_pxp.c  | 288 ++
 drivers/gpu/drm/i915/pxp/intel_pxp.h  |  67 
 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c  | 141 +
 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h  |  15 +
 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c  |  78 +
 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h  |  21 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c  | 100 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_irq.h  |  32 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_pm.c   |  46 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_pm.h   |  23 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c  | 175 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.h  |  15 +
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c  | 172 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.h  |  17 ++
 .../drm/i915/pxp/intel_pxp_tee_interface.h|  37 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  83 +
 drivers/misc/mei/Kconfig  |   2 +
 drivers/misc/mei/Makefile |   1 +
 drivers/misc/mei/pxp/Kconfig  |  13 +
 drivers/misc/mei/pxp/Makefile |   7 +
 drivers/misc/mei/pxp/mei_pxp.c| 229 ++
 drivers/misc/mei/pxp/mei_pxp.h|  18 ++
 include/drm/i915_component.h  |   1 +
 include/drm/i915_pxp_tee_interface.h  |  42 +++
 include/uapi/drm/i915_drm.h   |  99 +-
 52 files changed, 2153 insertions(+), 42 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp.h
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_debugfs.h
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
 create mode 100644 drivers/gpu/drm/i915/pxp/int

[PATCH v9 01/17] drm/i915/pxp: Define PXP component interface

2021-09-10 Thread Daniele Ceraolo Spurio
This will be used for communication between the i915 driver and the mei
one. Defining it in a stand-alone patch to avoid circualr dependedencies
between the patches modifying the 2 drivers.

Split out from an original patch from  Huang, Sean Z

v2: rename the component struct (Rodrigo)

Signed-off-by: Daniele Ceraolo Spurio 
Cc: Rodrigo Vivi 
Reviewed-by: Rodrigo Vivi 
---
 include/drm/i915_component.h |  1 +
 include/drm/i915_pxp_tee_interface.h | 42 
 2 files changed, 43 insertions(+)
 create mode 100644 include/drm/i915_pxp_tee_interface.h

diff --git a/include/drm/i915_component.h b/include/drm/i915_component.h
index 55c3b123581b..c1e2a43d2d1e 100644
--- a/include/drm/i915_component.h
+++ b/include/drm/i915_component.h
@@ -29,6 +29,7 @@
 enum i915_component_type {
I915_COMPONENT_AUDIO = 1,
I915_COMPONENT_HDCP,
+   I915_COMPONENT_PXP
 };
 
 /* MAX_PORT is the number of port
diff --git a/include/drm/i915_pxp_tee_interface.h 
b/include/drm/i915_pxp_tee_interface.h
new file mode 100644
index ..af593ec64469
--- /dev/null
+++ b/include/drm/i915_pxp_tee_interface.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef _I915_PXP_TEE_INTERFACE_H_
+#define _I915_PXP_TEE_INTERFACE_H_
+
+#include 
+#include 
+
+/**
+ * struct i915_pxp_component_ops - ops for PXP services.
+ * @owner: Module providing the ops
+ * @send: sends data to PXP
+ * @receive: receives data from PXP
+ */
+struct i915_pxp_component_ops {
+   /**
+* @owner: owner of the module provding the ops
+*/
+   struct module *owner;
+
+   int (*send)(struct device *dev, const void *message, size_t size);
+   int (*recv)(struct device *dev, void *buffer, size_t size);
+};
+
+/**
+ * struct i915_pxp_component - Used for communication between i915 and TEE
+ * drivers for the PXP services
+ * @tee_dev: device that provide the PXP service from TEE Bus.
+ * @pxp_ops: Ops implemented by TEE driver, used by i915 driver.
+ */
+struct i915_pxp_component {
+   struct device *tee_dev;
+   const struct i915_pxp_component_ops *ops;
+
+   /* To protect the above members. */
+   struct mutex mutex;
+};
+
+#endif /* _I915_TEE_PXP_INTERFACE_H_ */
-- 
2.25.1



[PATCH v9 02/17] mei: pxp: export pavp client to me client bus

2021-09-10 Thread Daniele Ceraolo Spurio
From: Vitaly Lubart 

Export PAVP client to work with i915 driver,
for binding it uses kernel component framework.

v2:drop debug prints, refactor match code to match mei_hdcp (Tomas)

Signed-off-by: Vitaly Lubart 
Signed-off-by: Tomas Winkler 
Signed-off-by: Daniele Ceraolo Spurio 
Reviewed-by: Rodrigo Vivi 
---
 drivers/misc/mei/Kconfig   |   2 +
 drivers/misc/mei/Makefile  |   1 +
 drivers/misc/mei/pxp/Kconfig   |  13 ++
 drivers/misc/mei/pxp/Makefile  |   7 +
 drivers/misc/mei/pxp/mei_pxp.c | 229 +
 drivers/misc/mei/pxp/mei_pxp.h |  18 +++
 6 files changed, 270 insertions(+)
 create mode 100644 drivers/misc/mei/pxp/Kconfig
 create mode 100644 drivers/misc/mei/pxp/Makefile
 create mode 100644 drivers/misc/mei/pxp/mei_pxp.c
 create mode 100644 drivers/misc/mei/pxp/mei_pxp.h

diff --git a/drivers/misc/mei/Kconfig b/drivers/misc/mei/Kconfig
index f5fd5b786607..0e0bcd0da852 100644
--- a/drivers/misc/mei/Kconfig
+++ b/drivers/misc/mei/Kconfig
@@ -47,3 +47,5 @@ config INTEL_MEI_TXE
  Intel Bay Trail
 
 source "drivers/misc/mei/hdcp/Kconfig"
+source "drivers/misc/mei/pxp/Kconfig"
+
diff --git a/drivers/misc/mei/Makefile b/drivers/misc/mei/Makefile
index f1c76f7ee804..d8e5165917f2 100644
--- a/drivers/misc/mei/Makefile
+++ b/drivers/misc/mei/Makefile
@@ -26,3 +26,4 @@ mei-$(CONFIG_EVENT_TRACING) += mei-trace.o
 CFLAGS_mei-trace.o = -I$(src)
 
 obj-$(CONFIG_INTEL_MEI_HDCP) += hdcp/
+obj-$(CONFIG_INTEL_MEI_PXP) += pxp/
diff --git a/drivers/misc/mei/pxp/Kconfig b/drivers/misc/mei/pxp/Kconfig
new file mode 100644
index ..4029b96afc04
--- /dev/null
+++ b/drivers/misc/mei/pxp/Kconfig
@@ -0,0 +1,13 @@
+
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
+#
+config INTEL_MEI_PXP
+   tristate "Intel PXP services of ME Interface"
+   select INTEL_MEI_ME
+   depends on DRM_I915
+   help
+ MEI Support for PXP Services on Intel platforms.
+
+ Enables the ME FW services required for PXP support through
+ I915 display driver of Intel.
diff --git a/drivers/misc/mei/pxp/Makefile b/drivers/misc/mei/pxp/Makefile
new file mode 100644
index ..0329950d5794
--- /dev/null
+++ b/drivers/misc/mei/pxp/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (c) 2020, Intel Corporation. All rights reserved.
+#
+# Makefile - PXP client driver for Intel MEI Bus Driver.
+
+obj-$(CONFIG_INTEL_MEI_PXP) += mei_pxp.o
diff --git a/drivers/misc/mei/pxp/mei_pxp.c b/drivers/misc/mei/pxp/mei_pxp.c
new file mode 100644
index ..f7380d387bab
--- /dev/null
+++ b/drivers/misc/mei/pxp/mei_pxp.c
@@ -0,0 +1,229 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright © 2020 - 2021 Intel Corporation
+ */
+
+/**
+ * DOC: MEI_PXP Client Driver
+ *
+ * The mei_pxp driver acts as a translation layer between PXP
+ * protocol  implementer (I915) and ME FW by translating PXP
+ * negotiation messages to ME FW command payloads and vice versa.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mei_pxp.h"
+
+/**
+ * mei_pxp_send_message() - Sends a PXP message to ME FW.
+ * @dev: device corresponding to the mei_cl_device
+ * @message: a message buffer to send
+ * @size: size of the message
+ * Return: 0 on Success, <0 on Failure
+ */
+static int
+mei_pxp_send_message(struct device *dev, const void *message, size_t size)
+{
+   struct mei_cl_device *cldev;
+   ssize_t byte;
+
+   if (!dev || !message)
+   return -EINVAL;
+
+   cldev = to_mei_cl_device(dev);
+
+   /* temporary drop const qualifier till the API is fixed */
+   byte = mei_cldev_send(cldev, (u8 *)message, size);
+   if (byte < 0) {
+   dev_dbg(dev, "mei_cldev_send failed. %zd\n", byte);
+   return byte;
+   }
+
+   return 0;
+}
+
+/**
+ * mei_pxp_receive_message() - Receives a PXP message from ME FW.
+ * @dev: device corresponding to the mei_cl_device
+ * @buffer: a message buffer to contain the received message
+ * @size: size of the buffer
+ * Return: bytes sent on Success, <0 on Failure
+ */
+static int
+mei_pxp_receive_message(struct device *dev, void *buffer, size_t size)
+{
+   struct mei_cl_device *cldev;
+   ssize_t byte;
+
+   if (!dev || !buffer)
+   return -EINVAL;
+
+   cldev = to_mei_cl_device(dev);
+
+   byte = mei_cldev_recv(cldev, buffer, size);
+   if (byte < 0) {
+   dev_dbg(dev, "mei_cldev_recv failed. %zd\n", byte);
+   return byte;
+   }
+
+   return byte;
+}
+
+static const struct i915_pxp_component_ops mei_pxp_ops = {
+   .owner = THIS_MODULE,
+   .send = mei_pxp_send_message,
+   .recv = mei_pxp_receive_message,
+};
+
+static int mei_component_master_bind(struct device *dev)
+{
+   struct mei_cl_device *cldev = to_mei_cl_device(dev);
+   struct i915_pxp_component *comp_master = mei_

[PATCH v9 03/17] drm/i915/pxp: define PXP device flag and kconfig

2021-09-10 Thread Daniele Ceraolo Spurio
Ahead of the PXP implementation, define the relevant define flag and
kconfig option.

v2: flip kconfig default to N. Some machines have IFWIs that do not
support PXP, so we need it to be an opt-in until we add support to query
the caps from the mei device.

Signed-off-by: Daniele Ceraolo Spurio 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/Kconfig | 11 +++
 drivers/gpu/drm/i915/i915_drv.h  |  3 +++
 drivers/gpu/drm/i915/intel_device_info.h |  1 +
 3 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index f960f5d7664e..5987c3d5d9fb 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -131,6 +131,17 @@ config DRM_I915_GVT_KVMGT
  Choose this option if you want to enable KVMGT support for
  Intel GVT-g.
 
+config DRM_I915_PXP
+   bool "Enable Intel PXP support for Intel Gen12+ platform"
+   depends on DRM_I915
+   depends on INTEL_MEI && INTEL_MEI_PXP
+   default n
+   help
+ PXP (Protected Xe Path) is an i915 component, available on GEN12+
+ GPUs, that helps to establish the hardware protected session and
+ manage the status of the alive software session, as well as its life
+ cycle.
+
 menu "drm/i915 Debugging"
 depends on DRM_I915
 depends on EXPERT
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 37c1ca266bcd..447a248f14aa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1678,6 +1678,9 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 
 #define HAS_GLOBAL_MOCS_REGISTERS(dev_priv)
(INTEL_INFO(dev_priv)->has_global_mocs)
 
+#define HAS_PXP(dev_priv) (IS_ENABLED(CONFIG_DRM_I915_PXP) && \
+  INTEL_INFO(dev_priv)->has_pxp) && \
+  VDBOX_MASK(&dev_priv->gt)
 
 #define HAS_GMCH(dev_priv) (INTEL_INFO(dev_priv)->display.has_gmch)
 
diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
b/drivers/gpu/drm/i915/intel_device_info.h
index d328bb95c49b..8e6f48d1eb7b 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -133,6 +133,7 @@ enum intel_ppgtt_type {
func(has_logical_ring_elsq); \
func(has_mslices); \
func(has_pooled_eu); \
+   func(has_pxp); \
func(has_rc6); \
func(has_rc6p); \
func(has_rps); \
-- 
2.25.1



[PATCH v9 04/17] drm/i915/pxp: allocate a vcs context for pxp usage

2021-09-10 Thread Daniele Ceraolo Spurio
The context is required to send the session termination commands to the
VCS, which will be implemented in a follow-up patch. We can also use the
presence of the context as a check of pxp initialization completion.

v2: use perma-pinned context (Chris)
v3: rename pinned_context functions (Chris)
v4: split export of pinned_context functions to a separate patch (Rodrigo)

Signed-off-by: Daniele Ceraolo Spurio 
Cc: Chris Wilson 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/Makefile  |  4 ++
 drivers/gpu/drm/i915/gt/intel_engine.h |  2 +
 drivers/gpu/drm/i915/gt/intel_gt.c |  5 ++
 drivers/gpu/drm/i915/gt/intel_gt_types.h   |  3 ++
 drivers/gpu/drm/i915/pxp/intel_pxp.c   | 62 ++
 drivers/gpu/drm/i915/pxp/intel_pxp.h   | 35 
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 15 ++
 7 files changed, 126 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp.h
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_types.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index c36c8a4f0716..23f5bc268962 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -281,6 +281,10 @@ i915-y += \
 
 i915-y += i915_perf.o
 
+# Protected execution platform (PXP) support
+i915-$(CONFIG_DRM_I915_PXP) += \
+   pxp/intel_pxp.o
+
 # Post-mortem debug and GPU hang state capture
 i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o
 i915-$(CONFIG_DRM_I915_SELFTEST) += \
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
b/drivers/gpu/drm/i915/gt/intel_engine.h
index 87579affb952..eed4634c08cd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -175,6 +175,8 @@ intel_write_status_page(struct intel_engine_cs *engine, int 
reg, u32 value)
 #define I915_GEM_HWS_SEQNO 0x40
 #define I915_GEM_HWS_SEQNO_ADDR(I915_GEM_HWS_SEQNO * 
sizeof(u32))
 #define I915_GEM_HWS_MIGRATE   (0x42 * sizeof(u32))
+#define I915_GEM_HWS_PXP   0x60
+#define I915_GEM_HWS_PXP_ADDR  (I915_GEM_HWS_PXP * sizeof(u32))
 #define I915_GEM_HWS_SCRATCH   0x80
 
 #define I915_HWS_CSB_BUF0_INDEX0x10
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 2aeaae036a6f..da30919b7e99 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -21,6 +21,7 @@
 #include "intel_uncore.h"
 #include "intel_pm.h"
 #include "shmem_utils.h"
+#include "pxp/intel_pxp.h"
 
 void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915)
 {
@@ -712,6 +713,8 @@ int intel_gt_init(struct intel_gt *gt)
 
intel_migrate_init(>->migrate, gt);
 
+   intel_pxp_init(>->pxp);
+
goto out_fw;
 err_gt:
__intel_gt_disable(gt);
@@ -747,6 +750,8 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
 
intel_rps_driver_unregister(>->rps);
 
+   intel_pxp_fini(>->pxp);
+
/*
 * Upon unregistering the device to prevent any new users, cancel
 * all in-flight requests so that we can quickly unbind the active
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 6fdcde64c180..8001a61f42e5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -26,6 +26,7 @@
 #include "intel_rps_types.h"
 #include "intel_migrate_types.h"
 #include "intel_wakeref.h"
+#include "pxp/intel_pxp_types.h"
 
 struct drm_i915_private;
 struct i915_ggtt;
@@ -196,6 +197,8 @@ struct intel_gt {
struct {
u8 uc_index;
} mocs;
+
+   struct intel_pxp pxp;
 };
 
 enum intel_gt_scratch_field {
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
new file mode 100644
index ..7b2053902146
--- /dev/null
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -0,0 +1,62 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright(c) 2020 Intel Corporation.
+ */
+#include "intel_pxp.h"
+#include "gt/intel_context.h"
+#include "i915_drv.h"
+
+static int create_vcs_context(struct intel_pxp *pxp)
+{
+   static struct lock_class_key pxp_lock;
+   struct intel_gt *gt = pxp_to_gt(pxp);
+   struct intel_engine_cs *engine;
+   struct intel_context *ce;
+
+   /*
+* Find the first VCS engine present. We're guaranteed there is one
+* if we're in this function due to the check in has_pxp
+*/
+   for (engine = gt->engine_class[VIDEO_DECODE_CLASS][0]; !engine; 
engine++);
+   GEM_BUG_ON(!engine || engine->class != VIDEO_DECODE_CLASS);
+
+   ce = intel_engine_create_pinned_context(engine, engine->gt->vm, SZ_4K,
+   I915_GEM_HWS_PXP_ADDR,
+   &pxp_lock, "pxp_context");
+   if (IS_ERR(ce)) {

[PATCH v9 07/17] drm/i915/pxp: Create the arbitrary session after boot

2021-09-10 Thread Daniele Ceraolo Spurio
From: "Huang, Sean Z" 

Create the arbitrary session, with the fixed session id 0xf, after
system boot, for the case that application allocates the protected
buffer without establishing any protection session. Because the
hardware requires at least one alive session for protected buffer
creation. This arbitrary session will need to be re-created after
teardown or power event because hardware encryption key won't be
valid after such cases.

The session ID is exposed as part of the uapi so it can be used as part
of userspace commands.

v2: use gt->uncore->rpm (Chris)
v3: s/arb_is_in_play/arb_is_valid (Chris), move set-up to the new
init_hw function
v4: move interface defs to separate header, set arb_is valid to false
on fini (Rodrigo)
v5: handle async component binding

Signed-off-by: Huang, Sean Z 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Chris Wilson 
Cc: Rodrigo Vivi 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/Makefile |  1 +
 drivers/gpu/drm/i915/pxp/intel_pxp.c  |  7 ++
 drivers/gpu/drm/i915/pxp/intel_pxp.h  |  5 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c  | 74 
 drivers/gpu/drm/i915/pxp/intel_pxp_session.h  | 15 
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c  | 87 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.h  |  3 +
 .../drm/i915/pxp/intel_pxp_tee_interface.h| 37 
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h| 10 +++
 include/uapi/drm/i915_drm.h   |  3 +
 10 files changed, 242 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_session.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_session.h
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee_interface.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index d39bd0cefc64..405e04f4dd59 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -284,6 +284,7 @@ i915-y += i915_perf.o
 # Protected execution platform (PXP) support
 i915-$(CONFIG_DRM_I915_PXP) += \
pxp/intel_pxp.o \
+   pxp/intel_pxp_session.o \
pxp/intel_pxp_tee.o
 
 # Post-mortem debug and GPU hang state capture
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index 66a98feb33ab..e1370f323126 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -3,6 +3,7 @@
  * Copyright(c) 2020 Intel Corporation.
  */
 #include "intel_pxp.h"
+#include "intel_pxp_session.h"
 #include "intel_pxp_tee.h"
 #include "gt/intel_context.h"
 #include "i915_drv.h"
@@ -65,6 +66,8 @@ void intel_pxp_init(struct intel_pxp *pxp)
if (!HAS_PXP(gt->i915))
return;
 
+   mutex_init(&pxp->tee_mutex);
+
ret = create_vcs_context(pxp);
if (ret)
return;
@@ -86,6 +89,8 @@ void intel_pxp_fini(struct intel_pxp *pxp)
if (!intel_pxp_is_enabled(pxp))
return;
 
+   pxp->arb_is_valid = false;
+
intel_pxp_tee_component_fini(pxp);
 
destroy_vcs_context(pxp);
@@ -94,6 +99,8 @@ void intel_pxp_fini(struct intel_pxp *pxp)
 void intel_pxp_init_hw(struct intel_pxp *pxp)
 {
kcr_pxp_enable(pxp_to_gt(pxp));
+
+   intel_pxp_create_arb_session(pxp);
 }
 
 void intel_pxp_fini_hw(struct intel_pxp *pxp)
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.h 
b/drivers/gpu/drm/i915/pxp/intel_pxp.h
index 5427c3b28aa9..8eeb65af78b1 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.h
@@ -19,6 +19,11 @@ static inline bool intel_pxp_is_enabled(const struct 
intel_pxp *pxp)
return pxp->ce;
 }
 
+static inline bool intel_pxp_is_active(const struct intel_pxp *pxp)
+{
+   return pxp->arb_is_valid;
+}
+
 #ifdef CONFIG_DRM_I915_PXP
 void intel_pxp_init(struct intel_pxp *pxp);
 void intel_pxp_fini(struct intel_pxp *pxp);
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_session.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c
new file mode 100644
index ..3331868f354c
--- /dev/null
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright(c) 2020, Intel Corporation. All rights reserved.
+ */
+
+#include "drm/i915_drm.h"
+#include "i915_drv.h"
+
+#include "intel_pxp.h"
+#include "intel_pxp_session.h"
+#include "intel_pxp_tee.h"
+#include "intel_pxp_types.h"
+
+#define ARB_SESSION I915_PROTECTED_CONTENT_DEFAULT_SESSION /* shorter define */
+
+#define GEN12_KCR_SIP _MMIO(0x32260) /* KCR hwdrm session in play 0-31 */
+
+static bool intel_pxp_session_is_in_play(struct intel_pxp *pxp, u32 id)
+{
+   struct intel_gt *gt = pxp_to_gt(pxp);
+   intel_wakeref_t wakeref;
+   u32 sip = 0;
+
+   with_intel_runtime_pm(gt->uncore->rpm, wakeref)
+   sip = intel_uncore_read(gt->uncore, GEN12_KCR_SIP);
+
+   return sip & BIT(id);
+}
+
+static int pxp_wait_for_session_state(struct intel_pxp *pxp, u32 id, bool 
in_

[PATCH v9 05/17] drm/i915/pxp: Implement funcs to create the TEE channel

2021-09-10 Thread Daniele Ceraolo Spurio
From: "Huang, Sean Z" 

Implement the funcs to create the TEE channel, so kernel can
send the TEE commands directly to TEE for creating the arbitrary
(default) session.

v2: fix locking, don't pollute dev_priv (Chris)

v3: wait for mei PXP component to be bound.

v4: drop the wait, as the component might be bound after i915 load
completes. We'll instead check when sending a tee message.

v5: fix an issue with mei_pxp module removal

v6: don't use fetch_and_zero in fini (Rodrigo)

Signed-off-by: Huang, Sean Z 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Chris Wilson 
---
 drivers/gpu/drm/i915/Makefile  |  3 +-
 drivers/gpu/drm/i915/pxp/intel_pxp.c   | 13 
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c   | 79 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.h   | 14 
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h |  6 ++
 5 files changed, 114 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_tee.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 23f5bc268962..d39bd0cefc64 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -283,7 +283,8 @@ i915-y += i915_perf.o
 
 # Protected execution platform (PXP) support
 i915-$(CONFIG_DRM_I915_PXP) += \
-   pxp/intel_pxp.o
+   pxp/intel_pxp.o \
+   pxp/intel_pxp_tee.o
 
 # Post-mortem debug and GPU hang state capture
 i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index 7b2053902146..400deaea2d8a 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -3,6 +3,7 @@
  * Copyright(c) 2020 Intel Corporation.
  */
 #include "intel_pxp.h"
+#include "intel_pxp_tee.h"
 #include "gt/intel_context.h"
 #include "i915_drv.h"
 
@@ -50,7 +51,16 @@ void intel_pxp_init(struct intel_pxp *pxp)
if (ret)
return;
 
+   ret = intel_pxp_tee_component_init(pxp);
+   if (ret)
+   goto out_context;
+
drm_info(>->i915->drm, "Protected Xe Path (PXP) protected content 
support initialized\n");
+
+   return;
+
+out_context:
+   destroy_vcs_context(pxp);
 }
 
 void intel_pxp_fini(struct intel_pxp *pxp)
@@ -58,5 +68,8 @@ void intel_pxp_fini(struct intel_pxp *pxp)
if (!intel_pxp_is_enabled(pxp))
return;
 
+   intel_pxp_tee_component_fini(pxp);
+
destroy_vcs_context(pxp);
+
 }
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
new file mode 100644
index ..f1d8de832653
--- /dev/null
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
@@ -0,0 +1,79 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright(c) 2020 Intel Corporation.
+ */
+
+#include 
+#include "drm/i915_pxp_tee_interface.h"
+#include "drm/i915_component.h"
+#include "i915_drv.h"
+#include "intel_pxp.h"
+#include "intel_pxp_tee.h"
+
+static inline struct intel_pxp *i915_dev_to_pxp(struct device *i915_kdev)
+{
+   return &kdev_to_i915(i915_kdev)->gt.pxp;
+}
+
+/**
+ * i915_pxp_tee_component_bind - bind function to pass the function pointers 
to pxp_tee
+ * @i915_kdev: pointer to i915 kernel device
+ * @tee_kdev: pointer to tee kernel device
+ * @data: pointer to pxp_tee_master containing the function pointers
+ *
+ * This bind function is called during the system boot or resume from system 
sleep.
+ *
+ * Return: return 0 if successful.
+ */
+static int i915_pxp_tee_component_bind(struct device *i915_kdev,
+  struct device *tee_kdev, void *data)
+{
+   struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev);
+
+   pxp->pxp_component = data;
+   pxp->pxp_component->tee_dev = tee_kdev;
+
+   return 0;
+}
+
+static void i915_pxp_tee_component_unbind(struct device *i915_kdev,
+ struct device *tee_kdev, void *data)
+{
+   struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev);
+
+   pxp->pxp_component = NULL;
+}
+
+static const struct component_ops i915_pxp_tee_component_ops = {
+   .bind   = i915_pxp_tee_component_bind,
+   .unbind = i915_pxp_tee_component_unbind,
+};
+
+int intel_pxp_tee_component_init(struct intel_pxp *pxp)
+{
+   int ret;
+   struct intel_gt *gt = pxp_to_gt(pxp);
+   struct drm_i915_private *i915 = gt->i915;
+
+   ret = component_add_typed(i915->drm.dev, &i915_pxp_tee_component_ops,
+ I915_COMPONENT_PXP);
+   if (ret < 0) {
+   drm_err(&i915->drm, "Failed to add PXP component (%d)\n", ret);
+   return ret;
+   }
+
+   pxp->pxp_component_added = true;
+
+   return 0;
+}
+
+void intel_pxp_tee_component_fini(struct intel_pxp *pxp)
+{
+   struct drm_i915_private *i915 = pxp_to_gt(pxp)->i915;
+
+   if (!pxp->pxp_component_added)
+   return;
+
+   

[PATCH v9 06/17] drm/i915/pxp: set KCR reg init

2021-09-10 Thread Daniele Ceraolo Spurio
The setting is required by hardware to allow us doing further protection
operation such as sending commands to GPU or TEE. The register needs to
be re-programmed on resume, so for simplicitly we bundle the programming
with the component binding, which is automatically called on resume.

Further HW set-up operations will be added in the same location in
follow-up patches, so get ready for them by using a couple of
init/fini_hw wrappers instead of calling the KCR funcs directly.

v3: move programming to component binding function, rework commit msg

Signed-off-by: Huang, Sean Z 
Signed-off-by: Daniele Ceraolo Spurio 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/pxp/intel_pxp.c | 27 
 drivers/gpu/drm/i915/pxp/intel_pxp.h |  3 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c |  5 +
 3 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index 400deaea2d8a..66a98feb33ab 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -7,6 +7,24 @@
 #include "gt/intel_context.h"
 #include "i915_drv.h"
 
+/* KCR register definitions */
+#define KCR_INIT _MMIO(0x320f0)
+
+/* Setting KCR Init bit is required after system boot */
+#define KCR_INIT_ALLOW_DISPLAY_ME_WRITES REG_BIT(14)
+
+static void kcr_pxp_enable(struct intel_gt *gt)
+{
+   intel_uncore_write(gt->uncore, KCR_INIT,
+  
_MASKED_BIT_ENABLE(KCR_INIT_ALLOW_DISPLAY_ME_WRITES));
+}
+
+static void kcr_pxp_disable(struct intel_gt *gt)
+{
+   intel_uncore_write(gt->uncore, KCR_INIT,
+  
_MASKED_BIT_DISABLE(KCR_INIT_ALLOW_DISPLAY_ME_WRITES));
+}
+
 static int create_vcs_context(struct intel_pxp *pxp)
 {
static struct lock_class_key pxp_lock;
@@ -71,5 +89,14 @@ void intel_pxp_fini(struct intel_pxp *pxp)
intel_pxp_tee_component_fini(pxp);
 
destroy_vcs_context(pxp);
+}
+
+void intel_pxp_init_hw(struct intel_pxp *pxp)
+{
+   kcr_pxp_enable(pxp_to_gt(pxp));
+}
 
+void intel_pxp_fini_hw(struct intel_pxp *pxp)
+{
+   kcr_pxp_disable(pxp_to_gt(pxp));
 }
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.h 
b/drivers/gpu/drm/i915/pxp/intel_pxp.h
index e87550fb9821..5427c3b28aa9 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.h
@@ -22,6 +22,9 @@ static inline bool intel_pxp_is_enabled(const struct 
intel_pxp *pxp)
 #ifdef CONFIG_DRM_I915_PXP
 void intel_pxp_init(struct intel_pxp *pxp);
 void intel_pxp_fini(struct intel_pxp *pxp);
+
+void intel_pxp_init_hw(struct intel_pxp *pxp);
+void intel_pxp_fini_hw(struct intel_pxp *pxp);
 #else
 static inline void intel_pxp_init(struct intel_pxp *pxp)
 {
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
index f1d8de832653..0c0c7946e6a0 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
@@ -33,6 +33,9 @@ static int i915_pxp_tee_component_bind(struct device 
*i915_kdev,
pxp->pxp_component = data;
pxp->pxp_component->tee_dev = tee_kdev;
 
+   /* the component is required to fully start the PXP HW */
+   intel_pxp_init_hw(pxp);
+
return 0;
 }
 
@@ -41,6 +44,8 @@ static void i915_pxp_tee_component_unbind(struct device 
*i915_kdev,
 {
struct intel_pxp *pxp = i915_dev_to_pxp(i915_kdev);
 
+   intel_pxp_fini_hw(pxp);
+
pxp->pxp_component = NULL;
 }
 
-- 
2.25.1



[PATCH v9 08/17] drm/i915/pxp: Implement arb session teardown

2021-09-10 Thread Daniele Ceraolo Spurio
From: "Huang, Sean Z" 

Teardown is triggered when the display topology changes and no
long meets the secure playback requirement, and hardware trashes
all the encryption keys for display. Additionally, we want to emit a
teardown operation to make sure we're clean on boot and resume

v2: emit in the ring, use high prio request (Chris)
v3: better defines, stalling flush, cleaned up and renamed submission
funcs (Chris)

Signed-off-by: Huang, Sean Z 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Chris Wilson 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/Makefile|   1 +
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  22 ++-
 drivers/gpu/drm/i915/pxp/intel_pxp.c |   7 +-
 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c | 141 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h |  15 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c |  29 
 drivers/gpu/drm/i915/pxp/intel_pxp_session.h |   1 +
 7 files changed, 212 insertions(+), 4 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_cmd.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 405e04f4dd59..4fb663de344d 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -284,6 +284,7 @@ i915-y += i915_perf.o
 # Protected execution platform (PXP) support
 i915-$(CONFIG_DRM_I915_PXP) += \
pxp/intel_pxp.o \
+   pxp/intel_pxp_cmd.o \
pxp/intel_pxp_session.o \
pxp/intel_pxp_tee.o
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 1c3af0fc0456..ec2a0a566c40 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -28,10 +28,13 @@
 #define INSTR_26_TO_24_MASK0x700
 #define   INSTR_26_TO_24_SHIFT 24
 
+#define __INSTR(client) ((client) << INSTR_CLIENT_SHIFT)
+
 /*
  * Memory interface instructions used by the kernel
  */
-#define MI_INSTR(opcode, flags) (((opcode) << 23) | (flags))
+#define MI_INSTR(opcode, flags) \
+   (__INSTR(INSTR_MI_CLIENT) | (opcode) << 23 | (flags))
 /* Many MI commands use bit 22 of the header dword for GGTT vs PPGTT */
 #define  MI_GLOBAL_GTT(1<<22)
 
@@ -57,6 +60,7 @@
 #define MI_SUSPEND_FLUSH   MI_INSTR(0x0b, 0)
 #define   MI_SUSPEND_FLUSH_EN  (1<<0)
 #define MI_SET_APPID   MI_INSTR(0x0e, 0)
+#define   MI_SET_APPID_SESSION_ID(x)   ((x) << 0)
 #define MI_OVERLAY_FLIPMI_INSTR(0x11, 0)
 #define   MI_OVERLAY_CONTINUE  (0x0<<21)
 #define   MI_OVERLAY_ON(0x1<<21)
@@ -146,6 +150,7 @@
 #define MI_STORE_REGISTER_MEM_GEN8   MI_INSTR(0x24, 2)
 #define   MI_SRM_LRM_GLOBAL_GTT(1<<22)
 #define MI_FLUSH_DWMI_INSTR(0x26, 1) /* for GEN6 */
+#define   MI_FLUSH_DW_PROTECTED_MEM_EN (1<<22)
 #define   MI_FLUSH_DW_STORE_INDEX  (1<<21)
 #define   MI_INVALIDATE_TLB(1<<18)
 #define   MI_FLUSH_DW_OP_STOREDW   (1<<14)
@@ -272,6 +277,19 @@
 #define   MI_MATH_REG_ZF   0x32
 #define   MI_MATH_REG_CF   0x33
 
+/*
+ * Media instructions used by the kernel
+ */
+#define MEDIA_INSTR(pipe, op, sub_op, flags) \
+   (__INSTR(INSTR_RC_CLIENT) | (pipe) << INSTR_SUBCLIENT_SHIFT | \
+   (op) << INSTR_26_TO_24_SHIFT | (sub_op) << 16 | (flags))
+
+#define MFX_WAIT   MEDIA_INSTR(1, 0, 0, 0)
+#define  MFX_WAIT_DW0_MFX_SYNC_CONTROL_FLAGREG_BIT(8)
+#define  MFX_WAIT_DW0_PXP_SYNC_CONTROL_FLAGREG_BIT(9)
+
+#define CRYPTO_KEY_EXCHANGEMEDIA_INSTR(2, 6, 9, 0)
+
 /*
  * Commands used only by the command parser
  */
@@ -328,8 +346,6 @@
 #define GFX_OP_3DSTATE_BINDING_TABLE_EDIT_PS \
((0x3<<29)|(0x3<<27)|(0x0<<24)|(0x47<<16))
 
-#define MFX_WAIT  ((0x3<<29)|(0x1<<27)|(0x0<<16))
-
 #define COLOR_BLT ((0x2<<29)|(0x40<<22))
 #define SRC_COPY_BLT  ((0x2<<29)|(0x43<<22))
 
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index e1370f323126..26176d43a02d 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -98,9 +98,14 @@ void intel_pxp_fini(struct intel_pxp *pxp)
 
 void intel_pxp_init_hw(struct intel_pxp *pxp)
 {
+   int ret;
+
kcr_pxp_enable(pxp_to_gt(pxp));
 
-   intel_pxp_create_arb_session(pxp);
+   /* always emit a full termination to clean the state */
+   ret = intel_pxp_terminate_arb_session_and_global(pxp);
+   if (!ret)
+   intel_pxp_create_arb_session(pxp);
 }
 
 void intel_pxp_fini_hw(struct intel_pxp *pxp)
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c
new file mode 100644
index ..80678dafde15
--- /dev/null
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd.c
@@ -0,0 +1,141 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright(c) 2020, Intel Corporation. All rights reserved.
+ */
+
+#in

[PATCH v9 10/17] drm/i915/pxp: interfaces for using protected objects

2021-09-10 Thread Daniele Ceraolo Spurio
This api allow user mode to create protected buffers and to mark
contexts as making use of such objects. Only when using contexts
marked in such a way is the execution guaranteed to work as expected.

Contexts can only be marked as using protected content at creation time
(i.e. the parameter is immutable) and they must be both bannable and not
recoverable. Given that the protected session gets invalidated on
suspend, contexts created this way hold a runtime pm wakeref until
they're either destroyed or invalidated.

All protected objects and contexts will be considered invalid when the
PXP session is destroyed and all new submissions using them will be
rejected. All intel contexts within the invalidated gem contexts will be
marked banned. Userspace can detect that an invalidation has occurred via
the RESET_STATS ioctl, where we report it the same way as a ban due to a
hang.

v5: squash patches, rebase on proto_ctx, update kerneldoc

v6: rebase on obj create_ext changes

v7: Use session counter to check if an object it valid, hold wakeref in
context, don't add a new flag to RESET_STATS (Daniel)

v8: don't increase guilty count for contexts banned during pxp
invalidation (Rodrigo)

v9: better comments, avoid wakeref put race between pxp_inval and
context_close, add usage examples (Rodrigo)

Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Bommu Krishnaiah 
Cc: Rodrigo Vivi 
Cc: Chris Wilson 
Cc: Lionel Landwerlin 
Cc: Jason Ekstrand 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 98 ---
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |  6 ++
 .../gpu/drm/i915/gem/i915_gem_context_types.h | 28 ++
 drivers/gpu/drm/i915/gem/i915_gem_create.c| 72 ++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 18 
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  1 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  6 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  8 ++
 .../gpu/drm/i915/gem/selftests/mock_context.c |  4 +-
 drivers/gpu/drm/i915/pxp/intel_pxp.c  | 78 +++
 drivers/gpu/drm/i915/pxp/intel_pxp.h  | 12 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c  |  6 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  9 ++
 include/uapi/drm/i915_drm.h   | 96 +-
 14 files changed, 407 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c2ab0e22db0a..3418be4f727f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -77,6 +77,8 @@
 #include "gt/intel_gpu_commands.h"
 #include "gt/intel_ring.h"
 
+#include "pxp/intel_pxp.h"
+
 #include "i915_gem_context.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
@@ -186,10 +188,13 @@ static int validate_priority(struct drm_i915_private 
*i915,
return 0;
 }
 
-static void proto_context_close(struct i915_gem_proto_context *pc)
+static void proto_context_close(struct drm_i915_private *i915,
+   struct i915_gem_proto_context *pc)
 {
int i;
 
+   if (pc->pxp_wakeref)
+   intel_runtime_pm_put(&i915->runtime_pm, pc->pxp_wakeref);
if (pc->vm)
i915_vm_put(pc->vm);
if (pc->user_engines) {
@@ -241,6 +246,33 @@ static int proto_context_set_persistence(struct 
drm_i915_private *i915,
return 0;
 }
 
+static int proto_context_set_protected(struct drm_i915_private *i915,
+  struct i915_gem_proto_context *pc,
+  bool protected)
+{
+   int ret = 0;
+
+   if (!intel_pxp_is_enabled(&i915->gt.pxp)) {
+   ret = -ENODEV;
+   } else if (!protected) {
+   pc->uses_protected_content = false;
+   } else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) ||
+  !(pc->user_flags & BIT(UCONTEXT_BANNABLE))) {
+   ret = -EPERM;
+   } else {
+   pc->uses_protected_content = true;
+
+   /*
+* protected context usage requires the PXP session to be up,
+* which in turn requires the device to be active.
+*/
+   pc->pxp_wakeref = intel_runtime_pm_get(&i915->runtime_pm);
+   ret = intel_pxp_wait_for_arb_start(&i915->gt.pxp);
+   }
+
+   return ret;
+}
+
 static struct i915_gem_proto_context *
 proto_context_create(struct drm_i915_private *i915, unsigned int flags)
 {
@@ -269,7 +301,7 @@ proto_context_create(struct drm_i915_private *i915, 
unsigned int flags)
return pc;
 
 proto_close:
-   proto_context_close(pc);
+   proto_context_close(i915, pc);
return err;
 }
 
@@ -693,6 +725,8 @@ static int set_proto_ctx_param(struct drm_i915_file_private 
*fpriv,
ret = -EPERM;
else if (args->value)
pc->user_flag

[PATCH v9 14/17] drm/i915/pxp: black pixels on pxp disabled

2021-09-10 Thread Daniele Ceraolo Spurio
From: Anshuman Gupta 

When protected sufaces has flipped and pxp session is disabled,
display black pixels by using plane color CTM correction.

v2:
- Display black pixels in async flip too.

v3:
- Removed the black pixels logic for async flip. [Ville]
- Used plane state to force black pixels. [Ville]

v4 (Daniele): update pxp_is_borked check.

v5: rebase on top of v9 plane decryption moving the decrypt check
(Juston)

Cc: Ville Syrjälä 
Cc: Gaurav Kumar 
Cc: Shankar Uma 
Signed-off-by: Anshuman Gupta 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Juston Li 
Reviewed-by: Rodrigo Vivi 
Reviewed-by: Uma Shankar 
---
 drivers/gpu/drm/i915/display/intel_display.c  | 12 -
 .../drm/i915/display/intel_display_types.h|  3 ++
 .../drm/i915/display/skl_universal_plane.c| 36 ++-
 drivers/gpu/drm/i915/i915_reg.h   | 46 +++
 4 files changed, 94 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 7c19a7b0676a..755f3e32516d 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -9003,6 +9003,11 @@ static bool bo_has_valid_encryption(struct 
drm_i915_gem_object *obj)
return intel_pxp_key_check(&i915->gt.pxp, obj, false) == 0;
 }
 
+static bool pxp_is_borked(struct drm_i915_gem_object *obj)
+{
+   return i915_gem_object_is_protected(obj) && 
!bo_has_valid_encryption(obj);
+}
+
 static int intel_atomic_check_planes(struct intel_atomic_state *state)
 {
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
@@ -9064,10 +9069,13 @@ static int intel_atomic_check_planes(struct 
intel_atomic_state *state)
new_plane_state = intel_atomic_get_new_plane_state(state, 
plane);
old_plane_state = intel_atomic_get_old_plane_state(state, 
plane);
fb = new_plane_state->hw.fb;
-   if (fb)
+   if (fb) {
new_plane_state->decrypt = 
bo_has_valid_encryption(intel_fb_obj(fb));
-   else
+   new_plane_state->force_black = 
pxp_is_borked(intel_fb_obj(fb));
+   } else {
new_plane_state->decrypt = old_plane_state->decrypt;
+   new_plane_state->force_black = 
old_plane_state->force_black;
+   }
}
 
return 0;
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h 
b/drivers/gpu/drm/i915/display/intel_display_types.h
index d75c8bd39abc..9fa4ef06e377 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -628,6 +628,9 @@ struct intel_plane_state {
/* Plane pxp decryption state */
bool decrypt;
 
+   /* Plane state to display black pixels when pxp is borked */
+   bool force_black;
+
/* plane control register */
u32 ctl;
 
diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c 
b/drivers/gpu/drm/i915/display/skl_universal_plane.c
index 55e3f093b951..c4adcb3e12b3 100644
--- a/drivers/gpu/drm/i915/display/skl_universal_plane.c
+++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c
@@ -1002,6 +1002,33 @@ static u32 skl_surf_address(const struct 
intel_plane_state *plane_state,
}
 }
 
+static void intel_load_plane_csc_black(struct intel_plane *intel_plane)
+{
+   struct drm_i915_private *dev_priv = to_i915(intel_plane->base.dev);
+   enum pipe pipe = intel_plane->pipe;
+   enum plane_id plane = intel_plane->id;
+   u16 postoff = 0;
+
+   drm_dbg_kms(&dev_priv->drm, "plane color CTM to black  %s:%d\n",
+   intel_plane->base.name, plane);
+   intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 0), 0);
+   intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 1), 0);
+
+   intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 2), 0);
+   intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 3), 0);
+
+   intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 4), 0);
+   intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 5), 0);
+
+   intel_de_write_fw(dev_priv, PLANE_CSC_PREOFF(pipe, plane, 0), 0);
+   intel_de_write_fw(dev_priv, PLANE_CSC_PREOFF(pipe, plane, 1), 0);
+   intel_de_write_fw(dev_priv, PLANE_CSC_PREOFF(pipe, plane, 2), 0);
+
+   intel_de_write_fw(dev_priv, PLANE_CSC_POSTOFF(pipe, plane, 0), postoff);
+   intel_de_write_fw(dev_priv, PLANE_CSC_POSTOFF(pipe, plane, 1), postoff);
+   intel_de_write_fw(dev_priv, PLANE_CSC_POSTOFF(pipe, plane, 2), postoff);
+}
+
 static void
 skl_program_plane(struct intel_plane *plane,
  const struct intel_crtc_state *crtc_state,
@@ -1115,14 +1142,21 @@ skl_program_plane(struct intel_plane *plane,
 */
intel_de_write_fw(dev_priv, PLANE_CTL(pipe, plane_id), plane_ctl);
plane_surf = intel_plane_ggtt_offs

[PATCH v9 17/17] drm/i915/pxp: enable PXP for integrated Gen12

2021-09-10 Thread Daniele Ceraolo Spurio
Note that discrete cards can support PXP as well, but we haven't tested
on those yet so keeping it disabled for now.

Signed-off-by: Daniele Ceraolo Spurio 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/i915_pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index d4a6a9dcf182..169837de395d 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -865,6 +865,7 @@ static const struct intel_device_info jsl_info = {
}, \
TGL_CURSOR_OFFSETS, \
.has_global_mocs = 1, \
+   .has_pxp = 1, \
.display.has_dsb = 1
 
 static const struct intel_device_info tgl_info = {
@@ -891,6 +892,7 @@ static const struct intel_device_info rkl_info = {
 #define DGFX_FEATURES \
.memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
.has_llc = 0, \
+   .has_pxp = 0, \
.has_snoop = 1, \
.is_dgfx = 1
 
-- 
2.25.1



[PATCH v9 11/17] drm/i915/pxp: start the arb session on demand

2021-09-10 Thread Daniele Ceraolo Spurio
Now that we can handle destruction and re-creation of the arb session,
we can postpone the start of the session to the first submission that
requires it, to avoid keeping it running with no user.

Signed-off-by: Daniele Ceraolo Spurio 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c  |  4 ++-
 drivers/gpu/drm/i915/pxp/intel_pxp.c | 37 +---
 drivers/gpu/drm/i915/pxp/intel_pxp.h |  5 +--
 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c |  2 +-
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c |  6 ++--
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 10 +-
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h   |  2 ++
 7 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 3418be4f727f..f1a6cfc33148 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -267,7 +267,9 @@ static int proto_context_set_protected(struct 
drm_i915_private *i915,
 * which in turn requires the device to be active.
 */
pc->pxp_wakeref = intel_runtime_pm_get(&i915->runtime_pm);
-   ret = intel_pxp_wait_for_arb_start(&i915->gt.pxp);
+
+   if (!intel_pxp_is_active(&i915->gt.pxp))
+   ret = intel_pxp_start(&i915->gt.pxp);
}
 
return ret;
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index e49e60567a56..e183ac479e8b 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -79,6 +79,7 @@ void intel_pxp_init(struct intel_pxp *pxp)
init_completion(&pxp->termination);
complete_all(&pxp->termination);
 
+   mutex_init(&pxp->arb_mutex);
INIT_WORK(&pxp->session_work, intel_pxp_session_work);
 
ret = create_vcs_context(pxp);
@@ -115,7 +116,7 @@ void intel_pxp_mark_termination_in_progress(struct 
intel_pxp *pxp)
reinit_completion(&pxp->termination);
 }
 
-static void intel_pxp_queue_termination(struct intel_pxp *pxp)
+static void pxp_queue_termination(struct intel_pxp *pxp)
 {
struct intel_gt *gt = pxp_to_gt(pxp);
 
@@ -134,31 +135,41 @@ static void intel_pxp_queue_termination(struct intel_pxp 
*pxp)
  * the arb session is restarted from the irq work when we receive the
  * termination completion interrupt
  */
-int intel_pxp_wait_for_arb_start(struct intel_pxp *pxp)
+int intel_pxp_start(struct intel_pxp *pxp)
 {
+   int ret = 0;
+
if (!intel_pxp_is_enabled(pxp))
-   return 0;
+   return -ENODEV;
+
+   mutex_lock(&pxp->arb_mutex);
+
+   if (pxp->arb_is_valid)
+   goto unlock;
+
+   pxp_queue_termination(pxp);
 
if (!wait_for_completion_timeout(&pxp->termination,
-msecs_to_jiffies(100)))
-   return -ETIMEDOUT;
+   msecs_to_jiffies(100))) {
+   ret = -ETIMEDOUT;
+   goto unlock;
+   }
+
+   /* make sure the compiler doesn't optimize the double access */
+   barrier();
 
if (!pxp->arb_is_valid)
-   return -EIO;
+   ret = -EIO;
 
-   return 0;
+unlock:
+   mutex_unlock(&pxp->arb_mutex);
+   return ret;
 }
 
 void intel_pxp_init_hw(struct intel_pxp *pxp)
 {
kcr_pxp_enable(pxp_to_gt(pxp));
intel_pxp_irq_enable(pxp);
-
-   /*
-* the session could've been attacked while we weren't loaded, so
-* handle it as if it was and re-create it.
-*/
-   intel_pxp_queue_termination(pxp);
 }
 
 void intel_pxp_fini_hw(struct intel_pxp *pxp)
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.h 
b/drivers/gpu/drm/i915/pxp/intel_pxp.h
index f942bdd2af0c..424fe00a91fb 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.h
@@ -34,7 +34,8 @@ void intel_pxp_init_hw(struct intel_pxp *pxp);
 void intel_pxp_fini_hw(struct intel_pxp *pxp);
 
 void intel_pxp_mark_termination_in_progress(struct intel_pxp *pxp);
-int intel_pxp_wait_for_arb_start(struct intel_pxp *pxp);
+
+int intel_pxp_start(struct intel_pxp *pxp);
 
 int intel_pxp_key_check(struct intel_pxp *pxp, struct drm_i915_gem_object 
*obj);
 
@@ -48,7 +49,7 @@ static inline void intel_pxp_fini(struct intel_pxp *pxp)
 {
 }
 
-static inline int intel_pxp_wait_for_arb_start(struct intel_pxp *pxp)
+static inline int intel_pxp_start(struct intel_pxp *pxp)
 {
return -ENODEV;
 }
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
index 46eca1e81b9b..340f20d130a8 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
@@ -31,7 +31,7 @@ void intel_pxp_irq_handler(struct intel_pxp *pxp, u16 iir)
   GEN12_DISPLAY_APP_TERMINATED_PER_FW_REQ_INTERRUPT)) {
  

[PATCH v9 09/17] drm/i915/pxp: Implement PXP irq handler

2021-09-10 Thread Daniele Ceraolo Spurio
From: "Huang, Sean Z" 

The HW will generate a teardown interrupt when session termination is
required, which requires i915 to submit a terminating batch. Once the HW
is done with the termination it will generate another interrupt, at
which point it is safe to re-create the session.

Since the termination and re-creation flow is something we want to
trigger from the driver as well, use a common work function that can be
called both from the irq handler and from the driver set-up flows, which
has the addded benefit of allowing us to skip any extra locks because
the work itself serializes the operations.

v2: use struct completion instead of bool (Chris)
v3: drop locks, clean up functions and improve comments (Chris),
move to common work function.
v4: improve comments, simplify wait logic (Rodrigo)
v5: unconditionally set interrupts, rename state_attacked var (Rodrigo)

Signed-off-by: Huang, Sean Z 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: Chris Wilson 
Cc: Rodrigo Vivi 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/Makefile|  1 +
 drivers/gpu/drm/i915/gt/intel_gt_irq.c   |  7 ++
 drivers/gpu/drm/i915/i915_reg.h  |  1 +
 drivers/gpu/drm/i915/pxp/intel_pxp.c | 66 +++--
 drivers/gpu/drm/i915/pxp/intel_pxp.h |  8 ++
 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c | 99 
 drivers/gpu/drm/i915/pxp/intel_pxp_irq.h | 32 +++
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 54 ++-
 drivers/gpu/drm/i915/pxp/intel_pxp_session.h |  5 +-
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c |  8 +-
 drivers/gpu/drm/i915/pxp/intel_pxp_types.h   | 18 
 11 files changed, 283 insertions(+), 16 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c
 create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_irq.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 4fb663de344d..b22b8c195bb8 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -285,6 +285,7 @@ i915-y += i915_perf.o
 i915-$(CONFIG_DRM_I915_PXP) += \
pxp/intel_pxp.o \
pxp/intel_pxp_cmd.o \
+   pxp/intel_pxp_irq.o \
pxp/intel_pxp_session.o \
pxp/intel_pxp_tee.o
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c 
b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index b2de83be4d97..699a74582d32 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -13,6 +13,7 @@
 #include "intel_lrc_reg.h"
 #include "intel_uncore.h"
 #include "intel_rps.h"
+#include "pxp/intel_pxp_irq.h"
 
 static void guc_irq_handler(struct intel_guc *guc, u16 iir)
 {
@@ -64,6 +65,9 @@ gen11_other_irq_handler(struct intel_gt *gt, const u8 
instance,
if (instance == OTHER_GTPM_INSTANCE)
return gen11_rps_irq_handler(>->rps, iir);
 
+   if (instance == OTHER_KCR_INSTANCE)
+   return intel_pxp_irq_handler(>->pxp, iir);
+
WARN_ONCE(1, "unhandled other interrupt instance=0x%x, iir=0x%x\n",
  instance, iir);
 }
@@ -196,6 +200,9 @@ void gen11_gt_irq_reset(struct intel_gt *gt)
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK,  ~0);
intel_uncore_write(uncore, GEN11_GUC_SG_INTR_ENABLE, 0);
intel_uncore_write(uncore, GEN11_GUC_SG_INTR_MASK,  ~0);
+
+   intel_uncore_write(uncore, GEN11_CRYPTO_RSVD_INTR_ENABLE, 0);
+   intel_uncore_write(uncore, GEN11_CRYPTO_RSVD_INTR_MASK,  ~0);
 }
 
 void gen11_gt_irq_postinstall(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index c2853cc005ee..84bc884bd474 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8117,6 +8117,7 @@ enum {
 /* irq instances for OTHER_CLASS */
 #define OTHER_GUC_INSTANCE 0
 #define OTHER_GTPM_INSTANCE1
+#define OTHER_KCR_INSTANCE 4
 
 #define GEN11_INTR_IDENTITY_REG(x) _MMIO(0x190060 + ((x) * 4))
 
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index 26176d43a02d..b0c7edc10cc3 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -2,7 +2,9 @@
 /*
  * Copyright(c) 2020 Intel Corporation.
  */
+#include 
 #include "intel_pxp.h"
+#include "intel_pxp_irq.h"
 #include "intel_pxp_session.h"
 #include "intel_pxp_tee.h"
 #include "gt/intel_context.h"
@@ -68,6 +70,16 @@ void intel_pxp_init(struct intel_pxp *pxp)
 
mutex_init(&pxp->tee_mutex);
 
+   /*
+* we'll use the completion to check if there is a termination pending,
+* so we start it as completed and we reinit it when a termination
+* is triggered.
+*/
+   init_completion(&pxp->termination);
+   complete_all(&pxp->termination);
+
+   INIT_WORK(&pxp->session_work, intel_pxp_session_work);
+
ret = create_vcs_context(pxp);
if (ret)
return;
@@ -96,19 +108,61 @@ void intel_pxp_fini(struct in

[PATCH v9 13/17] drm/i915/pxp: Add plane decryption support

2021-09-10 Thread Daniele Ceraolo Spurio
From: Anshuman Gupta 

Add support to enable/disable PLANE_SURF Decryption Request bit.
It requires only to enable plane decryption support when following
condition met.
1. PXP session is enabled.
2. Buffer object is protected.

v2:
- Used gen fb obj user_flags instead gem_object_metadata. [Krishna]

v3:
- intel_pxp_gem_object_status() API changes.

v4: use intel_pxp_is_active (Daniele)

v5: rebase and use the new protected object status checker (Daniele)

v6: used plane state for plane_decryption to handle async flip
as suggested by Ville.

v7: check pxp session while plane decrypt state computation. [Ville]
removed pointless code. [Ville]

v8 (Daniele): update PXP check

v9: move decrypt check after icl_check_nv12_planes() when overlays
have fb set (Juston)

v10 (Daniele): update PXP check again to match rework in earlier patches and
don't consider protection valid if the object has not been used in an
execbuf beforehand.

Cc: Bommu Krishnaiah 
Cc: Huang Sean Z 
Cc: Gaurav Kumar 
Cc: Ville Syrjälä 
Signed-off-by: Anshuman Gupta 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Juston Li 
Reviewed-by: Rodrigo Vivi 
Reviewed-by: Uma Shankar  #v9
---
 drivers/gpu/drm/i915/display/intel_display.c  | 26 +++
 .../drm/i915/display/intel_display_types.h|  3 +++
 .../drm/i915/display/skl_universal_plane.c| 15 ---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  2 +-
 drivers/gpu/drm/i915/i915_reg.h   |  1 +
 drivers/gpu/drm/i915/pxp/intel_pxp.c  |  9 ---
 drivers/gpu/drm/i915/pxp/intel_pxp.h  |  7 +++--
 7 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index a7ca38613f89..7c19a7b0676a 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -71,6 +71,8 @@
 #include "gt/intel_rps.h"
 #include "gt/gen8_ppgtt.h"
 
+#include "pxp/intel_pxp.h"
+
 #include "g4x_dp.h"
 #include "g4x_hdmi.h"
 #include "i915_drv.h"
@@ -8994,13 +8996,23 @@ static int intel_bigjoiner_add_affected_planes(struct 
intel_atomic_state *state)
return 0;
 }
 
+static bool bo_has_valid_encryption(struct drm_i915_gem_object *obj)
+{
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
+
+   return intel_pxp_key_check(&i915->gt.pxp, obj, false) == 0;
+}
+
 static int intel_atomic_check_planes(struct intel_atomic_state *state)
 {
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
struct intel_crtc_state *old_crtc_state, *new_crtc_state;
struct intel_plane_state *plane_state;
struct intel_plane *plane;
+   struct intel_plane_state *new_plane_state;
+   struct intel_plane_state *old_plane_state;
struct intel_crtc *crtc;
+   const struct drm_framebuffer *fb;
int i, ret;
 
ret = icl_add_linked_planes(state);
@@ -9048,6 +9060,16 @@ static int intel_atomic_check_planes(struct 
intel_atomic_state *state)
return ret;
}
 
+   for_each_new_intel_plane_in_state(state, plane, plane_state, i) {
+   new_plane_state = intel_atomic_get_new_plane_state(state, 
plane);
+   old_plane_state = intel_atomic_get_old_plane_state(state, 
plane);
+   fb = new_plane_state->hw.fb;
+   if (fb)
+   new_plane_state->decrypt = 
bo_has_valid_encryption(intel_fb_obj(fb));
+   else
+   new_plane_state->decrypt = old_plane_state->decrypt;
+   }
+
return 0;
 }
 
@@ -9334,6 +9356,10 @@ static int intel_atomic_check_async(struct 
intel_atomic_state *state)
drm_dbg_kms(&i915->drm, "Color range cannot be changed 
in async flip\n");
return -EINVAL;
}
+
+   /* plane decryption is allow to change only in synchronous 
flips */
+   if (old_plane_state->decrypt != new_plane_state->decrypt)
+   return -EINVAL;
}
 
return 0;
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h 
b/drivers/gpu/drm/i915/display/intel_display_types.h
index e9e806d90eec..d75c8bd39abc 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -625,6 +625,9 @@ struct intel_plane_state {
 
struct intel_fb_view view;
 
+   /* Plane pxp decryption state */
+   bool decrypt;
+
/* plane control register */
u32 ctl;
 
diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c 
b/drivers/gpu/drm/i915/display/skl_universal_plane.c
index 724e7b04f3b6..55e3f093b951 100644
--- a/drivers/gpu/drm/i915/display/skl_universal_plane.c
+++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c
@@ -18,6 +18,7 @@
 #include "intel_sprite.h"
 #include "skl_scaler.h"
 #include "skl_universal_plane.h"
+#include "pxp/intel_pxp.h"
 

  1   2   >