date:20210805

Re: [Intel-gfx] [PATCH 03/46] drm/i915/guc: Don't return -EAGAIN to user when guc_ids exhausted

2021-08-05 Thread Daniel Vetter

On Tue, Aug 03, 2021 at 03:29:00PM -0700, Matthew Brost wrote:
> Rather than returning -EAGAIN to the user when no guc_ids are available,
> implement a fair sharing algorithm in the kernel which blocks submissons
> until guc_ids become available. Submissions are released one at a time,
> based on priority, until the guc_id pressure is released to ensure fair
> sharing of the guc_ids. Once the pressure is fully released, the normal
> guc_id allocation (at request creation time in guc_request_alloc) can
> resume as this allocation path should be significantly faster and a fair
> sharing algorithm isn't needed when guc_ids are plentifully.
> 
> The fair sharing algorithm is implemented by forcing all submissions to
> the tasklet which serializes submissions, dequeuing one at a time.
> 
> If the submission doesn't have a guc_id and new guc_id can't be found,
> two lists are searched, one list with contexts that are not pinned but
> still registered with the guc (searched first) and another list with
> contexts that are pinned but do not have any submissions actively in
> inflight (scheduling enabled + registered, searched second). If no
> guc_ids can be found we kick a workqueue which will retire requests
> hopefully freeing a guc_id. The workqueue + tasklet ping / pong back and
> forth until a guc_id can be found.
> 
> Once a guc_id is found, we may have to disable context scheduling
> depending on which list the context is stolen from. When we disable
> scheduling, we block the tasklet from executing until the completion G2H
> returns. The disable scheduling must be issued from the workqueue
> because of the locking structure. When we deregister a context, we also
> do the same thing (waiting on the G2H) but we can safely issue the
> deregister H2G from the tasklet.
> 
> Once all the G2H have returned we can trigger a submission on the
> context.
> 
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/gt/intel_context_types.h |   3 +
>  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  26 +-
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 805 --
>  drivers/gpu/drm/i915/i915_request.h   |   6 +
>  4 files changed, 754 insertions(+), 86 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> b/drivers/gpu/drm/i915/gt/intel_context_types.h
> index e54351a170e2..8ed964ef967b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> @@ -185,6 +185,9 @@ struct intel_context {
>   /* GuC LRC descriptor reference count */
>   atomic_t guc_id_ref;
>  
> + /* Number of rq submitted without a guc_id */
> + u16 guc_num_rq_submit_no_id;
> +
>   /*
>* GuC ID link - in list when unpinned but guc_id still valid in GuC
>*/
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> index 1d7cb118e70f..e76579396efd 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> @@ -33,7 +33,28 @@ struct intel_guc {
>  
>   /* Global engine used to submit requests to GuC */
>   struct i915_sched_engine *sched_engine;
> - struct i915_request *stalled_request;
> +
> + /* Global state related to submission tasklet */
> + struct i915_request *stalled_rq;
> + struct intel_context *stalled_context;
> + struct work_struct retire_worker;
> + unsigned long flags;
> + int total_num_rq_with_no_guc_id;
> +
> + /*
> +  * Submisson stall reason. See intel_guc_submission.c for detailed
> +  * description.
> +  */

I think documenting this kind of stuff inline as kerneldoc is neater, and
closer to where it's generally needed. Source navigation tools point you
to here, not the comment that's burried somewhere else.

> + enum {
> + STALL_NONE,
> + STALL_GUC_ID_WORKQUEUE,
> + STALL_GUC_ID_TASKLET,
> + STALL_SCHED_DISABLE,
> + STALL_REGISTER_CONTEXT,
> + STALL_DEREGISTER_CONTEXT,
> + STALL_MOVE_LRC_TAIL,
> + STALL_ADD_REQUEST,
> + } submission_stall_reason;
>  
>   /* intel_guc_recv interrupt related state */
>   spinlock_t irq_lock;
> @@ -55,7 +76,8 @@ struct intel_guc {
>   struct ida guc_ids;
>   u32 num_guc_ids;
>   u32 max_guc_ids;
> - struct list_head guc_id_list;
> + struct list_head guc_id_list_no_ref;
> + struct list_head guc_id_list_unpinned;
>  
>   bool submission_supported;
>   bool submission_selected;
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 3b555c05c01c..f42a707f60ca 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -59,6 +59,25 @@
>   * ELSP context descriptor dword into Work Item.
>   * See guc_add_request()
>   *
> + * GuC flow control state mac

Re: [Intel-gfx] [PATCH 04/46] drm/i915/guc: Don't allow requests not ready to consume all guc_ids

2021-08-05 Thread Daniel Vetter

On Tue, Aug 03, 2021 at 03:29:01PM -0700, Matthew Brost wrote:
> Add a heuristic which checks if over half of the available guc_ids are
> currently consumed by requests not ready to be submitted. If this
> heuristic is true at request creation time (normal guc_id allocation
> location) force all submissions + guc_ids allocations to tasklet.
> 
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/gt/intel_context_types.h |  3 ++
>  drivers/gpu/drm/i915/gt/intel_reset.c |  9 
>  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  1 +
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 53 +--
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.h |  2 +
>  5 files changed, 65 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> b/drivers/gpu/drm/i915/gt/intel_context_types.h
> index 8ed964ef967b..c01530d7dc67 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> @@ -188,6 +188,9 @@ struct intel_context {
>   /* Number of rq submitted without a guc_id */
>   u16 guc_num_rq_submit_no_id;
>  
> + /* GuC number of requests not ready */
> + atomic_t guc_num_rq_not_ready;

atomic_t by default is unordered. This needs some giantic comments and
explainers why this is totally ok and we don't need barriers here.

I think good excuse to convert all the docs here into kerneldoc.
-Daniel

> +
>   /*
>* GuC ID link - in list when unpinned but guc_id still valid in GuC
>*/
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
> b/drivers/gpu/drm/i915/gt/intel_reset.c
> index 91200c43951f..ea763138197f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -22,6 +22,7 @@
>  #include "intel_reset.h"
>  
>  #include "uc/intel_guc.h"
> +#include "uc/intel_guc_submission.h"
>  
>  #define RESET_MAX_RETRIES 3
>  
> @@ -850,6 +851,14 @@ static void nop_submit_request(struct i915_request 
> *request)
>  {
>   RQ_TRACE(request, "-EIO\n");
>  
> + /*
> +  * XXX: Kinda ugly to check for GuC submission here but this function is
> +  * going away once we switch to the DRM scheduler so we can live with
> +  * this for now.
> +  */
> + if (intel_engine_uses_guc(request->engine))
> + intel_guc_decr_num_rq_not_ready(request->context);
> +
>   request = i915_request_mark_eio(request);
>   if (request) {
>   i915_request_submit(request);
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> index e76579396efd..917352c9f323 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> @@ -76,6 +76,7 @@ struct intel_guc {
>   struct ida guc_ids;
>   u32 num_guc_ids;
>   u32 max_guc_ids;
> + atomic_t num_guc_ids_not_ready;
>   struct list_head guc_id_list_no_ref;
>   struct list_head guc_id_list_unpinned;
>  
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index f42a707f60ca..ba750fc87af1 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -1384,6 +1384,41 @@ static inline void queue_request(struct 
> i915_sched_engine *sched_engine,
>   kick_tasklet(&rq->engine->gt->uc.guc);
>  }
>  
> +/* Macro to tweak heuristic, using a simple over 50% not ready for now */
> +#define TOO_MANY_GUC_IDS_NOT_READY(avail, consumed) \
> + ((consumed) > (avail) / 2)
> +static bool too_many_guc_ids_not_ready(struct intel_guc *guc,
> +struct intel_context *ce)
> +{
> + u32 available_guc_ids, guc_ids_consumed;
> +
> + available_guc_ids = guc->num_guc_ids;
> + guc_ids_consumed = atomic_read(&guc->num_guc_ids_not_ready);
> +
> + if (TOO_MANY_GUC_IDS_NOT_READY(available_guc_ids, guc_ids_consumed)) {
> + set_and_update_guc_ids_exhausted(guc);
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static void incr_num_rq_not_ready(struct intel_context *ce)
> +{
> + struct intel_guc *guc = ce_to_guc(ce);
> +
> + if (!atomic_fetch_add(1, &ce->guc_num_rq_not_ready))
> + atomic_inc(&guc->num_guc_ids_not_ready);
> +}
> +
> +void intel_guc_decr_num_rq_not_ready(struct intel_context *ce)
> +{
> + struct intel_guc *guc = ce_to_guc(ce);
> +
> + if (atomic_fetch_add(-1, &ce->guc_num_rq_not_ready) == 1)
> + atomic_dec(&guc->num_guc_ids_not_ready);
> +}
> +
>  static bool need_tasklet(struct intel_guc *guc, struct intel_context *ce)
>  {
>   struct i915_sched_engine * const sched_engine =
> @@ -1430,6 +1465,8 @@ static void guc_submit_request(struct i915_request *rq)
>   kick_tasklet(guc);
>  
>   spin_unlock_irqrestore(&sched_engine->lock, flags);
> +
> + intel_guc_decr_num_rq_not_ready(rq->

Re: [Intel-gfx] [PATCH v5 07/12] drm/i915/gt: Pipelined page migration

2021-08-05 Thread Daniel Vetter

On Thu, Jun 17, 2021 at 8:30 AM Thomas Hellström
 wrote:
> From: Chris Wilson 
>
> If we pipeline the PTE updates and then do the copy of those pages
> within a single unpreemptible command packet, we can submit the copies
> and leave them to be scheduled without having to synchronously wait
> under a global lock. In order to manage migration, we need to
> preallocate the page tables (and keep them pinned and available for use
> at any time), causing a bottleneck for migrations as all clients must
> contend on the limited resources. By inlining the ppGTT updates and
> performing the blit atomically, each client only owns the PTE while in
> use, and so we can reschedule individual operations however we see fit.
> And most importantly, we do not need to take a global lock on the shared
> vm, and wait until the operation is complete before releasing the lock
> for others to claim the PTE for themselves.
>
> Signed-off-by: Chris Wilson 
> Co-developed-by: Thomas Hellström 
> Signed-off-by: Thomas Hellström 
> Reviewed-by: Matthew Auld 
> ---
> v2:
> - Add a TODO for huge LMEM ptes (Pointed out by Matthew Auld)
> - Use intel_engine_destroy_pinned_context() to properly take the pinned
>   context timeline off the engine list. (CI warning).
> v3:
> - Remove an obsolete GEM_BUG_ON (Pointed out by Matthew Auld)
> - Fix the size argument in allocate_va_range() to not include the base
>   (Pointed out by Matthew Auld)

I stumbled over this because I was chasing some intel_context->vm
users, and have a few comments and questions.

First, we generally keep the patch changelog above the cut, it's often
fairly useful information when browsing old history.

> ---
>  drivers/gpu/drm/i915/Makefile |   1 +
>  drivers/gpu/drm/i915/gt/intel_engine.h|   1 +
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   2 +
>  drivers/gpu/drm/i915/gt/intel_migrate.c   | 542 ++
>  drivers/gpu/drm/i915/gt/intel_migrate.h   |  45 ++
>  drivers/gpu/drm/i915/gt/intel_migrate_types.h |  15 +
>  drivers/gpu/drm/i915/gt/intel_ring.h  |   1 +
>  drivers/gpu/drm/i915/gt/selftest_migrate.c| 291 ++
>  .../drm/i915/selftests/i915_live_selftests.h  |   1 +
>  9 files changed, 899 insertions(+)
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.c
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.h
>  create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate_types.h
>  create mode 100644 drivers/gpu/drm/i915/gt/selftest_migrate.c
>
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index dde698f3bff4..5e10e0628c56 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -108,6 +108,7 @@ gt-y += \
> gt/intel_gtt.o \
> gt/intel_llc.o \
> gt/intel_lrc.o \
> +   gt/intel_migrate.o \
> gt/intel_mocs.o \
> gt/intel_ppgtt.o \
> gt/intel_rc6.o \
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
> b/drivers/gpu/drm/i915/gt/intel_engine.h
> index 36ea9eb52bb5..62f7440bc111 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
> @@ -188,6 +188,7 @@ intel_write_status_page(struct intel_engine_cs *engine, 
> int reg, u32 value)
>  #define I915_GEM_HWS_PREEMPT_ADDR  (I915_GEM_HWS_PREEMPT * sizeof(u32))
>  #define I915_GEM_HWS_SEQNO 0x40
>  #define I915_GEM_HWS_SEQNO_ADDR(I915_GEM_HWS_SEQNO * 
> sizeof(u32))
> +#define I915_GEM_HWS_MIGRATE   (0x42 * sizeof(u32))
>  #define I915_GEM_HWS_SCRATCH   0x80
>
>  #define I915_HWS_CSB_BUF0_INDEX0x10
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
> b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index 2694dbb9967e..1c3af0fc0456 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -123,8 +123,10 @@
>  #define   MI_SEMAPHORE_SAD_NEQ_SDD (5 << 12)
>  #define   MI_SEMAPHORE_TOKEN_MASK  REG_GENMASK(9, 5)
>  #define   MI_SEMAPHORE_TOKEN_SHIFT 5
> +#define MI_STORE_DATA_IMM  MI_INSTR(0x20, 0)
>  #define MI_STORE_DWORD_IMM MI_INSTR(0x20, 1)
>  #define MI_STORE_DWORD_IMM_GEN4MI_INSTR(0x20, 2)
> +#define MI_STORE_QWORD_IMM_GEN8 (MI_INSTR(0x20, 3) | REG_BIT(21))
>  #define   MI_MEM_VIRTUAL   (1 << 22) /* 945,g33,965 */
>  #define   MI_USE_GGTT  (1 << 22) /* g4x+ */
>  #define MI_STORE_DWORD_INDEX   MI_INSTR(0x21, 1)
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
> b/drivers/gpu/drm/i915/gt/intel_migrate.c
> new file mode 100644
> index ..e2e860063e7b
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> @@ -0,0 +1,542 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2020 Intel Corporation
> + */
> +
> +#include "i915_drv.h"
> +#include "intel_context.h"
> +#include "intel_gpu_commands.h"
> +#include "intel_gt.h"
> +#include "intel_gtt.h"
> +#include "intel_migrate.h"
>

Re: [Intel-gfx] [RESEND PATCH v2 2/2] drm: add lockdep assert to drm_is_current_master_locked

2021-08-05 Thread Daniel Vetter

On Mon, Aug 02, 2021 at 06:59:57PM +0800, Desmond Cheong Zhi Xi wrote:
> In drm_is_current_master_locked, accessing drm_file.master should be
> protected by either drm_file.master_lookup_lock or
> drm_device.master_mutex. This was previously awkward to assert with
> lockdep.
> 
> Following patch ("locking/lockdep: Provide lockdep_assert{,_once}()
> helpers"), this assertion is now convenient. So we add in the
> assertion and explain this lock design in the kerneldoc.
> 
> Signed-off-by: Desmond Cheong Zhi Xi 
> Acked-by: Boqun Feng 
> Acked-by: Waiman Long 
> Acked-by: Peter Zijlstra (Intel) 

Both patches pushed to drm-misc-next, thanks.
-Daniel

> ---
>  drivers/gpu/drm/drm_auth.c | 6 +++---
>  include/drm/drm_file.h | 4 
>  2 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
> index 9c24b8cc8e36..6f4d7ff23c80 100644
> --- a/drivers/gpu/drm/drm_auth.c
> +++ b/drivers/gpu/drm/drm_auth.c
> @@ -63,9 +63,9 @@
>  
>  static bool drm_is_current_master_locked(struct drm_file *fpriv)
>  {
> - /* Either drm_device.master_mutex or drm_file.master_lookup_lock
> -  * should be held here.
> -  */
> + lockdep_assert_once(lockdep_is_held(&fpriv->master_lookup_lock) ||
> + lockdep_is_held(&fpriv->minor->dev->master_mutex));
> +
>   return fpriv->is_master && drm_lease_owner(fpriv->master) == 
> fpriv->minor->dev->master;
>  }
>  
> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
> index 726cfe0ff5f5..a3acb7ac3550 100644
> --- a/include/drm/drm_file.h
> +++ b/include/drm/drm_file.h
> @@ -233,6 +233,10 @@ struct drm_file {
>* this only matches &drm_device.master if the master is the currently
>* active one.
>*
> +  * To update @master, both &drm_device.master_mutex and
> +  * @master_lookup_lock need to be held, therefore holding either of
> +  * them is safe and enough for the read side.
> +  *
>* When dereferencing this pointer, either hold struct
>* &drm_device.master_mutex for the duration of the pointer's use, or
>* use drm_file_get_master() if struct &drm_device.master_mutex is not
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH] drm/i915/userptr: Probe existence of backing struct pages upon creation

2021-08-05 Thread Maarten Lankhorst

Op 03-08-2021 om 17:57 schreef Maarten Lankhorst:
> Op 2021-08-03 om 17:45 schreef Jason Ekstrand:
>> On Tue, Aug 3, 2021 at 10:09 AM Daniel Vetter  wrote:
>>> On Wed, Jul 28, 2021 at 4:22 PM Matthew Auld
>>>  wrote:
 On Mon, 26 Jul 2021 at 17:10, Tvrtko Ursulin
  wrote:
> On 26/07/2021 16:14, Jason Ekstrand wrote:
>> On Mon, Jul 26, 2021 at 3:31 AM Maarten Lankhorst
>>  wrote:
>>> Op 23-07-2021 om 13:34 schreef Matthew Auld:
 From: Chris Wilson 

 Jason Ekstrand requested a more efficient method than 
 userptr+set-domain
 to determine if the userptr object was backed by a complete set of 
 pages
 upon creation. To be more efficient than simply populating the userptr
 using get_user_pages() (as done by the call to set-domain or execbuf),
 we can walk the tree of vm_area_struct and check for gaps or vma not
 backed by struct page (VM_PFNMAP). The question is how to handle
 VM_MIXEDMAP which may be either struct page or pfn backed...

 With discrete we are going to drop support for set_domain(), so 
 offering
 a way to probe the pages, without having to resort to dummy batches has
 been requested.

 v2:
 - add new query param for the PROBE flag, so userspace can easily
check if the kernel supports it(Jason).
 - use mmap_read_{lock, unlock}.
 - add some kernel-doc.
 v3:
 - In the docs also mention that PROBE doesn't guarantee that the pages
will remain valid by the time they are actually used(Tvrtko).
 - Add a small comment for the hole finding logic(Jason).
 - Move the param next to all the other params which just return true.

 Testcase: igt/gem_userptr_blits/probe
 Signed-off-by: Chris Wilson 
 Signed-off-by: Matthew Auld 
 Cc: Thomas Hellström 
 Cc: Maarten Lankhorst 
 Cc: Tvrtko Ursulin 
 Cc: Jordan Justen 
 Cc: Kenneth Graunke 
 Cc: Jason Ekstrand 
 Cc: Daniel Vetter 
 Cc: Ramalingam C 
 Reviewed-by: Tvrtko Ursulin 
 Acked-by: Kenneth Graunke 
 Reviewed-by: Jason Ekstrand 
 ---
   drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 41 
 -
   drivers/gpu/drm/i915/i915_getparam.c|  1 +
   include/uapi/drm/i915_drm.h | 20 ++
   3 files changed, 61 insertions(+), 1 deletion(-)

 diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c 
 b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
 index 56edfeff8c02..468a7a617fbf 100644
 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
 +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
 @@ -422,6 +422,34 @@ static const struct drm_i915_gem_object_ops 
 i915_gem_userptr_ops = {

   #endif

 +static int
 +probe_range(struct mm_struct *mm, unsigned long addr, unsigned long 
 len)
 +{
 + const unsigned long end = addr + len;
 + struct vm_area_struct *vma;
 + int ret = -EFAULT;
 +
 + mmap_read_lock(mm);
 + for (vma = find_vma(mm, addr); vma; vma = vma->vm_next) {
 + /* Check for holes, note that we also update the addr 
 below */
 + if (vma->vm_start > addr)
 + break;
 +
 + if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
 + break;
 +
 + if (vma->vm_end >= end) {
 + ret = 0;
 + break;
 + }
 +
 + addr = vma->vm_end;
 + }
 + mmap_read_unlock(mm);
 +
 + return ret;
 +}
 +
   /*
* Creates a new mm object that wraps some normal memory from the 
 process
* context - user memory.
 @@ -477,7 +505,8 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
}

if (args->flags & ~(I915_USERPTR_READ_ONLY |
 - I915_USERPTR_UNSYNCHRONIZED))
 + I915_USERPTR_UNSYNCHRONIZED |
 + I915_USERPTR_PROBE))
return -EINVAL;

if (i915_gem_object_size_2big(args->user_size))
 @@ -504,6 +533,16 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
return -ENODEV;
}

 + if (args->flags & I915_USERPTR_PROBE) {
 + /*
 +  * Check that the range pointed to represents real struct
 +  * pages and not iomapping

[Intel-gfx] [PATCH] drm/i915: Update small joiner ram size

2021-08-05 Thread Vandita Kulkarni

Xelpd supports larger small joiner ram.

Signed-off-by: Vandita Kulkarni 
---
 drivers/gpu/drm/i915/display/intel_dp.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index 75d4ebc66941..d174f0d6e7cd 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -461,7 +461,9 @@ u32 intel_dp_mode_to_fec_clock(u32 mode_clock)
 static int
 small_joiner_ram_size_bits(struct drm_i915_private *i915)
 {
-   if (DISPLAY_VER(i915) >= 11)
+   if (DISPLAY_VER(i915) >= 13)
+   return 17280 * 8;
+   else if (DISPLAY_VER(i915) >= 11)
return 7680 * 8;
else
return 6144 * 8;
-- 
2.32.0

[Intel-gfx] [PULL] drm-misc-next

2021-08-05 Thread Maarten Lankhorst

drm-misc-next-2021-08-05:
drm-misc-next for v5.15:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
- Assorted docbook updates.
- Unbreak damage selftests.
- Define DRM_FORMAT_MAX_PLANES, maximum planes for a planar format.
- Add gem fb vmap/vunmap helpers, use them in gud and vkms drivers.

Driver Changes:
- Bridge fixes for ti-sn65dsi86.
- Use a full-featured driver for ATNA33XC20 to get backlight right,
  instead of the simple panel driver.
- Assorted fixes to pl111,.
- Support E Ink VB3300-KCA panel.
- Add support for Gopher 2b LCD and ilitek ili9341 panels.
The following changes since commit 04d505de7f82c8f2daa6139b460b05dc01e354e0:

  Merge tag 'amd-drm-next-5.15-2021-07-29' of 
https://gitlab.freedesktop.org/agd5f/linux into drm-next (2021-07-30 16:48:35 
+1000)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-next-2021-08-05

for you to fetch changes up to 5a04227326b04c15b015181772f5c853172fdb68:

  drm/panel: Add ilitek ili9341 panel driver (2021-08-05 11:09:23 +0200)


drm-misc-next for v5.15:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
- Assorted docbook updates.
- Unbreak damage selftests.
- Define DRM_FORMAT_MAX_PLANES, maximum planes for a planar format.
- Add gem fb vmap/vunmap helpers, use them in gud and vkms drivers.

Driver Changes:
- Bridge fixes for ti-sn65dsi86.
- Use a full-featured driver for ATNA33XC20 to get backlight right,
  instead of the simple panel driver.
- Assorted fixes to pl111,.
- Support E Ink VB3300-KCA panel.
- Add support for Gopher 2b LCD and ilitek ili9341 panels.


Alistair Francis (1):
  drm/panel: Add support for E Ink VB3300-KCA

Artjom Vejsel (2):
  dt-bindings: Add DT bindings for QiShenglong Gopher 2b panel
  drm/panel-simple: add Gopher 2b LCD panel

Cai Huoqing (2):
  drm/pl111: Remove unused including 
  drm: Fix typo in comments

Daniel Vetter (1):
  drm: Fix oops in damage self-tests by mocking damage property

Desmond Cheong Zhi Xi (1):
  drm: clean up unused kerneldoc in drm_lease.c

Dillon Min (2):
  dt-bindings: display: panel: Add ilitek ili9341 panel bindings
  drm/panel: Add ilitek ili9341 panel driver

Douglas Anderson (6):
  drm/dp: Don't zero PWMGEN_BIT_COUNT when driver_pwm_freq_hz not specified
  drm/bridge: ti-sn65dsi86: Fix power off sequence
  drm/bridge: ti-sn65dsi86: Add some 100 us delays
  Revert "drm/panel-simple: Add Samsung ATNA33XC20"
  Revert "drm/panel-simple: Support for delays between GPIO & regulator"
  drm/panel: atna33xc20: Introduce the Samsung ATNA33XC20 panel

Gregory Williams (1):
  DRM: ast: Fixed coding style issues of ast_mode.c

Simon Ser (2):
  drm/connector: add ref to drm_connector_get in iter docs
  drm: document drm_mode_get_property

Thomas Zimmermann (5):
  drm: Define DRM_FORMAT_MAX_PLANES
  drm/gem: Provide drm_gem_fb_{vmap,vunmap}()
  drm/gem: Clear mapping addresses for unused framebuffer planes
  drm/gud: Map framebuffer BOs with drm_gem_fb_vmap()
  drm/vkms: Map output framebuffer BOs with drm_gem_fb_vmap()

 .../bindings/display/panel/ilitek,ili9341.yaml |  78 ++
 .../bindings/display/panel/panel-simple.yaml   |   4 +
 .../devicetree/bindings/vendor-prefixes.yaml   |   2 +
 Documentation/gpu/drm-kms.rst  |   2 +
 drivers/gpu/drm/ast/ast_mode.c |  31 +-
 drivers/gpu/drm/bridge/ti-sn65dsi86.c  |  17 +-
 drivers/gpu/drm/drm_aperture.c |   2 +-
 drivers/gpu/drm/drm_atomic.c   |   2 +-
 drivers/gpu/drm/drm_atomic_helper.c|  10 +-
 drivers/gpu/drm/drm_atomic_uapi.c  |   6 +-
 drivers/gpu/drm/drm_auth.c |   2 +-
 drivers/gpu/drm/drm_bridge.c   |   2 +-
 drivers/gpu/drm/drm_bufs.c |   2 +-
 drivers/gpu/drm/drm_cache.c|   2 +-
 drivers/gpu/drm/drm_damage_helper.c|   2 +-
 drivers/gpu/drm/drm_dp_helper.c|  18 +-
 drivers/gpu/drm/drm_drv.c  |   4 +-
 drivers/gpu/drm/drm_dsc.c  |   2 +-
 drivers/gpu/drm/drm_edid.c |   4 +-
 drivers/gpu/drm/drm_fb_helper.c|   2 +-
 drivers/gpu/drm/drm_file.c |   6 +-
 drivers/gpu/drm/drm_format_helper.c|   2 +-
 drivers/gpu/drm/drm_framebuffer.c  |   2 +-
 drivers/gpu/drm/drm_gem.c  |   4 +-
 drivers/gpu/drm/drm_gem_atomic_helper.c|  39 +-
 drivers/gpu/drm/drm_gem_framebuffer_helper.c   |  95 ++-
 drivers/gpu/drm/drm_gem_shmem_helper.c |   2 +-
 drivers/gpu/drm/drm_gem_vram_helper.c  |   2 +-
 drivers/gpu/drm/drm_hdcp.c

[Intel-gfx] [PATCH v3 5/8] drm/i915: Use i915_gem_context_get_eb_vm in intel_context_set_gem

2021-08-05 Thread Daniel Vetter

Since

commit ccbc1b97948ab671335e950271e39766729736c3
Author: Jason Ekstrand 
Date:   Thu Jul 8 10:48:30 2021 -0500

drm/i915/gem: Don't allow changing the VM on running contexts (v4)

the gem_ctx->vm can't change anymore. Plus we always set the
intel_context->vm, so might as well use the helper we have for that.

This makes it very clear that we always overwrite intel_context->vm
for userspace contexts, since the default is gt->vm, which is
explicitly reserved for kernel context use. It would be good to split
things up a bit further and avoid any possibility for an accident
where we run kernel stuff in userspace vm or the other way round.

Signed-off-by: Daniel Vetter 
Cc: Jon Bloomfield 
Cc: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Joonas Lahtinen 
Cc: Daniel Vetter 
Cc: "Thomas Hellström" 
Cc: Matthew Auld 
Cc: Lionel Landwerlin 
Cc: Dave Airlie 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index a80b06c98dba..fd24a1236682 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -784,16 +784,8 @@ static int intel_context_set_gem(struct intel_context *ce,
 
ce->ring_size = SZ_16K;
 
-   if (rcu_access_pointer(ctx->vm)) {
-   struct i915_address_space *vm;
-
-   rcu_read_lock();
-   vm = context_get_vm_rcu(ctx); /* hmm */
-   rcu_read_unlock();
-
-   i915_vm_put(ce->vm);
-   ce->vm = vm;
-   }
+   i915_vm_put(ce->vm);
+   ce->vm = i915_gem_context_get_eb_vm(ctx);
 
if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
intel_engine_has_timeslices(ce->engine) &&
-- 
2.32.0

[Intel-gfx] [PATCH v3 2/8] drm/i915: Rename i915_gem_context_get_vm_rcu to i915_gem_context_get_eb_vm

2021-08-05 Thread Daniel Vetter

The important part isn't so much that this does an rcu lookup - that's
more an implementation detail, which will also be removed.

The thing that makes this different from other functions is that it's
gettting you the vm that batchbuffers will run in for that gem
context, which is either a full ppgtt stored in gem->ctx, or the ggtt.

We'll make more use of this function later on.

Signed-off-by: Daniel Vetter 
Cc: Jon Bloomfield 
Cc: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Joonas Lahtinen 
Cc: Daniel Vetter 
Cc: "Thomas Hellström" 
Cc: Matthew Auld 
Cc: Lionel Landwerlin 
Cc: Dave Airlie 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.h   | 2 +-
 drivers/gpu/drm/i915/gem/selftests/huge_pages.c   | 4 ++--
 drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 4 ++--
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 2 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  | 2 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 4 ++--
 drivers/gpu/drm/i915/selftests/i915_vma.c | 2 +-
 7 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index 18060536b0c2..da6e8b506d96 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -155,7 +155,7 @@ i915_gem_context_vm(struct i915_gem_context *ctx)
 }
 
 static inline struct i915_address_space *
-i915_gem_context_get_vm_rcu(struct i915_gem_context *ctx)
+i915_gem_context_get_eb_vm(struct i915_gem_context *ctx)
 {
struct i915_address_space *vm;
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index a094f3ce1a90..6c68fe26bb32 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -1456,7 +1456,7 @@ static int igt_tmpfs_fallback(void *arg)
struct i915_gem_context *ctx = arg;
struct drm_i915_private *i915 = ctx->i915;
struct vfsmount *gemfs = i915->mm.gemfs;
-   struct i915_address_space *vm = i915_gem_context_get_vm_rcu(ctx);
+   struct i915_address_space *vm = i915_gem_context_get_eb_vm(ctx);
struct drm_i915_gem_object *obj;
struct i915_vma *vma;
u32 *vaddr;
@@ -1512,7 +1512,7 @@ static int igt_shrink_thp(void *arg)
 {
struct i915_gem_context *ctx = arg;
struct drm_i915_private *i915 = ctx->i915;
-   struct i915_address_space *vm = i915_gem_context_get_vm_rcu(ctx);
+   struct i915_address_space *vm = i915_gem_context_get_eb_vm(ctx);
struct drm_i915_gem_object *obj;
struct i915_gem_engines_iter it;
struct intel_context *ce;
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 8eb5050f8cb3..d436ce7fa25c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1528,7 +1528,7 @@ static int write_to_scratch(struct i915_gem_context *ctx,
 
intel_gt_chipset_flush(engine->gt);
 
-   vm = i915_gem_context_get_vm_rcu(ctx);
+   vm = i915_gem_context_get_eb_vm(ctx);
vma = i915_vma_instance(obj, vm, NULL);
if (IS_ERR(vma)) {
err = PTR_ERR(vma);
@@ -1607,7 +1607,7 @@ static int read_from_scratch(struct i915_gem_context *ctx,
if (GRAPHICS_VER(i915) >= 8) {
const u32 GPR0 = engine->mmio_base + 0x600;
 
-   vm = i915_gem_context_get_vm_rcu(ctx);
+   vm = i915_gem_context_get_eb_vm(ctx);
vma = i915_vma_instance(obj, vm, NULL);
if (IS_ERR(vma)) {
err = PTR_ERR(vma);
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c 
b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index f12ffe797639..b3863abc51f5 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -3493,7 +3493,7 @@ static int smoke_submit(struct preempt_smoke *smoke,
if (batch) {
struct i915_address_space *vm;
 
-   vm = i915_gem_context_get_vm_rcu(ctx);
+   vm = i915_gem_context_get_eb_vm(ctx);
vma = i915_vma_instance(batch, vm, NULL);
i915_vm_put(vm);
if (IS_ERR(vma))
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c 
b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 08f011f893b2..6023c418ee8a 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -117,7 +117,7 @@ static struct i915_request *
 hang_create_request(struct hang *h, struct intel_engine_cs *engine)
 {
struct intel_gt *gt = h->gt;
-   struct i915_address_space *vm = i915_gem_context_get_vm_rcu(h->ctx);
+   struct i915_address_space *vm = i915_gem_context_get_eb_vm(h->ctx);
st

[Intel-gfx] [PATCH v3 1/8] drm/i915: Drop code to handle set-vm races from execbuf

2021-08-05 Thread Daniel Vetter

Changing the vm from a finalized gem ctx is no longer possible, which
means we don't have to check for that anymore.

I was pondering whether to keep the check as a WARN_ON, but things go
boom real bad real fast if the vm of a vma is wrong. Plus we'd need to
also get the ggtt vm for !full-ppgtt platforms. Ditching it all seemed
like a better idea.

References: ccbc1b97948a ("drm/i915/gem: Don't allow changing the VM on running 
contexts (v4)")
Signed-off-by: Daniel Vetter 
Cc: Jon Bloomfield 
Cc: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Joonas Lahtinen 
Cc: Daniel Vetter 
Cc: "Thomas Hellström" 
Cc: Matthew Auld 
Cc: Lionel Landwerlin 
Cc: Dave Airlie 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index e809aca00f72..905b1cbd22d5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -775,11 +775,7 @@ static int __eb_add_lut(struct i915_execbuffer *eb,
/* Check that the context hasn't been closed in the meantime */
err = -EINTR;
if (!mutex_lock_interruptible(&ctx->lut_mutex)) {
-   struct i915_address_space *vm = rcu_access_pointer(ctx->vm);
-
-   if (unlikely(vm && vma->vm != vm))
-   err = -EAGAIN; /* user racing with ctx set-vm */
-   else if (likely(!i915_gem_context_is_closed(ctx)))
+   if (likely(!i915_gem_context_is_closed(ctx)))
err = radix_tree_insert(&ctx->handles_vma, handle, vma);
else
err = -ENOENT;
-- 
2.32.0

[Intel-gfx] [PATCH v3 7/8] drm/i915: use xa_lock/unlock for fpriv->vm_xa lookups

2021-08-05 Thread Daniel Vetter

We don't need the absolute speed of rcu for this. And
i915_address_space in general dont need rcu protection anywhere else,
after we've made gem contexts and engines a lot more immutable.

Note that this semantically reverts

commit aabbe344dc3ca5f7d8263a02608ba6179e8a4499
Author: Chris Wilson 
Date:   Fri Aug 30 19:03:25 2019 +0100

drm/i915: Use RCU for unlocked vm_idr lookup

except we have the conversion from idr to xarray in between.

Signed-off-by: Daniel Vetter 
Cc: Jon Bloomfield 
Cc: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Joonas Lahtinen 
Cc: Daniel Vetter 
Cc: "Thomas Hellström" 
Cc: Matthew Auld 
Cc: Lionel Landwerlin 
Cc: Dave Airlie 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/i915_drv.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1488d166d91c..df2d723c894a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1880,11 +1880,11 @@ i915_gem_vm_lookup(struct drm_i915_file_private 
*file_priv, u32 id)
 {
struct i915_address_space *vm;
 
-   rcu_read_lock();
+   xa_lock(&file_priv->vm_xa);
vm = xa_load(&file_priv->vm_xa, id);
if (vm && !kref_get_unless_zero(&vm->ref))
vm = NULL;
-   rcu_read_unlock();
+   xa_unlock(&file_priv->vm_xa);
 
return vm;
 }
-- 
2.32.0

[Intel-gfx] [PATCH v3 4/8] drm/i915: Add i915_gem_context_is_full_ppgtt

2021-08-05 Thread Daniel Vetter

And use it anywhere we have open-coded checks for ctx->vm that really
only check for full ppgtt.

Plus for paranoia add a GEM_BUG_ON that checks it's really only set
when we have full ppgtt, just in case. gem_context->vm is different
since it's NULL in ggtt mode, unlike intel_context->vm or gt->vm,
which is always set.

v2: 0day found a testcase that I missed.

Signed-off-by: Daniel Vetter 
Cc: Jon Bloomfield 
Cc: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Joonas Lahtinen 
Cc: Daniel Vetter 
Cc: "Thomas Hellström" 
Cc: Matthew Auld 
Cc: Lionel Landwerlin 
Cc: Dave Airlie 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 2 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   | 7 +++
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c| 2 +-
 drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 6 +++---
 4 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 6263563e15d6..a80b06c98dba 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1581,7 +1581,7 @@ static int get_ppgtt(struct drm_i915_file_private 
*file_priv,
int err;
u32 id;
 
-   if (!rcu_access_pointer(ctx->vm))
+   if (!i915_gem_context_is_full_ppgtt(ctx))
return -ENODEV;
 
rcu_read_lock();
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index da6e8b506d96..37536a260e6e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -154,6 +154,13 @@ i915_gem_context_vm(struct i915_gem_context *ctx)
return rcu_dereference_protected(ctx->vm, lockdep_is_held(&ctx->mutex));
 }
 
+static inline bool i915_gem_context_is_full_ppgtt(struct i915_gem_context *ctx)
+{
+   GEM_BUG_ON(!!rcu_access_pointer(ctx->vm) != HAS_FULL_PPGTT(ctx->i915));
+
+   return !!rcu_access_pointer(ctx->vm);
+}
+
 static inline struct i915_address_space *
 i915_gem_context_get_eb_vm(struct i915_gem_context *ctx)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 905b1cbd22d5..40f08948f0b2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -749,7 +749,7 @@ static int eb_select_context(struct i915_execbuffer *eb)
return PTR_ERR(ctx);
 
eb->gem_context = ctx;
-   if (rcu_access_pointer(ctx->vm))
+   if (i915_gem_context_is_full_ppgtt(ctx))
eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
 
return 0;
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index d436ce7fa25c..0708b9cdeb9f 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -704,7 +704,7 @@ static int igt_ctx_exec(void *arg)
pr_err("Failed to fill dword %lu [%lu/%lu] with 
gpu (%s) [full-ppgtt? %s], err=%d\n",
   ndwords, dw, max_dwords(obj),
   engine->name,
-  yesno(!!rcu_access_pointer(ctx->vm)),
+  
yesno(i915_gem_context_is_full_ppgtt(ctx)),
   err);
intel_context_put(ce);
kernel_context_close(ctx);
@@ -838,7 +838,7 @@ static int igt_shared_ctx_exec(void *arg)
pr_err("Failed to fill dword %lu [%lu/%lu] with 
gpu (%s) [full-ppgtt? %s], err=%d\n",
   ndwords, dw, max_dwords(obj),
   engine->name,
-  yesno(!!rcu_access_pointer(ctx->vm)),
+  
yesno(i915_gem_context_is_full_ppgtt(ctx)),
   err);
intel_context_put(ce);
kernel_context_close(ctx);
@@ -1417,7 +1417,7 @@ static int igt_ctx_readonly(void *arg)
pr_err("Failed to fill dword %lu [%lu/%lu] with 
gpu (%s) [full-ppgtt? %s], err=%d\n",
   ndwords, dw, max_dwords(obj),
   ce->engine->name,
-  yesno(!!ctx_vm(ctx)),
+  
yesno(i915_gem_context_is_full_ppgtt(ctx)),
   err);
i915_gem_context_unlock_engines(ctx);
goto out_file;
-- 
2.32.0

[Intel-gfx] [PATCH v3 8/8] drm/i915: Stop rcu support for i915_address_space

2021-08-05 Thread Daniel Vetter

The full audit is quite a bit of work:

- i915_dpt has very simple lifetime (somehow we create a display pagetable vm
  per object, so its _very_ simple, there's only ever a single vma in there),
  and uses i915_vm_close(), which internally does a i915_vm_put(). No rcu.

  Aside: wtf is i915_dpt doing in the intel_display.c garbage collector as a new
  feature, instead of added as a separate file with some clean-ish interface.

  Also, i915_dpt unfortunately re-introduces some coding patterns from
  pre-dma_resv_lock conversion times.

- i915_gem_proto_ctx is fully refcounted and no rcu, all protected by
  fpriv->proto_context_lock.

- i915_gem_context is itself rcu protected, and that might leak to anything it
  points at. Before

commit cf977e18610e66e48c31619e7e0cfa871be9eada
Author: Chris Wilson 
Date:   Wed Dec 2 11:21:40 2020 +

drm/i915/gem: Spring clean debugfs

  and

commit db80a1294c231b6ac725085f046bb2931e00c9db
Author: Chris Wilson 
Date:   Mon Jan 18 11:08:54 2021 +

drm/i915/gem: Remove per-client stats from debugfs/i915_gem_objects

  we had a bunch of debugfs files that relied on rcu protecting everything, but
  those are gone now. The main one was removed even earlier with

  There doesn't seem to be anything left that's actually protecting
  stuff now that the ctx->vm itself is invariant. See

commit ccbc1b97948ab671335e950271e39766729736c3
Author: Jason Ekstrand 
Date:   Thu Jul 8 10:48:30 2021 -0500

drm/i915/gem: Don't allow changing the VM on running contexts (v4)

  Note that we drop the vm refcount before the final release of the gem context
  refcount, so this is all very dangerous even without rcu. Note that aside from
  later on creating new engines (a defunct feature) and debug output we're never
  looked at gem_ctx->vm for anything functional, hence why this is ok.
  Fingers crossed.

  Preceeding patches removed all vestiges of rcu use from gem_ctx->vm
  derferencing to make it clear it's really not used.

  The gem_ctx->rcu protection was introduced in

commit a4e7ccdac38ec8335d9e4e2656c1a041c77feae1
Author: Chris Wilson 
Date:   Fri Oct 4 14:40:09 2019 +0100

drm/i915: Move context management under GEM

  The commit message is somewhat entertaining because it fails to
  mention this fact completely, and compensates that by an in-commit
  changelog entry that claims that ctx->vm is protected by ctx->mutex.
  Which was the case _before_ this commit, but no longer after it.

- intel_context holds a full reference. Unfortunately intel_context is also rcu
  protected and the reference to the ->vm is dropped before the
  rcu barrier - only the kfree is delayed. So again we need to check
  whether that leaks anywhere on the intel_context->vm. RCU is only
  used to protect intel_context sitting on the breadcrumb lists, which
  don't look at the vm anywhere, so we are fine.

  Nothing else relies on rcu protection of intel_context and hence is
  fully protected by the kref refcount alone, which protects
  intel_context->vm in turn.

  The breadcrumbs rcu usage was added in

commit c744d50363b714783bbc88d986cc16def13710f7
Author: Chris Wilson 
Date:   Thu Nov 26 14:04:06 2020 +

drm/i915/gt: Split the breadcrumb spinlock between global and 
contexts

  its parent commit added the intel_context rcu protection:

commit 14d1eaf08845c534963c83f754afe0cb14cb2512
Author: Chris Wilson 
Date:   Thu Nov 26 14:04:05 2020 +

drm/i915/gt: Protect context lifetime with RCU

  given some credence to my claim that I've actually caught them all.

- drm_i915_gem_object's shares_resv_from pointer has a full refcount to the
  dma_resv, which is a sub-refcount that's released after the final
  i915_vm_put() has been called. Safe.

  Aside: Maybe we should have a struct dma_resv_shared which is just dma_resv +
  kref as a stand-alone thing. It's a pretty useful pattern which other drivers
  might want to copy.

  For a bit more context see

commit 4d8151ae5329cf50781a02fd2298a909589a5bab
Author: Thomas Hellström 
Date:   Tue Jun 1 09:46:41 2021 +0200

drm/i915: Don't free shared locks while shared

- the fpriv->vm_xa was relying on rcu_read_lock for lookup, but that
  was updated in a prep patch too to just be a spinlock-protected
  lookup.

- intel_gt->vm is set at driver load in intel_gt_init() and released
  in intel_gt_driver_release(). There seems to be some issue that
  in some error paths this is called twice, but otherwise no rcu to be
  found anywhere. This was added in the below commit, which
  unfortunately doesn't explain why this complication exists.

commit e6ba76480299a0d77c51d846f7467b1673aad25b
Author: Chris Wilson 
Date:   Sat Dec 21 16:03:24 2019 +

drm/i915: Remove i915->kernel_cont

[Intel-gfx] [PATCH v3 3/8] drm/i915: Use i915_gem_context_get_eb_vm in ctx_getparam

2021-08-05 Thread Daniel Vetter

Consolidates the "which is the vm my execbuf runs in" code a bit. We
do some get/put which isn't really required, but all the other users
want the refcounting, and I figured doing a function just for this
getparam to avoid 2 atomis is a bit much.

Signed-off-by: Daniel Vetter 
Cc: Jon Bloomfield 
Cc: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Joonas Lahtinen 
Cc: Daniel Vetter 
Cc: "Thomas Hellström" 
Cc: Matthew Auld 
Cc: Lionel Landwerlin 
Cc: Dave Airlie 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index cff72679ad7c..6263563e15d6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -2124,6 +2124,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device 
*dev, void *data,
struct drm_i915_file_private *file_priv = file->driver_priv;
struct drm_i915_gem_context_param *args = data;
struct i915_gem_context *ctx;
+   struct i915_address_space *vm;
int ret = 0;
 
ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
@@ -2133,12 +2134,10 @@ int i915_gem_context_getparam_ioctl(struct drm_device 
*dev, void *data,
switch (args->param) {
case I915_CONTEXT_PARAM_GTT_SIZE:
args->size = 0;
-   rcu_read_lock();
-   if (rcu_access_pointer(ctx->vm))
-   args->value = rcu_dereference(ctx->vm)->total;
-   else
-   args->value = to_i915(dev)->ggtt.vm.total;
-   rcu_read_unlock();
+   vm = i915_gem_context_get_eb_vm(ctx);
+   args->value = vm->total;
+   i915_vm_put(vm);
+
break;
 
case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
-- 
2.32.0

[Intel-gfx] [PATCH v3 6/8] drm/i915: Drop __rcu from gem_context->vm

2021-08-05 Thread Daniel Vetter

It's been invariant since

commit ccbc1b97948ab671335e950271e39766729736c3
Author: Jason Ekstrand 
Date:   Thu Jul 8 10:48:30 2021 -0500

drm/i915/gem: Don't allow changing the VM on running contexts (v4)

this just completes the deed. I've tried to split out prep work for
more careful review as much as possible, this is what's left:

- get_ppgtt gets simplified since we don't need to grab a temporary
  reference - we can rely on the temporary reference for the gem_ctx
  while we inspect the vm. The new vm_id still needs a full
  i915_vm_open ofc. This also removes the final caller of context_get_vm_rcu

- A pile of selftests can now just look at ctx->vm instead of
  rcu_dereference_protected( , true) or similar things.

- All callers of i915_gem_context_vm also disappear.

- I've changed the hugepage selftest to set scrub_64K without any
  locking, because when we inspect that setting we're also not taking
  any locks either. It works because it's a selftests that's careful
  (single threaded gives you nice ordering) and not a live driver
  where races can happen from anywhere.

These can only be split up further if we have some intermediate state
with a bunch more rcu_dereference_protected(ctx->vm, true), just to
shut up lockdep and sparse.

The conversion to __rcu happened in

commit a4e7ccdac38ec8335d9e4e2656c1a041c77feae1
Author: Chris Wilson 
Date:   Fri Oct 4 14:40:09 2019 +0100

drm/i915: Move context management under GEM

Note that we're not breaking the actual bugfix in there: The real
bugfix is pushing the i915_vm_relase onto a separate worker, to avoid
locking inversion issues. The rcu conversion was just thrown in for
entertainment value on top (no vm lookup isn't even close to anything
that's a hotpath where removing the single spinlock can be measured).

Signed-off-by: Daniel Vetter 
Cc: Jon Bloomfield 
Cc: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Joonas Lahtinen 
Cc: Daniel Vetter 
Cc: "Thomas Hellström" 
Cc: Matthew Auld 
Cc: Lionel Landwerlin 
Cc: Dave Airlie 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 53 ++-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   | 14 ++---
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  2 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  4 +-
 .../drm/i915/gem/selftests/i915_gem_context.c | 24 -
 drivers/gpu/drm/i915/i915_trace.h |  2 +-
 drivers/gpu/drm/i915/selftests/i915_vma.c |  2 +-
 7 files changed, 21 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index fd24a1236682..2f3cc73d4710 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -735,44 +735,6 @@ static int set_proto_ctx_param(struct 
drm_i915_file_private *fpriv,
return ret;
 }
 
-static struct i915_address_space *
-context_get_vm_rcu(struct i915_gem_context *ctx)
-{
-   GEM_BUG_ON(!rcu_access_pointer(ctx->vm));
-
-   do {
-   struct i915_address_space *vm;
-
-   /*
-* We do not allow downgrading from full-ppgtt [to a shared
-* global gtt], so ctx->vm cannot become NULL.
-*/
-   vm = rcu_dereference(ctx->vm);
-   if (!kref_get_unless_zero(&vm->ref))
-   continue;
-
-   /*
-* This ppgtt may have be reallocated between
-* the read and the kref, and reassigned to a third
-* context. In order to avoid inadvertent sharing
-* of this ppgtt with that third context (and not
-* src), we have to confirm that we have the same
-* ppgtt after passing through the strong memory
-* barrier implied by a successful
-* kref_get_unless_zero().
-*
-* Once we have acquired the current ppgtt of ctx,
-* we no longer care if it is released from ctx, as
-* it cannot be reallocated elsewhere.
-*/
-
-   if (vm == rcu_access_pointer(ctx->vm))
-   return rcu_pointer_handoff(vm);
-
-   i915_vm_put(vm);
-   } while (1);
-}
-
 static int intel_context_set_gem(struct intel_context *ce,
 struct i915_gem_context *ctx,
 struct intel_sseu sseu)
@@ -1193,7 +1155,7 @@ static void context_close(struct i915_gem_context *ctx)
 
set_closed_name(ctx);
 
-   vm = i915_gem_context_vm(ctx);
+   vm = ctx->vm;
if (vm)
i915_vm_close(vm);
 
@@ -1350,7 +1312,7 @@ i915_gem_create_context(struct drm_i915_private *i915,
vm = &ppgtt->vm;
}
if (vm) {
-   RCU_INIT_POINTER(ctx->vm, i915_vm_open(vm));
+   ctx->vm = i915_vm_open(vm);
 
/* i915_vm_open() tak

[Intel-gfx] [PATCH v3 0/8] remove rcu support from i915_address_space

2021-08-05 Thread Daniel Vetter

Hi all,

My seemingly trivial but totally not cleanup patch at the end now leaks,
so clearly the fixup in v2 did improve things but I still don't understand
that. Anyway that was fairly orthogonal, so I dropped it for later.

v1: 
https://lore.kernel.org/dri-devel/20210802154806.3710472-1-daniel.vet...@ffwll.ch/
v2: 
https://lore.kernel.org/dri-devel/20210804142522.4113021-1-daniel.vet...@ffwll.ch/

Cheers, Daniel

Daniel Vetter (8):
  drm/i915: Drop code to handle set-vm races from execbuf
  drm/i915: Rename i915_gem_context_get_vm_rcu to
i915_gem_context_get_eb_vm
  drm/i915: Use i915_gem_context_get_eb_vm in ctx_getparam
  drm/i915: Add i915_gem_context_is_full_ppgtt
  drm/i915: Use i915_gem_context_get_eb_vm in intel_context_set_gem
  drm/i915: Drop __rcu from gem_context->vm
  drm/i915: use xa_lock/unlock for fpriv->vm_xa lookups
  drm/i915: Stop rcu support for i915_address_space

 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 78 ---
 drivers/gpu/drm/i915/gem/i915_gem_context.h   | 13 ++--
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  2 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  8 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  8 +-
 .../drm/i915/gem/selftests/i915_gem_context.c | 34 
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |  1 -
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  6 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  |  2 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  2 +-
 drivers/gpu/drm/i915/i915_drv.h   |  4 +-
 drivers/gpu/drm/i915/i915_trace.h |  2 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  4 +-
 drivers/gpu/drm/i915/selftests/i915_vma.c |  4 +-
 15 files changed, 52 insertions(+), 118 deletions(-)

-- 
2.32.0

[Intel-gfx] [PATCH v5 00/20] drm/sched dependency handling and implicit sync fixes

2021-08-05 Thread Daniel Vetter

Hi all,

Two big changes:
- bikeshed repainted in new paint, pls don't touch, it's all fresh! The
  functions are now called _add_dependency and _add_implicit_dependencies.

- msm conversion, which includes a bugfix for the msm drm/sched
  conversion. I think it would be really good if the first two patches
  could land asap, but that means testing by some of the other drivers.
  Etnaviv especially is pending some testing/reviewed-by.

In general please review and test.

Thanks, Daniel

Daniel Vetter (20):
  drm/sched: Split drm_sched_job_init
  drm/msm: Fix drm/sched point of no return rules
  drm/sched: Barriers are needed for entity->last_scheduled
  drm/sched: Add dependency tracking
  drm/sched: drop entity parameter from drm_sched_push_job
  drm/sched: improve docs around drm_sched_entity
  drm/panfrost: use scheduler dependency tracking
  drm/lima: use scheduler dependency tracking
  drm/v3d: Move drm_sched_job_init to v3d_job_init
  drm/v3d: Use scheduler dependency handling
  drm/etnaviv: Use scheduler dependency handling
  drm/msm: Use scheduler dependency handling
  drm/gem: Delete gem array fencing helpers
  drm/sched: Don't store self-dependencies
  drm/sched: Check locking in drm_sched_job_await_implicit
  drm/msm: Don't break exclusive fence ordering
  drm/etnaviv: Don't break exclusive fence ordering
  drm/i915: delete exclude argument from i915_sw_fence_await_reservation
  drm/i915: Don't break exclusive fence ordering
  dma-resv: Give the docs a do-over

 Documentation/gpu/drm-mm.rst  |   3 +
 drivers/dma-buf/dma-resv.c|  24 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |   4 +-
 drivers/gpu/drm/drm_gem.c |  96 -
 drivers/gpu/drm/etnaviv/etnaviv_gem.h |   5 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |  66 +++---
 drivers/gpu/drm/etnaviv/etnaviv_sched.c   |  65 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.h   |   3 +-
 drivers/gpu/drm/i915/display/intel_display.c  |   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   2 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   6 +-
 drivers/gpu/drm/i915/i915_sw_fence.c  |   6 +-
 drivers/gpu/drm/i915/i915_sw_fence.h  |   1 -
 drivers/gpu/drm/lima/lima_gem.c   |   9 +-
 drivers/gpu/drm/lima/lima_sched.c |  28 +--
 drivers/gpu/drm/lima/lima_sched.h |   6 +-
 drivers/gpu/drm/msm/msm_gem.h |   5 -
 drivers/gpu/drm/msm/msm_gem_submit.c  |  36 ++--
 drivers/gpu/drm/msm/msm_ringbuffer.c  |  12 --
 drivers/gpu/drm/panfrost/panfrost_drv.c   |  16 +-
 drivers/gpu/drm/panfrost/panfrost_job.c   |  40 +---
 drivers/gpu/drm/panfrost/panfrost_job.h   |   5 +-
 drivers/gpu/drm/scheduler/sched_entity.c  | 140 +++--
 drivers/gpu/drm/scheduler/sched_fence.c   |  19 +-
 drivers/gpu/drm/scheduler/sched_main.c| 182 -
 drivers/gpu/drm/v3d/v3d_drv.h |   6 +-
 drivers/gpu/drm/v3d/v3d_gem.c | 114 +--
 drivers/gpu/drm/v3d/v3d_sched.c   |  44 +---
 include/drm/drm_gem.h |   5 -
 include/drm/gpu_scheduler.h   | 188 +++---
 include/linux/dma-buf.h   |   7 +
 include/linux/dma-resv.h  | 104 +-
 33 files changed, 693 insertions(+), 562 deletions(-)

-- 
2.32.0

[Intel-gfx] [PATCH v5 02/20] drm/msm: Fix drm/sched point of no return rules

2021-08-05 Thread Daniel Vetter

Originally drm_sched_job_init was the point of no return, after which
drivers must submit a job. I've split that up, which allows us to fix
this issue pretty easily.

Only thing we have to take care of is to not skip to error paths after
that. Other drivers do this the same for out-fence and similar things.

Fixes: 1d8a5ca436ee ("drm/msm: Conversion to drm scheduler")
Cc: Rob Clark 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-arm-...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Cc: freedr...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 6d6c44f0e1f3..d0ed4ddc509e 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -52,9 +52,6 @@ static struct msm_gem_submit *submit_create(struct drm_device 
*dev,
return ERR_PTR(ret);
}
 
-   /* FIXME: this is way too early */
-   drm_sched_job_arm(&job->base);
-
xa_init_flags(&submit->deps, XA_FLAGS_ALLOC);
 
kref_init(&submit->ref);
@@ -883,6 +880,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
 
submit->user_fence = dma_fence_get(&submit->base.s_fence->finished);
 
+   /* point of no return, we _have_ to submit no matter what */
+   drm_sched_job_arm(&submit->base);
+
/*
 * Allocate an id which can be used by WAIT_FENCE ioctl to map back
 * to the underlying fence.
@@ -892,17 +892,16 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
*data,
if (submit->fence_id < 0) {
ret = submit->fence_id = 0;
submit->fence_id = 0;
-   goto out;
}
 
-   if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
+   if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
struct sync_file *sync_file = 
sync_file_create(submit->user_fence);
if (!sync_file) {
ret = -ENOMEM;
-   goto out;
+   } else {
+   fd_install(out_fence_fd, sync_file->file);
+   args->fence_fd = out_fence_fd;
}
-   fd_install(out_fence_fd, sync_file->file);
-   args->fence_fd = out_fence_fd;
}
 
submit_attach_object_fences(submit);
-- 
2.32.0

[Intel-gfx] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init

2021-08-05 Thread Daniel Vetter

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
  usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
  to be moved into drm_sched_job_arm, which made me realize that the
  job->id definitely needs to be moved too.

  Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

Acked-by: Melissa Wen 
Cc: Melissa Wen 
Acked-by: Emma Anholt 
Acked-by: Steven Price  (v2)
Reviewed-by: Boris Brezillon  (v5)
Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Masahiro Yamada 
Cc: Kees Cook 
Cc: Adam Borowski 
Cc: Nick Terrell 
Cc: Mauro Carvalho Chehab 
Cc: Paul Menzel 
Cc: Sami Tolvanen 
Cc: Viresh Kumar 
Cc: Alex Deucher 
Cc: Dave Airlie 
Cc: Nirmoy Das 
Cc: Deepak R Varma 
Cc: Lee Jones 
Cc: Kevin Wang 
Cc: Chen Li 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Dennis Li 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Sonny Jiang 
Cc: Boris Brezillon 
Cc: Tian Tao 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Emma Anholt 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
 drivers/gpu/drm/lima/lima_sched.c|  2 +
 drivers/gpu/drm/msm/msm_gem_submit.c |  3 ++
 drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
 drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
 drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
 drivers/gpu/drm/scheduler/sched_main.c   | 69 
 drivers/gpu/drm/v3d/v3d_gem.c|  2 +
 include/drm/gpu_scheduler.h  |  7 ++-
 11 files changed, 94 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 139cd3bf1ad6..32e80bc6af22 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
if (r)
goto error_unlock;
 
+   drm_sched_job_arm(&job->base);
+
/* No memory allocation is allowed while holding the notifier lock.
 * The lock is held until amdgpu_cs_submit is finished and fence is
 * added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
if (r)
return r;
 
+   drm_sched_job_arm(&job->base);
+
*f = dma_fence_get(&job->base.s_fence->finished);
amdgpu_job_free_resources(job);
drm_sched_entity_push_job(&job->base, entity);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index feb6da1b6ceb..05f412204118 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
if (ret)
goto out_unlock;
 
+   drm_sched_job_arm(&submit->sched_job);
+
submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
submit->out_fence, 0,
diff --git a/drivers/gpu/drm/lima/li

[Intel-gfx] [PATCH v5 04/20] drm/sched: Add dependency tracking

2021-08-05 Thread Daniel Vetter

Instead of just a callback we can just glue in the gem helpers that
panfrost, v3d and lima currently use. There's really not that many
ways to skin this cat.

v2/3: Rebased.

v4: Repaint this shed. The functions are now called _add_dependency()
and _add_implicit_dependency()

Reviewed-by: Boris Brezillon  (v3)
Reviewed-by: Steven Price  (v1)
Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Nirmoy Das 
Cc: Boris Brezillon 
Cc: Luben Tuikov 
Cc: Alex Deucher 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
 drivers/gpu/drm/scheduler/sched_main.c   | 104 +++
 include/drm/gpu_scheduler.h  |  33 ++-
 3 files changed, 149 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 89e3f6eaf519..381fbf462ea7 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence 
*f,
job->sched->ops->free_job(job);
 }
 
+static struct dma_fence *
+drm_sched_job_dependency(struct drm_sched_job *job,
+struct drm_sched_entity *entity)
+{
+   if (!xa_empty(&job->dependencies))
+   return xa_erase(&job->dependencies, job->last_dependency++);
+
+   if (job->sched->ops->dependency)
+   return job->sched->ops->dependency(job, entity);
+
+   return NULL;
+}
+
 /**
  * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
  *
@@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct 
drm_sched_entity *entity)
struct drm_sched_fence *s_fence = job->s_fence;
 
/* Wait for all dependencies to avoid data corruptions */
-   while ((f = job->sched->ops->dependency(job, entity)))
+   while ((f = drm_sched_job_dependency(job, entity)))
dma_fence_wait(f, false);
 
drm_sched_fence_scheduled(s_fence);
@@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct 
drm_sched_entity *entity)
  */
 struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 {
-   struct drm_gpu_scheduler *sched = entity->rq->sched;
struct drm_sched_job *sched_job;
 
sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
@@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
return NULL;
 
while ((entity->dependency =
-   sched->ops->dependency(sched_job, entity))) {
+   drm_sched_job_dependency(sched_job, entity))) {
trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
 
if (drm_sched_entity_add_dependency_cb(entity))
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 454cb6164bdc..f77456929139 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
 
INIT_LIST_HEAD(&job->list);
 
+   xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);
+
return 0;
 }
 EXPORT_SYMBOL(drm_sched_job_init);
@@ -637,6 +639,99 @@ void drm_sched_job_arm(struct drm_sched_job *job)
 }
 EXPORT_SYMBOL(drm_sched_job_arm);
 
+/**
+ * drm_sched_job_add_dependency - adds the fence as a job dependency
+ * @job: scheduler job to add the dependencies to
+ * @fence: the dma_fence to add to the list of dependencies.
+ *
+ * Note that @fence is consumed in both the success and error cases.
+ *
+ * Returns:
+ * 0 on success, or an error on failing to expand the array.
+ */
+int drm_sched_job_add_dependency(struct drm_sched_job *job,
+struct dma_fence *fence)
+{
+   struct dma_fence *entry;
+   unsigned long index;
+   u32 id = 0;
+   int ret;
+
+   if (!fence)
+   return 0;
+
+   /* Deduplicate if we already depend on a fence from the same context.
+* This lets the size of the array of deps scale with the number of
+* engines involved, rather than the number of BOs.
+*/
+   xa_for_each(&job->dependencies, index, entry) {
+   if (entry->context != fence->context)
+   continue;
+
+   if (dma_fence_is_later(fence, entry)) {
+   dma_fence_put(entry);
+   xa_store(&job->dependencies, index, fence, GFP_KERNEL);
+   } else {
+   dma_fence_put(fence);
+   }
+   return 0;
+   }
+
+   ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, 
GFP_KERNEL);
+   if (ret != 0)
+

[Intel-gfx] [PATCH v5 03/20] drm/sched: Barriers are needed for entity->last_scheduled

2021-08-05 Thread Daniel Vetter

It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.

It's definitely not enough on arm.

Fix this proplery, which means
- explain the need for the barrier in both places
- point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.

While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.

Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.

v2: Put smp_rmp() in the right place and fix up comment (Andrey)

Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Steven Price 
Cc: Daniel Vetter 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Boris Brezillon 
---
 drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index f7347c284886..89e3f6eaf519 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
dma_fence_set_error(&sched_job->s_fence->finished, -ECANCELED);
 
dma_fence_put(entity->last_scheduled);
+
entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished);
 
+   /*
+* If the queue is empty we allow drm_sched_entity_select_rq() to
+* locklessly access ->last_scheduled. This only works if we set the
+* pointer before we dequeue and if we a write barrier here.
+*/
+   smp_wmb();
+
spsc_queue_pop(&entity->job_queue);
return sched_job;
 }
@@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity 
*entity)
struct drm_gpu_scheduler *sched;
struct drm_sched_rq *rq;
 
-   if (spsc_queue_count(&entity->job_queue) || !entity->sched_list)
+   /* single possible engine and already selected */
+   if (!entity->sched_list)
+   return;
+
+   /* queue non-empty, stay on the same engine */
+   if (spsc_queue_count(&entity->job_queue))
return;
 
-   fence = READ_ONCE(entity->last_scheduled);
+   /*
+* Only when the queue is empty are we guaranteed that the scheduler
+* thread cannot change ->last_scheduled. To enforce ordering we need
+* a read barrier here. See drm_sched_entity_pop_job() for the other
+* side.
+*/
+   smp_rmb();
+
+   fence = entity->last_scheduled;
+
+   /* stay on the same engine if the previous job hasn't finished */
if (fence && !dma_fence_is_signaled(fence))
return;
 
-- 
2.32.0

[Intel-gfx] [PATCH v5 05/20] drm/sched: drop entity parameter from drm_sched_push_job

2021-08-05 Thread Daniel Vetter

Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

v2:
Rebase on top of msm adopting drm/sched

Acked-by: Emma Anholt 
Acked-by: Melissa Wen 
Reviewed-by: Steven Price  (v1)
Reviewed-by: Boris Brezillon  (v1)
Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: Emma Anholt 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Alex Deucher 
Cc: Nirmoy Das 
Cc: Dave Airlie 
Cc: Chen Li 
Cc: Lee Jones 
Cc: Deepak R Varma 
Cc: Kevin Wang 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Dennis Li 
Cc: Boris Brezillon 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Rob Clark 
Cc: Sean Paul 
Cc: Melissa Wen 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  | 2 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 2 +-
 drivers/gpu/drm/lima/lima_gem.c  | 3 +--
 drivers/gpu/drm/lima/lima_sched.c| 5 ++---
 drivers/gpu/drm/lima/lima_sched.h| 3 +--
 drivers/gpu/drm/msm/msm_gem_submit.c | 2 +-
 drivers/gpu/drm/panfrost/panfrost_job.c  | 2 +-
 drivers/gpu/drm/scheduler/sched_entity.c | 6 ++
 drivers/gpu/drm/v3d/v3d_gem.c| 2 +-
 include/drm/gpu_scheduler.h  | 3 +--
 11 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 32e80bc6af22..1d8a914108af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 
trace_amdgpu_cs_ioctl(job);
amdgpu_vm_bo_trace_cs(&fpriv->vm, &p->ticket);
-   drm_sched_entity_push_job(&job->base, entity);
+   drm_sched_entity_push_job(&job->base);
 
amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 5ddb955d2315..b86099c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
 
*f = dma_fence_get(&job->base.s_fence->finished);
amdgpu_job_free_resources(job);
-   drm_sched_entity_push_job(&job->base, entity);
+   drm_sched_entity_push_job(&job->base);
 
return 0;
 }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 05f412204118..180bb633d5c5 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
/* the scheduler holds on to the job now */
kref_get(&submit->refcount);
 
-   drm_sched_entity_push_job(&submit->sched_job, sched_entity);
+   drm_sched_entity_push_job(&submit->sched_job);
 
 out_unlock:
mutex_unlock(&submit->gpu->fence_lock);
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index de62966243cd..c528f40981bb 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct 
lima_submit *submit)
goto err_out2;
}
 
-   fence = lima_sched_context_queue_task(
-   submit->ctx->context + submit->pipe, submit->task);
+   fence = lima_sched_context_queue_task(submit->task);
 
for (i = 0; i < submit->nr_bos; i++) {
if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index 38f755580507..e968b5a8f0b0 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe,
drm_sched_entity_fini(&context->base);
 }
 
-struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context 
*context,
-   struct lima_sched_task *task)
+struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
 {
struct dma_fence *fence = dma_fence_get(&task->base.s_fence->finished);
 
trace_lima_task_submit(task);
-   drm_sched_entity_push_job(&task->base, &context->base);
+   drm_sched_entity_push_job(&task->base);
return fence;
 }
 
diff --git a/drivers/gpu/drm/lima/lima_sched

[Intel-gfx] [PATCH v5 06/20] drm/sched: improve docs around drm_sched_entity

2021-08-05 Thread Daniel Vetter

I found a few too many things that are tricky and not documented, so I
started typing.

I found a few more things that looked broken while typing, see the
varios FIXME in drm_sched_entity.

Also some of the usual logics:
- actually include sched_entity.c declarations, that was lost in the
  move here: 620e762f9a98 ("drm/scheduler: move entity handling into
  separate file")

- Ditch the kerneldoc for internal functions, keep the comments where
  they're describing more than what the function name already implies.

- Switch drm_sched_entity to inline docs.

Acked-by: Melissa Wen 
Reviewed-by: Boris Brezillon  (v1)
Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: "Christian König" 
Cc: Boris Brezillon 
Cc: Steven Price 
Cc: Emma Anholt 
Cc: Lee Jones 
Cc: Andrey Grodzovsky 
---
 Documentation/gpu/drm-mm.rst |   3 +
 drivers/gpu/drm/scheduler/sched_entity.c |  85 -
 include/drm/gpu_scheduler.h  | 145 ++-
 3 files changed, 146 insertions(+), 87 deletions(-)

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index d5a73fa2c9ef..0198fa43d254 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -504,3 +504,6 @@ Scheduler Function References
 
 .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
:export:
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_entity.c
+   :export:
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index e4d33db1eb45..27e1573af96e 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -45,8 +45,14 @@
  * @guilty: atomic_t set to 1 when a job on this queue
  *  is found to be guilty causing a timeout
  *
- * Note: the sched_list should have at least one element to schedule
- *   the entity
+ * Note that the &sched_list must have at least one element to schedule the 
entity.
+ *
+ * For changing @priority later on at runtime see
+ * drm_sched_entity_set_priority(). For changing the set of schedulers
+ * @sched_list at runtime see drm_sched_entity_modify_sched().
+ *
+ * An entity is cleaned up by callind drm_sched_entity_fini(). See also
+ * drm_sched_entity_destroy().
  *
  * Returns 0 on success or a negative error code on failure.
  */
@@ -92,6 +98,11 @@ EXPORT_SYMBOL(drm_sched_entity_init);
  * @sched_list: the list of new drm scheds which will replace
  *  existing entity->sched_list
  * @num_sched_list: number of drm sched in sched_list
+ *
+ * Note that this must be called under the same common lock for @entity as
+ * drm_sched_job_arm() and drm_sched_entity_push_job(), or the driver needs to
+ * guarantee through some other means that this is never called while new jobs
+ * can be pushed to @entity.
  */
 void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
struct drm_gpu_scheduler **sched_list,
@@ -104,13 +115,6 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity 
*entity,
 }
 EXPORT_SYMBOL(drm_sched_entity_modify_sched);
 
-/**
- * drm_sched_entity_is_idle - Check if entity is idle
- *
- * @entity: scheduler entity
- *
- * Returns true if the entity does not have any unscheduled jobs.
- */
 static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
 {
rmb(); /* for list_empty to work without lock */
@@ -123,13 +127,7 @@ static bool drm_sched_entity_is_idle(struct 
drm_sched_entity *entity)
return false;
 }
 
-/**
- * drm_sched_entity_is_ready - Check if entity is ready
- *
- * @entity: scheduler entity
- *
- * Return true if entity could provide a job.
- */
+/* Return true if entity could provide a job. */
 bool drm_sched_entity_is_ready(struct drm_sched_entity *entity)
 {
if (spsc_queue_peek(&entity->job_queue) == NULL)
@@ -192,14 +190,7 @@ long drm_sched_entity_flush(struct drm_sched_entity 
*entity, long timeout)
 }
 EXPORT_SYMBOL(drm_sched_entity_flush);
 
-/**
- * drm_sched_entity_kill_jobs_cb - helper for drm_sched_entity_kill_jobs
- *
- * @f: signaled fence
- * @cb: our callback structure
- *
- * Signal the scheduler finished fence when the entity in question is killed.
- */
+/* Signal the scheduler finished fence when the entity in question is killed. 
*/
 static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
  struct dma_fence_cb *cb)
 {
@@ -224,14 +215,6 @@ drm_sched_job_dependency(struct drm_sched_job *job,
return NULL;
 }
 
-/**
- * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
- *
- * @entity: entity which is cleaned up
- *
- * Makes sure that all remaining jobs in an entity are killed before it is
- * destroyed.
- */
 static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
 {
struct drm_sched_job *job;
@@ -273,9 +256,11 @@ static void drm

[Intel-gfx] [PATCH v5 07/20] drm/panfrost: use scheduler dependency tracking

2021-08-05 Thread Daniel Vetter

Just deletes some code that's now more shared.

Note that thanks to the split into drm_sched_job_init/arm we can now
easily pull the _init() part from under the submission lock way ahead
where we're adding the sync file in-fences as dependencies.

v2: Correctly clean up the partially set up job, now that job_init()
and job_arm() are apart (Emma).

v3: Rebased over renamed functions for adding depdencies

Acked-by: Emma Anholt 
Reviewed-by: Steven Price  (v3)
Signed-off-by: Daniel Vetter 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Emma Anholt 
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 16 ---
 drivers/gpu/drm/panfrost/panfrost_job.c | 38 -
 drivers/gpu/drm/panfrost/panfrost_job.h |  5 +---
 3 files changed, 18 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 1ffaef5ec5ff..16212b6b202e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev,
if (ret)
goto fail;
 
-   ret = drm_gem_fence_array_add(&job->deps, fence);
+   ret = drm_sched_job_add_dependency(&job->base, fence);
 
if (ret)
goto fail;
@@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
struct drm_panfrost_submit *args = data;
struct drm_syncobj *sync_out = NULL;
struct panfrost_job *job;
-   int ret = 0;
+   int ret = 0, slot;
 
if (!args->jc)
return -EINVAL;
@@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
 
kref_init(&job->refcount);
 
-   xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
-
job->pfdev = pfdev;
job->jc = args->jc;
job->requirements = args->requirements;
job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
job->file_priv = file->driver_priv;
 
+   slot = panfrost_job_get_slot(job);
+
+   ret = drm_sched_job_init(&job->base,
+&job->file_priv->sched_entity[slot],
+NULL);
+   if (ret)
+   goto fail_job_put;
+
ret = panfrost_copy_in_sync(dev, file, args, job);
if (ret)
goto fail_job;
@@ -283,6 +289,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
drm_syncobj_replace_fence(sync_out, job->render_done_fence);
 
 fail_job:
+   drm_sched_job_cleanup(&job->base);
+fail_job_put:
panfrost_job_put(job);
 fail_out_sync:
if (sync_out)
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 4bc962763e1f..a98f507dc779 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -102,7 +102,7 @@ static struct dma_fence *panfrost_fence_create(struct 
panfrost_device *pfdev, in
return &fence->base;
 }
 
-static int panfrost_job_get_slot(struct panfrost_job *job)
+int panfrost_job_get_slot(struct panfrost_job *job)
 {
/* JS0: fragment jobs.
 * JS1: vertex/tiler jobs
@@ -242,13 +242,14 @@ static void panfrost_job_hw_submit(struct panfrost_job 
*job, int js)
 
 static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
  int bo_count,
- struct xarray *deps)
+ struct drm_sched_job *job)
 {
int i, ret;
 
for (i = 0; i < bo_count; i++) {
/* panfrost always uses write mode in its current uapi */
-   ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
+   ret = drm_sched_job_add_implicit_dependencies(job, bos[i],
+ true);
if (ret)
return ret;
}
@@ -269,31 +270,21 @@ static void panfrost_attach_object_fences(struct 
drm_gem_object **bos,
 int panfrost_job_push(struct panfrost_job *job)
 {
struct panfrost_device *pfdev = job->pfdev;
-   int slot = panfrost_job_get_slot(job);
-   struct drm_sched_entity *entity = &job->file_priv->sched_entity[slot];
struct ww_acquire_ctx acquire_ctx;
int ret = 0;
 
-
ret = drm_gem_lock_reservations(job->bos, job->bo_count,
&acquire_ctx);
if (ret)
return ret;
 
mutex_lock(&pfdev->sched_lock);
-
-   ret = drm_sched_job_init(&job->base, entity, NULL);
-   if (ret) {
-   mutex_unlock(&pfdev->sched_lock);
-   goto unlock;
-   }
-
drm_sched

[Intel-gfx] [PATCH v5 09/20] drm/v3d: Move drm_sched_job_init to v3d_job_init

2021-08-05 Thread Daniel Vetter

Prep work for using the scheduler dependency handling. We need to call
drm_sched_job_init earlier so we can use the new drm_sched_job_await*
functions for dependency handling here.

v2: Slightly better commit message and rebase to include the
drm_sched_job_arm() call (Emma).

v3: Cleanup jobs under construction correctly (Emma)

v4: Rebase over perfmon patch

Reviewed-by: Melissa Wen  (v3)
Acked-by: Emma Anholt 
Cc: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: Emma Anholt 
---
 drivers/gpu/drm/v3d/v3d_drv.h   |  1 +
 drivers/gpu/drm/v3d/v3d_gem.c   | 86 ++---
 drivers/gpu/drm/v3d/v3d_sched.c | 15 +++---
 3 files changed, 44 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 270134779073..c1d433b4cf93 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -379,6 +379,7 @@ int v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
 int v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
  struct drm_file *file_priv);
+void v3d_job_cleanup(struct v3d_job *job);
 void v3d_job_put(struct v3d_job *job);
 void v3d_reset(struct v3d_dev *v3d);
 void v3d_invalidate_caches(struct v3d_dev *v3d);
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 957228bef29c..42587248c54e 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -397,6 +397,12 @@ v3d_render_job_free(struct kref *ref)
v3d_job_free(ref);
 }
 
+void v3d_job_cleanup(struct v3d_job *job)
+{
+   drm_sched_job_cleanup(&job->base);
+   v3d_job_put(job);
+}
+
 void v3d_job_put(struct v3d_job *job)
 {
kref_put(&job->refcount, job->free);
@@ -438,9 +444,10 @@ v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
 static int
 v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 struct v3d_job *job, void (*free)(struct kref *ref),
-u32 in_sync)
+u32 in_sync, enum v3d_queue queue)
 {
struct dma_fence *in_fence = NULL;
+   struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
int ret;
 
job->v3d = v3d;
@@ -451,35 +458,33 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
return ret;
 
xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
+   ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
+v3d_priv);
+   if (ret)
+   goto fail;
 
ret = drm_syncobj_find_fence(file_priv, in_sync, 0, 0, &in_fence);
if (ret == -EINVAL)
-   goto fail;
+   goto fail_job;
 
ret = drm_gem_fence_array_add(&job->deps, in_fence);
if (ret)
-   goto fail;
+   goto fail_job;
 
kref_init(&job->refcount);
 
return 0;
+fail_job:
+   drm_sched_job_cleanup(&job->base);
 fail:
xa_destroy(&job->deps);
pm_runtime_put_autosuspend(v3d->drm.dev);
return ret;
 }
 
-static int
-v3d_push_job(struct v3d_file_priv *v3d_priv,
-struct v3d_job *job, enum v3d_queue queue)
+static void
+v3d_push_job(struct v3d_job *job)
 {
-   int ret;
-
-   ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
-v3d_priv);
-   if (ret)
-   return ret;
-
drm_sched_job_arm(&job->base);
 
job->done_fence = dma_fence_get(&job->base.s_fence->finished);
@@ -488,8 +493,6 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
kref_get(&job->refcount);
 
drm_sched_entity_push_job(&job->base);
-
-   return 0;
 }
 
 static void
@@ -564,7 +567,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
INIT_LIST_HEAD(&render->unref_list);
 
ret = v3d_job_init(v3d, file_priv, &render->base,
-  v3d_render_job_free, args->in_sync_rcl);
+  v3d_render_job_free, args->in_sync_rcl, V3D_RENDER);
if (ret) {
kfree(render);
return ret;
@@ -578,7 +581,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
}
 
ret = v3d_job_init(v3d, file_priv, &bin->base,
-  v3d_job_free, args->in_sync_bcl);
+  v3d_job_free, args->in_sync_bcl, V3D_BIN);
if (ret) {
v3d_job_put(&render->base);
kfree(bin);
@@ -600,7 +603,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
goto fail;
}
 
-   ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0);
+   ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, 
V3D_CACHE_CLEAN);
if (ret) {
kfree(clean_job);
clean_job = NULL;
@@ -635,9 +638,7

[Intel-gfx] [PATCH v5 08/20] drm/lima: use scheduler dependency tracking

2021-08-05 Thread Daniel Vetter

Nothing special going on here.

Aside reviewing the code, it seems like drm_sched_job_arm() should be
moved into lima_sched_context_queue_task and put under some mutex
together with drm_sched_push_job(). See the kerneldoc for
drm_sched_push_job().

v2: Rebase over renamed functions to add dependencies.

Signed-off-by: Daniel Vetter 
Cc: Qiang Yu 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/lima/lima_gem.c   |  6 --
 drivers/gpu/drm/lima/lima_sched.c | 21 -
 drivers/gpu/drm/lima/lima_sched.h |  3 ---
 3 files changed, 4 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index c528f40981bb..640acc060467 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -267,7 +267,9 @@ static int lima_gem_sync_bo(struct lima_sched_task *task, 
struct lima_bo *bo,
if (explicit)
return 0;
 
-   return drm_gem_fence_array_add_implicit(&task->deps, &bo->base.base, 
write);
+   return drm_sched_job_add_implicit_dependencies(&task->base,
+  &bo->base.base,
+  write);
 }
 
 static int lima_gem_add_deps(struct drm_file *file, struct lima_submit *submit)
@@ -285,7 +287,7 @@ static int lima_gem_add_deps(struct drm_file *file, struct 
lima_submit *submit)
if (err)
return err;
 
-   err = drm_gem_fence_array_add(&submit->task->deps, fence);
+   err = drm_sched_job_add_dependency(&submit->task->base, fence);
if (err) {
dma_fence_put(fence);
return err;
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index e968b5a8f0b0..99d5f6f1a882 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -134,24 +134,15 @@ int lima_sched_task_init(struct lima_sched_task *task,
task->num_bos = num_bos;
task->vm = lima_vm_get(vm);
 
-   xa_init_flags(&task->deps, XA_FLAGS_ALLOC);
-
return 0;
 }
 
 void lima_sched_task_fini(struct lima_sched_task *task)
 {
-   struct dma_fence *fence;
-   unsigned long index;
int i;
 
drm_sched_job_cleanup(&task->base);
 
-   xa_for_each(&task->deps, index, fence) {
-   dma_fence_put(fence);
-   }
-   xa_destroy(&task->deps);
-
if (task->bos) {
for (i = 0; i < task->num_bos; i++)
drm_gem_object_put(&task->bos[i]->base.base);
@@ -186,17 +177,6 @@ struct dma_fence *lima_sched_context_queue_task(struct 
lima_sched_task *task)
return fence;
 }
 
-static struct dma_fence *lima_sched_dependency(struct drm_sched_job *job,
-  struct drm_sched_entity *entity)
-{
-   struct lima_sched_task *task = to_lima_task(job);
-
-   if (!xa_empty(&task->deps))
-   return xa_erase(&task->deps, task->last_dep++);
-
-   return NULL;
-}
-
 static int lima_pm_busy(struct lima_device *ldev)
 {
int ret;
@@ -472,7 +452,6 @@ static void lima_sched_free_job(struct drm_sched_job *job)
 }
 
 static const struct drm_sched_backend_ops lima_sched_ops = {
-   .dependency = lima_sched_dependency,
.run_job = lima_sched_run_job,
.timedout_job = lima_sched_timedout_job,
.free_job = lima_sched_free_job,
diff --git a/drivers/gpu/drm/lima/lima_sched.h 
b/drivers/gpu/drm/lima/lima_sched.h
index ac70006b0e26..6a11764d87b3 100644
--- a/drivers/gpu/drm/lima/lima_sched.h
+++ b/drivers/gpu/drm/lima/lima_sched.h
@@ -23,9 +23,6 @@ struct lima_sched_task {
struct lima_vm *vm;
void *frame;
 
-   struct xarray deps;
-   unsigned long last_dep;
-
struct lima_bo **bos;
int num_bos;
 
-- 
2.32.0

[Intel-gfx] [PATCH v5 10/20] drm/v3d: Use scheduler dependency handling

2021-08-05 Thread Daniel Vetter

With the prep work out of the way this isn't tricky anymore.

Aside: The chaining of the various jobs is a bit awkward, with the
possibility of failure in bad places. I think with the
drm_sched_job_init/arm split and maybe preloading the
job->dependencies xarray this should be fixable.

v2: Rebase over renamed function names for adding dependencies.

Reviewed-by: Melissa Wen  (v1)
Acked-by: Emma Anholt 
Cc: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: Emma Anholt 
---
 drivers/gpu/drm/v3d/v3d_drv.h   |  5 -
 drivers/gpu/drm/v3d/v3d_gem.c   | 26 +-
 drivers/gpu/drm/v3d/v3d_sched.c | 29 +
 3 files changed, 10 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index c1d433b4cf93..b900a050d5e2 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -234,11 +234,6 @@ struct v3d_job {
struct drm_gem_object **bo;
u32 bo_count;
 
-   /* Array of struct dma_fence * to block on before submitting this job.
-*/
-   struct xarray deps;
-   unsigned long last_dep;
-
/* v3d fence to be signaled by IRQ handler when the job is complete. */
struct dma_fence *irq_fence;
 
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 42587248c54e..a3529809d547 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -259,8 +259,8 @@ v3d_lock_bo_reservations(struct v3d_job *job,
return ret;
 
for (i = 0; i < job->bo_count; i++) {
-   ret = drm_gem_fence_array_add_implicit(&job->deps,
-  job->bo[i], true);
+   ret = drm_sched_job_add_implicit_dependencies(&job->base,
+ job->bo[i], true);
if (ret) {
drm_gem_unlock_reservations(job->bo, job->bo_count,
acquire_ctx);
@@ -356,8 +356,6 @@ static void
 v3d_job_free(struct kref *ref)
 {
struct v3d_job *job = container_of(ref, struct v3d_job, refcount);
-   unsigned long index;
-   struct dma_fence *fence;
int i;
 
for (i = 0; i < job->bo_count; i++) {
@@ -366,11 +364,6 @@ v3d_job_free(struct kref *ref)
}
kvfree(job->bo);
 
-   xa_for_each(&job->deps, index, fence) {
-   dma_fence_put(fence);
-   }
-   xa_destroy(&job->deps);
-
dma_fence_put(job->irq_fence);
dma_fence_put(job->done_fence);
 
@@ -457,7 +450,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
if (ret < 0)
return ret;
 
-   xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
 v3d_priv);
if (ret)
@@ -467,7 +459,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
if (ret == -EINVAL)
goto fail_job;
 
-   ret = drm_gem_fence_array_add(&job->deps, in_fence);
+   ret = drm_sched_job_add_dependency(&job->base, in_fence);
if (ret)
goto fail_job;
 
@@ -477,7 +469,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
 fail_job:
drm_sched_job_cleanup(&job->base);
 fail:
-   xa_destroy(&job->deps);
pm_runtime_put_autosuspend(v3d->drm.dev);
return ret;
 }
@@ -640,8 +631,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
v3d_perfmon_get(bin->base.perfmon);
v3d_push_job(&bin->base);
 
-   ret = drm_gem_fence_array_add(&render->base.deps,
- 
dma_fence_get(bin->base.done_fence));
+   ret = drm_sched_job_add_dependency(&render->base.base,
+  
dma_fence_get(bin->base.done_fence));
if (ret)
goto fail_unreserve;
}
@@ -651,7 +642,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
if (clean_job) {
struct dma_fence *render_fence =
dma_fence_get(render->base.done_fence);
-   ret = drm_gem_fence_array_add(&clean_job->deps, render_fence);
+   ret = drm_sched_job_add_dependency(&clean_job->base,
+  render_fence);
if (ret)
goto fail_unreserve;
clean_job->perfmon = render->base.perfmon;
@@ -853,8 +845,8 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
mutex_lock(&v3d->sched_lock);
v3d_push_job(&job->base);
 
-   ret = drm_gem_fence_array_add(&clean_job->deps,
- dma_fence_get(job->base.done_fence));
+   ret = drm_sched_job_add_dependency(&clean_job->base,
+

[Intel-gfx] [PATCH v5 11/20] drm/etnaviv: Use scheduler dependency handling

2021-08-05 Thread Daniel Vetter

We need to pull the drm_sched_job_init much earlier, but that's very
minor surgery.

v2: Actually fix up cleanup paths by calling drm_sched_job_init, which
I wanted to to in the previous round (and did, for all other drivers).
Spotted by Lucas.

v3: Rebase over renamed functions to add dependencies.

Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: etna...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.h|  5 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 60 ++-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 63 +---
 drivers/gpu/drm/etnaviv/etnaviv_sched.h  |  3 +-
 4 files changed, 37 insertions(+), 94 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
index 98e60df882b6..63688e6e4580 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
@@ -80,9 +80,6 @@ struct etnaviv_gem_submit_bo {
u64 va;
struct etnaviv_gem_object *obj;
struct etnaviv_vram_mapping *mapping;
-   struct dma_fence *excl;
-   unsigned int nr_shared;
-   struct dma_fence **shared;
 };
 
 /* Created per submit-ioctl, to track bo's and cmdstream bufs, etc,
@@ -95,7 +92,7 @@ struct etnaviv_gem_submit {
struct etnaviv_file_private *ctx;
struct etnaviv_gpu *gpu;
struct etnaviv_iommu_context *mmu_context, *prev_mmu_context;
-   struct dma_fence *out_fence, *in_fence;
+   struct dma_fence *out_fence;
int out_fence_id;
struct list_head node; /* GPU active submit list */
struct etnaviv_cmdbuf cmdbuf;
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 4dd7d9d541c0..e3d43678eb09 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -188,16 +188,11 @@ static int submit_fence_sync(struct etnaviv_gem_submit 
*submit)
if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
continue;
 
-   if (bo->flags & ETNA_SUBMIT_BO_WRITE) {
-   ret = dma_resv_get_fences(robj, &bo->excl,
- &bo->nr_shared,
- &bo->shared);
-   if (ret)
-   return ret;
-   } else {
-   bo->excl = dma_resv_get_excl_unlocked(robj);
-   }
-
+   ret = 
drm_sched_job_add_implicit_dependencies(&submit->sched_job,
+ &bo->obj->base,
+ bo->flags & 
ETNA_SUBMIT_BO_WRITE);
+   if (ret)
+   return ret;
}
 
return ret;
@@ -403,8 +398,6 @@ static void submit_cleanup(struct kref *kref)
 
wake_up_all(&submit->gpu->fence_event);
 
-   if (submit->in_fence)
-   dma_fence_put(submit->in_fence);
if (submit->out_fence) {
/* first remove from IDR, so fence can not be found anymore */
mutex_lock(&submit->gpu->fence_lock);
@@ -529,7 +522,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
*data,
ret = etnaviv_cmdbuf_init(priv->cmdbuf_suballoc, &submit->cmdbuf,
  ALIGN(args->stream_size, 8) + 8);
if (ret)
-   goto err_submit_objects;
+   goto err_submit_put;
 
submit->ctx = file->driver_priv;
etnaviv_iommu_context_get(submit->ctx->mmu);
@@ -537,51 +530,62 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
*data,
submit->exec_state = args->exec_state;
submit->flags = args->flags;
 
+   ret = drm_sched_job_init(&submit->sched_job,
+&ctx->sched_entity[args->pipe],
+submit->ctx);
+   if (ret)
+   goto err_submit_put;
+
ret = submit_lookup_objects(submit, file, bos, args->nr_bos);
if (ret)
-   goto err_submit_objects;
+   goto err_submit_job;
 
if ((priv->mmu_global->version != ETNAVIV_IOMMU_V2) &&
!etnaviv_cmd_validate_one(gpu, stream, args->stream_size / 4,
  relocs, args->nr_relocs)) {
ret = -EINVAL;
-   goto err_submit_objects;
+   goto err_submit_job;
}
 
if (args->flags & ETNA_SUBMIT_FENCE_FD_IN) {
-   submit->in_fence = sync_file_get_fence(args->fence_fd);
-   if (!submit->in_fence) {
+   struct dma_fence *in_fence = 
sync_file_get_fence(args->fence_fd);
+   if (!in_fence) {
ret = -EINVA

[Intel-gfx] [PATCH v5 14/20] drm/sched: Don't store self-dependencies

2021-08-05 Thread Daniel Vetter

This is essentially part of drm_sched_dependency_optimized(), which
only amdgpu seems to make use of. Use it a bit more.

This would mean that as-is amdgpu can't use the dependency helpers, at
least not with the current approach amdgpu has for deciding whether a
vm_flush is needed. Since amdgpu also has very special rules around
implicit fencing it can't use those helpers either, and adding a
drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too
onerous. That way the special case handling for amdgpu sticks even
more out and we have higher chances that reviewers that go across all
drivers wont miss it.

Reviewed-by: Lucas Stach 
Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 
---
 drivers/gpu/drm/scheduler/sched_main.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index f77456929139..49e507f91ec0 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -660,6 +660,13 @@ int drm_sched_job_add_dependency(struct drm_sched_job *job,
if (!fence)
return 0;
 
+   /* if it's a fence from us it's guaranteed to be earlier */
+   if (fence->context == job->entity->fence_context ||
+   fence->context == job->entity->fence_context + 1) {
+   dma_fence_put(fence);
+   return 0;
+   }
+
/* Deduplicate if we already depend on a fence from the same context.
 * This lets the size of the array of deps scale with the number of
 * engines involved, rather than the number of BOs.
-- 
2.32.0

[Intel-gfx] [PATCH v5 17/20] drm/etnaviv: Don't break exclusive fence ordering

2021-08-05 Thread Daniel Vetter

There's only one exclusive slot, and we must not break the ordering.
Adding a new exclusive fence drops all previous fences from the
dma_resv. To avoid violating the signalling order we err on the side of
over-synchronizing by waiting for the existing fences, even if
userspace asked us to ignore them.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but it probably makes sense to lift this into
dma-resv.c code as a proper concept, so that drivers don't have to
hack up their own solution each on their own. Hence go with the simple
fix for now.

Another option is the fence import ioctl from Jason:

https://lore.kernel.org/dri-devel/20210610210925.642582-7-ja...@jlekstrand.net/

v2: Improve commit message per Lucas' suggestion.

Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: etna...@lists.freedesktop.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index e3d43678eb09..8d1703da971a 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -178,19 +178,21 @@ static int submit_fence_sync(struct etnaviv_gem_submit 
*submit)
for (i = 0; i < submit->nr_bos; i++) {
struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
struct dma_resv *robj = bo->obj->base.resv;
+   bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
 
-   if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
+   if (!(write)) {
ret = dma_resv_reserve_shared(robj, 1);
if (ret)
return ret;
}
 
-   if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
+   /* exclusive fences must be ordered */
+   if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
continue;
 
ret = 
drm_sched_job_add_implicit_dependencies(&submit->sched_job,
  &bo->obj->base,
- bo->flags & 
ETNA_SUBMIT_BO_WRITE);
+ write);
if (ret)
return ret;
}
-- 
2.32.0

[Intel-gfx] [PATCH v5 15/20] drm/sched: Check locking in drm_sched_job_await_implicit

2021-08-05 Thread Daniel Vetter

You really need to hold the reservation here or all kinds of funny
things can happen between grabbing the dependencies and inserting the
new fences.

Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 
---
 drivers/gpu/drm/scheduler/sched_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 49e507f91ec0..1abb40b07324 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -715,6 +715,8 @@ int drm_sched_job_add_implicit_dependencies(struct 
drm_sched_job *job,
struct dma_fence **fences;
unsigned int i, fence_count;
 
+   dma_resv_assert_held(obj->resv);
+
if (!write) {
struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
 
-- 
2.32.0

[Intel-gfx] [PATCH v5 13/20] drm/gem: Delete gem array fencing helpers

2021-08-05 Thread Daniel Vetter

Integrated into the scheduler now and all users converted over.

Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/drm_gem.c | 96 ---
 include/drm/drm_gem.h |  5 --
 2 files changed, 101 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 09c820045859..37e2e2820f08 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1272,99 +1272,3 @@ drm_gem_unlock_reservations(struct drm_gem_object 
**objs, int count,
ww_acquire_fini(acquire_ctx);
 }
 EXPORT_SYMBOL(drm_gem_unlock_reservations);
-
-/**
- * drm_gem_fence_array_add - Adds the fence to an array of fences to be
- * waited on, deduplicating fences from the same context.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @fence: the dma_fence to add to the list of dependencies.
- *
- * This functions consumes the reference for @fence both on success and error
- * cases.
- *
- * Returns:
- * 0 on success, or an error on failing to expand the array.
- */
-int drm_gem_fence_array_add(struct xarray *fence_array,
-   struct dma_fence *fence)
-{
-   struct dma_fence *entry;
-   unsigned long index;
-   u32 id = 0;
-   int ret;
-
-   if (!fence)
-   return 0;
-
-   /* Deduplicate if we already depend on a fence from the same context.
-* This lets the size of the array of deps scale with the number of
-* engines involved, rather than the number of BOs.
-*/
-   xa_for_each(fence_array, index, entry) {
-   if (entry->context != fence->context)
-   continue;
-
-   if (dma_fence_is_later(fence, entry)) {
-   dma_fence_put(entry);
-   xa_store(fence_array, index, fence, GFP_KERNEL);
-   } else {
-   dma_fence_put(fence);
-   }
-   return 0;
-   }
-
-   ret = xa_alloc(fence_array, &id, fence, xa_limit_32b, GFP_KERNEL);
-   if (ret != 0)
-   dma_fence_put(fence);
-
-   return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add);
-
-/**
- * drm_gem_fence_array_add_implicit - Adds the implicit dependencies tracked
- * in the GEM object's reservation object to an array of dma_fences for use in
- * scheduling a rendering job.
- *
- * This should be called after drm_gem_lock_reservations() on your array of
- * GEM objects used in the job but before updating the reservations with your
- * own fences.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @obj: the gem object to add new dependencies from.
- * @write: whether the job might write the object (so we need to depend on
- * shared fences in the reservation object).
- */
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
-struct drm_gem_object *obj,
-bool write)
-{
-   int ret;
-   struct dma_fence **fences;
-   unsigned int i, fence_count;
-
-   if (!write) {
-   struct dma_fence *fence =
-   dma_resv_get_excl_unlocked(obj->resv);
-
-   return drm_gem_fence_array_add(fence_array, fence);
-   }
-
-   ret = dma_resv_get_fences(obj->resv, NULL,
-   &fence_count, &fences);
-   if (ret || !fence_count)
-   return ret;
-
-   for (i = 0; i < fence_count; i++) {
-   ret = drm_gem_fence_array_add(fence_array, fences[i]);
-   if (ret)
-   break;
-   }
-
-   for (; i < fence_count; i++)
-   dma_fence_put(fences[i]);
-   kfree(fences);
-   return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add_implicit);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 35e7f44c2a75..e55a767188af 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -407,11 +407,6 @@ int drm_gem_lock_reservations(struct drm_gem_object 
**objs, int count,
  struct ww_acquire_ctx *acquire_ctx);
 void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
 struct ww_acquire_ctx *acquire_ctx);
-int drm_gem_fence_array_add(struct xarray *fence_array,
-   struct dma_fence *fence);
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
-struct drm_gem_object *obj,
-bool write);
 int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
u32 handle, u64 *offset);
 
-- 
2.32.0

[Intel-gfx] [PATCH v5 16/20] drm/msm: Don't break exclusive fence ordering

2021-08-05 Thread Daniel Vetter

There's only one exclusive slot, and we must not break the ordering.

Adding a new exclusive fence drops all previous fences from the
dma_resv. To avoid violating the signalling order we err on the side of
over-synchronizing by waiting for the existing fences, even if
userspace asked us to ignore them.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but
- msm has a synchronous dma_fence_wait for anything from another
  context, so doesn't seem to care much,
- and it probably makes sense to lift this into dma-resv.c code as a
  proper concept, so that drivers don't have to hack up their own
  solution each on their own.

v2: Improve commit message per Lucas' suggestion.

Cc: Lucas Stach 
Signed-off-by: Daniel Vetter 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index fb5a2eab27a2..66633dfd58a2 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -330,7 +330,8 @@ static int submit_fence_sync(struct msm_gem_submit *submit, 
bool no_implicit)
return ret;
}
 
-   if (no_implicit)
+   /* exclusive fences must be ordered */
+   if (no_implicit && !write)
continue;
 
ret = drm_sched_job_add_implicit_dependencies(&submit->base,
-- 
2.32.0

[Intel-gfx] [PATCH v5 18/20] drm/i915: delete exclude argument from i915_sw_fence_await_reservation

2021-08-05 Thread Daniel Vetter

No longer used, the last user disappeared with

commit d07f0e59b2c762584478920cd2d11fba2980a94a
Author: Chris Wilson 
Date:   Fri Oct 28 13:58:44 2016 +0100

drm/i915: Move GEM activity tracking into a common struct reservation_object

Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: "Thomas Hellström" 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/display/intel_display.c | 4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c  | 2 +-
 drivers/gpu/drm/i915/i915_sw_fence.c | 6 +-
 drivers/gpu/drm/i915/i915_sw_fence.h | 1 -
 4 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 86b86deca701..0ec736026132 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -11248,7 +11248,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 */
if (intel_crtc_needs_modeset(crtc_state)) {
ret = 
i915_sw_fence_await_reservation(&state->commit_ready,
- 
old_obj->base.resv, NULL,
+ 
old_obj->base.resv,
  false, 0,
  GFP_KERNEL);
if (ret < 0)
@@ -11282,7 +11282,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
struct dma_fence *fence;
 
ret = i915_sw_fence_await_reservation(&state->commit_ready,
- obj->base.resv, NULL,
+ obj->base.resv,
  false,
  
i915_fence_timeout(dev_priv),
  GFP_KERNEL);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index f0435c6feb68..fde88fa90780 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -104,7 +104,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object 
*obj,
clflush = clflush_work_create(obj);
if (clflush) {
i915_sw_fence_await_reservation(&clflush->base.chain,
-   obj->base.resv, NULL, true,
+   obj->base.resv, true,

i915_fence_timeout(to_i915(obj->base.dev)),
I915_FENCE_GFP);
dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index c589a681da77..91711a46b1c7 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -567,7 +567,6 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
 
 int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
struct dma_resv *resv,
-   const struct dma_fence_ops *exclude,
bool write,
unsigned long timeout,
gfp_t gfp)
@@ -587,9 +586,6 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
return ret;
 
for (i = 0; i < count; i++) {
-   if (shared[i]->ops == exclude)
-   continue;
-
pending = i915_sw_fence_await_dma_fence(fence,
shared[i],
timeout,
@@ -609,7 +605,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
excl = dma_resv_get_excl_unlocked(resv);
}
 
-   if (ret >= 0 && excl && excl->ops != exclude) {
+   if (ret >= 0 && excl) {
pending = i915_sw_fence_await_dma_fence(fence,
excl,
timeout,
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
b/drivers/gpu/drm/i915/i915_sw_fence.h
index 30a863353ee6..6572f01668e4 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -86,7 +86,6 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 
 int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
struct dma_resv *resv,
-   const struct dma_fence_ops *exclude,
bool write,

[Intel-gfx] [PATCH v5 12/20] drm/msm: Use scheduler dependency handling

2021-08-05 Thread Daniel Vetter

drm_sched_job_init is already at the right place, so this boils down
to deleting code.

Signed-off-by: Daniel Vetter 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/msm/msm_gem.h|  5 -
 drivers/gpu/drm/msm/msm_gem_submit.c | 19 +--
 drivers/gpu/drm/msm/msm_ringbuffer.c | 12 
 3 files changed, 5 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index f9e3ffb2309a..8bf0ac707fd7 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -312,11 +312,6 @@ struct msm_gem_submit {
struct ww_acquire_ctx ticket;
uint32_t seqno; /* Sequence number of the submit on the ring */
 
-   /* Array of struct dma_fence * to block on before submitting this job.
-*/
-   struct xarray deps;
-   unsigned long last_dep;
-
/* Hw fence, which is created when the scheduler executes the job, and
 * is signaled when the hw finishes (via seqno write from cmdstream)
 */
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 96cea0ba4cfd..fb5a2eab27a2 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -52,8 +52,6 @@ static struct msm_gem_submit *submit_create(struct drm_device 
*dev,
return ERR_PTR(ret);
}
 
-   xa_init_flags(&submit->deps, XA_FLAGS_ALLOC);
-
kref_init(&submit->ref);
submit->dev = dev;
submit->aspace = queue->ctx->aspace;
@@ -72,8 +70,6 @@ void __msm_gem_submit_destroy(struct kref *kref)
 {
struct msm_gem_submit *submit =
container_of(kref, struct msm_gem_submit, ref);
-   unsigned long index;
-   struct dma_fence *fence;
unsigned i;
 
if (submit->fence_id) {
@@ -82,12 +78,6 @@ void __msm_gem_submit_destroy(struct kref *kref)
mutex_unlock(&submit->queue->lock);
}
 
-   xa_for_each (&submit->deps, index, fence) {
-   dma_fence_put(fence);
-   }
-
-   xa_destroy(&submit->deps);
-
dma_fence_put(submit->user_fence);
dma_fence_put(submit->hw_fence);
 
@@ -343,8 +333,9 @@ static int submit_fence_sync(struct msm_gem_submit *submit, 
bool no_implicit)
if (no_implicit)
continue;
 
-   ret = drm_gem_fence_array_add_implicit(&submit->deps, obj,
-   write);
+   ret = drm_sched_job_add_implicit_dependencies(&submit->base,
+ obj,
+ write);
if (ret)
break;
}
@@ -588,7 +579,7 @@ static struct drm_syncobj **msm_parse_deps(struct 
msm_gem_submit *submit,
if (ret)
break;
 
-   ret = drm_gem_fence_array_add(&submit->deps, fence);
+   ret = drm_sched_job_add_dependency(&submit->base, fence);
if (ret)
break;
 
@@ -798,7 +789,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
goto out_unlock;
}
 
-   ret = drm_gem_fence_array_add(&submit->deps, in_fence);
+   ret = drm_sched_job_add_dependency(&submit->base, in_fence);
if (ret)
goto out_unlock;
}
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c 
b/drivers/gpu/drm/msm/msm_ringbuffer.c
index bd54c1412649..652b1dedd7c1 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -11,17 +11,6 @@ static uint num_hw_submissions = 8;
 MODULE_PARM_DESC(num_hw_submissions, "The max # of jobs to write into 
ringbuffer (default 8)");
 module_param(num_hw_submissions, uint, 0600);
 
-static struct dma_fence *msm_job_dependency(struct drm_sched_job *job,
-   struct drm_sched_entity *s_entity)
-{
-   struct msm_gem_submit *submit = to_msm_submit(job);
-
-   if (!xa_empty(&submit->deps))
-   return xa_erase(&submit->deps, submit->last_dep++);
-
-   return NULL;
-}
-
 static struct dma_fence *msm_job_run(struct drm_sched_job *job)
 {
struct msm_gem_submit *submit = to_msm_submit(job);
@@ -52,7 +41,6 @@ static void msm_job_free(struct drm_sched_job *job)
 }
 
 const struct drm_sched_backend_ops msm_sched_ops = {
-   .dependency = msm_job_dependency,
.run_job = msm_job_run,
.free_job = msm_job_free
 };
-- 
2.32.0

[Intel-gfx] [PATCH v5 20/20] dma-resv: Give the docs a do-over

2021-08-05 Thread Daniel Vetter

Specifically document the new/clarified rules around how the shared
fences do not have any ordering requirements against the exclusive
fence.

But also document all the things a bit better, given how central
struct dma_resv to dynamic buffer management the docs have been very
inadequat.

- Lots more links to other pieces of the puzzle. Unfortunately
  ttm_buffer_object has no docs, so no links :-(

- Explain/complain a bit about dma_resv_locking_ctx(). I still don't
  like that one, but fixing the ttm call chains is going to be
  horrible. Plus we want to plug in real slowpath locking when we do
  that anyway.

- Main part of the patch is some actual docs for struct dma_resv.

Overall I think we still have a lot of bad naming in this area (e.g.
dma_resv.fence is singular, but contains the multiple shared fences),
but I think that's more indicative of how the semantics and rules are
just not great.

Another thing that's real awkard is how chaining exclusive fences
right now means direct dma_resv.exclusive_fence pointer access with an
rcu_assign_pointer. Not so great either.

v2:
- Fix a pile of typos (Matt, Jason)
- Hammer it in that breaking the rules leads to use-after-free issues
  around dma-buf sharing (Christian)

Reviewed-by: Christian König 
Cc: Jason Ekstrand 
Cc: Matthew Auld 
Reviewed-by: Matthew Auld 
Signed-off-by: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/dma-buf/dma-resv.c |  24 ++---
 include/linux/dma-buf.h|   7 +++
 include/linux/dma-resv.h   | 104 +++--
 3 files changed, 124 insertions(+), 11 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index e744fd87c63c..84fbe60629e3 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -48,6 +48,8 @@
  * write operations) or N shared fences (read operations).  The RCU
  * mechanism is used to protect read access to fences from locked
  * write-side updates.
+ *
+ * See struct dma_resv for more details.
  */
 
 DEFINE_WD_CLASS(reservation_ww_class);
@@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
  * @num_fences: number of fences we want to add
  *
  * Should be called before dma_resv_add_shared_fence().  Must
- * be called with obj->lock held.
+ * be called with @obj locked through dma_resv_lock().
+ *
+ * Note that the preallocated slots need to be re-reserved if @obj is unlocked
+ * at any time before calling dma_resv_add_shared_fence(). This is validated
+ * when CONFIG_DEBUG_MUTEXES is enabled.
  *
  * RETURNS
  * Zero for success, or -errno
@@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
  * @obj: the reservation object
  * @fence: the shared fence to add
  *
- * Add a fence to a shared slot, obj->lock must be held, and
+ * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
  * dma_resv_reserve_shared() has been called.
+ *
+ * See also &dma_resv.fence for a discussion of the semantics.
  */
 void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
@@ -278,9 +286,11 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
 /**
  * dma_resv_add_excl_fence - Add an exclusive fence.
  * @obj: the reservation object
- * @fence: the shared fence to add
+ * @fence: the exclusive fence to add
  *
- * Add a fence to the exclusive slot.  The obj->lock must be held.
+ * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
+ * Note that this function replaces all fences attached to @obj, see also
+ * &dma_resv.fence_excl for a discussion of the semantics.
  */
 void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
@@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct 
dma_fence *passed_fence)
  * fence
  *
  * Callers are not required to hold specific locks, but maybe hold
- * dma_resv_lock() already
+ * dma_resv_lock() already.
+ *
  * RETURNS
- * true if all fences signaled, else false
+ *
+ * True if all fences signaled, else false.
  */
 bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
 {
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 678b2006be78..fc62b5f9980c 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -420,6 +420,13 @@ struct dma_buf {
 * - Dynamic importers should set fences for any access that they can't
 *   disable immediately from their &dma_buf_attach_ops.move_notify
 *   callback.
+*
+* IMPORTANT:
+*
+* All drivers must obey the struct dma_resv rules, specifically the
+* rules for updating fences, see &dma_resv.fence_excl and
+* &dma_resv.fence. If these dependency rules are broken access tracking
+* can be lost resulting in use after free issues.
 */
struct dma_resv *resv;
 
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index e1ca2080a1ff..9100dd3dc21f 1

[Intel-gfx] [PATCH v5 19/20] drm/i915: Don't break exclusive fence ordering

2021-08-05 Thread Daniel Vetter

There's only one exclusive slot, and we must not break the ordering.
Adding a new exclusive fence drops all previous fences from the
dma_resv. To avoid violating the signalling order we err on the side of
over-synchronizing by waiting for the existing fences, even if
userspace asked us to ignore them.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but it probably makes sense to lift this into
dma-resv.c code as a proper concept, so that drivers don't have to
hack up their own solution each on their own. Hence go with the simple
fix for now.

Another option is the fence import ioctl from Jason:

https://lore.kernel.org/dri-devel/20210610210925.642582-7-ja...@jlekstrand.net/

v2: Improve commit message per Lucas' suggestion.

Cc: Lucas Stach 
Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: "Thomas Hellström" 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 1ed7475de454..25ba2765d27d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2240,6 +2240,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
struct i915_vma *vma = ev->vma;
unsigned int flags = ev->flags;
struct drm_i915_gem_object *obj = vma->obj;
+   bool async, write;
 
assert_vma_held(vma);
 
@@ -2271,7 +2272,10 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
flags &= ~EXEC_OBJECT_ASYNC;
}
 
-   if (err == 0 && !(flags & EXEC_OBJECT_ASYNC)) {
+   async = flags & EXEC_OBJECT_ASYNC;
+   write = flags & EXEC_OBJECT_WRITE;
+
+   if (err == 0 && (!async || write)) {
err = i915_request_await_object
(eb->request, obj, flags & EXEC_OBJECT_WRITE);
}
-- 
2.32.0

Re: [Intel-gfx] [PATCH 24/33] drm/i915/guc: Implement banned contexts for GuC submission

2021-08-05 Thread Tvrtko Ursulin




On 27/07/2021 01:23, Matthew Brost wrote:

When using GuC submission, if a context gets banned disable scheduling
and mark all inflight requests as complete.

Cc: John Harrison 
Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c   |   2 +-
  drivers/gpu/drm/i915/gt/intel_context.h   |  13 ++
  drivers/gpu/drm/i915/gt/intel_context_types.h |   2 +
  drivers/gpu/drm/i915/gt/intel_reset.c |  32 +---
  .../gpu/drm/i915/gt/intel_ring_submission.c   |  20 +++
  drivers/gpu/drm/i915/gt/uc/intel_guc.h|   2 +
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 151 --
  drivers/gpu/drm/i915/i915_trace.h |  10 ++
  8 files changed, 195 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e3df01a201d7..05c3ee191710 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1084,7 +1084,7 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
for_each_gem_engine(ce, engines, it) {
struct intel_engine_cs *engine;
  
-		if (ban && intel_context_set_banned(ce))

+   if (ban && intel_context_ban(ce, NULL))
continue;
  
  		/*

diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index 2ed9bf5f91a5..814d9277096a 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -16,6 +16,7 @@
  #include "intel_engine_types.h"
  #include "intel_ring_types.h"
  #include "intel_timeline_types.h"
+#include "i915_trace.h"
  
  #define CE_TRACE(ce, fmt, ...) do {	\

const struct intel_context *ce__ = (ce);\
@@ -243,6 +244,18 @@ static inline bool intel_context_set_banned(struct 
intel_context *ce)
return test_and_set_bit(CONTEXT_BANNED, &ce->flags);
  }
  
+static inline bool intel_context_ban(struct intel_context *ce,

+struct i915_request *rq)
+{
+   bool ret = intel_context_set_banned(ce);
+
+   trace_intel_context_ban(ce);
+   if (ce->ops->ban)
+   ce->ops->ban(ce, rq);


Do you want to skip this call if already banned?


+
+   return ret;
+}
+
  static inline bool
  intel_context_force_single_submission(const struct intel_context *ce)
  {
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 035108c10b2c..57c19ee3e313 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -35,6 +35,8 @@ struct intel_context_ops {
  
  	int (*alloc)(struct intel_context *ce);
  
+	void (*ban)(struct intel_context *ce, struct i915_request *rq);

+
int (*pre_pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, 
void **vaddr);
int (*pin)(struct intel_context *ce, void *vaddr);
void (*unpin)(struct intel_context *ce);
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
b/drivers/gpu/drm/i915/gt/intel_reset.c
index 4d281bc8a38c..91200c43951f 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -22,7 +22,6 @@
  #include "intel_reset.h"
  
  #include "uc/intel_guc.h"

-#include "uc/intel_guc_submission.h"
  
  #define RESET_MAX_RETRIES 3
  
@@ -39,21 +38,6 @@ static void rmw_clear_fw(struct intel_uncore *uncore, i915_reg_t reg, u32 clr)

intel_uncore_rmw_fw(uncore, reg, clr, 0);
  }
  
-static void skip_context(struct i915_request *rq)

-{
-   struct intel_context *hung_ctx = rq->context;
-
-   list_for_each_entry_from_rcu(rq, &hung_ctx->timeline->requests, link) {
-   if (!i915_request_is_active(rq))
-   return;
-
-   if (rq->context == hung_ctx) {
-   i915_request_set_error_once(rq, -EIO);
-   __i915_request_skip(rq);
-   }
-   }
-}


More importantly I must be missing something - this code has been moved 
to ring_context_ban - what am I not seeing on the execlists side of things?!


Regards,

Tvrtko


-
  static void client_mark_guilty(struct i915_gem_context *ctx, bool banned)
  {
struct drm_i915_file_private *file_priv = ctx->file_priv;
@@ -88,10 +72,8 @@ static bool mark_guilty(struct i915_request *rq)
bool banned;
int i;
  
-	if (intel_context_is_closed(rq->context)) {

-   intel_context_set_banned(rq->context);
+   if (intel_context_is_closed(rq->context))
return true;
-   }
  
  	rcu_read_lock();

ctx = rcu_dereference(rq->context->gem_context);
@@ -123,11 +105,9 @@ static bool mark_guilty(struct i915_request *rq)
banned = !i915_gem_context_is_recoverable(ctx);
if (time_before(jiffies, prev_hang + CONTEXT_FAST_HANG_JIFFIES))
banned = tru

[Intel-gfx] [PATCH] drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-05 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

When a non-persistent context exits we currently mark it as banned in
order to trigger fast termination of any outstanding GPU jobs it may have
left running.

In doing so we apply a very strict 1ms limit in which the left over job
has to preempt before we issues an engine resets.

Some workloads are not able to cleanly preempt in that time window and it
can be argued that it would instead be better to give them a bit more
grace since avoiding engine resets is generally preferrable.

To achieve this the patch splits handling of banned contexts from simply
closed non-persistent ones and then applies different timeouts for both
and also extends the criteria which determines if a request should be
scheduled back in after preemption or not.

15ms preempt timeout grace is given to exited non-persistent contexts
which have been empirically tested to satisfy customers requirements
and still provides reasonably quick cleanup post exit.

v2:
 * Streamline fast path checks.

v3:
 * Simplify by using only schedulable status.
 * Increase timeout to 20ms.

v4:
 * Fix live_execlists selftest.

v5:
 * Fix logic in kill_engines.

v6:
 * Rebase.

Signed-off-by: Tvrtko Ursulin 
Cc: Chris Wilson 
Cc: Zhen Han 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 22 +--
 drivers/gpu/drm/i915/gt/intel_context.c   |  2 ++
 drivers/gpu/drm/i915/gt/intel_context.h   | 17 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |  1 +
 .../drm/i915/gt/intel_execlists_submission.c  | 11 --
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 20 +++--
 drivers/gpu/drm/i915/i915_request.c   |  2 +-
 7 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index cff72679ad7c..21fe5d4057ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1065,7 +1065,8 @@ static struct intel_engine_cs *active_engine(struct 
intel_context *ce)
return engine;
 }
 
-static void kill_engines(struct i915_gem_engines *engines, bool ban)
+static void
+kill_engines(struct i915_gem_engines *engines, bool ban, bool persistent)
 {
struct i915_gem_engines_iter it;
struct intel_context *ce;
@@ -1079,8 +1080,15 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
 */
for_each_gem_engine(ce, engines, it) {
struct intel_engine_cs *engine;
+   bool skip = false;
+
+   if (ban)
+   skip = intel_context_ban(ce, NULL);
+   else if (!persistent)
+   skip = !intel_context_clear_schedulable(ce);
 
-   if (ban && intel_context_ban(ce, NULL))
+   /* Already previously banned or made non-schedulable? */
+   if (skip)
continue;
 
/*
@@ -1093,7 +1101,7 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
engine = active_engine(ce);
 
/* First attempt to gracefully cancel the context */
-   if (engine && !__cancel_engine(engine) && ban)
+   if (engine && !__cancel_engine(engine) && (ban || !persistent))
/*
 * If we are unable to send a preemptive pulse to bump
 * the context from the GPU, we have to resort to a full
@@ -1105,8 +1113,6 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
 
 static void kill_context(struct i915_gem_context *ctx)
 {
-   bool ban = (!i915_gem_context_is_persistent(ctx) ||
-   !ctx->i915->params.enable_hangcheck);
struct i915_gem_engines *pos, *next;
 
spin_lock_irq(&ctx->stale.lock);
@@ -1119,7 +1125,8 @@ static void kill_context(struct i915_gem_context *ctx)
 
spin_unlock_irq(&ctx->stale.lock);
 
-   kill_engines(pos, ban);
+   kill_engines(pos, !ctx->i915->params.enable_hangcheck,
+i915_gem_context_is_persistent(ctx));
 
spin_lock_irq(&ctx->stale.lock);
GEM_BUG_ON(i915_sw_fence_signaled(&pos->fence));
@@ -1165,7 +1172,8 @@ static void engines_idle_release(struct i915_gem_context 
*ctx,
 
 kill:
if (list_empty(&engines->link)) /* raced, already closed */
-   kill_engines(engines, true);
+   kill_engines(engines, true,
+i915_gem_context_is_persistent(ctx));
 
i915_sw_fence_commit(&engines->fence);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 745e84c72c90..bc1701ef1578 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -382,6 +382,8 @@ intel_context_init(struct intel_context *ce, struct 
intel_engine_cs *engine)
ce-

[Intel-gfx] [CI v2] drm/i915: Tweaked Wa_14010685332 for all PCHs

2021-08-05 Thread Anshuman Gupta

dispcnlunit1_cp_xosc_clkreq clock observed to be active on TGL-H platform
despite Wa_14010685332 original sequence, thus blocks entry to deeper s0ix 
state.

The Tweaked Wa_14010685332 sequence fixes this issue, therefore use tweaked
Wa_14010685332 sequence for every PCH since PCH_CNP.

v2:
- removed RKL from comment and simplified condition. [Rodrigo]

Fixes: b896898c7369 ("drm/i915: Tweaked Wa_14010685332 for PCHs used on gen11 
platforms")
Cc: Matt Roper 
Cc: Rodrigo Vivi 
Cc: Imre Deak 
Signed-off-by: Anshuman Gupta 
Reviewed-by: Rodrigo Vivi 
---
 .../drm/i915/display/intel_display_power.c| 16 +++---
 drivers/gpu/drm/i915/i915_irq.c   | 21 ---
 2 files changed, 8 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 5da293369f30..cce1a926fcc1 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -6329,13 +6329,13 @@ void intel_display_power_suspend_late(struct 
drm_i915_private *i915)
if (DISPLAY_VER(i915) >= 11 || IS_GEMINILAKE(i915) ||
IS_BROXTON(i915)) {
bxt_enable_dc9(i915);
-   /* Tweaked Wa_14010685332:icp,jsp,mcc */
-   if (INTEL_PCH_TYPE(i915) >= PCH_ICP && INTEL_PCH_TYPE(i915) <= 
PCH_MCC)
-   intel_de_rmw(i915, SOUTH_CHICKEN1,
-SBCLK_RUN_REFCLK_DIS, 
SBCLK_RUN_REFCLK_DIS);
} else if (IS_HASWELL(i915) || IS_BROADWELL(i915)) {
hsw_enable_pc8(i915);
}
+
+   /* Tweaked Wa_14010685332:cnp,icp,jsp,mcc,tgp,adp */
+   if (INTEL_PCH_TYPE(i915) >= PCH_CNP && INTEL_PCH_TYPE(i915) < PCH_DG1)
+   intel_de_rmw(i915, SOUTH_CHICKEN1, SBCLK_RUN_REFCLK_DIS, 
SBCLK_RUN_REFCLK_DIS);
 }
 
 void intel_display_power_resume_early(struct drm_i915_private *i915)
@@ -6344,13 +6344,13 @@ void intel_display_power_resume_early(struct 
drm_i915_private *i915)
IS_BROXTON(i915)) {
gen9_sanitize_dc_state(i915);
bxt_disable_dc9(i915);
-   /* Tweaked Wa_14010685332:icp,jsp,mcc */
-   if (INTEL_PCH_TYPE(i915) >= PCH_ICP && INTEL_PCH_TYPE(i915) <= 
PCH_MCC)
-   intel_de_rmw(i915, SOUTH_CHICKEN1, 
SBCLK_RUN_REFCLK_DIS, 0);
-
} else if (IS_HASWELL(i915) || IS_BROADWELL(i915)) {
hsw_disable_pc8(i915);
}
+
+   /* Tweaked Wa_14010685332:cnp,icp,jsp,mcc,tgp,adp */
+   if (INTEL_PCH_TYPE(i915) >= PCH_CNP && INTEL_PCH_TYPE(i915) < PCH_DG1)
+   intel_de_rmw(i915, SOUTH_CHICKEN1, SBCLK_RUN_REFCLK_DIS, 0);
 }
 
 void intel_display_power_suspend(struct drm_i915_private *i915)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 17d336218b67..9bc4f4a8e12e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -3079,24 +3079,6 @@ static void valleyview_irq_reset(struct drm_i915_private 
*dev_priv)
spin_unlock_irq(&dev_priv->irq_lock);
 }
 
-static void cnp_display_clock_wa(struct drm_i915_private *dev_priv)
-{
-   struct intel_uncore *uncore = &dev_priv->uncore;
-
-   /*
-* Wa_14010685332:cnp/cmp,tgp,adp
-* TODO: Clarify which platforms this applies to
-* TODO: Figure out if this workaround can be applied in the s0ix 
suspend/resume handlers as
-* on earlier platforms and whether the workaround is also needed for 
runtime suspend/resume
-*/
-   if (INTEL_PCH_TYPE(dev_priv) == PCH_CNP ||
-   (INTEL_PCH_TYPE(dev_priv) >= PCH_TGP && INTEL_PCH_TYPE(dev_priv) < 
PCH_DG1)) {
-   intel_uncore_rmw(uncore, SOUTH_CHICKEN1, SBCLK_RUN_REFCLK_DIS,
-SBCLK_RUN_REFCLK_DIS);
-   intel_uncore_rmw(uncore, SOUTH_CHICKEN1, SBCLK_RUN_REFCLK_DIS, 
0);
-   }
-}
-
 static void gen8_display_irq_reset(struct drm_i915_private *dev_priv)
 {
struct intel_uncore *uncore = &dev_priv->uncore;
@@ -3130,7 +3112,6 @@ static void gen8_irq_reset(struct drm_i915_private 
*dev_priv)
if (HAS_PCH_SPLIT(dev_priv))
ibx_irq_reset(dev_priv);
 
-   cnp_display_clock_wa(dev_priv);
 }
 
 static void gen11_display_irq_reset(struct drm_i915_private *dev_priv)
@@ -3174,8 +3155,6 @@ static void gen11_display_irq_reset(struct 
drm_i915_private *dev_priv)
 
if (INTEL_PCH_TYPE(dev_priv) >= PCH_ICP)
GEN3_IRQ_RESET(uncore, SDE);
-
-   cnp_display_clock_wa(dev_priv);
 }
 
 static void gen11_irq_reset(struct drm_i915_private *dev_priv)
-- 
2.26.2

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Update small joiner ram size

2021-08-05 Thread Patchwork

== Series Details ==

Series: drm/i915: Update small joiner ram size
URL   : https://patchwork.freedesktop.org/series/93410/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10449 -> Patchwork_20771


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/index.html

Known issues


  Here are the changes found in Patchwork_20771 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [PASS][1] -> [FAIL][2] ([i915#1888])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  
 Possible fixes 

  * igt@kms_chamelium@dp-crc-fast:
- fi-kbl-7500u:   [FAIL][3] ([i915#1372]) -> [PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [fdo#112080]: https://bugs.freedesktop.org/show_bug.cgi?id=112080
  [i915#1372]: https://gitlab.freedesktop.org/drm/intel/issues/1372
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#3844]: https://gitlab.freedesktop.org/drm/intel/issues/3844
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#579]: https://gitlab.freedesktop.org/drm/intel/issues/579


Participating hosts (39 -> 35)
--

  Additional (1): fi-jsl-1 
  Missing(5): fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 
fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10449 -> Patchwork_20771

  CI-20190529: 20190529
  CI_DRM_10449: b0b7ea6dcb6afb51059e3ae01afece47c41fd0c1 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6160: 4287344dd6a39d9036c5fb9a047a7d8f10bee981 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20771: 76ad0b5c9154228fa572485a518c25a1e8fe1c4d @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

76ad0b5c9154 drm/i915: Update small joiner ram size

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/index.html

Re: [Intel-gfx] [PATCH v3 07/14] vfio/platform: Use open_device() instead of open coding a refcnt scheme

2021-08-05 Thread Eric Auger

Hi Jason,

On 7/29/21 2:49 AM, Jason Gunthorpe wrote:
> Platform simply wants to run some code when the device is first
> opened/last closed. Use the core framework and locking for this.  Aside
> from removing a bit of code this narrows the locking scope from a global
> lock.
>
> Signed-off-by: Jason Gunthorpe 
> Signed-off-by: Yishai Hadas 
> Reviewed-by: Cornelia Huck 
> Reviewed-by: Christoph Hellwig 
Reviewed-by: Eric Auger 

Thanks

Eric

> ---
>  drivers/vfio/platform/vfio_platform_common.c  | 79 ---
>  drivers/vfio/platform/vfio_platform_private.h |  1 -
>  2 files changed, 32 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/vfio/platform/vfio_platform_common.c 
> b/drivers/vfio/platform/vfio_platform_common.c
> index bdde8605178cd2..6af7ce7d619c25 100644
> --- a/drivers/vfio/platform/vfio_platform_common.c
> +++ b/drivers/vfio/platform/vfio_platform_common.c
> @@ -218,65 +218,52 @@ static int vfio_platform_call_reset(struct 
> vfio_platform_device *vdev,
>   return -EINVAL;
>  }
>  
> -static void vfio_platform_release(struct vfio_device *core_vdev)
> +static void vfio_platform_close_device(struct vfio_device *core_vdev)
>  {
>   struct vfio_platform_device *vdev =
>   container_of(core_vdev, struct vfio_platform_device, vdev);
> + const char *extra_dbg = NULL;
> + int ret;
>  
> - mutex_lock(&driver_lock);
> -
> - if (!(--vdev->refcnt)) {
> - const char *extra_dbg = NULL;
> - int ret;
> -
> - ret = vfio_platform_call_reset(vdev, &extra_dbg);
> - if (ret && vdev->reset_required) {
> - dev_warn(vdev->device, "reset driver is required and 
> reset call failed in release (%d) %s\n",
> -  ret, extra_dbg ? extra_dbg : "");
> - WARN_ON(1);
> - }
> - pm_runtime_put(vdev->device);
> - vfio_platform_regions_cleanup(vdev);
> - vfio_platform_irq_cleanup(vdev);
> + ret = vfio_platform_call_reset(vdev, &extra_dbg);
> + if (WARN_ON(ret && vdev->reset_required)) {
> + dev_warn(
> + vdev->device,
> + "reset driver is required and reset call failed in 
> release (%d) %s\n",
> + ret, extra_dbg ? extra_dbg : "");
>   }
> -
> - mutex_unlock(&driver_lock);
> + pm_runtime_put(vdev->device);
> + vfio_platform_regions_cleanup(vdev);
> + vfio_platform_irq_cleanup(vdev);
>  }
>  
> -static int vfio_platform_open(struct vfio_device *core_vdev)
> +static int vfio_platform_open_device(struct vfio_device *core_vdev)
>  {
>   struct vfio_platform_device *vdev =
>   container_of(core_vdev, struct vfio_platform_device, vdev);
> + const char *extra_dbg = NULL;
>   int ret;
>  
> - mutex_lock(&driver_lock);
> -
> - if (!vdev->refcnt) {
> - const char *extra_dbg = NULL;
> -
> - ret = vfio_platform_regions_init(vdev);
> - if (ret)
> - goto err_reg;
> + ret = vfio_platform_regions_init(vdev);
> + if (ret)
> + return ret;
>  
> - ret = vfio_platform_irq_init(vdev);
> - if (ret)
> - goto err_irq;
> + ret = vfio_platform_irq_init(vdev);
> + if (ret)
> + goto err_irq;
>  
> - ret = pm_runtime_get_sync(vdev->device);
> - if (ret < 0)
> - goto err_rst;
> + ret = pm_runtime_get_sync(vdev->device);
> + if (ret < 0)
> + goto err_rst;
>  
> - ret = vfio_platform_call_reset(vdev, &extra_dbg);
> - if (ret && vdev->reset_required) {
> - dev_warn(vdev->device, "reset driver is required and 
> reset call failed in open (%d) %s\n",
> -  ret, extra_dbg ? extra_dbg : "");
> - goto err_rst;
> - }
> + ret = vfio_platform_call_reset(vdev, &extra_dbg);
> + if (ret && vdev->reset_required) {
> + dev_warn(
> + vdev->device,
> + "reset driver is required and reset call failed in open 
> (%d) %s\n",
> + ret, extra_dbg ? extra_dbg : "");
> + goto err_rst;
>   }
> -
> - vdev->refcnt++;
> -
> - mutex_unlock(&driver_lock);
>   return 0;
>  
>  err_rst:
> @@ -284,8 +271,6 @@ static int vfio_platform_open(struct vfio_device 
> *core_vdev)
>   vfio_platform_irq_cleanup(vdev);
>  err_irq:
>   vfio_platform_regions_cleanup(vdev);
> -err_reg:
> - mutex_unlock(&driver_lock);
>   return ret;
>  }
>  
> @@ -616,8 +601,8 @@ static int vfio_platform_mmap(struct vfio_device 
> *core_vdev, struct vm_area_stru
>  
>  static const struct vfio_device_ops vfio_platform_ops = {
>   .name   = "vfio-platform",
> - .open   = vfio_platform_open,
> - .release= vfio_platform_releas

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for remove rcu support from i915_address_space (rev4)

2021-08-05 Thread Patchwork

== Series Details ==

Series: remove rcu support from i915_address_space (rev4)
URL   : https://patchwork.freedesktop.org/series/93314/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
0e6cb5de1e74 drm/i915: Drop code to handle set-vm races from execbuf
-:17: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#17: 
References: ccbc1b97948a ("drm/i915/gem: Don't allow changing the VM on running 
contexts (v4)")

-:17: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#17: 
References: ccbc1b97948a ("drm/i915/gem: Don't allow changing the VM on running 
contexts (v4)")

-:46: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 2 warnings, 0 checks, 12 lines checked
1bd404a5dddc drm/i915: Rename i915_gem_context_get_vm_rcu to 
i915_gem_context_get_eb_vm
-:148: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 80 lines checked
37d02c39555f drm/i915: Use i915_gem_context_get_eb_vm in ctx_getparam
-:54: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 23 lines checked
22042c81dd12 drm/i915: Add i915_gem_context_is_full_ppgtt
-:105: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 53 lines checked
47f45eed8f19 drm/i915: Use i915_gem_context_get_eb_vm in intel_context_set_gem
-:12: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#12: 
commit ccbc1b97948ab671335e950271e39766729736c3

-:61: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 18 lines checked
c1770ae51252 drm/i915: Drop __rcu from gem_context->vm
-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#11: 
commit ccbc1b97948ab671335e950271e39766729736c3

-:23: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#23: 
  i915_vm_open ofc. This also removes the final caller of context_get_vm_rcu

-:42: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit a4e7ccdac38e ("drm/i915: Move 
context management under GEM")'
#42: 
commit a4e7ccdac38ec8335d9e4e2656c1a041c77feae1

-:345: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 2 errors, 2 warnings, 0 checks, 232 lines checked
fea76eb28a60 drm/i915: use xa_lock/unlock for fpriv->vm_xa lookups
-:15: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit aabbe344dc3c ("drm/i915: Use RCU 
for unlocked vm_idr lookup")'
#15: 
commit aabbe344dc3ca5f7d8263a02608ba6179e8a4499

-:52: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 13 lines checked
be11100f27ee drm/i915: Stop rcu support for i915_address_space
-:11: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#11: 
- i915_dpt has very simple lifetime (somehow we create a display pagetable vm

-:27: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit cf977e18610e ("drm/i915/gem: 
Spring clean debugfs")'
#27: 
commit cf977e18610e66e48c31619e7e0cfa871be9eada

-:35: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit db80a1294c23 ("drm/i915/gem: 
Remove per-client stats from debugfs/i915_gem_objects")'
#35: 
commit db80a1294c231b6ac725085f046bb2931e00c9db

-:47: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#47: 
commit ccbc1b97948ab671335e950271e39766729736c3

-:59: WARNING:TYPO_SPELLING: 'Preceeding' may be misspelled - perhaps 
'Preceding'?
#59: 
  Preceeding patches removed all vestiges of rcu use from gem_ctx->vm
  ^^

-:64:

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for remove rcu support from i915_address_space (rev4)

2021-08-05 Thread Patchwork

== Series Details ==

Series: remove rcu support from i915_address_space (rev4)
URL   : https://patchwork.freedesktop.org/series/93314/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-drivers/gpu/drm/i915/gem/i915_gem_context.c:1364:34:expected struct 
i915_address_space *vm
-drivers/gpu/drm/i915/gem/i915_gem_context.c:1364:34:got struct 
i915_address_space [noderef] __rcu *vm
-drivers/gpu/drm/i915/gem/i915_gem_context.c:1364:34: warning: incorrect type 
in argument 1 (different address spaces)
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)

[Intel-gfx] ✗ Fi.CI.BAT: failure for remove rcu support from i915_address_space (rev4)

2021-08-05 Thread Patchwork

== Series Details ==

Series: remove rcu support from i915_address_space (rev4)
URL   : https://patchwork.freedesktop.org/series/93314/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10449 -> Patchwork_20772


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20772 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20772, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20772/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20772:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20772/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  
Known issues


  Here are the changes found in Patchwork_20772 that come from known issues:

### IGT changes ###

 Possible fixes 

  * igt@kms_chamelium@dp-crc-fast:
- fi-kbl-7500u:   [FAIL][3] ([i915#1372]) -> [PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20772/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [fdo#112080]: https://bugs.freedesktop.org/show_bug.cgi?id=112080
  [i915#1372]: https://gitlab.freedesktop.org/drm/intel/issues/1372
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#3844]: https://gitlab.freedesktop.org/drm/intel/issues/3844
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#579]: https://gitlab.freedesktop.org/drm/intel/issues/579


Participating hosts (39 -> 35)
--

  Additional (1): fi-jsl-1 
  Missing(5): fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 
fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10449 -> Patchwork_20772

  CI-20190529: 20190529
  CI_DRM_10449: b0b7ea6dcb6afb51059e3ae01afece47c41fd0c1 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6160: 4287344dd6a39d9036c5fb9a047a7d8f10bee981 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20772: be11100f27eef517b2b401b8afe9401e9e599d0f @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

be11100f27ee drm/i915: Stop rcu support for i915_address_space
fea76eb28a60 drm/i915: use xa_lock/unlock for fpriv->vm_xa lookups
c1770ae51252 drm/i915: Drop __rcu from gem_context->vm
47f45eed8f19 drm/i915: Use i915_gem_context_get_eb_vm in intel_context_set_gem
22042c81dd12 drm/i915: Add i915_gem_context_is_full_ppgtt
37d02c39555f drm/i915: Use i915_gem_context_get_eb_vm in ctx_getparam
1bd404a5dddc drm/i915: Rename i915_gem_context_get_vm_rcu to 
i915_gem_context_get_eb_vm
0e6cb5de1e74 drm/i915: Drop code to handle set-vm races from execbuf

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20772/index.html

Re: [Intel-gfx] [PATCH v5 14/20] drm/sched: Don't store self-dependencies

2021-08-05 Thread Daniel Vetter

On Thu, Aug 5, 2021 at 3:18 PM Christian König  wrote:
>
>
>
> Am 05.08.21 um 12:46 schrieb Daniel Vetter:
> > This is essentially part of drm_sched_dependency_optimized(), which
> > only amdgpu seems to make use of. Use it a bit more.
> >
> > This would mean that as-is amdgpu can't use the dependency helpers, at
> > least not with the current approach amdgpu has for deciding whether a
> > vm_flush is needed. Since amdgpu also has very special rules around
> > implicit fencing it can't use those helpers either, and adding a
> > drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too
> > onerous. That way the special case handling for amdgpu sticks even
> > more out and we have higher chances that reviewers that go across all
> > drivers wont miss it.
>
> Well you should probably drop the sentence about the vm_flush, this is
> completely unrelated.
>
> Additional to that I still don't think that this is a good idea.
> Dependency handling is something completely driver specific.
>
> E.g. even when you have submitted jobs back to back they still might
> need a cache flush in between and that is not only for amdgpu like this.
>
> What you can do is to optimize for while looking at the fences later on
> and then note that you have done so and what the last hw fence is you
> used instead.

Out of 6 drivers using drm/sched 5 can use this. When we get i915
over, that one will be added to the list. amdgpu can't use any of this
anyway due to the vm_id allocation requirements, which is why I
mention that. Also note that all the callbacks are still there, so you
can just ignore this all and still build your own. Like amdgpu does.

So I'm not sure what exactly your object is, aside from "this doesn't
fit for amdgpu", which a) I know b) the commit message explains c)
doesn't actually hurt amdgpu in the slightest. And we still get the
benefit that for most drivers it's a nice optimization.
-Daniel

> Regards,
> Christian.
>
> >
> > Reviewed-by: Lucas Stach 
> > Acked-by: Melissa Wen 
> > Signed-off-by: Daniel Vetter 
> > Cc: "Christian König" 
> > Cc: Daniel Vetter 
> > Cc: Luben Tuikov 
> > Cc: Andrey Grodzovsky 
> > Cc: Alex Deucher 
> > ---
> >   drivers/gpu/drm/scheduler/sched_main.c | 7 +++
> >   1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index f77456929139..49e507f91ec0 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -660,6 +660,13 @@ int drm_sched_job_add_dependency(struct drm_sched_job 
> > *job,
> >   if (!fence)
> >   return 0;
> >
> > + /* if it's a fence from us it's guaranteed to be earlier */
> > + if (fence->context == job->entity->fence_context ||
> > + fence->context == job->entity->fence_context + 1) {
> > + dma_fence_put(fence);
> > + return 0;
> > + }
> > +
> >   /* Deduplicate if we already depend on a fence from the same context.
> >* This lets the size of the array of deps scale with the number of
> >* engines involved, rather than the number of BOs.
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH v5 15/20] drm/sched: Check locking in drm_sched_job_await_implicit

2021-08-05 Thread Daniel Vetter

On Thu, Aug 5, 2021 at 3:19 PM Christian König  wrote:
>
> Am 05.08.21 um 12:47 schrieb Daniel Vetter:
> > You really need to hold the reservation here or all kinds of funny
> > things can happen between grabbing the dependencies and inserting the
> > new fences.
> >
> > Acked-by: Melissa Wen 
> > Signed-off-by: Daniel Vetter 
> > Cc: "Christian König" 
> > Cc: Daniel Vetter 
> > Cc: Luben Tuikov 
> > Cc: Andrey Grodzovsky 
> > Cc: Alex Deucher 
>
> The function name in the subject line should be updated, apart from that
> feel free to add my rb to this patch.

Fixed locally and r-b added, I think the later parts of this series
will need to be resent anyway. Thanks for your review.
-Daniel

>
> Christian.
>
> > ---
> >   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index 49e507f91ec0..1abb40b07324 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -715,6 +715,8 @@ int drm_sched_job_add_implicit_dependencies(struct 
> > drm_sched_job *job,
> >   struct dma_fence **fences;
> >   unsigned int i, fence_count;
> >
> > + dma_resv_assert_held(obj->resv);
> > +
> >   if (!write) {
> >   struct dma_fence *fence = 
> > dma_resv_get_excl_unlocked(obj->resv);
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/sched dependency handling and implicit sync fixes

2021-08-05 Thread Patchwork

== Series Details ==

Series: drm/sched dependency handling and implicit sync fixes
URL   : https://patchwork.freedesktop.org/series/93415/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
fbee9b54db37 drm/sched: Split drm_sched_job_init
-:237: WARNING:UNSPECIFIED_INT: Prefer 'unsigned int' to bare use of 'unsigned'
#237: FILE: drivers/gpu/drm/scheduler/sched_fence.c:173:
+   unsigned seq;

-:333: WARNING:AVOID_BUG: Avoid crashing the kernel - try using WARN_ON & 
recovery code rather than BUG() or BUG_ON()
#333: FILE: drivers/gpu/drm/scheduler/sched_main.c:623:
+   BUG_ON(!entity);

-:402: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#402: FILE: include/drm/gpu_scheduler.h:391:
+struct drm_sched_fence *drm_sched_fence_alloc(

-:410: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 3 warnings, 1 checks, 249 lines checked
e8aa4762329c drm/msm: Fix drm/sched point of no return rules
-:74: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 39 lines checked
590f3db49271 drm/sched: Barriers are needed for entity->last_scheduled
-:88: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 43 lines checked
bcde98968570 drm/sched: Add dependency tracking
-:195: CHECK:LINE_SPACING: Please don't use multiple blank lines
#195: FILE: drivers/gpu/drm/scheduler/sched_main.c:729:
+
+

-:271: WARNING:TYPO_SPELLING: 'ommitted' may be misspelled - perhaps 'omitted'?
#271: FILE: include/drm/gpu_scheduler.h:244:
+* drm_sched_job_add_implicit_dependencies() this can be ommitted and
 

-:286: CHECK:LINE_SPACING: Please don't use multiple blank lines
#286: FILE: include/drm/gpu_scheduler.h:378:
+
+

-:289: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 2 warnings, 2 checks, 230 lines checked
4578cd7fba03 drm/sched: drop entity parameter from drm_sched_push_job
-:227: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 110 lines checked
3ed6103f784e drm/sched: improve docs around drm_sched_entity
-:17: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit 620e762f9a98 ("drm/scheduler: 
move entity handling into separate file")'
#17: 
  move here: 620e762f9a98 ("drm/scheduler: move entity handling into

-:413: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 346 lines checked
42d75ea77670 drm/panfrost: use scheduler dependency tracking
-:214: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 158 lines checked
1074a5fc4e39 drm/lima: use scheduler dependency tracking
-:118: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 75 lines checked
e94f03075601 drm/v3d: Move drm_sched_job_init to v3d_job_init
-:344: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 288 lines checked
43177403c5b4 drm/v3d: Use scheduler dependency handling
-:207: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 162 lines checked
94e8ea6bdac4 drm/etnaviv: Use scheduler dependency handling
-:13: WARNING:REPEATED_WORD: Possible repeated word: 'to'
#13: 
I wanted to to in the previous round (and did, for all other drivers).

-:122: WARNING:LINE_SPACING: Missing a blank line after declarations
#122: FILE: drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:552:
+   struct dma_fence *in_fence = 
sync_file_get_fence(args->fence_fd);
+   if (!in_fence) {

-:297: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 3 warnings, 0 checks, 243 lines checked
914d59644238 drm/msm: Use scheduler dependency handling
-:132: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 chec

Re: [Intel-gfx] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init

2021-08-05 Thread Daniel Vetter

On Thu, Aug 5, 2021 at 3:44 PM Christian König  wrote:
> Am 05.08.21 um 12:46 schrieb Daniel Vetter:
> > This is a very confusingly named function, because not just does it
> > init an object, it arms it and provides a point of no return for
> > pushing a job into the scheduler. It would be nice if that's a bit
> > clearer in the interface.
> >
> > But the real reason is that I want to push the dependency tracking
> > helpers into the scheduler code, and that means drm_sched_job_init
> > must be called a lot earlier, without arming the job.
> >
> > v2:
> > - don't change .gitignore (Steven)
> > - don't forget v3d (Emma)
> >
> > v3: Emma noticed that I leak the memory allocated in
> > drm_sched_job_init if we bail out before the point of no return in
> > subsequent driver patches. To be able to fix this change
> > drm_sched_job_cleanup() so it can handle being called both before and
> > after drm_sched_job_arm().
> >
> > Also improve the kerneldoc for this.
> >
> > v4:
> > - Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
> >usual (Melissa)
> >
> > - Christian pointed out that drm_sched_entity_select_rq() also needs
> >to be moved into drm_sched_job_arm, which made me realize that the
> >job->id definitely needs to be moved too.
> >
> >Shuffle things to fit between job_init and job_arm.
> >
> > v5:
> > Reshuffle the split between init/arm once more, amdgpu abuses
> > drm_sched.ready to signal gpu reset failures. Also document this
> > somewhat. (Christian)
> >
> > v6:
> > Rebase on top of the msm drm/sched support. Note that the
> > drm_sched_job_init() call is completely misplaced, and hence also the
> > split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
> > patch will address.
> >
> > Acked-by: Melissa Wen 
> > Cc: Melissa Wen 
> > Acked-by: Emma Anholt 
> > Acked-by: Steven Price  (v2)
> > Reviewed-by: Boris Brezillon  (v5)
> > Signed-off-by: Daniel Vetter 
>
> At least the amdgpu parts look ok of hand, but I can't judge the rest I
> think.

The thing that really scares me here and that I got wrong a few times
is the cleanup for drm_sched_job at the various points. Can you give
those parts in drm/scheduler/ a full review pls, just to make sure? I
can note that in the tag ofc, just like a bit more confidence here
that it's not busted :-)

> So only Acked-by: Christian König 

Thanks, Daniel

>
> > Cc: Lucas Stach 
> > Cc: Russell King 
> > Cc: Christian Gmeiner 
> > Cc: Qiang Yu 
> > Cc: Rob Herring 
> > Cc: Tomeu Vizoso 
> > Cc: Steven Price 
> > Cc: Alyssa Rosenzweig 
> > Cc: David Airlie 
> > Cc: Daniel Vetter 
> > Cc: Sumit Semwal 
> > Cc: "Christian König" 
> > Cc: Masahiro Yamada 
> > Cc: Kees Cook 
> > Cc: Adam Borowski 
> > Cc: Nick Terrell 
> > Cc: Mauro Carvalho Chehab 
> > Cc: Paul Menzel 
> > Cc: Sami Tolvanen 
> > Cc: Viresh Kumar 
> > Cc: Alex Deucher 
> > Cc: Dave Airlie 
> > Cc: Nirmoy Das 
> > Cc: Deepak R Varma 
> > Cc: Lee Jones 
> > Cc: Kevin Wang 
> > Cc: Chen Li 
> > Cc: Luben Tuikov 
> > Cc: "Marek Olšák" 
> > Cc: Dennis Li 
> > Cc: Maarten Lankhorst 
> > Cc: Andrey Grodzovsky 
> > Cc: Sonny Jiang 
> > Cc: Boris Brezillon 
> > Cc: Tian Tao 
> > Cc: etna...@lists.freedesktop.org
> > Cc: l...@lists.freedesktop.org
> > Cc: linux-me...@vger.kernel.org
> > Cc: linaro-mm-...@lists.linaro.org
> > Cc: Emma Anholt 
> > Cc: Rob Clark 
> > Cc: Sean Paul 
> > Cc: linux-arm-...@vger.kernel.org
> > Cc: freedr...@lists.freedesktop.org
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
> >   drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
> >   drivers/gpu/drm/lima/lima_sched.c|  2 +
> >   drivers/gpu/drm/msm/msm_gem_submit.c |  3 ++
> >   drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
> >   drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
> >   drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
> >   drivers/gpu/drm/scheduler/sched_main.c   | 69 
> >   drivers/gpu/drm/v3d/v3d_gem.c|  2 +
> >   include/drm/gpu_scheduler.h  |  7 ++-
> >   11 files changed, 94 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index 139cd3bf1ad6..32e80bc6af22 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser 
> > *p,
> >   if (r)
> >   goto error_unlock;
> >
> > + drm_sched_job_arm(&job->base);
> > +
> >   /* No memory allocation is allowed while holding the notifier lock.
> >* The lock is held until amdgpu_cs_submit is finished and fence is
> >* added to BOs.
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > index d33e6d97cc89..5ddb955d2315 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > +++ b/drivers/gpu/

[Intel-gfx] ✗ Fi.CI.BUILD: failure for Provide core infrastructure for managing open/release (rev8)

2021-08-05 Thread Patchwork

== Series Details ==

Series: Provide core infrastructure for managing open/release (rev8)
URL   : https://patchwork.freedesktop.org/series/92556/
State : failure

== Summary ==

Applying: vfio/samples: Remove module get/put
Applying: vfio/mbochs: Fix missing error unwind of mbochs_used_mbytes
Applying: vfio: Introduce a vfio_uninit_group_dev() API call
Applying: vfio: Provide better generic support for open/release vfio_device_ops
Applying: vfio/samples: Delete useless open/close
Applying: vfio/fsl: Move to the device set infrastructure
Applying: vfio/platform: Use open_device() instead of open coding a refcnt 
scheme
Applying: vfio/pci: Move to the device set infrastructure
Applying: vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set
Applying: vfio/pci: Reorganize VFIO_DEVICE_PCI_HOT_RESET to use the device set
error: sha1 information is lacking or useless (drivers/vfio/pci/vfio_pci.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0010 vfio/pci: Reorganize VFIO_DEVICE_PCI_HOT_RESET to use the 
device set
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/sched dependency handling and implicit sync fixes

2021-08-05 Thread Patchwork

== Series Details ==

Series: drm/sched dependency handling and implicit sync fixes
URL   : https://patchwork.freedesktop.org/series/93415/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10450 -> Patchwork_20773


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20773 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20773, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20773/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20773:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@workarounds:
- fi-rkl-guc: [PASS][1] -> [DMESG-FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10450/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20773/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html

  
Known issues


  Here are the changes found in Patchwork_20773 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@query-info:
- fi-bsw-kefka:   NOTRUN -> [SKIP][3] ([fdo#109271]) +17 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20773/fi-bsw-kefka/igt@amdgpu/amd_ba...@query-info.html

  * igt@runner@aborted:
- fi-rkl-guc: NOTRUN -> [FAIL][4] ([i915#3462])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20773/fi-rkl-guc/igt@run...@aborted.html

  
 Possible fixes 

  * igt@i915_selftest@live@execlists:
- fi-bsw-kefka:   [INCOMPLETE][5] ([i915#2940]) -> [PASS][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10450/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20773/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940
  [i915#3462]: https://gitlab.freedesktop.org/drm/intel/issues/3462


Participating hosts (40 -> 34)
--

  Missing(6): fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 
fi-bdw-samus bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10450 -> Patchwork_20773

  CI-20190529: 20190529
  CI_DRM_10450: 51d9c8293e8446e921b74d996982ade862fcfa5c @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6160: 4287344dd6a39d9036c5fb9a047a7d8f10bee981 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20773: 2836aa5b3f16292d5043778e200038dc658fa8b1 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

2836aa5b3f16 dma-resv: Give the docs a do-over
c023ae95f5f1 drm/i915: Don't break exclusive fence ordering
9a678298fbf9 drm/i915: delete exclude argument from 
i915_sw_fence_await_reservation
8d1e08eee56b drm/etnaviv: Don't break exclusive fence ordering
0691454484ff drm/msm: Don't break exclusive fence ordering
7768c5a01737 drm/sched: Check locking in drm_sched_job_await_implicit
a0a15e60e8a4 drm/sched: Don't store self-dependencies
c4596cb48171 drm/gem: Delete gem array fencing helpers
914d59644238 drm/msm: Use scheduler dependency handling
94e8ea6bdac4 drm/etnaviv: Use scheduler dependency handling
43177403c5b4 drm/v3d: Use scheduler dependency handling
e94f03075601 drm/v3d: Move drm_sched_job_init to v3d_job_init
1074a5fc4e39 drm/lima: use scheduler dependency tracking
42d75ea77670 drm/panfrost: use scheduler dependency tracking
3ed6103f784e drm/sched: improve docs around drm_sched_entity
4578cd7fba03 drm/sched: drop entity parameter from drm_sched_push_job
bcde98968570 drm/sched: Add dependency tracking
590f3db49271 drm/sched: Barriers are needed for entity->last_scheduled
e8aa4762329c drm/msm: Fix drm/sched point of no return rules
fbee9b54db37 drm/sched: Split drm_sched_job_init

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20773/index.html

[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-05 Thread Patchwork

== Series Details ==

Series: drm/i915: Be more gentle when exiting non-persistent contexts
URL   : https://patchwork.freedesktop.org/series/93420/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10450 -> Patchwork_20775


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20775 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20775, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20775:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10450/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  
Known issues


  Here are the changes found in Patchwork_20775 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@query-info:
- fi-bsw-kefka:   NOTRUN -> [SKIP][3] ([fdo#109271]) +17 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-bsw-kefka/igt@amdgpu/amd_ba...@query-info.html

  * igt@gem_exec_fence@basic-busy@bcs0:
- fi-kbl-soraka:  NOTRUN -> [SKIP][4] ([fdo#109271]) +26 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-kbl-soraka/igt@gem_exec_fence@basic-b...@bcs0.html

  * igt@gem_huc_copy@huc-copy:
- fi-kbl-soraka:  NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#2190])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html

  * igt@i915_pm_rpm@basic-rte:
- fi-kbl-soraka:  NOTRUN -> [FAIL][6] ([i915#579])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-kbl-soraka/igt@i915_pm_...@basic-rte.html

  * igt@i915_selftest@live@gt_pm:
- fi-kbl-soraka:  NOTRUN -> [DMESG-FAIL][7] ([i915#1886] / [i915#2291])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html

  * igt@i915_selftest@live@late_gt_pm:
- fi-bsw-nick:[PASS][8] -> [DMESG-FAIL][9] ([i915#2927])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10450/fi-bsw-nick/igt@i915_selftest@live@late_gt_pm.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-bsw-nick/igt@i915_selftest@live@late_gt_pm.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-kbl-soraka:  NOTRUN -> [SKIP][10] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-kbl-soraka/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
- fi-kbl-soraka:  NOTRUN -> [SKIP][11] ([fdo#109271] / [i915#533])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-kbl-soraka/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html

  * igt@runner@aborted:
- fi-bsw-nick:NOTRUN -> [FAIL][12] ([fdo#109271] / [i915#1436])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-bsw-nick/igt@run...@aborted.html

  
 Possible fixes 

  * igt@i915_selftest@live@execlists:
- fi-bsw-kefka:   [INCOMPLETE][13] ([i915#2940]) -> [PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10450/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2291]: https://gitlab.freedesktop.org/drm/intel/issues/2291
  [i915#2927]: https://gitlab.freedesktop.org/drm/intel/issues/2927
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#579]: https://gitlab.freedesktop.org/drm/intel/issues/579


Participating hosts (40 -> 35)
--

  Additional (1): fi-kbl-soraka 
  Missing(6): fi-ilk-m540 fi-hsw-420

Re: [Intel-gfx] [PATCH v5 14/20] drm/sched: Don't store self-dependencies

2021-08-05 Thread Daniel Vetter

On Thu, Aug 5, 2021 at 3:57 PM Christian König  wrote:
> Am 05.08.21 um 15:25 schrieb Daniel Vetter:
> > On Thu, Aug 5, 2021 at 3:18 PM Christian König  
> > wrote:
> >>
> >>
> >> Am 05.08.21 um 12:46 schrieb Daniel Vetter:
> >>> This is essentially part of drm_sched_dependency_optimized(), which
> >>> only amdgpu seems to make use of. Use it a bit more.
> >>>
> >>> This would mean that as-is amdgpu can't use the dependency helpers, at
> >>> least not with the current approach amdgpu has for deciding whether a
> >>> vm_flush is needed. Since amdgpu also has very special rules around
> >>> implicit fencing it can't use those helpers either, and adding a
> >>> drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too
> >>> onerous. That way the special case handling for amdgpu sticks even
> >>> more out and we have higher chances that reviewers that go across all
> >>> drivers wont miss it.
> >> Well you should probably drop the sentence about the vm_flush, this is
> >> completely unrelated.
> >>
> >> Additional to that I still don't think that this is a good idea.
> >> Dependency handling is something completely driver specific.
> >>
> >> E.g. even when you have submitted jobs back to back they still might
> >> need a cache flush in between and that is not only for amdgpu like this.
> >>
> >> What you can do is to optimize for while looking at the fences later on
> >> and then note that you have done so and what the last hw fence is you
> >> used instead.
> > Out of 6 drivers using drm/sched 5 can use this. When we get i915
> > over, that one will be added to the list. amdgpu can't use any of this
> > anyway due to the vm_id allocation requirements, which is why I
> > mention that. Also note that all the callbacks are still there, so you
> > can just ignore this all and still build your own. Like amdgpu does.
>
> The VMID allocation stuff is rather easy to handle, that's why I noted
> we should remove that sentence.
>
> The problematic stuff is handling the cache flush and pipeline sync
> which you make impossible with this here.

Well the vmid is tied to the flush, but yeah the commit message
doesn't make this clear.

> > So I'm not sure what exactly your object is, aside from "this doesn't
> > fit for amdgpu", which a) I know b) the commit message explains c)
> > doesn't actually hurt amdgpu in the slightest. And we still get the
> > benefit that for most drivers it's a nice optimization.
>
> Well exactly that's what I wanted to avoid. We still can use this in
> amdgpu even with the VMID allocation stuff and I still hope to do so.
>
> Can't we add this as a wrapper or similar?

This patch is not the only thing that will prevent you from using
these helpers, because amdgpu also needs to keep track of all the
fences in the xarray, which these helpers don't - they get cleared out
as we hand them off to the scheduler. So it's more surgery than just
not having this, and I'm honestly not sure it's worth it since you'd
need to duplicate quite a bit more than just the functions to add
dependencies.
-Daniel

-Daniel

> Christian.
>
> > -Daniel
> >
> >> Regards,
> >> Christian.
> >>
> >>> Reviewed-by: Lucas Stach 
> >>> Acked-by: Melissa Wen 
> >>> Signed-off-by: Daniel Vetter 
> >>> Cc: "Christian König" 
> >>> Cc: Daniel Vetter 
> >>> Cc: Luben Tuikov 
> >>> Cc: Andrey Grodzovsky 
> >>> Cc: Alex Deucher 
> >>> ---
> >>>drivers/gpu/drm/scheduler/sched_main.c | 7 +++
> >>>1 file changed, 7 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> >>> b/drivers/gpu/drm/scheduler/sched_main.c
> >>> index f77456929139..49e507f91ec0 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>> @@ -660,6 +660,13 @@ int drm_sched_job_add_dependency(struct 
> >>> drm_sched_job *job,
> >>>if (!fence)
> >>>return 0;
> >>>
> >>> + /* if it's a fence from us it's guaranteed to be earlier */
> >>> + if (fence->context == job->entity->fence_context ||
> >>> + fence->context == job->entity->fence_context + 1) {
> >>> + dma_fence_put(fence);
> >>> + return 0;
> >>> + }
> >>> +
> >>>/* Deduplicate if we already depend on a fence from the same 
> >>> context.
> >>> * This lets the size of the array of deps scale with the number of
> >>> * engines involved, rather than the number of BOs.
> >
>


--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init

2021-08-05 Thread Daniel Vetter

On Thu, Aug 5, 2021 at 4:47 PM Christian König  wrote:
>
> Am 05.08.21 um 16:07 schrieb Daniel Vetter:
> > On Thu, Aug 5, 2021 at 3:44 PM Christian König  
> > wrote:
> >> Am 05.08.21 um 12:46 schrieb Daniel Vetter:
> >>> This is a very confusingly named function, because not just does it
> >>> init an object, it arms it and provides a point of no return for
> >>> pushing a job into the scheduler. It would be nice if that's a bit
> >>> clearer in the interface.
> >>>
> >>> But the real reason is that I want to push the dependency tracking
> >>> helpers into the scheduler code, and that means drm_sched_job_init
> >>> must be called a lot earlier, without arming the job.
> >>>
> >>> v2:
> >>> - don't change .gitignore (Steven)
> >>> - don't forget v3d (Emma)
> >>>
> >>> v3: Emma noticed that I leak the memory allocated in
> >>> drm_sched_job_init if we bail out before the point of no return in
> >>> subsequent driver patches. To be able to fix this change
> >>> drm_sched_job_cleanup() so it can handle being called both before and
> >>> after drm_sched_job_arm().
> >>>
> >>> Also improve the kerneldoc for this.
> >>>
> >>> v4:
> >>> - Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
> >>> usual (Melissa)
> >>>
> >>> - Christian pointed out that drm_sched_entity_select_rq() also needs
> >>> to be moved into drm_sched_job_arm, which made me realize that the
> >>> job->id definitely needs to be moved too.
> >>>
> >>> Shuffle things to fit between job_init and job_arm.
> >>>
> >>> v5:
> >>> Reshuffle the split between init/arm once more, amdgpu abuses
> >>> drm_sched.ready to signal gpu reset failures. Also document this
> >>> somewhat. (Christian)
> >>>
> >>> v6:
> >>> Rebase on top of the msm drm/sched support. Note that the
> >>> drm_sched_job_init() call is completely misplaced, and hence also the
> >>> split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
> >>> patch will address.
> >>>
> >>> Acked-by: Melissa Wen 
> >>> Cc: Melissa Wen 
> >>> Acked-by: Emma Anholt 
> >>> Acked-by: Steven Price  (v2)
> >>> Reviewed-by: Boris Brezillon  (v5)
> >>> Signed-off-by: Daniel Vetter 
> >> At least the amdgpu parts look ok of hand, but I can't judge the rest I
> >> think.
> > The thing that really scares me here and that I got wrong a few times
> > is the cleanup for drm_sched_job at the various points. Can you give
> > those parts in drm/scheduler/ a full review pls, just to make sure? I
> > can note that in the tag ofc, just like a bit more confidence here
> > that it's not busted :-)
>
> I can take another look, but I won't have time for that in the next two
> weeks - vacation and kid starting school.

Hm ok I'll ask others, since this is kinda needed for the msm fix. At
least the msm design relies on this split being present, so fixing it
without this split here would be a pile of rather pointless work.
-Daniel

> Christian.
>
> >
> >> So only Acked-by: Christian König 
> > Thanks, Daniel
> >
> >>> Cc: Lucas Stach 
> >>> Cc: Russell King 
> >>> Cc: Christian Gmeiner 
> >>> Cc: Qiang Yu 
> >>> Cc: Rob Herring 
> >>> Cc: Tomeu Vizoso 
> >>> Cc: Steven Price 
> >>> Cc: Alyssa Rosenzweig 
> >>> Cc: David Airlie 
> >>> Cc: Daniel Vetter 
> >>> Cc: Sumit Semwal 
> >>> Cc: "Christian König" 
> >>> Cc: Masahiro Yamada 
> >>> Cc: Kees Cook 
> >>> Cc: Adam Borowski 
> >>> Cc: Nick Terrell 
> >>> Cc: Mauro Carvalho Chehab 
> >>> Cc: Paul Menzel 
> >>> Cc: Sami Tolvanen 
> >>> Cc: Viresh Kumar 
> >>> Cc: Alex Deucher 
> >>> Cc: Dave Airlie 
> >>> Cc: Nirmoy Das 
> >>> Cc: Deepak R Varma 
> >>> Cc: Lee Jones 
> >>> Cc: Kevin Wang 
> >>> Cc: Chen Li 
> >>> Cc: Luben Tuikov 
> >>> Cc: "Marek Olšák" 
> >>> Cc: Dennis Li 
> >>> Cc: Maarten Lankhorst 
> >>> Cc: Andrey Grodzovsky 
> >>> Cc: Sonny Jiang 
> >>> Cc: Boris Brezillon 
> >>> Cc: Tian Tao 
> >>> Cc: etna...@lists.freedesktop.org
> >>> Cc: l...@lists.freedesktop.org
> >>> Cc: linux-me...@vger.kernel.org
> >>> Cc: linaro-mm-...@lists.linaro.org
> >>> Cc: Emma Anholt 
> >>> Cc: Rob Clark 
> >>> Cc: Sean Paul 
> >>> Cc: linux-arm-...@vger.kernel.org
> >>> Cc: freedr...@lists.freedesktop.org
> >>> ---
> >>>drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
> >>>drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
> >>>drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
> >>>drivers/gpu/drm/lima/lima_sched.c|  2 +
> >>>drivers/gpu/drm/msm/msm_gem_submit.c |  3 ++
> >>>drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
> >>>drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
> >>>drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
> >>>drivers/gpu/drm/scheduler/sched_main.c   | 69 
> >>>drivers/gpu/drm/v3d/v3d_gem.c|  2 +
> >>>include/drm/gpu_scheduler.h  |  7 ++-
> >>>11 files changed, 94 insertions(+), 22 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>

Re: [Intel-gfx] [PATCH v4 05/18] drm/i915/dg2: Add SQIDI steering

2021-08-05 Thread Matt Roper

On Wed, Aug 04, 2021 at 01:22:17PM -0700, Lucas De Marchi wrote:
> On Thu, Jul 29, 2021 at 09:59:55AM -0700, Matt Roper wrote:
> > Although DG2_G10 platforms will always have all SQIDI's present and
> > don't need steering for registers in a SQIDI MMIO range, this isn't true
> > for DG2_G11 platforms; only SQIDI's 2 and 3 can be used on those.
> > 
> > We handle SQIDI ranges a bit differently from other types of explicit
> > steering.  The SQIDI ranges belong to either the MCFG unit or the SF
> > unit, both of which have their own dedicated steering registers and do
> > not use the typical 0xFDC steering control that all other types of
> > ranges use.  Thus we only need to worry about picking a valid initial
> > value for the MCFG and SF steering registers (0xFD0 and 0xFD8
> > resepectively) at driver init; they won't change after we set them up so
> 
> respectively
> 
> > we don't need to worry about re-steering them explicitly at runtime.
> > 
> > Given that any SQIDI value should work fine for DG2-G10 and XeHP SDV,
> > while only values of 2 and 3 are valid for DG2-G11, we'll just
> > initialize the MCFG and SF steering registers to a constant value of "2"
> > for all XeHP-based platforms for simplicity --- that will work in all
> > cases.
> > 
> > Bspec: 66534
> > Cc: Radhakrishna Sripada 
> > Signed-off-by: Matt Roper 
> > ---
> > drivers/gpu/drm/i915/gt/intel_workarounds.c | 28 +
> > drivers/gpu/drm/i915/i915_reg.h |  2 ++
> > 2 files changed, 25 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
> > b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index 8717337a6c81..6895b083523d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -889,17 +889,24 @@ cfl_gt_workarounds_init(struct drm_i915_private 
> > *i915, struct i915_wa_list *wal)
> > GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
> > }
> > 
> > -static void __add_mcr_wa(struct drm_i915_private *i915, struct 
> > i915_wa_list *wal,
> > -unsigned slice, unsigned subslice)
> > +static void __set_mcr_steering(struct i915_wa_list *wal,
> > +  i915_reg_t steering_reg,
> > +  unsigned int slice, unsigned int subslice)
> > {
> > u32 mcr, mcr_mask;
> > 
> > mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
> > mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
> > 
> > -   drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr);
> > +   wa_write_clr_set(wal, steering_reg, mcr_mask, mcr);
> > +}
> > +
> > +static void __add_mcr_wa(struct drm_i915_private *i915, struct 
> > i915_wa_list *wal,
> > +unsigned int slice, unsigned int subslice)
> > +{
> > +   drm_dbg(&i915->drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice);
> 
> maybe we could leave the debug message in __set_mcr_steering() and add
> what steering register we are setting? Up to you.
> 

I've got a separate patch that adds more clear steering debug
information via a drm_printer and then prints it both in the dmesg log
and in a new debugfs node.  The patch depends on some debugfs changes
that haven't shown up yet so I didn't include it here, but I'll rebase
and send it soon if the debugfs changes don't happen first.


Matt

> 
> Reviewed-by: Lucas De Marchi 
> 
> 
> > 
> > -   wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
> > +   __set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice);
> > }
> > 
> > static void
> > @@ -953,7 +960,6 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list 
> > *wal)
> >  * - L3 Bank (fusable)
> >  * - MSLICE (fusable)
> >  * - LNCF (sub-unit within mslice; always present if mslice is present)
> > -* - SQIDI (always on)
> >  *
> >  * We'll do our default/implicit steering based on GSLICE (in the
> >  * sliceid field) and DSS (in the subsliceid field).  If we can
> > @@ -1003,6 +1009,18 @@ xehp_init_mcr(struct intel_gt *gt, struct 
> > i915_wa_list *wal)
> > WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
> > 
> > __add_mcr_wa(i915, wal, slice, subslice);
> > +
> > +   /*
> > +* SQIDI ranges are special because they use different steering
> > +* registers than everything else we work with.  On XeHP SDV and
> > +* DG2-G10, any value in the steering registers will work fine since
> > +* all instances are present, but DG2-G11 only has SQIDI instances at
> > +* ID's 2 and 3, so we need to steer to one of those.  For simplicity
> > +* we'll just steer to a hardcoded "2" since that value will work
> > +* everywhere.
> > +*/
> > +   __set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
> > +   __set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
> > }
> > 
> > static void
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h 
> > b/drivers/gpu/drm/i915/i915_reg.h
> > index f4113e7e8271..39ce6befff52 100644
> > --- a/driv

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/dp: Use max params for older panels

2021-08-05 Thread Patchwork

== Series Details ==

Series: drm/i915/dp: Use max params for older panels
URL   : https://patchwork.freedesktop.org/series/93390/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10445_full -> Patchwork_20769_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_20769_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@feature_discovery@display-2x:
- shard-iclb: NOTRUN -> [SKIP][1] ([i915#1839])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-iclb3/igt@feature_discov...@display-2x.html

  * igt@gem_ctx_persistence@hostile:
- shard-snb:  NOTRUN -> [SKIP][2] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-snb6/igt@gem_ctx_persiste...@hostile.html

  * igt@gem_ctx_persistence@legacy-engines-hang@render:
- shard-tglb: [PASS][3] -> [FAIL][4] ([i915#2410])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-tglb2/igt@gem_ctx_persistence@legacy-engines-h...@render.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-tglb6/igt@gem_ctx_persistence@legacy-engines-h...@render.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][5] -> [TIMEOUT][6] ([i915#2369] / [i915#3063] 
/ [i915#3648])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-tglb5/igt@gem_...@unwedge-stress.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-tglb8/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
- shard-apl:  NOTRUN -> [FAIL][7] ([i915#2846])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-apl8/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none@vcs1:
- shard-kbl:  [PASS][8] -> [FAIL][9] ([i915#2842])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-kbl3/igt@gem_exec_fair@basic-n...@vcs1.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-kbl4/igt@gem_exec_fair@basic-n...@vcs1.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
- shard-tglb: [PASS][10] -> [FAIL][11] ([i915#2842]) +1 similar 
issue
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-tglb6/igt@gem_exec_fair@basic-pace-sh...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-tglb2/igt@gem_exec_fair@basic-pace-sh...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][12] ([i915#2842]) +1 similar issue
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-iclb4/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_fair@basic-pace@vecs0:
- shard-kbl:  [PASS][13] -> [SKIP][14] ([fdo#109271])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-kbl7/igt@gem_exec_fair@basic-p...@vecs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-kbl6/igt@gem_exec_fair@basic-p...@vecs0.html

  * igt@gem_exec_schedule@u-semaphore-user:
- shard-snb:  NOTRUN -> [SKIP][15] ([fdo#109271]) +209 similar 
issues
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-snb5/igt@gem_exec_sched...@u-semaphore-user.html

  * igt@gem_mmap_gtt@cpuset-medium-copy-odd:
- shard-iclb: [PASS][16] -> [FAIL][17] ([i915#2428])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-iclb3/igt@gem_mmap_...@cpuset-medium-copy-odd.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-iclb7/igt@gem_mmap_...@cpuset-medium-copy-odd.html

  * igt@gem_pwrite@basic-exhaustion:
- shard-snb:  NOTRUN -> [WARN][18] ([i915#2658])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-snb5/igt@gem_pwr...@basic-exhaustion.html

  * igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-dp:
- shard-apl:  NOTRUN -> [SKIP][19] ([fdo#109271] / [i915#1937])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-apl2/igt@i915_pm_lpsp@kms-l...@kms-lpsp-dp.html

  * igt@i915_suspend@forcewake:
- shard-apl:  NOTRUN -> [DMESG-WARN][20] ([i915#180])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-apl8/igt@i915_susp...@forcewake.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-hflip:
- shard-skl:  NOTRUN -> [SKIP][21] ([fdo#109271] / [i915#3777])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-skl1/igt@kms_big...@y-tiled-max-hw-stride-32bpp-rotate-0-hflip.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-180-async-flip:
- shard-skl:  NOTRUN -> [FAIL][22] ([i915#3722])
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20769/shard-skl5/igt@kms_big...@y-tiled-max-hw-stride-32bpp-rotate-180

Re: [Intel-gfx] [PATCH v3 09/14] vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set

2021-08-05 Thread Jason Gunthorpe

On Tue, Aug 03, 2021 at 10:52:25AM -0600, Alex Williamson wrote:
> On Tue, 3 Aug 2021 13:41:52 -0300
> Jason Gunthorpe  wrote:
> > On Tue, Aug 03, 2021 at 10:34:06AM -0600, Alex Williamson wrote:
> > > I think the vfio_pci_find_reset_target() function needs to be re-worked
> > > to just tell us true/false that it's ok to reset the provided device,
> > > not to anoint an arbitrary target device.  Thanks,  
> > 
> > Yes, though this logic is confusing, why do we need to check if any
> > device needs a reset at this point? If we are being asked to reset
> > vdev shouldn't vdev needs_reset?
> > 
> > Or is the function more of a 'synchronize pending reset' kind of
> > thing?
> 
> Yes, the latter.  For instance think about a multi-function PCI device
> such as a GPU.  The functions have dramatically different capabilities,
> some might have function level reset abilities and others not.  We want
> to be able to trigger a bus reset as the last device of the set is
> released, no matter the order they're released and no matter the
> capabilities of the device we're currently processing.  Thanks,

I worked on this for awhile, I think this is much clearer about what
this algorithm is trying to do:

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 5d6db93d6c680f..e418bcbb68facc 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -223,7 +223,7 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_device 
*vdev)
}
 }
 
-static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev);
+static bool vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set);
 static void vfio_pci_disable(struct vfio_pci_device *vdev);
 static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data);
 
@@ -404,6 +404,9 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
struct vfio_pci_ioeventfd *ioeventfd, *ioeventfd_tmp;
int i, bar;
 
+   /* For needs_reset */
+   lockdep_assert_held(&vdev->vdev.dev_set->lock);
+
/* Stop the device from further DMA */
pci_clear_master(pdev);
 
@@ -487,9 +490,7 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
 out:
pci_disable_device(pdev);
 
-   vfio_pci_try_bus_reset(vdev);
-
-   if (!disable_idle_d3)
+   if (!vfio_pci_dev_set_try_reset(vdev->vdev.dev_set) && !disable_idle_d3)
vfio_pci_set_power_state(vdev, PCI_D3hot);
 }
 
@@ -2145,36 +2146,6 @@ static struct pci_driver vfio_pci_driver = {
.err_handler= &vfio_err_handlers,
 };
 
-static int vfio_pci_get_unused_devs(struct pci_dev *pdev, void *data)
-{
-   struct vfio_devices *devs = data;
-   struct vfio_device *device;
-   struct vfio_pci_device *vdev;
-
-   if (devs->cur_index == devs->max_index)
-   return -ENOSPC;
-
-   device = vfio_device_get_from_dev(&pdev->dev);
-   if (!device)
-   return -EINVAL;
-
-   if (pci_dev_driver(pdev) != &vfio_pci_driver) {
-   vfio_device_put(device);
-   return -EBUSY;
-   }
-
-   vdev = container_of(device, struct vfio_pci_device, vdev);
-
-   /* Fault if the device is not unused */
-   if (device->open_count) {
-   vfio_device_put(device);
-   return -EBUSY;
-   }
-
-   devs->devices[devs->cur_index++] = vdev;
-   return 0;
-}
-
 static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data)
 {
struct vfio_devices *devs = data;
@@ -2208,79 +2179,86 @@ static int vfio_pci_try_zap_and_vma_lock_cb(struct 
pci_dev *pdev, void *data)
return 0;
 }
 
+static int vfio_pci_is_device_in_set(struct pci_dev *pdev, void *data)
+{
+   struct vfio_device_set *dev_set = data;
+   struct vfio_device *cur;
+
+   lockdep_assert_held(&dev_set->lock);
+
+   list_for_each_entry(cur, &dev_set->device_list, dev_set_list)
+   if (cur->dev == &pdev->dev)
+   return 0;
+   return -EBUSY;
+}
+
+static bool vfio_pci_dev_set_needs_reset(struct vfio_device_set *dev_set)
+{
+   struct vfio_pci_device *cur;
+   bool needs_reset = false;
+
+   list_for_each_entry(cur, &dev_set->device_list, vdev.dev_set_list) {
+   /* No VFIO device in the set can have an open device FD */
+   if (cur->vdev.open_count)
+   return false;
+   needs_reset |= cur->needs_reset;
+   }
+   return needs_reset;
+}
+
 /*
- * If a bus or slot reset is available for the provided device and:
+ * If a bus or slot reset is available for the provided dev_set and:
  *  - All of the devices affected by that bus or slot reset are unused
- *(!refcnt)
  *  - At least one of the affected devices is marked dirty via
  *needs_reset (such as by lack of FLR support)
- * Then attempt to perform that bus or slot reset.  Callers are required
- * to hold vdev->dev_set->lock, protecting the bus/slot reset group f

Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-05 Thread Tvrtko Ursulin




On 05/08/2021 16:04, Patchwork wrote:

*Patch Details*
*Series:*   drm/i915: Be more gentle when exiting non-persistent contexts
*URL:*	https://patchwork.freedesktop.org/series/93420/ 


*State:*failure
*Details:* 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/index.html 




  CI Bug Log - changes from CI_DRM_10450 -> Patchwork_20775


Summary

*FAILURE*

Serious unknown changes coming with Patchwork_20775 absolutely need to be
verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_20775, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.

External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/index.html



Possible new issues

Here are the unknown changes that may have been introduced in 
Patchwork_20775:



  IGT changes


Possible regressions

  * igt@i915_selftest@live@gt_lrc:
  o fi-rkl-guc: PASS


-> DMESG-WARN




<6> [233.928677] i915: Running intel_lrc_live_selftests/live_lrc_isolation
<3> [233.988780] i915 :00:02.0: [drm] *ERROR* rcs0 context redzone 
overwritten!

Something GuC specific by the look of it, or at least I haven't found the same 
signature elsewhere. But in any case it is not related to this patch.

Regards,

Tvrtko




Known issues

Here are the changes found in Patchwork_20775 that come from known issues:


  IGT changes


Issues hit

  *

igt@amdgpu/amd_basic@query-info:

  o fi-bsw-kefka: NOTRUN -> SKIP


(fdo#109271
) +17
similar issues
  *

igt@gem_exec_fence@basic-busy@bcs0:

  o fi-kbl-soraka: NOTRUN -> SKIP


(fdo#109271
) +26
similar issues
  *

igt@gem_huc_copy@huc-copy:

  o fi-kbl-soraka: NOTRUN -> SKIP


(fdo#109271
 /
i915#2190 )
  *

igt@i915_pm_rpm@basic-rte:

  o fi-kbl-soraka: NOTRUN -> FAIL


(i915#579 )
  *

igt@i915_selftest@live@gt_pm:

  o fi-kbl-soraka: NOTRUN -> DMESG-FAIL


(i915#1886
 /
i915#2291 )
  *

igt@i915_selftest@live@late_gt_pm:

  o fi-bsw-nick: PASS


-> DMESG-FAIL


(i915#2927 )
  *

igt@kms_chamelium@common-hpd-after-suspend:

  o fi-kbl-soraka: NOTRUN -> SKIP


(fdo#109271
 /
fdo#111827
) +8
similar issues
  *

igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:

  o fi-kbl-soraka: NOTRUN -> SKIP


(fdo#109271
 / i915#533
)
  *

igt@runner@aborted:

  o fi-bsw-nick: NOTRUN -> FAIL


(fdo#109271
 /
i915#1436 )


Possible fixe

Re: [Intel-gfx] [PATCH] drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-05 Thread Matthew Brost

On Thu, Aug 05, 2021 at 01:05:09PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin 
> 
> When a non-persistent context exits we currently mark it as banned in
> order to trigger fast termination of any outstanding GPU jobs it may have
> left running.
> 
> In doing so we apply a very strict 1ms limit in which the left over job
> has to preempt before we issues an engine resets.
> 
> Some workloads are not able to cleanly preempt in that time window and it
> can be argued that it would instead be better to give them a bit more
> grace since avoiding engine resets is generally preferrable.
> 
> To achieve this the patch splits handling of banned contexts from simply
> closed non-persistent ones and then applies different timeouts for both
> and also extends the criteria which determines if a request should be
> scheduled back in after preemption or not.
> 
> 15ms preempt timeout grace is given to exited non-persistent contexts
> which have been empirically tested to satisfy customers requirements
> and still provides reasonably quick cleanup post exit.
> 

I think you need to rework your thinking here a bit as this a very
execlists specific solution and the GuC needs to be considered.

> v2:
>  * Streamline fast path checks.
> 
> v3:
>  * Simplify by using only schedulable status.
>  * Increase timeout to 20ms.
> 
> v4:
>  * Fix live_execlists selftest.
> 
> v5:
>  * Fix logic in kill_engines.
> 
> v6:
>  * Rebase.
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Chris Wilson 
> Cc: Zhen Han 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 22 +--
>  drivers/gpu/drm/i915/gt/intel_context.c   |  2 ++
>  drivers/gpu/drm/i915/gt/intel_context.h   | 17 +-
>  drivers/gpu/drm/i915/gt/intel_context_types.h |  1 +
>  .../drm/i915/gt/intel_execlists_submission.c  | 11 --
>  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 20 +++--
>  drivers/gpu/drm/i915/i915_request.c   |  2 +-
>  7 files changed, 57 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index cff72679ad7c..21fe5d4057ab 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1065,7 +1065,8 @@ static struct intel_engine_cs *active_engine(struct 
> intel_context *ce)
>   return engine;
>  }
>  
> -static void kill_engines(struct i915_gem_engines *engines, bool ban)
> +static void
> +kill_engines(struct i915_gem_engines *engines, bool ban, bool persistent)
>  {
>   struct i915_gem_engines_iter it;
>   struct intel_context *ce;
> @@ -1079,8 +1080,15 @@ static void kill_engines(struct i915_gem_engines 
> *engines, bool ban)
>*/
>   for_each_gem_engine(ce, engines, it) {
>   struct intel_engine_cs *engine;
> + bool skip = false;
> +
> + if (ban)
> + skip = intel_context_ban(ce, NULL);
> + else if (!persistent)
> + skip = !intel_context_clear_schedulable(ce);

schedulable doesn't hook into the backend at all, while
intel_context_ban does. In the case of GuC submission intel_context_ban
changes to preemption timeout to 1 us and disables scheduling resulting
in the context getting kicked off the hardware immediately. You likely
need to update intel_context_clear_schedulable to use the same vfunc as
intel_context_ban() but accept an argument for the value of the
preemption timeout. For a ban user a lower value, for clearing
schedulable use a higher value.

>  
> - if (ban && intel_context_ban(ce, NULL))
> + /* Already previously banned or made non-schedulable? */
> + if (skip)
>   continue;
>  
>   /*
> @@ -1093,7 +1101,7 @@ static void kill_engines(struct i915_gem_engines 
> *engines, bool ban)
>   engine = active_engine(ce);
>  
>   /* First attempt to gracefully cancel the context */
> - if (engine && !__cancel_engine(engine) && ban)
> + if (engine && !__cancel_engine(engine) && (ban || !persistent))
>   /*
>* If we are unable to send a preemptive pulse to bump
>* the context from the GPU, we have to resort to a full
> @@ -1105,8 +1113,6 @@ static void kill_engines(struct i915_gem_engines 
> *engines, bool ban)
>  
>  static void kill_context(struct i915_gem_context *ctx)
>  {
> - bool ban = (!i915_gem_context_is_persistent(ctx) ||
> - !ctx->i915->params.enable_hangcheck);
>   struct i915_gem_engines *pos, *next;
>  
>   spin_lock_irq(&ctx->stale.lock);
> @@ -1119,7 +1125,8 @@ static void kill_context(struct i915_gem_context *ctx)
>  
>   spin_unlock_irq(&ctx->stale.lock);
>  
> - kill_engines(pos, ban);
> + kill_engines(pos, !ctx->i915->params.enable_hangcheck,
> +

[Intel-gfx] [PATCH v5 3/9] drm/i915/dg2: Report INSTDONE_GEOM values in error state

2021-08-05 Thread Matt Roper

Xe_HPG adds some additional INSTDONE_GEOM debug registers; the Mesa team
has indicated that having these reported in the error state would be
useful for debugging GPU hangs.  These registers are replicated per-DSS
with gslice steering.

Cc: Lionel Landwerlin 
Signed-off-by: Matt Roper 
Acked-by: Lionel Landwerlin 
Reviewed-by: Matt Atwood 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c|  7 +++
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  3 +++
 drivers/gpu/drm/i915/i915_gpu_error.c| 10 --
 drivers/gpu/drm/i915/i915_reg.h  |  1 +
 4 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 58ed67894b3d..332efea696a5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1202,6 +1202,13 @@ void intel_engine_get_instdone(const struct 
intel_engine_cs *engine,
  GEN7_ROW_INSTDONE);
}
}
+
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
+   for_each_instdone_gslice_dss_xehp(i915, sseu, iter, 
slice, subslice)
+   instdone->geom_svg[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ 
XEHPG_INSTDONE_GEOM_SVG);
+   }
} else if (GRAPHICS_VER(i915) >= 7) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 0b4846b01626..bfbfe53c23dd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -69,6 +69,9 @@ struct intel_instdone {
u32 slice_common_extra[2];
u32 sampler[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
u32 row[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
+
+   /* Added in XeHPG */
+   u32 geom_svg[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
 };
 
 /*
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 8230bc3ac8a9..91d5da7b0a2b 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -431,6 +431,7 @@ static void error_print_instdone(struct 
drm_i915_error_state_buf *m,
const struct sseu_dev_info *sseu = &ee->engine->gt->info.sseu;
int slice;
int subslice;
+   int iter;
 
err_printf(m, "  INSTDONE: 0x%08x\n",
   ee->instdone.instdone);
@@ -445,8 +446,6 @@ static void error_print_instdone(struct 
drm_i915_error_state_buf *m,
return;
 
if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 50)) {
-   int iter;
-
for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, 
subslice)
err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
   slice, subslice,
@@ -471,6 +470,13 @@ static void error_print_instdone(struct 
drm_i915_error_state_buf *m,
if (GRAPHICS_VER(m->i915) < 12)
return;
 
+   if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 55)) {
+   for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, 
subslice)
+   err_printf(m, "  GEOM_SVGUNIT_INSTDONE[%d][%d]: 
0x%08x\n",
+  slice, subslice,
+  ee->instdone.geom_svg[slice][subslice]);
+   }
+
err_printf(m, "  SC_INSTDONE_EXTRA: 0x%08x\n",
   ee->instdone.slice_common_extra[0]);
err_printf(m, "  SC_INSTDONE_EXTRA2: 0x%08x\n",
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 167eaa87501b..8bfd646fc403 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2686,6 +2686,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GEN12_SC_INSTDONE_EXTRA2   _MMIO(0x7108)
 #define GEN7_SAMPLER_INSTDONE  _MMIO(0xe160)
 #define GEN7_ROW_INSTDONE  _MMIO(0xe164)
+#define XEHPG_INSTDONE_GEOM_SVG_MMIO(0x666c)
 #define MCFG_MCR_SELECTOR  _MMIO(0xfd0)
 #define SF_MCR_SELECTOR_MMIO(0xfd8)
 #define GEN8_MCR_SELECTOR  _MMIO(0xfdc)
-- 
2.25.4

[Intel-gfx] [PATCH v5 4/9] drm/i915/xehpsdv: Add compute DSS type

2021-08-05 Thread Matt Roper

From: Stuart Summers 

Starting in XeHP, the concept of slice has been removed in favor of
DSS (Dual-Subslice) masks for various workload types. These workloads have
been divided into those enabled for geometry and those enabled for compute.

i915 currently maintains a single set of S/SS/EU masks for the device.
The goal of this patch set is to minimize the amount of impact to prior
generations while still giving the user maximum flexibility.

v2:
 - Generalize a comment about uapi access to geometry/compute masks; the
   proposed uapi has changed since the comment was first written, and
   will show up in a future series once the userspace code is published.
   (Lucas)

Bspec: 33117, 33118, 20376
Cc: Daniele Ceraolo Spurio 
Cc: Matt Roper 
Cc: Lucas De Marchi 
Signed-off-by: Stuart Summers 
Signed-off-by: Steve Hampson 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_sseu.c | 66 +---
 drivers/gpu/drm/i915/gt/intel_sseu.h |  5 ++-
 drivers/gpu/drm/i915/i915_reg.h  |  3 +-
 include/uapi/drm/i915_drm.h  |  3 --
 4 files changed, 55 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c 
b/drivers/gpu/drm/i915/gt/intel_sseu.c
index bbd272943c3f..9cf157a2454f 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -46,11 +46,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info 
*sseu, u8 slice)
 }
 
 void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
- u32 ss_mask)
+ u8 *subslice_mask, u32 ss_mask)
 {
int offset = slice * sseu->ss_stride;
 
-   memcpy(&sseu->subslice_mask[offset], &ss_mask, sseu->ss_stride);
+   memcpy(&subslice_mask[offset], &ss_mask, sseu->ss_stride);
 }
 
 unsigned int
@@ -100,14 +100,24 @@ static u16 compute_eu_total(const struct sseu_dev_info 
*sseu)
return total;
 }
 
-static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
-   u8 s_en, u32 ss_en, u16 eu_en)
+static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en)
+{
+   u32 ss_mask;
+
+   ss_mask = ss_en >> (s * sseu->max_subslices);
+   ss_mask &= GENMASK(sseu->max_subslices - 1, 0);
+
+   return ss_mask;
+}
+
+static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en,
+   u32 g_ss_en, u32 c_ss_en, u16 eu_en)
 {
int s, ss;
 
-   /* ss_en represents entire subslice mask across all slices */
+   /* g_ss_en/c_ss_en represent entire subslice mask across all slices */
GEM_BUG_ON(sseu->max_slices * sseu->max_subslices >
-  sizeof(ss_en) * BITS_PER_BYTE);
+  sizeof(g_ss_en) * BITS_PER_BYTE);
 
for (s = 0; s < sseu->max_slices; s++) {
if ((s_en & BIT(s)) == 0)
@@ -115,7 +125,22 @@ static void gen11_compute_sseu_info(struct sseu_dev_info 
*sseu,
 
sseu->slice_mask |= BIT(s);
 
-   intel_sseu_set_subslices(sseu, s, ss_en);
+   /*
+* XeHP introduces the concept of compute vs geometry DSS. To
+* reduce variation between GENs around subslice usage, store a
+* mask for both the geometry and compute enabled masks since
+* userspace will need to be able to query these masks
+* independently.  Also compute a total enabled subslice count
+* for the purposes of selecting subslices to use in a
+* particular GEM context.
+*/
+   intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask,
+get_ss_stride_mask(sseu, s, c_ss_en));
+   intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask,
+get_ss_stride_mask(sseu, s, g_ss_en));
+   intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
+get_ss_stride_mask(sseu, s,
+   g_ss_en | c_ss_en));
 
for (ss = 0; ss < sseu->max_subslices; ss++)
if (intel_sseu_has_subslice(sseu, s, ss))
@@ -129,7 +154,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 {
struct sseu_dev_info *sseu = >->info.sseu;
struct intel_uncore *uncore = gt->uncore;
-   u32 dss_en;
+   u32 g_dss_en, c_dss_en = 0;
u16 eu_en = 0;
u8 eu_en_fuse;
u8 s_en;
@@ -145,10 +170,12 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 * across the entire device. Then calculate out the DSS for each
 * workload type within that software slice.
 */
-   if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915))
+   if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915)) {
intel_sseu_set_info(sseu, 1, 32, 16);
-   else
+   sse

[Intel-gfx] [PATCH v5 2/9] drm/i915/xehp: Loop over all gslices for INSTDONE processing

2021-08-05 Thread Matt Roper

We no longer have traditional slices on Xe_HP platforms, but the
INSTDONE registers are replicated according to gslice representation
which is similar.  We can mostly re-use the existing instdone code with
just a few modifications:

 * Create an alternate instdone loop macro that will iterate over the
   flat DSS space, but still provide the gslice/dss steering values for
   compatibility with the legacy code.

 * We should allocate INSTDONE storage space according to the maximum
   number of gslices rather than the maximum number of legacy slices to
   ensure we have enough storage space to hold all of the values.  XeHP
   design has 8 gslices, whereas older platforms never had more than 3
   slices.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 48 +++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 12 -
 drivers/gpu/drm/i915/gt/intel_sseu.h |  7 +++
 drivers/gpu/drm/i915/i915_gpu_error.c| 32 +
 4 files changed, 66 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 0d9105a31d84..58ed67894b3d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1163,16 +1163,16 @@ void intel_engine_get_instdone(const struct 
intel_engine_cs *engine,
u32 mmio_base = engine->mmio_base;
int slice;
int subslice;
+   int iter;
 
memset(instdone, 0, sizeof(*instdone));
 
-   switch (GRAPHICS_VER(i915)) {
-   default:
+   if (GRAPHICS_VER(i915) >= 8) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
 
if (engine->id != RCS0)
-   break;
+   return;
 
instdone->slice_common =
intel_uncore_read(uncore, GEN7_SC_INSTDONE);
@@ -1182,21 +1182,32 @@ void intel_engine_get_instdone(const struct 
intel_engine_cs *engine,
instdone->slice_common_extra[1] =
intel_uncore_read(uncore, 
GEN12_SC_INSTDONE_EXTRA2);
}
-   for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
-   instdone->sampler[slice][subslice] =
-   read_subslice_reg(engine, slice, subslice,
- GEN7_SAMPLER_INSTDONE);
-   instdone->row[slice][subslice] =
-   read_subslice_reg(engine, slice, subslice,
- GEN7_ROW_INSTDONE);
+
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+   for_each_instdone_gslice_dss_xehp(i915, sseu, iter, 
slice, subslice) {
+   instdone->sampler[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ 
GEN7_SAMPLER_INSTDONE);
+   instdone->row[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ GEN7_ROW_INSTDONE);
+   }
+   } else {
+   for_each_instdone_slice_subslice(i915, sseu, slice, 
subslice) {
+   instdone->sampler[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ 
GEN7_SAMPLER_INSTDONE);
+   instdone->row[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ GEN7_ROW_INSTDONE);
+   }
}
-   break;
-   case 7:
+   } else if (GRAPHICS_VER(i915) >= 7) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
 
if (engine->id != RCS0)
-   break;
+   return;
 
instdone->slice_common =
intel_uncore_read(uncore, GEN7_SC_INSTDONE);
@@ -1204,22 +1215,15 @@ void intel_engine_get_instdone(const struct 
intel_engine_cs *engine,
intel_uncore_read(uncore, GEN7_SAMPLER_INSTDONE);
instdone->row[0][0] =
intel_uncore_read(uncore, GEN7_ROW_INSTDONE);
-
-   break;
-   case 6:
-   case 5:
-   case 4:
+   } else if (GRAPHICS_VER(i915) >= 4) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
if (engine->id == RCS0)
/* HACK: Using the wrong struct mem

[Intel-gfx] [PATCH v5 8/9] drm/i915/dg2: Maintain backward-compatible nested batch behavior

2021-08-05 Thread Matt Roper

For tgl+, the per-context setting of MI_MODE[12] determines whether
the bits of a nested MI_BATCH_BUFFER_START instruction should be
interpreted in the traditional manner or whether they should
instead use a new tgl+ meaning that breaks backward compatibility, but
allows nesting into 3rd-level batchbuffers.  For previous platforms,
the hardware default for this register bit is to maintain
backward-compatible behavior unless a context intentionally opts into
the new behavior; however Xe_HPG flips the hardware default behavior.

>From a SW perspective, we want to maintain the backward-compatible
behavior for userspace, so we'll apply a fake workaround to set it back
to the legacy behavior on platforms where the hardware default is to
break compatibility.  At the moment there is no Linux userspace that
utilizes third-level batchbuffers, so this will avoid userspace from
needing to make any changes.  using the legacy meaning is the correct
thing to do.  If/when we have userspace consumers that want to utilize
third-level batch nesting, we can provide a context parameter to allow
them to opt-in.

Bspec: 45974, 45718
Cc: John Harrison 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 39 +++--
 drivers/gpu/drm/i915/i915_reg.h |  1 +
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index aae609d7d85d..97b3cd81b721 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -644,6 +644,37 @@ static void dg1_ctx_workarounds_init(struct 
intel_engine_cs *engine,
 DG1_HZ_READ_SUPPRESSION_OPTIMIZATION_DISABLE);
 }
 
+static void fakewa_disable_nestedbb_mode(struct intel_engine_cs *engine,
+struct i915_wa_list *wal)
+{
+   /*
+* This is a "fake" workaround defined by software to ensure we
+* maintain reliable, backward-compatible behavior for userspace with
+* regards to how nested MI_BATCH_BUFFER_START commands are handled.
+*
+* The per-context setting of MI_MODE[12] determines whether the bits
+* of a nested MI_BATCH_BUFFER_START instruction should be interpreted
+* in the traditional manner or whether they should instead use a new
+* tgl+ meaning that breaks backward compatibility, but allows nesting
+* into 3rd-level batchbuffers.  When this new capability was first
+* added in TGL, it remained off by default unless a context
+* intentionally opted in to the new behavior.  However Xe_HPG now
+* flips this on by default and requires that we explicitly opt out if
+* we don't want the new behavior.
+*
+* From a SW perspective, we want to maintain the backward-compatible
+* behavior for userspace, so we'll apply a fake workaround to set it
+* back to the legacy behavior on platforms where the hardware default
+* is to break compatibility.  At the moment there is no Linux
+* userspace that utilizes third-level batchbuffers, so this will avoid
+* userspace from needing to make any changes.  using the legacy
+* meaning is the correct thing to do.  If/when we have userspace
+* consumers that want to utilize third-level batch nesting, we can
+* provide a context parameter to allow them to opt-in.
+*/
+   wa_masked_dis(wal, RING_MI_MODE(engine->mmio_base), TGL_NESTED_BB_EN);
+}
+
 static void
 __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
   struct i915_wa_list *wal,
@@ -651,11 +682,15 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
 {
struct drm_i915_private *i915 = engine->i915;
 
+   wa_init_start(wal, name, engine->name);
+
+   /* Applies to all engines */
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55))
+   fakewa_disable_nestedbb_mode(engine, wal);
+
if (engine->class != RENDER_CLASS)
return;
 
-   wa_init_start(wal, name, engine->name);
-
if (IS_DG1(i915))
dg1_ctx_workarounds_init(engine, wal);
else if (GRAPHICS_VER(i915) == 12)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 77f6dcaba2b9..269685955fbd 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2821,6 +2821,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define MI_MODE_MMIO(0x209c)
 # define VS_TIMER_DISPATCH (1 << 6)
 # define MI_FLUSH_ENABLE   (1 << 12)
+# define TGL_NESTED_BB_EN  (1 << 12)
 # define ASYNC_FLIP_PERF_DISABLE   (1 << 14)
 # define MODE_IDLE (1 << 9)
 # define STOP_RING

[Intel-gfx] [PATCH v5 6/9] drm/i915/xehpsdv: Read correct RP_STATE_CAP register

2021-08-05 Thread Matt Roper

The RP_STATE_CAP register is no longer part of the MCHBAR on XEHPSDV; this
register is now a per-tile register at GTTMMADDR offset 0x250014.

Cc: Rodrigo Vivi 
Signed-off-by: Matt Roper 
Signed-off-by: Lucas De Marchi 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/gt/intel_rps.c | 4 +++-
 drivers/gpu/drm/i915/i915_reg.h | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index a3e69eba376f..3489f5f0cac1 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -2141,7 +2141,9 @@ u32 intel_rps_read_state_cap(struct intel_rps *rps)
struct drm_i915_private *i915 = rps_to_i915(rps);
struct intel_uncore *uncore = rps_to_uncore(rps);
 
-   if (IS_GEN9_LP(i915))
+   if (IS_XEHPSDV(i915))
+   return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP);
+   else if (IS_GEN9_LP(i915))
return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
else
return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index f8d3cd11eced..77f6dcaba2b9 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -4115,6 +4115,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   RPN_CAP_MASK REG_GENMASK(23, 16)
 #define BXT_RP_STATE_CAP_MMIO(0x138170)
 #define GEN9_RP_STATE_LIMITS   _MMIO(0x138148)
+#define XEHPSDV_RP_STATE_CAP   _MMIO(0x250014)
 
 /*
  * Logical Context regs
-- 
2.25.4

[Intel-gfx] [PATCH v5 1/9] drm/i915/dg2: Add support for new DG2-G11 revid 0x5

2021-08-05 Thread Matt Roper

The bspec has been updated with a new revision 0x5 that translates to B1
GT stepping and C0 display stepping.

Bspec: 44477
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_step.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index b5fb961e1b62..6cf967631395 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -118,6 +118,7 @@ static const struct intel_step_info 
dg2_g10_revid_step_tbl[] = {
 static const struct intel_step_info dg2_g11_revid_step_tbl[] = {
[0x0] = { .gt_step = STEP_A0, .display_step = STEP_B0 },
[0x4] = { .gt_step = STEP_B0, .display_step = STEP_C0 },
+   [0x5] = { .gt_step = STEP_B1, .display_step = STEP_C0 },
 };
 
 void intel_step_init(struct drm_i915_private *i915)
-- 
2.25.4

[Intel-gfx] [PATCH v5 5/9] drm/i915/xehpsdv: factor out function to read RP_STATE_CAP

2021-08-05 Thread Matt Roper

From: Lucas De Marchi 

Instead of maintaining the same if ladder in 3 different places, add a
function to read RP_STATE_CAP.

Signed-off-by: Lucas De Marchi 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/debugfs_gt_pm.c |  8 +++-
 drivers/gpu/drm/i915/gt/intel_rps.c | 17 -
 drivers/gpu/drm/i915/gt/intel_rps.h |  1 +
 drivers/gpu/drm/i915/i915_debugfs.c |  8 +++-
 4 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c 
b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
index d6f5836396f8..f6733f279890 100644
--- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
@@ -309,13 +309,11 @@ static int frequency_show(struct seq_file *m, void 
*unused)
int max_freq;
 
rp_state_limits = intel_uncore_read(uncore, 
GEN6_RP_STATE_LIMITS);
-   if (IS_GEN9_LP(i915)) {
-   rp_state_cap = intel_uncore_read(uncore, 
BXT_RP_STATE_CAP);
+   rp_state_cap = intel_rps_read_state_cap(rps);
+   if (IS_GEN9_LP(i915))
gt_perf_status = intel_uncore_read(uncore, 
BXT_GT_PERF_STATUS);
-   } else {
-   rp_state_cap = intel_uncore_read(uncore, 
GEN6_RP_STATE_CAP);
+   else
gt_perf_status = intel_uncore_read(uncore, 
GEN6_GT_PERF_STATUS);
-   }
 
/* RPSTAT1 is in the GT power well */
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index d812b27835f8..a3e69eba376f 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -996,20 +996,16 @@ int intel_rps_set(struct intel_rps *rps, u8 val)
 static void gen6_rps_init(struct intel_rps *rps)
 {
struct drm_i915_private *i915 = rps_to_i915(rps);
-   struct intel_uncore *uncore = rps_to_uncore(rps);
+   u32 rp_state_cap = intel_rps_read_state_cap(rps);
 
/* All of these values are in units of 50MHz */
 
/* static values from HW: RP0 > RP1 > RPn (min_freq) */
if (IS_GEN9_LP(i915)) {
-   u32 rp_state_cap = intel_uncore_read(uncore, BXT_RP_STATE_CAP);
-
rps->rp0_freq = (rp_state_cap >> 16) & 0xff;
rps->rp1_freq = (rp_state_cap >>  8) & 0xff;
rps->min_freq = (rp_state_cap >>  0) & 0xff;
} else {
-   u32 rp_state_cap = intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
-
rps->rp0_freq = (rp_state_cap >>  0) & 0xff;
rps->rp1_freq = (rp_state_cap >>  8) & 0xff;
rps->min_freq = (rp_state_cap >> 16) & 0xff;
@@ -2140,6 +2136,17 @@ int intel_rps_set_min_frequency(struct intel_rps *rps, 
u32 val)
return set_min_freq(rps, val);
 }
 
+u32 intel_rps_read_state_cap(struct intel_rps *rps)
+{
+   struct drm_i915_private *i915 = rps_to_i915(rps);
+   struct intel_uncore *uncore = rps_to_uncore(rps);
+
+   if (IS_GEN9_LP(i915))
+   return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
+   else
+   return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
+}
+
 /* External interface for intel_ips.ko */
 
 static struct drm_i915_private __rcu *ips_mchdev;
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h 
b/drivers/gpu/drm/i915/gt/intel_rps.h
index 4213bcce1667..11960d64ca82 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.h
+++ b/drivers/gpu/drm/i915/gt/intel_rps.h
@@ -41,6 +41,7 @@ u32 intel_rps_get_rp1_frequency(struct intel_rps *rps);
 u32 intel_rps_get_rpn_frequency(struct intel_rps *rps);
 u32 intel_rps_read_punit_req(struct intel_rps *rps);
 u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps);
+u32 intel_rps_read_state_cap(struct intel_rps *rps);
 
 void gen5_rps_irq_handler(struct intel_rps *rps);
 void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir);
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 44969f5dde50..eec0d349ea6a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -420,13 +420,11 @@ static int i915_frequency_info(struct seq_file *m, void 
*unused)
int max_freq;
 
rp_state_limits = intel_uncore_read(&dev_priv->uncore, 
GEN6_RP_STATE_LIMITS);
-   if (IS_GEN9_LP(dev_priv)) {
-   rp_state_cap = intel_uncore_read(&dev_priv->uncore, 
BXT_RP_STATE_CAP);
+   rp_state_cap = intel_rps_read_state_cap(rps);
+   if (IS_GEN9_LP(dev_priv))
gt_perf_status = intel_uncore_read(&dev_priv->uncore, 
BXT_GT_PERF_STATUS);
-   } else {
-   rp_state_cap = intel_uncore_read(&dev_priv->uncore, 
GEN6_RP_STATE_CAP);
+   else
gt_perf_status = intel_uncore_read(&dev_priv->uncore

[Intel-gfx] [PATCH v5 0/9] Begin enabling Xe_HP SDV and DG2 platforms

2021-08-05 Thread Matt Roper

This series provides some of the initial enablement patches for two
upcoming discrete GPUs:
 * XeHP SDV:  Xe_HP (version 12.50) graphics IP, no display IP
 * DG2:  Xe_HPG (version 12.55) graphics IP, Xe_LPD (version 13) display IP

Both platforms will need additional enablement patches beyond what's
present in this series before they're truly usable, including various
LMEM and GuC work that's already happening separately.  The new
features/functionality that these platforms bring (such as multi-tile
support, dedicated compute engines, etc.) may be referenced in passing
in some of these patches but will be fully enabled in future series.

v2:
 - General rebase and incorporation of r-b's.
 - Re-order intel_gt_info and intel_device_info structures to eliminate
   some unnecessary padding after the size change of
   intel_engine_mask_t.  (Tvrtko)
 - Use 'intel_step' mechanisms for revid->stepping mapping.  (Jani)
 - Drop the DSC patches for now; they need some rework.  (Jani)

v3:
 - About 20 of the patches have landed upstream now.  Rebase and resend
   the rest.  Some of these are already reviewed, but have dependencies
   on other unreviewed patches (e.g., the new engine definitions, the
   initial SNPS PHY support, etc.).

v4:
 - Several more patches have landed upstream; rebase and re-send the
   rest.  Some of the remaining patches are reviewed but still have
   dependencies on non-reviewed patches, so the order is shuffled this
   time to group patches by dependency rather than by xehp vs xehpsdv vs
   dg2.
 - Minor cleanup to "drm/i915/xehp: handle new steering options"
   suggested by Caz.

v5:
 - Rebase remaining patches after several more have landed upstream.
 - Drop the two MOCS patches for now; we need to wait for some prep work
   from Ayaz to land before we apply those.
 - Make a comment about uapi in the compute DSS patch more general; the
   uapi itself will show up in a future series once the corresponding
   userspace driver code is published.  (Lucas)
 - Add an extra patch for a new DG2-G11 stepping that has appeared.

Cc: Rodrigo Vivi 
Cc: Lucas De Marchi 
Cc: James Ausmus 


Akeem G Abodunrin (1):
  drm/i915/dg2: Add new LRI reg offsets

Ankit Nautiyal (1):
  drm/i915/dg2: Configure PCON in DP pre-enable path

Lucas De Marchi (1):
  drm/i915/xehpsdv: factor out function to read RP_STATE_CAP

Matt Roper (5):
  drm/i915/dg2: Add support for new DG2-G11 revid 0x5
  drm/i915/xehp: Loop over all gslices for INSTDONE processing
  drm/i915/dg2: Report INSTDONE_GEOM values in error state
  drm/i915/xehpsdv: Read correct RP_STATE_CAP register
  drm/i915/dg2: Maintain backward-compatible nested batch behavior

Stuart Summers (1):
  drm/i915/xehpsdv: Add compute DSS type

 drivers/gpu/drm/i915/display/intel_ddi.c |  3 +
 drivers/gpu/drm/i915/gt/debugfs_gt_pm.c  |  8 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 55 -
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 15 +++-
 drivers/gpu/drm/i915/gt/intel_lrc.c  | 85 +++-
 drivers/gpu/drm/i915/gt/intel_rps.c  | 19 +++--
 drivers/gpu/drm/i915/gt/intel_rps.h  |  1 +
 drivers/gpu/drm/i915/gt/intel_sseu.c | 66 +++
 drivers/gpu/drm/i915/gt/intel_sseu.h | 12 ++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c  | 39 -
 drivers/gpu/drm/i915/i915_debugfs.c  |  8 +-
 drivers/gpu/drm/i915/i915_gpu_error.c| 36 +++--
 drivers/gpu/drm/i915/i915_reg.h  |  6 +-
 drivers/gpu/drm/i915/intel_step.c|  1 +
 include/uapi/drm/i915_drm.h  |  3 -
 15 files changed, 284 insertions(+), 73 deletions(-)

-- 
2.25.4

[Intel-gfx] [PATCH v5 7/9] drm/i915/dg2: Add new LRI reg offsets

2021-08-05 Thread Matt Roper

From: Akeem G Abodunrin 

New LRI register offsets were introduced for DG2, this patch adds
those extra registers, and create new register table for setting offsets
to compare with HW generated context image - especially for gt_lrc test.
Also updates general purpose register with scratch offset for DG2, in
order to use it for live_lrc_fixed selftest.

Cc: Chris P Wilson 
Cc: Prathap Kumar Valsan 
Signed-off-by: Akeem G Abodunrin 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 85 -
 1 file changed, 83 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index bb4af4977920..6ba8daea2f56 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -226,6 +226,40 @@ static const u8 gen12_xcs_offsets[] = {
END
 };
 
+static const u8 dg2_xcs_offsets[] = {
+   NOP(1),
+   LRI(15, POSTED),
+   REG16(0x244),
+   REG(0x034),
+   REG(0x030),
+   REG(0x038),
+   REG(0x03c),
+   REG(0x168),
+   REG(0x140),
+   REG(0x110),
+   REG(0x1c0),
+   REG(0x1c4),
+   REG(0x1c8),
+   REG(0x180),
+   REG16(0x2b4),
+   REG(0x120),
+   REG(0x124),
+
+   NOP(1),
+   LRI(9, POSTED),
+   REG16(0x3a8),
+   REG16(0x28c),
+   REG16(0x288),
+   REG16(0x284),
+   REG16(0x280),
+   REG16(0x27c),
+   REG16(0x278),
+   REG16(0x274),
+   REG16(0x270),
+
+   END
+};
+
 static const u8 gen8_rcs_offsets[] = {
NOP(1),
LRI(14, POSTED),
@@ -525,6 +559,49 @@ static const u8 xehp_rcs_offsets[] = {
END
 };
 
+static const u8 dg2_rcs_offsets[] = {
+   NOP(1),
+   LRI(15, POSTED),
+   REG16(0x244),
+   REG(0x034),
+   REG(0x030),
+   REG(0x038),
+   REG(0x03c),
+   REG(0x168),
+   REG(0x140),
+   REG(0x110),
+   REG(0x1c0),
+   REG(0x1c4),
+   REG(0x1c8),
+   REG(0x180),
+   REG16(0x2b4),
+   REG(0x120),
+   REG(0x124),
+
+   NOP(1),
+   LRI(9, POSTED),
+   REG16(0x3a8),
+   REG16(0x28c),
+   REG16(0x288),
+   REG16(0x284),
+   REG16(0x280),
+   REG16(0x27c),
+   REG16(0x278),
+   REG16(0x274),
+   REG16(0x270),
+
+   LRI(3, POSTED),
+   REG(0x1b0),
+   REG16(0x5a8),
+   REG16(0x5ac),
+
+   NOP(6),
+   LRI(1, 0),
+   REG(0x0c8),
+
+   END
+};
+
 #undef END
 #undef REG16
 #undef REG
@@ -543,7 +620,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs 
*engine)
   !intel_engine_has_relative_mmio(engine));
 
if (engine->class == RENDER_CLASS) {
-   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55))
+   return dg2_rcs_offsets;
+   else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
return xehp_rcs_offsets;
else if (GRAPHICS_VER(engine->i915) >= 12)
return gen12_rcs_offsets;
@@ -554,7 +633,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs 
*engine)
else
return gen8_rcs_offsets;
} else {
-   if (GRAPHICS_VER(engine->i915) >= 12)
+   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55))
+   return dg2_xcs_offsets;
+   else if (GRAPHICS_VER(engine->i915) >= 12)
return gen12_xcs_offsets;
else if (GRAPHICS_VER(engine->i915) >= 9)
return gen9_xcs_offsets;
-- 
2.25.4

[Intel-gfx] [PATCH v5 9/9] drm/i915/dg2: Configure PCON in DP pre-enable path

2021-08-05 Thread Matt Roper

From: Ankit Nautiyal 

Add the functions to configure HDMI2.1 pcon for DG2, before DP link
training.

Signed-off-by: Ankit Nautiyal 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_ddi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c 
b/drivers/gpu/drm/i915/display/intel_ddi.c
index d8162951b78f..e932fd0fe7e2 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2402,6 +2402,7 @@ static void dg2_ddi_pre_enable_dp(struct 
intel_atomic_state *state,
if (!is_mst)
intel_dp_set_power(intel_dp, DP_SET_POWER_D0);
 
+   intel_dp_configure_protocol_converter(intel_dp, crtc_state);
intel_dp_sink_set_decompression_state(intel_dp, crtc_state, true);
/*
 * DDI FEC: "anticipates enabling FEC encoding sets the FEC_READY bit
@@ -2409,6 +2410,8 @@ static void dg2_ddi_pre_enable_dp(struct 
intel_atomic_state *state,
 * training
 */
intel_dp_sink_set_fec_ready(intel_dp, crtc_state);
+   intel_dp_check_frl_training(intel_dp);
+   intel_dp_pcon_dsc_configure(intel_dp, crtc_state);
 
/*
 * 5.h Follow DisplayPort specification training sequence (see notes for
-- 
2.25.4

Re: [Intel-gfx] [PATCH v5 1/9] drm/i915/dg2: Add support for new DG2-G11 revid 0x5

2021-08-05 Thread Lucas De Marchi


On Thu, Aug 05, 2021 at 09:36:39AM -0700, Matt Roper wrote:

The bspec has been updated with a new revision 0x5 that translates to B1
GT stepping and C0 display stepping.

Bspec: 44477
Signed-off-by: Matt Roper 



Reviewed-by: Lucas De Marchi 

Lucas De Marchi


---
drivers/gpu/drm/i915/intel_step.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index b5fb961e1b62..6cf967631395 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -118,6 +118,7 @@ static const struct intel_step_info 
dg2_g10_revid_step_tbl[] = {
static const struct intel_step_info dg2_g11_revid_step_tbl[] = {
[0x0] = { .gt_step = STEP_A0, .display_step = STEP_B0 },
[0x4] = { .gt_step = STEP_B0, .display_step = STEP_C0 },
+   [0x5] = { .gt_step = STEP_B1, .display_step = STEP_C0 },
};

void intel_step_init(struct drm_i915_private *i915)
--
2.25.4

Re: [Intel-gfx] [PATCH v5 14/20] drm/sched: Don't store self-dependencies

2021-08-05 Thread Christian König





Am 05.08.21 um 12:46 schrieb Daniel Vetter:

This is essentially part of drm_sched_dependency_optimized(), which
only amdgpu seems to make use of. Use it a bit more.

This would mean that as-is amdgpu can't use the dependency helpers, at
least not with the current approach amdgpu has for deciding whether a
vm_flush is needed. Since amdgpu also has very special rules around
implicit fencing it can't use those helpers either, and adding a
drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too
onerous. That way the special case handling for amdgpu sticks even
more out and we have higher chances that reviewers that go across all
drivers wont miss it.


Well you should probably drop the sentence about the vm_flush, this is 
completely unrelated.


Additional to that I still don't think that this is a good idea. 
Dependency handling is something completely driver specific.


E.g. even when you have submitted jobs back to back they still might 
need a cache flush in between and that is not only for amdgpu like this.


What you can do is to optimize for while looking at the fences later on 
and then note that you have done so and what the last hw fence is you 
used instead.


Regards,
Christian.



Reviewed-by: Lucas Stach 
Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 
---
  drivers/gpu/drm/scheduler/sched_main.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index f77456929139..49e507f91ec0 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -660,6 +660,13 @@ int drm_sched_job_add_dependency(struct drm_sched_job *job,
if (!fence)
return 0;
  
+	/* if it's a fence from us it's guaranteed to be earlier */

+   if (fence->context == job->entity->fence_context ||
+   fence->context == job->entity->fence_context + 1) {
+   dma_fence_put(fence);
+   return 0;
+   }
+
/* Deduplicate if we already depend on a fence from the same context.
 * This lets the size of the array of deps scale with the number of
 * engines involved, rather than the number of BOs.

Re: [Intel-gfx] [PATCH v5 15/20] drm/sched: Check locking in drm_sched_job_await_implicit

2021-08-05 Thread Christian König


Am 05.08.21 um 12:47 schrieb Daniel Vetter:

You really need to hold the reservation here or all kinds of funny
things can happen between grabbing the dependencies and inserting the
new fences.

Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 


The function name in the subject line should be updated, apart from that 
feel free to add my rb to this patch.


Christian.


---
  drivers/gpu/drm/scheduler/sched_main.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 49e507f91ec0..1abb40b07324 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -715,6 +715,8 @@ int drm_sched_job_add_implicit_dependencies(struct 
drm_sched_job *job,
struct dma_fence **fences;
unsigned int i, fence_count;
  
+	dma_resv_assert_held(obj->resv);

+
if (!write) {
struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);

Re: [Intel-gfx] [PATCH v5 4/9] drm/i915/xehpsdv: Add compute DSS type

2021-08-05 Thread Lucas De Marchi


On Thu, Aug 05, 2021 at 09:36:42AM -0700, Matt Roper wrote:

From: Stuart Summers 

Starting in XeHP, the concept of slice has been removed in favor of
DSS (Dual-Subslice) masks for various workload types. These workloads have
been divided into those enabled for geometry and those enabled for compute.

i915 currently maintains a single set of S/SS/EU masks for the device.
The goal of this patch set is to minimize the amount of impact to prior
generations while still giving the user maximum flexibility.

v2:
- Generalize a comment about uapi access to geometry/compute masks; the
  proposed uapi has changed since the comment was first written, and
  will show up in a future series once the userspace code is published.
  (Lucas)

Bspec: 33117, 33118, 20376
Cc: Daniele Ceraolo Spurio 
Cc: Matt Roper 
Cc: Lucas De Marchi 
Signed-off-by: Stuart Summers 
Signed-off-by: Steve Hampson 
Signed-off-by: Matt Roper 
---
drivers/gpu/drm/i915/gt/intel_sseu.c | 66 +---
drivers/gpu/drm/i915/gt/intel_sseu.h |  5 ++-
drivers/gpu/drm/i915/i915_reg.h  |  3 +-
include/uapi/drm/i915_drm.h  |  3 --
4 files changed, 55 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c 
b/drivers/gpu/drm/i915/gt/intel_sseu.c
index bbd272943c3f..9cf157a2454f 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -46,11 +46,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info 
*sseu, u8 slice)
}

void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
- u32 ss_mask)
+ u8 *subslice_mask, u32 ss_mask)
{
int offset = slice * sseu->ss_stride;

-   memcpy(&sseu->subslice_mask[offset], &ss_mask, sseu->ss_stride);
+   memcpy(&subslice_mask[offset], &ss_mask, sseu->ss_stride);
}

unsigned int
@@ -100,14 +100,24 @@ static u16 compute_eu_total(const struct sseu_dev_info 
*sseu)
return total;
}

-static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
-   u8 s_en, u32 ss_en, u16 eu_en)
+static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en)
+{
+   u32 ss_mask;
+
+   ss_mask = ss_en >> (s * sseu->max_subslices);
+   ss_mask &= GENMASK(sseu->max_subslices - 1, 0);
+
+   return ss_mask;
+}
+
+static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en,
+   u32 g_ss_en, u32 c_ss_en, u16 eu_en)
{
int s, ss;

-   /* ss_en represents entire subslice mask across all slices */
+   /* g_ss_en/c_ss_en represent entire subslice mask across all slices */
GEM_BUG_ON(sseu->max_slices * sseu->max_subslices >
-  sizeof(ss_en) * BITS_PER_BYTE);
+  sizeof(g_ss_en) * BITS_PER_BYTE);

for (s = 0; s < sseu->max_slices; s++) {
if ((s_en & BIT(s)) == 0)
@@ -115,7 +125,22 @@ static void gen11_compute_sseu_info(struct sseu_dev_info 
*sseu,

sseu->slice_mask |= BIT(s);

-   intel_sseu_set_subslices(sseu, s, ss_en);
+   /*
+* XeHP introduces the concept of compute vs geometry DSS. To
+* reduce variation between GENs around subslice usage, store a
+* mask for both the geometry and compute enabled masks since
+* userspace will need to be able to query these masks
+* independently.  Also compute a total enabled subslice count
+* for the purposes of selecting subslices to use in a
+* particular GEM context.
+*/
+   intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask,
+get_ss_stride_mask(sseu, s, c_ss_en));
+   intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask,
+get_ss_stride_mask(sseu, s, g_ss_en));
+   intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
+get_ss_stride_mask(sseu, s,
+   g_ss_en | c_ss_en));

for (ss = 0; ss < sseu->max_subslices; ss++)
if (intel_sseu_has_subslice(sseu, s, ss))
@@ -129,7 +154,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
{
struct sseu_dev_info *sseu = >->info.sseu;
struct intel_uncore *uncore = gt->uncore;
-   u32 dss_en;
+   u32 g_dss_en, c_dss_en = 0;
u16 eu_en = 0;
u8 eu_en_fuse;
u8 s_en;
@@ -145,10 +170,12 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 * across the entire device. Then calculate out the DSS for each
 * workload type within that software slice.
 */
-   if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915))
+   if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915)) {
intel_sseu_set_info(sseu, 1, 32, 1

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: fix i915_globals_exit() section mismatch error

2021-08-05 Thread Patchwork

== Series Details ==

Series: drm/i915: fix i915_globals_exit() section mismatch error
URL   : https://patchwork.freedesktop.org/series/93398/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10445_full -> Patchwork_20770_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20770_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20770_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20770_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_exec_endless@dispatch@vcs0:
- shard-tglb: NOTRUN -> [INCOMPLETE][1]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-tglb1/igt@gem_exec_endless@dispa...@vcs0.html

  * igt@i915_pm_sseu@full-enable:
- shard-glk:  NOTRUN -> [FAIL][2]
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-glk6/igt@i915_pm_s...@full-enable.html

  
Known issues


  Here are the changes found in Patchwork_20770_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@feature_discovery@display-2x:
- shard-iclb: NOTRUN -> [SKIP][3] ([i915#1839])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-iclb2/igt@feature_discov...@display-2x.html

  * igt@gem_create@create-massive:
- shard-snb:  NOTRUN -> [DMESG-WARN][4] ([i915#3002])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-snb2/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_persistence@hostile:
- shard-snb:  NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-snb7/igt@gem_ctx_persiste...@hostile.html

  * igt@gem_ctx_persistence@legacy-engines-hang@render:
- shard-tglb: [PASS][6] -> [FAIL][7] ([i915#2410])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-tglb2/igt@gem_ctx_persistence@legacy-engines-h...@render.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-tglb7/igt@gem_ctx_persistence@legacy-engines-h...@render.html

  * igt@gem_exec_fair@basic-deadline:
- shard-glk:  [PASS][8] -> [FAIL][9] ([i915#2846])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-glk9/igt@gem_exec_f...@basic-deadline.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-glk7/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-apl:  [PASS][10] -> [SKIP][11] ([fdo#109271])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-apl3/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-apl1/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs1:
- shard-iclb: NOTRUN -> [FAIL][12] ([i915#2842])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-iclb2/igt@gem_exec_fair@basic-n...@vcs1.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
- shard-tglb: [PASS][13] -> [FAIL][14] ([i915#2842]) +2 similar 
issues
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-tglb6/igt@gem_exec_fair@basic-pace-sh...@rcs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-tglb7/igt@gem_exec_fair@basic-pace-sh...@rcs0.html

  * igt@gem_pwrite@basic-exhaustion:
- shard-snb:  NOTRUN -> [WARN][15] ([i915#2658])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-snb6/igt@gem_pwr...@basic-exhaustion.html
- shard-kbl:  NOTRUN -> [WARN][16] ([i915#2658])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-kbl2/igt@gem_pwr...@basic-exhaustion.html

  * igt@gen9_exec_parse@batch-invalid-length:
- shard-snb:  NOTRUN -> [SKIP][17] ([fdo#109271]) +269 similar 
issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-snb7/igt@gen9_exec_pa...@batch-invalid-length.html

  * igt@i915_pm_dc@dc5-dpms:
- shard-kbl:  NOTRUN -> [FAIL][18] ([i915#545])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-kbl2/igt@i915_pm...@dc5-dpms.html

  * igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-dp:
- shard-apl:  NOTRUN -> [SKIP][19] ([fdo#109271] / [i915#1937])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20770/shard-apl1/igt@i915_pm_lpsp@kms-l...@kms-lpsp-dp.html

  * igt@i915_pm_sseu@full-enable:
- shard-skl:  [PASS][20] -> [FAIL][21] ([i915#3650])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10445/shard-

Re: [Intel-gfx] [PATCH v3 09/14] vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set

2021-08-05 Thread Alex Williamson

On Thu, 5 Aug 2021 08:47:01 -0300
Jason Gunthorpe  wrote:

> On Tue, Aug 03, 2021 at 10:52:25AM -0600, Alex Williamson wrote:
> > On Tue, 3 Aug 2021 13:41:52 -0300
> > Jason Gunthorpe  wrote:  
> > > On Tue, Aug 03, 2021 at 10:34:06AM -0600, Alex Williamson wrote:  
> > > > I think the vfio_pci_find_reset_target() function needs to be re-worked
> > > > to just tell us true/false that it's ok to reset the provided device,
> > > > not to anoint an arbitrary target device.  Thanks,
> > > 
> > > Yes, though this logic is confusing, why do we need to check if any
> > > device needs a reset at this point? If we are being asked to reset
> > > vdev shouldn't vdev needs_reset?
> > > 
> > > Or is the function more of a 'synchronize pending reset' kind of
> > > thing?  
> > 
> > Yes, the latter.  For instance think about a multi-function PCI device
> > such as a GPU.  The functions have dramatically different capabilities,
> > some might have function level reset abilities and others not.  We want
> > to be able to trigger a bus reset as the last device of the set is
> > released, no matter the order they're released and no matter the
> > capabilities of the device we're currently processing.  Thanks,  
> 
> I worked on this for awhile, I think this is much clearer about what
> this algorithm is trying to do:
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 5d6db93d6c680f..e418bcbb68facc 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -223,7 +223,7 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_device 
> *vdev)
>   }
>  }
>  
> -static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev);
> +static bool vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set);
>  static void vfio_pci_disable(struct vfio_pci_device *vdev);
>  static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void 
> *data);
>  
> @@ -404,6 +404,9 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
>   struct vfio_pci_ioeventfd *ioeventfd, *ioeventfd_tmp;
>   int i, bar;
>  
> + /* For needs_reset */
> + lockdep_assert_held(&vdev->vdev.dev_set->lock);
> +
>   /* Stop the device from further DMA */
>   pci_clear_master(pdev);
>  
> @@ -487,9 +490,7 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
>  out:
>   pci_disable_device(pdev);
>  
> - vfio_pci_try_bus_reset(vdev);
> -
> - if (!disable_idle_d3)
> + if (!vfio_pci_dev_set_try_reset(vdev->vdev.dev_set) && !disable_idle_d3)
>   vfio_pci_set_power_state(vdev, PCI_D3hot);
>  }
>  
> @@ -2145,36 +2146,6 @@ static struct pci_driver vfio_pci_driver = {
>   .err_handler= &vfio_err_handlers,
>  };
>  
> -static int vfio_pci_get_unused_devs(struct pci_dev *pdev, void *data)
> -{
> - struct vfio_devices *devs = data;
> - struct vfio_device *device;
> - struct vfio_pci_device *vdev;
> -
> - if (devs->cur_index == devs->max_index)
> - return -ENOSPC;
> -
> - device = vfio_device_get_from_dev(&pdev->dev);
> - if (!device)
> - return -EINVAL;
> -
> - if (pci_dev_driver(pdev) != &vfio_pci_driver) {
> - vfio_device_put(device);
> - return -EBUSY;
> - }
> -
> - vdev = container_of(device, struct vfio_pci_device, vdev);
> -
> - /* Fault if the device is not unused */
> - if (device->open_count) {
> - vfio_device_put(device);
> - return -EBUSY;
> - }
> -
> - devs->devices[devs->cur_index++] = vdev;
> - return 0;
> -}
> -
>  static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data)
>  {
>   struct vfio_devices *devs = data;
> @@ -2208,79 +2179,86 @@ static int vfio_pci_try_zap_and_vma_lock_cb(struct 
> pci_dev *pdev, void *data)
>   return 0;
>  }
>  
> +static int vfio_pci_is_device_in_set(struct pci_dev *pdev, void *data)
> +{
> + struct vfio_device_set *dev_set = data;
> + struct vfio_device *cur;
> +
> + lockdep_assert_held(&dev_set->lock);
> +
> + list_for_each_entry(cur, &dev_set->device_list, dev_set_list)
> + if (cur->dev == &pdev->dev)
> + return 0;
> + return -EBUSY;
> +}
> +
> +static bool vfio_pci_dev_set_needs_reset(struct vfio_device_set *dev_set)

Slight nit on the name here since we're essentially combining
needs_reset along with the notion of the device being unused.  I'm not
sure, maybe "should_reset"?  Otherwise it looks ok.  Thanks,

Alex

> +{
> + struct vfio_pci_device *cur;
> + bool needs_reset = false;
> +
> + list_for_each_entry(cur, &dev_set->device_list, vdev.dev_set_list) {
> + /* No VFIO device in the set can have an open device FD */
> + if (cur->vdev.open_count)
> + return false;
> + needs_reset |= cur->needs_reset;
> + }
> + return needs_reset;
> +}
> +
>  /*
> - * If a bus or slot reset is available for the provid

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Begin enabling Xe_HP SDV and DG2 platforms (rev9)

2021-08-05 Thread Patchwork

== Series Details ==

Series: Begin enabling Xe_HP SDV and DG2 platforms (rev9)
URL   : https://patchwork.freedesktop.org/series/92135/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
515352629f0e drm/i915/dg2: Add support for new DG2-G11 revid 0x5
b8058f4b6221 drm/i915/xehp: Loop over all gslices for INSTDONE processing
-:135: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'iter_' - possible 
side-effects?
#135: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:582:
+#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, 
dss_) \
+   for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
+(iter_) < GEN_MAX_SUBSLICES; \
+(iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
+(dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
+   for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))

-:135: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'gslice_' - possible 
side-effects?
#135: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:582:
+#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, 
dss_) \
+   for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
+(iter_) < GEN_MAX_SUBSLICES; \
+(iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
+(dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
+   for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))

-:135: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'dss_' - possible 
side-effects?
#135: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:582:
+#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, 
dss_) \
+   for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
+(iter_) < GEN_MAX_SUBSLICES; \
+(iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
+(dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
+   for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))

total: 0 errors, 0 warnings, 3 checks, 164 lines checked
8b84974c7782 drm/i915/dg2: Report INSTDONE_GEOM values in error state
0b355c3ad3ca drm/i915/xehpsdv: Add compute DSS type
f61fbc6c1d71 drm/i915/xehpsdv: factor out function to read RP_STATE_CAP
7c5f8a298512 drm/i915/xehpsdv: Read correct RP_STATE_CAP register
de625d7a0adc drm/i915/dg2: Add new LRI reg offsets
af46376d4b74 drm/i915/dg2: Maintain backward-compatible nested batch behavior
e8ef3eecff4f drm/i915/dg2: Configure PCON in DP pre-enable path

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Begin enabling Xe_HP SDV and DG2 platforms (rev9)

2021-08-05 Thread Patchwork

== Series Details ==

Series: Begin enabling Xe_HP SDV and DG2 platforms (rev9)
URL   : https://patchwork.freedesktop.org/series/92135/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/display/intel_display.c:1901:21:expected struct 
i915_vma *[assigned] vma
+drivers/gpu/drm/i915/display/intel_display.c:1901:21:got void [noderef] 
__iomem *[assigned] iomem
+drivers/gpu/drm/i915/display/intel_display.c:1901:21: warning: incorrect type 
in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1410:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1410:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1410:34: warning: incorrect type 
in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1392:5: warning: context imbalance in 
'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/intel_ring_submission.c:1268:24: warning: Using plain 
integer as NULL pointer
+drivers/gpu/drm/i915/i915_perf.c:1442:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/i915_perf.c:1496:15: warning: memset with byte count of 
16777216
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative 
(-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative 
(-262080)
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: w

Re: [Intel-gfx] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init

2021-08-05 Thread Christian König


Am 05.08.21 um 12:46 schrieb Daniel Vetter:

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
   usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
   to be moved into drm_sched_job_arm, which made me realize that the
   job->id definitely needs to be moved too.

   Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

Acked-by: Melissa Wen 
Cc: Melissa Wen 
Acked-by: Emma Anholt 
Acked-by: Steven Price  (v2)
Reviewed-by: Boris Brezillon  (v5)
Signed-off-by: Daniel Vetter 


At least the amdgpu parts look ok of hand, but I can't judge the rest I 
think.


So only Acked-by: Christian König 


Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Masahiro Yamada 
Cc: Kees Cook 
Cc: Adam Borowski 
Cc: Nick Terrell 
Cc: Mauro Carvalho Chehab 
Cc: Paul Menzel 
Cc: Sami Tolvanen 
Cc: Viresh Kumar 
Cc: Alex Deucher 
Cc: Dave Airlie 
Cc: Nirmoy Das 
Cc: Deepak R Varma 
Cc: Lee Jones 
Cc: Kevin Wang 
Cc: Chen Li 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Dennis Li 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Sonny Jiang 
Cc: Boris Brezillon 
Cc: Tian Tao 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Emma Anholt 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
  drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
  drivers/gpu/drm/lima/lima_sched.c|  2 +
  drivers/gpu/drm/msm/msm_gem_submit.c |  3 ++
  drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
  drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
  drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
  drivers/gpu/drm/scheduler/sched_main.c   | 69 
  drivers/gpu/drm/v3d/v3d_gem.c|  2 +
  include/drm/gpu_scheduler.h  |  7 ++-
  11 files changed, 94 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 139cd3bf1ad6..32e80bc6af22 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
if (r)
goto error_unlock;
  
+	drm_sched_job_arm(&job->base);

+
/* No memory allocation is allowed while holding the notifier lock.
 * The lock is held until amdgpu_cs_submit is finished and fence is
 * added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
if (r)
return r;
  
+	drm_sched_job_arm(&job->base);

+
*f = dma_fence_get(&job->base.s_fence->finished);
amdgpu_job_free_resources(job);
drm_sched_entity_push_job(&job->base, entity);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index feb6da1b6ceb..05f412204118 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
if (ret)
goto out_unlock;
  
+	drm_sched_job_arm(&submit->sched_job);

+
submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);

Re: [Intel-gfx] [PATCH v5 03/20] drm/sched: Barriers are needed for entity->last_scheduled

2021-08-05 Thread Christian König





Am 05.08.21 um 12:46 schrieb Daniel Vetter:

It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.

It's definitely not enough on arm.

Fix this proplery, which means
- explain the need for the barrier in both places
- point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.

While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.

Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.

v2: Put smp_rmp() in the right place and fix up comment (Andrey)

Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Steven Price 
Cc: Daniel Vetter 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Boris Brezillon 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
  1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index f7347c284886..89e3f6eaf519 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
dma_fence_set_error(&sched_job->s_fence->finished, -ECANCELED);
  
  	dma_fence_put(entity->last_scheduled);

+
entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished);
  
+	/*

+* If the queue is empty we allow drm_sched_entity_select_rq() to
+* locklessly access ->last_scheduled. This only works if we set the
+* pointer before we dequeue and if we a write barrier here.
+*/
+   smp_wmb();
+
spsc_queue_pop(&entity->job_queue);
return sched_job;
  }
@@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity 
*entity)
struct drm_gpu_scheduler *sched;
struct drm_sched_rq *rq;
  
-	if (spsc_queue_count(&entity->job_queue) || !entity->sched_list)

+   /* single possible engine and already selected */
+   if (!entity->sched_list)
+   return;
+
+   /* queue non-empty, stay on the same engine */
+   if (spsc_queue_count(&entity->job_queue))
return;
  
-	fence = READ_ONCE(entity->last_scheduled);

+   /*
+* Only when the queue is empty are we guaranteed that the scheduler
+* thread cannot change ->last_scheduled. To enforce ordering we need
+* a read barrier here. See drm_sched_entity_pop_job() for the other
+* side.
+*/
+   smp_rmb();
+
+   fence = entity->last_scheduled;
+
+   /* stay on the same engine if the previous job hasn't finished */
if (fence && !dma_fence_is_signaled(fence))
return;

Re: [Intel-gfx] [PATCH v5 04/20] drm/sched: Add dependency tracking

2021-08-05 Thread Christian König


Am 05.08.21 um 12:46 schrieb Daniel Vetter:

Instead of just a callback we can just glue in the gem helpers that
panfrost, v3d and lima currently use. There's really not that many
ways to skin this cat.

v2/3: Rebased.

v4: Repaint this shed. The functions are now called _add_dependency()
and _add_implicit_dependency()

Reviewed-by: Boris Brezillon  (v3)
Reviewed-by: Steven Price  (v1)
Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 


Reviewed-by: Christian König 


Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Nirmoy Das 
Cc: Boris Brezillon 
Cc: Luben Tuikov 
Cc: Alex Deucher 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
  drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
  drivers/gpu/drm/scheduler/sched_main.c   | 104 +++
  include/drm/gpu_scheduler.h  |  33 ++-
  3 files changed, 149 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 89e3f6eaf519..381fbf462ea7 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence 
*f,
job->sched->ops->free_job(job);
  }
  
+static struct dma_fence *

+drm_sched_job_dependency(struct drm_sched_job *job,
+struct drm_sched_entity *entity)
+{
+   if (!xa_empty(&job->dependencies))
+   return xa_erase(&job->dependencies, job->last_dependency++);
+
+   if (job->sched->ops->dependency)
+   return job->sched->ops->dependency(job, entity);
+
+   return NULL;
+}
+
  /**
   * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
   *
@@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct 
drm_sched_entity *entity)
struct drm_sched_fence *s_fence = job->s_fence;
  
  		/* Wait for all dependencies to avoid data corruptions */

-   while ((f = job->sched->ops->dependency(job, entity)))
+   while ((f = drm_sched_job_dependency(job, entity)))
dma_fence_wait(f, false);
  
  		drm_sched_fence_scheduled(s_fence);

@@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct 
drm_sched_entity *entity)
   */
  struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity 
*entity)
  {
-   struct drm_gpu_scheduler *sched = entity->rq->sched;
struct drm_sched_job *sched_job;
  
  	sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));

@@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
return NULL;
  
  	while ((entity->dependency =

-   sched->ops->dependency(sched_job, entity))) {
+   drm_sched_job_dependency(sched_job, entity))) {
trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
  
  		if (drm_sched_entity_add_dependency_cb(entity))

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 454cb6164bdc..f77456929139 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
  
  	INIT_LIST_HEAD(&job->list);
  
+	xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);

+
return 0;
  }
  EXPORT_SYMBOL(drm_sched_job_init);
@@ -637,6 +639,99 @@ void drm_sched_job_arm(struct drm_sched_job *job)
  }
  EXPORT_SYMBOL(drm_sched_job_arm);
  
+/**

+ * drm_sched_job_add_dependency - adds the fence as a job dependency
+ * @job: scheduler job to add the dependencies to
+ * @fence: the dma_fence to add to the list of dependencies.
+ *
+ * Note that @fence is consumed in both the success and error cases.
+ *
+ * Returns:
+ * 0 on success, or an error on failing to expand the array.
+ */
+int drm_sched_job_add_dependency(struct drm_sched_job *job,
+struct dma_fence *fence)
+{
+   struct dma_fence *entry;
+   unsigned long index;
+   u32 id = 0;
+   int ret;
+
+   if (!fence)
+   return 0;
+
+   /* Deduplicate if we already depend on a fence from the same context.
+* This lets the size of the array of deps scale with the number of
+* engines involved, rather than the number of BOs.
+*/
+   xa_for_each(&job->dependencies, index, entry) {
+   if (entry->context != fence->context)
+   continue;
+
+   if (dma_fence_is_later(fence, entry)) {
+   dma_fence_put(entry);
+   xa_store(&job->dependencies, index, fence, GFP_KERNEL);
+   } else {
+   dma_fence_put(fence);
+   }
+   return 0;
+   }
+
+   ret = xa_alloc(&job->dependencies, &id, fen

Re: [Intel-gfx] [PATCH v5 05/20] drm/sched: drop entity parameter from drm_sched_push_job

2021-08-05 Thread Christian König


Am 05.08.21 um 12:46 schrieb Daniel Vetter:

Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

v2:
Rebase on top of msm adopting drm/sched

Acked-by: Emma Anholt 
Acked-by: Melissa Wen 
Reviewed-by: Steven Price  (v1)
Reviewed-by: Boris Brezillon  (v1)
Signed-off-by: Daniel Vetter 


Reviewed-by: Christian König 


Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: Emma Anholt 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Alex Deucher 
Cc: Nirmoy Das 
Cc: Dave Airlie 
Cc: Chen Li 
Cc: Lee Jones 
Cc: Deepak R Varma 
Cc: Kevin Wang 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Dennis Li 
Cc: Boris Brezillon 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Rob Clark 
Cc: Sean Paul 
Cc: Melissa Wen 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   | 2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  | 2 +-
  drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 2 +-
  drivers/gpu/drm/lima/lima_gem.c  | 3 +--
  drivers/gpu/drm/lima/lima_sched.c| 5 ++---
  drivers/gpu/drm/lima/lima_sched.h| 3 +--
  drivers/gpu/drm/msm/msm_gem_submit.c | 2 +-
  drivers/gpu/drm/panfrost/panfrost_job.c  | 2 +-
  drivers/gpu/drm/scheduler/sched_entity.c | 6 ++
  drivers/gpu/drm/v3d/v3d_gem.c| 2 +-
  include/drm/gpu_scheduler.h  | 3 +--
  11 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 32e80bc6af22..1d8a914108af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
  
  	trace_amdgpu_cs_ioctl(job);

amdgpu_vm_bo_trace_cs(&fpriv->vm, &p->ticket);
-   drm_sched_entity_push_job(&job->base, entity);
+   drm_sched_entity_push_job(&job->base);
  
  	amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c

index 5ddb955d2315..b86099c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
  
  	*f = dma_fence_get(&job->base.s_fence->finished);

amdgpu_job_free_resources(job);
-   drm_sched_entity_push_job(&job->base, entity);
+   drm_sched_entity_push_job(&job->base);
  
  	return 0;

  }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 05f412204118..180bb633d5c5 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
/* the scheduler holds on to the job now */
kref_get(&submit->refcount);
  
-	drm_sched_entity_push_job(&submit->sched_job, sched_entity);

+   drm_sched_entity_push_job(&submit->sched_job);
  
  out_unlock:

mutex_unlock(&submit->gpu->fence_lock);
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index de62966243cd..c528f40981bb 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct 
lima_submit *submit)
goto err_out2;
}
  
-	fence = lima_sched_context_queue_task(

-   submit->ctx->context + submit->pipe, submit->task);
+   fence = lima_sched_context_queue_task(submit->task);
  
  	for (i = 0; i < submit->nr_bos; i++) {

if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index 38f755580507..e968b5a8f0b0 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe,
drm_sched_entity_fini(&context->base);
  }
  
-struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context,

-   struct lima_sched_task *task)
+struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
  {
struct dma_fence *fence = dma_fence_get(&task->base.s_fence->finished);
  
  	trace_lima_task_submit(task);

-   drm_sched_entity_push_job(&task->base, &context->base);
+   drm_sched_entity_push_job(&task->base)

Re: [Intel-gfx] [PATCH v5 14/20] drm/sched: Don't store self-dependencies

2021-08-05 Thread Christian König


Am 05.08.21 um 15:25 schrieb Daniel Vetter:

On Thu, Aug 5, 2021 at 3:18 PM Christian König  wrote:



Am 05.08.21 um 12:46 schrieb Daniel Vetter:

This is essentially part of drm_sched_dependency_optimized(), which
only amdgpu seems to make use of. Use it a bit more.

This would mean that as-is amdgpu can't use the dependency helpers, at
least not with the current approach amdgpu has for deciding whether a
vm_flush is needed. Since amdgpu also has very special rules around
implicit fencing it can't use those helpers either, and adding a
drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too
onerous. That way the special case handling for amdgpu sticks even
more out and we have higher chances that reviewers that go across all
drivers wont miss it.

Well you should probably drop the sentence about the vm_flush, this is
completely unrelated.

Additional to that I still don't think that this is a good idea.
Dependency handling is something completely driver specific.

E.g. even when you have submitted jobs back to back they still might
need a cache flush in between and that is not only for amdgpu like this.

What you can do is to optimize for while looking at the fences later on
and then note that you have done so and what the last hw fence is you
used instead.

Out of 6 drivers using drm/sched 5 can use this. When we get i915
over, that one will be added to the list. amdgpu can't use any of this
anyway due to the vm_id allocation requirements, which is why I
mention that. Also note that all the callbacks are still there, so you
can just ignore this all and still build your own. Like amdgpu does.


The VMID allocation stuff is rather easy to handle, that's why I noted 
we should remove that sentence.


The problematic stuff is handling the cache flush and pipeline sync 
which you make impossible with this here.



So I'm not sure what exactly your object is, aside from "this doesn't
fit for amdgpu", which a) I know b) the commit message explains c)
doesn't actually hurt amdgpu in the slightest. And we still get the
benefit that for most drivers it's a nice optimization.


Well exactly that's what I wanted to avoid. We still can use this in 
amdgpu even with the VMID allocation stuff and I still hope to do so.


Can't we add this as a wrapper or similar?

Christian.


-Daniel


Regards,
Christian.


Reviewed-by: Lucas Stach 
Acked-by: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 
---
   drivers/gpu/drm/scheduler/sched_main.c | 7 +++
   1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index f77456929139..49e507f91ec0 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -660,6 +660,13 @@ int drm_sched_job_add_dependency(struct drm_sched_job *job,
   if (!fence)
   return 0;

+ /* if it's a fence from us it's guaranteed to be earlier */
+ if (fence->context == job->entity->fence_context ||
+ fence->context == job->entity->fence_context + 1) {
+ dma_fence_put(fence);
+ return 0;
+ }
+
   /* Deduplicate if we already depend on a fence from the same context.
* This lets the size of the array of deps scale with the number of
* engines involved, rather than the number of BOs.

[Intel-gfx] ✓ Fi.CI.BAT: success for Begin enabling Xe_HP SDV and DG2 platforms (rev9)

2021-08-05 Thread Patchwork

== Series Details ==

Series: Begin enabling Xe_HP SDV and DG2 platforms (rev9)
URL   : https://patchwork.freedesktop.org/series/92135/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10451 -> Patchwork_20776


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20776:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@i915_selftest@live@gt_timelines:
- {fi-ehl-2}: [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/fi-ehl-2/igt@i915_selftest@live@gt_timelines.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/fi-ehl-2/igt@i915_selftest@live@gt_timelines.html

  
Known issues


  Here are the changes found in Patchwork_20776 that come from known issues:

### IGT changes ###

 Possible fixes 

  * igt@kms_chamelium@hdmi-hpd-fast:
- fi-icl-u2:  [DMESG-WARN][3] ([i915#2203] / [i915#2868]) -> 
[PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
  [i915#2868]: https://gitlab.freedesktop.org/drm/intel/issues/2868


Participating hosts (40 -> 34)
--

  Missing(6): fi-ilk-m540 fi-hsw-4200u fi-tgl-1115g4 fi-bsw-cyan 
fi-bdw-samus bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10451 -> Patchwork_20776

  CI-20190529: 20190529
  CI_DRM_10451: 3bea0ad83735904d380d83bcca30557268acf887 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6160: 4287344dd6a39d9036c5fb9a047a7d8f10bee981 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20776: e8ef3eecff4fdf295eeb9d88287bd5fe99f1ad11 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

e8ef3eecff4f drm/i915/dg2: Configure PCON in DP pre-enable path
af46376d4b74 drm/i915/dg2: Maintain backward-compatible nested batch behavior
de625d7a0adc drm/i915/dg2: Add new LRI reg offsets
7c5f8a298512 drm/i915/xehpsdv: Read correct RP_STATE_CAP register
f61fbc6c1d71 drm/i915/xehpsdv: factor out function to read RP_STATE_CAP
0b355c3ad3ca drm/i915/xehpsdv: Add compute DSS type
8b84974c7782 drm/i915/dg2: Report INSTDONE_GEOM values in error state
b8058f4b6221 drm/i915/xehp: Loop over all gslices for INSTDONE processing
515352629f0e drm/i915/dg2: Add support for new DG2-G11 revid 0x5

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/index.html

Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-05 Thread Matthew Brost

On Thu, Aug 05, 2021 at 05:10:29PM +0100, Tvrtko Ursulin wrote:
> 
> On 05/08/2021 16:04, Patchwork wrote:
> > *Patch Details*
> > *Series:*   drm/i915: Be more gentle when exiting non-persistent contexts
> > *URL:*  https://patchwork.freedesktop.org/series/93420/
> > 
> > *State:*failure
> > *Details:*
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/index.html
> > 
> > 
> > 
> >   CI Bug Log - changes from CI_DRM_10450 -> Patchwork_20775
> > 
> > 
> > Summary
> > 
> > *FAILURE*
> > 
> > Serious unknown changes coming with Patchwork_20775 absolutely need to be
> > verified manually.
> > 
> > If you think the reported changes have nothing to do with the changes
> > introduced in Patchwork_20775, please notify your bug team to allow them
> > to document this new failure mode, which will reduce false positives in CI.
> > 
> > External URL:
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20775/index.html
> > 
> > 
> > Possible new issues
> > 
> > Here are the unknown changes that may have been introduced in
> > Patchwork_20775:
> > 
> > 
> >   IGT changes
> > 
> > 
> > Possible regressions
> > 
> >   * igt@i915_selftest@live@gt_lrc:
> >   o fi-rkl-guc: PASS
> > 
> > 
> > -> DMESG-WARN
> > 
> > 
> 
> <6> [233.928677] i915: Running intel_lrc_live_selftests/live_lrc_isolation
> <3> [233.988780] i915 :00:02.0: [drm] *ERROR* rcs0 context redzone 
> overwritten!
> 
> Something GuC specific by the look of it, or at least I haven't found the 
> same signature elsewhere. But in any case it is not related to this patch.
> 

No sure what this is about. Ran this locally on a RKL machine and it
passed just fine for me. Something to keep an eye on as CI gets fully
enabled with GuC submission.

Also BTW, speaking of CI & GuC submission it isn't all that great yet.
Maybe ping me when you have the next rev of this patch and I can run
series of tests with GuC submission related to banning / persistence.

Matt

> Regards,
> 
> Tvrtko
> 
> > 
> > 
> > Known issues
> > 
> > Here are the changes found in Patchwork_20775 that come from known issues:
> > 
> > 
> >   IGT changes
> > 
> > 
> > Issues hit
> > 
> >   *
> > 
> > igt@amdgpu/amd_basic@query-info:
> > 
> >   o fi-bsw-kefka: NOTRUN -> SKIP
> > 
> > 
> > (fdo#109271
> > ) +17
> > similar issues
> >   *
> > 
> > igt@gem_exec_fence@basic-busy@bcs0:
> > 
> >   o fi-kbl-soraka: NOTRUN -> SKIP
> > 
> > 
> > (fdo#109271
> > ) +26
> > similar issues
> >   *
> > 
> > igt@gem_huc_copy@huc-copy:
> > 
> >   o fi-kbl-soraka: NOTRUN -> SKIP
> > 
> > 
> > (fdo#109271
> >  /
> > i915#2190 )
> >   *
> > 
> > igt@i915_pm_rpm@basic-rte:
> > 
> >   o fi-kbl-soraka: NOTRUN -> FAIL
> > 
> > 
> > (i915#579 )
> >   *
> > 
> > igt@i915_selftest@live@gt_pm:
> > 
> >   o fi-kbl-soraka: NOTRUN -> DMESG-FAIL
> > 
> > 
> > (i915#1886
> >  /
> > i915#2291 )
> >   *
> > 
> > igt@i915_selftest@live@late_gt_pm:
> > 
> >   o fi-bsw-nick: PASS
> > 
> > 
> > -> DMESG-FAIL
> > 
> > 
> > (i915#2927 )
> >   *
> > 
> > igt@kms_chamelium@common-hpd-after-suspend:
> > 
> >   o fi-kbl-soraka: NOTRUN -> SKIP
> > 
> >

Re: [Intel-gfx] [PATCH 4/4] DO_NOT_MERGE: drm/i915/display: Enable PSR2 selective fetch by default

2021-08-05 Thread Gwan-gyeong Mun





On 8/3/21 8:18 PM, Souza, Jose wrote:

On Tue, 2021-08-03 at 14:17 +0300, Gwan-gyeong Mun wrote:


On 7/31/21 3:10 AM, José Roberto de Souza wrote:

Only to execute tests with PSR2 selective fetch enabled and check what
is broken.

IGT tests know to fail with this:
- kms_cursor_legacy: all tests that checks if evasion happend, I have
fix for it making cursor_slowpath() returns true for display 12+.

- kms_psr2_su: The pageflip test, it needs to have the damage clip set
otherwise it will update the whole screen and the selective blocks
will not match with expected.


kms_psr2_su is a test case for intel PSR2 HW tracking and kms_psr2_sf is
used as a test for intel PSR2 manual tracking. Is it necessary to modify
kms_psr2_su for testing PSR2 manual tracking?


kms_psr2_su is to test that PSR2 is sending selective updates, just adding a 
couple of lines we can make it work with selective fetch.


- kms_psr: psr2_*_(mmap_gtt, mmap_cpu, blt and render), all those
tests should be dropped or skipped for display 12+.


Could you explain in more detail why we need to skip on display 12+?


This are stuff that would end up calling intel_psr_invalidate/flush().



Thanks for the explanation.
And there is an issue confirmed in local tests, so I leave additional 
comments.



Signed-off-by: José Roberto de Souza 
---
   drivers/gpu/drm/i915/display/intel_psr.c | 9 -
   drivers/gpu/drm/i915/i915_params.h   | 2 +-
   2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
b/drivers/gpu/drm/i915/display/intel_psr.c
index 894a2d35668a2..e128f0c2aeecc 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -877,15 +877,6 @@ static bool intel_psr2_config_valid(struct intel_dp 
*intel_dp,
   return false;
   }

-/*
- * We are missing the implementation of some workarounds to enabled PSR2
- * in Alderlake_P, until ready PSR2 should be kept disabled.
- */
-if (IS_ALDERLAKE_P(dev_priv)) {
-drm_dbg_kms(&dev_priv->drm, "PSR2 is missing the implementation of 
workarounds\n");
-return false;
-}
-
   if (!transcoder_has_psr2(dev_priv, crtc_state->cpu_transcoder)) {
   drm_dbg_kms(&dev_priv->drm,
   "PSR2 not supported in transcoder %s\n",
diff --git a/drivers/gpu/drm/i915/i915_params.h 
b/drivers/gpu/drm/i915/i915_params.h
index f27eceb82c0f5..8d725b64592d8 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -55,7 +55,7 @@ struct drm_printer;
   param(int, enable_fbc, -1, 0600) \
   param(int, enable_psr, -1, 0600) \
   param(bool, psr_safest_params, false, 0400) \
-param(bool, enable_psr2_sel_fetch, false, 0400) \
+param(bool, enable_psr2_sel_fetch, true, 0400) \
If we do not modify this part and do not enable it by default at boot 
time as shown in the original code below,

param(bool, enable_psr2_sel_fetch, false, 0400) \

when we execute the kms_psr2_sf test case of igt, the FIFO underrun as 
below still occurs.


i915 :00:02.0: [drm] *ERROR* CPU pipe A FIFO underrun: port,transcoder,

When PSR2 panel is used, PSR1 is enabled by default when 
enable_psr2_sel_fetch is not enabled by default.
And when kms_psr2_sf is executed, the mode is changed to PSR2, and when 
kms_psr2_sf is terminated, PSR2 is deactivated and PSR1 is re-enabled. 
At this point. I suspect there is a problem.



   param(int, disable_power_well, -1, 0400) \
   param(int, enable_ips, 1, 0600) \
   param(int, invert_brightness, 0, 0600) \

Re: [Intel-gfx] [PATCH] drm/i915: Update small joiner ram size

2021-08-05 Thread Navare, Manasi

On Thu, Aug 05, 2021 at 03:49:37PM +0530, Vandita Kulkarni wrote:
> Xelpd supports larger small joiner ram.
> 
> Signed-off-by: Vandita Kulkarni 
> ---
>  drivers/gpu/drm/i915/display/intel_dp.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
> b/drivers/gpu/drm/i915/display/intel_dp.c
> index 75d4ebc66941..d174f0d6e7cd 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> @@ -461,7 +461,9 @@ u32 intel_dp_mode_to_fec_clock(u32 mode_clock)
>  static int
>  small_joiner_ram_size_bits(struct drm_i915_private *i915)
>  {
> - if (DISPLAY_VER(i915) >= 11)
> + if (DISPLAY_VER(i915) >= 13)
> + return 17280 * 8;

Verified from the Bspec, looks good to me.

Reviewed-by: Manasi Navare 

Manasi

> + else if (DISPLAY_VER(i915) >= 11)
>   return 7680 * 8;
>   else
>   return 6144 * 8;
> -- 
> 2.32.0
>

Re: [Intel-gfx] [PATCH v4 3/4] drm/shmem-helpers: Allocate wc pages on x86

2021-08-05 Thread Thomas Zimmermann


Hi

Am 23.07.21 um 09:36 schrieb Daniel Vetter:


The real fix is to get at the architecture-specific wc allocator, which is
currently not something that's exposed, but hidden within the dma api. I
think having this stick out like this is better than hiding it behind fake
generic code (like we do with drm_clflush, which defacto also only really
works on x86).

Also note that ttm has the exact same ifdef in its page allocator, but it
does fall back to using dma_alloc_coherent on other platforms.


If this fixes a real problem and there's no full solution yet, let's 
take what we have. So if you can extract the essence of this comment 
into a TODO comment that tells how to fix the issue, fell free to add my


Acked-by: Thomas Zimmermann 

Best regards
Thomas


-Daniel


Best regard
Thomas


+
shmem->pages = pages;
return 0;
@@ -203,6 +212,11 @@ static void drm_gem_shmem_put_pages_locked(struct 
drm_gem_shmem_object *shmem)
if (--shmem->pages_use_count > 0)
return;
+#ifdef CONFIG_X86
+   if (shmem->map_wc)
+   set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
+#endif
+
drm_gem_put_pages(obj, shmem->pages,
  shmem->pages_mark_dirty_on_put,
  shmem->pages_mark_accessed_on_put);



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer








--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for remove rcu support from i915_address_space (rev5)

2021-08-05 Thread Patchwork

== Series Details ==

Series: remove rcu support from i915_address_space (rev5)
URL   : https://patchwork.freedesktop.org/series/93314/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
0b21f453cdfb drm/i915: Drop code to handle set-vm races from execbuf
-:17: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#17: 
References: ccbc1b97948a ("drm/i915/gem: Don't allow changing the VM on running 
contexts (v4)")

-:17: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#17: 
References: ccbc1b97948a ("drm/i915/gem: Don't allow changing the VM on running 
contexts (v4)")

-:46: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 2 warnings, 0 checks, 12 lines checked
d23b18f97bd9 drm/i915: Rename i915_gem_context_get_vm_rcu to 
i915_gem_context_get_eb_vm
-:148: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 80 lines checked
07bcc7e00033 drm/i915: Use i915_gem_context_get_eb_vm in ctx_getparam
-:54: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 23 lines checked
b2e1515de24d drm/i915: Add i915_gem_context_is_full_ppgtt
-:105: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 53 lines checked
ba6c948d717d drm/i915: Use i915_gem_context_get_eb_vm in intel_context_set_gem
-:12: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#12: 
commit ccbc1b97948ab671335e950271e39766729736c3

-:61: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 18 lines checked
04ad005ea013 drm/i915: Drop __rcu from gem_context->vm
-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#11: 
commit ccbc1b97948ab671335e950271e39766729736c3

-:23: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#23: 
  i915_vm_open ofc. This also removes the final caller of context_get_vm_rcu

-:42: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit a4e7ccdac38e ("drm/i915: Move 
context management under GEM")'
#42: 
commit a4e7ccdac38ec8335d9e4e2656c1a041c77feae1

-:345: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 2 errors, 2 warnings, 0 checks, 232 lines checked
89f791357ea9 drm/i915: use xa_lock/unlock for fpriv->vm_xa lookups
-:15: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit aabbe344dc3c ("drm/i915: Use RCU 
for unlocked vm_idr lookup")'
#15: 
commit aabbe344dc3ca5f7d8263a02608ba6179e8a4499

-:52: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 13 lines checked
57adc91192e3 drm/i915: Stop rcu support for i915_address_space
-:11: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#11: 
- i915_dpt has very simple lifetime (somehow we create a display pagetable vm

-:27: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit cf977e18610e ("drm/i915/gem: 
Spring clean debugfs")'
#27: 
commit cf977e18610e66e48c31619e7e0cfa871be9eada

-:35: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit db80a1294c23 ("drm/i915/gem: 
Remove per-client stats from debugfs/i915_gem_objects")'
#35: 
commit db80a1294c231b6ac725085f046bb2931e00c9db

-:47: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ccbc1b97948a ("drm/i915/gem: 
Don't allow changing the VM on running contexts (v4)")'
#47: 
commit ccbc1b97948ab671335e950271e39766729736c3

-:59: WARNING:TYPO_SPELLING: 'Preceeding' may be misspelled - perhaps 
'Preceding'?
#59: 
  Preceeding patches removed all vestiges of rcu use from gem_ctx->vm
  ^^

-:64:

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for remove rcu support from i915_address_space (rev5)

2021-08-05 Thread Patchwork

== Series Details ==

Series: remove rcu support from i915_address_space (rev5)
URL   : https://patchwork.freedesktop.org/series/93314/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-drivers/gpu/drm/i915/gem/i915_gem_context.c:1364:34:expected struct 
i915_address_space *vm
-drivers/gpu/drm/i915/gem/i915_gem_context.c:1364:34:got struct 
i915_address_space [noderef] __rcu *vm
-drivers/gpu/drm/i915/gem/i915_gem_context.c:1364:34: warning: incorrect type 
in argument 1 (different address spaces)
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
-drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)

Re: [Intel-gfx] [PATCH v5 01/20] drm/sched: Split drm_sched_job_init

2021-08-05 Thread Christian König


Am 05.08.21 um 16:07 schrieb Daniel Vetter:

On Thu, Aug 5, 2021 at 3:44 PM Christian König  wrote:

Am 05.08.21 um 12:46 schrieb Daniel Vetter:

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
to be moved into drm_sched_job_arm, which made me realize that the
job->id definitely needs to be moved too.

Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

Acked-by: Melissa Wen 
Cc: Melissa Wen 
Acked-by: Emma Anholt 
Acked-by: Steven Price  (v2)
Reviewed-by: Boris Brezillon  (v5)
Signed-off-by: Daniel Vetter 

At least the amdgpu parts look ok of hand, but I can't judge the rest I
think.

The thing that really scares me here and that I got wrong a few times
is the cleanup for drm_sched_job at the various points. Can you give
those parts in drm/scheduler/ a full review pls, just to make sure? I
can note that in the tag ofc, just like a bit more confidence here
that it's not busted :-)


I can take another look, but I won't have time for that in the next two 
weeks - vacation and kid starting school.


Christian.




So only Acked-by: Christian König 

Thanks, Daniel


Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Masahiro Yamada 
Cc: Kees Cook 
Cc: Adam Borowski 
Cc: Nick Terrell 
Cc: Mauro Carvalho Chehab 
Cc: Paul Menzel 
Cc: Sami Tolvanen 
Cc: Viresh Kumar 
Cc: Alex Deucher 
Cc: Dave Airlie 
Cc: Nirmoy Das 
Cc: Deepak R Varma 
Cc: Lee Jones 
Cc: Kevin Wang 
Cc: Chen Li 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Dennis Li 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Sonny Jiang 
Cc: Boris Brezillon 
Cc: Tian Tao 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Emma Anholt 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
   drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
   drivers/gpu/drm/lima/lima_sched.c|  2 +
   drivers/gpu/drm/msm/msm_gem_submit.c |  3 ++
   drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
   drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
   drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
   drivers/gpu/drm/scheduler/sched_main.c   | 69 
   drivers/gpu/drm/v3d/v3d_gem.c|  2 +
   include/drm/gpu_scheduler.h  |  7 ++-
   11 files changed, 94 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 139cd3bf1ad6..32e80bc6af22 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
   if (r)
   goto error_unlock;

+ drm_sched_job_arm(&job->base);
+
   /* No memory allocation is allowed while holding the notifier lock.
* The lock is held until amdgpu_cs_submit is finished and fence is
* added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
   if (r)
   return r;

+ drm_sched_job_arm(&job->base);
+
   *f = dma_fence_get(&job->base.s_fence->finished);
   amdgpu_job_free_resources(job)

[Intel-gfx] ✗ Fi.CI.BAT: failure for remove rcu support from i915_address_space (rev5)

2021-08-05 Thread Patchwork

== Series Details ==

Series: remove rcu support from i915_address_space (rev5)
URL   : https://patchwork.freedesktop.org/series/93314/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10451 -> Patchwork_20777


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20777 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20777, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20777:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  
Known issues


  Here are the changes found in Patchwork_20777 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_selftest@live@execlists:
- fi-bsw-nick:[PASS][3] -> [INCOMPLETE][4] ([i915#2940])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/fi-bsw-nick/igt@i915_selftest@l...@execlists.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/fi-bsw-nick/igt@i915_selftest@l...@execlists.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-kbl-7500u:   [PASS][5] -> [FAIL][6] ([i915#1372])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html

  * igt@runner@aborted:
- fi-bsw-nick:NOTRUN -> [FAIL][7] ([fdo#109271] / [i915#1436])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/fi-bsw-nick/igt@run...@aborted.html

  
 Possible fixes 

  * igt@kms_chamelium@hdmi-hpd-fast:
- fi-icl-u2:  [DMESG-WARN][8] ([i915#2203] / [i915#2868]) -> 
[PASS][9]
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1372]: https://gitlab.freedesktop.org/drm/intel/issues/1372
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
  [i915#2868]: https://gitlab.freedesktop.org/drm/intel/issues/2868
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940


Participating hosts (40 -> 35)
--

  Missing(5): fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan bat-jsl-1 fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10451 -> Patchwork_20777

  CI-20190529: 20190529
  CI_DRM_10451: 3bea0ad83735904d380d83bcca30557268acf887 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6160: 4287344dd6a39d9036c5fb9a047a7d8f10bee981 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20777: 57adc91192e34f34d12cce813f1991033826e70c @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

57adc91192e3 drm/i915: Stop rcu support for i915_address_space
89f791357ea9 drm/i915: use xa_lock/unlock for fpriv->vm_xa lookups
04ad005ea013 drm/i915: Drop __rcu from gem_context->vm
ba6c948d717d drm/i915: Use i915_gem_context_get_eb_vm in intel_context_set_gem
b2e1515de24d drm/i915: Add i915_gem_context_is_full_ppgtt
07bcc7e00033 drm/i915: Use i915_gem_context_get_eb_vm in ctx_getparam
d23b18f97bd9 drm/i915: Rename i915_gem_context_get_vm_rcu to 
i915_gem_context_get_eb_vm
0b21f453cdfb drm/i915: Drop code to handle set-vm races from execbuf

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/index.html

Re: [Intel-gfx] [PATCH] drm/aperture: Pass DRM driver structure instead of driver name

2021-08-05 Thread Dmitry Baryshkov


On 29/06/2021 16:58, Thomas Zimmermann wrote:

Print the name of the DRM driver when taking over fbdev devices. Makes
the output to dmesg more consistent. Note that the driver name is only
used for printing a string to the kernel log. No UAPI is affected by this
change.

Signed-off-by: Thomas Zimmermann 
---


[...]


  drivers/gpu/drm/msm/msm_fbdev.c   |  2 +-


Reviewed-by: Dmitry Baryshkov 


  drivers/gpu/drm/nouveau/nouveau_drm.c |  2 +-
  drivers/gpu/drm/qxl/qxl_drv.c |  2 +-
  drivers/gpu/drm/radeon/radeon_drv.c   |  2 +-
  drivers/gpu/drm/rockchip/rockchip_drm_drv.c   |  2 +-
  drivers/gpu/drm/sun4i/sun4i_drv.c |  2 +-
  drivers/gpu/drm/tegra/drm.c   |  2 +-
  drivers/gpu/drm/tiny/cirrus.c |  2 +-
  drivers/gpu/drm/vboxvideo/vbox_drv.c  |  2 +-
  drivers/gpu/drm/vc4/vc4_drv.c |  2 +-
  drivers/gpu/drm/virtio/virtgpu_drv.c  |  2 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c   |  2 +-
  include/drm/drm_aperture.h| 14 +-
  23 files changed, 43 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 6f30c525caac..accf9c1b967a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1278,7 +1278,7 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
  #endif
  
  	/* Get rid of things like offb */

-   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
"amdgpudrmfb");
+   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
&amdgpu_kms_driver);
if (ret)
return ret;
  
diff --git a/drivers/gpu/drm/armada/armada_drv.c b/drivers/gpu/drm/armada/armada_drv.c

index dab0a1f0983b..31925ae3ab72 100644
--- a/drivers/gpu/drm/armada/armada_drv.c
+++ b/drivers/gpu/drm/armada/armada_drv.c
@@ -95,7 +95,7 @@ static int armada_drm_bind(struct device *dev)
}
  
  	/* Remove early framebuffers */

-   ret = drm_aperture_remove_framebuffers(false, "armada-drm-fb");
+   ret = drm_aperture_remove_framebuffers(false, &armada_drm_driver);
if (ret) {
dev_err(dev, "[" DRM_NAME ":%s] can't kick out simple-fb: %d\n",
__func__, ret);
diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c
index 5aa452b4efe6..86d5cd7b6318 100644
--- a/drivers/gpu/drm/ast/ast_drv.c
+++ b/drivers/gpu/drm/ast/ast_drv.c
@@ -100,7 +100,7 @@ static int ast_remove_conflicting_framebuffers(struct 
pci_dev *pdev)
primary = pdev->resource[PCI_ROM_RESOURCE].flags & 
IORESOURCE_ROM_SHADOW;
  #endif
  
-	return drm_aperture_remove_conflicting_framebuffers(base, size, primary, "astdrmfb");

+   return drm_aperture_remove_conflicting_framebuffers(base, size, primary, 
&ast_driver);
  }
  
  static int ast_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)

diff --git a/drivers/gpu/drm/bochs/bochs_drv.c 
b/drivers/gpu/drm/bochs/bochs_drv.c
index c828cadbabff..0d232b44ecd7 100644
--- a/drivers/gpu/drm/bochs/bochs_drv.c
+++ b/drivers/gpu/drm/bochs/bochs_drv.c
@@ -110,7 +110,7 @@ static int bochs_pci_probe(struct pci_dev *pdev,
return -ENOMEM;
}
  
-	ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, "bochsdrmfb");

+   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, 
&bochs_driver);
if (ret)
return ret;
  
diff --git a/drivers/gpu/drm/drm_aperture.c b/drivers/gpu/drm/drm_aperture.c

index 9335d9d6cf9a..9ac39cf11694 100644
--- a/drivers/gpu/drm/drm_aperture.c
+++ b/drivers/gpu/drm/drm_aperture.c
@@ -33,6 +33,10 @@
   *
   * .. code-block:: c
   *
+ * static const struct drm_driver example_driver = {
+ * ...
+ * };
+ *
   *static int remove_conflicting_framebuffers(struct pci_dev *pdev)
   *{
   *bool primary = false;
@@ -46,7 +50,7 @@
   *#endif
   *
   *return drm_aperture_remove_conflicting_framebuffers(base, size, 
primary,
- * "example 
driver");
+ * 
&example_driver);
   *}
   *
   *static int probe(struct pci_dev *pdev)
@@ -274,7 +278,7 @@ static void drm_aperture_detach_drivers(resource_size_t 
base, resource_size_t si
   * @base: the aperture's base address in physical memory
   * @size: aperture size in bytes
   * @primary: also kick vga16fb if present
- * @name: requesting driver name
+ * @req_driver: requesting DRM driver
   *
   * This function removes graphics device drivers which use memory range 
described by
   * @base and @size.
@@ -283,7 +287,7 @@ static void drm_aperture_detach_drivers(resource_size_t 
base, resource_size_t si
   * 0 on success, or a negative errno code otherwise
   */
  int drm_aperture_remove_conflicting_framebuffers(resource_size_t ba

Re: [Intel-gfx] [PATCH v5 07/20] drm/panfrost: use scheduler dependency tracking

2021-08-05 Thread Alyssa Rosenzweig

Acked-by: Alyssa Rosenzweig 

On Thu, Aug 05, 2021 at 12:46:52PM +0200, Daniel Vetter wrote:
> Just deletes some code that's now more shared.
> 
> Note that thanks to the split into drm_sched_job_init/arm we can now
> easily pull the _init() part from under the submission lock way ahead
> where we're adding the sync file in-fences as dependencies.
> 
> v2: Correctly clean up the partially set up job, now that job_init()
> and job_arm() are apart (Emma).
> 
> v3: Rebased over renamed functions for adding depdencies
> 
> Acked-by: Emma Anholt 
> Reviewed-by: Steven Price  (v3)
> Signed-off-by: Daniel Vetter 
> Cc: Rob Herring 
> Cc: Tomeu Vizoso 
> Cc: Steven Price 
> Cc: Alyssa Rosenzweig 
> Cc: Sumit Semwal 
> Cc: "Christian K??nig" 
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org
> Cc: Emma Anholt 
> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 16 ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 38 -
>  drivers/gpu/drm/panfrost/panfrost_job.h |  5 +---
>  3 files changed, 18 insertions(+), 41 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 1ffaef5ec5ff..16212b6b202e 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev,
>   if (ret)
>   goto fail;
>  
> - ret = drm_gem_fence_array_add(&job->deps, fence);
> + ret = drm_sched_job_add_dependency(&job->base, fence);
>  
>   if (ret)
>   goto fail;
> @@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
> void *data,
>   struct drm_panfrost_submit *args = data;
>   struct drm_syncobj *sync_out = NULL;
>   struct panfrost_job *job;
> - int ret = 0;
> + int ret = 0, slot;
>  
>   if (!args->jc)
>   return -EINVAL;
> @@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device 
> *dev, void *data,
>  
>   kref_init(&job->refcount);
>  
> - xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
> -
>   job->pfdev = pfdev;
>   job->jc = args->jc;
>   job->requirements = args->requirements;
>   job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
>   job->file_priv = file->driver_priv;
>  
> + slot = panfrost_job_get_slot(job);
> +
> + ret = drm_sched_job_init(&job->base,
> +  &job->file_priv->sched_entity[slot],
> +  NULL);
> + if (ret)
> + goto fail_job_put;
> +
>   ret = panfrost_copy_in_sync(dev, file, args, job);
>   if (ret)
>   goto fail_job;
> @@ -283,6 +289,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
> void *data,
>   drm_syncobj_replace_fence(sync_out, job->render_done_fence);
>  
>  fail_job:
> + drm_sched_job_cleanup(&job->base);
> +fail_job_put:
>   panfrost_job_put(job);
>  fail_out_sync:
>   if (sync_out)
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 4bc962763e1f..a98f507dc779 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -102,7 +102,7 @@ static struct dma_fence *panfrost_fence_create(struct 
> panfrost_device *pfdev, in
>   return &fence->base;
>  }
>  
> -static int panfrost_job_get_slot(struct panfrost_job *job)
> +int panfrost_job_get_slot(struct panfrost_job *job)
>  {
>   /* JS0: fragment jobs.
>* JS1: vertex/tiler jobs
> @@ -242,13 +242,14 @@ static void panfrost_job_hw_submit(struct panfrost_job 
> *job, int js)
>  
>  static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
> int bo_count,
> -   struct xarray *deps)
> +   struct drm_sched_job *job)
>  {
>   int i, ret;
>  
>   for (i = 0; i < bo_count; i++) {
>   /* panfrost always uses write mode in its current uapi */
> - ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
> + ret = drm_sched_job_add_implicit_dependencies(job, bos[i],
> +   true);
>   if (ret)
>   return ret;
>   }
> @@ -269,31 +270,21 @@ static void panfrost_attach_object_fences(struct 
> drm_gem_object **bos,
>  int panfrost_job_push(struct panfrost_job *job)
>  {
>   struct panfrost_device *pfdev = job->pfdev;
> - int slot = panfrost_job_get_slot(job);
> - struct drm_sched_entity *entity = &job->file_priv->sched_entity[slot];
>   struct ww_acquire_ctx acquire_ctx;
>   int ret = 0;
>  
> -
>   ret = drm_gem_lock_reservations(job->bos, job->bo_count,
>   &acquire_ctx);
>   if (ret)
>

Re: [Intel-gfx] [PATCH 4/4] DO_NOT_MERGE: drm/i915/display: Enable PSR2 selective fetch by default

2021-08-05 Thread Souza, Jose

On Thu, 2021-08-05 at 21:26 +0300, Gwan-gyeong Mun wrote:
> 
> On 8/3/21 8:18 PM, Souza, Jose wrote:
> > On Tue, 2021-08-03 at 14:17 +0300, Gwan-gyeong Mun wrote:
> > > 
> > > On 7/31/21 3:10 AM, José Roberto de Souza wrote:
> > > > Only to execute tests with PSR2 selective fetch enabled and check what
> > > > is broken.
> > > > 
> > > > IGT tests know to fail with this:
> > > > - kms_cursor_legacy: all tests that checks if evasion happend, I have
> > > > fix for it making cursor_slowpath() returns true for display 12+.
> > > > 
> > > > - kms_psr2_su: The pageflip test, it needs to have the damage clip set
> > > > otherwise it will update the whole screen and the selective blocks
> > > > will not match with expected.
> > > > 
> > > kms_psr2_su is a test case for intel PSR2 HW tracking and kms_psr2_sf is
> > > used as a test for intel PSR2 manual tracking. Is it necessary to modify
> > > kms_psr2_su for testing PSR2 manual tracking?
> > 
> > kms_psr2_su is to test that PSR2 is sending selective updates, just adding 
> > a couple of lines we can make it work with selective fetch.
> > 
> > > > - kms_psr: psr2_*_(mmap_gtt, mmap_cpu, blt and render), all those
> > > > tests should be dropped or skipped for display 12+.
> > > > 
> > > Could you explain in more detail why we need to skip on display 12+?
> > 
> > This are stuff that would end up calling intel_psr_invalidate/flush().
> > 
> 
> Thanks for the explanation.
> And there is an issue confirmed in local tests, so I leave additional 
> comments.
> > > 
> > > > Signed-off-by: José Roberto de Souza 
> > > > ---
> > > >drivers/gpu/drm/i915/display/intel_psr.c | 9 -
> > > >drivers/gpu/drm/i915/i915_params.h   | 2 +-
> > > >2 files changed, 1 insertion(+), 10 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
> > > > b/drivers/gpu/drm/i915/display/intel_psr.c
> > > > index 894a2d35668a2..e128f0c2aeecc 100644
> > > > --- a/drivers/gpu/drm/i915/display/intel_psr.c
> > > > +++ b/drivers/gpu/drm/i915/display/intel_psr.c
> > > > @@ -877,15 +877,6 @@ static bool intel_psr2_config_valid(struct 
> > > > intel_dp *intel_dp,
> > > >return false;
> > > >}
> > > > 
> > > > -/*
> > > > - * We are missing the implementation of some workarounds to enabled 
> > > > PSR2
> > > > - * in Alderlake_P, until ready PSR2 should be kept disabled.
> > > > - */
> > > > -if (IS_ALDERLAKE_P(dev_priv)) {
> > > > -drm_dbg_kms(&dev_priv->drm, "PSR2 is missing the implementation of 
> > > > workarounds\n");
> > > > -return false;
> > > > -}
> > > > -
> > > >if (!transcoder_has_psr2(dev_priv, crtc_state->cpu_transcoder)) {
> > > >drm_dbg_kms(&dev_priv->drm,
> > > >"PSR2 not supported in transcoder %s\n",
> > > > diff --git a/drivers/gpu/drm/i915/i915_params.h 
> > > > b/drivers/gpu/drm/i915/i915_params.h
> > > > index f27eceb82c0f5..8d725b64592d8 100644
> > > > --- a/drivers/gpu/drm/i915/i915_params.h
> > > > +++ b/drivers/gpu/drm/i915/i915_params.h
> > > > @@ -55,7 +55,7 @@ struct drm_printer;
> > > >param(int, enable_fbc, -1, 0600) \
> > > >param(int, enable_psr, -1, 0600) \
> > > >param(bool, psr_safest_params, false, 0400) \
> > > > -param(bool, enable_psr2_sel_fetch, false, 0400) \
> > > > +param(bool, enable_psr2_sel_fetch, true, 0400) \
> If we do not modify this part and do not enable it by default at boot 
> time as shown in the original code below,
> param(bool, enable_psr2_sel_fetch, false, 0400) \
> 
> when we execute the kms_psr2_sf test case of igt, the FIFO underrun as 
> below still occurs.
> 
> i915 :00:02.0: [drm] *ERROR* CPU pipe A FIFO underrun: port,transcoder,
> 
> When PSR2 panel is used, PSR1 is enabled by default when 
> enable_psr2_sel_fetch is not enabled by default.
> And when kms_psr2_sf is executed, the mode is changed to PSR2, and when 
> kms_psr2_sf is terminated, PSR2 is deactivated and PSR1 is re-enabled. 
> At this point. I suspect there is a problem.

Was able to reproduce this even with enable_psr2_sel_fetch set to true.
Added some debug messages to intel_psr_exit() and intel_psr_activate() and 
those functions are not called and the underrun still happens.

Could be a regression recently introduced because I was not seeing this 
underrun a few weeks ago.
Anyways this underrun happens with and without(just doing the changes to allow 
PSR2 in alderlake-P in intel_psr2_config_valid()) this patches.

> 
> > > >param(int, disable_power_well, -1, 0400) \
> > > >param(int, enable_ips, 1, 0600) \
> > > >param(int, invert_brightness, 0, 0600) \
> > > > 
> >

Re: [Intel-gfx] [PATCH v5 02/20] drm/msm: Fix drm/sched point of no return rules

2021-08-05 Thread Rob Clark

On Thu, Aug 5, 2021 at 3:47 AM Daniel Vetter  wrote:
>
> Originally drm_sched_job_init was the point of no return, after which
> drivers must submit a job. I've split that up, which allows us to fix
> this issue pretty easily.
>
> Only thing we have to take care of is to not skip to error paths after
> that. Other drivers do this the same for out-fence and similar things.
>
> Fixes: 1d8a5ca436ee ("drm/msm: Conversion to drm scheduler")
> Cc: Rob Clark 
> Cc: Rob Clark 
> Cc: Sean Paul 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: linux-arm-...@vger.kernel.org
> Cc: dri-de...@lists.freedesktop.org
> Cc: freedr...@lists.freedesktop.org
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/msm/msm_gem_submit.c | 15 +++
>  1 file changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> b/drivers/gpu/drm/msm/msm_gem_submit.c
> index 6d6c44f0e1f3..d0ed4ddc509e 100644
> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> @@ -52,9 +52,6 @@ static struct msm_gem_submit *submit_create(struct 
> drm_device *dev,
> return ERR_PTR(ret);
> }
>
> -   /* FIXME: this is way too early */
> -   drm_sched_job_arm(&job->base);
> -
> xa_init_flags(&submit->deps, XA_FLAGS_ALLOC);
>
> kref_init(&submit->ref);
> @@ -883,6 +880,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
> *data,
>
> submit->user_fence = dma_fence_get(&submit->base.s_fence->finished);
>
> +   /* point of no return, we _have_ to submit no matter what */
> +   drm_sched_job_arm(&submit->base);
> +
> /*
>  * Allocate an id which can be used by WAIT_FENCE ioctl to map back
>  * to the underlying fence.
> @@ -892,17 +892,16 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
> *data,
> if (submit->fence_id < 0) {
> ret = submit->fence_id = 0;
> submit->fence_id = 0;
> -   goto out;
> }
>
> -   if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> +   if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> struct sync_file *sync_file = 
> sync_file_create(submit->user_fence);
> if (!sync_file) {
> ret = -ENOMEM;
> -   goto out;
> +   } else {
> +   fd_install(out_fence_fd, sync_file->file);
> +   args->fence_fd = out_fence_fd;
> }
> -   fd_install(out_fence_fd, sync_file->file);
> -   args->fence_fd = out_fence_fd;

I wonder if instead we should (approximately) undo "drm/msm/submit:
Simplify out-fence-fd handling" so that the point that it could fail
is moved up ahead of the drm_sched_job_arm()?

Also, does the dma_fence_get() work before drm_sched_job_arm()?  From
a quick look, it looks like it won't, but I'm still playing catchup
and haven't had a chance to look at your entire series.  If it doesn't
work before drm_sched_job_arm(), then there is really no way to
prevent a error path between the fence-init and job-submit.

But, prior to your series, wouldn't a failure after
drm_sched_job_init() but before the job is submitted just burn a
fence-id, and otherwise carry on it's merry way?

BR,
-R

> }
>
> submit_attach_object_fences(submit);
> --
> 2.32.0
>

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Provide core infrastructure for managing open/release (rev9)

2021-08-05 Thread Patchwork

== Series Details ==

Series: Provide core infrastructure for managing open/release (rev9)
URL   : https://patchwork.freedesktop.org/series/92556/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
5d99eb3c1b1c vfio/samples: Remove module get/put
-:57: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 31 lines checked
fb9d431ac9b4 vfio/mbochs: Fix missing error unwind of mbochs_used_mbytes
-:12: WARNING:BAD_SIGN_OFF: Co-developed-by: must be immediately followed by 
Signed-off-by:
#12: 
Co-developed-by: Alex Williamson 
Reviewed-by: Christoph Hellwig 
-:103: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 2 warnings, 0 checks, 78 lines checked
14a1353260e5 vfio: Introduce a vfio_uninit_group_dev() API call
2085f40d11f9 vfio: Provide better generic support for open/release 
vfio_device_ops
-:260: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided
#260: FILE: drivers/vfio/vfio.c:1483:
+   fdno = ret = get_unused_fd_flags(O_CLOEXEC);

-:358: CHECK:UNCOMMENTED_DEFINITION: struct mutex definition without comment
#358: FILE: include/linux/vfio.h:25:
+   struct mutex lock;

-:402: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 2 checks, 325 lines checked
abe0ffd61cfa vfio/samples: Delete useless open/close
-:98: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 66 lines checked
5d682856d46f vfio/fsl: Move to the device set infrastructure
-:300: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 256 lines checked
d06f6c571ebe vfio/platform: Use open_device() instead of open coding a refcnt 
scheme
-:51: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#51: FILE: drivers/vfio/platform/vfio_platform_common.c:230:
+   dev_warn(

-:105: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#105: FILE: drivers/vfio/platform/vfio_platform_common.c:261:
+   dev_warn(

-:149: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 2 checks, 120 lines checked
aaf4591d0469 vfio/pci: Move to the device set infrastructure
1d22f1ed155e vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set
-:276: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 231 lines checked
1309c4bd1733 vfio/pci: Reorganize VFIO_DEVICE_PCI_HOT_RESET to use the device 
set
-:21: WARNING:BAD_SIGN_OFF: Non-standard signature: Reviewed-off-by:
#21: 
Reviewed-off-by: Christoph Hellwig 

-:309: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 2 warnings, 0 checks, 274 lines checked
c5854ee895cf vfio/mbochs: Fix close when multiple device FDs are open
-:37: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 16 lines checked
486294d2c582 vfio/ap, ccw: Fix open/close when multiple device FDs are open
-:84: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 52 lines checked
b75c4b452009 vfio/gvt: Fix open/close when multiple device FDs are open
-:52: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 26 lines checked
16a862df1e07 vfio: Remove struct vfio_device_ops open/release
-:143: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Jason Gunthorpe ' != 'Signed-off-by: Jason 
Gunthorpe '

total: 0 errors, 1 warnings, 0 checks, 107 lines checked

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Update small joiner ram size

2021-08-05 Thread Patchwork

== Series Details ==

Series: drm/i915: Update small joiner ram size
URL   : https://patchwork.freedesktop.org/series/93410/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10449_full -> Patchwork_20771_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_20771_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_isolation@preservation-s3@rcs0:
- shard-skl:  [PASS][1] -> [INCOMPLETE][2] ([i915#198])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-skl1/igt@gem_ctx_isolation@preservation...@rcs0.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-skl8/igt@gem_ctx_isolation@preservation...@rcs0.html

  * igt@gem_ctx_persistence@legacy-engines-hostile-preempt:
- shard-snb:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-snb7/igt@gem_ctx_persiste...@legacy-engines-hostile-preempt.html

  * igt@gem_eio@in-flight-contexts-10ms:
- shard-tglb: [PASS][4] -> [TIMEOUT][5] ([i915#3063])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-tglb1/igt@gem_...@in-flight-contexts-10ms.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-tglb2/igt@gem_...@in-flight-contexts-10ms.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][6] -> [TIMEOUT][7] ([i915#2369] / [i915#3063] 
/ [i915#3648])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-tglb5/igt@gem_...@unwedge-stress.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-tglb6/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-none@vecs0:
- shard-apl:  NOTRUN -> [FAIL][8] ([i915#2842] / [i915#3468])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-apl6/igt@gem_exec_fair@basic-n...@vecs0.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
- shard-tglb: [PASS][9] -> [FAIL][10] ([i915#2842])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-tglb2/igt@gem_exec_fair@basic-pace-sh...@rcs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-tglb7/igt@gem_exec_fair@basic-pace-sh...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-kbl:  [PASS][11] -> [SKIP][12] ([fdo#109271])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-kbl6/igt@gem_exec_fair@basic-p...@vcs1.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-kbl3/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][13] -> [FAIL][14] ([i915#2842]) +1 similar 
issue
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-glk7/igt@gem_exec_fair@basic-throt...@rcs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-glk3/igt@gem_exec_fair@basic-throt...@rcs0.html
- shard-iclb: [PASS][15] -> [FAIL][16] ([i915#2842])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10449/shard-iclb1/igt@gem_exec_fair@basic-throt...@rcs0.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-iclb5/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_render_copy@linear-to-vebox-y-tiled:
- shard-apl:  NOTRUN -> [SKIP][17] ([fdo#109271]) +223 similar 
issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-apl6/igt@gem_render_c...@linear-to-vebox-y-tiled.html

  * igt@gem_userptr_blits@input-checking:
- shard-apl:  NOTRUN -> [DMESG-WARN][18] ([i915#3002])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-apl2/igt@gem_userptr_bl...@input-checking.html

  * igt@gem_userptr_blits@unsync-unmap-cycles:
- shard-iclb: NOTRUN -> [SKIP][19] ([i915#3297])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-iclb7/igt@gem_userptr_bl...@unsync-unmap-cycles.html

  * igt@gen7_exec_parse@oacontrol-tracking:
- shard-iclb: NOTRUN -> [SKIP][20] ([fdo#109289])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-iclb7/igt@gen7_exec_pa...@oacontrol-tracking.html

  * igt@i915_pm_dc@dc6-psr:
- shard-tglb: NOTRUN -> [FAIL][21] ([i915#454])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-tglb7/igt@i915_pm...@dc6-psr.html

  * igt@i915_pm_rpm@basic-rte:
- shard-apl:  NOTRUN -> [FAIL][22] ([i915#579])
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-apl6/igt@i915_pm_...@basic-rte.html

  * igt@i915_pm_rpm@gem-idle:
- shard-tglb: NOTRUN -> [SKIP][23] ([i915#579])
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20771/shard-tglb7/igt@i915_pm_...@gem-idle.html

  * igt@i915_pm

[Intel-gfx] ✗ Fi.CI.BAT: failure for Provide core infrastructure for managing open/release (rev9)

2021-08-05 Thread Patchwork

== Series Details ==

Series: Provide core infrastructure for managing open/release (rev9)
URL   : https://patchwork.freedesktop.org/series/92556/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10453 -> Patchwork_20778


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20778 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20778, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20778:

### IGT changes ###

 Possible regressions 

  * igt@i915_pm_rps@basic-api:
- fi-rkl-guc: [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10453/fi-rkl-guc/igt@i915_pm_...@basic-api.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-rkl-guc/igt@i915_pm_...@basic-api.html

  * igt@runner@aborted:
- fi-rkl-guc: NOTRUN -> [FAIL][3]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-rkl-guc/igt@run...@aborted.html

  
Known issues


  Here are the changes found in Patchwork_20778 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-sdma:
- fi-kbl-7500u:   NOTRUN -> [SKIP][4] ([fdo#109271]) +30 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-7500u/igt@amdgpu/amd_ba...@cs-sdma.html

  * igt@gem_exec_fence@basic-busy@bcs0:
- fi-kbl-soraka:  NOTRUN -> [SKIP][5] ([fdo#109271]) +26 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-soraka/igt@gem_exec_fence@basic-b...@bcs0.html

  * igt@gem_huc_copy@huc-copy:
- fi-kbl-soraka:  NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#2190])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html
- fi-kbl-7500u:   NOTRUN -> [SKIP][7] ([fdo#109271] / [i915#2190])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-7500u/igt@gem_huc_c...@huc-copy.html

  * igt@i915_pm_rpm@basic-rte:
- fi-kbl-7500u:   NOTRUN -> [FAIL][8] ([i915#579])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-7500u/igt@i915_pm_...@basic-rte.html
- fi-kbl-soraka:  NOTRUN -> [FAIL][9] ([i915#579])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-soraka/igt@i915_pm_...@basic-rte.html

  * igt@i915_selftest@live@gt_pm:
- fi-kbl-soraka:  NOTRUN -> [DMESG-FAIL][10] ([i915#1886] / [i915#2291])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-kbl-soraka:  NOTRUN -> [SKIP][11] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-soraka/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
- fi-kbl-7500u:   NOTRUN -> [SKIP][12] ([fdo#109271] / [i915#533])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-7500u/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html
- fi-kbl-soraka:  NOTRUN -> [SKIP][13] ([fdo#109271] / [i915#533])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-kbl-soraka/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html

  
 Possible fixes 

  * igt@i915_selftest@live@hangcheck:
- {fi-hsw-gt1}:   [DMESG-WARN][14] ([i915#3303]) -> [PASS][15]
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10453/fi-hsw-gt1/igt@i915_selftest@l...@hangcheck.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20778/fi-hsw-gt1/igt@i915_selftest@l...@hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2291]: https://gitlab.freedesktop.org/drm/intel/issues/2291
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#579]: https://gitlab.freedesktop.org/drm/intel/issues/579


Participating hosts (35 -> 34)
--

  Additional (2): fi-kbl-soraka fi-kbl-7500u 
  Missing(3): fi-bdw-samus fi-bsw-cyan bat-jsl-1

Re: [Intel-gfx] [PATCH v3 09/14] vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set

2021-08-05 Thread Jason Gunthorpe

On Thu, Aug 05, 2021 at 11:33:11AM -0600, Alex Williamson wrote:
> > +static int vfio_pci_is_device_in_set(struct pci_dev *pdev, void *data)
> > +{
> > +   struct vfio_device_set *dev_set = data;
> > +   struct vfio_device *cur;
> > +
> > +   lockdep_assert_held(&dev_set->lock);
> > +
> > +   list_for_each_entry(cur, &dev_set->device_list, dev_set_list)
> > +   if (cur->dev == &pdev->dev)
> > +   return 0;
> > +   return -EBUSY;
> > +}
> > +
> > +static bool vfio_pci_dev_set_needs_reset(struct vfio_device_set *dev_set)
> 
> Slight nit on the name here since we're essentially combining
> needs_reset along with the notion of the device being unused.  I'm not
> sure, maybe "should_reset"?  Otherwise it looks ok.  Thanks,

What I did is add a new function vfio_pci_dev_set_resettable() which
pulls in three parts of logic that can be be shared with the
VFIO_DEVICE_PCI_HOT_RESET change in the next patch. That leaves this
function as purely needs_reset.

In turn the VFIO_DEVICE_PCI_HOT_RESET patch gets the same treatment
where it becomes a dev_set centric API just like this.

I'll send it as a v4.

Thanks,
Jason

[Intel-gfx] [PATCH v2] drm/i915/gvt: Fix cached atomics setting for Windows VM

2021-08-05 Thread Zhenyu Wang

We've seen recent regression with host and windows VM running
simultaneously that cause gpu hang or even crash. Finally bisect to
commit 58586680ffad ("drm/i915: Disable atomics in L3 for gen9"),
which seems cached atomics behavior difference caused regression
issue.

This tries to add new scratch register handler and add those in mmio
save/restore list for context switch. No gpu hang produced with this one.

Cc: sta...@vger.kernel.org # 5.12+
Cc: "Xu, Terrence" 
Cc: "Bloomfield, Jon" 
Cc: "Ekstrand, Jason" 
Fixes: 58586680ffad ("drm/i915: Disable atomics in L3 for gen9")
Signed-off-by: Zhenyu Wang 
---
 drivers/gpu/drm/i915/gvt/handlers.c | 1 +
 drivers/gpu/drm/i915/gvt/mmio_context.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/gvt/handlers.c 
b/drivers/gpu/drm/i915/gvt/handlers.c
index 06024d321a1a..cde0a477fb49 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -3149,6 +3149,7 @@ static int init_bdw_mmio_info(struct intel_gvt *gvt)
MMIO_DFH(_MMIO(0xb100), D_BDW, F_CMD_ACCESS, NULL, NULL);
MMIO_DFH(_MMIO(0xb10c), D_BDW, F_CMD_ACCESS, NULL, NULL);
MMIO_D(_MMIO(0xb110), D_BDW);
+   MMIO_D(GEN9_SCRATCH_LNCF1, D_BDW_PLUS);
 
MMIO_F(_MMIO(0x24d0), 48, F_CMD_ACCESS | F_CMD_WRITE_PATCH, 0, 0,
D_BDW_PLUS, NULL, force_nonpriv_write);
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c 
b/drivers/gpu/drm/i915/gvt/mmio_context.c
index b8ac80765461..f776c470914d 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.c
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.c
@@ -105,6 +105,8 @@ static struct engine_mmio gen9_engine_mmio_list[] 
__cacheline_aligned = {
{RCS0, COMMON_SLICE_CHICKEN2, 0x, true}, /* 0x7014 */
{RCS0, GEN9_CS_DEBUG_MODE1, 0x, false}, /* 0x20ec */
{RCS0, GEN8_L3SQCREG4, 0, false}, /* 0xb118 */
+   {RCS0, GEN9_SCRATCH1, 0, false}, /* 0xb11c */
+   {RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */
{RCS0, GEN7_HALF_SLICE_CHICKEN1, 0x, true}, /* 0xe100 */
{RCS0, HALF_SLICE_CHICKEN2, 0x, true}, /* 0xe180 */
{RCS0, HALF_SLICE_CHICKEN3, 0x, true}, /* 0xe184 */
-- 
2.32.0.rc2

1 2 >

1 - 100 of 124 matches

Mail list logo