[Intel-gfx] [PATCH] drm/i915/debugfs: DISPLAY_VER 13 lpsp capability

2021-07-13 Thread Anshuman Gupta
Extend i915_lpsp_capability debugfs to DG2,ADLP and future platforms.

v2: commit log modification.

Cc: Animesh Manna 
Signed-off-by: Anshuman Gupta 
---
 drivers/gpu/drm/i915/display/intel_display_debugfs.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display_debugfs.c 
b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
index d5af5708c9da..65832c4d962f 100644
--- a/drivers/gpu/drm/i915/display/intel_display_debugfs.c
+++ b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
@@ -2256,6 +2256,11 @@ static int i915_lpsp_capability_show(struct seq_file *m, 
void *data)
if (connector->status != connector_status_connected)
return -ENODEV;
 
+   if (DISPLAY_VER(i915) >= 13) {
+   LPSP_CAPABLE(encoder->port <= PORT_B);
+   return 0;
+   }
+
switch (DISPLAY_VER(i915)) {
case 12:
/*
-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PULL] drm-misc-fixes

2021-07-13 Thread Thomas Zimmermann
Hi Dave and Daniel,

these two fixes in drm-misc-fixes got lost during last cycle. Sending them
now.

Best regards
Thomas

drm-misc-fixes-2021-07-13:
Short summary of fixes pull:

 * dma-buf: Fix fence leak in sync_file_merge() error code
 * drm/panel: nt35510: Don't fail on DSI reads
The following changes since commit d330099115597bbc238d6758a4930e72b49ea9ba:

  drm/nouveau: fix dma_address check for CPU/GPU sync (2021-06-24 15:40:44 
+0200)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-fixes-2021-07-13

for you to fetch changes up to ffe000217c5068c5da07ccb1c0f8cce7ad767435:

  dma-buf/sync_file: Don't leak fences on merge failure (2021-07-12 13:34:49 
+0200)


Short summary of fixes pull:

 * dma-buf: Fix fence leak in sync_file_merge() error code
 * drm/panel: nt35510: Don't fail on DSI reads


Jason Ekstrand (1):
  dma-buf/sync_file: Don't leak fences on merge failure

Linus Walleij (1):
  drm/panel: nt35510: Do not fail if DSI read fails

 drivers/dma-buf/sync_file.c   | 13 +++--
 drivers/gpu/drm/panel/panel-novatek-nt35510.c |  4 +---
 2 files changed, 8 insertions(+), 9 deletions(-)

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/8] drm/i915: Explicitly track DRM clients

2021-07-13 Thread Tvrtko Ursulin



On 12/07/2021 17:12, Daniel Vetter wrote:

On Mon, Jul 12, 2021 at 04:51:42PM +0100, Tvrtko Ursulin wrote:


On 12/07/2021 15:42, Daniel Vetter wrote:

On Mon, Jul 12, 2021 at 01:17:12PM +0100, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

Tracking DRM clients more explicitly will allow later patches to
accumulate past and current GPU usage in a centralised place and also
consolidate access to owning task pid/name.

Unique client id is also assigned for the purpose of distinguishing/
consolidating between multiple file descriptors owned by the same process.

v2:
   Chris Wilson:
   * Enclose new members into dedicated structs.
   * Protect against failed sysfs registration.

v3:
   * sysfs_attr_init.

v4:
   * Fix for internal clients.

v5:
   * Use cyclic ida for client id. (Chris)
   * Do not leak pid reference. (Chris)
   * Tidy code with some locals.

v6:
   * Use xa_alloc_cyclic to simplify locking. (Chris)
   * No need to unregister individial sysfs files. (Chris)
   * Rebase on top of fpriv kref.
   * Track client closed status and reflect in sysfs.

v7:
   * Make drm_client more standalone concept.

v8:
   * Simplify sysfs show. (Chris)
   * Always track name and pid.

v9:
   * Fix cyclic id assignment.

v10:
   * No need for a mutex around xa_alloc_cyclic.
   * Refactor sysfs into own function.
   * Unregister sysfs before freeing pid and name.
   * Move clients setup into own function.

v11:
   * Call clients init directly from driver init. (Chris)

v12:
   * Do not fail client add on id wrap. (Maciej)

v13 (Lucas): Rebase.

v14:
   * Dropped sysfs bits.

Signed-off-by: Tvrtko Ursulin 
Reviewed-by: Chris Wilson  # v11
Reviewed-by: Aravind Iddamsetty  # v11
Signed-off-by: Chris Wilson 


On the implementation: I'm not clear why this is a separate object. All
that seems to achieve is make the lifetim fun we have in here even more
annoying, for not real gain?

What's the reasons for this separate i915_drm_client struct? The commit
message talks about de-duping these within the same process, but with
fdinfo I'm not seeing the relevance of this anymore.


AFAIR I started with the new fields directly in file_priv (note file_priv
then needed to be freed via RCU due sysfs access to it!), but then the idea
there was to consolidate new members into a separate struct.


Yeah separate struct makes sense for this stuff, just to
encapsulate/document things a bit. It's the entire scaffolding around it
that I don't think makes sense anymore with the design switch to fdinfo.


So if I just drop the client name updating and lock/RCU used to query 
said client data locklessly you would be happy with that?



Plan was (and still is in internal) that the concept for DRM client will
gain more users/usefulness and would benefit from encapsulation from the
start.

For instance at patch 3 in the series it does consolidate i915 users of
ctx->pid to go via ctx->client (process name as well). Those are async entry
points (compared to file_priv lifetime) from error capture and debugfs. Hm
no, debugfs is there no more, only error capture remains.

As you say since the change of direction to use fdinfo, the asynchronous
entry path into those members from sysfs is gone. Hence if they were moved
back to file_priv, and assuming ctx->pid/name changes to be undone, then
file_priv could remain being immediately freed on file close. Or perhaps we
lose easy pid/name update for files passed over sockets. I'd have to think
about that a bit deeper.

But essentially I think ctx->client is a cleaner design that ctx->pid and
given error capture and debugfs can be async to file_priv lifetime that's a
benefit for me.



From a quick check it's just for debug printing when a ctx hung/got

banned, and for that we do want the pid - users won't have an
understanding of a drm_client. I think pid is actually what we want there.


With regards to de-duping multiple fdinfo entries via client->id - that is
actually the opposite from what you suggest. Whereas with the sysfs approach
we had one entry per client, with fdinfo we have duplicates. So client->id
is essential for userspace to correctly account per client statistics.


Uh why? Like if you use fdinfo and have a bunch of duplicate drm_file,
then your parsing tool can aggregate them all together under the same pid.
No need we do that in the kernel.


It's not done in the kernel. It's just userspace which needs an unique key.


If the problem is that we can't tell apart a dup'ed fd from multiple
open() calls, then I think that should be solved by dropping the hash of
the drm_file pointer into the fdinfo.


Yes hash would work as long as fdinfo is the only way in since then 
lifetime rules are aligned. Or I just keep the id as is since I am 
keeping the client encapsulation, which is simpler.



Also, with the fdinfo approach, why do we still need to even track the
pid? That can be all figured out from proc now, with much cleaner
semantics.


Not sure what you mean here. As explained above pid i

Re: [Intel-gfx] [PATCH 23/47] drm/i915/guc: Update GuC debugfs to support new GuC

2021-07-13 Thread Michal Wajdeczko



On 24.06.2021 09:04, Matthew Brost wrote:
> Update GuC debugfs to support the new GuC structures.
> 
> Signed-off-by: John Harrison 
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 22 
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  3 ++
>  .../gpu/drm/i915/gt/uc/intel_guc_debugfs.c| 23 +++-
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 52 +++
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.h |  4 ++
>  drivers/gpu/drm/i915/i915_debugfs.c   |  1 +
>  6 files changed, 104 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index e0f92e28350c..4ed074df88e5 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -1135,3 +1135,25 @@ void intel_guc_ct_event_handler(struct intel_guc_ct 
> *ct)
>  
>   ct_try_receive_message(ct);
>  }
> +
> +void intel_guc_log_ct_info(struct intel_guc_ct *ct,

this is not "guc log" function, it is "guc ct" one, so:

  void intel_guc_ct_print_info(struct intel_guc_ct *ct,

> +struct drm_printer *p)
> +{
> + if (!ct->enabled) {
> + drm_puts(p, "CT disabled\n");

nit: maybe

  drm_puts(p, "CT %s\n", enableddisabled(false));

> + return;
> + }
> +
> + drm_printf(p, "H2G Space: %u\n",
> +atomic_read(&ct->ctbs.send.space) * 4);

don't you want to print size ?
or GGTT offset ?

> + drm_printf(p, "Head: %u\n",
> +ct->ctbs.send.desc->head);
> + drm_printf(p, "Tail: %u\n",
> +ct->ctbs.send.desc->tail);
> + drm_printf(p, "G2H Space: %u\n",
> +atomic_read(&ct->ctbs.recv.space) * 4);
> + drm_printf(p, "Head: %u\n",
> +ct->ctbs.recv.desc->head);
> + drm_printf(p, "Tail: %u\n",
> +ct->ctbs.recv.desc->tail);

hmm, what about adding helper:

  static void dump_ctb(struct intel_guc_ct_buffer *ctb, *p)
  {
drm_printf(p, "Size: %u\n", ctb->size);
drm_printf(p, "Space: %u\n", atomic_read(&ctb->space) * 4);
drm_printf(p, "Head: %u\n", ctb->desc->head);
drm_printf(p, "Tail: %u\n", ctb->desc->tail);
  }

and then:

drm_printf(p, "H2G:\n");
dump_ctb(&ct->ctbs.send, p);
drm_printf(p, "G2H:\n");
dump_ctb(&ct->ctbs.recv, p);

or

dump_ctb(&ct->ctbs.send, "H2G", p);
dump_ctb(&ct->ctbs.recv, "G2H", p);


> +}
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> index ab1b79ab960b..f62eb06b32fc 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
> @@ -16,6 +16,7 @@
>  
>  struct i915_vma;
>  struct intel_guc;
> +struct drm_printer;
>  
>  /**
>   * DOC: Command Transport (CT).
> @@ -106,4 +107,6 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 
> *action, u32 len,
> u32 *response_buf, u32 response_buf_size, u32 flags);
>  void intel_guc_ct_event_handler(struct intel_guc_ct *ct);
>  
> +void intel_guc_log_ct_info(struct intel_guc_ct *ct, struct drm_printer *p);
> +
>  #endif /* _INTEL_GUC_CT_H_ */
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.c
> index fe7cb7b29a1e..62b9ce0fafaa 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.c
> @@ -9,6 +9,8 @@
>  #include "intel_guc.h"
>  #include "intel_guc_debugfs.h"
>  #include "intel_guc_log_debugfs.h"
> +#include "gt/uc/intel_guc_ct.h"
> +#include "gt/uc/intel_guc_submission.h"
>  
>  static int guc_info_show(struct seq_file *m, void *data)
>  {
> @@ -22,16 +24,35 @@ static int guc_info_show(struct seq_file *m, void *data)
>   drm_puts(&p, "\n");
>   intel_guc_log_info(&guc->log, &p);
>  
> - /* Add more as required ... */
> + if (!intel_guc_submission_is_used(guc))
> + return 0;
> +
> + intel_guc_log_ct_info(&guc->ct, &p);
> + intel_guc_log_submission_info(guc, &p);
>  
>   return 0;
>  }
>  DEFINE_GT_DEBUGFS_ATTRIBUTE(guc_info);
>  
> +static int guc_registered_contexts_show(struct seq_file *m, void *data)
> +{
> + struct intel_guc *guc = m->private;
> + struct drm_printer p = drm_seq_file_printer(m);
> +
> + if (!intel_guc_submission_is_used(guc))
> + return -ENODEV;
> +
> + intel_guc_log_context_info(guc, &p);
> +
> + return 0;
> +}
> +DEFINE_GT_DEBUGFS_ATTRIBUTE(guc_registered_contexts);
> +
>  void intel_guc_debugfs_register(struct intel_guc *guc, struct dentry *root)
>  {
>   static const struct debugfs_gt_file files[] = {
>   { "guc_info", &guc_info_fops, NULL },
> + { "guc_registered_contexts", &guc_registered_contexts_fops, 
> NULL },
>   };
>  
>   if (!intel_guc_is_supported(guc))
> diff --git 

[Intel-gfx] [PATCH] Revert "drm/vgem: Implement mmap as GEM object function"

2021-07-13 Thread Thomas Zimmermann
Commit 375cca1cfeb5 ("drm/vgem: Implement mmap as GEM object function")
broke severla IGT tests in vgem_basic. [1] Attempts to fix the issue
have not worked out so far. [2][3] Revert the original patch for now.

Note that there is a patch that converts vgem to shmem helpers. [4]
Merging this change would be preferable to modifying vgem's mmap code.

[1] https://intel-gfx-ci.01.org/tree/drm-tip/igt@vgem_ba...@unload.html
[2] 
https://lore.kernel.org/intel-gfx/20210709154256.12005-1-tzimmerm...@suse.de/
[3] https://lore.kernel.org/intel-gfx/20210712123321.3658-1-tzimmerm...@suse.de/
[4] https://patchwork.freedesktop.org/series/90671/

Signed-off-by: Thomas Zimmermann 
Fixes: 375cca1cfeb5 ("drm/vgem: Implement mmap as GEM object function")
Reviewed-by: Daniel Vetter 
Cc: Thomas Zimmermann 
Cc: Christian König 
Cc: Daniel Vetter 
Cc: "Christian König" 
Cc: Jason Gunthorpe 
Cc: Melissa Wen 
Cc: Qinglang Miao 
Cc: Gerd Hoffmann 
---
 drivers/gpu/drm/vgem/vgem_drv.c | 46 +
 1 file changed, 41 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index df634aa52638..bf38a7e319d1 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -239,7 +239,32 @@ static struct drm_ioctl_desc vgem_ioctls[] = {
DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, 
DRM_RENDER_ALLOW),
 };

-DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
+static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+   unsigned long flags = vma->vm_flags;
+   int ret;
+
+   ret = drm_gem_mmap(filp, vma);
+   if (ret)
+   return ret;
+
+   /* Keep the WC mmaping set by drm_gem_mmap() but our pages
+* are ordinary and not special.
+*/
+   vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
+   return 0;
+}
+
+static const struct file_operations vgem_driver_fops = {
+   .owner  = THIS_MODULE,
+   .open   = drm_open,
+   .mmap   = vgem_mmap,
+   .poll   = drm_poll,
+   .read   = drm_read,
+   .unlocked_ioctl = drm_ioctl,
+   .compat_ioctl   = drm_compat_ioctl,
+   .release= drm_release,
+};

 static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
 {
@@ -362,12 +387,24 @@ static void vgem_prime_vunmap(struct drm_gem_object *obj, 
struct dma_buf_map *ma
vgem_unpin_pages(bo);
 }

-static int vgem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct 
*vma)
+static int vgem_prime_mmap(struct drm_gem_object *obj,
+  struct vm_area_struct *vma)
 {
+   int ret;
+
+   if (obj->size < vma->vm_end - vma->vm_start)
+   return -EINVAL;
+
+   if (!obj->filp)
+   return -ENODEV;
+
+   ret = call_mmap(obj->filp, vma);
+   if (ret)
+   return ret;
+
vma_set_file(vma, obj->filp);
vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
vma->vm_page_prot = 
pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
-   vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);

return 0;
 }
@@ -379,7 +416,6 @@ static const struct drm_gem_object_funcs 
vgem_gem_object_funcs = {
.get_sg_table = vgem_prime_get_sg_table,
.vmap = vgem_prime_vmap,
.vunmap = vgem_prime_vunmap,
-   .mmap = vgem_prime_mmap,
.vm_ops = &vgem_gem_vm_ops,
 };

@@ -397,7 +433,7 @@ static const struct drm_driver vgem_driver = {
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_import = vgem_prime_import,
.gem_prime_import_sg_table = vgem_prime_import_sg_table,
-   .gem_prime_mmap = drm_gem_prime_mmap,
+   .gem_prime_mmap = vgem_prime_mmap,

.name   = DRIVER_NAME,
.desc   = DRIVER_DESC,
--
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 24/47] drm/i915/guc: Add several request trace points

2021-07-13 Thread Tvrtko Ursulin



On 24/06/2021 08:04, Matthew Brost wrote:

Add trace points for request dependencies and GuC submit. Extended
existing request trace points to include submit fence value,, guc_id,
and ring tail value.

Cc: John Harrison 
Signed-off-by: Matthew Brost 
---
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  3 ++
  drivers/gpu/drm/i915/i915_request.c   |  3 ++
  drivers/gpu/drm/i915/i915_trace.h | 39 ++-
  3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 89b3c7e5d15b..c2327eebc09c 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -422,6 +422,7 @@ static int guc_dequeue_one_context(struct intel_guc *guc)
guc->stalled_request = last;
return false;
}
+   trace_i915_request_guc_submit(last);
}
  
  	guc->stalled_request = NULL;

@@ -642,6 +643,8 @@ static int guc_bypass_tasklet_submit(struct intel_guc *guc,
ret = guc_add_request(guc, rq);
if (ret == -EBUSY)
guc->stalled_request = rq;
+   else
+   trace_i915_request_guc_submit(rq);
  
  	return ret;

  }
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index d92c9f25c9f4..7f7aa096e873 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1344,6 +1344,9 @@ __i915_request_await_execution(struct i915_request *to,
return err;
}
  
+	trace_i915_request_dep_to(to);

+   trace_i915_request_dep_from(from);


Are those two guaranteed to be atomic ie. no other dep_to/dep_from can 
get injected in the middle of them and if so what guarantees that?


Actually we had an internal discussion going in November 2019 on these 
very tracepoints which I think was left hanging in the air.


There I was suggesting you create a single tracepoint in the format of 
"from -> to", so it's clear without any doubt what is going on.


I also suggested this should out outside the GuC patch since it is 
backend agnostic.


I also asked why only this flavour of dependencies and not all. You said 
this was the handy one for debugging GuC backend issues. I said in that 
case you should name it trace_i915_request_await_request so it is 
clearer it does not cover all dependencies.


As it stands it is a bit misleadingly named, has that question mark 
around atomicity, and also is not GuC specific. So really I wouldn't 
think it passes the bar in the current state.

Regards,

Tvrtko

P.S. Same discussion from 2019 also talked about 
trace_i915_request_guc_submit and how it exactly aligns to existing 
request in tracepoint. You were saying the new one is handy because it 
corresponds with H2G, as the last request_in of the group would trigger 
it. I was saying that then you could either know implicitly last 
request_in triggers H2G, or that you could consider adding explicit H2G 
tracepoints.



+
/* Couple the dependency tree for PI on this exposed to->fence */
if (to->engine->sched_engine->schedule) {
err = i915_sched_node_add_dependency(&to->sched,
diff --git a/drivers/gpu/drm/i915/i915_trace.h 
b/drivers/gpu/drm/i915/i915_trace.h
index 6778ad2a14a4..b02d04b6c8f6 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -794,22 +794,27 @@ DECLARE_EVENT_CLASS(i915_request,
TP_STRUCT__entry(
 __field(u32, dev)
 __field(u64, ctx)
+__field(u32, guc_id)
 __field(u16, class)
 __field(u16, instance)
 __field(u32, seqno)
+__field(u32, tail)
 ),
  
  	TP_fast_assign(

   __entry->dev = rq->engine->i915->drm.primary->index;
   __entry->class = rq->engine->uabi_class;
   __entry->instance = rq->engine->uabi_instance;
+  __entry->guc_id = rq->context->guc_id;
   __entry->ctx = rq->fence.context;
   __entry->seqno = rq->fence.seqno;
+  __entry->tail = rq->tail;
   ),
  
-	TP_printk("dev=%u, engine=%u:%u, ctx=%llu, seqno=%u",

+   TP_printk("dev=%u, engine=%u:%u, guc_id=%u, ctx=%llu, seqno=%u, 
tail=%u",
  __entry->dev, __entry->class, __entry->instance,
- __entry->ctx, __entry->seqno)
+ __entry->guc_id, __entry->ctx, __entry->seqno,
+ __entry->tail)
  );
  
  DEFINE_EVENT(i915_request, i915_request_add,

@@ -818,6 +823,21 @@ DEFINE_EVENT(i915_request, i915_reques

Re: [Intel-gfx] [PATCH v4 02/18] drm/sched: Barriers are needed for entity->last_scheduled

2021-07-13 Thread Daniel Vetter
On Tue, Jul 13, 2021 at 9:25 AM Christian König
 wrote:
> Am 13.07.21 um 08:50 schrieb Daniel Vetter:
> > On Tue, Jul 13, 2021 at 8:35 AM Christian König
> >  wrote:
> >> Am 12.07.21 um 19:53 schrieb Daniel Vetter:
> >>> It might be good enough on x86 with just READ_ONCE, but the write side
> >>> should then at least be WRITE_ONCE because x86 has total store order.
> >>>
> >>> It's definitely not enough on arm.
> >>>
> >>> Fix this proplery, which means
> >>> - explain the need for the barrier in both places
> >>> - point at the other side in each comment
> >>>
> >>> Also pull out the !sched_list case as the first check, so that the
> >>> code flow is clearer.
> >>>
> >>> While at it sprinkle some comments around because it was very
> >>> non-obvious to me what's actually going on here and why.
> >>>
> >>> Note that we really need full barriers here, at first I thought
> >>> store-release and load-acquire on ->last_scheduled would be enough,
> >>> but we actually requiring ordering between that and the queue state.
> >>>
> >>> v2: Put smp_rmp() in the right place and fix up comment (Andrey)
> >>>
> >>> Signed-off-by: Daniel Vetter 
> >>> Cc: "Christian König" 
> >>> Cc: Steven Price 
> >>> Cc: Daniel Vetter 
> >>> Cc: Andrey Grodzovsky 
> >>> Cc: Lee Jones 
> >>> Cc: Boris Brezillon 
> >>> ---
> >>>drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
> >>>1 file changed, 25 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> >>> b/drivers/gpu/drm/scheduler/sched_entity.c
> >>> index f7347c284886..89e3f6eaf519 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> >>> @@ -439,8 +439,16 @@ struct drm_sched_job 
> >>> *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
> >>>dma_fence_set_error(&sched_job->s_fence->finished, 
> >>> -ECANCELED);
> >>>
> >>>dma_fence_put(entity->last_scheduled);
> >>> +
> >>>entity->last_scheduled = 
> >>> dma_fence_get(&sched_job->s_fence->finished);
> >>>
> >>> + /*
> >>> +  * If the queue is empty we allow drm_sched_entity_select_rq() to
> >>> +  * locklessly access ->last_scheduled. This only works if we set the
> >>> +  * pointer before we dequeue and if we a write barrier here.
> >>> +  */
> >>> + smp_wmb();
> >>> +
> >> Again, conceptual those barriers should be part of the spsc_queue
> >> container and not externally.
> > That would be extremely unusual api. Let's assume that your queue is
> > very dumb, and protected by a simple lock. That's about the maximum
> > any user could expect.
> >
> > But then you still need barriers here, because linux locks (spinlock,
> > mutex) are defined to be one-way barriers: Stuff that's inside is
> > guaranteed to be done insinde, but stuff outside of the locked region
> > can leak in. They're load-acquire/store-release barriers. So not good
> > enough.
> >
> > You really need to have barriers here, and they really all need to be
> > documented properly. And yes that's a shit-ton of work in drm/sched,
> > because it's full of yolo lockless stuff.
> >
> > The other case you could make is that this works like a wakeup queue,
> > or similar. The rules there are:
> > - wake_up (i.e. pushing something into the queue) is a store-release barrier
> > - the waked up (i.e. popping an entry) is a load acquire barrier
> > Which is obviuosly needed because otherwise you don't have coherency
> > for the data queued up. And again not the barriers you're locking for
> > here.
>
> Exactly that was the idea, yes.
>
> > Either way, we'd still need the comments, because it's still lockless
> > trickery, and every single one of that needs to have a comment on both
> > sides to explain what's going on.
> >
> > Essentially replace spsc_queue with an llist underneath, and that's
> > the amount of barriers a data structure should provide. Anything else
> > is asking your datastructure to paper over bugs in your users.
> >
> > This is similar to how atomic_t is by default completely unordered,
> > and users need to add barriers as needed, with comments.
>
> My main problem is as always that kernel atomics work different than
> userspace atomics.
>
> > I think this is all to make sure people don't just write lockless algorithms
> > because it's a cool idea, but are forced to think this all through.
> > Which seems to not have happened very consistently for drm/sched, so I
> > guess needs to be fixed.
>
> Well at least initially that was all perfectly thought through. The
> problem is nobody is really maintaining that stuff.
>
> > I'm definitely not going to hide all that by making the spsc_queue
> > stuff provide random unjustified barriers just because that would
> > paper over drm/sched bugs. We need to fix the actual bugs, and
> > preferrable all of them. I've found a few, but I wasn't involved in
> > drm/sched thus far, so best I can do is discover them as we go.
>

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/debugfs: xelpd lpsp capability (rev2)

2021-07-13 Thread Patchwork
== Series Details ==

Series: drm/i915/debugfs: xelpd lpsp capability (rev2)
URL   : https://patchwork.freedesktop.org/series/92364/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10337 -> Patchwork_20583


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/index.html


Changes
---

  No changes found


Participating hosts (37 -> 36)
--

  Missing(1): fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10337 -> Patchwork_20583

  CI-20190529: 20190529
  CI_DRM_10337: 52d04d593394807e36200b0875a6e91c8d6af770 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6135: 3bf28f9dffd41b85c262d4e6664ffbdf5b7d9a93 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20583: 5bd004fcf54682092c8bb77e296af2766808508c @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

5bd004fcf546 drm/i915/debugfs: DISPLAY_VER 13 lpsp capability

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5] drm/i915: Be more gentle with exiting non-persistent context

2021-07-13 Thread Tvrtko Ursulin



Ping for any reviewers? This fixes a customer issue on heavily loaded 
transcode boxes by avoiding false GPU hang reports upon pressing Ctrl-C.


Regards,

Tvrtko

On 16/06/2021 11:09, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

When a non-persistent context exits we currently mark it as banned in
order to trigger fast termination of any outstanding GPU jobs it may have
left running.

In doing so we apply a very strict 1ms limit in which the left over job
has to preempt before we issues an engine resets.

Some workloads are not able to cleanly preempt in that time window and it
can be argued that it would instead be better to give them a bit more
grace since avoiding engine resets is generally preferrable.

To achieve this the patch splits handling of banned contexts from simply
exited non-persistent ones and then applies different timeouts for both
and also extends the criteria which determines if a request should be
scheduled back in after preemption or not.

15ms preempt timeout grace is given to exited non-persistent contexts
which have been empirically tested to satisfy customers requirements
and still provides reasonably quick cleanup post exit.

v2:
  * Streamline fast path checks.

v3:
  * Simplify by using only schedulable status.
  * Increase timeout to 20ms.

v4:
  * Fix live_execlists selftest.

v5:
  * Fix logic in kill_engines.

Signed-off-by: Tvrtko Ursulin 
Cc: Chris Wilson 
Cc: Zhen Han 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 22 +--
  drivers/gpu/drm/i915/gt/intel_context.c   |  2 ++
  drivers/gpu/drm/i915/gt/intel_context.h   | 17 +-
  drivers/gpu/drm/i915/gt/intel_context_types.h |  1 +
  .../drm/i915/gt/intel_execlists_submission.c  | 11 --
  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 20 +++--
  drivers/gpu/drm/i915/i915_request.c   |  2 +-
  7 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 7720b8c22c81..6289d82d55d1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -426,7 +426,8 @@ static struct intel_engine_cs *active_engine(struct 
intel_context *ce)
return engine;
  }
  
-static void kill_engines(struct i915_gem_engines *engines, bool ban)

+static void
+kill_engines(struct i915_gem_engines *engines, bool ban, bool persistent)
  {
struct i915_gem_engines_iter it;
struct intel_context *ce;
@@ -440,8 +441,15 @@ static void kill_engines(struct i915_gem_engines *engines, 
bool ban)
 */
for_each_gem_engine(ce, engines, it) {
struct intel_engine_cs *engine;
+   bool skip = false;
  
-		if (ban && intel_context_set_banned(ce))

+   if (ban)
+   skip = intel_context_set_banned(ce);
+   else if (!persistent)
+   skip = !intel_context_clear_schedulable(ce);
+
+   /* Already previously banned or made non-schedulable? */
+   if (skip)
continue;
  
  		/*

@@ -454,7 +462,7 @@ static void kill_engines(struct i915_gem_engines *engines, 
bool ban)
engine = active_engine(ce);
  
  		/* First attempt to gracefully cancel the context */

-   if (engine && !__cancel_engine(engine) && ban)
+   if (engine && !__cancel_engine(engine) && (ban || !persistent))
/*
 * If we are unable to send a preemptive pulse to bump
 * the context from the GPU, we have to resort to a full
@@ -466,8 +474,6 @@ static void kill_engines(struct i915_gem_engines *engines, 
bool ban)
  
  static void kill_context(struct i915_gem_context *ctx)

  {
-   bool ban = (!i915_gem_context_is_persistent(ctx) ||
-   !ctx->i915->params.enable_hangcheck);
struct i915_gem_engines *pos, *next;
  
  	spin_lock_irq(&ctx->stale.lock);

@@ -480,7 +486,8 @@ static void kill_context(struct i915_gem_context *ctx)
  
  		spin_unlock_irq(&ctx->stale.lock);
  
-		kill_engines(pos, ban);

+   kill_engines(pos, !ctx->i915->params.enable_hangcheck,
+i915_gem_context_is_persistent(ctx));
  
  		spin_lock_irq(&ctx->stale.lock);

GEM_BUG_ON(i915_sw_fence_signaled(&pos->fence));
@@ -526,7 +533,8 @@ static void engines_idle_release(struct i915_gem_context 
*ctx,
  
  kill:

if (list_empty(&engines->link)) /* raced, already closed */
-   kill_engines(engines, true);
+   kill_engines(engines, true,
+i915_gem_context_is_persistent(ctx));
  
  	i915_sw_fence_commit(&engines->fence);

  }
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 4033184f13b9..9d539f48d7c6 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Revert "drm/vgem: Implement mmap as GEM object function"

2021-07-13 Thread Patchwork
== Series Details ==

Series: Revert "drm/vgem: Implement mmap as GEM object function"
URL   : https://patchwork.freedesktop.org/series/92467/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
80530052a67f Revert "drm/vgem: Implement mmap as GEM object function"
-:17: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#17: 
[2] 
https://lore.kernel.org/intel-gfx/20210709154256.12005-1-tzimmerm...@suse.de/

total: 0 errors, 1 warnings, 0 checks, 74 lines checked


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.BAT: failure for Revert "drm/vgem: Implement mmap as GEM object function"

2021-07-13 Thread Patchwork
== Series Details ==

Series: Revert "drm/vgem: Implement mmap as GEM object function"
URL   : https://patchwork.freedesktop.org/series/92467/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10337 -> Patchwork_20584


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20584 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20584, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20584:

### IGT changes ###

 Possible regressions 

  * igt@gem_exec_suspend@basic-s3:
- fi-kbl-8809g:   [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/fi-kbl-8809g/igt@gem_exec_susp...@basic-s3.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-kbl-8809g/igt@gem_exec_susp...@basic-s3.html

  
Known issues


  Here are the changes found in Patchwork_20584 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-compute:
- fi-cfl-guc: NOTRUN -> [SKIP][3] ([fdo#109271]) +17 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-cfl-guc/igt@amdgpu/amd_ba...@cs-compute.html
- fi-skl-guc: NOTRUN -> [SKIP][4] ([fdo#109271]) +17 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-skl-guc/igt@amdgpu/amd_ba...@cs-compute.html
- fi-elk-e7500:   NOTRUN -> [SKIP][5] ([fdo#109271]) +18 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-elk-e7500/igt@amdgpu/amd_ba...@cs-compute.html

  * igt@amdgpu/amd_basic@cs-gfx:
- fi-hsw-4770:NOTRUN -> [SKIP][6] ([fdo#109271] / [fdo#109315]) +17 
similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-hsw-4770/igt@amdgpu/amd_ba...@cs-gfx.html
- fi-skl-6700k2:  NOTRUN -> [SKIP][7] ([fdo#109271]) +17 similar issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-skl-6700k2/igt@amdgpu/amd_ba...@cs-gfx.html
- fi-kbl-soraka:  NOTRUN -> [SKIP][8] ([fdo#109271]) +9 similar issues
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-kbl-soraka/igt@amdgpu/amd_ba...@cs-gfx.html

  * igt@amdgpu/amd_basic@cs-sdma:
- fi-kbl-guc: NOTRUN -> [SKIP][9] ([fdo#109271]) +17 similar issues
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-kbl-guc/igt@amdgpu/amd_ba...@cs-sdma.html
- fi-kbl-7500u:   NOTRUN -> [SKIP][10] ([fdo#109271]) +17 similar issues
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-kbl-7500u/igt@amdgpu/amd_ba...@cs-sdma.html

  * igt@amdgpu/amd_basic@memory-alloc:
- fi-cml-u2:  NOTRUN -> [SKIP][11] ([fdo#109315]) +17 similar issues
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-cml-u2/igt@amdgpu/amd_ba...@memory-alloc.html

  * igt@amdgpu/amd_basic@query-info:
- fi-bsw-kefka:   NOTRUN -> [SKIP][12] ([fdo#109271]) +17 similar issues
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-bsw-kefka/igt@amdgpu/amd_ba...@query-info.html
- fi-glk-dsi: NOTRUN -> [SKIP][13] ([fdo#109271]) +17 similar issues
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-glk-dsi/igt@amdgpu/amd_ba...@query-info.html
- fi-tgl-y:   NOTRUN -> [SKIP][14] ([fdo#109315])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-tgl-y/igt@amdgpu/amd_ba...@query-info.html

  * igt@amdgpu/amd_basic@semaphore:
- fi-icl-y:   NOTRUN -> [SKIP][15] ([fdo#109315]) +17 similar issues
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-icl-y/igt@amdgpu/amd_ba...@semaphore.html
- fi-bsw-nick:NOTRUN -> [SKIP][16] ([fdo#109271]) +17 similar issues
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-bsw-nick/igt@amdgpu/amd_ba...@semaphore.html
- fi-bdw-5557u:   NOTRUN -> [SKIP][17] ([fdo#109271]) +17 similar issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-bdw-5557u/igt@amdgpu/amd_ba...@semaphore.html

  * igt@amdgpu/amd_basic@userptr:
- fi-bxt-dsi: NOTRUN -> [SKIP][18] ([fdo#109271]) +17 similar issues
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-bxt-dsi/igt@amdgpu/amd_ba...@userptr.html

  * igt@amdgpu/amd_cs_nop@fork-compute0:
- fi-ivb-3770:NOTRUN -> [SKIP][19] ([fdo#109271]) +18 similar issues
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20584/fi-ivb-3770/igt@amdgpu/amd_cs_...@fork-compute0.html

  * igt@amdgpu/amd

Re: [Intel-gfx] [PATCH v4 02/18] drm/sched: Barriers are needed for entity->last_scheduled

2021-07-13 Thread Christian König

Am 12.07.21 um 19:53 schrieb Daniel Vetter:

It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.

It's definitely not enough on arm.

Fix this proplery, which means
- explain the need for the barrier in both places
- point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.

While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.

Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.

v2: Put smp_rmp() in the right place and fix up comment (Andrey)

Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Steven Price 
Cc: Daniel Vetter 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Boris Brezillon 
---
  drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
  1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index f7347c284886..89e3f6eaf519 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
dma_fence_set_error(&sched_job->s_fence->finished, -ECANCELED);
  
  	dma_fence_put(entity->last_scheduled);

+
entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished);
  
+	/*

+* If the queue is empty we allow drm_sched_entity_select_rq() to
+* locklessly access ->last_scheduled. This only works if we set the
+* pointer before we dequeue and if we a write barrier here.
+*/
+   smp_wmb();
+


Again, conceptual those barriers should be part of the spsc_queue 
container and not externally.


Regards,
Christian.


spsc_queue_pop(&entity->job_queue);
return sched_job;
  }
@@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity 
*entity)
struct drm_gpu_scheduler *sched;
struct drm_sched_rq *rq;
  
-	if (spsc_queue_count(&entity->job_queue) || !entity->sched_list)

+   /* single possible engine and already selected */
+   if (!entity->sched_list)
+   return;
+
+   /* queue non-empty, stay on the same engine */
+   if (spsc_queue_count(&entity->job_queue))
return;
  
-	fence = READ_ONCE(entity->last_scheduled);

+   /*
+* Only when the queue is empty are we guaranteed that the scheduler
+* thread cannot change ->last_scheduled. To enforce ordering we need
+* a read barrier here. See drm_sched_entity_pop_job() for the other
+* side.
+*/
+   smp_rmb();
+
+   fence = entity->last_scheduled;
+
+   /* stay on the same engine if the previous job hasn't finished */
if (fence && !dma_fence_is_signaled(fence))
return;
  


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 01/18] drm/sched: Split drm_sched_job_init

2021-07-13 Thread Christian König

Am 12.07.21 um 19:53 schrieb Daniel Vetter:

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
   usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
   to be moved into drm_sched_job_arm, which made me realize that the
   job->id definitely needs to be moved too.


As far as I can see you still have drm_sched_entity_select_rq() in 
drm_sched_job_init()?


Christian.



   Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

Cc: Melissa Wen 
Acked-by: Steven Price  (v2)
Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Masahiro Yamada 
Cc: Kees Cook 
Cc: Adam Borowski 
Cc: Nick Terrell 
Cc: Mauro Carvalho Chehab 
Cc: Paul Menzel 
Cc: Sami Tolvanen 
Cc: Viresh Kumar 
Cc: Alex Deucher 
Cc: Dave Airlie 
Cc: Nirmoy Das 
Cc: Deepak R Varma 
Cc: Lee Jones 
Cc: Kevin Wang 
Cc: Chen Li 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Dennis Li 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Sonny Jiang 
Cc: Boris Brezillon 
Cc: Tian Tao 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Emma Anholt 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
  drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
  drivers/gpu/drm/lima/lima_sched.c|  2 +
  drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
  drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
  drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
  drivers/gpu/drm/scheduler/sched_main.c   | 69 
  drivers/gpu/drm/v3d/v3d_gem.c|  2 +
  include/drm/gpu_scheduler.h  |  7 ++-
  10 files changed, 91 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index c5386d13eb4a..a4ec092af9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
if (r)
goto error_unlock;
  
+	drm_sched_job_arm(&job->base);

+
/* No memory allocation is allowed while holding the notifier lock.
 * The lock is held until amdgpu_cs_submit is finished and fence is
 * added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
if (r)
return r;
  
+	drm_sched_job_arm(&job->base);

+
*f = dma_fence_get(&job->base.s_fence->finished);
amdgpu_job_free_resources(job);
drm_sched_entity_push_job(&job->base, entity);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index feb6da1b6ceb..05f412204118 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
if (ret)
goto out_unlock;
  
+	drm_sched_job_arm(&submit->sched_job);

+
submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
submit->out_fence, 0,
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index dba8329937a3..38f755580507 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
return err;
}
  
+	drm_sched_job_ar

[Intel-gfx] [PATCH 3/5] drm/i915: convert drm_i915_gem_object to kernel-doc

2021-07-13 Thread Matthew Auld
Before we can pull in the previous kernel doc for the caching bits, we
first get to add kernel doc for all of drm_i915_gem_object so this
actually builds.

Signed-off-by: Matthew Auld 
Cc: Daniel Vetter 
---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 422 +++---
 1 file changed, 366 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 02c3529b774c..da2194290436 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -174,24 +174,75 @@ struct i915_gem_object_page_iter {
struct mutex lock; /* protects this cache */
 };
 
-struct drm_i915_gem_object {
-   /*
-* We might have reason to revisit the below since it wastes
-* a lot of space for non-ttm gem objects.
-* In any case, always use the accessors for the ttm_buffer_object
-* when accessing it.
+/**
+ * struct i915_page_sizes - Track the various pieces we need to
+ * both track and construct huge GTT entries, when binding the
+ * object.
+ */
+struct i915_page_sizes {
+   /**
+* @phys:
+*
+* The sg mask of the pages sg_table. i.e the
+* mask of of the lengths for each sg entry.
 */
+   unsigned int phys;
+
+   /**
+* @sg:
+*
+* The gtt page sizes we are allowed to use given
+* the sg mask and the supported page sizes. This will
+* express the smallest unit we can use for the whole
+* object, as well as the larger sizes we may be able to
+* use opportunistically.
+*/
+   unsigned int sg;
+
+   /**
+* @gtt:
+*
+* The actual gtt page size usage. Since we can
+* have multiple vma associated with this object we need
+* to prevent any trampling of state, hence a copy of
+* this struct also lives in each vma, therefore the gtt
+* value here should only be read/write through the vma.
+*/
+   unsigned int gtt;
+};
+
+/**
+ * struct drm_i915_gem_object - Our core GEM object which extends the base
+ * struct drm_gem_object behaviour.
+ */
+struct drm_i915_gem_object {
union {
+   /** @base: The base DRM GEM object. */
struct drm_gem_object base;
+
+   /**
+* @__do_not_access:
+*
+* The base TTM object, if we are using the TTM backend. Note
+* that this also embeds its own DRM_GEM base object.
+*
+* We might have reason to revisit the below since it wastes a
+* lot of space for non-ttm gem objects.  In any case, always
+* use the accessors for the ttm_buffer_object when accessing
+* it.
+*/
struct ttm_buffer_object __do_not_access;
};
 
+   /**
+* @ops: The struct drm_i915_gem_object_ops interface implemented by the
+* object instance.
+*/
const struct drm_i915_gem_object_ops *ops;
 
+   /** @vma: Track all the struct i915_vma instances for this object. */
struct {
-   /**
-* @vma.lock: protect the list/tree of vmas
-*/
+   /** @vma.lock: protect the list/tree of vmas */
spinlock_t lock;
 
/**
@@ -224,7 +275,9 @@ struct drm_i915_gem_object {
 * this translation from object to context->handles_vma.
 */
struct list_head lut_list;
-   spinlock_t lut_lock; /* guards lut_list */
+
+   /** @lut_lock: Guards the lut_list */
+   spinlock_t lut_lock;
 
/**
 * @obj_link: Link into @i915_gem_ww_ctx.obj_list
@@ -234,29 +287,123 @@ struct drm_i915_gem_object {
 * when i915_gem_ww_ctx_backoff() or i915_gem_ww_ctx_fini() are called.
 */
struct list_head obj_link;
-   /**
-* @shared_resv_from: The object shares the resv from this vm.
-*/
+
+   /** @shares_resv_from: The object shares the resv from this vm. */
struct i915_address_space *shares_resv_from;
 
union {
+   /** @rcu: Embedded rcu_head */
struct rcu_head rcu;
+
+   /**
+* @freed:
+*
+* When objects need to be destroyed we batch them together into
+* an llist, for a separate worker thread to then pick up and
+* process.
+*/
struct llist_node freed;
};
 
/**
-* Whether the object is currently in the GGTT mmap.
+* @userfault_count: Whether the object is currently in the GGTT mmap.
 */
unsigned int userfault_count;
+   /**
+* @userfault_link:
+*
+* We need to maintain the list of all objects which might have been
+

[Intel-gfx] [PATCH 2/5] drm/i915/uapi: convert drm_i915_gem_madvise to kernel-doc

2021-07-13 Thread Matthew Auld
Add some kernel doc for this. We can then just reference this later when
documenting madv in the kernel.

Signed-off-by: Matthew Auld 
Cc: Daniel Vetter 
---
 include/uapi/drm/i915_drm.h | 50 +++--
 1 file changed, 42 insertions(+), 8 deletions(-)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index e334a8b14ef2..a839085b6577 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1492,20 +1492,54 @@ struct drm_i915_get_pipe_from_crtc_id {
__u32 pipe;
 };
 
-#define I915_MADV_WILLNEED 0
-#define I915_MADV_DONTNEED 1
-#define __I915_MADV_PURGED 2 /* internal state */
-
+/**
+ * struct drm_i915_gem_madvise - Update the madvise hint for the object.
+ *
+ * The kernel uses this to know when it can safely discard the backing pages 
for
+ * an object, when under memory pressure.
+ */
 struct drm_i915_gem_madvise {
-   /** Handle of the buffer to change the backing store advice */
+   /**
+* @handle: Handle of the buffer to change the backing store advice for.
+*/
__u32 handle;
 
-   /* Advice: either the buffer will be needed again in the near future,
-* or wont be and could be discarded under memory pressure.
+   /**
+* @madv: The madvise hint to set for the object.
+*
+* Supported values:
+*
+* I915_MADV_WILLNEED:
+*
+* The buffer will be needed again in the near future. By default all
+* objects are set as I915_MADV_WILLNEED. Once the pages become
+* dirty, the kernel is no longer allowed to simply discard the pages,
+* and instead can only resort to swapping the pages out, if under
+* memory pressure, where the page contents must persist when swapping
+* the pages back in.
+*
+* I915_MADV_DONTNEED:
+*
+* The buffer wont be needed. The pages and their contents can be
+* discarded under memory pressure.
+*
+* Note that if the pages were discarded then the kernel updates the
+* internal madvise value of the object to __I915_MADV_PURGED, which
+* effectively kills the object, since all further requests to allocate
+* pages for the object will be rejected. At this point a new object is
+* needed. This will be reflected in @retained.
 */
+#define I915_MADV_WILLNEED 0
+#define I915_MADV_DONTNEED 1
+#define __I915_MADV_PURGED 2 /* internal state */
__u32 madv;
 
-   /** Whether the backing store still exists. */
+   /**
+* @retained: Whether the backing store still exists.
+*
+* Set to false if the kernel purged the object and marked the object as
+* __I915_MADV_PURGED.
+*/
__u32 retained;
 };
 
-- 
2.26.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-13 Thread Matthew Auld
Try to document the object caching related bits, like cache_coherent and
cache_dirty.

Suggested-by: Daniel Vetter 
Signed-off-by: Matthew Auld 
---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 135 +-
 drivers/gpu/drm/i915/i915_drv.h   |   9 --
 2 files changed, 131 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index ef3de2ae9723..02c3529b774c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -92,6 +92,57 @@ struct drm_i915_gem_object_ops {
const char *name; /* friendly name for debug, e.g. lockdep classes */
 };
 
+/**
+ * enum i915_cache_level - The supported GTT caching values for system memory
+ * pages.
+ *
+ * These translate to some special GTT PTE bits when binding pages into some
+ * address space. It also determines whether an object, or rather its pages are
+ * coherent with the GPU, when also reading or writing through the CPU cache
+ * with those pages.
+ *
+ * Userspace can also control this through struct drm_i915_gem_caching.
+ */
+enum i915_cache_level {
+   /**
+* @I915_CACHE_NONE:
+*
+* Not coherent with the CPU cache. If the cache is dirty and we need
+* the underlying pages to be coherent with some later GPU access then
+* we need to manually flush the pages.
+*
+* Note that on shared-LLC platforms reads through the CPU cache are
+* still coherent even with this setting. See also
+* I915_BO_CACHE_COHERENT_FOR_READ for more details.
+*/
+   I915_CACHE_NONE = 0,
+   /**
+* @I915_CACHE_LLC:
+*
+* Coherent with the CPU cache. If the cache is dirty, then the GPU will
+* ensure that access remains coherent, when both reading and writing
+* through the CPU cache.
+*
+* Applies to both platforms with shared-LLC(HAS_LLC), and snooping
+* based platforms(HAS_SNOOP).
+*/
+   I915_CACHE_LLC,
+   /**
+* @I915_CACHE_L3_LLC:
+*
+* gen7+, L3 sits between the domain specifc caches, eg sampler/render
+* caches, and the large Last-Level-Cache. LLC is coherent with the CPU,
+* but L3 is only visible to the GPU.
+*/
+   I915_CACHE_L3_LLC,
+   /**
+* @I915_CACHE_WT:
+*
+* hsw:gt3e Write-through for scanout buffers.
+*/
+   I915_CACHE_WT,
+};
+
 enum i915_map_type {
I915_MAP_WB = 0,
I915_MAP_WC,
@@ -228,14 +279,90 @@ struct drm_i915_gem_object {
unsigned int mem_flags;
 #define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */
 #define I915_BO_FLAG_IOMEM   BIT(1) /* Object backed by IO memory */
-   /*
-* Is the object to be mapped as read-only to the GPU
-* Only honoured if hardware has relevant pte bit
+   /**
+* @cache_level: The desired GTT caching level.
+*
+* See enum i915_cache_level for possible values, along with what
+* each does.
 */
unsigned int cache_level:3;
-   unsigned int cache_coherent:2;
+   /**
+* @cache_coherent:
+*
+* Track whether the pages are coherent with the GPU if reading or
+* writing through the CPU cache.
+*
+* This largely depends on the @cache_level, for example if the object
+* is marked as I915_CACHE_LLC, then GPU access is coherent for both
+* reads and writes through the CPU cache.
+*
+* Note that on platforms with shared-LLC support(HAS_LLC) reads through
+* the CPU cache are always coherent, regardless of the @cache_level. On
+* snooping based platforms this is not the case, unless the full
+* I915_CACHE_LLC or similar setting is used.
+*
+* As a result of this we need to track coherency separately for reads
+* and writes, in order to avoid superfluous flushing on shared-LLC
+* platforms, for reads.
+*
+* I915_BO_CACHE_COHERENT_FOR_READ:
+*
+* When reading through the CPU cache, the GPU is still coherent. Note
+* that no data has actually been modified here, so it might seem
+* strange that we care about this.
+*
+* As an example, if some object is mapped on the CPU with write-back
+* caching, and we read some page, then the cache likely now contains
+* the data from that read. At this point the cache and main memory
+* match up, so all good. But next the GPU needs to write some data to
+* that same page. Now if the @cache_level is I915_CACHE_NONE and the
+* the platform doesn't have the shared-LLC, then the GPU will
+* effectively skip invalidating the cache(or however that works
+* internally) when writing the new value.  This i

[Intel-gfx] [PATCH 4/5] drm/i915: pull in some more kernel-doc

2021-07-13 Thread Matthew Auld
Pull in the kernel-doc for drm_i915_gem_object.

Signed-off-by: Matthew Auld 
Cc: Daniel Vetter 
---
 Documentation/gpu/i915.rst | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 204ebdaadb45..77558084e989 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -387,6 +387,13 @@ GEM BO Management Implementation Details
 .. kernel-doc:: drivers/gpu/drm/i915/i915_vma_types.h
:doc: Virtual Memory Address
 
+GEM Buffer Object
+-
+This section documents our core GEM object, and related bits.
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+   :internal:
+
 Buffer Object Eviction
 --
 
-- 
2.26.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 5/5] drm/i915/ehl: unconditionally flush the pages on acquire

2021-07-13 Thread Matthew Auld
EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
possible for userspace to bypass the GTT caching bits set by the kernel,
as per the given object cache_level. This is troublesome since the heavy
flush we apply when first acquiring the pages is skipped if the kernel
thinks the object is coherent with the GPU. As a result it might be
possible to bypass the cache and read the contents of the page directly,
which could be stale data. If it's just a case of userspace shooting
themselves in the foot then so be it, but since i915 takes the stance of
always zeroing memory before handing it to userspace, we need to prevent
this.

v2: this time actually set cache_dirty in put_pages()
v3: move to get_pages() which looks simpler

BSpec: 34007
References: 046091758b50 ("Revert "drm/i915/ehl: Update MOCS table for EHL"")
Signed-off-by: Matthew Auld 
Cc: Tejas Upadhyay 
Cc: Francisco Jerez 
Cc: Lucas De Marchi 
Cc: Jon Bloomfield 
Cc: Chris Wilson 
Cc: Matt Roper 
Cc: Daniel Vetter 
---
 .../gpu/drm/i915/gem/i915_gem_object_types.h   |  6 ++
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  | 18 ++
 2 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index da2194290436..7089d1b222c5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -522,6 +522,12 @@ struct drm_i915_gem_object {
 * I915_BO_CACHE_COHERENT_FOR_WRITE, i.e that the GPU will be coherent
 * for both reads and writes though the CPU cache. So pretty much this
 * should only be needed for I915_CACHE_NONE objects.
+*
+* Update: Some bonkers hardware decided to add the 'Bypass LLC' MOCS
+* entry, which defeats our @cache_coherent tracking, since userspace
+* can freely bypass the CPU cache when touching the pages with the GPU,
+* where the kernel is completely unaware. On such platform we need
+* apply the sledgehammer-on-acquire regardless of the @cache_coherent.
 */
unsigned int cache_dirty:1;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 6a04cce188fc..11f072193f3b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -182,6 +182,24 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
if (i915_gem_object_needs_bit17_swizzle(obj))
i915_gem_object_do_bit_17_swizzle(obj, st);
 
+   /*
+* EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
+* possible for userspace to bypass the GTT caching bits set by the
+* kernel, as per the given object cache_level. This is troublesome
+* since the heavy flush we apply when first gathering the pages is
+* skipped if the kernel thinks the object is coherent with the GPU. As
+* a result it might be possible to bypass the cache and read the
+* contents of the page directly, which could be stale data. If it's
+* just a case of userspace shooting themselves in the foot then so be
+* it, but since i915 takes the stance of always zeroing memory before
+* handing it to userspace, we need to prevent this.
+*
+* By setting cache_dirty here we make the clflush in set_pages
+* unconditional on such platforms.
+*/
+   if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER)
+   obj->cache_dirty = true;
+
__i915_gem_object_set_pages(obj, st, sg_page_sizes);
 
return 0;
-- 
2.26.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/debugfs: xelpd lpsp capability (rev2)

2021-07-13 Thread Patchwork
== Series Details ==

Series: drm/i915/debugfs: xelpd lpsp capability (rev2)
URL   : https://patchwork.freedesktop.org/series/92364/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10337_full -> Patchwork_20583_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20583_full:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@kms_ccs@pipe-a-bad-rotation-90-yf_tiled_ccs:
- {shard-rkl}:[FAIL][1] ([i915#3678]) -> [SKIP][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-rkl-2/igt@kms_ccs@pipe-a-bad-rotation-90-yf_tiled_ccs.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-rkl-6/igt@kms_ccs@pipe-a-bad-rotation-90-yf_tiled_ccs.html

  * igt@kms_cursor_legacy@pipe-c-torture-bo:
- {shard-rkl}:[PASS][3] -> [INCOMPLETE][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-rkl-1/igt@kms_cursor_leg...@pipe-c-torture-bo.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-rkl-2/igt@kms_cursor_leg...@pipe-c-torture-bo.html

  * igt@sysfs_timeslice_duration@timeout@rcs0:
- {shard-rkl}:[PASS][5] -> [FAIL][6] +5 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-rkl-1/igt@sysfs_timeslice_duration@time...@rcs0.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-rkl-2/igt@sysfs_timeslice_duration@time...@rcs0.html

  
Known issues


  Here are the changes found in Patchwork_20583_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-apl:  NOTRUN -> [DMESG-WARN][7] ([i915#3002])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-apl2/igt@gem_cre...@create-massive.html

  * igt@gem_exec_fair@basic-deadline:
- shard-glk:  [PASS][8] -> [FAIL][9] ([i915#2846])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-glk8/igt@gem_exec_f...@basic-deadline.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-glk4/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-iclb: [PASS][10] -> [FAIL][11] ([i915#2842])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-iclb8/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-iclb8/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
- shard-tglb: [PASS][12] -> [FAIL][13] ([i915#2842])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-tglb1/igt@gem_exec_fair@basic-pace-sh...@rcs0.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-tglb2/igt@gem_exec_fair@basic-pace-sh...@rcs0.html

  * igt@gem_exec_whisper@basic-contexts-forked:
- shard-glk:  [PASS][14] -> [DMESG-WARN][15] ([i915#118] / 
[i915#95])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-glk2/igt@gem_exec_whis...@basic-contexts-forked.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-glk8/igt@gem_exec_whis...@basic-contexts-forked.html

  * igt@gem_userptr_blits@coherency-unsync:
- shard-iclb: NOTRUN -> [SKIP][16] ([i915#3297])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-iclb7/igt@gem_userptr_bl...@coherency-unsync.html

  * igt@gen9_exec_parse@cmd-crossing-page:
- shard-iclb: NOTRUN -> [SKIP][17] ([fdo#112306])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-iclb7/igt@gen9_exec_pa...@cmd-crossing-page.html

  * igt@i915_suspend@fence-restore-untiled:
- shard-skl:  [PASS][18] -> [INCOMPLETE][19] ([i915#146] / 
[i915#198])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-skl7/igt@i915_susp...@fence-restore-untiled.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-skl8/igt@i915_susp...@fence-restore-untiled.html

  * igt@kms_big_fb@x-tiled-16bpp-rotate-0:
- shard-skl:  [PASS][20] -> [DMESG-WARN][21] ([i915#1982])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-skl1/igt@kms_big...@x-tiled-16bpp-rotate-0.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-skl5/igt@kms_big...@x-tiled-16bpp-rotate-0.html

  * igt@kms_big_fb@x-tiled-16bpp-rotate-90:
- shard-iclb: NOTRUN -> [SKIP][22] ([fdo#110725] / [fdo#111614])
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20583/shard-iclb7/igt@kms_big...@x-tiled-16bpp-rotate-90.html

  * igt@kms_ccs@pipe-d-bad-pixel-format-y_tiled_gen12_rc_ccs_c

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/5] drm/i915: document caching related bits

2021-07-13 Thread Patchwork
== Series Details ==

Series: series starting with [1/5] drm/i915: document caching related bits
URL   : https://patchwork.freedesktop.org/series/92469/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
2d424bc0390d drm/i915: document caching related bits
-:58: WARNING:TYPO_SPELLING: 'specifc' may be misspelled - perhaps 'specific'?
#58: FILE: drivers/gpu/drm/i915/gem/i915_gem_object_types.h:133:
+* gen7+, L3 sits between the domain specifc caches, eg sampler/render
 ^^^

-:119: WARNING:REPEATED_WORD: Possible repeated word: 'the'
#119: FILE: drivers/gpu/drm/i915/gem/i915_gem_object_types.h:319:
+* that same page. Now if the @cache_level is I915_CACHE_NONE and the
+* the platform doesn't have the shared-LLC, then the GPU will

total: 0 errors, 2 warnings, 0 checks, 166 lines checked
33fab8144f1c drm/i915/uapi: convert drm_i915_gem_madvise to kernel-doc
-:55: WARNING:TYPO_SPELLING: 'wont' may be misspelled - perhaps 'won't'?
#55: FILE: include/uapi/drm/i915_drm.h:1523:
+* The buffer wont be needed. The pages and their contents can be
  

total: 0 errors, 1 warnings, 0 checks, 62 lines checked
5121ee416eb6 drm/i915: convert drm_i915_gem_object to kernel-doc
-:37: WARNING:REPEATED_WORD: Possible repeated word: 'of'
#37: FILE: drivers/gpu/drm/i915/gem/i915_gem_object_types.h:187:
+* mask of of the lengths for each sg entry.

total: 0 errors, 1 warnings, 0 checks, 577 lines checked
c30b089eaa9f drm/i915: pull in some more kernel-doc
d8d89d8aeb49 drm/i915/ehl: unconditionally flush the pages on acquire
-:21: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#21: 
References: 046091758b50 ("Revert "drm/i915/ehl: Update MOCS table for EHL"")

-:21: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit 046091758b50 ("Revert 
"drm/i915/ehl: Update MOCS table for EHL"")'
#21: 
References: 046091758b50 ("Revert "drm/i915/ehl: Update MOCS table for EHL"")

total: 1 errors, 1 warnings, 0 checks, 36 lines checked


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 02/18] drm/sched: Barriers are needed for entity->last_scheduled

2021-07-13 Thread Christian König

Am 13.07.21 um 08:50 schrieb Daniel Vetter:

On Tue, Jul 13, 2021 at 8:35 AM Christian König
 wrote:

Am 12.07.21 um 19:53 schrieb Daniel Vetter:

It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.

It's definitely not enough on arm.

Fix this proplery, which means
- explain the need for the barrier in both places
- point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.

While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.

Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.

v2: Put smp_rmp() in the right place and fix up comment (Andrey)

Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Steven Price 
Cc: Daniel Vetter 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Boris Brezillon 
---
   drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
   1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index f7347c284886..89e3f6eaf519 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
   dma_fence_set_error(&sched_job->s_fence->finished, -ECANCELED);

   dma_fence_put(entity->last_scheduled);
+
   entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished);

+ /*
+  * If the queue is empty we allow drm_sched_entity_select_rq() to
+  * locklessly access ->last_scheduled. This only works if we set the
+  * pointer before we dequeue and if we a write barrier here.
+  */
+ smp_wmb();
+

Again, conceptual those barriers should be part of the spsc_queue
container and not externally.

That would be extremely unusual api. Let's assume that your queue is
very dumb, and protected by a simple lock. That's about the maximum
any user could expect.

But then you still need barriers here, because linux locks (spinlock,
mutex) are defined to be one-way barriers: Stuff that's inside is
guaranteed to be done insinde, but stuff outside of the locked region
can leak in. They're load-acquire/store-release barriers. So not good
enough.

You really need to have barriers here, and they really all need to be
documented properly. And yes that's a shit-ton of work in drm/sched,
because it's full of yolo lockless stuff.

The other case you could make is that this works like a wakeup queue,
or similar. The rules there are:
- wake_up (i.e. pushing something into the queue) is a store-release barrier
- the waked up (i.e. popping an entry) is a load acquire barrier
Which is obviuosly needed because otherwise you don't have coherency
for the data queued up. And again not the barriers you're locking for
here.


Exactly that was the idea, yes.


Either way, we'd still need the comments, because it's still lockless
trickery, and every single one of that needs to have a comment on both
sides to explain what's going on.

Essentially replace spsc_queue with an llist underneath, and that's
the amount of barriers a data structure should provide. Anything else
is asking your datastructure to paper over bugs in your users.

This is similar to how atomic_t is by default completely unordered,
and users need to add barriers as needed, with comments.


My main problem is as always that kernel atomics work different than 
userspace atomics.



I think this is all to make sure people don't just write lockless algorithms
because it's a cool idea, but are forced to think this all through.
Which seems to not have happened very consistently for drm/sched, so I
guess needs to be fixed.


Well at least initially that was all perfectly thought through. The 
problem is nobody is really maintaining that stuff.



I'm definitely not going to hide all that by making the spsc_queue
stuff provide random unjustified barriers just because that would
paper over drm/sched bugs. We need to fix the actual bugs, and
preferrable all of them. I've found a few, but I wasn't involved in
drm/sched thus far, so best I can do is discover them as we go.


I don't think that those are random unjustified barriers at all and it 
sounds like you didn't grip what I said here.


See the spsc queue must have the following semantics:

1. When you pop a job all changes made before you push the job must be 
visible.


2. When the queue becomes empty all the changes made before you pop the 
last job must be visible.


Otherwise I completely agree with you that the whole scheduler doesn't 
work at all and we need to add tons of external barriers.


Regards,
Christian.


-Daniel



Regards,
Christian.


  

[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/5] drm/i915: document caching related bits

2021-07-13 Thread Patchwork
== Series Details ==

Series: series starting with [1/5] drm/i915: document caching related bits
URL   : https://patchwork.freedesktop.org/series/92469/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10337 -> Patchwork_20585


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/index.html


Changes
---

  No changes found


Participating hosts (37 -> 36)
--

  Missing(1): fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10337 -> Patchwork_20585

  CI-20190529: 20190529
  CI_DRM_10337: 52d04d593394807e36200b0875a6e91c8d6af770 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6135: 3bf28f9dffd41b85c262d4e6664ffbdf5b7d9a93 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20585: d8d89d8aeb4931dac25c7d1caff287fdef764b9d @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

d8d89d8aeb49 drm/i915/ehl: unconditionally flush the pages on acquire
c30b089eaa9f drm/i915: pull in some more kernel-doc
5121ee416eb6 drm/i915: convert drm_i915_gem_object to kernel-doc
33fab8144f1c drm/i915/uapi: convert drm_i915_gem_madvise to kernel-doc
2d424bc0390d drm/i915: document caching related bits

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-13 Thread Mika Kuoppala
Matthew Auld  writes:

> Try to document the object caching related bits, like cache_coherent and
> cache_dirty.
>
> Suggested-by: Daniel Vetter 
> Signed-off-by: Matthew Auld 
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  | 135 +-
>  drivers/gpu/drm/i915/i915_drv.h   |   9 --
>  2 files changed, 131 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index ef3de2ae9723..02c3529b774c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -92,6 +92,57 @@ struct drm_i915_gem_object_ops {
>   const char *name; /* friendly name for debug, e.g. lockdep classes */
>  };
>  
> +/**
> + * enum i915_cache_level - The supported GTT caching values for system memory
> + * pages.
> + *
> + * These translate to some special GTT PTE bits when binding pages into some
> + * address space. It also determines whether an object, or rather its pages 
> are
> + * coherent with the GPU, when also reading or writing through the CPU cache
> + * with those pages.
> + *
> + * Userspace can also control this through struct drm_i915_gem_caching.
> + */
> +enum i915_cache_level {
> + /**
> +  * @I915_CACHE_NONE:
> +  *
> +  * Not coherent with the CPU cache. If the cache is dirty and we need
> +  * the underlying pages to be coherent with some later GPU access then
> +  * we need to manually flush the pages.
> +  *
> +  * Note that on shared-LLC platforms reads through the CPU cache are
> +  * still coherent even with this setting. See also
> +  * I915_BO_CACHE_COHERENT_FOR_READ for more details.
> +  */
> + I915_CACHE_NONE = 0,
> + /**
> +  * @I915_CACHE_LLC:
> +  *
> +  * Coherent with the CPU cache. If the cache is dirty, then the GPU will
> +  * ensure that access remains coherent, when both reading and writing
> +  * through the CPU cache.
> +  *
> +  * Applies to both platforms with shared-LLC(HAS_LLC), and snooping
> +  * based platforms(HAS_SNOOP).
> +  */
> + I915_CACHE_LLC,
> + /**
> +  * @I915_CACHE_L3_LLC:
> +  *
> +  * gen7+, L3 sits between the domain specifc caches, eg sampler/render

typo: specifc

> +  * caches, and the large Last-Level-Cache. LLC is coherent with the CPU,
> +  * but L3 is only visible to the GPU.
> +  */

I dont get the difference between this and I915_CACHE_LLC.
Could the diff between LLC and L3_LLC be described here with example?

Thanks,
-Mika

> + I915_CACHE_L3_LLC,
> + /**
> +  * @I915_CACHE_WT:
> +  *
> +  * hsw:gt3e Write-through for scanout buffers.
> +  */
> + I915_CACHE_WT,
> +};
> +
>  enum i915_map_type {
>   I915_MAP_WB = 0,
>   I915_MAP_WC,
> @@ -228,14 +279,90 @@ struct drm_i915_gem_object {
>   unsigned int mem_flags;
>  #define I915_BO_FLAG_STRUCT_PAGE BIT(0) /* Object backed by struct pages */
>  #define I915_BO_FLAG_IOMEM   BIT(1) /* Object backed by IO memory */
> - /*
> -  * Is the object to be mapped as read-only to the GPU
> -  * Only honoured if hardware has relevant pte bit
> + /**
> +  * @cache_level: The desired GTT caching level.
> +  *
> +  * See enum i915_cache_level for possible values, along with what
> +  * each does.
>*/
>   unsigned int cache_level:3;
> - unsigned int cache_coherent:2;
> + /**
> +  * @cache_coherent:
> +  *
> +  * Track whether the pages are coherent with the GPU if reading or
> +  * writing through the CPU cache.
> +  *
> +  * This largely depends on the @cache_level, for example if the object
> +  * is marked as I915_CACHE_LLC, then GPU access is coherent for both
> +  * reads and writes through the CPU cache.
> +  *
> +  * Note that on platforms with shared-LLC support(HAS_LLC) reads through
> +  * the CPU cache are always coherent, regardless of the @cache_level. On
> +  * snooping based platforms this is not the case, unless the full
> +  * I915_CACHE_LLC or similar setting is used.
> +  *
> +  * As a result of this we need to track coherency separately for reads
> +  * and writes, in order to avoid superfluous flushing on shared-LLC
> +  * platforms, for reads.
> +  *
> +  * I915_BO_CACHE_COHERENT_FOR_READ:
> +  *
> +  * When reading through the CPU cache, the GPU is still coherent. Note
> +  * that no data has actually been modified here, so it might seem
> +  * strange that we care about this.
> +  *
> +  * As an example, if some object is mapped on the CPU with write-back
> +  * caching, and we read some page, then the cache likely now contains
> +  * the data from that read. At this point the cache and main memory
> +  * match up, so all good. But next the GPU needs to write some data to
> +  * that same p

[Intel-gfx] ✓ Fi.CI.IGT: success for series starting with [1/5] drm/i915: document caching related bits

2021-07-13 Thread Patchwork
== Series Details ==

Series: series starting with [1/5] drm/i915: document caching related bits
URL   : https://patchwork.freedesktop.org/series/92469/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10337_full -> Patchwork_20585_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20585_full:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_eio@unwedge-stress:
- {shard-rkl}:[TIMEOUT][1] ([i915#3063]) -> [FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-rkl-6/igt@gem_...@unwedge-stress.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-rkl-2/igt@gem_...@unwedge-stress.html

  * igt@kms_ccs@pipe-b-bad-rotation-90-y_tiled_gen12_mc_ccs:
- {shard-rkl}:[FAIL][3] ([i915#3678]) -> [SKIP][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-rkl-5/igt@kms_ccs@pipe-b-bad-rotation-90-y_tiled_gen12_mc_ccs.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-rkl-6/igt@kms_ccs@pipe-b-bad-rotation-90-y_tiled_gen12_mc_ccs.html

  * igt@sysfs_preempt_timeout@timeout@vecs0:
- {shard-rkl}:[PASS][5] -> [FAIL][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-rkl-2/igt@sysfs_preempt_timeout@time...@vecs0.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-rkl-2/igt@sysfs_preempt_timeout@time...@vecs0.html

  
Known issues


  Here are the changes found in Patchwork_20585_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-apl:  NOTRUN -> [DMESG-WARN][7] ([i915#3002])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-apl7/igt@gem_cre...@create-massive.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][8] -> [TIMEOUT][9] ([i915#2369] / [i915#3063] 
/ [i915#3648])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-tglb6/igt@gem_...@unwedge-stress.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-tglb7/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
- shard-glk:  [PASS][10] -> [FAIL][11] ([i915#2846])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-glk8/igt@gem_exec_f...@basic-deadline.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-glk6/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-tglb: [PASS][12] -> [FAIL][13] ([i915#2842])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-tglb6/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-tglb1/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-none@vecs0:
- shard-glk:  [PASS][14] -> [FAIL][15] ([i915#2842] / [i915#3468])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-glk9/igt@gem_exec_fair@basic-n...@vecs0.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-glk9/igt@gem_exec_fair@basic-n...@vecs0.html

  * igt@gem_exec_suspend@basic-s3:
- shard-kbl:  NOTRUN -> [DMESG-WARN][16] ([i915#180])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-kbl1/igt@gem_exec_susp...@basic-s3.html

  * igt@gem_exec_whisper@basic-queues-forked-all:
- shard-glk:  [PASS][17] -> [DMESG-WARN][18] ([i915#118] / 
[i915#95]) +1 similar issue
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10337/shard-glk8/igt@gem_exec_whis...@basic-queues-forked-all.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-glk4/igt@gem_exec_whis...@basic-queues-forked-all.html

  * igt@gem_userptr_blits@coherency-unsync:
- shard-iclb: NOTRUN -> [SKIP][19] ([i915#3297])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-iclb5/igt@gem_userptr_bl...@coherency-unsync.html

  * igt@gen9_exec_parse@cmd-crossing-page:
- shard-iclb: NOTRUN -> [SKIP][20] ([fdo#112306])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-iclb5/igt@gen9_exec_pa...@cmd-crossing-page.html

  * igt@kms_big_fb@x-tiled-16bpp-rotate-90:
- shard-iclb: NOTRUN -> [SKIP][21] ([fdo#110725] / [fdo#111614])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-iclb5/igt@kms_big...@x-tiled-16bpp-rotate-90.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180-async-flip:
- shard-skl:  NOTRUN -> [FAIL][22] ([i915#3722])
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20585/shard-skl5/i

[Intel-gfx] [PATCH] drm/i915/gtt: drop the page table optimisation

2021-07-13 Thread Matthew Auld
We skip filling out the pt with scratch entries if the va range covers
the entire pt, since we later have to fill it with the PTEs for the
object pages anyway. However this might leave open a small window where
the PTEs don't point to anything valid for the HW to consume.

When for example using 2M GTT pages this fill_px() showed up as being
quite significant in perf measurements, and ends up being completely
wasted since we ignore the pt and just use the pde directly.

Anyway, currently we have our PTE construction split between alloc and
insert, which is probably slightly iffy nowadays, since the alloc
doesn't actually allocate anything anymore, instead it just sets up the
page directories and points the PTEs at the scratch page. Later when we
do the insert step we re-program the PTEs again. Better might be to
squash the alloc and insert into a single step, then bringing back this
optimisation(along with some others) should be possible.

Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables")
Signed-off-by: Matthew Auld 
Cc: Jon Bloomfield 
Cc: Chris Wilson 
Cc: Daniel Vetter 
Cc:  # v4.15+
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 3d02c726c746..6e0e52eeb87a 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -303,10 +303,7 @@ static void __gen8_ppgtt_alloc(struct i915_address_space * 
const vm,
__i915_gem_object_pin_pages(pt->base);
i915_gem_object_make_unshrinkable(pt->base);
 
-   if (lvl ||
-   gen8_pt_count(*start, end) < I915_PDES ||
-   intel_vgpu_active(vm->i915))
-   fill_px(pt, vm->scratch[lvl]->encode);
+   fill_px(pt, vm->scratch[lvl]->encode);
 
spin_lock(&pd->lock);
if (likely(!pd->entry[idx])) {
-- 
2.26.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/gtt: drop the page table optimisation

2021-07-13 Thread Daniel Vetter
On Tue, Jul 13, 2021 at 02:04:31PM +0100, Matthew Auld wrote:
> We skip filling out the pt with scratch entries if the va range covers
> the entire pt, since we later have to fill it with the PTEs for the
> object pages anyway. However this might leave open a small window where
> the PTEs don't point to anything valid for the HW to consume.
> 
> When for example using 2M GTT pages this fill_px() showed up as being
> quite significant in perf measurements, and ends up being completely
> wasted since we ignore the pt and just use the pde directly.
> 
> Anyway, currently we have our PTE construction split between alloc and
> insert, which is probably slightly iffy nowadays, since the alloc
> doesn't actually allocate anything anymore, instead it just sets up the
> page directories and points the PTEs at the scratch page. Later when we
> do the insert step we re-program the PTEs again. Better might be to
> squash the alloc and insert into a single step, then bringing back this
> optimisation(along with some others) should be possible.
> 
> Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables")
> Signed-off-by: Matthew Auld 
> Cc: Jon Bloomfield 
> Cc: Chris Wilson 
> Cc: Daniel Vetter 
> Cc:  # v4.15+

This is some impressively convoluted code, and I'm scared.

But as far as I managed to convince myself, your story here checks out.
Problem will be a bit that this code moved around a _lot_ so we'll need a
lot of dedicated backports :-(

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index 3d02c726c746..6e0e52eeb87a 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -303,10 +303,7 @@ static void __gen8_ppgtt_alloc(struct i915_address_space 
> * const vm,
>   __i915_gem_object_pin_pages(pt->base);
>   i915_gem_object_make_unshrinkable(pt->base);
>  
> - if (lvl ||
> - gen8_pt_count(*start, end) < I915_PDES ||
> - intel_vgpu_active(vm->i915))
> - fill_px(pt, vm->scratch[lvl]->encode);
> + fill_px(pt, vm->scratch[lvl]->encode);
>  
>   spin_lock(&pd->lock);
>   if (likely(!pd->entry[idx])) {
> -- 
> 2.26.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/shmem-helper: Switch to vmf_insert_pfn

2021-07-13 Thread Daniel Vetter
On Thu, Jun 03, 2021 at 11:08:31PM +0200, Daniel Vetter wrote:
> We want to stop gup, which isn't the case if we use vmf_insert_page
> and VM_MIXEDMAP, because that does not set pte_special.
> 
> v2: With this shmem gem helpers now definitely need CONFIG_MMU (0day)
> 
> v3: add more depends on MMU. For usb drivers this is a bit awkward,
> but really it's correct: To be able to provide a contig mapping of
> buffers to userspace on !MMU platforms we'd need to use the cma
> helpers for these drivers on those platforms. As-is this wont work.
> 
> Also not exactly sure why vm_insert_page doesn't go boom, because that
> definitely wont fly in practice since the pages are non-contig to
> begin with.
> 
> Signed-off-by: Daniel Vetter 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Thomas Zimmermann 
> Cc: David Airlie 
> Cc: Daniel Vetter 

Merged to drm-misc-next.

Aside, anyone feel like a review on the previous patch? Still not ready to
switch vgem over, but I think I've found the next bug that needs fixing in
shmem helpers.
-Daniel

> ---
>  drivers/gpu/drm/Kconfig| 2 +-
>  drivers/gpu/drm/drm_gem_shmem_helper.c | 4 ++--
>  drivers/gpu/drm/gud/Kconfig| 2 +-
>  drivers/gpu/drm/tiny/Kconfig   | 4 ++--
>  drivers/gpu/drm/udl/Kconfig| 1 +
>  5 files changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 56a55a6e6239..9c21527b791f 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -206,7 +206,7 @@ config DRM_KMS_CMA_HELPER
>  
>  config DRM_GEM_SHMEM_HELPER
>   bool
> - depends on DRM
> + depends on DRM && MMU
>   help
> Choose this if you need the GEM shmem helper functions
>  
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 6d625cee7a6a..11edd54f0580 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -542,7 +542,7 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault 
> *vmf)
>   } else {
>   page = shmem->pages[page_offset];
>  
> - ret = vmf_insert_page(vma, vmf->address, page);
> + ret = vmf_insert_pfn(vma, vmf->address, page_to_pfn(page));
>   }
>  
>   mutex_unlock(&shmem->pages_lock);
> @@ -612,7 +612,7 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, struct 
> vm_area_struct *vma)
>   return ret;
>   }
>  
> - vma->vm_flags |= VM_MIXEDMAP | VM_DONTEXPAND;
> + vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND;
>   vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
>   if (shmem->map_wc)
>   vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
> diff --git a/drivers/gpu/drm/gud/Kconfig b/drivers/gpu/drm/gud/Kconfig
> index 1c8601bf4d91..9c1e61f9eec3 100644
> --- a/drivers/gpu/drm/gud/Kconfig
> +++ b/drivers/gpu/drm/gud/Kconfig
> @@ -2,7 +2,7 @@
>  
>  config DRM_GUD
>   tristate "GUD USB Display"
> - depends on DRM && USB
> + depends on DRM && USB && MMU
>   select LZ4_COMPRESS
>   select DRM_KMS_HELPER
>   select DRM_GEM_SHMEM_HELPER
> diff --git a/drivers/gpu/drm/tiny/Kconfig b/drivers/gpu/drm/tiny/Kconfig
> index d46f95d9196d..a15f57ace9e7 100644
> --- a/drivers/gpu/drm/tiny/Kconfig
> +++ b/drivers/gpu/drm/tiny/Kconfig
> @@ -31,7 +31,7 @@ config DRM_CIRRUS_QEMU
>  
>  config DRM_GM12U320
>   tristate "GM12U320 driver for USB projectors"
> - depends on DRM && USB
> + depends on DRM && USB && MMU
>   select DRM_KMS_HELPER
>   select DRM_GEM_SHMEM_HELPER
>   help
> @@ -40,7 +40,7 @@ config DRM_GM12U320
>  
>  config DRM_SIMPLEDRM
>   tristate "Simple framebuffer driver"
> - depends on DRM
> + depends on DRM && MMU
>   select DRM_GEM_SHMEM_HELPER
>   select DRM_KMS_HELPER
>   help
> diff --git a/drivers/gpu/drm/udl/Kconfig b/drivers/gpu/drm/udl/Kconfig
> index 1f497d8f1ae5..c744175c6992 100644
> --- a/drivers/gpu/drm/udl/Kconfig
> +++ b/drivers/gpu/drm/udl/Kconfig
> @@ -4,6 +4,7 @@ config DRM_UDL
>   depends on DRM
>   depends on USB
>   depends on USB_ARCH_HAS_HCD
> + depends on MMU
>   select DRM_GEM_SHMEM_HELPER
>   select DRM_KMS_HELPER
>   help
> -- 
> 2.31.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PULL] drm-misc-fixes

2021-07-13 Thread Daniel Vetter
On Tue, Jul 13, 2021 at 10:44:05AM +0200, Thomas Zimmermann wrote:
> Hi Dave and Daniel,
> 
> these two fixes in drm-misc-fixes got lost during last cycle. Sending them
> now.

Applied to drm-fixes, thanks.
-Daniel

> 
> Best regards
> Thomas
> 
> drm-misc-fixes-2021-07-13:
> Short summary of fixes pull:
> 
>  * dma-buf: Fix fence leak in sync_file_merge() error code
>  * drm/panel: nt35510: Don't fail on DSI reads
> The following changes since commit d330099115597bbc238d6758a4930e72b49ea9ba:
> 
>   drm/nouveau: fix dma_address check for CPU/GPU sync (2021-06-24 15:40:44 
> +0200)
> 
> are available in the Git repository at:
> 
>   git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-fixes-2021-07-13
> 
> for you to fetch changes up to ffe000217c5068c5da07ccb1c0f8cce7ad767435:
> 
>   dma-buf/sync_file: Don't leak fences on merge failure (2021-07-12 13:34:49 
> +0200)
> 
> 
> Short summary of fixes pull:
> 
>  * dma-buf: Fix fence leak in sync_file_merge() error code
>  * drm/panel: nt35510: Don't fail on DSI reads
> 
> 
> Jason Ekstrand (1):
>   dma-buf/sync_file: Don't leak fences on merge failure
> 
> Linus Walleij (1):
>   drm/panel: nt35510: Do not fail if DSI read fails
> 
>  drivers/dma-buf/sync_file.c   | 13 +++--
>  drivers/gpu/drm/panel/panel-novatek-nt35510.c |  4 +---
>  2 files changed, 8 insertions(+), 9 deletions(-)
> 
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gtt: drop the page table optimisation

2021-07-13 Thread Patchwork
== Series Details ==

Series: drm/i915/gtt: drop the page table optimisation
URL   : https://patchwork.freedesktop.org/series/92474/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10339 -> Patchwork_20586


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/index.html

Known issues


  Here are the changes found in Patchwork_20586 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@semaphore:
- fi-bdw-5557u:   NOTRUN -> [SKIP][1] ([fdo#109271]) +27 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/fi-bdw-5557u/igt@amdgpu/amd_ba...@semaphore.html

  * igt@core_hotunplug@unbind-rebind:
- fi-bdw-5557u:   NOTRUN -> [WARN][2] ([i915#3718])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/fi-bdw-5557u/igt@core_hotunp...@unbind-rebind.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-bdw-5557u:   NOTRUN -> [SKIP][3] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/fi-bdw-5557u/igt@kms_chamel...@dp-crc-fast.html

  * igt@prime_vgem@basic-userptr:
- fi-pnv-d510:NOTRUN -> [SKIP][4] ([fdo#109271]) +48 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/fi-pnv-d510/igt@prime_v...@basic-userptr.html

  
 Possible fixes 

  * igt@gem_exec_parallel@engines@userptr:
- fi-pnv-d510:[INCOMPLETE][5] ([i915#299]) -> [PASS][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10339/fi-pnv-d510/igt@gem_exec_parallel@engi...@userptr.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/fi-pnv-d510/igt@gem_exec_parallel@engi...@userptr.html

  * igt@i915_selftest@live@gt_pm:
- fi-icl-y:   [DMESG-FAIL][7] ([i915#2291]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10339/fi-icl-y/igt@i915_selftest@live@gt_pm.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/fi-icl-y/igt@i915_selftest@live@gt_pm.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#2291]: https://gitlab.freedesktop.org/drm/intel/issues/2291
  [i915#299]: https://gitlab.freedesktop.org/drm/intel/issues/299
  [i915#3718]: https://gitlab.freedesktop.org/drm/intel/issues/3718


Participating hosts (39 -> 35)
--

  Missing(4): fi-ilk-m540 fi-bdw-samus fi-tgl-dsi fi-hsw-4200u 


Build changes
-

  * Linux: CI_DRM_10339 -> Patchwork_20586

  CI-20190529: 20190529
  CI_DRM_10339: 5ff1081389a25152162a30fac19ae9c7e8248df7 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6135: 3bf28f9dffd41b85c262d4e6664ffbdf5b7d9a93 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20586: 23bf950f2be07aeaaa77419144b8ee662335d398 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

23bf950f2be0 drm/i915/gtt: drop the page table optimisation

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/fb-helper: Try to protect cleanup against delayed setup

2021-07-13 Thread Daniel Vetter
Some vague evidences suggests this can go wrong. Try to prevent it by
holding the right mutex and clearing ->deferred_setup to make sure we
later on don't accidentally try to re-register the fbdev when the
driver thought it had it all cleaned up already.

v2: I realized that this is fundamentally butchered, and CI complained
about lockdep splats. So limit the critical section again and just add
a few notes what the proper fix is.

References: 
https://intel-gfx-ci.01.org/tree/linux-next/next-20201215/fi-byt-j1900/igt@i915_pm_...@module-reload.html
Signed-off-by: Daniel Vetter 
Cc: Ville Syrjälä 
Cc: Chris Wilson 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_fb_helper.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 9d82fda274eb..8f11e5abb222 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -598,6 +598,9 @@ EXPORT_SYMBOL(drm_fb_helper_alloc_fbi);
  * A wrapper around unregister_framebuffer, to release the fb_info
  * framebuffer device. This must be called before releasing all resources for
  * @fb_helper by calling drm_fb_helper_fini().
+ *
+ * Note that this is fundamentally racy on hotunload because it doen't handle
+ * open fbdev file descriptors at all. Use drm_fbdev_generic_setup() instead.
  */
 void drm_fb_helper_unregister_fbi(struct drm_fb_helper *fb_helper)
 {
@@ -611,6 +614,9 @@ EXPORT_SYMBOL(drm_fb_helper_unregister_fbi);
  * @fb_helper: driver-allocated fbdev helper, can be NULL
  *
  * This cleans up all remaining resources associated with @fb_helper.
+ *
+ * Note that this is fundamentally racy on hotunload because it doen't handle
+ * open fbdev file descriptors at all. Use drm_fbdev_generic_setup() instead.
  */
 void drm_fb_helper_fini(struct drm_fb_helper *fb_helper)
 {
@@ -2382,6 +2388,10 @@ static void drm_fbdev_client_unregister(struct 
drm_client_dev *client)
 {
struct drm_fb_helper *fb_helper = drm_fb_helper_from_client(client);
 
+   mutex_lock(&fb_helper->lock);
+   fb_helper->deferred_setup = false;
+   mutex_unlock(&fb_helper->lock);
+
if (fb_helper->fbdev)
/* drm_fbdev_fb_destroy() takes care of cleanup */
drm_fb_helper_unregister_fbi(fb_helper);
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Correct the locking and pin pattern for dma-buf (v5)

2021-07-13 Thread Daniel Vetter
On Mon, Jul 12, 2021 at 06:12:33PM -0500, Jason Ekstrand wrote:
> From: Thomas Hellström 
> 
> If our exported dma-bufs are imported by another instance of our driver,
> that instance will typically have the imported dma-bufs locked during
> dma_buf_map_attachment(). But the exporter also locks the same reservation
> object in the map_dma_buf() callback, which leads to recursive locking.
> 
> So taking the lock inside _pin_pages_unlocked() is incorrect.
> 
> Additionally, the current pinning code path is contrary to the defined
> way that pinning should occur.
> 
> Remove the explicit pin/unpin from the map/umap functions and move them
> to the attach/detach allowing correct locking to occur, and to match
> the static dma-buf drm_prime pattern.
> 
> Add a live selftest to exercise both dynamic and non-dynamic
> exports.
> 
> v2:
> - Extend the selftest with a fake dynamic importer.
> - Provide real pin and unpin callbacks to not abuse the interface.
> v3: (ruhl)
> - Remove the dynamic export support and move the pinning into the
>   attach/detach path.
> v4: (ruhl)
> - Put pages does not need to assert on the dma-resv
> v5: (jason)
> - Lock around dma_buf_unmap_attachment() when emulating a dynamic
>   importer in the subtests.
> - Use pin_pages_unlocked
> 
> Reported-by: Michael J. Ruhl 
> Signed-off-by: Thomas Hellström 
> Signed-off-by: Michael J. Ruhl 
> Signed-off-by: Jason Ekstrand 
> Reviewed-by: Jason Ekstrand 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|  43 +--
>  .../drm/i915/gem/selftests/i915_gem_dmabuf.c  | 118 +-
>  2 files changed, 147 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> index 616c3a2f1baf0..9a655f69a0671 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> @@ -12,6 +12,8 @@
>  #include "i915_gem_object.h"
>  #include "i915_scatterlist.h"
>  
> +I915_SELFTEST_DECLARE(static bool force_different_devices;)
> +
>  static struct drm_i915_gem_object *dma_buf_to_obj(struct dma_buf *buf)
>  {
>   return to_intel_bo(buf->priv);
> @@ -25,15 +27,11 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
> dma_buf_attachment *attachme
>   struct scatterlist *src, *dst;
>   int ret, i;
>  
> - ret = i915_gem_object_pin_pages_unlocked(obj);
> - if (ret)
> - goto err;
> -
>   /* Copy sg so that we make an independent mapping */
>   st = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
>   if (st == NULL) {
>   ret = -ENOMEM;
> - goto err_unpin_pages;
> + goto err;
>   }
>  
>   ret = sg_alloc_table(st, obj->mm.pages->nents, GFP_KERNEL);
> @@ -58,8 +56,6 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
> dma_buf_attachment *attachme
>   sg_free_table(st);
>  err_free:
>   kfree(st);
> -err_unpin_pages:
> - i915_gem_object_unpin_pages(obj);
>  err:
>   return ERR_PTR(ret);
>  }
> @@ -68,13 +64,9 @@ static void i915_gem_unmap_dma_buf(struct 
> dma_buf_attachment *attachment,
>  struct sg_table *sg,
>  enum dma_data_direction dir)
>  {
> - struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf);
> -
>   dma_unmap_sgtable(attachment->dev, sg, dir, DMA_ATTR_SKIP_CPU_SYNC);
>   sg_free_table(sg);
>   kfree(sg);
> -
> - i915_gem_object_unpin_pages(obj);
>  }
>  
>  static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct dma_buf_map 
> *map)
> @@ -168,7 +160,31 @@ static int i915_gem_end_cpu_access(struct dma_buf 
> *dma_buf, enum dma_data_direct
>   return err;
>  }
>  
> +/**
> + * i915_gem_dmabuf_attach - Do any extra attach work necessary
> + * @dmabuf: imported dma-buf
> + * @attach: new attach to do work on
> + *
> + */
> +static int i915_gem_dmabuf_attach(struct dma_buf *dmabuf,
> +   struct dma_buf_attachment *attach)
> +{
> + struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
> +
> + return i915_gem_object_pin_pages_unlocked(obj);
> +}
> +
> +static void i915_gem_dmabuf_detach(struct dma_buf *dmabuf,
> +struct dma_buf_attachment *attach)
> +{
> + struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
> +
> + i915_gem_object_unpin_pages(obj);
> +}
> +
>  static const struct dma_buf_ops i915_dmabuf_ops =  {
> + .attach = i915_gem_dmabuf_attach,
> + .detach = i915_gem_dmabuf_detach,
>   .map_dma_buf = i915_gem_map_dma_buf,
>   .unmap_dma_buf = i915_gem_unmap_dma_buf,
>   .release = drm_gem_dmabuf_release,
> @@ -204,6 +220,8 @@ static int i915_gem_object_get_pages_dmabuf(struct 
> drm_i915_gem_object *obj)
>   struct sg_table *pages;
>   unsigned int sg_page_sizes;
>  
> + assert_object_held(obj);
> +
>   pages = dma_buf_map_attachment(obj->base.import_attach,
>  

Re: [Intel-gfx] [PATCH 1/2] drm/i915/gem: Correct the locking and pin pattern for dma-buf (v5)

2021-07-13 Thread Jason Ekstrand
On Tue, Jul 13, 2021 at 9:40 AM Daniel Vetter  wrote:
>
> On Mon, Jul 12, 2021 at 06:12:33PM -0500, Jason Ekstrand wrote:
> > From: Thomas Hellström 
> >
> > If our exported dma-bufs are imported by another instance of our driver,
> > that instance will typically have the imported dma-bufs locked during
> > dma_buf_map_attachment(). But the exporter also locks the same reservation
> > object in the map_dma_buf() callback, which leads to recursive locking.
> >
> > So taking the lock inside _pin_pages_unlocked() is incorrect.
> >
> > Additionally, the current pinning code path is contrary to the defined
> > way that pinning should occur.
> >
> > Remove the explicit pin/unpin from the map/umap functions and move them
> > to the attach/detach allowing correct locking to occur, and to match
> > the static dma-buf drm_prime pattern.
> >
> > Add a live selftest to exercise both dynamic and non-dynamic
> > exports.
> >
> > v2:
> > - Extend the selftest with a fake dynamic importer.
> > - Provide real pin and unpin callbacks to not abuse the interface.
> > v3: (ruhl)
> > - Remove the dynamic export support and move the pinning into the
> >   attach/detach path.
> > v4: (ruhl)
> > - Put pages does not need to assert on the dma-resv
> > v5: (jason)
> > - Lock around dma_buf_unmap_attachment() when emulating a dynamic
> >   importer in the subtests.
> > - Use pin_pages_unlocked
> >
> > Reported-by: Michael J. Ruhl 
> > Signed-off-by: Thomas Hellström 
> > Signed-off-by: Michael J. Ruhl 
> > Signed-off-by: Jason Ekstrand 
> > Reviewed-by: Jason Ekstrand 
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|  43 +--
> >  .../drm/i915/gem/selftests/i915_gem_dmabuf.c  | 118 +-
> >  2 files changed, 147 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> > index 616c3a2f1baf0..9a655f69a0671 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> > @@ -12,6 +12,8 @@
> >  #include "i915_gem_object.h"
> >  #include "i915_scatterlist.h"
> >
> > +I915_SELFTEST_DECLARE(static bool force_different_devices;)
> > +
> >  static struct drm_i915_gem_object *dma_buf_to_obj(struct dma_buf *buf)
> >  {
> >   return to_intel_bo(buf->priv);
> > @@ -25,15 +27,11 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
> > dma_buf_attachment *attachme
> >   struct scatterlist *src, *dst;
> >   int ret, i;
> >
> > - ret = i915_gem_object_pin_pages_unlocked(obj);
> > - if (ret)
> > - goto err;
> > -
> >   /* Copy sg so that we make an independent mapping */
> >   st = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
> >   if (st == NULL) {
> >   ret = -ENOMEM;
> > - goto err_unpin_pages;
> > + goto err;
> >   }
> >
> >   ret = sg_alloc_table(st, obj->mm.pages->nents, GFP_KERNEL);
> > @@ -58,8 +56,6 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
> > dma_buf_attachment *attachme
> >   sg_free_table(st);
> >  err_free:
> >   kfree(st);
> > -err_unpin_pages:
> > - i915_gem_object_unpin_pages(obj);
> >  err:
> >   return ERR_PTR(ret);
> >  }
> > @@ -68,13 +64,9 @@ static void i915_gem_unmap_dma_buf(struct 
> > dma_buf_attachment *attachment,
> >  struct sg_table *sg,
> >  enum dma_data_direction dir)
> >  {
> > - struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf);
> > -
> >   dma_unmap_sgtable(attachment->dev, sg, dir, DMA_ATTR_SKIP_CPU_SYNC);
> >   sg_free_table(sg);
> >   kfree(sg);
> > -
> > - i915_gem_object_unpin_pages(obj);
> >  }
> >
> >  static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct 
> > dma_buf_map *map)
> > @@ -168,7 +160,31 @@ static int i915_gem_end_cpu_access(struct dma_buf 
> > *dma_buf, enum dma_data_direct
> >   return err;
> >  }
> >
> > +/**
> > + * i915_gem_dmabuf_attach - Do any extra attach work necessary
> > + * @dmabuf: imported dma-buf
> > + * @attach: new attach to do work on
> > + *
> > + */
> > +static int i915_gem_dmabuf_attach(struct dma_buf *dmabuf,
> > +   struct dma_buf_attachment *attach)
> > +{
> > + struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
> > +
> > + return i915_gem_object_pin_pages_unlocked(obj);
> > +}
> > +
> > +static void i915_gem_dmabuf_detach(struct dma_buf *dmabuf,
> > +struct dma_buf_attachment *attach)
> > +{
> > + struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
> > +
> > + i915_gem_object_unpin_pages(obj);
> > +}
> > +
> >  static const struct dma_buf_ops i915_dmabuf_ops =  {
> > + .attach = i915_gem_dmabuf_attach,
> > + .detach = i915_gem_dmabuf_detach,
> >   .map_dma_buf = i915_gem_map_dma_buf,
> >   .unmap_dma_buf = i915_gem_unmap_dma_buf,
> >   .release = drm_gem_dmabuf_r

Re: [Intel-gfx] [PATCH 2/2] drm/i915/gem: Migrate to system at dma-buf attach time (v5)

2021-07-13 Thread Daniel Vetter
On Mon, Jul 12, 2021 at 06:12:34PM -0500, Jason Ekstrand wrote:
> From: Thomas Hellström 
> 
> Until we support p2p dma or as a complement to that, migrate data
> to system memory at dma-buf attach time if possible.
> 
> v2:
> - Rebase on dynamic exporter. Update the igt_dmabuf_import_same_driver
>   selftest to migrate if we are LMEM capable.
> v3:
> - Migrate also in the pin() callback.
> v4:
> - Migrate in attach
> v5: (jason)
> - Lock around the migration
> 
> Signed-off-by: Thomas Hellström 
> Signed-off-by: Michael J. Ruhl 
> Reported-by: kernel test robot 
> Signed-off-by: Jason Ekstrand 
> Reviewed-by: Jason Ekstrand 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 25 ++-
>  .../drm/i915/gem/selftests/i915_gem_dmabuf.c  |  4 ++-
>  2 files changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> index 9a655f69a0671..3163f00554476 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> @@ -170,8 +170,31 @@ static int i915_gem_dmabuf_attach(struct dma_buf *dmabuf,
> struct dma_buf_attachment *attach)
>  {
>   struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
> + struct i915_gem_ww_ctx ww;
> + int err;
> +
> + for_i915_gem_ww(&ww, err, true) {
> + err = i915_gem_object_lock(obj, &ww);
> + if (err)
> + continue;
> +
> + if (!i915_gem_object_can_migrate(obj, INTEL_REGION_SMEM)) {
> + err = -EOPNOTSUPP;
> + continue;
> + }
> +
> + err = i915_gem_object_migrate(obj, &ww, INTEL_REGION_SMEM);
> + if (err)
> + continue;
>  
> - return i915_gem_object_pin_pages_unlocked(obj);
> + err = i915_gem_object_wait_migration(obj, 0);
> + if (err)
> + continue;
> +
> + err = i915_gem_object_pin_pages(obj);
> + }
> +
> + return err;
>  }
>  
>  static void i915_gem_dmabuf_detach(struct dma_buf *dmabuf,
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c 
> b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> index 3dc0f8b3cdab0..4f7e77b1c0152 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> @@ -106,7 +106,9 @@ static int igt_dmabuf_import_same_driver(void *arg)
>   int err;
>  
>   force_different_devices = true;
> - obj = i915_gem_object_create_shmem(i915, PAGE_SIZE);
> + obj = i915_gem_object_create_lmem(i915, PAGE_SIZE, 0);

I'm wondering (and couldn't answer) whether this creates an lmem+smem
buffer, since if we create an lmem-only buffer then the migration above
should fail.

Which I'm also not sure we have a testcase for that testcase either ...

I tried to read some code here, but got a bit lost. Ideas?
-Daniel

> + if (IS_ERR(obj))
> + obj = i915_gem_object_create_shmem(i915, PAGE_SIZE);
>   if (IS_ERR(obj))
>   goto out_ret;
>  
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915/gem: Migrate to system at dma-buf attach time (v5)

2021-07-13 Thread Matthew Auld
On Tue, 13 Jul 2021 at 15:44, Daniel Vetter  wrote:
>
> On Mon, Jul 12, 2021 at 06:12:34PM -0500, Jason Ekstrand wrote:
> > From: Thomas Hellström 
> >
> > Until we support p2p dma or as a complement to that, migrate data
> > to system memory at dma-buf attach time if possible.
> >
> > v2:
> > - Rebase on dynamic exporter. Update the igt_dmabuf_import_same_driver
> >   selftest to migrate if we are LMEM capable.
> > v3:
> > - Migrate also in the pin() callback.
> > v4:
> > - Migrate in attach
> > v5: (jason)
> > - Lock around the migration
> >
> > Signed-off-by: Thomas Hellström 
> > Signed-off-by: Michael J. Ruhl 
> > Reported-by: kernel test robot 
> > Signed-off-by: Jason Ekstrand 
> > Reviewed-by: Jason Ekstrand 
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 25 ++-
> >  .../drm/i915/gem/selftests/i915_gem_dmabuf.c  |  4 ++-
> >  2 files changed, 27 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> > index 9a655f69a0671..3163f00554476 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> > @@ -170,8 +170,31 @@ static int i915_gem_dmabuf_attach(struct dma_buf 
> > *dmabuf,
> > struct dma_buf_attachment *attach)
> >  {
> >   struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
> > + struct i915_gem_ww_ctx ww;
> > + int err;
> > +
> > + for_i915_gem_ww(&ww, err, true) {
> > + err = i915_gem_object_lock(obj, &ww);
> > + if (err)
> > + continue;
> > +
> > + if (!i915_gem_object_can_migrate(obj, INTEL_REGION_SMEM)) {
> > + err = -EOPNOTSUPP;
> > + continue;
> > + }
> > +
> > + err = i915_gem_object_migrate(obj, &ww, INTEL_REGION_SMEM);
> > + if (err)
> > + continue;
> >
> > - return i915_gem_object_pin_pages_unlocked(obj);
> > + err = i915_gem_object_wait_migration(obj, 0);
> > + if (err)
> > + continue;
> > +
> > + err = i915_gem_object_pin_pages(obj);
> > + }
> > +
> > + return err;
> >  }
> >
> >  static void i915_gem_dmabuf_detach(struct dma_buf *dmabuf,
> > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c 
> > b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> > index 3dc0f8b3cdab0..4f7e77b1c0152 100644
> > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> > @@ -106,7 +106,9 @@ static int igt_dmabuf_import_same_driver(void *arg)
> >   int err;
> >
> >   force_different_devices = true;
> > - obj = i915_gem_object_create_shmem(i915, PAGE_SIZE);
> > + obj = i915_gem_object_create_lmem(i915, PAGE_SIZE, 0);
>
> I'm wondering (and couldn't answer) whether this creates an lmem+smem
> buffer, since if we create an lmem-only buffer then the migration above
> should fail.

It's lmem-only, but it's also a kernel internal object, so the
migration path will still happily migrate it if asked. On the other
hand if it's a userspace object then we always have to respect the
placements.

I think for now the only usecase for that is in the selftests.

>
> Which I'm also not sure we have a testcase for that testcase either ...
>
> I tried to read some code here, but got a bit lost. Ideas?
> -Daniel
>
> > + if (IS_ERR(obj))
> > + obj = i915_gem_object_create_shmem(i915, PAGE_SIZE);
> >   if (IS_ERR(obj))
> >   goto out_ret;
> >
> > --
> > 2.31.1
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 02/18] drm/sched: Barriers are needed for entity->last_scheduled

2021-07-13 Thread Christian König

Am 13.07.21 um 11:10 schrieb Daniel Vetter:

On Tue, Jul 13, 2021 at 9:25 AM Christian König
 wrote:

Am 13.07.21 um 08:50 schrieb Daniel Vetter:

On Tue, Jul 13, 2021 at 8:35 AM Christian König
 wrote:

Am 12.07.21 um 19:53 schrieb Daniel Vetter:

It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.

It's definitely not enough on arm.

Fix this proplery, which means
- explain the need for the barrier in both places
- point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.

While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.

Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.

v2: Put smp_rmp() in the right place and fix up comment (Andrey)

Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Steven Price 
Cc: Daniel Vetter 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Boris Brezillon 
---
drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index f7347c284886..89e3f6eaf519 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
dma_fence_set_error(&sched_job->s_fence->finished, -ECANCELED);

dma_fence_put(entity->last_scheduled);
+
entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished);

+ /*
+  * If the queue is empty we allow drm_sched_entity_select_rq() to
+  * locklessly access ->last_scheduled. This only works if we set the
+  * pointer before we dequeue and if we a write barrier here.
+  */
+ smp_wmb();
+

Again, conceptual those barriers should be part of the spsc_queue
container and not externally.

That would be extremely unusual api. Let's assume that your queue is
very dumb, and protected by a simple lock. That's about the maximum
any user could expect.

But then you still need barriers here, because linux locks (spinlock,
mutex) are defined to be one-way barriers: Stuff that's inside is
guaranteed to be done insinde, but stuff outside of the locked region
can leak in. They're load-acquire/store-release barriers. So not good
enough.

You really need to have barriers here, and they really all need to be
documented properly. And yes that's a shit-ton of work in drm/sched,
because it's full of yolo lockless stuff.

The other case you could make is that this works like a wakeup queue,
or similar. The rules there are:
- wake_up (i.e. pushing something into the queue) is a store-release barrier
- the waked up (i.e. popping an entry) is a load acquire barrier
Which is obviuosly needed because otherwise you don't have coherency
for the data queued up. And again not the barriers you're locking for
here.

Exactly that was the idea, yes.


Either way, we'd still need the comments, because it's still lockless
trickery, and every single one of that needs to have a comment on both
sides to explain what's going on.

Essentially replace spsc_queue with an llist underneath, and that's
the amount of barriers a data structure should provide. Anything else
is asking your datastructure to paper over bugs in your users.

This is similar to how atomic_t is by default completely unordered,
and users need to add barriers as needed, with comments.

My main problem is as always that kernel atomics work different than
userspace atomics.


I think this is all to make sure people don't just write lockless algorithms
because it's a cool idea, but are forced to think this all through.
Which seems to not have happened very consistently for drm/sched, so I
guess needs to be fixed.

Well at least initially that was all perfectly thought through. The
problem is nobody is really maintaining that stuff.


I'm definitely not going to hide all that by making the spsc_queue
stuff provide random unjustified barriers just because that would
paper over drm/sched bugs. We need to fix the actual bugs, and
preferrable all of them. I've found a few, but I wasn't involved in
drm/sched thus far, so best I can do is discover them as we go.

I don't think that those are random unjustified barriers at all and it
sounds like you didn't grip what I said here.

See the spsc queue must have the following semantics:

1. When you pop a job all changes made before you push the job must be
visible.

This is the standard barriers that also wake-up queues have, it's just
store-release+load-acquire.


2. When the queue becomes empty all the changes made before you pop the
last job must be visible.

Re: [Intel-gfx] [PATCH 2/2] drm/i915/gem: Migrate to system at dma-buf attach time (v5)

2021-07-13 Thread Daniel Vetter
On Tue, Jul 13, 2021 at 04:06:13PM +0100, Matthew Auld wrote:
> On Tue, 13 Jul 2021 at 15:44, Daniel Vetter  wrote:
> >
> > On Mon, Jul 12, 2021 at 06:12:34PM -0500, Jason Ekstrand wrote:
> > > From: Thomas Hellström 
> > >
> > > Until we support p2p dma or as a complement to that, migrate data
> > > to system memory at dma-buf attach time if possible.
> > >
> > > v2:
> > > - Rebase on dynamic exporter. Update the igt_dmabuf_import_same_driver
> > >   selftest to migrate if we are LMEM capable.
> > > v3:
> > > - Migrate also in the pin() callback.
> > > v4:
> > > - Migrate in attach
> > > v5: (jason)
> > > - Lock around the migration
> > >
> > > Signed-off-by: Thomas Hellström 
> > > Signed-off-by: Michael J. Ruhl 
> > > Reported-by: kernel test robot 
> > > Signed-off-by: Jason Ekstrand 
> > > Reviewed-by: Jason Ekstrand 
> > > ---
> > >  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 25 ++-
> > >  .../drm/i915/gem/selftests/i915_gem_dmabuf.c  |  4 ++-
> > >  2 files changed, 27 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
> > > b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> > > index 9a655f69a0671..3163f00554476 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> > > @@ -170,8 +170,31 @@ static int i915_gem_dmabuf_attach(struct dma_buf 
> > > *dmabuf,
> > > struct dma_buf_attachment *attach)
> > >  {
> > >   struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
> > > + struct i915_gem_ww_ctx ww;
> > > + int err;
> > > +
> > > + for_i915_gem_ww(&ww, err, true) {
> > > + err = i915_gem_object_lock(obj, &ww);
> > > + if (err)
> > > + continue;
> > > +
> > > + if (!i915_gem_object_can_migrate(obj, INTEL_REGION_SMEM)) {
> > > + err = -EOPNOTSUPP;
> > > + continue;
> > > + }
> > > +
> > > + err = i915_gem_object_migrate(obj, &ww, INTEL_REGION_SMEM);
> > > + if (err)
> > > + continue;
> > >
> > > - return i915_gem_object_pin_pages_unlocked(obj);
> > > + err = i915_gem_object_wait_migration(obj, 0);
> > > + if (err)
> > > + continue;
> > > +
> > > + err = i915_gem_object_pin_pages(obj);
> > > + }
> > > +
> > > + return err;
> > >  }
> > >
> > >  static void i915_gem_dmabuf_detach(struct dma_buf *dmabuf,
> > > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c 
> > > b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> > > index 3dc0f8b3cdab0..4f7e77b1c0152 100644
> > > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> > > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
> > > @@ -106,7 +106,9 @@ static int igt_dmabuf_import_same_driver(void *arg)
> > >   int err;
> > >
> > >   force_different_devices = true;
> > > - obj = i915_gem_object_create_shmem(i915, PAGE_SIZE);
> > > + obj = i915_gem_object_create_lmem(i915, PAGE_SIZE, 0);
> >
> > I'm wondering (and couldn't answer) whether this creates an lmem+smem
> > buffer, since if we create an lmem-only buffer then the migration above
> > should fail.
> 
> It's lmem-only, but it's also a kernel internal object, so the
> migration path will still happily migrate it if asked. On the other
> hand if it's a userspace object then we always have to respect the
> placements.
> 
> I think for now the only usecase for that is in the selftests.

Yeah I've read the kerneldoc, it's all nicely documented but feels a bit
dangerous. What I proposed on irc:
- i915_gem_object_migrate does the placement check, i.e. as strict as
  can_migrate.
- A new __i915_gem_object_migrate is for selftest that do special stuff.
- In the import selftest we check that lmem-only fails (because we can't
  pin it into smem) for a non-dynamic importer, but lmem+smem works and
  gets migrated.
- Once we have dynamic dma-buf for p2p pci, then we'll have another
  selftest which checks that things work for lmem only if and only if the
  importer is dynamic and has set the allow_p2p flag.

We could also add the can_migrate check everywhere (including
dma_buf->attach), but that feels like the less save api.
-Daniel


> 
> >
> > Which I'm also not sure we have a testcase for that testcase either ...
> >
> > I tried to read some code here, but got a bit lost. Ideas?
> > -Daniel
> >
> > > + if (IS_ERR(obj))
> > > + obj = i915_gem_object_create_shmem(i915, PAGE_SIZE);
> > >   if (IS_ERR(obj))
> > >   goto out_ret;
> > >
> > > --
> > > 2.31.1
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https:

Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-13 Thread Ville Syrjälä
On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
> + /**
> +  * @cache_coherent:
> +  *
> +  * Track whether the pages are coherent with the GPU if reading or
> +  * writing through the CPU cache.
> +  *
> +  * This largely depends on the @cache_level, for example if the object
> +  * is marked as I915_CACHE_LLC, then GPU access is coherent for both
> +  * reads and writes through the CPU cache.
> +  *
> +  * Note that on platforms with shared-LLC support(HAS_LLC) reads through
> +  * the CPU cache are always coherent, regardless of the @cache_level. On
> +  * snooping based platforms this is not the case, unless the full
> +  * I915_CACHE_LLC or similar setting is used.
> +  *
> +  * As a result of this we need to track coherency separately for reads
> +  * and writes, in order to avoid superfluous flushing on shared-LLC
> +  * platforms, for reads.
> +  *
> +  * I915_BO_CACHE_COHERENT_FOR_READ:
> +  *
> +  * When reading through the CPU cache, the GPU is still coherent. Note
> +  * that no data has actually been modified here, so it might seem
> +  * strange that we care about this.
> +  *
> +  * As an example, if some object is mapped on the CPU with write-back
> +  * caching, and we read some page, then the cache likely now contains
> +  * the data from that read. At this point the cache and main memory
> +  * match up, so all good. But next the GPU needs to write some data to
> +  * that same page. Now if the @cache_level is I915_CACHE_NONE and the
> +  * the platform doesn't have the shared-LLC, then the GPU will
> +  * effectively skip invalidating the cache(or however that works
> +  * internally) when writing the new value.  This is really bad since the
> +  * GPU has just written some new data to main memory, but the CPU cache
> +  * is still valid and now contains stale data. As a result the next time
> +  * we do a cached read with the CPU, we are rewarded with stale data.
> +  * Likewise if the cache is later flushed, we might be rewarded with
> +  * overwriting main memory with stale data.
> +  *
> +  * I915_BO_CACHE_COHERENT_FOR_WRITE:
> +  *
> +  * When writing through the CPU cache, the GPU is still coherent. Note
> +  * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> +  *
> +  * This is never set when I915_CACHE_NONE is used for @cache_level,
> +  * where instead we have to manually flush the caches after writing
> +  * through the CPU cache. For other cache levels this should be set and
> +  * the object is therefore considered coherent for both reads and writes
> +  * through the CPU cache.

I don't remember why we have this read vs. write split and this new
documentation doesn't seem to really explain it either.

Is it for optimizing some display related case where we can omit the
invalidates but still have to do the writeback to keep the display
engine happy?

-- 
Ville Syrjälä
Intel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-13 Thread Matthew Auld
On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä
 wrote:
>
> On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
> > + /**
> > +  * @cache_coherent:
> > +  *
> > +  * Track whether the pages are coherent with the GPU if reading or
> > +  * writing through the CPU cache.
> > +  *
> > +  * This largely depends on the @cache_level, for example if the object
> > +  * is marked as I915_CACHE_LLC, then GPU access is coherent for both
> > +  * reads and writes through the CPU cache.
> > +  *
> > +  * Note that on platforms with shared-LLC support(HAS_LLC) reads 
> > through
> > +  * the CPU cache are always coherent, regardless of the @cache_level. 
> > On
> > +  * snooping based platforms this is not the case, unless the full
> > +  * I915_CACHE_LLC or similar setting is used.
> > +  *
> > +  * As a result of this we need to track coherency separately for reads
> > +  * and writes, in order to avoid superfluous flushing on shared-LLC
> > +  * platforms, for reads.
> > +  *
> > +  * I915_BO_CACHE_COHERENT_FOR_READ:
> > +  *
> > +  * When reading through the CPU cache, the GPU is still coherent. Note
> > +  * that no data has actually been modified here, so it might seem
> > +  * strange that we care about this.
> > +  *
> > +  * As an example, if some object is mapped on the CPU with write-back
> > +  * caching, and we read some page, then the cache likely now contains
> > +  * the data from that read. At this point the cache and main memory
> > +  * match up, so all good. But next the GPU needs to write some data to
> > +  * that same page. Now if the @cache_level is I915_CACHE_NONE and the
> > +  * the platform doesn't have the shared-LLC, then the GPU will
> > +  * effectively skip invalidating the cache(or however that works
> > +  * internally) when writing the new value.  This is really bad since 
> > the
> > +  * GPU has just written some new data to main memory, but the CPU 
> > cache
> > +  * is still valid and now contains stale data. As a result the next 
> > time
> > +  * we do a cached read with the CPU, we are rewarded with stale data.
> > +  * Likewise if the cache is later flushed, we might be rewarded with
> > +  * overwriting main memory with stale data.
> > +  *
> > +  * I915_BO_CACHE_COHERENT_FOR_WRITE:
> > +  *
> > +  * When writing through the CPU cache, the GPU is still coherent. Note
> > +  * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> > +  *
> > +  * This is never set when I915_CACHE_NONE is used for @cache_level,
> > +  * where instead we have to manually flush the caches after writing
> > +  * through the CPU cache. For other cache levels this should be set 
> > and
> > +  * the object is therefore considered coherent for both reads and 
> > writes
> > +  * through the CPU cache.
>
> I don't remember why we have this read vs. write split and this new
> documentation doesn't seem to really explain it either.

Hmm, I attempted to explain that earlier:

* Note that on platforms with shared-LLC support(HAS_LLC) reads through
* the CPU cache are always coherent, regardless of the @cache_level. On
* snooping based platforms this is not the case, unless the full
* I915_CACHE_LLC or similar setting is used.
*
* As a result of this we need to track coherency separately for reads
* and writes, in order to avoid superfluous flushing on shared-LLC
* platforms, for reads.

So AFAIK it's just because shared-LLC can be coherent for reads, while
also not being coherent for writes(CACHE_NONE), so being able to track
each separately is kind of needed to avoid unnecessary flushing for
the read cases i.e simple boolean for coherent vs non-coherent is not
enough.

I can try to reword things to make that more clear.

>
> Is it for optimizing some display related case where we can omit the
> invalidates but still have to do the writeback to keep the display
> engine happy?
>
> --
> Ville Syrjälä
> Intel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 08/12] drm/i915/jsl_ehl: Use revid->stepping tables

2021-07-13 Thread Matt Roper
Switch JSL/EHL to use a revid->stepping table as we're trying to do on
all platforms going forward.

v2:
 - Use COMMON_STEPPING().  (Anusha)

Bspec: 29153
Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 2 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 2 +-
 drivers/gpu/drm/i915/i915_drv.h   | 9 -
 drivers/gpu/drm/i915/intel_step.c | 8 
 4 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c 
b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
index 882bfd499e55..dfc31b682848 100644
--- a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
+++ b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
@@ -2674,7 +2674,7 @@ static bool
 ehl_combo_pll_div_frac_wa_needed(struct drm_i915_private *i915)
 {
return ((IS_PLATFORM(i915, INTEL_ELKHARTLAKE) &&
-IS_JSL_EHL_REVID(i915, EHL_REVID_B0, REVID_FOREVER)) ||
+IS_JSL_EHL_DISPLAY_STEP(i915, STEP_B0, STEP_FOREVER)) ||
 IS_TIGERLAKE(i915) || IS_ALDERLAKE_P(i915)) &&
 i915->dpll.ref_clks.nssc == 38400;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index e2d8acb8c1c9..4c0c15bbdac2 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1043,7 +1043,7 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
 
/* Wa_1607087056:icl,ehl,jsl */
if (IS_ICELAKE(i915) ||
-   IS_JSL_EHL_REVID(i915, EHL_REVID_A0, EHL_REVID_A0))
+   IS_JSL_EHL_GT_STEP(i915, STEP_A0, STEP_A0))
wa_write_or(wal,
SLICE_UNIT_LEVEL_CLKGATE,
L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d4f705f06c73..b3ce2b73a143 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1532,11 +1532,10 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define IS_ICL_GT_STEP(p, since, until) \
(IS_ICELAKE(p) && IS_GT_STEP(p, since, until))
 
-#define EHL_REVID_A00x0
-#define EHL_REVID_B00x1
-
-#define IS_JSL_EHL_REVID(p, since, until) \
-   (IS_JSL_EHL(p) && IS_REVID(p, since, until))
+#define IS_JSL_EHL_GT_STEP(p, since, until) \
+   (IS_JSL_EHL(p) && IS_GT_STEP(p, since, until))
+#define IS_JSL_EHL_DISPLAY_STEP(p, since, until) \
+   (IS_JSL_EHL(p) && IS_DISPLAY_STEP(p, since, until))
 
 #define IS_TGL_DISPLAY_STEP(__i915, since, until) \
(IS_TIGERLAKE(__i915) && \
diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index f8be464d1179..868606f8139f 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -57,6 +57,11 @@ static const struct intel_step_info icl_revids[] = {
[7] = { COMMON_STEPPING(D0) },
 };
 
+static const struct intel_step_info jsl_ehl_revids[] = {
+   [0] = { COMMON_STEPPING(A0) },
+   [1] = { COMMON_STEPPING(B0) },
+};
+
 static const struct intel_step_info tgl_uy_revids[] = {
[0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[1] = { .gt_step = STEP_B0, .display_step = STEP_C0 },
@@ -104,6 +109,9 @@ void intel_step_init(struct drm_i915_private *i915)
} else if (IS_TIGERLAKE(i915)) {
revids = tgl_revids;
size = ARRAY_SIZE(tgl_revids);
+   } else if (IS_JSL_EHL(i915)) {
+   revids = jsl_ehl_revids;
+   size = ARRAY_SIZE(jsl_ehl_revids);
} else if (IS_ICELAKE(i915)) {
revids = icl_revids;
size = ARRAY_SIZE(icl_revids);
-- 
2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/fb-helper: Try to protect cleanup against delayed setup

2021-07-13 Thread Patchwork
== Series Details ==

Series: drm/fb-helper: Try to protect cleanup against delayed setup
URL   : https://patchwork.freedesktop.org/series/92478/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
b0ba74e0d5e3 drm/fb-helper: Try to protect cleanup against delayed setup
-:18: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#18: 
References: 
https://intel-gfx-ci.01.org/tree/linux-next/next-20201215/fi-byt-j1900/igt@i915_pm_...@module-reload.html

-:62: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 2 warnings, 0 checks, 28 lines checked


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 02/18] drm/sched: Barriers are needed for entity->last_scheduled

2021-07-13 Thread Daniel Vetter
On Tue, Jul 13, 2021 at 6:11 PM Andrey Grodzovsky
 wrote:
> On 2021-07-13 5:10 a.m., Daniel Vetter wrote:
> > On Tue, Jul 13, 2021 at 9:25 AM Christian König
> >  wrote:
> >> Am 13.07.21 um 08:50 schrieb Daniel Vetter:
> >>> On Tue, Jul 13, 2021 at 8:35 AM Christian König
> >>>  wrote:
>  Am 12.07.21 um 19:53 schrieb Daniel Vetter:
> > It might be good enough on x86 with just READ_ONCE, but the write side
> > should then at least be WRITE_ONCE because x86 has total store order.
> >
> > It's definitely not enough on arm.
> >
> > Fix this proplery, which means
> > - explain the need for the barrier in both places
> > - point at the other side in each comment
> >
> > Also pull out the !sched_list case as the first check, so that the
> > code flow is clearer.
> >
> > While at it sprinkle some comments around because it was very
> > non-obvious to me what's actually going on here and why.
> >
> > Note that we really need full barriers here, at first I thought
> > store-release and load-acquire on ->last_scheduled would be enough,
> > but we actually requiring ordering between that and the queue state.
> >
> > v2: Put smp_rmp() in the right place and fix up comment (Andrey)
> >
> > Signed-off-by: Daniel Vetter 
> > Cc: "Christian König" 
> > Cc: Steven Price 
> > Cc: Daniel Vetter 
> > Cc: Andrey Grodzovsky 
> > Cc: Lee Jones 
> > Cc: Boris Brezillon 
> > ---
> > drivers/gpu/drm/scheduler/sched_entity.c | 27 
> > ++--
> > 1 file changed, 25 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> > b/drivers/gpu/drm/scheduler/sched_entity.c
> > index f7347c284886..89e3f6eaf519 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -439,8 +439,16 @@ struct drm_sched_job 
> > *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
> > dma_fence_set_error(&sched_job->s_fence->finished, 
> > -ECANCELED);
> >
> > dma_fence_put(entity->last_scheduled);
> > +
> > entity->last_scheduled = 
> > dma_fence_get(&sched_job->s_fence->finished);
> >
> > + /*
> > +  * If the queue is empty we allow drm_sched_entity_select_rq() to
> > +  * locklessly access ->last_scheduled. This only works if we set 
> > the
> > +  * pointer before we dequeue and if we a write barrier here.
> > +  */
> > + smp_wmb();
> > +
>  Again, conceptual those barriers should be part of the spsc_queue
>  container and not externally.
> >>> That would be extremely unusual api. Let's assume that your queue is
> >>> very dumb, and protected by a simple lock. That's about the maximum
> >>> any user could expect.
> >>>
> >>> But then you still need barriers here, because linux locks (spinlock,
> >>> mutex) are defined to be one-way barriers: Stuff that's inside is
> >>> guaranteed to be done insinde, but stuff outside of the locked region
> >>> can leak in. They're load-acquire/store-release barriers. So not good
> >>> enough.
> >>>
> >>> You really need to have barriers here, and they really all need to be
> >>> documented properly. And yes that's a shit-ton of work in drm/sched,
> >>> because it's full of yolo lockless stuff.
> >>>
> >>> The other case you could make is that this works like a wakeup queue,
> >>> or similar. The rules there are:
> >>> - wake_up (i.e. pushing something into the queue) is a store-release 
> >>> barrier
> >>> - the waked up (i.e. popping an entry) is a load acquire barrier
> >>> Which is obviuosly needed because otherwise you don't have coherency
> >>> for the data queued up. And again not the barriers you're locking for
> >>> here.
> >> Exactly that was the idea, yes.
> >>
> >>> Either way, we'd still need the comments, because it's still lockless
> >>> trickery, and every single one of that needs to have a comment on both
> >>> sides to explain what's going on.
> >>>
> >>> Essentially replace spsc_queue with an llist underneath, and that's
> >>> the amount of barriers a data structure should provide. Anything else
> >>> is asking your datastructure to paper over bugs in your users.
> >>>
> >>> This is similar to how atomic_t is by default completely unordered,
> >>> and users need to add barriers as needed, with comments.
> >> My main problem is as always that kernel atomics work different than
> >> userspace atomics.
> >>
> >>> I think this is all to make sure people don't just write lockless 
> >>> algorithms
> >>> because it's a cool idea, but are forced to think this all through.
> >>> Which seems to not have happened very consistently for drm/sched, so I
> >>> guess needs to be fixed.
> >> Well at least initially that was all perfectly thought through. The
> >> problem is nobody is really maintaining that st

Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-13 Thread Daniel Vetter
On Tue, Jul 13, 2021 at 6:14 PM Matthew Auld
 wrote:
> On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä
>  wrote:
> >
> > On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
> > > + /**
> > > +  * @cache_coherent:
> > > +  *
> > > +  * Track whether the pages are coherent with the GPU if reading or
> > > +  * writing through the CPU cache.
> > > +  *
> > > +  * This largely depends on the @cache_level, for example if the 
> > > object
> > > +  * is marked as I915_CACHE_LLC, then GPU access is coherent for both
> > > +  * reads and writes through the CPU cache.
> > > +  *
> > > +  * Note that on platforms with shared-LLC support(HAS_LLC) reads 
> > > through
> > > +  * the CPU cache are always coherent, regardless of the 
> > > @cache_level. On
> > > +  * snooping based platforms this is not the case, unless the full
> > > +  * I915_CACHE_LLC or similar setting is used.
> > > +  *
> > > +  * As a result of this we need to track coherency separately for 
> > > reads
> > > +  * and writes, in order to avoid superfluous flushing on shared-LLC
> > > +  * platforms, for reads.
> > > +  *
> > > +  * I915_BO_CACHE_COHERENT_FOR_READ:
> > > +  *
> > > +  * When reading through the CPU cache, the GPU is still coherent. 
> > > Note
> > > +  * that no data has actually been modified here, so it might seem
> > > +  * strange that we care about this.
> > > +  *
> > > +  * As an example, if some object is mapped on the CPU with 
> > > write-back
> > > +  * caching, and we read some page, then the cache likely now 
> > > contains
> > > +  * the data from that read. At this point the cache and main memory
> > > +  * match up, so all good. But next the GPU needs to write some data 
> > > to
> > > +  * that same page. Now if the @cache_level is I915_CACHE_NONE and 
> > > the
> > > +  * the platform doesn't have the shared-LLC, then the GPU will
> > > +  * effectively skip invalidating the cache(or however that works
> > > +  * internally) when writing the new value.  This is really bad 
> > > since the
> > > +  * GPU has just written some new data to main memory, but the CPU 
> > > cache
> > > +  * is still valid and now contains stale data. As a result the next 
> > > time
> > > +  * we do a cached read with the CPU, we are rewarded with stale 
> > > data.
> > > +  * Likewise if the cache is later flushed, we might be rewarded with
> > > +  * overwriting main memory with stale data.
> > > +  *
> > > +  * I915_BO_CACHE_COHERENT_FOR_WRITE:
> > > +  *
> > > +  * When writing through the CPU cache, the GPU is still coherent. 
> > > Note
> > > +  * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> > > +  *
> > > +  * This is never set when I915_CACHE_NONE is used for @cache_level,
> > > +  * where instead we have to manually flush the caches after writing
> > > +  * through the CPU cache. For other cache levels this should be set 
> > > and
> > > +  * the object is therefore considered coherent for both reads and 
> > > writes
> > > +  * through the CPU cache.
> >
> > I don't remember why we have this read vs. write split and this new
> > documentation doesn't seem to really explain it either.
>
> Hmm, I attempted to explain that earlier:
>
> * Note that on platforms with shared-LLC support(HAS_LLC) reads through
> * the CPU cache are always coherent, regardless of the @cache_level. On
> * snooping based platforms this is not the case, unless the full
> * I915_CACHE_LLC or similar setting is used.
> *
> * As a result of this we need to track coherency separately for reads
> * and writes, in order to avoid superfluous flushing on shared-LLC
> * platforms, for reads.
>
> So AFAIK it's just because shared-LLC can be coherent for reads, while
> also not being coherent for writes(CACHE_NONE), so being able to track
> each separately is kind of needed to avoid unnecessary flushing for
> the read cases i.e simple boolean for coherent vs non-coherent is not
> enough.
>
> I can try to reword things to make that more clear.

Maybe highlight the security aspect a bit more: When reads are always
coherent, we don't have to force the clflush. If reads are not
coherent we must ensure that the clflush has finished before userspace
can get at the backing storage, like writing ptes and similar things.
Writes otoh can only result in userspace eating cacheling corruption
if it races against the kernel (by e.g. trying to predict where we'll
bind a buffer and issuing gpu access to that location before the
buffer is actually bound from some other engine in parallel with an
execbuf that binds the buffer).

Atm we don't do a great job with that, but that's something that I
think is getting looked into.
-Daniel

> > Is it for optimizing some display related case where we can omit the
> > invalidates but still have to do the writeback to keep the disp

Re: [Intel-gfx] [PATCH v4 14/18] drm/msm: Don't break exclusive fence ordering

2021-07-13 Thread Rob Clark
On Mon, Jul 12, 2021 at 1:02 PM Daniel Vetter  wrote:
>
> There's only one exclusive slot, and we must not break the ordering.
>
> Adding a new exclusive fence drops all previous fences from the
> dma_resv. To avoid violating the signalling order we err on the side of
> over-synchronizing by waiting for the existing fences, even if
> userspace asked us to ignore them.
>
> A better fix would be to us a dma_fence_chain or _array like e.g.
> amdgpu now uses, but
> - msm has a synchronous dma_fence_wait for anything from another
>   context, so doesn't seem to care much,
> - and it probably makes sense to lift this into dma-resv.c code as a
>   proper concept, so that drivers don't have to hack up their own
>   solution each on their own.
>
> v2: Improve commit message per Lucas' suggestion.
>
> Cc: Lucas Stach 
> Signed-off-by: Daniel Vetter 
> Cc: Rob Clark 
> Cc: Sean Paul 
> Cc: linux-arm-...@vger.kernel.org
> Cc: freedr...@lists.freedesktop.org
> ---
>  drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> b/drivers/gpu/drm/msm/msm_gem_submit.c
> index b71da71a3dd8..edd0051d849f 100644
> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> @@ -306,7 +306,8 @@ static int submit_fence_sync(struct msm_gem_submit 
> *submit, bool no_implicit)
> return ret;
> }
>
> -   if (no_implicit)
> +   /* exclusive fences must be ordered */
> +   if (no_implicit && !write)
> continue;

In practice, modern userspace (the kind that is more likely to set the
no-implicit flag on every submit) also sets MSM_SUBMIT_BO_WRITE on
every bo, to shave some cpu overhead so I suppose this would not
really hurt anything

Do you know if this is covered in any piglit/etc test?

BR,
-R

>
> ret = msm_gem_sync_object(&msm_obj->base, submit->ring->fctx,
> --
> 2.32.0
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 08/12] drm/i915/jsl_ehl: Use revid->stepping tables

2021-07-13 Thread Srivatsa, Anusha



> -Original Message-
> From: Roper, Matthew D 
> Sent: Tuesday, July 13, 2021 9:15 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: Srivatsa, Anusha ; Roper, Matthew D
> 
> Subject: [PATCH v3 08/12] drm/i915/jsl_ehl: Use revid->stepping tables
> 
> Switch JSL/EHL to use a revid->stepping table as we're trying to do on all
> platforms going forward.
> 
> v2:
>  - Use COMMON_STEPPING().  (Anusha)
> 
> Bspec: 29153
> Cc: Anusha Srivatsa 
> Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 

> ---
>  drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 2 +-
>  drivers/gpu/drm/i915/gt/intel_workarounds.c   | 2 +-
>  drivers/gpu/drm/i915/i915_drv.h   | 9 -
>  drivers/gpu/drm/i915/intel_step.c | 8 
>  4 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
> b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
> index 882bfd499e55..dfc31b682848 100644
> --- a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
> +++ b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
> @@ -2674,7 +2674,7 @@ static bool
>  ehl_combo_pll_div_frac_wa_needed(struct drm_i915_private *i915)  {
>   return ((IS_PLATFORM(i915, INTEL_ELKHARTLAKE) &&
> -  IS_JSL_EHL_REVID(i915, EHL_REVID_B0, REVID_FOREVER))
> ||
> +  IS_JSL_EHL_DISPLAY_STEP(i915, STEP_B0, STEP_FOREVER))
> ||
>IS_TIGERLAKE(i915) || IS_ALDERLAKE_P(i915)) &&
>i915->dpll.ref_clks.nssc == 38400;
>  }
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index e2d8acb8c1c9..4c0c15bbdac2 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -1043,7 +1043,7 @@ icl_gt_workarounds_init(struct drm_i915_private
> *i915, struct i915_wa_list *wal)
> 
>   /* Wa_1607087056:icl,ehl,jsl */
>   if (IS_ICELAKE(i915) ||
> - IS_JSL_EHL_REVID(i915, EHL_REVID_A0, EHL_REVID_A0))
> + IS_JSL_EHL_GT_STEP(i915, STEP_A0, STEP_A0))
>   wa_write_or(wal,
>   SLICE_UNIT_LEVEL_CLKGATE,
>   L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); diff --
> git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index d4f705f06c73..b3ce2b73a143 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1532,11 +1532,10 @@ IS_SUBPLATFORM(const struct drm_i915_private
> *i915,  #define IS_ICL_GT_STEP(p, since, until) \
>   (IS_ICELAKE(p) && IS_GT_STEP(p, since, until))
> 
> -#define EHL_REVID_A00x0
> -#define EHL_REVID_B00x1
> -
> -#define IS_JSL_EHL_REVID(p, since, until) \
> - (IS_JSL_EHL(p) && IS_REVID(p, since, until))
> +#define IS_JSL_EHL_GT_STEP(p, since, until) \
> + (IS_JSL_EHL(p) && IS_GT_STEP(p, since, until)) #define
> +IS_JSL_EHL_DISPLAY_STEP(p, since, until) \
> + (IS_JSL_EHL(p) && IS_DISPLAY_STEP(p, since, until))
> 
>  #define IS_TGL_DISPLAY_STEP(__i915, since, until) \
>   (IS_TIGERLAKE(__i915) && \
> diff --git a/drivers/gpu/drm/i915/intel_step.c
> b/drivers/gpu/drm/i915/intel_step.c
> index f8be464d1179..868606f8139f 100644
> --- a/drivers/gpu/drm/i915/intel_step.c
> +++ b/drivers/gpu/drm/i915/intel_step.c
> @@ -57,6 +57,11 @@ static const struct intel_step_info icl_revids[] = {
>   [7] = { COMMON_STEPPING(D0) },
>  };
> 
> +static const struct intel_step_info jsl_ehl_revids[] = {
> + [0] = { COMMON_STEPPING(A0) },
> + [1] = { COMMON_STEPPING(B0) },
> +};
> +
>  static const struct intel_step_info tgl_uy_revids[] = {
>   [0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
>   [1] = { .gt_step = STEP_B0, .display_step = STEP_C0 }, @@ -104,6
> +109,9 @@ void intel_step_init(struct drm_i915_private *i915)
>   } else if (IS_TIGERLAKE(i915)) {
>   revids = tgl_revids;
>   size = ARRAY_SIZE(tgl_revids);
> + } else if (IS_JSL_EHL(i915)) {
> + revids = jsl_ehl_revids;
> + size = ARRAY_SIZE(jsl_ehl_revids);
>   } else if (IS_ICELAKE(i915)) {
>   revids = icl_revids;
>   size = ARRAY_SIZE(icl_revids);
> --
> 2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 14/18] drm/msm: Don't break exclusive fence ordering

2021-07-13 Thread Daniel Vetter
On Tue, Jul 13, 2021 at 6:51 PM Rob Clark  wrote:
>
> On Mon, Jul 12, 2021 at 1:02 PM Daniel Vetter  wrote:
> >
> > There's only one exclusive slot, and we must not break the ordering.
> >
> > Adding a new exclusive fence drops all previous fences from the
> > dma_resv. To avoid violating the signalling order we err on the side of
> > over-synchronizing by waiting for the existing fences, even if
> > userspace asked us to ignore them.
> >
> > A better fix would be to us a dma_fence_chain or _array like e.g.
> > amdgpu now uses, but
> > - msm has a synchronous dma_fence_wait for anything from another
> >   context, so doesn't seem to care much,
> > - and it probably makes sense to lift this into dma-resv.c code as a
> >   proper concept, so that drivers don't have to hack up their own
> >   solution each on their own.
> >
> > v2: Improve commit message per Lucas' suggestion.
> >
> > Cc: Lucas Stach 
> > Signed-off-by: Daniel Vetter 
> > Cc: Rob Clark 
> > Cc: Sean Paul 
> > Cc: linux-arm-...@vger.kernel.org
> > Cc: freedr...@lists.freedesktop.org
> > ---
> >  drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> > b/drivers/gpu/drm/msm/msm_gem_submit.c
> > index b71da71a3dd8..edd0051d849f 100644
> > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > @@ -306,7 +306,8 @@ static int submit_fence_sync(struct msm_gem_submit 
> > *submit, bool no_implicit)
> > return ret;
> > }
> >
> > -   if (no_implicit)
> > +   /* exclusive fences must be ordered */
> > +   if (no_implicit && !write)
> > continue;
>
> In practice, modern userspace (the kind that is more likely to set the
> no-implicit flag on every submit) also sets MSM_SUBMIT_BO_WRITE on
> every bo, to shave some cpu overhead so I suppose this would not
> really hurt anything
>
> Do you know if this is covered in any piglit/etc test?

You need some command submission, plus buffer sharing with vgem
setting it's own exclusive fences, plus checking with dma_buf poll()
whether it signals all in the right order. That's pretty low-level, so
maybe something in igt, but I haven't typed that. Maybe I need to do
that for i915 at least.
-Daniel

> BR,
> -R
>
> >
> > ret = msm_gem_sync_object(&msm_obj->base, 
> > submit->ring->fctx,
> > --
> > 2.32.0
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✓ Fi.CI.IGT: success for series starting with [CI,1/6] drm/i915/display: Settle on "adl-x" in WA comments

2021-07-13 Thread Souza, Jose
On Tue, 2021-07-13 at 03:31 +, Patchwork wrote:
Patch Details
Series: series starting with [CI,1/6] drm/i915/display: Settle on "adl-x" in WA 
comments
URL:https://patchwork.freedesktop.org/series/92457/
State:  success
Details:
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20581/index.html
CI Bug Log - changes from CI_DRM_10335_full -> Patchwork_20581_full
Summary

SUCCESS

No regressions found.

Patches pushed, thanks for the reviews Matt.

Known issues

Here are the changes found in Patchwork_20581_full that come from known issues:

IGT changes
Issues hit

  *   igt@feature_discovery@psr2:

 *   shard-iclb: 
PASS
 -> 
SKIP
 ([i915#658])
  *   igt@gem_ctx_isolation@preservation-s3@vcs0:

 *   shard-skl: 
PASS
 -> 
INCOMPLETE
 ([i915#146] / [i915#198])
  *   igt@gem_ctx_persistence@legacy-engines-mixed:

 *   shard-snb: NOTRUN -> 
SKIP
 ([fdo#109271] / [i915#1099]) +3 similar issues
  *   igt@gem_ctx_persistence@many-contexts:

 *   shard-tglb: 
PASS
 -> 
FAIL
 ([i915#2410])
  *   igt@gem_exec_fair@basic-deadline:

 *   shard-glk: 
PASS
 -> 
FAIL
 ([i915#2846])

 *   shard-apl: NOTRUN -> 
FAIL
 ([i915#2846])

  *   igt@gem_exec_fair@basic-pace-share@rcs0:

 *   shard-glk: 
PASS
 -> 
FAIL
 ([i915#2842])
  *   igt@gem_exec_fair@basic-pace@rcs0:

 *   shard-kbl: 
PASS
 -> 
FAIL
 ([i915#2842])
  *   igt@gem_exec_fair@basic-pace@vcs0:

 *   shard-iclb: 
PASS
 -> 
FAIL
 ([i915#2842]) +2 similar issues
  *   igt@gem_exec_fair@basic-pace@vcs1:

 *   shard-iclb: NOTRUN -> 
FAIL
 ([i915#2842]) +2 similar issues
  *   igt@gem_exec_fair@basic-pace@vecs0:

 *   shard-tglb: 
PASS
 -> 
FAIL
 ([i915#2842]) +1 similar issue
  *   igt@gem_exec_reloc@basic-wide-active@vcs1:

 *   shard-iclb: NOTRUN -> 
FAIL
 ([i915#3633])
  *   igt@gem_mmap_gtt@big-copy-odd:

 *   shard-glk: 
PASS
 -> 
FAIL
 ([i915#307])
  *   igt@gem_mmap_gtt@cpuset-big-copy:

 *   shard-iclb: 
PASS
 -> 
FAIL
 ([i915#307])
  *   igt@gem_mmap_offset@clear:

 *   shard-skl: 
PASS
 -> 
FAIL
 ([i915#3160])

 *   shard-iclb: 
PASS
 -

Re: [Intel-gfx] [PATCH v2 09/12] drm/i915/rkl: Use revid->stepping tables

2021-07-13 Thread Srivatsa, Anusha



> -Original Message-
> From: Roper, Matthew D 
> Sent: Monday, July 12, 2021 3:56 PM
> To: Srivatsa, Anusha 
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [PATCH v2 09/12] drm/i915/rkl: Use revid->stepping tables
> 
> On Mon, Jul 12, 2021 at 03:51:15PM -0700, Srivatsa, Anusha wrote:
> >
> >
> > > -Original Message-
> > > From: Roper, Matthew D 
> > > Sent: Friday, July 9, 2021 8:37 PM
> > > To: intel-gfx@lists.freedesktop.org
> > > Cc: Srivatsa, Anusha ; Roper, Matthew D
> > > 
> > > Subject: [PATCH v2 09/12] drm/i915/rkl: Use revid->stepping tables
> > >
> > > Switch RKL to use a revid->stepping table as we're trying to do on
> > > all platforms going forward.
> > >
> > > Bspec: 44501
> > > Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 

> > > ---
> > >  drivers/gpu/drm/i915/display/intel_psr.c | 4 ++--
> > >  drivers/gpu/drm/i915/i915_drv.h  | 8 ++--
> > >  drivers/gpu/drm/i915/intel_step.c| 9 +
> > >  3 files changed, 13 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/display/intel_psr.c
> > > b/drivers/gpu/drm/i915/display/intel_psr.c
> > > index 9643624fe160..74b2aa3c2946 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_psr.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_psr.c
> > > @@ -594,7 +594,7 @@ static void hsw_activate_psr2(struct intel_dp
> > > *intel_dp)
> > >   if (intel_dp->psr.psr2_sel_fetch_enabled) {
> > >   /* WA 1408330847 */
> > >   if (IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
> > > - IS_RKL_REVID(dev_priv, RKL_REVID_A0, RKL_REVID_A0))
> > > + IS_RKL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0))
> > >   intel_de_rmw(dev_priv, CHICKEN_PAR1_1,
> > >DIS_RAM_BYPASS_PSR2_MAN_TRACK,
> > >DIS_RAM_BYPASS_PSR2_MAN_TRACK);
> @@ -1342,7 +1342,7 @@
> > > static void intel_psr_disable_locked(struct intel_dp
> > > *intel_dp)
> > >   /* WA 1408330847 */
> > >   if (intel_dp->psr.psr2_sel_fetch_enabled &&
> > >   (IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
> > > -  IS_RKL_REVID(dev_priv, RKL_REVID_A0, RKL_REVID_A0)))
> > > +  IS_RKL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0)))
> > >   intel_de_rmw(dev_priv, CHICKEN_PAR1_1,
> > >DIS_RAM_BYPASS_PSR2_MAN_TRACK, 0);
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > b/drivers/gpu/drm/i915/i915_drv.h index b3ce2b73a143..9195131cf90f
> > > 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1549,12 +1549,8 @@ IS_SUBPLATFORM(const struct
> drm_i915_private
> > > *i915,
> > >   (IS_TIGERLAKE(__i915) && !(IS_TGL_U(__i915) || IS_TGL_Y(__i915))
> > > && \
> > >IS_GT_STEP(__i915, since, until))
> > >
> > > -#define RKL_REVID_A0 0x0
> > > -#define RKL_REVID_B0 0x1
> > > -#define RKL_REVID_C0 0x4
> > > -
> > > -#define IS_RKL_REVID(p, since, until) \
> > > - (IS_ROCKETLAKE(p) && IS_REVID(p, since, until))
> > > +#define IS_RKL_DISPLAY_STEP(p, since, until) \
> > > + (IS_ROCKETLAKE(p) && IS_DISPLAY_STEP(p, since, until))
> > >
> >
> > If a platform has the same gt and display stepping, I wonder if we
> > should stick to using IS__GT_STEP while replacing
> > IS_REVID instances. The previous patches have
> > IS__GT_STEP.
> > Just a thought.
> 
> No, we want to be very explicit about which IP block the stepping belongs to
> to avoid mistakes.  Just because the steppings are equivalent right now
> doesn't mean a new revision won't show up in the future that has different
> GT vs display steppings.  In that case it's easy to update the table, but we
> don't want to have to dig through the rest of the code looking for places
> where we used the wrong macro.  Plus, intentionally using the wrong macro
> on a platform where it doesn't matter is going to lead to copy/paste errors
> when people add additional platforms to a workaround.
> 
> 
> Matt
> 
> >
> > Anusha
> >
> > >  #define DG1_REVID_A0 0x0
> > >  #define DG1_REVID_B0 0x1
> > > diff --git a/drivers/gpu/drm/i915/intel_step.c
> > > b/drivers/gpu/drm/i915/intel_step.c
> > > index 6e1b132ecf38..21211649e6bb 100644
> > > --- a/drivers/gpu/drm/i915/intel_step.c
> > > +++ b/drivers/gpu/drm/i915/intel_step.c
> > > @@ -75,6 +75,12 @@ static const struct intel_step_info tgl_revids[] = {
> > >   [1] = { .gt_step = STEP_B0, .display_step = STEP_D0 },  };
> > >
> > > +static const struct intel_step_info rkl_revids[] = {
> > > + [0] = { COMMON_STEPPING(A0) },
> > > + [1] = { COMMON_STEPPING(B0) },
> > > + [4] = { COMMON_STEPPING(C0) },
> > > +};
> > > +
> > >  static const struct intel_step_info adls_revids[] = {
> > >   [0x0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
> > >   [0x1] = { .gt_step = STEP_A0, .display_step = STEP_A2 }, @@ -103,6
> > > +109,9 @@ void intel_step_init(struct drm_i915_private *i915)
> > >   } else 

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/fb-helper: Try to protect cleanup against delayed setup

2021-07-13 Thread Patchwork
== Series Details ==

Series: drm/fb-helper: Try to protect cleanup against delayed setup
URL   : https://patchwork.freedesktop.org/series/92478/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10341 -> Patchwork_20587


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20587/index.html

Known issues


  Here are the changes found in Patchwork_20587 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_prime@amd-to-i915:
- fi-tgl-y:   NOTRUN -> [SKIP][1] ([fdo#109315] / [i915#2575])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20587/fi-tgl-y/igt@amdgpu/amd_pr...@amd-to-i915.html

  * igt@i915_selftest@live@execlists:
- fi-cfl-8109u:   [PASS][2] -> [DMESG-WARN][3] ([i915#203]) +3 similar 
issues
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10341/fi-cfl-8109u/igt@i915_selftest@l...@execlists.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20587/fi-cfl-8109u/igt@i915_selftest@l...@execlists.html

  * igt@runner@aborted:
- fi-bdw-5557u:   NOTRUN -> [FAIL][4] ([i915#1602] / [i915#2029])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20587/fi-bdw-5557u/igt@run...@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [i915#1602]: https://gitlab.freedesktop.org/drm/intel/issues/1602
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#2029]: https://gitlab.freedesktop.org/drm/intel/issues/2029
  [i915#203]: https://gitlab.freedesktop.org/drm/intel/issues/203
  [i915#2575]: https://gitlab.freedesktop.org/drm/intel/issues/2575
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#541]: https://gitlab.freedesktop.org/drm/intel/issues/541


Participating hosts (39 -> 35)
--

  Missing(4): fi-kbl-soraka fi-ilk-m540 fi-bdw-samus fi-hsw-4200u 


Build changes
-

  * Linux: CI_DRM_10341 -> Patchwork_20587

  CI-20190529: 20190529
  CI_DRM_10341: 72a4e94d9585ff89a8c85bd1436fb05b60dad2f8 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6137: 2fee489255f7a8cd6a584373c30e3d44a07a78ea @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20587: b0ba74e0d5e3af9a36ca12804924a802b41617de @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

b0ba74e0d5e3 drm/fb-helper: Try to protect cleanup against delayed setup

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20587/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 10/12] drm/i915/dg1: Use revid->stepping tables

2021-07-13 Thread Srivatsa, Anusha



> -Original Message-
> From: Roper, Matthew D 
> Sent: Friday, July 9, 2021 8:37 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: Srivatsa, Anusha ; Roper, Matthew D
> 
> Subject: [PATCH v2 10/12] drm/i915/dg1: Use revid->stepping tables
> 
> Switch DG1 to use a revid->stepping table as we're trying to do on all
> platforms going forward.
> 
> This removes the last use of IS_REVID() and REVID_FOREVER, so remove
> those now-unused macros as well to prevent their accidental use on future
> platforms.
> 
> Bspec: 44463
> Signed-off-by: Matt Roper 
> ---
>  .../gpu/drm/i915/display/intel_display_power.c |  2 +-
>  drivers/gpu/drm/i915/gt/intel_region_lmem.c|  2 +-
>  drivers/gpu/drm/i915/gt/intel_workarounds.c| 10 +-
>  drivers/gpu/drm/i915/i915_drv.h| 18 --
>  drivers/gpu/drm/i915/intel_pm.c|  2 +-
>  drivers/gpu/drm/i915/intel_step.c  |  8 
>  6 files changed, 20 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> b/drivers/gpu/drm/i915/display/intel_display_power.c
> index 285380079aab..975a7e25cea5 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> @@ -5799,7 +5799,7 @@ static void tgl_bw_buddy_init(struct
> drm_i915_private *dev_priv)
>   int config, i;
> 
>   if (IS_ALDERLAKE_S(dev_priv) ||
> - IS_DG1_REVID(dev_priv, DG1_REVID_A0, DG1_REVID_A0) ||
> + IS_DG1_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
>   IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_B0))
>   /* Wa_1409767108:tgl,dg1,adl-s */
>   table = wa_1409767108_buddy_page_masks; diff --git
> a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> index 1f43aba2e9e2..50d11a84e7a9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> +++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> @@ -157,7 +157,7 @@ intel_gt_setup_fake_lmem(struct intel_gt *gt)  static
> bool get_legacy_lowmem_region(struct intel_uncore *uncore,
>u64 *start, u32 *size)
>  {
> - if (!IS_DG1_REVID(uncore->i915, DG1_REVID_A0, DG1_REVID_B0))
> + if (!IS_DG1_GT_STEP(uncore->i915, STEP_A0, STEP_B0))
>   return false;
> 
>   *start = 0;
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 4c0c15bbdac2..62321e9149db 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -,7 +,7 @@ dg1_gt_workarounds_init(struct drm_i915_private
> *i915, struct i915_wa_list *wal)
>   gen12_gt_workarounds_init(i915, wal);
> 
>   /* Wa_1607087056:dg1 */
> - if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0))
> + if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0))
>   wa_write_or(wal,
>   SLICE_UNIT_LEVEL_CLKGATE,
>   L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); @@ -
> 1522,7 +1522,7 @@ static void dg1_whitelist_build(struct intel_engine_cs
> *engine)
>   tgl_whitelist_build(engine);
> 
>   /* GEN:BUG:1409280441:dg1 */
> - if (IS_DG1_REVID(engine->i915, DG1_REVID_A0, DG1_REVID_A0) &&
> + if (IS_DG1_GT_STEP(engine->i915, STEP_A0, STEP_A0) &&
>   (engine->class == RENDER_CLASS ||
>engine->class == COPY_ENGINE_CLASS))
>   whitelist_reg_ext(w, RING_ID(engine->mmio_base), @@ -
> 1592,7 +1592,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine,
> struct i915_wa_list *wal)  {
>   struct drm_i915_private *i915 = engine->i915;
> 
> - if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
> + if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
>   IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0)) {
>   /*
>* Wa_1607138336:tgl[a0],dg1[a0]
> @@ -1638,7 +1638,7 @@ rcs_engine_wa_init(struct intel_engine_cs
> *engine, struct i915_wa_list *wal)
>   }
> 
>   if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) ||
> - IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
> + IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
>   IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
>   /* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
>   wa_masked_en(wal, GEN7_ROW_CHICKEN2,
> @@ -1652,7 +1652,7 @@ rcs_engine_wa_init(struct intel_engine_cs
> *engine, struct i915_wa_list *wal)
>   }
> 
> 
> - if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
> + if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
>   IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
>   /*
>* Wa_1607030317:tgl
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h index 9195131cf90f..d462b9434541
> 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1323,19 +1323,10 @@ static i

Re: [Intel-gfx] [PATCH v2 12/12] drm/i915/icl: Drop workarounds that only apply to pre-production steppings

2021-07-13 Thread Srivatsa, Anusha



> -Original Message-
> From: Roper, Matthew D 
> Sent: Friday, July 9, 2021 8:37 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: Srivatsa, Anusha ; Roper, Matthew D
> 
> Subject: [PATCH v2 12/12] drm/i915/icl: Drop workarounds that only apply to
> pre-production steppings
> 
> We're past the point at which we usually drop workarounds that were never
> needed on production hardware.  The driver will already print an error and
> apply taint if loaded on pre-production hardware.
> 
> Signed-off-by: Matt Roper 
Definitely cleans up the code. 

Reviewed-by: Anusha Srivatsa 

> ---
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 39 -
>  drivers/gpu/drm/i915/i915_drv.h |  3 --
>  2 files changed, 42 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 9b257a394305..5ace14cdfa85 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -517,21 +517,12 @@ static void cfl_ctx_workarounds_init(struct
> intel_engine_cs *engine,  static void icl_ctx_workarounds_init(struct
> intel_engine_cs *engine,
>struct i915_wa_list *wal)
>  {
> - struct drm_i915_private *i915 = engine->i915;
> -
>   /* WaDisableBankHangMode:icl */
>   wa_write(wal,
>GEN8_L3CNTLREG,
>intel_uncore_read(engine->uncore, GEN8_L3CNTLREG) |
>GEN8_ERRDETBCTRL);
> 
> - /* Wa_1604370585:icl (pre-prod)
> -  * Formerly known as WaPushConstantDereferenceHoldDisable
> -  */
> - if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
> - wa_masked_en(wal, GEN7_ROW_CHICKEN2,
> -  PUSH_CONSTANT_DEREF_DISABLE);
> -
>   /* WaForceEnableNonCoherent:icl
>* This is not the same workaround as in early Gen9 platforms, where
>* lacking this could cause system hangs, but coherency performance
> @@ -541,18 +532,6 @@ static void icl_ctx_workarounds_init(struct
> intel_engine_cs *engine,
>*/
>   wa_masked_en(wal, ICL_HDC_MODE,
> HDC_FORCE_NON_COHERENT);
> 
> - /* Wa_2006611047:icl (pre-prod)
> -  * Formerly known as WaDisableImprovedTdlClkGating
> -  */
> - if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
> - wa_masked_en(wal, GEN7_ROW_CHICKEN2,
> -  GEN11_TDL_CLOCK_GATING_FIX_DISABLE);
> -
> - /* Wa_2006665173:icl (pre-prod) */
> - if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
> - wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
> -  GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC);
> -
>   /* WaEnableFloatBlendOptimization:icl */
>   wa_write_clr_set(wal,
>GEN10_CACHE_MODE_SS,
> @@ -982,18 +961,6 @@ icl_gt_workarounds_init(struct drm_i915_private
> *i915, struct i915_wa_list *wal)
>   GEN8_GAMW_ECO_DEV_RW_IA,
>   GAMW_ECO_DEV_CTX_RELOAD_DISABLE);
> 
> - /* Wa_1405779004:icl (pre-prod) */
> - if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
> - wa_write_or(wal,
> - SLICE_UNIT_LEVEL_CLKGATE,
> - MSCUNIT_CLKGATE_DIS);
> -
> - /* Wa_1406838659:icl (pre-prod) */
> - if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
> - wa_write_or(wal,
> - INF_UNIT_LEVEL_CLKGATE,
> - CGPSF_CLKGATE_DIS);
> -
>   /* Wa_1406463099:icl
>* Formerly known as WaGamTlbPendError
>*/
> @@ -1669,12 +1636,6 @@ rcs_engine_wa_init(struct intel_engine_cs
> *engine, struct i915_wa_list *wal)
>   PMFLUSH_GAPL3UNBLOCK |
>   PMFLUSHDONE_LNEBLK);
> 
> - /* Wa_1406609255:icl (pre-prod) */
> - if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
> - wa_write_or(wal,
> - GEN7_SARCHKMD,
> - GEN7_DISABLE_DEMAND_PREFETCH);
> -
>   /* Wa_1606682166:icl */
>   wa_write_or(wal,
>   GEN7_SARCHKMD,
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h index 8682a5f557c5..da5f230e2d4b
> 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1513,9 +1513,6 @@ IS_SUBPLATFORM(const struct drm_i915_private
> *i915,  #define IS_KBL_DISPLAY_STEP(dev_priv, since, until) \
>   (IS_KABYLAKE(dev_priv) && IS_DISPLAY_STEP(dev_priv, since,
> until))
> 
> -#define IS_ICL_GT_STEP(p, since, until) \
> - (IS_ICELAKE(p) && IS_GT_STEP(p, since, until))
> -
>  #define IS_JSL_EHL_GT_STEP(p, since, until) \
>   (IS_JSL_EHL(p) && IS_GT_STEP(p, since, until))  #define
> IS_JSL_EHL_DISPLAY_STEP(p, since, until) \
> --
> 2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailma

Re: [Intel-gfx] [igt-dev] [PATCH i-g-t] [RFC] tests/kms_prime: Aligned pitch to 64 byte for Intel platforms

2021-07-13 Thread Srinivas, Vidya
Hello All,

Very sorry to bother you all again. Kindly can you help so that we can close on 
this please?
We have submitted both IGT and kernel patches. If none are okay, should we skip 
the subtest for panels which have width not 64 byte aligned for Intel platforms?

https://patchwork.freedesktop.org/patch/435794/  - IGT patch (we are aligning 
width itself as workaround due to missing alignment in VGEM before reaching 
i915)
https://patchwork.freedesktop.org/patch/436199/  - Kernel patch

Tejas has submitted another solution 
https://patchwork.freedesktop.org/patch/441967/ (256B alignment) - this too 
works.

Regards
Vidya

-Original Message-
From: Surendrakumar Upadhyay, TejaskumarX 
 
Sent: Monday, June 28, 2021 4:39 PM
To: Srinivas, Vidya ; Ville Syrjälä 

Cc: igt-...@lists.freedesktop.org; intel-gfx@lists.freedesktop.org; Lin, 
Charlton 
Subject: RE: [igt-dev] [PATCH i-g-t] [RFC] tests/kms_prime: Aligned pitch to 64 
byte for Intel platforms



> -Original Message-
> From: Intel-gfx  On Behalf Of 
> Srinivas, Vidya
> Sent: 31 May 2021 20:18
> To: Ville Syrjälä 
> Cc: igt-...@lists.freedesktop.org; intel-gfx@lists.freedesktop.org; 
> Lin, Charlton 
> Subject: Re: [Intel-gfx] [igt-dev] [PATCH i-g-t] [RFC] 
> tests/kms_prime: Aligned pitch to 64 byte for Intel platforms
> 
> Hello Ville,
> 
> Thank you very much.
> Before reaching our i915's i915_gem_dumb_create, it goes to 
> vgem_gem_dumb_create for kms_prime.
> 
> The pitch gets calculated there and it is not 64 byte aligned. Due to 
> this, intel_framebuffer_init reports "pitch must be 64 byte aligned"
> and framebuffer creation fails. I tried submitting vgem patch where 64 
> byte alignment can be done in vgem_gem_dumb_create and that also 
> passes. But we did not get approval yet as few of them felt, vgem is 
> generic and other platforms might fail if we do 64 byte alignment there.
> 
> Kindly suggest. Thanks a lot.
> 
> Regards
> Vidya
> 
> -Original Message-
> From: Ville Syrjälä 
> Sent: Monday, May 31, 2021 7:48 PM
> To: Srinivas, Vidya 
> Cc: intel-gfx@lists.freedesktop.org; igt-...@lists.freedesktop.org; 
> Lin, Charlton 
> Subject: Re: [igt-dev] [PATCH i-g-t] [RFC] tests/kms_prime: Aligned 
> pitch to 64 byte for Intel platforms
> 
> On Fri, May 28, 2021 at 10:04:03AM +0530, Vidya Srinivas wrote:
> > For Intel platforms, pitch needs to be 64 byte aligned.
> > Kernel code vgem_gem_dumb_create which is platform generic code
> doesnt
> > do the alignment. This causes frame buffer creation to fail on Intel 
> > platforms where the pitch is not 64 byte aligned.
> >
> > tests: test run on Intel platforms with panel resolution 1366x768
> >
> > Signed-off-by: Vidya Srinivas 
> > ---
> >  tests/kms_prime.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/tests/kms_prime.c b/tests/kms_prime.c index
> > 8cb2ca2a9dc3..fdc941fe8100 100644
> > --- a/tests/kms_prime.c
> > +++ b/tests/kms_prime.c
> > @@ -51,6 +51,8 @@ static struct {
> > { .r = 1.0, .g = 0.0, .b = 0.0, .color = 0x },  };
> >
> > +bool check_platform;
> > +
> >  IGT_TEST_DESCRIPTION("Prime tests, focusing on KMS side");
> >
> >  static bool has_prime_import(int fd) @@ -101,7 +103,7 @@ static 
> > void prepare_scratch(int exporter_fd, struct
> dumb_bo *scratch,
> > scratch->bpp = 32;
> >
> > scratch->handle = kmstest_dumb_create(exporter_fd,
> > -   scratch->width,
> > +   check_platform? ALIGN(scratch->width, 64): scratch-
> >width,
> 
> The dumb_create ioctl already does this for us.

I915_dumb_create does it for us but "vgem_gem_dumb_create" does not do 64 
ALIGN. And kms_prime is using "vgem_gem_dumb_create" never call i915 
dumb_create() as the IGT creates buffer through VGEM driver, see below IGT 
snippet :

/* ANY = anything that is not VGEM */
first_fd = __drm_open_driver_another(0, DRIVER_ANY | 
DRIVER_VGEM);
igt_require(first_fd >= 0);

second_fd = __drm_open_driver_another(1, DRIVER_ANY | 
DRIVER_VGEM);

Thanks,
Tejas
> 
> > scratch->height,
> > scratch->bpp,
> > &scratch->pitch,
> > @@ -262,6 +264,7 @@ igt_main
> >
> > /* ANY = anything that is not VGEM */
> > first_fd = __drm_open_driver_another(0, DRIVER_ANY |
> DRIVER_VGEM);
> > +   check_platform = is_i915_device(first_fd);
> > igt_require(first_fd >= 0);
> >
> > second_fd = __drm_open_driver_another(1, DRIVER_ANY |
> DRIVER_VGEM);
> > --
> > 2.7.4
> >
> > ___
> > igt-dev mailing list
> > igt-...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/igt-dev
> 
> --
> Ville Syrjälä
> Intel
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2 02/12] drm/i915: Make pre-production detection use direct revid comparison

2021-07-13 Thread Srivatsa, Anusha



> -Original Message-
> From: Roper, Matthew D 
> Sent: Friday, July 9, 2021 8:37 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: Srivatsa, Anusha ; Roper, Matthew D
> 
> Subject: [PATCH v2 02/12] drm/i915: Make pre-production detection use
> direct revid comparison
> 
> Although we're converting our workarounds to use a revid->stepping lookup
> table, the function that detects pre-production hardware should continue to
> compare against PCI revision ID values directly.  These are listed in the 
> bspec
> as integers, so it's easier to confirm their correctness if we just use an 
> integer
> literal rather than a symbolic name anyway.
> 
> Bspec: 13620, 19131, 13626, 18329
> Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 

> ---
>  drivers/gpu/drm/i915/i915_drv.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c
> b/drivers/gpu/drm/i915/i915_drv.c index 30d8cd8c69b1..90136995f5eb
> 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -271,10 +271,10 @@ static void intel_detect_preproduction_hw(struct
> drm_i915_private *dev_priv)
>   bool pre = false;
> 
>   pre |= IS_HSW_EARLY_SDV(dev_priv);
> - pre |= IS_SKL_REVID(dev_priv, 0, SKL_REVID_F0);
> - pre |= IS_BXT_REVID(dev_priv, 0, BXT_REVID_B_LAST);
> - pre |= IS_KBL_GT_STEP(dev_priv, 0, STEP_A0);
> - pre |= IS_GLK_REVID(dev_priv, 0, GLK_REVID_A2);
> + pre |= IS_SKYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x6;
> + pre |= IS_BROXTON(dev_priv) && INTEL_REVID(dev_priv) < 0xA;
> + pre |= IS_KABYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x1;
> + pre |= IS_GEMINILAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x3;
> 
>   if (pre) {
>   drm_err(&dev_priv->drm, "This is a pre-production stepping.
> "
> --
> 2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 04/12] drm/i915/kbl: Drop pre-production revision from stepping table

2021-07-13 Thread Srivatsa, Anusha



> -Original Message-
> From: Roper, Matthew D 
> Sent: Friday, July 9, 2021 8:37 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: Srivatsa, Anusha ; Roper, Matthew D
> 
> Subject: [PATCH v2 04/12] drm/i915/kbl: Drop pre-production revision from
> stepping table
> 
> We're long past the point where we need to care about pre-production
> hardware, and we already warn the user and taint the kernel if we detect the
> driver is being loaded on pre-production hardware.
> 
> Bspec: 18329
> Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 

> ---
>  drivers/gpu/drm/i915/intel_step.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_step.c
> b/drivers/gpu/drm/i915/intel_step.c
> index 69c928b046e8..8987453aa172 100644
> --- a/drivers/gpu/drm/i915/intel_step.c
> +++ b/drivers/gpu/drm/i915/intel_step.c
> @@ -33,7 +33,6 @@ static const struct intel_step_info skl_revids[] = {  };
> 
>  static const struct intel_step_info kbl_revids[] = {
> - [0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
>   [1] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
>   [2] = { .gt_step = STEP_C0, .display_step = STEP_B0 },
>   [3] = { .gt_step = STEP_D0, .display_step = STEP_B0 },
> --
> 2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/gtt: drop the page table optimisation

2021-07-13 Thread Patchwork
== Series Details ==

Series: drm/i915/gtt: drop the page table optimisation
URL   : https://patchwork.freedesktop.org/series/92474/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10339_full -> Patchwork_20586_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20586_full:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_exec_schedule@preempt-hang:
- {shard-rkl}:NOTRUN -> [TIMEOUT][1]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-rkl-2/igt@gem_exec_sched...@preempt-hang.html

  * igt@kms_ccs@pipe-b-bad-rotation-90-y_tiled_gen12_mc_ccs:
- {shard-rkl}:[FAIL][2] ([i915#3678]) -> [SKIP][3]
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10339/shard-rkl-5/igt@kms_ccs@pipe-b-bad-rotation-90-y_tiled_gen12_mc_ccs.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-rkl-6/igt@kms_ccs@pipe-b-bad-rotation-90-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-c-random-ccs-data-y_tiled_gen12_mc_ccs:
- {shard-rkl}:NOTRUN -> [SKIP][4] +1 similar issue
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-rkl-6/igt@kms_ccs@pipe-c-random-ccs-data-y_tiled_gen12_mc_ccs.html

  * igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite:
- {shard-rkl}:[PASS][5] -> [TIMEOUT][6] +1 similar issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10339/shard-rkl-6/igt@kms_frontbuffer_track...@fbc-rgb565-draw-pwrite.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-rkl-2/igt@kms_frontbuffer_track...@fbc-rgb565-draw-pwrite.html

  * igt@kms_pipe_b_c_ivb@pipe-b-double-modeset-then-modeset-pipe-c:
- {shard-rkl}:[SKIP][7] ([fdo#109289]) -> [TIMEOUT][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10339/shard-rkl-6/igt@kms_pipe_b_c_...@pipe-b-double-modeset-then-modeset-pipe-c.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-rkl-2/igt@kms_pipe_b_c_...@pipe-b-double-modeset-then-modeset-pipe-c.html

  * igt@sysfs_heartbeat_interval@precise@vcs0:
- {shard-rkl}:[PASS][9] -> [FAIL][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10339/shard-rkl-6/igt@sysfs_heartbeat_interval@prec...@vcs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-rkl-1/igt@sysfs_heartbeat_interval@prec...@vcs0.html

  
Known issues


  Here are the changes found in Patchwork_20586_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_persistence@smoketest:
- shard-snb:  NOTRUN -> [SKIP][11] ([fdo#109271] / [i915#1099]) +5 
similar issues
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-snb6/igt@gem_ctx_persiste...@smoketest.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-kbl:  NOTRUN -> [FAIL][12] ([i915#2842]) +1 similar issue
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-kbl3/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs1:
- shard-iclb: NOTRUN -> [FAIL][13] ([i915#2842])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-iclb1/igt@gem_exec_fair@basic-n...@vcs1.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
- shard-glk:  [PASS][14] -> [FAIL][15] ([i915#2842])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10339/shard-glk4/igt@gem_exec_fair@basic-pace-sh...@rcs0.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-glk9/igt@gem_exec_fair@basic-pace-sh...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs0:
- shard-tglb: [PASS][16] -> [FAIL][17] ([i915#2842])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10339/shard-tglb3/igt@gem_exec_fair@basic-p...@vcs0.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-tglb3/igt@gem_exec_fair@basic-p...@vcs0.html

  * igt@gem_exec_whisper@basic-forked:
- shard-glk:  [PASS][18] -> [DMESG-WARN][19] ([i915#118] / 
[i915#95]) +1 similar issue
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10339/shard-glk1/igt@gem_exec_whis...@basic-forked.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-glk3/igt@gem_exec_whis...@basic-forked.html

  * igt@gem_huc_copy@huc-copy:
- shard-apl:  NOTRUN -> [SKIP][20] ([fdo#109271] / [i915#2190])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20586/shard-apl7/igt@gem_huc_c...@huc-copy.html

  * igt@gem_mmap_gtt@cpuset-medium-copy-xy:
- shard-glk:  [PASS][21] -> [FAIL][22] ([i915#307])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10339/s

[Intel-gfx] [PATCH v3 10/12] drm/i915/dg1: Use revid->stepping tables

2021-07-13 Thread Matt Roper
Switch DG1 to use a revid->stepping table as we're trying to do on all
platforms going forward.

This removes the last use of IS_REVID() and REVID_FOREVER, so remove
those now-unused macros as well to prevent their accidental use on
future platforms.

v2:
 - Use COMMON_STEPPING() macro in table.  (Anusha)

Bspec: 44463
Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
---
 .../gpu/drm/i915/display/intel_display_power.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c|  2 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c| 10 +-
 drivers/gpu/drm/i915/i915_drv.h| 18 --
 drivers/gpu/drm/i915/intel_pm.c|  2 +-
 drivers/gpu/drm/i915/intel_step.c  |  8 
 6 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 285380079aab..975a7e25cea5 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -5799,7 +5799,7 @@ static void tgl_bw_buddy_init(struct drm_i915_private 
*dev_priv)
int config, i;
 
if (IS_ALDERLAKE_S(dev_priv) ||
-   IS_DG1_REVID(dev_priv, DG1_REVID_A0, DG1_REVID_A0) ||
+   IS_DG1_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_B0))
/* Wa_1409767108:tgl,dg1,adl-s */
table = wa_1409767108_buddy_page_masks;
diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c 
b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index 1f43aba2e9e2..50d11a84e7a9 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -157,7 +157,7 @@ intel_gt_setup_fake_lmem(struct intel_gt *gt)
 static bool get_legacy_lowmem_region(struct intel_uncore *uncore,
 u64 *start, u32 *size)
 {
-   if (!IS_DG1_REVID(uncore->i915, DG1_REVID_A0, DG1_REVID_B0))
+   if (!IS_DG1_GT_STEP(uncore->i915, STEP_A0, STEP_B0))
return false;
 
*start = 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 4c0c15bbdac2..62321e9149db 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -,7 +,7 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
gen12_gt_workarounds_init(i915, wal);
 
/* Wa_1607087056:dg1 */
-   if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0))
+   if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0))
wa_write_or(wal,
SLICE_UNIT_LEVEL_CLKGATE,
L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
@@ -1522,7 +1522,7 @@ static void dg1_whitelist_build(struct intel_engine_cs 
*engine)
tgl_whitelist_build(engine);
 
/* GEN:BUG:1409280441:dg1 */
-   if (IS_DG1_REVID(engine->i915, DG1_REVID_A0, DG1_REVID_A0) &&
+   if (IS_DG1_GT_STEP(engine->i915, STEP_A0, STEP_A0) &&
(engine->class == RENDER_CLASS ||
 engine->class == COPY_ENGINE_CLASS))
whitelist_reg_ext(w, RING_ID(engine->mmio_base),
@@ -1592,7 +1592,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
 {
struct drm_i915_private *i915 = engine->i915;
 
-   if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
+   if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0)) {
/*
 * Wa_1607138336:tgl[a0],dg1[a0]
@@ -1638,7 +1638,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
}
 
if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) ||
-   IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
+   IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
/* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
wa_masked_en(wal, GEN7_ROW_CHICKEN2,
@@ -1652,7 +1652,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
}
 
 
-   if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
+   if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
/*
 * Wa_1607030317:tgl
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9195131cf90f..d462b9434541 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1323,19 +1323,10 @@ static inline struct drm_i915_private 
*pdev_to_i915(struct pci_dev *pdev)
 #define IS_DISPLAY_VER(i915, from, until) \
(DISPLAY_VER(i915) >= (from) && DISPLAY_VER(i915) <= (until))
 
-#define REVID_FOREVER  0xff
 #define INTEL_REVID(dev_priv)  (to_pci_dev((dev_priv)->drm.d

Re: [Intel-gfx] [PATCH v3 10/12] drm/i915/dg1: Use revid->stepping tables

2021-07-13 Thread Srivatsa, Anusha



> -Original Message-
> From: Roper, Matthew D 
> Sent: Tuesday, July 13, 2021 10:29 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: Srivatsa, Anusha ; Roper, Matthew D
> 
> Subject: [PATCH v3 10/12] drm/i915/dg1: Use revid->stepping tables
> 
> Switch DG1 to use a revid->stepping table as we're trying to do on all
> platforms going forward.
> 
> This removes the last use of IS_REVID() and REVID_FOREVER, so remove
> those now-unused macros as well to prevent their accidental use on future
> platforms.
> 
> v2:
>  - Use COMMON_STEPPING() macro in table.  (Anusha)
> 
> Bspec: 44463
> Cc: Anusha Srivatsa 
> Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 

> ---
>  .../gpu/drm/i915/display/intel_display_power.c |  2 +-
>  drivers/gpu/drm/i915/gt/intel_region_lmem.c|  2 +-
>  drivers/gpu/drm/i915/gt/intel_workarounds.c| 10 +-
>  drivers/gpu/drm/i915/i915_drv.h| 18 --
>  drivers/gpu/drm/i915/intel_pm.c|  2 +-
>  drivers/gpu/drm/i915/intel_step.c  |  8 
>  6 files changed, 20 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> b/drivers/gpu/drm/i915/display/intel_display_power.c
> index 285380079aab..975a7e25cea5 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> @@ -5799,7 +5799,7 @@ static void tgl_bw_buddy_init(struct
> drm_i915_private *dev_priv)
>   int config, i;
> 
>   if (IS_ALDERLAKE_S(dev_priv) ||
> - IS_DG1_REVID(dev_priv, DG1_REVID_A0, DG1_REVID_A0) ||
> + IS_DG1_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
>   IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_B0))
>   /* Wa_1409767108:tgl,dg1,adl-s */
>   table = wa_1409767108_buddy_page_masks; diff --git
> a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> index 1f43aba2e9e2..50d11a84e7a9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> +++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> @@ -157,7 +157,7 @@ intel_gt_setup_fake_lmem(struct intel_gt *gt)  static
> bool get_legacy_lowmem_region(struct intel_uncore *uncore,
>u64 *start, u32 *size)
>  {
> - if (!IS_DG1_REVID(uncore->i915, DG1_REVID_A0, DG1_REVID_B0))
> + if (!IS_DG1_GT_STEP(uncore->i915, STEP_A0, STEP_B0))
>   return false;
> 
>   *start = 0;
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 4c0c15bbdac2..62321e9149db 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -,7 +,7 @@ dg1_gt_workarounds_init(struct drm_i915_private
> *i915, struct i915_wa_list *wal)
>   gen12_gt_workarounds_init(i915, wal);
> 
>   /* Wa_1607087056:dg1 */
> - if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0))
> + if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0))
>   wa_write_or(wal,
>   SLICE_UNIT_LEVEL_CLKGATE,
>   L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); @@ -
> 1522,7 +1522,7 @@ static void dg1_whitelist_build(struct intel_engine_cs
> *engine)
>   tgl_whitelist_build(engine);
> 
>   /* GEN:BUG:1409280441:dg1 */
> - if (IS_DG1_REVID(engine->i915, DG1_REVID_A0, DG1_REVID_A0) &&
> + if (IS_DG1_GT_STEP(engine->i915, STEP_A0, STEP_A0) &&
>   (engine->class == RENDER_CLASS ||
>engine->class == COPY_ENGINE_CLASS))
>   whitelist_reg_ext(w, RING_ID(engine->mmio_base), @@ -
> 1592,7 +1592,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine,
> struct i915_wa_list *wal)  {
>   struct drm_i915_private *i915 = engine->i915;
> 
> - if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
> + if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
>   IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0)) {
>   /*
>* Wa_1607138336:tgl[a0],dg1[a0]
> @@ -1638,7 +1638,7 @@ rcs_engine_wa_init(struct intel_engine_cs
> *engine, struct i915_wa_list *wal)
>   }
> 
>   if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) ||
> - IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
> + IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
>   IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
>   /* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
>   wa_masked_en(wal, GEN7_ROW_CHICKEN2,
> @@ -1652,7 +1652,7 @@ rcs_engine_wa_init(struct intel_engine_cs
> *engine, struct i915_wa_list *wal)
>   }
> 
> 
> - if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
> + if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
>   IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
>   /*
>* Wa_1607030317:tgl
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h index 9195131cf90f..d462b9434541
> 10

Re: [Intel-gfx] [PATCH v4 14/18] drm/msm: Don't break exclusive fence ordering

2021-07-13 Thread Rob Clark
On Tue, Jul 13, 2021 at 9:58 AM Daniel Vetter  wrote:
>
> On Tue, Jul 13, 2021 at 6:51 PM Rob Clark  wrote:
> >
> > On Mon, Jul 12, 2021 at 1:02 PM Daniel Vetter  
> > wrote:
> > >
> > > There's only one exclusive slot, and we must not break the ordering.
> > >
> > > Adding a new exclusive fence drops all previous fences from the
> > > dma_resv. To avoid violating the signalling order we err on the side of
> > > over-synchronizing by waiting for the existing fences, even if
> > > userspace asked us to ignore them.
> > >
> > > A better fix would be to us a dma_fence_chain or _array like e.g.
> > > amdgpu now uses, but
> > > - msm has a synchronous dma_fence_wait for anything from another
> > >   context, so doesn't seem to care much,
> > > - and it probably makes sense to lift this into dma-resv.c code as a
> > >   proper concept, so that drivers don't have to hack up their own
> > >   solution each on their own.
> > >
> > > v2: Improve commit message per Lucas' suggestion.
> > >
> > > Cc: Lucas Stach 
> > > Signed-off-by: Daniel Vetter 
> > > Cc: Rob Clark 
> > > Cc: Sean Paul 
> > > Cc: linux-arm-...@vger.kernel.org
> > > Cc: freedr...@lists.freedesktop.org
> > > ---
> > >  drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> > > b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > index b71da71a3dd8..edd0051d849f 100644
> > > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > @@ -306,7 +306,8 @@ static int submit_fence_sync(struct msm_gem_submit 
> > > *submit, bool no_implicit)
> > > return ret;
> > > }
> > >
> > > -   if (no_implicit)
> > > +   /* exclusive fences must be ordered */
> > > +   if (no_implicit && !write)
> > > continue;
> >
> > In practice, modern userspace (the kind that is more likely to set the
> > no-implicit flag on every submit) also sets MSM_SUBMIT_BO_WRITE on
> > every bo, to shave some cpu overhead so I suppose this would not
> > really hurt anything
> >
> > Do you know if this is covered in any piglit/etc test?
>
> You need some command submission, plus buffer sharing with vgem
> setting it's own exclusive fences, plus checking with dma_buf poll()
> whether it signals all in the right order. That's pretty low-level, so
> maybe something in igt, but I haven't typed that. Maybe I need to do
> that for i915 at least.

ok, you lost me at vgem ;-)

(the vgem vs cache situation on arm is kinda hopeless)

BR,
-R

> -Daniel
>
> > BR,
> > -R
> >
> > >
> > > ret = msm_gem_sync_object(&msm_obj->base, 
> > > submit->ring->fctx,
> > > --
> > > 2.32.0
> > >
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 14/18] drm/msm: Don't break exclusive fence ordering

2021-07-13 Thread Daniel Vetter
On Tue, Jul 13, 2021 at 7:42 PM Rob Clark  wrote:
> On Tue, Jul 13, 2021 at 9:58 AM Daniel Vetter  wrote:
> >
> > On Tue, Jul 13, 2021 at 6:51 PM Rob Clark  wrote:
> > >
> > > On Mon, Jul 12, 2021 at 1:02 PM Daniel Vetter  
> > > wrote:
> > > >
> > > > There's only one exclusive slot, and we must not break the ordering.
> > > >
> > > > Adding a new exclusive fence drops all previous fences from the
> > > > dma_resv. To avoid violating the signalling order we err on the side of
> > > > over-synchronizing by waiting for the existing fences, even if
> > > > userspace asked us to ignore them.
> > > >
> > > > A better fix would be to us a dma_fence_chain or _array like e.g.
> > > > amdgpu now uses, but
> > > > - msm has a synchronous dma_fence_wait for anything from another
> > > >   context, so doesn't seem to care much,
> > > > - and it probably makes sense to lift this into dma-resv.c code as a
> > > >   proper concept, so that drivers don't have to hack up their own
> > > >   solution each on their own.
> > > >
> > > > v2: Improve commit message per Lucas' suggestion.
> > > >
> > > > Cc: Lucas Stach 
> > > > Signed-off-by: Daniel Vetter 
> > > > Cc: Rob Clark 
> > > > Cc: Sean Paul 
> > > > Cc: linux-arm-...@vger.kernel.org
> > > > Cc: freedr...@lists.freedesktop.org
> > > > ---
> > > >  drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++-
> > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> > > > b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > index b71da71a3dd8..edd0051d849f 100644
> > > > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > @@ -306,7 +306,8 @@ static int submit_fence_sync(struct msm_gem_submit 
> > > > *submit, bool no_implicit)
> > > > return ret;
> > > > }
> > > >
> > > > -   if (no_implicit)
> > > > +   /* exclusive fences must be ordered */
> > > > +   if (no_implicit && !write)
> > > > continue;
> > >
> > > In practice, modern userspace (the kind that is more likely to set the
> > > no-implicit flag on every submit) also sets MSM_SUBMIT_BO_WRITE on
> > > every bo, to shave some cpu overhead so I suppose this would not
> > > really hurt anything
> > >
> > > Do you know if this is covered in any piglit/etc test?
> >
> > You need some command submission, plus buffer sharing with vgem
> > setting it's own exclusive fences, plus checking with dma_buf poll()
> > whether it signals all in the right order. That's pretty low-level, so
> > maybe something in igt, but I haven't typed that. Maybe I need to do
> > that for i915 at least.
>
> ok, you lost me at vgem ;-)
>
> (the vgem vs cache situation on arm is kinda hopeless)

Oh that explains a few things ... I just found out why vgem is failing
for wc buffers on x86 (on some of our less-coherent igpu at least),
and wondered how the heck this works on arm. Sounds like it just
doesn't :-/

On the testcase: You'd never actually check buffer contents, only
fences, so the test would still work.
-Daniel
>
> BR,
> -R
>
> > -Daniel
> >
> > > BR,
> > > -R
> > >
> > > >
> > > > ret = msm_gem_sync_object(&msm_obj->base, 
> > > > submit->ring->fctx,
> > > > --
> > > > 2.32.0
> > > >
> >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-13 Thread Ville Syrjälä
On Tue, Jul 13, 2021 at 05:13:37PM +0100, Matthew Auld wrote:
> On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä
>  wrote:
> >
> > On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
> > > + /**
> > > +  * @cache_coherent:
> > > +  *
> > > +  * Track whether the pages are coherent with the GPU if reading or
> > > +  * writing through the CPU cache.
> > > +  *
> > > +  * This largely depends on the @cache_level, for example if the 
> > > object
> > > +  * is marked as I915_CACHE_LLC, then GPU access is coherent for both
> > > +  * reads and writes through the CPU cache.
> > > +  *
> > > +  * Note that on platforms with shared-LLC support(HAS_LLC) reads 
> > > through
> > > +  * the CPU cache are always coherent, regardless of the 
> > > @cache_level. On
> > > +  * snooping based platforms this is not the case, unless the full
> > > +  * I915_CACHE_LLC or similar setting is used.
> > > +  *
> > > +  * As a result of this we need to track coherency separately for 
> > > reads
> > > +  * and writes, in order to avoid superfluous flushing on shared-LLC
> > > +  * platforms, for reads.
> > > +  *
> > > +  * I915_BO_CACHE_COHERENT_FOR_READ:
> > > +  *
> > > +  * When reading through the CPU cache, the GPU is still coherent. 
> > > Note
> > > +  * that no data has actually been modified here, so it might seem
> > > +  * strange that we care about this.
> > > +  *
> > > +  * As an example, if some object is mapped on the CPU with 
> > > write-back
> > > +  * caching, and we read some page, then the cache likely now 
> > > contains
> > > +  * the data from that read. At this point the cache and main memory
> > > +  * match up, so all good. But next the GPU needs to write some data 
> > > to
> > > +  * that same page. Now if the @cache_level is I915_CACHE_NONE and 
> > > the
> > > +  * the platform doesn't have the shared-LLC, then the GPU will
> > > +  * effectively skip invalidating the cache(or however that works
> > > +  * internally) when writing the new value.  This is really bad 
> > > since the
> > > +  * GPU has just written some new data to main memory, but the CPU 
> > > cache
> > > +  * is still valid and now contains stale data. As a result the next 
> > > time
> > > +  * we do a cached read with the CPU, we are rewarded with stale 
> > > data.
> > > +  * Likewise if the cache is later flushed, we might be rewarded with
> > > +  * overwriting main memory with stale data.
> > > +  *
> > > +  * I915_BO_CACHE_COHERENT_FOR_WRITE:
> > > +  *
> > > +  * When writing through the CPU cache, the GPU is still coherent. 
> > > Note
> > > +  * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> > > +  *
> > > +  * This is never set when I915_CACHE_NONE is used for @cache_level,
> > > +  * where instead we have to manually flush the caches after writing
> > > +  * through the CPU cache. For other cache levels this should be set 
> > > and
> > > +  * the object is therefore considered coherent for both reads and 
> > > writes
> > > +  * through the CPU cache.
> >
> > I don't remember why we have this read vs. write split and this new
> > documentation doesn't seem to really explain it either.
> 
> Hmm, I attempted to explain that earlier:
> 
> * Note that on platforms with shared-LLC support(HAS_LLC) reads through
> * the CPU cache are always coherent, regardless of the @cache_level. On
> * snooping based platforms this is not the case, unless the full
> * I915_CACHE_LLC or similar setting is used.
> *
> * As a result of this we need to track coherency separately for reads
> * and writes, in order to avoid superfluous flushing on shared-LLC
> * platforms, for reads.
> 
> So AFAIK it's just because shared-LLC can be coherent for reads, while
> also not being coherent for writes(CACHE_NONE),

CPU vs. GPU is fully coherent when it comes to LLC. Or at least I've
never heard of any mechanism that would make it only partially coherent.

-- 
Ville Syrjälä
Intel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 03/12] drm/i915/skl: Use revid->stepping tables

2021-07-13 Thread Lucas De Marchi

On Fri, Jul 09, 2021 at 08:37:15PM -0700, Matt Roper wrote:

Switch SKL to use a revid->stepping table as we're trying to do on all
platforms going forward.  Also drop the preproduction revisions and add
the newer steppings we hadn't already handled.

Note that SKL has a case where a newer revision ID corresponds to an
older GT/disp stepping (0x9 -> STEP_J0, 0xA -> STEP_I1).  Also, the lack
of a revision ID 0x8 in the table is intentional and not an oversight.
We'll re-write the KBL-specific comment to make it clear that these kind
of quirks are expected.

v2:
- Since GT and display steppings are always identical on SKL use a
  macro to set both values at once in a more readable manner.  (Anusha)
- Drop preproduction steppings.

Bspec: 13626
Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
---
drivers/gpu/drm/i915/gt/intel_workarounds.c |  2 +-
drivers/gpu/drm/i915/i915_drv.h | 11 +---
drivers/gpu/drm/i915/intel_step.c   | 30 +
drivers/gpu/drm/i915/intel_step.h   |  4 +++
4 files changed, 31 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index d9a5a445ceec..6dfd564e078f 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -883,7 +883,7 @@ skl_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);

/* WaInPlaceDecompressionHang:skl */
-   if (IS_SKL_REVID(i915, SKL_REVID_H0, REVID_FOREVER))
+   if (IS_SKL_GT_STEP(i915, STEP_H0, STEP_FOREVER))
wa_write_or(wal,
GEN9_GAMT_ECO_REG_RW_IA,
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c4747f4407ef..f30499ed6787 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1515,16 +1515,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
#define IS_TGL_Y(dev_priv) \
IS_SUBPLATFORM(dev_priv, INTEL_TIGERLAKE, INTEL_SUBPLATFORM_ULX)

-#define SKL_REVID_A0   0x0
-#define SKL_REVID_B0   0x1
-#define SKL_REVID_C0   0x2
-#define SKL_REVID_D0   0x3
-#define SKL_REVID_E0   0x4
-#define SKL_REVID_F0   0x5
-#define SKL_REVID_G0   0x6
-#define SKL_REVID_H0   0x7
-
-#define IS_SKL_REVID(p, since, until) (IS_SKYLAKE(p) && IS_REVID(p, since, 
until))
+#define IS_SKL_GT_STEP(p, since, until) (IS_SKYLAKE(p) && IS_GT_STEP(p, since, 
until))

#define BXT_REVID_A00x0
#define BXT_REVID_A10x1
diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index 93ccd42f2514..69c928b046e8 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -7,14 +7,31 @@
#include "intel_step.h"

/*
- * KBL revision ID ordering is bizarre; higher revision ID's map to lower
- * steppings in some cases.  So rather than test against the revision ID
- * directly, let's map that into our own range of increasing ID's that we
- * can test against in a regular manner.
+ * Some platforms have unusual ways of mapping PCI revision ID to GT/display
+ * steppings.  E.g., in some cases a higher PCI revision may translate to a
+ * lower stepping of the GT and/or display IP.  This file provides lookup
+ * tables to map the PCI revision into a standard set of stepping values that
+ * can be compared numerically.
+ *
+ * Also note that some revisions/steppings may have been set aside as
+ * placeholders but never materialized in real hardware; in those cases there
+ * may be jumps in the revision IDs or stepping values in the tables below.
 */

+/*
+ * Some platforms always have the same stepping value for GT and display;
+ * use a macro to define these to make it easier to identify the platforms
+ * where the two steppings can deviate.
+ */
+#define COMMON_STEPPING(x)  .gt_step = STEP_##x, .display_step = STEP_##x


nitpick:

"stepping" is the proper word, but we settled on "step"
everyhere: functions, macros, tables, filename etc. Can we
continue doing that?  For the comments I think it's ok to
continue using the proper word, but for real code I think it
would be better to keep it consistent

thanks
Lucas De Marchi
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 3/5] drm/i915/uapi: reject caching ioctls for discrete

2021-07-13 Thread Kenneth Graunke
On Monday, July 5, 2021 6:53:08 AM PDT Matthew Auld wrote:
> It's a noop on DG1, and in the future when need to support other devices
> which let us control the coherency, then it should be an immutable
> creation time property for the BO. This will likely be controlled
> through a new gem_create_ext extension.
> 
> v2: add some kernel doc for the discrete changes, and document the
> implicit rules
> 
> Suggested-by: Daniel Vetter 
> Signed-off-by: Matthew Auld 
> Cc: Thomas Hellström 
> Cc: Maarten Lankhorst 
> Cc: Tvrtko Ursulin 
> Cc: Jordan Justen 
> Cc: Kenneth Graunke 
> Cc: Jason Ekstrand 
> Cc: Daniel Vetter 
> Cc: Ramalingam C 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_domain.c |  6 +
>  include/uapi/drm/i915_drm.h| 29 +-
>  2 files changed, 34 insertions(+), 1 deletion(-)

This caching ioctl patch is:

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-13 Thread Matthew Auld
On Tue, 13 Jul 2021 at 18:47, Ville Syrjälä
 wrote:
>
> On Tue, Jul 13, 2021 at 05:13:37PM +0100, Matthew Auld wrote:
> > On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä
> >  wrote:
> > >
> > > On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
> > > > + /**
> > > > +  * @cache_coherent:
> > > > +  *
> > > > +  * Track whether the pages are coherent with the GPU if reading or
> > > > +  * writing through the CPU cache.
> > > > +  *
> > > > +  * This largely depends on the @cache_level, for example if the 
> > > > object
> > > > +  * is marked as I915_CACHE_LLC, then GPU access is coherent for 
> > > > both
> > > > +  * reads and writes through the CPU cache.
> > > > +  *
> > > > +  * Note that on platforms with shared-LLC support(HAS_LLC) reads 
> > > > through
> > > > +  * the CPU cache are always coherent, regardless of the 
> > > > @cache_level. On
> > > > +  * snooping based platforms this is not the case, unless the full
> > > > +  * I915_CACHE_LLC or similar setting is used.
> > > > +  *
> > > > +  * As a result of this we need to track coherency separately for 
> > > > reads
> > > > +  * and writes, in order to avoid superfluous flushing on 
> > > > shared-LLC
> > > > +  * platforms, for reads.
> > > > +  *
> > > > +  * I915_BO_CACHE_COHERENT_FOR_READ:
> > > > +  *
> > > > +  * When reading through the CPU cache, the GPU is still coherent. 
> > > > Note
> > > > +  * that no data has actually been modified here, so it might seem
> > > > +  * strange that we care about this.
> > > > +  *
> > > > +  * As an example, if some object is mapped on the CPU with 
> > > > write-back
> > > > +  * caching, and we read some page, then the cache likely now 
> > > > contains
> > > > +  * the data from that read. At this point the cache and main 
> > > > memory
> > > > +  * match up, so all good. But next the GPU needs to write some 
> > > > data to
> > > > +  * that same page. Now if the @cache_level is I915_CACHE_NONE and 
> > > > the
> > > > +  * the platform doesn't have the shared-LLC, then the GPU will
> > > > +  * effectively skip invalidating the cache(or however that works
> > > > +  * internally) when writing the new value.  This is really bad 
> > > > since the
> > > > +  * GPU has just written some new data to main memory, but the CPU 
> > > > cache
> > > > +  * is still valid and now contains stale data. As a result the 
> > > > next time
> > > > +  * we do a cached read with the CPU, we are rewarded with stale 
> > > > data.
> > > > +  * Likewise if the cache is later flushed, we might be rewarded 
> > > > with
> > > > +  * overwriting main memory with stale data.
> > > > +  *
> > > > +  * I915_BO_CACHE_COHERENT_FOR_WRITE:
> > > > +  *
> > > > +  * When writing through the CPU cache, the GPU is still coherent. 
> > > > Note
> > > > +  * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> > > > +  *
> > > > +  * This is never set when I915_CACHE_NONE is used for 
> > > > @cache_level,
> > > > +  * where instead we have to manually flush the caches after 
> > > > writing
> > > > +  * through the CPU cache. For other cache levels this should be 
> > > > set and
> > > > +  * the object is therefore considered coherent for both reads and 
> > > > writes
> > > > +  * through the CPU cache.
> > >
> > > I don't remember why we have this read vs. write split and this new
> > > documentation doesn't seem to really explain it either.
> >
> > Hmm, I attempted to explain that earlier:
> >
> > * Note that on platforms with shared-LLC support(HAS_LLC) reads through
> > * the CPU cache are always coherent, regardless of the @cache_level. On
> > * snooping based platforms this is not the case, unless the full
> > * I915_CACHE_LLC or similar setting is used.
> > *
> > * As a result of this we need to track coherency separately for reads
> > * and writes, in order to avoid superfluous flushing on shared-LLC
> > * platforms, for reads.
> >
> > So AFAIK it's just because shared-LLC can be coherent for reads, while
> > also not being coherent for writes(CACHE_NONE),
>
> CPU vs. GPU is fully coherent when it comes to LLC. Or at least I've
> never heard of any mechanism that would make it only partially coherent.

What do you mean by "comes to LLC", are you talking about HAS_LLC() or
I915_CACHE_LLC?

If you set I915_CACHE_LLC, then yes it is fully coherent for both
HAS_LLC() and HAS_SNOOP().

If you set I915_CACHE_NONE, then reads are still coherent on
HAS_LLC(), for HAS_SNOOP() they are not. Or at least that is the
existing behaviour in the driver AFAIK.

>
> --
> Ville Syrjälä
> Intel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 21/47] drm/i915/guc: Ensure G2H response has space in buffer

2021-07-13 Thread John Harrison

On 6/24/2021 00:04, Matthew Brost wrote:

Ensure G2H response has space in the buffer before sending H2G CTB as
the GuC can't handle any backpressure on the G2H interface.

Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc.h| 13 +++-
  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 76 +++
  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  4 +-
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  4 +
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 13 ++--
  5 files changed, 87 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index b43ec56986b5..24e7a924134e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -95,11 +95,17 @@ inline int intel_guc_send(struct intel_guc *guc, const u32 
*action, u32 len)
  }
  
  #define INTEL_GUC_SEND_NB		BIT(31)

+#define INTEL_GUC_SEND_G2H_DW_SHIFT0
+#define INTEL_GUC_SEND_G2H_DW_MASK (0xff << INTEL_GUC_SEND_G2H_DW_SHIFT)
+#define MAKE_SEND_FLAGS(len) \
+   ({GEM_BUG_ON(!FIELD_FIT(INTEL_GUC_SEND_G2H_DW_MASK, len)); \
+   (FIELD_PREP(INTEL_GUC_SEND_G2H_DW_MASK, len) | INTEL_GUC_SEND_NB);})
  static
-inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len)
+inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len,
+u32 g2h_len_dw)
  {
return intel_guc_ct_send(&guc->ct, action, len, NULL, 0,
-INTEL_GUC_SEND_NB);
+MAKE_SEND_FLAGS(g2h_len_dw));
  }
  
  static inline int

@@ -113,6 +119,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, const u32 
*action, u32 len,
  static inline int intel_guc_send_busy_loop(struct intel_guc* guc,
   const u32 *action,
   u32 len,
+  u32 g2h_len_dw,
   bool loop)
  {
int err;
@@ -121,7 +128,7 @@ static inline int intel_guc_send_busy_loop(struct 
intel_guc* guc,
might_sleep_if(loop && (!in_atomic() && !irqs_disabled()));
  
  retry:

-   err = intel_guc_send_nb(guc, action, len);
+   err = intel_guc_send_nb(guc, action, len, g2h_len_dw);
if (unlikely(err == -EBUSY && loop)) {
if (likely(!in_atomic() && !irqs_disabled()))
cond_resched();
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 7491f041859e..a60970e85635 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -73,6 +73,7 @@ static inline struct drm_device *ct_to_drm(struct 
intel_guc_ct *ct)
  #define CTB_DESC_SIZE ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
  #define CTB_H2G_BUFFER_SIZE   (SZ_4K)
  #define CTB_G2H_BUFFER_SIZE   (4 * CTB_H2G_BUFFER_SIZE)
+#define G2H_ROOM_BUFFER_SIZE   (PAGE_SIZE)
Any particular reason why PAGE_SIZE instead of SZ_4K? I'm not seeing 
anything in the code that is actually related to page sizes. Seems like 
'(CTB_G2H_BUFFER_SIZE / 4)' would be a more correct way to express it. 
Unless I'm missing something about how it's used?


John.


  
  struct ct_request {

struct list_head link;
@@ -129,23 +130,27 @@ static void guc_ct_buffer_desc_init(struct 
guc_ct_buffer_desc *desc)
  
  static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)

  {
+   u32 space;
+
ctb->broken = false;
ctb->tail = 0;
ctb->head = 0;
-   ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
+   space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size) - ctb->resv_space;
+   atomic_set(&ctb->space, space);
  
  	guc_ct_buffer_desc_init(ctb->desc);

  }
  
  static void guc_ct_buffer_init(struct intel_guc_ct_buffer *ctb,

   struct guc_ct_buffer_desc *desc,
-  u32 *cmds, u32 size_in_bytes)
+  u32 *cmds, u32 size_in_bytes, u32 resv_space)
  {
GEM_BUG_ON(size_in_bytes % 4);
  
  	ctb->desc = desc;

ctb->cmds = cmds;
ctb->size = size_in_bytes / 4;
+   ctb->resv_space = resv_space / 4;
  
  	guc_ct_buffer_reset(ctb);

  }
@@ -226,6 +231,7 @@ int intel_guc_ct_init(struct intel_guc_ct *ct)
struct guc_ct_buffer_desc *desc;
u32 blob_size;
u32 cmds_size;
+   u32 resv_space;
void *blob;
u32 *cmds;
int err;
@@ -250,19 +256,23 @@ int intel_guc_ct_init(struct intel_guc_ct *ct)
desc = blob;
cmds = blob + 2 * CTB_DESC_SIZE;
cmds_size = CTB_H2G_BUFFER_SIZE;
-   CT_DEBUG(ct, "%s desc %#tx cmds %#tx size %u\n", "send",
-ptrdiff(desc, blob), ptrdiff(cmds, blob), cmds_size);
+   resv_space = 0;
+   CT_DEBUG(ct, "%s desc %#tx cmds %#tx 

[Intel-gfx] [PATCH] drm/i915: Fix wm params for ccs

2021-07-13 Thread Juha-Pekka Heikkila
skl_compute_plane_wm_params() didn't take into account ccs
modifiers on graphics ver >= 12

Signed-off-by: Juha-Pekka Heikkila 
---
 drivers/gpu/drm/i915/intel_pm.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0cbb79452fcf..540a7ecbf004 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5249,11 +5249,9 @@ skl_compute_wm_params(const struct intel_crtc_state 
*crtc_state,
 
wp->y_tiled = modifier == I915_FORMAT_MOD_Y_TILED ||
  modifier == I915_FORMAT_MOD_Yf_TILED ||
- modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
- modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
+ is_ccs_modifier(modifier);
wp->x_tiled = modifier == I915_FORMAT_MOD_X_TILED;
-   wp->rc_surface = modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
-modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
+   wp->rc_surface = is_ccs_modifier(modifier);
wp->is_planar = intel_format_info_is_yuv_semiplanar(format, modifier);
 
wp->width = width;
-- 
2.28.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/5] drm/i915: document caching related bits

2021-07-13 Thread Ville Syrjälä
On Tue, Jul 13, 2021 at 07:24:23PM +0100, Matthew Auld wrote:
> On Tue, 13 Jul 2021 at 18:47, Ville Syrjälä
>  wrote:
> >
> > On Tue, Jul 13, 2021 at 05:13:37PM +0100, Matthew Auld wrote:
> > > On Tue, 13 Jul 2021 at 16:55, Ville Syrjälä
> > >  wrote:
> > > >
> > > > On Tue, Jul 13, 2021 at 11:45:50AM +0100, Matthew Auld wrote:
> > > > > + /**
> > > > > +  * @cache_coherent:
> > > > > +  *
> > > > > +  * Track whether the pages are coherent with the GPU if reading 
> > > > > or
> > > > > +  * writing through the CPU cache.
> > > > > +  *
> > > > > +  * This largely depends on the @cache_level, for example if the 
> > > > > object
> > > > > +  * is marked as I915_CACHE_LLC, then GPU access is coherent for 
> > > > > both
> > > > > +  * reads and writes through the CPU cache.
> > > > > +  *
> > > > > +  * Note that on platforms with shared-LLC support(HAS_LLC) 
> > > > > reads through
> > > > > +  * the CPU cache are always coherent, regardless of the 
> > > > > @cache_level. On
> > > > > +  * snooping based platforms this is not the case, unless the 
> > > > > full
> > > > > +  * I915_CACHE_LLC or similar setting is used.
> > > > > +  *
> > > > > +  * As a result of this we need to track coherency separately 
> > > > > for reads
> > > > > +  * and writes, in order to avoid superfluous flushing on 
> > > > > shared-LLC
> > > > > +  * platforms, for reads.
> > > > > +  *
> > > > > +  * I915_BO_CACHE_COHERENT_FOR_READ:
> > > > > +  *
> > > > > +  * When reading through the CPU cache, the GPU is still 
> > > > > coherent. Note
> > > > > +  * that no data has actually been modified here, so it might 
> > > > > seem
> > > > > +  * strange that we care about this.
> > > > > +  *
> > > > > +  * As an example, if some object is mapped on the CPU with 
> > > > > write-back
> > > > > +  * caching, and we read some page, then the cache likely now 
> > > > > contains
> > > > > +  * the data from that read. At this point the cache and main 
> > > > > memory
> > > > > +  * match up, so all good. But next the GPU needs to write some 
> > > > > data to
> > > > > +  * that same page. Now if the @cache_level is I915_CACHE_NONE 
> > > > > and the
> > > > > +  * the platform doesn't have the shared-LLC, then the GPU will
> > > > > +  * effectively skip invalidating the cache(or however that works
> > > > > +  * internally) when writing the new value.  This is really bad 
> > > > > since the
> > > > > +  * GPU has just written some new data to main memory, but the 
> > > > > CPU cache
> > > > > +  * is still valid and now contains stale data. As a result the 
> > > > > next time
> > > > > +  * we do a cached read with the CPU, we are rewarded with stale 
> > > > > data.
> > > > > +  * Likewise if the cache is later flushed, we might be rewarded 
> > > > > with
> > > > > +  * overwriting main memory with stale data.
> > > > > +  *
> > > > > +  * I915_BO_CACHE_COHERENT_FOR_WRITE:
> > > > > +  *
> > > > > +  * When writing through the CPU cache, the GPU is still 
> > > > > coherent. Note
> > > > > +  * that this also implies I915_BO_CACHE_COHERENT_FOR_READ.
> > > > > +  *
> > > > > +  * This is never set when I915_CACHE_NONE is used for 
> > > > > @cache_level,
> > > > > +  * where instead we have to manually flush the caches after 
> > > > > writing
> > > > > +  * through the CPU cache. For other cache levels this should be 
> > > > > set and
> > > > > +  * the object is therefore considered coherent for both reads 
> > > > > and writes
> > > > > +  * through the CPU cache.
> > > >
> > > > I don't remember why we have this read vs. write split and this new
> > > > documentation doesn't seem to really explain it either.
> > >
> > > Hmm, I attempted to explain that earlier:
> > >
> > > * Note that on platforms with shared-LLC support(HAS_LLC) reads through
> > > * the CPU cache are always coherent, regardless of the @cache_level. On
> > > * snooping based platforms this is not the case, unless the full
> > > * I915_CACHE_LLC or similar setting is used.
> > > *
> > > * As a result of this we need to track coherency separately for reads
> > > * and writes, in order to avoid superfluous flushing on shared-LLC
> > > * platforms, for reads.
> > >
> > > So AFAIK it's just because shared-LLC can be coherent for reads, while
> > > also not being coherent for writes(CACHE_NONE),
> >
> > CPU vs. GPU is fully coherent when it comes to LLC. Or at least I've
> > never heard of any mechanism that would make it only partially coherent.
> 
> What do you mean by "comes to LLC", are you talking about HAS_LLC() or
> I915_CACHE_LLC?

I'm talking about the actual cache.

> 
> If you set I915_CACHE_LLC, then yes it is fully coherent for both
> HAS_LLC() and HAS_SNOOP().
> 
> If you set I915_CACHE_NONE, then reads are still coherent on
> HAS_LLC(),

Reads and wri

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Minor revid/stepping and workaround cleanup (rev4)

2021-07-13 Thread Patchwork
== Series Details ==

Series: Minor revid/stepping and workaround cleanup (rev4)
URL   : https://patchwork.freedesktop.org/series/92299/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
62d758aadfe3 drm/i915/step: s/_revid_tbl/_revids
76bd03989163 drm/i915: Make pre-production detection use direct revid comparison
5c84247d6ac9 drm/i915/skl: Use revid->stepping tables
-:57: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#57: FILE: drivers/gpu/drm/i915/i915_drv.h:1518:
+#define IS_SKL_GT_STEP(p, since, until) (IS_SKYLAKE(p) && IS_GT_STEP(p, since, 
until))

total: 0 errors, 0 warnings, 1 checks, 80 lines checked
1c898561ba39 drm/i915/kbl: Drop pre-production revision from stepping table
27320b6e947b drm/i915/bxt: Use revid->stepping tables
bfc4af69ce3b drm/i915/glk: Use revid->stepping tables
ea04a99db33e drm/i915/icl: Use revid->stepping tables
-:116: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#116: FILE: drivers/gpu/drm/i915/i915_drv.h:1532:
+#define IS_ICL_GT_STEP(p, since, until) \
+   (IS_ICELAKE(p) && IS_GT_STEP(p, since, until))

total: 0 errors, 0 warnings, 1 checks, 87 lines checked
95aec3b31d70 drm/i915/jsl_ehl: Use revid->stepping tables
-:56: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#56: FILE: drivers/gpu/drm/i915/i915_drv.h:1535:
+#define IS_JSL_EHL_GT_STEP(p, since, until) \
+   (IS_JSL_EHL(p) && IS_GT_STEP(p, since, until))

-:58: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#58: FILE: drivers/gpu/drm/i915/i915_drv.h:1537:
+#define IS_JSL_EHL_DISPLAY_STEP(p, since, until) \
+   (IS_JSL_EHL(p) && IS_DISPLAY_STEP(p, since, until))

total: 0 errors, 0 warnings, 2 checks, 51 lines checked
d515112313b0 drm/i915/rkl: Use revid->stepping tables
-:49: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#49: FILE: drivers/gpu/drm/i915/i915_drv.h:1552:
+#define IS_RKL_DISPLAY_STEP(p, since, until) \
+   (IS_ROCKETLAKE(p) && IS_DISPLAY_STEP(p, since, until))

total: 0 errors, 0 warnings, 1 checks, 51 lines checked
28dea784c5d3 drm/i915/dg1: Use revid->stepping tables
-:129: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#129: FILE: drivers/gpu/drm/i915/i915_drv.h:1546:
+#define IS_DG1_GT_STEP(p, since, until) \
+   (IS_DG1(p) && IS_GT_STEP(p, since, until))

-:131: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#131: FILE: drivers/gpu/drm/i915/i915_drv.h:1548:
+#define IS_DG1_DISPLAY_STEP(p, since, until) \
+   (IS_DG1(p) && IS_DISPLAY_STEP(p, since, until))

total: 0 errors, 0 warnings, 2 checks, 118 lines checked
a66bc53c20b3 drm/i915/cnl: Drop all workarounds
d62d0ec1b3aa drm/i915/icl: Drop workarounds that only apply to pre-production 
steppings


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 07/12] drm/i915/icl: Use revid->stepping tables

2021-07-13 Thread Souza, Jose
On Fri, 2021-07-09 at 20:37 -0700, Matt Roper wrote:
> Switch ICL to use a revid->stepping table as we're trying to do on all
> platforms going forward.  While we're at it, let's include some
> additional steppings that have popped up, even if we don't yet have any
> workarounds tied to those steppings (we probably need to audit our
> workaround list soon to see if any of the bounds have moved or if new
> workarounds have appeared).
> 
> Note that the current bspec table is missing information about how to
> map PCI revision ID to GT/display steppings; it only provides an SoC
> stepping.  The mapping to GT/display steppings (which aren't always the
> same as the SoC stepping) used to be in the bspec, but was apparently
> dropped during an update in Nov 2019; I've made my changes here based on
> an older bspec snapshot that still had the necessary information.  We've
> requested that the missing information be restored.
> 
> I'm only including the production revids in the table here since we're
> past the point at which we usually stop trying to support pre-production
> hardware.  An appropriate check is added to
> intel_detect_preproduction_hw() to print an error and taint the kernel
> just in case someone still tries to load the driver on old
> pre-production hardware.
> 
> v2:
>  - Drop pre-production steppings and add error/taint at startup when
>loading on pre-production hardware.
> 

Reviewed-by: José Roberto de Souza 

> Bspec: 21141  # pre-Nov 2019 snapshot
> Signed-off-by: Matt Roper 
> ---
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 12 ++--
>  drivers/gpu/drm/i915/i915_drv.c |  1 +
>  drivers/gpu/drm/i915/i915_drv.h | 10 ++
>  drivers/gpu/drm/i915/intel_step.c   |  7 +++
>  4 files changed, 16 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
> b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 6dfd564e078f..e2d8acb8c1c9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -557,7 +557,7 @@ static void icl_ctx_workarounds_init(struct 
> intel_engine_cs *engine,
>   /* Wa_1604370585:icl (pre-prod)
>* Formerly known as WaPushConstantDereferenceHoldDisable
>*/
> - if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
> + if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
>   wa_masked_en(wal, GEN7_ROW_CHICKEN2,
>PUSH_CONSTANT_DEREF_DISABLE);
>  
> @@ -573,12 +573,12 @@ static void icl_ctx_workarounds_init(struct 
> intel_engine_cs *engine,
>   /* Wa_2006611047:icl (pre-prod)
>* Formerly known as WaDisableImprovedTdlClkGating
>*/
> - if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
> + if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
>   wa_masked_en(wal, GEN7_ROW_CHICKEN2,
>GEN11_TDL_CLOCK_GATING_FIX_DISABLE);
>  
>   /* Wa_2006665173:icl (pre-prod) */
> - if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
> + if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
>   wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
>GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC);
>  
> @@ -1023,13 +1023,13 @@ icl_gt_workarounds_init(struct drm_i915_private 
> *i915, struct i915_wa_list *wal)
>   GAMW_ECO_DEV_CTX_RELOAD_DISABLE);
>  
>   /* Wa_1405779004:icl (pre-prod) */
> - if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
> + if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
>   wa_write_or(wal,
>   SLICE_UNIT_LEVEL_CLKGATE,
>   MSCUNIT_CLKGATE_DIS);
>  
>   /* Wa_1406838659:icl (pre-prod) */
> - if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
> + if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
>   wa_write_or(wal,
>   INF_UNIT_LEVEL_CLKGATE,
>   CGPSF_CLKGATE_DIS);
> @@ -1725,7 +1725,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, 
> struct i915_wa_list *wal)
>   PMFLUSHDONE_LNEBLK);
>  
>   /* Wa_1406609255:icl (pre-prod) */
> - if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
> + if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
>   wa_write_or(wal,
>   GEN7_SARCHKMD,
>   GEN7_DISABLE_DEMAND_PREFETCH);
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 90136995f5eb..c43b698bf0b9 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -275,6 +275,7 @@ static void intel_detect_preproduction_hw(struct 
> drm_i915_private *dev_priv)
>   pre |= IS_BROXTON(dev_priv) && INTEL_REVID(dev_priv) < 0xA;
>   pre |= IS_KABYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x1;
>   pre |= IS_GEMINILAKE(dev_priv) && INTEL_REVID(dev_p

[Intel-gfx] [CI v4 06/12] drm/i915/glk: Use revid->stepping tables

2021-07-13 Thread Matt Roper
Switch GLK to use a revid->stepping table as we're trying to do on all
platforms going forward.  Pre-production and placeholder revisions are
omitted.

Although nothing in the code is using the data from this table at the
moment, we expect some upcoming DMC patches to start utilizing it.

Bspec: 19131
Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/i915_drv.h   | 8 
 drivers/gpu/drm/i915/intel_step.c | 7 +++
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index afb159f2a658..dac9ed2dfca5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1522,14 +1522,6 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define IS_KBL_DISPLAY_STEP(dev_priv, since, until) \
(IS_KABYLAKE(dev_priv) && IS_DISPLAY_STEP(dev_priv, since, until))
 
-#define GLK_REVID_A0   0x0
-#define GLK_REVID_A1   0x1
-#define GLK_REVID_A2   0x2
-#define GLK_REVID_B0   0x3
-
-#define IS_GLK_REVID(dev_priv, since, until) \
-   (IS_GEMINILAKE(dev_priv) && IS_REVID(dev_priv, since, until))
-
 #define CNL_REVID_A0   0x0
 #define CNL_REVID_B0   0x1
 #define CNL_REVID_C0   0x2
diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index 57c33a25b760..1bc0701092ab 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -49,6 +49,10 @@ static const struct intel_step_info bxt_revids[] = {
[0xD] = { COMMON_STEP(E0) },
 };
 
+static const struct intel_step_info glk_revids[] = {
+   [3] = { COMMON_STEP(B0) },
+};
+
 static const struct intel_step_info tgl_uy_revids[] = {
[0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[1] = { .gt_step = STEP_B0, .display_step = STEP_C0 },
@@ -96,6 +100,9 @@ void intel_step_init(struct drm_i915_private *i915)
} else if (IS_TIGERLAKE(i915)) {
revids = tgl_revids;
size = ARRAY_SIZE(tgl_revids);
+   } else if (IS_GEMINILAKE(i915)) {
+   revids = glk_revids;
+   size = ARRAY_SIZE(glk_revids);
} else if (IS_BROXTON(i915)) {
revids = bxt_revids;
size = ARRAY_SIZE(bxt_revids);
-- 
2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI v4 04/12] drm/i915/kbl: Drop pre-production revision from stepping table

2021-07-13 Thread Matt Roper
We're long past the point where we need to care about pre-production
hardware, and we already warn the user and taint the kernel if we detect
the driver is being loaded on pre-production hardware.

Bspec: 18329
Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/intel_step.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index 4e6a2b3b4f8a..1dd6944e7aca 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -33,7 +33,6 @@ static const struct intel_step_info skl_revids[] = {
 };
 
 static const struct intel_step_info kbl_revids[] = {
-   [0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[1] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
[2] = { .gt_step = STEP_C0, .display_step = STEP_B0 },
[3] = { .gt_step = STEP_D0, .display_step = STEP_B0 },
-- 
2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI v4 05/12] drm/i915/bxt: Use revid->stepping tables

2021-07-13 Thread Matt Roper
Switch BXT to use a revid->stepping table as we're trying to do on all
platforms going forward.  Note that the REVID macros we had before
weren't being used anywhere in the code and weren't even correct; the
table values come from the bspec (and omits all the placeholder and
preproduction revisions).

Although nothing in the code is using the data from this table at the
moment, we expect some upcoming DMC patches to start utilizing it.

Bspec: 13620
Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/i915_drv.h   |  9 -
 drivers/gpu/drm/i915/intel_step.c | 10 ++
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f30499ed6787..afb159f2a658 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1517,15 +1517,6 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 
 #define IS_SKL_GT_STEP(p, since, until) (IS_SKYLAKE(p) && IS_GT_STEP(p, since, 
until))
 
-#define BXT_REVID_A0   0x0
-#define BXT_REVID_A1   0x1
-#define BXT_REVID_B0   0x3
-#define BXT_REVID_B_LAST   0x8
-#define BXT_REVID_C0   0x9
-
-#define IS_BXT_REVID(dev_priv, since, until) \
-   (IS_BROXTON(dev_priv) && IS_REVID(dev_priv, since, until))
-
 #define IS_KBL_GT_STEP(dev_priv, since, until) \
(IS_KABYLAKE(dev_priv) && IS_GT_STEP(dev_priv, since, until))
 #define IS_KBL_DISPLAY_STEP(dev_priv, since, until) \
diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index 1dd6944e7aca..57c33a25b760 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -42,6 +42,13 @@ static const struct intel_step_info kbl_revids[] = {
[7] = { .gt_step = STEP_G0, .display_step = STEP_C0 },
 };
 
+static const struct intel_step_info bxt_revids[] = {
+   [0xA] = { COMMON_STEP(C0) },
+   [0xB] = { COMMON_STEP(C0) },
+   [0xC] = { COMMON_STEP(D0) },
+   [0xD] = { COMMON_STEP(E0) },
+};
+
 static const struct intel_step_info tgl_uy_revids[] = {
[0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[1] = { .gt_step = STEP_B0, .display_step = STEP_C0 },
@@ -89,6 +96,9 @@ void intel_step_init(struct drm_i915_private *i915)
} else if (IS_TIGERLAKE(i915)) {
revids = tgl_revids;
size = ARRAY_SIZE(tgl_revids);
+   } else if (IS_BROXTON(i915)) {
+   revids = bxt_revids;
+   size = ARRAY_SIZE(bxt_revids);
} else if (IS_KABYLAKE(i915)) {
revids = kbl_revids;
size = ARRAY_SIZE(kbl_revids);
-- 
2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI v4 02/12] drm/i915: Make pre-production detection use direct revid comparison

2021-07-13 Thread Matt Roper
Although we're converting our workarounds to use a revid->stepping
lookup table, the function that detects pre-production hardware should
continue to compare against PCI revision ID values directly.  These are
listed in the bspec as integers, so it's easier to confirm their
correctness if we just use an integer literal rather than a symbolic
name anyway.

Bspec: 13620, 19131, 13626, 18329
Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/i915_drv.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 30d8cd8c69b1..90136995f5eb 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -271,10 +271,10 @@ static void intel_detect_preproduction_hw(struct 
drm_i915_private *dev_priv)
bool pre = false;
 
pre |= IS_HSW_EARLY_SDV(dev_priv);
-   pre |= IS_SKL_REVID(dev_priv, 0, SKL_REVID_F0);
-   pre |= IS_BXT_REVID(dev_priv, 0, BXT_REVID_B_LAST);
-   pre |= IS_KBL_GT_STEP(dev_priv, 0, STEP_A0);
-   pre |= IS_GLK_REVID(dev_priv, 0, GLK_REVID_A2);
+   pre |= IS_SKYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x6;
+   pre |= IS_BROXTON(dev_priv) && INTEL_REVID(dev_priv) < 0xA;
+   pre |= IS_KABYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x1;
+   pre |= IS_GEMINILAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x3;
 
if (pre) {
drm_err(&dev_priv->drm, "This is a pre-production stepping. "
-- 
2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI v4 01/12] drm/i915/step: s/_revid_tbl/_revids

2021-07-13 Thread Matt Roper
From: Anusha Srivatsa 

Simplify the stepping info array name.

Cc: Jani Nikula 
Signed-off-by: Anusha Srivatsa 
Reviewed-by: Jani Nikula 
---
 drivers/gpu/drm/i915/intel_step.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index ba9479a67521..93ccd42f2514 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -26,7 +26,7 @@ static const struct intel_step_info kbl_revids[] = {
[7] = { .gt_step = STEP_G0, .display_step = STEP_C0 },
 };
 
-static const struct intel_step_info tgl_uy_revid_step_tbl[] = {
+static const struct intel_step_info tgl_uy_revids[] = {
[0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[1] = { .gt_step = STEP_B0, .display_step = STEP_C0 },
[2] = { .gt_step = STEP_B1, .display_step = STEP_C0 },
@@ -34,12 +34,12 @@ static const struct intel_step_info tgl_uy_revid_step_tbl[] 
= {
 };
 
 /* Same GT stepping between tgl_uy_revids and tgl_revids don't mean the same 
HW */
-static const struct intel_step_info tgl_revid_step_tbl[] = {
+static const struct intel_step_info tgl_revids[] = {
[0] = { .gt_step = STEP_A0, .display_step = STEP_B0 },
[1] = { .gt_step = STEP_B0, .display_step = STEP_D0 },
 };
 
-static const struct intel_step_info adls_revid_step_tbl[] = {
+static const struct intel_step_info adls_revids[] = {
[0x0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[0x1] = { .gt_step = STEP_A0, .display_step = STEP_A2 },
[0x4] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
@@ -47,7 +47,7 @@ static const struct intel_step_info adls_revid_step_tbl[] = {
[0xC] = { .gt_step = STEP_D0, .display_step = STEP_C0 },
 };
 
-static const struct intel_step_info adlp_revid_step_tbl[] = {
+static const struct intel_step_info adlp_revids[] = {
[0x0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[0x4] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
[0x8] = { .gt_step = STEP_C0, .display_step = STEP_C0 },
@@ -62,17 +62,17 @@ void intel_step_init(struct drm_i915_private *i915)
struct intel_step_info step = {};
 
if (IS_ALDERLAKE_P(i915)) {
-   revids = adlp_revid_step_tbl;
-   size = ARRAY_SIZE(adlp_revid_step_tbl);
+   revids = adlp_revids;
+   size = ARRAY_SIZE(adlp_revids);
} else if (IS_ALDERLAKE_S(i915)) {
-   revids = adls_revid_step_tbl;
-   size = ARRAY_SIZE(adls_revid_step_tbl);
+   revids = adls_revids;
+   size = ARRAY_SIZE(adls_revids);
} else if (IS_TGL_U(i915) || IS_TGL_Y(i915)) {
-   revids = tgl_uy_revid_step_tbl;
-   size = ARRAY_SIZE(tgl_uy_revid_step_tbl);
+   revids = tgl_uy_revids;
+   size = ARRAY_SIZE(tgl_uy_revids);
} else if (IS_TIGERLAKE(i915)) {
-   revids = tgl_revid_step_tbl;
-   size = ARRAY_SIZE(tgl_revid_step_tbl);
+   revids = tgl_revids;
+   size = ARRAY_SIZE(tgl_revids);
} else if (IS_KABYLAKE(i915)) {
revids = kbl_revids;
size = ARRAY_SIZE(kbl_revids);
-- 
2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI v4 09/12] drm/i915/rkl: Use revid->stepping tables

2021-07-13 Thread Matt Roper
Switch RKL to use a revid->stepping table as we're trying to do on all
platforms going forward.

Bspec: 44501
Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/display/intel_psr.c | 4 ++--
 drivers/gpu/drm/i915/i915_drv.h  | 8 ++--
 drivers/gpu/drm/i915/intel_step.c| 9 +
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
b/drivers/gpu/drm/i915/display/intel_psr.c
index 4dfe1dceb863..d436490ab28c 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -594,7 +594,7 @@ static void hsw_activate_psr2(struct intel_dp *intel_dp)
if (intel_dp->psr.psr2_sel_fetch_enabled) {
/* WA 1408330847 */
if (IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
-   IS_RKL_REVID(dev_priv, RKL_REVID_A0, RKL_REVID_A0))
+   IS_RKL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0))
intel_de_rmw(dev_priv, CHICKEN_PAR1_1,
 DIS_RAM_BYPASS_PSR2_MAN_TRACK,
 DIS_RAM_BYPASS_PSR2_MAN_TRACK);
@@ -1342,7 +1342,7 @@ static void intel_psr_disable_locked(struct intel_dp 
*intel_dp)
/* WA 1408330847 */
if (intel_dp->psr.psr2_sel_fetch_enabled &&
(IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
-IS_RKL_REVID(dev_priv, RKL_REVID_A0, RKL_REVID_A0)))
+IS_RKL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0)))
intel_de_rmw(dev_priv, CHICKEN_PAR1_1,
 DIS_RAM_BYPASS_PSR2_MAN_TRACK, 0);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b3ce2b73a143..9195131cf90f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1549,12 +1549,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
(IS_TIGERLAKE(__i915) && !(IS_TGL_U(__i915) || IS_TGL_Y(__i915)) && \
 IS_GT_STEP(__i915, since, until))
 
-#define RKL_REVID_A0   0x0
-#define RKL_REVID_B0   0x1
-#define RKL_REVID_C0   0x4
-
-#define IS_RKL_REVID(p, since, until) \
-   (IS_ROCKETLAKE(p) && IS_REVID(p, since, until))
+#define IS_RKL_DISPLAY_STEP(p, since, until) \
+   (IS_ROCKETLAKE(p) && IS_DISPLAY_STEP(p, since, until))
 
 #define DG1_REVID_A0   0x0
 #define DG1_REVID_B0   0x1
diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index 9de17bdfe62f..93edfbef2903 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -75,6 +75,12 @@ static const struct intel_step_info tgl_revids[] = {
[1] = { .gt_step = STEP_B0, .display_step = STEP_D0 },
 };
 
+static const struct intel_step_info rkl_revids[] = {
+   [0] = { COMMON_STEP(A0) },
+   [1] = { COMMON_STEP(B0) },
+   [4] = { COMMON_STEP(C0) },
+};
+
 static const struct intel_step_info adls_revids[] = {
[0x0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[0x1] = { .gt_step = STEP_A0, .display_step = STEP_A2 },
@@ -103,6 +109,9 @@ void intel_step_init(struct drm_i915_private *i915)
} else if (IS_ALDERLAKE_S(i915)) {
revids = adls_revids;
size = ARRAY_SIZE(adls_revids);
+   } else if (IS_ROCKETLAKE(i915)) {
+   revids = rkl_revids;
+   size = ARRAY_SIZE(rkl_revids);
} else if (IS_TGL_U(i915) || IS_TGL_Y(i915)) {
revids = tgl_uy_revids;
size = ARRAY_SIZE(tgl_uy_revids);
-- 
2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI v4 07/12] drm/i915/icl: Use revid->stepping tables

2021-07-13 Thread Matt Roper
Switch ICL to use a revid->stepping table as we're trying to do on all
platforms going forward.  While we're at it, let's include some
additional steppings that have popped up, even if we don't yet have any
workarounds tied to those steppings (we probably need to audit our
workaround list soon to see if any of the bounds have moved or if new
workarounds have appeared).

Note that the current bspec table is missing information about how to
map PCI revision ID to GT/display steppings; it only provides an SoC
stepping.  The mapping to GT/display steppings (which aren't always the
same as the SoC stepping) used to be in the bspec, but was apparently
dropped during an update in Nov 2019; I've made my changes here based on
an older bspec snapshot that still had the necessary information.  We've
requested that the missing information be restored.

I'm only including the production revids in the table here since we're
past the point at which we usually stop trying to support pre-production
hardware.  An appropriate check is added to
intel_detect_preproduction_hw() to print an error and taint the kernel
just in case someone still tries to load the driver on old
pre-production hardware.

v2:
 - Drop pre-production steppings and add error/taint at startup when
   loading on pre-production hardware.

Bspec: 21141  # pre-Nov 2019 snapshot
Signed-off-by: Matt Roper 
Reviewed-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 12 ++--
 drivers/gpu/drm/i915/i915_drv.c |  1 +
 drivers/gpu/drm/i915/i915_drv.h | 10 ++
 drivers/gpu/drm/i915/intel_step.c   |  7 +++
 4 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 9f7cd2e54894..478c3c8602c1 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -557,7 +557,7 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs 
*engine,
/* Wa_1604370585:icl (pre-prod)
 * Formerly known as WaPushConstantDereferenceHoldDisable
 */
-   if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
+   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
wa_masked_en(wal, GEN7_ROW_CHICKEN2,
 PUSH_CONSTANT_DEREF_DISABLE);
 
@@ -573,12 +573,12 @@ static void icl_ctx_workarounds_init(struct 
intel_engine_cs *engine,
/* Wa_2006611047:icl (pre-prod)
 * Formerly known as WaDisableImprovedTdlClkGating
 */
-   if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
+   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
wa_masked_en(wal, GEN7_ROW_CHICKEN2,
 GEN11_TDL_CLOCK_GATING_FIX_DISABLE);
 
/* Wa_2006665173:icl (pre-prod) */
-   if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
+   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
 GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC);
 
@@ -1030,13 +1030,13 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
GAMW_ECO_DEV_CTX_RELOAD_DISABLE);
 
/* Wa_1405779004:icl (pre-prod) */
-   if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
+   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
wa_write_or(wal,
SLICE_UNIT_LEVEL_CLKGATE,
MSCUNIT_CLKGATE_DIS);
 
/* Wa_1406838659:icl (pre-prod) */
-   if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
+   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
wa_write_or(wal,
INF_UNIT_LEVEL_CLKGATE,
CGPSF_CLKGATE_DIS);
@@ -1733,7 +1733,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
PMFLUSHDONE_LNEBLK);
 
/* Wa_1406609255:icl (pre-prod) */
-   if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
+   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
wa_write_or(wal,
GEN7_SARCHKMD,
GEN7_DISABLE_DEMAND_PREFETCH);
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 90136995f5eb..c43b698bf0b9 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -275,6 +275,7 @@ static void intel_detect_preproduction_hw(struct 
drm_i915_private *dev_priv)
pre |= IS_BROXTON(dev_priv) && INTEL_REVID(dev_priv) < 0xA;
pre |= IS_KABYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x1;
pre |= IS_GEMINILAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x3;
+   pre |= IS_ICELAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x7;
 
if (pre) {
drm_err(&dev_priv->drm, "This is a pre-production stepping

[Intel-gfx] [CI v4 12/12] drm/i915/icl: Drop workarounds that only apply to pre-production steppings

2021-07-13 Thread Matt Roper
We're past the point at which we usually drop workarounds that were
never needed on production hardware.  The driver will already print an
error and apply taint if loaded on pre-production hardware.

Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 39 -
 drivers/gpu/drm/i915/i915_drv.h |  3 --
 2 files changed, 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 1398f35affcb..7731db33c46a 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -517,21 +517,12 @@ static void cfl_ctx_workarounds_init(struct 
intel_engine_cs *engine,
 static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 struct i915_wa_list *wal)
 {
-   struct drm_i915_private *i915 = engine->i915;
-
/* WaDisableBankHangMode:icl */
wa_write(wal,
 GEN8_L3CNTLREG,
 intel_uncore_read(engine->uncore, GEN8_L3CNTLREG) |
 GEN8_ERRDETBCTRL);
 
-   /* Wa_1604370585:icl (pre-prod)
-* Formerly known as WaPushConstantDereferenceHoldDisable
-*/
-   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
-   wa_masked_en(wal, GEN7_ROW_CHICKEN2,
-PUSH_CONSTANT_DEREF_DISABLE);
-
/* WaForceEnableNonCoherent:icl
 * This is not the same workaround as in early Gen9 platforms, where
 * lacking this could cause system hangs, but coherency performance
@@ -541,18 +532,6 @@ static void icl_ctx_workarounds_init(struct 
intel_engine_cs *engine,
 */
wa_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT);
 
-   /* Wa_2006611047:icl (pre-prod)
-* Formerly known as WaDisableImprovedTdlClkGating
-*/
-   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
-   wa_masked_en(wal, GEN7_ROW_CHICKEN2,
-GEN11_TDL_CLOCK_GATING_FIX_DISABLE);
-
-   /* Wa_2006665173:icl (pre-prod) */
-   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
-   wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
-GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC);
-
/* WaEnableFloatBlendOptimization:icl */
wa_write_clr_set(wal,
 GEN10_CACHE_MODE_SS,
@@ -989,18 +968,6 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
GEN8_GAMW_ECO_DEV_RW_IA,
GAMW_ECO_DEV_CTX_RELOAD_DISABLE);
 
-   /* Wa_1405779004:icl (pre-prod) */
-   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
-   wa_write_or(wal,
-   SLICE_UNIT_LEVEL_CLKGATE,
-   MSCUNIT_CLKGATE_DIS);
-
-   /* Wa_1406838659:icl (pre-prod) */
-   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
-   wa_write_or(wal,
-   INF_UNIT_LEVEL_CLKGATE,
-   CGPSF_CLKGATE_DIS);
-
/* Wa_1406463099:icl
 * Formerly known as WaGamTlbPendError
 */
@@ -1677,12 +1644,6 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, 
struct i915_wa_list *wal)
PMFLUSH_GAPL3UNBLOCK |
PMFLUSHDONE_LNEBLK);
 
-   /* Wa_1406609255:icl (pre-prod) */
-   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
-   wa_write_or(wal,
-   GEN7_SARCHKMD,
-   GEN7_DISABLE_DEMAND_PREFETCH);
-
/* Wa_1606682166:icl */
wa_write_or(wal,
GEN7_SARCHKMD,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8682a5f557c5..da5f230e2d4b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1513,9 +1513,6 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define IS_KBL_DISPLAY_STEP(dev_priv, since, until) \
(IS_KABYLAKE(dev_priv) && IS_DISPLAY_STEP(dev_priv, since, until))
 
-#define IS_ICL_GT_STEP(p, since, until) \
-   (IS_ICELAKE(p) && IS_GT_STEP(p, since, until))
-
 #define IS_JSL_EHL_GT_STEP(p, since, until) \
(IS_JSL_EHL(p) && IS_GT_STEP(p, since, until))
 #define IS_JSL_EHL_DISPLAY_STEP(p, since, until) \
-- 
2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI v4 08/12] drm/i915/jsl_ehl: Use revid->stepping tables

2021-07-13 Thread Matt Roper
Switch JSL/EHL to use a revid->stepping table as we're trying to do on
all platforms going forward.

v2:
 - Use COMMON_STEP().  (Anusha)

Bspec: 29153
Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 2 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 2 +-
 drivers/gpu/drm/i915/i915_drv.h   | 9 -
 drivers/gpu/drm/i915/intel_step.c | 8 
 4 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c 
b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
index 882bfd499e55..dfc31b682848 100644
--- a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
+++ b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
@@ -2674,7 +2674,7 @@ static bool
 ehl_combo_pll_div_frac_wa_needed(struct drm_i915_private *i915)
 {
return ((IS_PLATFORM(i915, INTEL_ELKHARTLAKE) &&
-IS_JSL_EHL_REVID(i915, EHL_REVID_B0, REVID_FOREVER)) ||
+IS_JSL_EHL_DISPLAY_STEP(i915, STEP_B0, STEP_FOREVER)) ||
 IS_TIGERLAKE(i915) || IS_ALDERLAKE_P(i915)) &&
 i915->dpll.ref_clks.nssc == 38400;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 478c3c8602c1..0cab641de40f 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1050,7 +1050,7 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
 
/* Wa_1607087056:icl,ehl,jsl */
if (IS_ICELAKE(i915) ||
-   IS_JSL_EHL_REVID(i915, EHL_REVID_A0, EHL_REVID_A0))
+   IS_JSL_EHL_GT_STEP(i915, STEP_A0, STEP_A0))
wa_write_or(wal,
SLICE_UNIT_LEVEL_CLKGATE,
L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d4f705f06c73..b3ce2b73a143 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1532,11 +1532,10 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define IS_ICL_GT_STEP(p, since, until) \
(IS_ICELAKE(p) && IS_GT_STEP(p, since, until))
 
-#define EHL_REVID_A00x0
-#define EHL_REVID_B00x1
-
-#define IS_JSL_EHL_REVID(p, since, until) \
-   (IS_JSL_EHL(p) && IS_REVID(p, since, until))
+#define IS_JSL_EHL_GT_STEP(p, since, until) \
+   (IS_JSL_EHL(p) && IS_GT_STEP(p, since, until))
+#define IS_JSL_EHL_DISPLAY_STEP(p, since, until) \
+   (IS_JSL_EHL(p) && IS_DISPLAY_STEP(p, since, until))
 
 #define IS_TGL_DISPLAY_STEP(__i915, since, until) \
(IS_TIGERLAKE(__i915) && \
diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index 9ce032993a99..9de17bdfe62f 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -57,6 +57,11 @@ static const struct intel_step_info icl_revids[] = {
[7] = { COMMON_STEP(D0) },
 };
 
+static const struct intel_step_info jsl_ehl_revids[] = {
+   [0] = { COMMON_STEP(A0) },
+   [1] = { COMMON_STEP(B0) },
+};
+
 static const struct intel_step_info tgl_uy_revids[] = {
[0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[1] = { .gt_step = STEP_B0, .display_step = STEP_C0 },
@@ -104,6 +109,9 @@ void intel_step_init(struct drm_i915_private *i915)
} else if (IS_TIGERLAKE(i915)) {
revids = tgl_revids;
size = ARRAY_SIZE(tgl_revids);
+   } else if (IS_JSL_EHL(i915)) {
+   revids = jsl_ehl_revids;
+   size = ARRAY_SIZE(jsl_ehl_revids);
} else if (IS_ICELAKE(i915)) {
revids = icl_revids;
size = ARRAY_SIZE(icl_revids);
-- 
2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI v4 00/12] Minor revid/stepping and workaround cleanup

2021-07-13 Thread Matt Roper
PCI revision IDs don't always map to GT and display IP steppings in an
intuitive/sensible way.  On many of our recent platforms we've switched
to using revid->stepping lookup tables with the infrastructure in
intel_step.c to handle stepping lookups and comparisons.  This series
converts several of our older platforms over to the same table-based
scheme; this is good not only for consistency, but also because some
upcoming DMC work will rely on table-based lookups.  Going forward the
only place that revision ID's should really get used directly is when
checking to see if we're running on pre-production hardware.

Note:  I haven't added the stepping tables for CFL and its derivatives
(WHL, AML) yet since there are so many different variants and the
steppings work a bit strangely on some of them.  We don't have any
stepping-specific workarounds on these platforms, so the tables aren't
necessary until Anusha's DMC work arrives; I'll let her determine the
best way to handle the tables for those.  Ditto for CML.

Let's also take the opportunity to drop a bit of effectively dead code
in the workarounds file too.

v2:
 - Include an already-reviewed patch from Anusha's DMC series as the
   first patch here that changes the naming of the revision ID tables,
   and then adjust the naming of the new tables I add here to follow the
   same convention.
 - Drop the pre-production revisions for all gen11 and earlier
   platforms; we're past the point where we usually drop the
   pre-production support.  intel_detect_preproduction_hw() is updated
   with the proper revids for ICL to ensure we print an error and taint
   the kernel if the kernel is loaded on a pre-production platform.
 - ICL workarounds that only apply to pre-production steppings are
   dropped.
 - For platforms where GT stepping is always the same as display
   stepping, we use a macro to assign them both at once to make it more
   obvious how the platform works.
 - Stepping tables for BXT and GLK are added.  They're completely unused
   in our current code (we have no stepping-specific workarounds), but
   some DMC patches from Anusha will arrive shortly that require these.
   Note that the BXT revision macros we had previously were completely
   wrong; it's a good thing they weren't actually being used for
   anything.

v3:
 - Use COMMON_STEP() macro on a few more platforms.  (Anusha)

v4:
 - s/COMMON_STEPPING/COMMON_STEP/ across whole series.  (Lucas)

Cc: Jani Nikula 
Cc: Anusha Srivatsa 
Cc: Lucas De Marchi 

Anusha Srivatsa (1):
  drm/i915/step: s/_revid_tbl/_revids

Matt Roper (11):
  drm/i915: Make pre-production detection use direct revid comparison
  drm/i915/skl: Use revid->stepping tables
  drm/i915/kbl: Drop pre-production revision from stepping table
  drm/i915/bxt: Use revid->stepping tables
  drm/i915/glk: Use revid->stepping tables
  drm/i915/icl: Use revid->stepping tables
  drm/i915/jsl_ehl: Use revid->stepping tables
  drm/i915/rkl: Use revid->stepping tables
  drm/i915/dg1: Use revid->stepping tables
  drm/i915/cnl: Drop all workarounds
  drm/i915/icl: Drop workarounds that only apply to pre-production
steppings

 .../drm/i915/display/intel_display_power.c|   2 +-
 drivers/gpu/drm/i915/display/intel_dpll_mgr.c |   2 +-
 drivers/gpu/drm/i915/display/intel_psr.c  |   4 +-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c   |   2 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 108 ++
 drivers/gpu/drm/i915/i915_drv.c   |   9 +-
 drivers/gpu/drm/i915/i915_drv.h   |  79 ++---
 drivers/gpu/drm/i915/intel_pm.c   |   2 +-
 drivers/gpu/drm/i915/intel_step.c | 104 ++---
 drivers/gpu/drm/i915/intel_step.h |   4 +
 10 files changed, 119 insertions(+), 197 deletions(-)

-- 
2.25.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI v4 11/12] drm/i915/cnl: Drop all workarounds

2021-07-13 Thread Matt Roper
All of the Cannon Lake hardware that came out had graphics fused off,
and our userspace drivers have already dropped their support for the
platform; CNL-specific code in i915 that isn't inherited by subsequent
platforms is effectively dead code.  Let's remove all of the
CNL-specific workarounds as a quick and easy first step.

References: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6899
Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 55 -
 drivers/gpu/drm/i915/i915_drv.h |  7 ---
 2 files changed, 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 045bb794a3ad..1398f35affcb 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -514,35 +514,6 @@ static void cfl_ctx_workarounds_init(struct 
intel_engine_cs *engine,
 GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
-static void cnl_ctx_workarounds_init(struct intel_engine_cs *engine,
-struct i915_wa_list *wal)
-{
-   /* WaForceContextSaveRestoreNonCoherent:cnl */
-   wa_masked_en(wal, CNL_HDC_CHICKEN0,
-HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
-
-   /* WaDisableReplayBufferBankArbitrationOptimization:cnl */
-   wa_masked_en(wal, COMMON_SLICE_CHICKEN2,
-GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-   /* WaPushConstantDereferenceHoldDisable:cnl */
-   wa_masked_en(wal, GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
-
-   /* FtrEnableFastAnisoL1BankingFix:cnl */
-   wa_masked_en(wal, HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
-
-   /* WaDisable3DMidCmdPreemption:cnl */
-   wa_masked_dis(wal, GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
-   /* WaDisableGPGPUMidCmdPreemption:cnl */
-   wa_masked_field_set(wal, GEN8_CS_CHICKEN1,
-   GEN9_PREEMPT_GPGPU_LEVEL_MASK,
-   GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
-   /* WaDisableEarlyEOT:cnl */
-   wa_masked_en(wal, GEN8_ROW_CHICKEN, DISABLE_EARLY_EOT);
-}
-
 static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 struct i915_wa_list *wal)
 {
@@ -711,8 +682,6 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
gen12_ctx_workarounds_init(engine, wal);
else if (GRAPHICS_VER(i915) == 11)
icl_ctx_workarounds_init(engine, wal);
-   else if (IS_CANNONLAKE(i915))
-   cnl_ctx_workarounds_init(engine, wal);
else if (IS_COFFEELAKE(i915) || IS_COMETLAKE(i915))
cfl_ctx_workarounds_init(engine, wal);
else if (IS_GEMINILAKE(i915))
@@ -989,15 +958,6 @@ icl_wa_init_mcr(struct drm_i915_private *i915, struct 
i915_wa_list *wal)
wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
 }
 
-static void
-cnl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list 
*wal)
-{
-   /* WaInPlaceDecompressionHang:cnl */
-   wa_write_or(wal,
-   GEN9_GAMT_ECO_REG_RW_IA,
-   GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
-}
-
 static void
 icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list 
*wal)
 {
@@ -1147,8 +1107,6 @@ gt_init_workarounds(struct drm_i915_private *i915, struct 
i915_wa_list *wal)
gen12_gt_workarounds_init(i915, wal);
else if (GRAPHICS_VER(i915) == 11)
icl_gt_workarounds_init(i915, wal);
-   else if (IS_CANNONLAKE(i915))
-   cnl_gt_workarounds_init(i915, wal);
else if (IS_COFFEELAKE(i915) || IS_COMETLAKE(i915))
cfl_gt_workarounds_init(i915, wal);
else if (IS_GEMINILAKE(i915))
@@ -1425,17 +1383,6 @@ static void cml_whitelist_build(struct intel_engine_cs 
*engine)
cfl_whitelist_build(engine);
 }
 
-static void cnl_whitelist_build(struct intel_engine_cs *engine)
-{
-   struct i915_wa_list *w = &engine->whitelist;
-
-   if (engine->class != RENDER_CLASS)
-   return;
-
-   /* WaEnablePreemptionGranularityControlByUMD:cnl */
-   whitelist_reg(w, GEN8_CS_CHICKEN1);
-}
-
 static void icl_whitelist_build(struct intel_engine_cs *engine)
 {
struct i915_wa_list *w = &engine->whitelist;
@@ -1549,8 +1496,6 @@ void intel_engine_init_whitelist(struct intel_engine_cs 
*engine)
tgl_whitelist_build(engine);
else if (GRAPHICS_VER(i915) == 11)
icl_whitelist_build(engine);
-   else if (IS_CANNONLAKE(i915))
-   cnl_whitelist_build(engine);
else if (IS_COMETLAKE(i915))
cml_whitelist_build(engine);
else if (IS_COFFEELAKE(i915))
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d462b9434541..8682a5f557c5 100644
--- a/drivers/gpu/dr

[Intel-gfx] [CI v4 03/12] drm/i915/skl: Use revid->stepping tables

2021-07-13 Thread Matt Roper
Switch SKL to use a revid->stepping table as we're trying to do on all
platforms going forward.  Also drop the preproduction revisions and add
the newer steppings we hadn't already handled.

Note that SKL has a case where a newer revision ID corresponds to an
older GT/disp stepping (0x9 -> STEP_J0, 0xA -> STEP_I1).  Also, the lack
of a revision ID 0x8 in the table is intentional and not an oversight.
We'll re-write the KBL-specific comment to make it clear that these kind
of quirks are expected.

v2:
 - Since GT and display steppings are always identical on SKL use a
   macro to set both values at once in a more readable manner.  (Anusha)
 - Drop preproduction steppings.

Bspec: 13626
Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c |  2 +-
 drivers/gpu/drm/i915/i915_drv.h | 11 +---
 drivers/gpu/drm/i915/intel_step.c   | 30 +
 drivers/gpu/drm/i915/intel_step.h   |  4 +++
 4 files changed, 31 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 72562c233ad2..9f7cd2e54894 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -890,7 +890,7 @@ skl_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
 
/* WaInPlaceDecompressionHang:skl */
-   if (IS_SKL_REVID(i915, SKL_REVID_H0, REVID_FOREVER))
+   if (IS_SKL_GT_STEP(i915, STEP_H0, STEP_FOREVER))
wa_write_or(wal,
GEN9_GAMT_ECO_REG_RW_IA,
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c4747f4407ef..f30499ed6787 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1515,16 +1515,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define IS_TGL_Y(dev_priv) \
IS_SUBPLATFORM(dev_priv, INTEL_TIGERLAKE, INTEL_SUBPLATFORM_ULX)
 
-#define SKL_REVID_A0   0x0
-#define SKL_REVID_B0   0x1
-#define SKL_REVID_C0   0x2
-#define SKL_REVID_D0   0x3
-#define SKL_REVID_E0   0x4
-#define SKL_REVID_F0   0x5
-#define SKL_REVID_G0   0x6
-#define SKL_REVID_H0   0x7
-
-#define IS_SKL_REVID(p, since, until) (IS_SKYLAKE(p) && IS_REVID(p, since, 
until))
+#define IS_SKL_GT_STEP(p, since, until) (IS_SKYLAKE(p) && IS_GT_STEP(p, since, 
until))
 
 #define BXT_REVID_A0   0x0
 #define BXT_REVID_A1   0x1
diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index 93ccd42f2514..4e6a2b3b4f8a 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -7,14 +7,31 @@
 #include "intel_step.h"
 
 /*
- * KBL revision ID ordering is bizarre; higher revision ID's map to lower
- * steppings in some cases.  So rather than test against the revision ID
- * directly, let's map that into our own range of increasing ID's that we
- * can test against in a regular manner.
+ * Some platforms have unusual ways of mapping PCI revision ID to GT/display
+ * steppings.  E.g., in some cases a higher PCI revision may translate to a
+ * lower stepping of the GT and/or display IP.  This file provides lookup
+ * tables to map the PCI revision into a standard set of stepping values that
+ * can be compared numerically.
+ *
+ * Also note that some revisions/steppings may have been set aside as
+ * placeholders but never materialized in real hardware; in those cases there
+ * may be jumps in the revision IDs or stepping values in the tables below.
  */
 
+/*
+ * Some platforms always have the same stepping value for GT and display;
+ * use a macro to define these to make it easier to identify the platforms
+ * where the two steppings can deviate.
+ */
+#define COMMON_STEP(x)  .gt_step = STEP_##x, .display_step = STEP_##x
+
+static const struct intel_step_info skl_revids[] = {
+   [0x6] = { COMMON_STEP(G0) },
+   [0x7] = { COMMON_STEP(H0) },
+   [0x9] = { COMMON_STEP(J0) },
+   [0xA] = { COMMON_STEP(I1) },
+};
 
-/* FIXME: what about REVID_E0 */
 static const struct intel_step_info kbl_revids[] = {
[0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[1] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
@@ -76,6 +93,9 @@ void intel_step_init(struct drm_i915_private *i915)
} else if (IS_KABYLAKE(i915)) {
revids = kbl_revids;
size = ARRAY_SIZE(kbl_revids);
+   } else if (IS_SKYLAKE(i915)) {
+   revids = skl_revids;
+   size = ARRAY_SIZE(skl_revids);
}
 
/* Not using the stepping scheme for the platform yet. */
diff --git a/drivers/gpu/drm/i915/intel_step.h 
b/drivers/gpu/drm/i915/intel_step.h
index 958a8bb5d677..88a771

[Intel-gfx] [CI v4 10/12] drm/i915/dg1: Use revid->stepping tables

2021-07-13 Thread Matt Roper
Switch DG1 to use a revid->stepping table as we're trying to do on all
platforms going forward.

This removes the last use of IS_REVID() and REVID_FOREVER, so remove
those now-unused macros as well to prevent their accidental use on
future platforms.

v2:
 - Use COMMON_STEP() macro in table.  (Anusha)

Bspec: 44463
Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 
---
 .../gpu/drm/i915/display/intel_display_power.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c|  2 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c| 10 +-
 drivers/gpu/drm/i915/i915_drv.h| 18 --
 drivers/gpu/drm/i915/intel_pm.c|  2 +-
 drivers/gpu/drm/i915/intel_step.c  |  8 
 6 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index d92db471411e..64be896bcd8b 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -5799,7 +5799,7 @@ static void tgl_bw_buddy_init(struct drm_i915_private 
*dev_priv)
int config, i;
 
if (IS_ALDERLAKE_S(dev_priv) ||
-   IS_DG1_REVID(dev_priv, DG1_REVID_A0, DG1_REVID_A0) ||
+   IS_DG1_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_B0))
/* Wa_1409767108:tgl,dg1,adl-s */
table = wa_1409767108_buddy_page_masks;
diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c 
b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index 1f43aba2e9e2..50d11a84e7a9 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -157,7 +157,7 @@ intel_gt_setup_fake_lmem(struct intel_gt *gt)
 static bool get_legacy_lowmem_region(struct intel_uncore *uncore,
 u64 *start, u32 *size)
 {
-   if (!IS_DG1_REVID(uncore->i915, DG1_REVID_A0, DG1_REVID_B0))
+   if (!IS_DG1_GT_STEP(uncore->i915, STEP_A0, STEP_B0))
return false;
 
*start = 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 0cab641de40f..045bb794a3ad 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1118,7 +1118,7 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
gen12_gt_workarounds_init(i915, wal);
 
/* Wa_1607087056:dg1 */
-   if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0))
+   if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0))
wa_write_or(wal,
SLICE_UNIT_LEVEL_CLKGATE,
L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
@@ -1529,7 +1529,7 @@ static void dg1_whitelist_build(struct intel_engine_cs 
*engine)
tgl_whitelist_build(engine);
 
/* GEN:BUG:1409280441:dg1 */
-   if (IS_DG1_REVID(engine->i915, DG1_REVID_A0, DG1_REVID_A0) &&
+   if (IS_DG1_GT_STEP(engine->i915, STEP_A0, STEP_A0) &&
(engine->class == RENDER_CLASS ||
 engine->class == COPY_ENGINE_CLASS))
whitelist_reg_ext(w, RING_ID(engine->mmio_base),
@@ -1599,7 +1599,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
 {
struct drm_i915_private *i915 = engine->i915;
 
-   if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
+   if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0)) {
/*
 * Wa_1607138336:tgl[a0],dg1[a0]
@@ -1645,7 +1645,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
}
 
if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) ||
-   IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
+   IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
/* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
wa_masked_en(wal, GEN7_ROW_CHICKEN2,
@@ -1659,7 +1659,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
}
 
 
-   if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
+   if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
/*
 * Wa_1607030317:tgl
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9195131cf90f..d462b9434541 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1323,19 +1323,10 @@ static inline struct drm_i915_private 
*pdev_to_i915(struct pci_dev *pdev)
 #define IS_DISPLAY_VER(i915, from, until) \
(DISPLAY_VER(i915) >= (from) && DISPLAY_VER(i915) <= (until))
 
-#define REVID_FOREVER  0xff
 #define INTEL_REVID(dev_priv)  (to

[Intel-gfx] ✓ Fi.CI.BAT: success for Minor revid/stepping and workaround cleanup (rev4)

2021-07-13 Thread Patchwork
== Series Details ==

Series: Minor revid/stepping and workaround cleanup (rev4)
URL   : https://patchwork.freedesktop.org/series/92299/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10342 -> Patchwork_20588


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20588/index.html

Known issues


  Here are the changes found in Patchwork_20588 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@query-info:
- fi-bsw-kefka:   NOTRUN -> [SKIP][1] ([fdo#109271]) +17 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20588/fi-bsw-kefka/igt@amdgpu/amd_ba...@query-info.html

  * igt@gem_exec_gttfill@basic:
- fi-bsw-n3050:   NOTRUN -> [SKIP][2] ([fdo#109271])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20588/fi-bsw-n3050/igt@gem_exec_gttf...@basic.html

  * igt@gem_exec_suspend@basic-s3:
- fi-bsw-n3050:   NOTRUN -> [INCOMPLETE][3] ([i915#3159])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20588/fi-bsw-n3050/igt@gem_exec_susp...@basic-s3.html

  
 Possible fixes 

  * igt@i915_pm_rpm@module-reload:
- fi-kbl-soraka:  [DMESG-WARN][4] ([i915#1982]) -> [PASS][5]
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10342/fi-kbl-soraka/igt@i915_pm_...@module-reload.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20588/fi-kbl-soraka/igt@i915_pm_...@module-reload.html

  * igt@i915_selftest@live@execlists:
- fi-bsw-kefka:   [INCOMPLETE][6] ([i915#2782] / [i915#2940]) -> 
[PASS][7]
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10342/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20588/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2782]: https://gitlab.freedesktop.org/drm/intel/issues/2782
  [i915#2867]: https://gitlab.freedesktop.org/drm/intel/issues/2867
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940
  [i915#3159]: https://gitlab.freedesktop.org/drm/intel/issues/3159
  [i915#3717]: https://gitlab.freedesktop.org/drm/intel/issues/3717


Participating hosts (38 -> 36)
--

  Additional (1): fi-bsw-n3050 
  Missing(3): fi-ilk-m540 fi-bdw-samus fi-hsw-4200u 


Build changes
-

  * Linux: CI_DRM_10342 -> Patchwork_20588

  CI-20190529: 20190529
  CI_DRM_10342: 308b278ffbef846356ca6b220ef1aa908c22c5fd @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6137: 2fee489255f7a8cd6a584373c30e3d44a07a78ea @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20588: d62d0ec1b3aa504119130e0c61f1b16ad15f06a2 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

d62d0ec1b3aa drm/i915/icl: Drop workarounds that only apply to pre-production 
steppings
a66bc53c20b3 drm/i915/cnl: Drop all workarounds
28dea784c5d3 drm/i915/dg1: Use revid->stepping tables
d515112313b0 drm/i915/rkl: Use revid->stepping tables
95aec3b31d70 drm/i915/jsl_ehl: Use revid->stepping tables
ea04a99db33e drm/i915/icl: Use revid->stepping tables
bfc4af69ce3b drm/i915/glk: Use revid->stepping tables
27320b6e947b drm/i915/bxt: Use revid->stepping tables
1c898561ba39 drm/i915/kbl: Drop pre-production revision from stepping table
5c84247d6ac9 drm/i915/skl: Use revid->stepping tables
76bd03989163 drm/i915: Make pre-production detection use direct revid comparison
62d758aadfe3 drm/i915/step: s/_revid_tbl/_revids

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20588/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 07/12] drm/i915/icl: Use revid->stepping tables

2021-07-13 Thread Lucas De Marchi

On Fri, Jul 09, 2021 at 08:37:19PM -0700, Matt Roper wrote:

Switch ICL to use a revid->stepping table as we're trying to do on all
platforms going forward.  While we're at it, let's include some
additional steppings that have popped up, even if we don't yet have any
workarounds tied to those steppings (we probably need to audit our
workaround list soon to see if any of the bounds have moved or if new
workarounds have appeared).

Note that the current bspec table is missing information about how to
map PCI revision ID to GT/display steppings; it only provides an SoC
stepping.  The mapping to GT/display steppings (which aren't always the
same as the SoC stepping) used to be in the bspec, but was apparently
dropped during an update in Nov 2019; I've made my changes here based on
an older bspec snapshot that still had the necessary information.  We've
requested that the missing information be restored.

I'm only including the production revids in the table here since we're
past the point at which we usually stop trying to support pre-production
hardware.  An appropriate check is added to
intel_detect_preproduction_hw() to print an error and taint the kernel
just in case someone still tries to load the driver on old
pre-production hardware.

v2:
- Drop pre-production steppings and add error/taint at startup when
  loading on pre-production hardware.


oh... I forgot to send my review. Here is the commend I had:

It seems we are not actually dropping the WAs. We have several applying
only to A0 or A0/B0. From your first paragraph, is the intention to do
an audit of the WA ranges later?  Because we are currently running
without applying those WAs, so those are effectively dead code.

Lucas De Marchi
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/gt: Fix -EDEADLK handling regression

2021-07-13 Thread Daniel Vetter
On Thu, Jul 1, 2021 at 9:07 AM Maarten Lankhorst
 wrote:
> Op 30-06-2021 om 18:44 schreef Ville Syrjala:
> > From: Ville Syrjälä 
> >
> > The conversion to ww mutexes failed to address the fence code which
> > already returns -EDEADLK when we run out of fences. Ww mutexes on
> > the other hand treat -EDEADLK as an internal errno value indicating
> > a need to restart the operation due to a deadlock. So now when the
> > fence code returns -EDEADLK the higher level code erroneously
> > restarts everything instead of returning the error to userspace
> > as is expected.
> >
> > To remedy this let's switch the fence code to use a different errno
> > value for this. -ENOBUFS seems like a semi-reasonable unique choice.
> > Apart from igt the only user of this I could find is sna, and even
> > there all we do is dump the current fence registers from debugfs
> > into the X server log. So no user visible functionality is affected.
> > If we really cared about preserving this we could of course convert
> > back to -EDEADLK higher up, but doesn't seem like that's worth
> > the hassle here.
> >
> > Not quite sure which commit specifically broke this, but I'll
> > just attribute it to the general gem ww mutex work.
> >
> > Cc: sta...@vger.kernel.org
> > Cc: Maarten Lankhorst 
> > Cc: Thomas Hellström 
> > Testcase: igt/gem_pread/exhaustion
> > Testcase: igt/gem_pwrite/basic-exhaustion
> > Testcase: igt/gem_fenced_exec_thrash/too-many-fences
> > Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for i915_gem_ww_ctx 
> > locking, v2.")
> > Signed-off-by: Ville Syrjälä 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c 
> > b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > index cac7f3f44642..f8948de72036 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > @@ -348,7 +348,7 @@ static struct i915_fence_reg *fence_find(struct 
> > i915_ggtt *ggtt)
> >   if (intel_has_pending_fb_unpin(ggtt->vm.i915))
> >   return ERR_PTR(-EAGAIN);
> >
> > - return ERR_PTR(-EDEADLK);
> > + return ERR_PTR(-ENOBUFS);
> >  }
> >
> >  int __i915_vma_pin_fence(struct i915_vma *vma)
>
> Makes sense..
>
> Reviewed-by: Maarten Lankhorst 
>
> Is it a slightly more reent commit? Might probably be the part that converts 
> execbuffer to use ww locks.

- please cc: dri-devel on anything gem/gt related.
- this should probably be ENOSPC or something like that for at least a
seeming retention of errno consistentcy:

https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#recommended-ioctl-return-values

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/gt: Fix -EDEADLK handling regression

2021-07-13 Thread Daniel Vetter
On Tue, Jul 13, 2021 at 9:58 PM Daniel Vetter  wrote:
>
> On Thu, Jul 1, 2021 at 9:07 AM Maarten Lankhorst
>  wrote:
> > Op 30-06-2021 om 18:44 schreef Ville Syrjala:
> > > From: Ville Syrjälä 
> > >
> > > The conversion to ww mutexes failed to address the fence code which
> > > already returns -EDEADLK when we run out of fences. Ww mutexes on
> > > the other hand treat -EDEADLK as an internal errno value indicating
> > > a need to restart the operation due to a deadlock. So now when the
> > > fence code returns -EDEADLK the higher level code erroneously
> > > restarts everything instead of returning the error to userspace
> > > as is expected.
> > >
> > > To remedy this let's switch the fence code to use a different errno
> > > value for this. -ENOBUFS seems like a semi-reasonable unique choice.
> > > Apart from igt the only user of this I could find is sna, and even
> > > there all we do is dump the current fence registers from debugfs
> > > into the X server log. So no user visible functionality is affected.
> > > If we really cared about preserving this we could of course convert
> > > back to -EDEADLK higher up, but doesn't seem like that's worth
> > > the hassle here.
> > >
> > > Not quite sure which commit specifically broke this, but I'll
> > > just attribute it to the general gem ww mutex work.
> > >
> > > Cc: sta...@vger.kernel.org
> > > Cc: Maarten Lankhorst 
> > > Cc: Thomas Hellström 
> > > Testcase: igt/gem_pread/exhaustion
> > > Testcase: igt/gem_pwrite/basic-exhaustion
> > > Testcase: igt/gem_fenced_exec_thrash/too-many-fences
> > > Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for i915_gem_ww_ctx 
> > > locking, v2.")
> > > Signed-off-by: Ville Syrjälä 
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c 
> > > b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > > index cac7f3f44642..f8948de72036 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > > @@ -348,7 +348,7 @@ static struct i915_fence_reg *fence_find(struct 
> > > i915_ggtt *ggtt)
> > >   if (intel_has_pending_fb_unpin(ggtt->vm.i915))
> > >   return ERR_PTR(-EAGAIN);
> > >
> > > - return ERR_PTR(-EDEADLK);
> > > + return ERR_PTR(-ENOBUFS);
> > >  }
> > >
> > >  int __i915_vma_pin_fence(struct i915_vma *vma)
> >
> > Makes sense..
> >
> > Reviewed-by: Maarten Lankhorst 
> >
> > Is it a slightly more reent commit? Might probably be the part that 
> > converts execbuffer to use ww locks.
>
> - please cc: dri-devel on anything gem/gt related.
> - this should probably be ENOSPC or something like that for at least a
> seeming retention of errno consistentcy:
>
> https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#recommended-ioctl-return-values

Other option would be to map that back to EDEADLK in the execbuf ioctl
somewhere, so we retain a distinct errno code.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 07/12] drm/i915/icl: Use revid->stepping tables

2021-07-13 Thread Matt Roper
On Tue, Jul 13, 2021 at 12:57:07PM -0700, Lucas De Marchi wrote:
> On Fri, Jul 09, 2021 at 08:37:19PM -0700, Matt Roper wrote:
> > Switch ICL to use a revid->stepping table as we're trying to do on all
> > platforms going forward.  While we're at it, let's include some
> > additional steppings that have popped up, even if we don't yet have any
> > workarounds tied to those steppings (we probably need to audit our
> > workaround list soon to see if any of the bounds have moved or if new
> > workarounds have appeared).
> > 
> > Note that the current bspec table is missing information about how to
> > map PCI revision ID to GT/display steppings; it only provides an SoC
> > stepping.  The mapping to GT/display steppings (which aren't always the
> > same as the SoC stepping) used to be in the bspec, but was apparently
> > dropped during an update in Nov 2019; I've made my changes here based on
> > an older bspec snapshot that still had the necessary information.  We've
> > requested that the missing information be restored.
> > 
> > I'm only including the production revids in the table here since we're
> > past the point at which we usually stop trying to support pre-production
> > hardware.  An appropriate check is added to
> > intel_detect_preproduction_hw() to print an error and taint the kernel
> > just in case someone still tries to load the driver on old
> > pre-production hardware.
> > 
> > v2:
> > - Drop pre-production steppings and add error/taint at startup when
> >   loading on pre-production hardware.
> 
> oh... I forgot to send my review. Here is the commend I had:
> 
> It seems we are not actually dropping the WAs. We have several applying
> only to A0 or A0/B0. From your first paragraph, is the intention to do
> an audit of the WA ranges later?  Because we are currently running
> without applying those WAs, so those are effectively dead code.

The actual dropping of workarounds for pre-production steppings happens
in patch #12.  But a more in-depth audit will be done in the future.


Matt

> 
> Lucas De Marchi

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 07/12] drm/i915/icl: Use revid->stepping tables

2021-07-13 Thread Lucas De Marchi

On Tue, Jul 13, 2021 at 12:59:53PM -0700, Matt Roper wrote:

On Tue, Jul 13, 2021 at 12:57:07PM -0700, Lucas De Marchi wrote:

On Fri, Jul 09, 2021 at 08:37:19PM -0700, Matt Roper wrote:
> Switch ICL to use a revid->stepping table as we're trying to do on all
> platforms going forward.  While we're at it, let's include some
> additional steppings that have popped up, even if we don't yet have any
> workarounds tied to those steppings (we probably need to audit our
> workaround list soon to see if any of the bounds have moved or if new
> workarounds have appeared).
>
> Note that the current bspec table is missing information about how to
> map PCI revision ID to GT/display steppings; it only provides an SoC
> stepping.  The mapping to GT/display steppings (which aren't always the
> same as the SoC stepping) used to be in the bspec, but was apparently
> dropped during an update in Nov 2019; I've made my changes here based on
> an older bspec snapshot that still had the necessary information.  We've
> requested that the missing information be restored.
>
> I'm only including the production revids in the table here since we're
> past the point at which we usually stop trying to support pre-production
> hardware.  An appropriate check is added to
> intel_detect_preproduction_hw() to print an error and taint the kernel
> just in case someone still tries to load the driver on old
> pre-production hardware.
>
> v2:
> - Drop pre-production steppings and add error/taint at startup when
>   loading on pre-production hardware.

oh... I forgot to send my review. Here is the commend I had:

It seems we are not actually dropping the WAs. We have several applying
only to A0 or A0/B0. From your first paragraph, is the intention to do
an audit of the WA ranges later?  Because we are currently running
without applying those WAs, so those are effectively dead code.


The actual dropping of workarounds for pre-production steppings happens
in patch #12.  But a more in-depth audit will be done in the future.


ahh, ok. Makes sense then.

Thanks
Lucas De Marchi




Matt



Lucas De Marchi


--
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [CI v4 03/12] drm/i915/skl: Use revid->stepping tables

2021-07-13 Thread Lucas De Marchi

On Tue, Jul 13, 2021 at 12:36:26PM -0700, Matt Roper wrote:

Switch SKL to use a revid->stepping table as we're trying to do on all
platforms going forward.  Also drop the preproduction revisions and add
the newer steppings we hadn't already handled.

Note that SKL has a case where a newer revision ID corresponds to an
older GT/disp stepping (0x9 -> STEP_J0, 0xA -> STEP_I1).  Also, the lack
of a revision ID 0x8 in the table is intentional and not an oversight.
We'll re-write the KBL-specific comment to make it clear that these kind
of quirks are expected.

v2:
- Since GT and display steppings are always identical on SKL use a
  macro to set both values at once in a more readable manner.  (Anusha)
- Drop preproduction steppings.

Bspec: 13626
Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
Reviewed-by: Anusha Srivatsa 



Reviewed-by: Lucas De Marchi 

Lucas De Marchi


---
drivers/gpu/drm/i915/gt/intel_workarounds.c |  2 +-
drivers/gpu/drm/i915/i915_drv.h | 11 +---
drivers/gpu/drm/i915/intel_step.c   | 30 +
drivers/gpu/drm/i915/intel_step.h   |  4 +++
4 files changed, 31 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 72562c233ad2..9f7cd2e54894 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -890,7 +890,7 @@ skl_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);

/* WaInPlaceDecompressionHang:skl */
-   if (IS_SKL_REVID(i915, SKL_REVID_H0, REVID_FOREVER))
+   if (IS_SKL_GT_STEP(i915, STEP_H0, STEP_FOREVER))
wa_write_or(wal,
GEN9_GAMT_ECO_REG_RW_IA,
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c4747f4407ef..f30499ed6787 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1515,16 +1515,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
#define IS_TGL_Y(dev_priv) \
IS_SUBPLATFORM(dev_priv, INTEL_TIGERLAKE, INTEL_SUBPLATFORM_ULX)

-#define SKL_REVID_A0   0x0
-#define SKL_REVID_B0   0x1
-#define SKL_REVID_C0   0x2
-#define SKL_REVID_D0   0x3
-#define SKL_REVID_E0   0x4
-#define SKL_REVID_F0   0x5
-#define SKL_REVID_G0   0x6
-#define SKL_REVID_H0   0x7
-
-#define IS_SKL_REVID(p, since, until) (IS_SKYLAKE(p) && IS_REVID(p, since, 
until))
+#define IS_SKL_GT_STEP(p, since, until) (IS_SKYLAKE(p) && IS_GT_STEP(p, since, 
until))

#define BXT_REVID_A00x0
#define BXT_REVID_A10x1
diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index 93ccd42f2514..4e6a2b3b4f8a 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -7,14 +7,31 @@
#include "intel_step.h"

/*
- * KBL revision ID ordering is bizarre; higher revision ID's map to lower
- * steppings in some cases.  So rather than test against the revision ID
- * directly, let's map that into our own range of increasing ID's that we
- * can test against in a regular manner.
+ * Some platforms have unusual ways of mapping PCI revision ID to GT/display
+ * steppings.  E.g., in some cases a higher PCI revision may translate to a
+ * lower stepping of the GT and/or display IP.  This file provides lookup
+ * tables to map the PCI revision into a standard set of stepping values that
+ * can be compared numerically.
+ *
+ * Also note that some revisions/steppings may have been set aside as
+ * placeholders but never materialized in real hardware; in those cases there
+ * may be jumps in the revision IDs or stepping values in the tables below.
 */

+/*
+ * Some platforms always have the same stepping value for GT and display;
+ * use a macro to define these to make it easier to identify the platforms
+ * where the two steppings can deviate.
+ */
+#define COMMON_STEP(x)  .gt_step = STEP_##x, .display_step = STEP_##x
+
+static const struct intel_step_info skl_revids[] = {
+   [0x6] = { COMMON_STEP(G0) },
+   [0x7] = { COMMON_STEP(H0) },
+   [0x9] = { COMMON_STEP(J0) },
+   [0xA] = { COMMON_STEP(I1) },
+};

-/* FIXME: what about REVID_E0 */
static const struct intel_step_info kbl_revids[] = {
[0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[1] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
@@ -76,6 +93,9 @@ void intel_step_init(struct drm_i915_private *i915)
} else if (IS_KABYLAKE(i915)) {
revids = kbl_revids;
size = ARRAY_SIZE(kbl_revids);
+   } else if (IS_SKYLAKE(i915)) {
+   revids = skl_revids;
+   size = ARRAY_SIZE(skl_revids);
}

/* Not using the stepping scheme for the platform yet. */
diff --git a/dri

Re: [Intel-gfx] [PATCH v4 02/18] drm/sched: Barriers are needed for entity->last_scheduled

2021-07-13 Thread Andrey Grodzovsky


On 2021-07-13 5:10 a.m., Daniel Vetter wrote:

On Tue, Jul 13, 2021 at 9:25 AM Christian König
 wrote:

Am 13.07.21 um 08:50 schrieb Daniel Vetter:

On Tue, Jul 13, 2021 at 8:35 AM Christian König
 wrote:

Am 12.07.21 um 19:53 schrieb Daniel Vetter:

It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.

It's definitely not enough on arm.

Fix this proplery, which means
- explain the need for the barrier in both places
- point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.

While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.

Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.

v2: Put smp_rmp() in the right place and fix up comment (Andrey)

Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Steven Price 
Cc: Daniel Vetter 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Boris Brezillon 
---
drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index f7347c284886..89e3f6eaf519 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
dma_fence_set_error(&sched_job->s_fence->finished, -ECANCELED);

dma_fence_put(entity->last_scheduled);
+
entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished);

+ /*
+  * If the queue is empty we allow drm_sched_entity_select_rq() to
+  * locklessly access ->last_scheduled. This only works if we set the
+  * pointer before we dequeue and if we a write barrier here.
+  */
+ smp_wmb();
+

Again, conceptual those barriers should be part of the spsc_queue
container and not externally.

That would be extremely unusual api. Let's assume that your queue is
very dumb, and protected by a simple lock. That's about the maximum
any user could expect.

But then you still need barriers here, because linux locks (spinlock,
mutex) are defined to be one-way barriers: Stuff that's inside is
guaranteed to be done insinde, but stuff outside of the locked region
can leak in. They're load-acquire/store-release barriers. So not good
enough.

You really need to have barriers here, and they really all need to be
documented properly. And yes that's a shit-ton of work in drm/sched,
because it's full of yolo lockless stuff.

The other case you could make is that this works like a wakeup queue,
or similar. The rules there are:
- wake_up (i.e. pushing something into the queue) is a store-release barrier
- the waked up (i.e. popping an entry) is a load acquire barrier
Which is obviuosly needed because otherwise you don't have coherency
for the data queued up. And again not the barriers you're locking for
here.

Exactly that was the idea, yes.


Either way, we'd still need the comments, because it's still lockless
trickery, and every single one of that needs to have a comment on both
sides to explain what's going on.

Essentially replace spsc_queue with an llist underneath, and that's
the amount of barriers a data structure should provide. Anything else
is asking your datastructure to paper over bugs in your users.

This is similar to how atomic_t is by default completely unordered,
and users need to add barriers as needed, with comments.

My main problem is as always that kernel atomics work different than
userspace atomics.


I think this is all to make sure people don't just write lockless algorithms
because it's a cool idea, but are forced to think this all through.
Which seems to not have happened very consistently for drm/sched, so I
guess needs to be fixed.

Well at least initially that was all perfectly thought through. The
problem is nobody is really maintaining that stuff.


I'm definitely not going to hide all that by making the spsc_queue
stuff provide random unjustified barriers just because that would
paper over drm/sched bugs. We need to fix the actual bugs, and
preferrable all of them. I've found a few, but I wasn't involved in
drm/sched thus far, so best I can do is discover them as we go.

I don't think that those are random unjustified barriers at all and it
sounds like you didn't grip what I said here.

See the spsc queue must have the following semantics:

1. When you pop a job all changes made before you push the job must be
visible.

This is the standard barriers that also wake-up queues have, it's just
store-release+load-acquire.


2. When the queue becomes empty all the changes made before you pop the
last job must be visibl

Re: [Intel-gfx] [PATCH] drm/i915: Fix wm params for ccs

2021-07-13 Thread Lucas De Marchi

On Tue, Jul 13, 2021 at 09:44:21PM +0300, Juha-Pekka Heikkila wrote:

skl_compute_plane_wm_params() didn't take into account ccs
modifiers on graphics ver >= 12

Signed-off-by: Juha-Pekka Heikkila 



Reviewed-by: Lucas De Marchi 

Lucas De Marchi


---
drivers/gpu/drm/i915/intel_pm.c | 6 ++
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0cbb79452fcf..540a7ecbf004 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5249,11 +5249,9 @@ skl_compute_wm_params(const struct intel_crtc_state 
*crtc_state,

wp->y_tiled = modifier == I915_FORMAT_MOD_Y_TILED ||
  modifier == I915_FORMAT_MOD_Yf_TILED ||
- modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
- modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
+ is_ccs_modifier(modifier);
wp->x_tiled = modifier == I915_FORMAT_MOD_X_TILED;
-   wp->rc_surface = modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
-modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
+   wp->rc_surface = is_ccs_modifier(modifier);
wp->is_planar = intel_format_info_is_yuv_semiplanar(format, modifier);

wp->width = width;
--
2.28.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/gt: Fix -EDEADLK handling regression

2021-07-13 Thread Rodrigo Vivi
On Tue, Jul 13, 2021 at 09:59:18PM +0200, Daniel Vetter wrote:
> On Tue, Jul 13, 2021 at 9:58 PM Daniel Vetter  wrote:
> >
> > On Thu, Jul 1, 2021 at 9:07 AM Maarten Lankhorst
> >  wrote:
> > > Op 30-06-2021 om 18:44 schreef Ville Syrjala:
> > > > From: Ville Syrjälä 
> > > >
> > > > The conversion to ww mutexes failed to address the fence code which
> > > > already returns -EDEADLK when we run out of fences. Ww mutexes on
> > > > the other hand treat -EDEADLK as an internal errno value indicating
> > > > a need to restart the operation due to a deadlock. So now when the
> > > > fence code returns -EDEADLK the higher level code erroneously
> > > > restarts everything instead of returning the error to userspace
> > > > as is expected.
> > > >
> > > > To remedy this let's switch the fence code to use a different errno
> > > > value for this. -ENOBUFS seems like a semi-reasonable unique choice.
> > > > Apart from igt the only user of this I could find is sna, and even
> > > > there all we do is dump the current fence registers from debugfs
> > > > into the X server log. So no user visible functionality is affected.
> > > > If we really cared about preserving this we could of course convert
> > > > back to -EDEADLK higher up, but doesn't seem like that's worth
> > > > the hassle here.
> > > >
> > > > Not quite sure which commit specifically broke this, but I'll
> > > > just attribute it to the general gem ww mutex work.
> > > >
> > > > Cc: sta...@vger.kernel.org
> > > > Cc: Maarten Lankhorst 
> > > > Cc: Thomas Hellström 
> > > > Testcase: igt/gem_pread/exhaustion
> > > > Testcase: igt/gem_pwrite/basic-exhaustion
> > > > Testcase: igt/gem_fenced_exec_thrash/too-many-fences
> > > > Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for 
> > > > i915_gem_ww_ctx locking, v2.")
> > > > Signed-off-by: Ville Syrjälä 
> > > > ---
> > > >  drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c 
> > > > b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > > > index cac7f3f44642..f8948de72036 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > > > @@ -348,7 +348,7 @@ static struct i915_fence_reg *fence_find(struct 
> > > > i915_ggtt *ggtt)
> > > >   if (intel_has_pending_fb_unpin(ggtt->vm.i915))
> > > >   return ERR_PTR(-EAGAIN);
> > > >
> > > > - return ERR_PTR(-EDEADLK);
> > > > + return ERR_PTR(-ENOBUFS);
> > > >  }
> > > >
> > > >  int __i915_vma_pin_fence(struct i915_vma *vma)
> > >
> > > Makes sense..
> > >
> > > Reviewed-by: Maarten Lankhorst 
> > >
> > > Is it a slightly more reent commit? Might probably be the part that 
> > > converts execbuffer to use ww locks.
> >
> > - please cc: dri-devel on anything gem/gt related.
> > - this should probably be ENOSPC or something like that for at least a
> > seeming retention of errno consistentcy:
> >
> > https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#recommended-ioctl-return-values
> 
> Other option would be to map that back to EDEADLK in the execbuf ioctl
> somewhere, so we retain a distinct errno code.

I'm about to push this patch to drm-intel-fixes... I'm assuming if there's any 
fix it will
be a follow-up patch and not a revert or force push, right?!

> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/fb-helper: Try to protect cleanup against delayed setup

2021-07-13 Thread Sam Ravnborg
Hi Daniel,

On Tue, Jul 13, 2021 at 03:59:22PM +0200, Daniel Vetter wrote:
> Some vague evidences suggests this can go wrong. Try to prevent it by
> holding the right mutex and clearing ->deferred_setup to make sure we
> later on don't accidentally try to re-register the fbdev when the
> driver thought it had it all cleaned up already.
> 
> v2: I realized that this is fundamentally butchered, and CI complained
> about lockdep splats. So limit the critical section again and just add
> a few notes what the proper fix is.
> 
> References: 
> https://intel-gfx-ci.01.org/tree/linux-next/next-20201215/fi-byt-j1900/igt@i915_pm_...@module-reload.html
> Signed-off-by: Daniel Vetter 
> Cc: Ville Syrjälä 
> Cc: Chris Wilson 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Thomas Zimmermann 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> ---
>  drivers/gpu/drm/drm_fb_helper.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index 9d82fda274eb..8f11e5abb222 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -598,6 +598,9 @@ EXPORT_SYMBOL(drm_fb_helper_alloc_fbi);
>   * A wrapper around unregister_framebuffer, to release the fb_info
>   * framebuffer device. This must be called before releasing all resources for
>   * @fb_helper by calling drm_fb_helper_fini().
> + *
> + * Note that this is fundamentally racy on hotunload because it doen't handle
s/doen't/doesn't/
> + * open fbdev file descriptors at all. Use drm_fbdev_generic_setup() instead.
>   */
>  void drm_fb_helper_unregister_fbi(struct drm_fb_helper *fb_helper)
>  {
> @@ -611,6 +614,9 @@ EXPORT_SYMBOL(drm_fb_helper_unregister_fbi);
>   * @fb_helper: driver-allocated fbdev helper, can be NULL
>   *
>   * This cleans up all remaining resources associated with @fb_helper.
> + *
> + * Note that this is fundamentally racy on hotunload because it doen't handle
s/doen't/doesn't/
> + * open fbdev file descriptors at all. Use drm_fbdev_generic_setup() instead.
>   */
>  void drm_fb_helper_fini(struct drm_fb_helper *fb_helper)
>  {
> @@ -2382,6 +2388,10 @@ static void drm_fbdev_client_unregister(struct 
> drm_client_dev *client)
>  {
>   struct drm_fb_helper *fb_helper = drm_fb_helper_from_client(client);
>  
> + mutex_lock(&fb_helper->lock);
> + fb_helper->deferred_setup = false;
> + mutex_unlock(&fb_helper->lock);
> +
>   if (fb_helper->fbdev)
>   /* drm_fbdev_fb_destroy() takes care of cleanup */
>   drm_fb_helper_unregister_fbi(fb_helper);

I could not find any better spot to clear deferred_setup - so I think
this is OK.

With the two spellign issues fixed:
Acked-by: Sam Ravnborg 

No r-b as I an not too fluent in these code paths and all the locking.

Sam
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/gt: Fix -EDEADLK handling regression

2021-07-13 Thread Ville Syrjälä
On Tue, Jul 13, 2021 at 09:59:18PM +0200, Daniel Vetter wrote:
> On Tue, Jul 13, 2021 at 9:58 PM Daniel Vetter  wrote:
> >
> > On Thu, Jul 1, 2021 at 9:07 AM Maarten Lankhorst
> >  wrote:
> > > Op 30-06-2021 om 18:44 schreef Ville Syrjala:
> > > > From: Ville Syrjälä 
> > > >
> > > > The conversion to ww mutexes failed to address the fence code which
> > > > already returns -EDEADLK when we run out of fences. Ww mutexes on
> > > > the other hand treat -EDEADLK as an internal errno value indicating
> > > > a need to restart the operation due to a deadlock. So now when the
> > > > fence code returns -EDEADLK the higher level code erroneously
> > > > restarts everything instead of returning the error to userspace
> > > > as is expected.
> > > >
> > > > To remedy this let's switch the fence code to use a different errno
> > > > value for this. -ENOBUFS seems like a semi-reasonable unique choice.
> > > > Apart from igt the only user of this I could find is sna, and even
> > > > there all we do is dump the current fence registers from debugfs
> > > > into the X server log. So no user visible functionality is affected.
> > > > If we really cared about preserving this we could of course convert
> > > > back to -EDEADLK higher up, but doesn't seem like that's worth
> > > > the hassle here.
> > > >
> > > > Not quite sure which commit specifically broke this, but I'll
> > > > just attribute it to the general gem ww mutex work.
> > > >
> > > > Cc: sta...@vger.kernel.org
> > > > Cc: Maarten Lankhorst 
> > > > Cc: Thomas Hellström 
> > > > Testcase: igt/gem_pread/exhaustion
> > > > Testcase: igt/gem_pwrite/basic-exhaustion
> > > > Testcase: igt/gem_fenced_exec_thrash/too-many-fences
> > > > Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for 
> > > > i915_gem_ww_ctx locking, v2.")
> > > > Signed-off-by: Ville Syrjälä 
> > > > ---
> > > >  drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c 
> > > > b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > > > index cac7f3f44642..f8948de72036 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > > > @@ -348,7 +348,7 @@ static struct i915_fence_reg *fence_find(struct 
> > > > i915_ggtt *ggtt)
> > > >   if (intel_has_pending_fb_unpin(ggtt->vm.i915))
> > > >   return ERR_PTR(-EAGAIN);
> > > >
> > > > - return ERR_PTR(-EDEADLK);
> > > > + return ERR_PTR(-ENOBUFS);
> > > >  }
> > > >
> > > >  int __i915_vma_pin_fence(struct i915_vma *vma)
> > >
> > > Makes sense..
> > >
> > > Reviewed-by: Maarten Lankhorst 
> > >
> > > Is it a slightly more reent commit? Might probably be the part that 
> > > converts execbuffer to use ww locks.
> >
> > - please cc: dri-devel on anything gem/gt related.

Thought I did. Apparently got lost somewhere.

> > - this should probably be ENOSPC or something like that for at least a
> > seeming retention of errno consistentcy:

ENOSPC is already used for other things.

> >
> > https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#recommended-ioctl-return-values
> 
> Other option would be to map that back to EDEADLK in the execbuf ioctl
> somewhere, so we retain a distinct errno code.

Already mentioned in the commit msg.

-- 
Ville Syrjälä
Intel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Fix wm params for ccs

2021-07-13 Thread Patchwork
== Series Details ==

Series: drm/i915: Fix wm params for ccs
URL   : https://patchwork.freedesktop.org/series/92491/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10342 -> Patchwork_20589


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20589/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20589:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_ctx_create@basic-files:
- {fi-tgl-1115g4}:[PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10342/fi-tgl-1115g4/igt@gem_ctx_cre...@basic-files.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20589/fi-tgl-1115g4/igt@gem_ctx_cre...@basic-files.html

  
Known issues


  Here are the changes found in Patchwork_20589 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@query-info:
- fi-bsw-kefka:   NOTRUN -> [SKIP][3] ([fdo#109271]) +17 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20589/fi-bsw-kefka/igt@amdgpu/amd_ba...@query-info.html

  * igt@gem_exec_gttfill@basic:
- fi-bsw-n3050:   NOTRUN -> [SKIP][4] ([fdo#109271])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20589/fi-bsw-n3050/igt@gem_exec_gttf...@basic.html

  * igt@gem_exec_suspend@basic-s3:
- fi-bsw-n3050:   NOTRUN -> [INCOMPLETE][5] ([i915#3159])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20589/fi-bsw-n3050/igt@gem_exec_susp...@basic-s3.html

  * igt@i915_selftest@live@late_gt_pm:
- fi-bsw-nick:[PASS][6] -> [DMESG-FAIL][7] ([i915#2927])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10342/fi-bsw-nick/igt@i915_selftest@live@late_gt_pm.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20589/fi-bsw-nick/igt@i915_selftest@live@late_gt_pm.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-kbl-7500u:   [PASS][8] -> [FAIL][9] ([i915#1372])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10342/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20589/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html

  * igt@runner@aborted:
- fi-bsw-nick:NOTRUN -> [FAIL][10] ([fdo#109271] / [i915#1436])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20589/fi-bsw-nick/igt@run...@aborted.html

  
 Possible fixes 

  * igt@i915_pm_rpm@module-reload:
- fi-kbl-soraka:  [DMESG-WARN][11] ([i915#1982]) -> [PASS][12]
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10342/fi-kbl-soraka/igt@i915_pm_...@module-reload.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20589/fi-kbl-soraka/igt@i915_pm_...@module-reload.html

  * igt@i915_selftest@live@execlists:
- fi-bsw-kefka:   [INCOMPLETE][13] ([i915#2782] / [i915#2940]) -> 
[PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10342/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20589/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1372]: https://gitlab.freedesktop.org/drm/intel/issues/1372
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2722]: https://gitlab.freedesktop.org/drm/intel/issues/2722
  [i915#2782]: https://gitlab.freedesktop.org/drm/intel/issues/2782
  [i915#2927]: https://gitlab.freedesktop.org/drm/intel/issues/2927
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940
  [i915#3159]: https://gitlab.freedesktop.org/drm/intel/issues/3159
  [i915#3717]: https://gitlab.freedesktop.org/drm/intel/issues/3717


Participating hosts (38 -> 36)
--

  Additional (1): fi-bsw-n3050 
  Missing(3): fi-ilk-m540 fi-bdw-samus fi-hsw-4200u 


Build changes
-

  * Linux: CI_DRM_10342 -> Patchwork_20589

  CI-20190529: 20190529
  CI_DRM_10342: 308b278ffbef846356ca6b220ef1aa908c22c5fd @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6137: 2fee489255f7a8cd6a584373c30e3d44a07a78ea @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20589: 285d35119aba4b515bf40966ec00b3993f00e866 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

285d35119aba drm/i915: Fix wm params for ccs

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20589/index.html
__

[Intel-gfx] [PATCH] drm/i915/ehl: Resolve insufficient header credits in MIPI DSI

2021-07-13 Thread Aria Kraft
MIPI DSI initialization on EHL can fail due to not enough header credits 
available.

To resolve this failure, this patch adds a header count to the existing 100us 
wait function.

It then adds a call to this modified function to request a single header credit 
during initialization.

Reviewed-by: Bob Paauwe 
Signed-off-by: Aria Kraft 
---
 drivers/gpu/drm/i915/display/icl_dsi.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/icl_dsi.c 
b/drivers/gpu/drm/i915/display/icl_dsi.c
index 43ec7fcd3f5d..fd836fdc6ec7 100644
--- a/drivers/gpu/drm/i915/display/icl_dsi.c
+++ b/drivers/gpu/drm/i915/display/icl_dsi.c
@@ -54,12 +54,15 @@ static int payload_credits_available(struct 
drm_i915_private *dev_priv,
>> FREE_PLOAD_CREDIT_SHIFT;
 }
 
-static void wait_for_header_credits(struct drm_i915_private *dev_priv,
-   enum transcoder dsi_trans)
+static bool wait_for_header_credits(struct drm_i915_private *dev_priv,
+   enum transcoder dsi_trans, unsigned int 
credits)
 {
if (wait_for_us(header_credits_available(dev_priv, dsi_trans) >=
-   MAX_HEADER_CREDIT, 100))
+   credits, 100)) {
drm_err(&dev_priv->drm, "DSI header credits not released\n");
+   return false;
+   }
+   return true;
 }
 
 static void wait_for_payload_credits(struct drm_i915_private *dev_priv,
@@ -90,7 +93,7 @@ static void wait_for_cmds_dispatched_to_panel(struct 
intel_encoder *encoder)
/* wait for header/payload credits to be released */
for_each_dsi_port(port, intel_dsi->ports) {
dsi_trans = dsi_port_to_transcoder(port);
-   wait_for_header_credits(dev_priv, dsi_trans);
+   wait_for_header_credits(dev_priv, dsi_trans, MAX_HEADER_CREDIT);
wait_for_payload_credits(dev_priv, dsi_trans);
}
 
@@ -108,7 +111,7 @@ static void wait_for_cmds_dispatched_to_panel(struct 
intel_encoder *encoder)
/* wait for header credits to be released */
for_each_dsi_port(port, intel_dsi->ports) {
dsi_trans = dsi_port_to_transcoder(port);
-   wait_for_header_credits(dev_priv, dsi_trans);
+   wait_for_header_credits(dev_priv, dsi_trans, MAX_HEADER_CREDIT);
}
 
/* wait for LP TX in progress bit to be cleared */
@@ -155,13 +158,9 @@ static int dsi_send_pkt_hdr(struct intel_dsi_host *host,
struct drm_i915_private *dev_priv = to_i915(intel_dsi->base.base.dev);
enum transcoder dsi_trans = dsi_port_to_transcoder(host->port);
u32 tmp;
-   int free_credits;
 
/* check if header credit available */
-   free_credits = header_credits_available(dev_priv, dsi_trans);
-   if (free_credits < 1) {
-   drm_err(&dev_priv->drm,
-   "send pkt header failed, not enough hdr credits\n");
+   if (!wait_for_header_credits(dev_priv, dsi_trans, 1)) {
return -1;
}
 
-- 
2.31.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/4] iommu/vt-d: Disable superpage for Geminilake igfx

2021-07-13 Thread Ville Syrjälä
On Tue, Jul 13, 2021 at 09:34:09AM +0800, Lu Baolu wrote:
> On 7/12/21 11:47 PM, Ville Syrjälä wrote:
> > On Mon, Jul 12, 2021 at 07:23:07AM +0800, Lu Baolu wrote:
> >> On 7/10/21 12:47 AM, Ville Syrjala wrote:
> >>> From: Ville Syrjälä
> >>>
> >>> While running "gem_exec_big --r single" from igt-gpu-tools on
> >>> Geminilake as soon as a 2M mapping is made I tend to get a DMAR
> >>> write fault. Strangely the faulting address is always a 4K page
> >>> and usually very far away from the 2M page that got mapped.
> >>> But if no 2M mappings get used I can't reproduce the fault.
> >>>
> >>> I also tried to dump the PTE for the faulting address but it actually
> >>> looks correct to me (ie. definitely seems to have the write bit set):
> >>>DMAR: DRHD: handling fault status reg 2
> >>>DMAR: [DMA Write] Request device [00:02.0] PASID  fault addr 
> >>> 7fa8a78000 [fault reason 05] PTE Write access is not set
> >>>DMAR: fault 7fa8a78000 (level=1) PTE = 149efc003
> >>>
> >>> So not really sure what's going on and this might just be full on duct
> >>> tape, but it seems to work here. The machine has now survived a whole day
> >>> running that test whereas with superpage enabled it fails in less than
> >>> a minute usually.
> >>>
> >>> TODO: might be nice to disable superpage only for the igfx iommu
> >>> instead of both iommus
> >> If all these quirks are about igfx dedicated iommu's, I would suggest to
> >> disable superpage only for the igfx ones.
> > Sure. Unfortunately there's no convenient mechanism to do that in
> > the iommu driver that I can immediately see. So not something I
> > can just whip up easily. Since you're actually familiar with the
> > driver maybe you can come up with a decent solution for that?
> > 
> 
> How about something like below? [no compile, no test...]
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 1131b8efb050..2d51ef288a9e 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -338,6 +338,7 @@ static int intel_iommu_strict;
>   static int intel_iommu_superpage = 1;
>   static int iommu_identity_mapping;
>   static int iommu_skip_te_disable;
> +static int iommu_skip_igfx_superpage;
> 
>   #define IDENTMAP_GFX2
>   #define IDENTMAP_AZALIA 4
> @@ -652,6 +653,27 @@ static bool domain_update_iommu_snooping(struct 
> intel_iommu *skip)
>   return ret;
>   }
> 
> +static bool domain_use_super_page(struct dmar_domain *domain)
> +{
> + struct dmar_drhd_unit *drhd;
> + struct intel_iommu *iommu;
> + bool ret = true;
> +
> + if (!intel_iommu_superpage)
> + return false;
> +
> + rcu_read_lock();
> + for_each_active_iommu(iommu, drhd) {
> + if (drhd->gfx_dedicated && iommu_skip_igfx_superpage) {
> + ret = false;
> + break
 ^
Missing semicolon. Othwerwise seems to work great here. Thanks.

Are you going to turn this into a proper patch, or do you
want me to just squash this into my patches and repost?

> + }
> + }
> + rcu_read_unlock();
> +
> + return ret;
> +}
> +
>   static int domain_update_iommu_superpage(struct dmar_domain *domain,
>struct intel_iommu *skip)
>   {
> @@ -659,7 +681,7 @@ static int domain_update_iommu_superpage(struct 
> dmar_domain *domain,
>   struct intel_iommu *iommu;
>   int mask = 0x3;
> 
> - if (!intel_iommu_superpage)
> + if (!domain_use_super_page(domain))
>   return 0;
> 
>   /* set iommu_superpage to the smallest common denominator */
> @@ -5656,6 +5678,14 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 
> 0x1632, quirk_iommu_igfx);
>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x163A, quirk_iommu_igfx);
>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x163D, quirk_iommu_igfx);
> 
> +static void quirk_skip_igfx_superpage(struct pci_dev *dev)
> +{
> + pci_info(dev, "Disabling IOMMU superpage for graphics on this 
> chipset\n");
> + iommu_skip_igfx_superpage = 1;
> +}
> +
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x3184, 
> quirk_skip_igfx_superpage);
> +
>   static void quirk_iommu_rwbf(struct pci_dev *dev)
>   {
>   if (risky_device(dev))
> 
> Best regards,
> baolu

-- 
Ville Syrjälä
Intel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [REBASED v2] drm/i915: Tweaked Wa_14010685332 for all PCHs

2021-07-13 Thread Box, David E
Tested and confirmed working on TGL-H Dell platforms.

David Box
Linux Power Management
IAGS/SSE

From: Gupta, Anshuman 
Sent: Monday, July 12, 2021 12:09 AM
To: intel-gfx@lists.freedesktop.org 
Cc: Box, David E ; Gupta, Anshuman 
; Roper, Matthew D ; Vivi, 
Rodrigo ; Deak, Imre 
Subject: [REBASED v2] drm/i915: Tweaked Wa_14010685332 for all PCHs

dispcnlunit1_cp_xosc_clkreq clock observed to be active on TGL-H platform
despite Wa_14010685332 original sequence, thus blocks entry to deeper s0ix 
state.

The Tweaked Wa_14010685332 sequence fixes this issue, therefore use tweaked
Wa_14010685332 sequence for every PCH since PCH_CNP.

v2:
- removed RKL from comment and simplified condition. [Rodrigo]

Fixes: b896898c7369 ("drm/i915: Tweaked Wa_14010685332 for PCHs used on gen11 
platforms")
Cc: Matt Roper 
Cc: Rodrigo Vivi 
Cc: Imre Deak 
Signed-off-by: Anshuman Gupta 
Reviewed-by: Rodrigo Vivi 
---
 .../drm/i915/display/intel_display_power.c| 16 +++---
 drivers/gpu/drm/i915/i915_irq.c   | 21 ---
 2 files changed, 8 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 285380079aab..28a363119560 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -6388,13 +6388,13 @@ void intel_display_power_suspend_late(struct 
drm_i915_private *i915)
 if (DISPLAY_VER(i915) >= 11 || IS_GEMINILAKE(i915) ||
 IS_BROXTON(i915)) {
 bxt_enable_dc9(i915);
-   /* Tweaked Wa_14010685332:icp,jsp,mcc */
-   if (INTEL_PCH_TYPE(i915) >= PCH_ICP && INTEL_PCH_TYPE(i915) <= 
PCH_MCC)
-   intel_de_rmw(i915, SOUTH_CHICKEN1,
-SBCLK_RUN_REFCLK_DIS, 
SBCLK_RUN_REFCLK_DIS);
 } else if (IS_HASWELL(i915) || IS_BROADWELL(i915)) {
 hsw_enable_pc8(i915);
 }
+
+   /* Tweaked Wa_14010685332:cnp,icp,jsp,mcc,tgp,adp */
+   if (INTEL_PCH_TYPE(i915) >= PCH_CNP && INTEL_PCH_TYPE(i915) < PCH_DG1)
+   intel_de_rmw(i915, SOUTH_CHICKEN1, SBCLK_RUN_REFCLK_DIS, 
SBCLK_RUN_REFCLK_DIS);
 }

 void intel_display_power_resume_early(struct drm_i915_private *i915)
@@ -6403,13 +6403,13 @@ void intel_display_power_resume_early(struct 
drm_i915_private *i915)
 IS_BROXTON(i915)) {
 gen9_sanitize_dc_state(i915);
 bxt_disable_dc9(i915);
-   /* Tweaked Wa_14010685332:icp,jsp,mcc */
-   if (INTEL_PCH_TYPE(i915) >= PCH_ICP && INTEL_PCH_TYPE(i915) <= 
PCH_MCC)
-   intel_de_rmw(i915, SOUTH_CHICKEN1, 
SBCLK_RUN_REFCLK_DIS, 0);
-
 } else if (IS_HASWELL(i915) || IS_BROADWELL(i915)) {
 hsw_disable_pc8(i915);
 }
+
+   /* Tweaked Wa_14010685332:cnp,icp,jsp,mcc,tgp,adp */
+   if (INTEL_PCH_TYPE(i915) >= PCH_CNP && INTEL_PCH_TYPE(i915) < PCH_DG1)
+   intel_de_rmw(i915, SOUTH_CHICKEN1, SBCLK_RUN_REFCLK_DIS, 0);
 }

 void intel_display_power_suspend(struct drm_i915_private *i915)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 1d4c683c9de9..99c75a9d7ffa 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -3064,24 +3064,6 @@ static void valleyview_irq_reset(struct drm_i915_private 
*dev_priv)
 spin_unlock_irq(&dev_priv->irq_lock);
 }

-static void cnp_display_clock_wa(struct drm_i915_private *dev_priv)
-{
-   struct intel_uncore *uncore = &dev_priv->uncore;
-
-   /*
-* Wa_14010685332:cnp/cmp,tgp,adp
-* TODO: Clarify which platforms this applies to
-* TODO: Figure out if this workaround can be applied in the s0ix 
suspend/resume handlers as
-* on earlier platforms and whether the workaround is also needed for 
runtime suspend/resume
-*/
-   if (INTEL_PCH_TYPE(dev_priv) == PCH_CNP ||
-   (INTEL_PCH_TYPE(dev_priv) >= PCH_TGP && INTEL_PCH_TYPE(dev_priv) < 
PCH_DG1)) {
-   intel_uncore_rmw(uncore, SOUTH_CHICKEN1, SBCLK_RUN_REFCLK_DIS,
-SBCLK_RUN_REFCLK_DIS);
-   intel_uncore_rmw(uncore, SOUTH_CHICKEN1, SBCLK_RUN_REFCLK_DIS, 
0);
-   }
-}
-
 static void gen8_display_irq_reset(struct drm_i915_private *dev_priv)
 {
 struct intel_uncore *uncore = &dev_priv->uncore;
@@ -3115,7 +3097,6 @@ static void gen8_irq_reset(struct drm_i915_private 
*dev_priv)
 if (HAS_PCH_SPLIT(dev_priv))
 ibx_irq_reset(dev_priv);

-   cnp_display_clock_wa(dev_priv);
 }

 static void gen11_display_irq_reset(struct drm_i915_private *dev_priv)
@@ -3159,8 +3140,6 @@ static void gen11_display_irq_reset(struct 
drm_i915_private *dev_priv)

 if (INTEL_PCH_TYPE(dev_priv) >= PCH_ICP)
 GEN3_IRQ_RESET(uncore, SDE);
-
-   cnp_display_clock_

[Intel-gfx] [PATCH v4 0/4] shmem helpers for vgem

2021-07-13 Thread Daniel Vetter
Hi all

I've found another potential issue, so lets try this again and see what
intel-gfx-ci says. Also Thomas tried to unify vgem more, which motivated
me to dig this all out again.

Test-with: 20210527140732.5762-1-daniel.vet...@ffwll.ch

Review very much welcome, as always!

Cheers, Daniel

Daniel Vetter (4):
  dma-buf: Require VM_PFNMAP vma for mmap
  drm/shmem-helper: Switch to vmf_insert_pfn
  drm/shmem-helpers: Allocate wc pages on x86
  drm/vgem: use shmem helpers

 drivers/dma-buf/dma-buf.c  |  15 +-
 drivers/gpu/drm/Kconfig|   7 +-
 drivers/gpu/drm/drm_gem_shmem_helper.c |  18 +-
 drivers/gpu/drm/gud/Kconfig|   2 +-
 drivers/gpu/drm/tiny/Kconfig   |   4 +-
 drivers/gpu/drm/udl/Kconfig|   1 +
 drivers/gpu/drm/vgem/vgem_drv.c| 315 +
 7 files changed, 49 insertions(+), 313 deletions(-)

-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v4 1/4] dma-buf: Require VM_PFNMAP vma for mmap

2021-07-13 Thread Daniel Vetter
tldr; DMA buffers aren't normal memory, expecting that you can use
them like that (like calling get_user_pages works, or that they're
accounting like any other normal memory) cannot be guaranteed.

Since some userspace only runs on integrated devices, where all
buffers are actually all resident system memory, there's a huge
temptation to assume that a struct page is always present and useable
like for any more pagecache backed mmap. This has the potential to
result in a uapi nightmare.

To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
blocks get_user_pages and all the other struct page based
infrastructure for everyone. In spirit this is the uapi counterpart to
the kernel-internal CONFIG_DMABUF_DEBUG.

Motivated by a recent patch which wanted to swich the system dma-buf
heap to vm_insert_page instead of vm_insert_pfn.

v2:

Jason brought up that we also want to guarantee that all ptes have the
pte_special flag set, to catch fast get_user_pages (on architectures
that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.

From auditing the various functions to insert pfn pte entires
(vm_insert_pfn_prot, remap_pfn_range and all it's callers like
dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
this should be the correct flag to check for.

References: 
https://lore.kernel.org/lkml/cakmk7uhi+mg0z0humnt13qccvuturvjpcr0njrl12k-wbwz...@mail.gmail.com/
Acked-by: Christian König 
Cc: Jason Gunthorpe 
Cc: Suren Baghdasaryan 
Cc: Matthew Wilcox 
Cc: John Stultz 
Signed-off-by: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
--
Resending this so I can test the next two patches for vgem/shmem in
intel-gfx-ci. Last round failed somehow, but I can't repro that at all
locally here.

No immediate plans to merge this patch here since ttm isn't addressed
yet (and there we have the hugepte issue, for which I don't think we
have a clear consensus yet).
-Daniel
---
 drivers/dma-buf/dma-buf.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 510b42771974..65cbd7f0f16a 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -130,6 +130,7 @@ static struct file_system_type dma_buf_fs_type = {
 static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
 {
struct dma_buf *dmabuf;
+   int ret;
 
if (!is_dma_buf_file(file))
return -EINVAL;
@@ -145,7 +146,11 @@ static int dma_buf_mmap_internal(struct file *file, struct 
vm_area_struct *vma)
dmabuf->size >> PAGE_SHIFT)
return -EINVAL;
 
-   return dmabuf->ops->mmap(dmabuf, vma);
+   ret = dmabuf->ops->mmap(dmabuf, vma);
+
+   WARN_ON(!(vma->vm_flags & VM_PFNMAP));
+
+   return ret;
 }
 
 static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence)
@@ -1276,6 +1281,8 @@ EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access);
 int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
 unsigned long pgoff)
 {
+   int ret;
+
if (WARN_ON(!dmabuf || !vma))
return -EINVAL;
 
@@ -1296,7 +1303,11 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct 
vm_area_struct *vma,
vma_set_file(vma, dmabuf->file);
vma->vm_pgoff = pgoff;
 
-   return dmabuf->ops->mmap(dmabuf, vma);
+   ret = dmabuf->ops->mmap(dmabuf, vma);
+
+   WARN_ON(!(vma->vm_flags & VM_PFNMAP));
+
+   return ret;
 }
 EXPORT_SYMBOL_GPL(dma_buf_mmap);
 
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v4 3/4] drm/shmem-helpers: Allocate wc pages on x86

2021-07-13 Thread Daniel Vetter
intel-gfx-ci realized that something is not quite coherent anymore on
some platforms for our i915+vgem tests, when I tried to switch vgem
over to shmem helpers.

After lots of head-scratching I realized that I've removed calls to
drm_clflush. And we need those. To make this a bit cleaner use the
same page allocation tooling as ttm, which does internally clflush
(and more, as neeeded on any platform instead of just the intel x86
cpus i915 can be combined with).

Unfortunately this doesn't exist on arm, or as a generic feature. For
that I think only the dma-api can get at wc memory reliably, so maybe
we'd need some kind of GFP_WC flag to do this properly.

Signed-off-by: Daniel Vetter 
Cc: Christian König 
Cc: "Thomas Hellström" 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_gem_shmem_helper.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 296ab1b7c07f..657d2490aaa5 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -10,6 +10,10 @@
 #include 
 #include 
 
+#ifdef CONFIG_X86
+#include 
+#endif
+
 #include 
 #include 
 #include 
@@ -162,6 +166,11 @@ static int drm_gem_shmem_get_pages_locked(struct 
drm_gem_shmem_object *shmem)
return PTR_ERR(pages);
}
 
+#ifdef CONFIG_X86
+   if (shmem->map_wc)
+   set_pages_array_wc(pages, obj->size >> PAGE_SHIFT);
+#endif
+
shmem->pages = pages;
 
return 0;
@@ -203,6 +212,11 @@ static void drm_gem_shmem_put_pages_locked(struct 
drm_gem_shmem_object *shmem)
if (--shmem->pages_use_count > 0)
return;
 
+#ifdef CONFIG_X86
+   if (shmem->map_wc)
+   set_pages_array_wb(shmem->pages, obj->size >> PAGE_SHIFT);
+#endif
+
drm_gem_put_pages(obj, shmem->pages,
  shmem->pages_mark_dirty_on_put,
  shmem->pages_mark_accessed_on_put);
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v4 2/4] drm/shmem-helper: Switch to vmf_insert_pfn

2021-07-13 Thread Daniel Vetter
We want to stop gup, which isn't the case if we use vmf_insert_page
and VM_MIXEDMAP, because that does not set pte_special.

v2: With this shmem gem helpers now definitely need CONFIG_MMU (0day)

v3: add more depends on MMU. For usb drivers this is a bit awkward,
but really it's correct: To be able to provide a contig mapping of
buffers to userspace on !MMU platforms we'd need to use the cma
helpers for these drivers on those platforms. As-is this wont work.

Also not exactly sure why vm_insert_page doesn't go boom, because that
definitely wont fly in practice since the pages are non-contig to
begin with.

Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/Kconfig| 2 +-
 drivers/gpu/drm/drm_gem_shmem_helper.c | 4 ++--
 drivers/gpu/drm/gud/Kconfig| 2 +-
 drivers/gpu/drm/tiny/Kconfig   | 4 ++--
 drivers/gpu/drm/udl/Kconfig| 1 +
 5 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 0d372354c2d0..314eefa39892 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -211,7 +211,7 @@ config DRM_KMS_CMA_HELPER
 
 config DRM_GEM_SHMEM_HELPER
bool
-   depends on DRM
+   depends on DRM && MMU
help
  Choose this if you need the GEM shmem helper functions
 
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index d5e6d4568f99..296ab1b7c07f 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -542,7 +542,7 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf)
} else {
page = shmem->pages[page_offset];
 
-   ret = vmf_insert_page(vma, vmf->address, page);
+   ret = vmf_insert_pfn(vma, vmf->address, page_to_pfn(page));
}
 
mutex_unlock(&shmem->pages_lock);
@@ -612,7 +612,7 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, struct 
vm_area_struct *vma)
return ret;
}
 
-   vma->vm_flags |= VM_MIXEDMAP | VM_DONTEXPAND;
+   vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND;
vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
if (shmem->map_wc)
vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
diff --git a/drivers/gpu/drm/gud/Kconfig b/drivers/gpu/drm/gud/Kconfig
index 1c8601bf4d91..9c1e61f9eec3 100644
--- a/drivers/gpu/drm/gud/Kconfig
+++ b/drivers/gpu/drm/gud/Kconfig
@@ -2,7 +2,7 @@
 
 config DRM_GUD
tristate "GUD USB Display"
-   depends on DRM && USB
+   depends on DRM && USB && MMU
select LZ4_COMPRESS
select DRM_KMS_HELPER
select DRM_GEM_SHMEM_HELPER
diff --git a/drivers/gpu/drm/tiny/Kconfig b/drivers/gpu/drm/tiny/Kconfig
index 5593128eeff9..c11fb5be7d09 100644
--- a/drivers/gpu/drm/tiny/Kconfig
+++ b/drivers/gpu/drm/tiny/Kconfig
@@ -44,7 +44,7 @@ config DRM_CIRRUS_QEMU
 
 config DRM_GM12U320
tristate "GM12U320 driver for USB projectors"
-   depends on DRM && USB
+   depends on DRM && USB && MMU
select DRM_KMS_HELPER
select DRM_GEM_SHMEM_HELPER
help
@@ -53,7 +53,7 @@ config DRM_GM12U320
 
 config DRM_SIMPLEDRM
tristate "Simple framebuffer driver"
-   depends on DRM
+   depends on DRM && MMU
select DRM_GEM_SHMEM_HELPER
select DRM_KMS_HELPER
help
diff --git a/drivers/gpu/drm/udl/Kconfig b/drivers/gpu/drm/udl/Kconfig
index 1f497d8f1ae5..c744175c6992 100644
--- a/drivers/gpu/drm/udl/Kconfig
+++ b/drivers/gpu/drm/udl/Kconfig
@@ -4,6 +4,7 @@ config DRM_UDL
depends on DRM
depends on USB
depends on USB_ARCH_HAS_HCD
+   depends on MMU
select DRM_GEM_SHMEM_HELPER
select DRM_KMS_HELPER
help
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


  1   2   >