Re: [Intel-gfx] [drm-tip:drm-tip 5/8] drivers/gpu/drm/i915/i915_request.c:827:1: error: redefinition of 'i915_request_await_start'

2019-05-08 Thread Joonas Lahtinen
This was caused by a silent merge conflict, and is now fixed.

Regards, Joonas

Quoting kbuild test robot (2019-05-07 13:53:48)
> tree:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
> head:   73db4ec12f05160528884c0b2c845b1c6b7c6718
> commit: b9a2acf7709f52c77dc752ec99e3873e392d8df6 [5/8] Merge remote-tracking 
> branch 'drm-intel/drm-intel-next-queued' into drm-tip
> config: x86_64-rhel (attached as .config)
> compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
> reproduce:
> git checkout b9a2acf7709f52c77dc752ec99e3873e392d8df6
> # save the attached .config to linux build tree
> make ARCH=x86_64 
> 
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot 
> 
> All errors (new ones prefixed by >>):
> 
> >> drivers/gpu/drm/i915/i915_request.c:827:1: error: redefinition of 
> >> 'i915_request_await_start'
> i915_request_await_start(struct i915_request *rq, struct i915_request 
> *signal)
> ^~~~
>drivers/gpu/drm/i915/i915_request.c:794:1: note: previous definition of 
> 'i915_request_await_start' was here
> i915_request_await_start(struct i915_request *rq, struct i915_request 
> *signal)
> ^~~~
>drivers/gpu/drm/i915/i915_request.c:794:1: warning: 
> 'i915_request_await_start' defined but not used [-Wunused-function]
> 
> vim +/i915_request_await_start +827 drivers/gpu/drm/i915/i915_request.c
> 
> ca6e56f65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-04  
> 825  
> a2bc4695b drivers/gpu/drm/i915/i915_gem_request.c Chris Wilson 2016-09-09  
> 826  static int
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01 
> @827  i915_request_await_start(struct i915_request *rq, struct i915_request 
> *signal)
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 828  {
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 829  if (list_is_first(&signal->ring_link, 
> &signal->ring->request_list))
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 830  return 0;
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 831  
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 832  signal = list_prev_entry(signal, ring_link);
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 833  if (i915_timeline_sync_is_later(rq->timeline, &signal->fence))
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 834  return 0;
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 835  
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 836  return i915_sw_fence_await_dma_fence(&rq->submit,
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 837   &signal->fence, 0,
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 838   I915_FENCE_GFP);
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 839  }
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson 2019-05-01  
> 840  
> 
> :: The code at line 827 was first introduced by commit
> :: e766fde6511e2be83acbca1d603035e08de23f3b drm/i915: Delay semaphore 
> submission until the start of the signaler
> 
> :: TO: Chris Wilson 
> :: CC: Joonas Lahtinen 
> 
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [drm-tip:drm-tip /8] drivers/gpu/drm/i915/i915_request.c:842:1: error: redefinition of 'already_busywaiting'

2019-05-08 Thread Joonas Lahtinen
This too was caused by a merge conflict and one missing Fixes:.

Regards, Joonas

Quoting kbuild test robot (2019-05-07 14:08:25)
> tree:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
> head:   ae28cc6cf80a2e8cbb58f255ef7cac6b2923c98a
> commit: 47f4a14297839cb4cedd725fb916a5da5eb9b5ba [/8] Merge remote-tracking 
> branch 'drm-intel/drm-intel-next-queued' into drm-tip
> config: x86_64-rhel (attached as .config)
> compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
> reproduce:
> git checkout 47f4a14297839cb4cedd725fb916a5da5eb9b5ba
> # save the attached .config to linux build tree
> make ARCH=x86_64 
> 
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot 
> 
> Note: the drm-tip/drm-tip HEAD ae28cc6cf80a2e8cbb58f255ef7cac6b2923c98a 
> builds fine.
>   It only hurts bisectibility.
> 
> All errors (new ones prefixed by >>):
> 
>drivers/gpu/drm/i915/i915_request.c:827:1: error: redefinition of 
> 'i915_request_await_start'
> i915_request_await_start(struct i915_request *rq, struct i915_request 
> *signal)
> ^~~~
>drivers/gpu/drm/i915/i915_request.c:794:1: note: previous definition of 
> 'i915_request_await_start' was here
> i915_request_await_start(struct i915_request *rq, struct i915_request 
> *signal)
> ^~~~
> >> drivers/gpu/drm/i915/i915_request.c:842:1: error: redefinition of 
> >> 'already_busywaiting'
> already_busywaiting(struct i915_request *rq)
> ^~~
>drivers/gpu/drm/i915/i915_request.c:809:1: note: previous definition of 
> 'already_busywaiting' was here
> already_busywaiting(struct i915_request *rq)
> ^~~
>drivers/gpu/drm/i915/i915_request.c:809:1: warning: 'already_busywaiting' 
> defined but not used [-Wunused-function]
>drivers/gpu/drm/i915/i915_request.c:794:1: warning: 
> 'i915_request_await_start' defined but not used [-Wunused-function]
> i915_request_await_start(struct i915_request *rq, struct i915_request 
> *signal)
> ^~~~
> 
> vim +/already_busywaiting +842 drivers/gpu/drm/i915/i915_request.c
> 
> 47f4a1429 drivers/gpu/drm/i915/i915_request.c Joonas Lahtinen 2019-05-07  
> 825  
> a2bc4695b drivers/gpu/drm/i915/i915_gem_request.c Chris Wilson2016-09-09  
> 826  static int
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01 
> @827  i915_request_await_start(struct i915_request *rq, struct i915_request 
> *signal)
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 828  {
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 829   if (list_is_first(&signal->ring_link, &signal->ring->request_list))
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 830   return 0;
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 831  
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 832   signal = list_prev_entry(signal, ring_link);
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 833   if (i915_timeline_sync_is_later(rq->timeline, &signal->fence))
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 834   return 0;
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 835  
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 836   return i915_sw_fence_await_dma_fence(&rq->submit,
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 837&signal->fence, 0,
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 838I915_FENCE_GFP);
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 839  }
> e766fde65 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-01  
> 840  
> 2564fe708 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-04  
> 841  static intel_engine_mask_t
> 2564fe708 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-04 
> @842  already_busywaiting(struct i915_request *rq)
> 2564fe708 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-04  
> 843  {
> 2564fe708 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-04  
> 844   /*
> 2564fe708 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-04  
> 845* Polling a semaphore causes bus traffic, delaying other users of
> 2564fe708 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-04  
> 846* both the GPU and CPU. We want to limit the impact on others,
> 2564fe708 drivers/gpu/drm/i915/i915_request.c Chris Wilson2019-05-04  
> 847* while taking advantage of early submission to reduce GPU
> 2564fe708 drivers/gpu/

Re: [Intel-gfx] [PATCH] RFC: x86/smp: use printk_deferred in native_smp_send_reschedule

2019-05-08 Thread Daniel Vetter
On Wed, May 8, 2019 at 9:53 AM Sergey Senozhatsky
 wrote:
>
> On (05/08/19 16:44), Sergey Senozhatsky wrote:
> > [..]
> > >  static void native_smp_send_reschedule(int cpu)
> > >  {
> > > if (unlikely(cpu_is_offline(cpu))) {
> > > -   WARN(1, "sched: Unexpected reschedule of offline CPU#%d!\n", 
> > > cpu);
> > > +   printk_deferred(KERN_WARNING
> > > +   "sched: Unexpected reschedule of offline 
> > > CPU#%d!\n", cpu);
> > > return;
> > > }
> > > apic->send_IPI(cpu, RESCHEDULE_VECTOR);
> >
> > Hmm,
> > One thing to notice here is that the CPU in question is offline-ed,
> > and printk_deferred() is a per-CPU type of deferred printk(). So the
> > following thing
> >
> >   __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> >   irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> >
> > might not print anything at all. In this particular case we always
> > need another CPU to do console_unlock(), since this_cpu() is not
> > really expected to do wake_up_klogd_work_func()->console_unlock().
>
> D'oh... It's remote CPU which is offline, not this_cpu().
> Sorry, my bad!

Well I started reading, then freaked out about the WARN_ON in
irq_work_queue_on until I realized that's not the one we're calling
either :-)

> Any printk-related patch in this area will make PeterZ really-really
> angry :)

Hm any more context for someone with no clue about this? Just that the
dependencies are already terribly complex and it's not going to get
better, or something more specific?

> printk_deferred(), just like prinkt_safe(), depends on IRQ work;
> printk_safe(), however, can redirect multiple lines, unlike
> printk_deferred(). So if you want to keep the backtrace, you may
> do something like
>
> if (unlikely(cpu_is_offline(cpu))) {
> printk_safe_enter(...);
> WARN(1, "sched: Unexpected reschedule of offline CPU#%d!\n",
>  cpu);
> printk_safe_exit(...);
> return;
> }
>
> I think, in this case John's reworked-printk can do better than
> printk_safe/printk_deferred.

Hm I think this is what Petr was suggesting, but somehow I didn't find
the printk_safe_* functions and didn't connect the dots. Needs the
_irqsave variants I guess, I'll respin a v2 of this.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 21/40] drm/i915: Move object->pages API to i915_gem_object.[ch]

2019-05-08 Thread Chris Wilson
Currently the code for manipulating the pages on an object is still
residing in i915_gem.c, move it to i915_gem_object.c

Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/Makefile |   4 +-
 .../gpu/drm/i915/{ => gem}/i915_gem_object.c  |   5 +-
 .../gpu/drm/i915/{ => gem}/i915_gem_object.h  | 127 +++-
 drivers/gpu/drm/i915/i915_drv.h   | 137 +-
 drivers/gpu/drm/i915/i915_globals.c   |   2 +-
 drivers/gpu/drm/i915/i915_vma.h   |   2 +-
 drivers/gpu/drm/i915/intel_frontbuffer.h  |   2 +-
 7 files changed, 141 insertions(+), 138 deletions(-)
 rename drivers/gpu/drm/i915/{ => gem}/i915_gem_object.c (97%)
 rename drivers/gpu/drm/i915/{ => gem}/i915_gem_object.h (59%)

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 96344c9a0726..1c8bd0a5212d 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -86,7 +86,10 @@ i915-y += $(gt-y)
 
 # GEM (Graphics Execution Management) code
 obj-y += gem/
+gem-y += \
+   gem/i915_gem_object.o
 i915-y += \
+ $(gem-y) \
  i915_active.o \
  i915_cmd_parser.o \
  i915_gem_batch_pool.o \
@@ -99,7 +102,6 @@ i915-y += \
  i915_gem_gtt.o \
  i915_gem_internal.o \
  i915_gem.o \
- i915_gem_object.o \
  i915_gem_pm.o \
  i915_gem_render_state.o \
  i915_gem_shrinker.o \
diff --git a/drivers/gpu/drm/i915/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
similarity index 97%
rename from drivers/gpu/drm/i915/i915_gem_object.c
rename to drivers/gpu/drm/i915/gem/i915_gem_object.c
index ac6a5ab84586..8179252bb39b 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -22,9 +22,10 @@
  *
  */
 
-#include "i915_drv.h"
 #include "i915_gem_object.h"
-#include "i915_globals.h"
+
+#include "../i915_drv.h"
+#include "../i915_globals.h"
 
 static struct i915_global_object {
struct i915_global base;
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
similarity index 59%
rename from drivers/gpu/drm/i915/i915_gem_object.h
rename to drivers/gpu/drm/i915/gem/i915_gem_object.h
index 3666b0c5f6ca..061b20c3da5b 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -13,7 +13,7 @@
 
 #include 
 
-#include "gem/i915_gem_object_types.h"
+#include "i915_gem_object_types.h"
 
 struct drm_i915_gem_object *i915_gem_object_alloc(void);
 void i915_gem_object_free(struct drm_i915_gem_object *obj);
@@ -192,6 +192,131 @@ i915_gem_object_get_tile_row_size(const struct 
drm_i915_gem_object *obj)
 int i915_gem_object_set_tiling(struct drm_i915_gem_object *obj,
   unsigned int tiling, unsigned int stride);
 
+struct scatterlist *
+i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
+  unsigned int n, unsigned int *offset);
+
+struct page *
+i915_gem_object_get_page(struct drm_i915_gem_object *obj,
+unsigned int n);
+
+struct page *
+i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj,
+  unsigned int n);
+
+dma_addr_t
+i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj,
+   unsigned long n);
+
+void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
+struct sg_table *pages,
+unsigned int sg_page_sizes);
+int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
+
+static inline int __must_check
+i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
+{
+   might_lock(&obj->mm.lock);
+
+   if (atomic_inc_not_zero(&obj->mm.pages_pin_count))
+   return 0;
+
+   return __i915_gem_object_get_pages(obj);
+}
+
+static inline bool
+i915_gem_object_has_pages(struct drm_i915_gem_object *obj)
+{
+   return !IS_ERR_OR_NULL(READ_ONCE(obj->mm.pages));
+}
+
+static inline void
+__i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
+{
+   GEM_BUG_ON(!i915_gem_object_has_pages(obj));
+
+   atomic_inc(&obj->mm.pages_pin_count);
+}
+
+static inline bool
+i915_gem_object_has_pinned_pages(struct drm_i915_gem_object *obj)
+{
+   return atomic_read(&obj->mm.pages_pin_count);
+}
+
+static inline void
+__i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
+{
+   GEM_BUG_ON(!i915_gem_object_has_pages(obj));
+   GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
+
+   atomic_dec(&obj->mm.pages_pin_count);
+}
+
+static inline void
+i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
+{
+   __i915_gem_object_unpin_pages(obj);
+}
+
+enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock/struct_mutex */
+   I915_MM_NORMAL = 0,
+   I915_MM_SHRINKER /* called "recursively" from dir

[Intel-gfx] [PATCH 16/40] drm/i915: Extend execution fence to support a callback

2019-05-08 Thread Chris Wilson
In the next patch, we will want to configure the slave request
depending on which physical engine the master request is executed on.
For this, we introduce a callback from the execute fence to convey this
information.

Signed-off-by: Chris Wilson 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_request.c | 84 +++--
 drivers/gpu/drm/i915/i915_request.h |  4 ++
 2 files changed, 83 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 3beced7daa15..1188a450804f 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -38,6 +38,8 @@ struct execute_cb {
struct list_head link;
struct irq_work work;
struct i915_sw_fence *fence;
+   void (*hook)(struct i915_request *rq, struct dma_fence *signal);
+   struct i915_request *signal;
 };
 
 static struct i915_global_request {
@@ -329,6 +331,17 @@ static void irq_execute_cb(struct irq_work *wrk)
kmem_cache_free(global.slab_execute_cbs, cb);
 }
 
+static void irq_execute_cb_hook(struct irq_work *wrk)
+{
+   struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
+
+   cb->hook(container_of(cb->fence, struct i915_request, submit),
+&cb->signal->fence);
+   i915_request_put(cb->signal);
+
+   irq_execute_cb(wrk);
+}
+
 static void __notify_execute_cb(struct i915_request *rq)
 {
struct execute_cb *cb;
@@ -355,14 +368,19 @@ static void __notify_execute_cb(struct i915_request *rq)
 }
 
 static int
-i915_request_await_execution(struct i915_request *rq,
-struct i915_request *signal,
-gfp_t gfp)
+__i915_request_await_execution(struct i915_request *rq,
+  struct i915_request *signal,
+  void (*hook)(struct i915_request *rq,
+   struct dma_fence *signal),
+  gfp_t gfp)
 {
struct execute_cb *cb;
 
-   if (i915_request_is_active(signal))
+   if (i915_request_is_active(signal)) {
+   if (hook)
+   hook(rq, &signal->fence);
return 0;
+   }
 
cb = kmem_cache_alloc(global.slab_execute_cbs, gfp);
if (!cb)
@@ -372,8 +390,18 @@ i915_request_await_execution(struct i915_request *rq,
i915_sw_fence_await(cb->fence);
init_irq_work(&cb->work, irq_execute_cb);
 
+   if (hook) {
+   cb->hook = hook;
+   cb->signal = i915_request_get(signal);
+   cb->work.func = irq_execute_cb_hook;
+   }
+
spin_lock_irq(&signal->lock);
if (i915_request_is_active(signal)) {
+   if (hook) {
+   hook(rq, &signal->fence);
+   i915_request_put(signal);
+   }
i915_sw_fence_complete(cb->fence);
kmem_cache_free(global.slab_execute_cbs, cb);
} else {
@@ -845,7 +873,7 @@ emit_semaphore_wait(struct i915_request *to,
return err;
 
/* Only submit our spinner after the signaler is running! */
-   err = i915_request_await_execution(to, from, gfp);
+   err = __i915_request_await_execution(to, from, NULL, gfp);
if (err)
return err;
 
@@ -971,6 +999,52 @@ i915_request_await_dma_fence(struct i915_request *rq, 
struct dma_fence *fence)
return 0;
 }
 
+int
+i915_request_await_execution(struct i915_request *rq,
+struct dma_fence *fence,
+void (*hook)(struct i915_request *rq,
+ struct dma_fence *signal))
+{
+   struct dma_fence **child = &fence;
+   unsigned int nchild = 1;
+   int ret;
+
+   if (dma_fence_is_array(fence)) {
+   struct dma_fence_array *array = to_dma_fence_array(fence);
+
+   /* XXX Error for signal-on-any fence arrays */
+
+   child = array->fences;
+   nchild = array->num_fences;
+   GEM_BUG_ON(!nchild);
+   }
+
+   do {
+   fence = *child++;
+   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
+   continue;
+
+   /*
+* We don't squash repeated fence dependencies here as we
+* want to run our callback in all cases.
+*/
+
+   if (dma_fence_is_i915(fence))
+   ret = __i915_request_await_execution(rq,
+to_request(fence),
+hook,
+I915_FENCE_GFP);
+   else
+   ret = i915_sw_fence_await_dma_fence(&rq->submit, fence,
+

[Intel-gfx] [PATCH 38/40] drm/i915: Flush the execution-callbacks on retiring

2019-05-08 Thread Chris Wilson
In the unlikely case the request completes while we regard it as not even
executing on the GPU (see the next patch!), we have to flush any pending
execution callbacks at retirement and ensure that we do not add any
more.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_request.c | 93 +++--
 1 file changed, 49 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 03f15ac63ec9..b1491ce01a77 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -119,6 +119,50 @@ const struct dma_fence_ops i915_fence_ops = {
.release = i915_fence_release,
 };
 
+static void irq_execute_cb(struct irq_work *wrk)
+{
+   struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
+
+   i915_sw_fence_complete(cb->fence);
+   kmem_cache_free(global.slab_execute_cbs, cb);
+}
+
+static void irq_execute_cb_hook(struct irq_work *wrk)
+{
+   struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
+
+   cb->hook(container_of(cb->fence, struct i915_request, submit),
+&cb->signal->fence);
+   i915_request_put(cb->signal);
+
+   irq_execute_cb(wrk);
+}
+
+static void __notify_execute_cb(struct i915_request *rq)
+{
+   struct execute_cb *cb;
+
+   lockdep_assert_held(&rq->lock);
+
+   if (list_empty(&rq->execute_cb))
+   return;
+
+   list_for_each_entry(cb, &rq->execute_cb, link)
+   irq_work_queue(&cb->work);
+
+   /*
+* XXX Rollback on __i915_request_unsubmit()
+*
+* In the future, perhaps when we have an active time-slicing scheduler,
+* it will be interesting to unsubmit parallel execution and remove
+* busywaits from the GPU until their master is restarted. This is
+* quite hairy, we have to carefully rollback the fence and do a
+* preempt-to-idle cycle on the target engine, all the while the
+* master execute_cb may refire.
+*/
+   INIT_LIST_HEAD(&rq->execute_cb);
+}
+
 static inline void
 i915_request_remove_from_client(struct i915_request *request)
 {
@@ -246,6 +290,11 @@ static bool i915_request_retire(struct i915_request *rq)
GEM_BUG_ON(!atomic_read(&rq->i915->gt_pm.rps.num_waiters));
atomic_dec(&rq->i915->gt_pm.rps.num_waiters);
}
+   if (!test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) {
+   set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
+   __notify_execute_cb(rq);
+   }
+   GEM_BUG_ON(!list_empty(&rq->execute_cb));
spin_unlock(&rq->lock);
 
local_irq_enable();
@@ -285,50 +334,6 @@ void i915_request_retire_upto(struct i915_request *rq)
} while (i915_request_retire(tmp) && tmp != rq);
 }
 
-static void irq_execute_cb(struct irq_work *wrk)
-{
-   struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
-
-   i915_sw_fence_complete(cb->fence);
-   kmem_cache_free(global.slab_execute_cbs, cb);
-}
-
-static void irq_execute_cb_hook(struct irq_work *wrk)
-{
-   struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
-
-   cb->hook(container_of(cb->fence, struct i915_request, submit),
-&cb->signal->fence);
-   i915_request_put(cb->signal);
-
-   irq_execute_cb(wrk);
-}
-
-static void __notify_execute_cb(struct i915_request *rq)
-{
-   struct execute_cb *cb;
-
-   lockdep_assert_held(&rq->lock);
-
-   if (list_empty(&rq->execute_cb))
-   return;
-
-   list_for_each_entry(cb, &rq->execute_cb, link)
-   irq_work_queue(&cb->work);
-
-   /*
-* XXX Rollback on __i915_request_unsubmit()
-*
-* In the future, perhaps when we have an active time-slicing scheduler,
-* it will be interesting to unsubmit parallel execution and remove
-* busywaits from the GPU until their master is restarted. This is
-* quite hairy, we have to carefully rollback the fence and do a
-* preempt-to-idle cycle on the target engine, all the while the
-* master execute_cb may refire.
-*/
-   INIT_LIST_HEAD(&rq->execute_cb);
-}
-
 static int
 __i915_request_await_execution(struct i915_request *rq,
   struct i915_request *signal,
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 15/40] drm/i915: Apply an execution_mask to the virtual_engine

2019-05-08 Thread Chris Wilson
Allow the user to direct which physical engines of the virtual engine
they wish to execute one, as sometimes it is necessary to override the
load balancing algorithm.

v2: Only kick the virtual engines on context-out if required

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c|  67 +
 drivers/gpu/drm/i915/gt/selftest_lrc.c | 131 +
 drivers/gpu/drm/i915/i915_request.c|   1 +
 drivers/gpu/drm/i915/i915_request.h|   3 +
 4 files changed, 202 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index bc388df39802..69849ffb9c82 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -550,6 +550,15 @@ execlists_context_schedule_in(struct i915_request *rq)
rq->hw_context->active = rq->engine;
 }
 
+static void kick_siblings(struct i915_request *rq)
+{
+   struct virtual_engine *ve = to_virtual_engine(rq->hw_context->engine);
+   struct i915_request *next = READ_ONCE(ve->request);
+
+   if (next && next->execution_mask & ~rq->execution_mask)
+   tasklet_schedule(&ve->base.execlists.tasklet);
+}
+
 static inline void
 execlists_context_schedule_out(struct i915_request *rq, unsigned long status)
 {
@@ -557,6 +566,18 @@ execlists_context_schedule_out(struct i915_request *rq, 
unsigned long status)
intel_engine_context_out(rq->engine);
execlists_context_status_change(rq, status);
trace_i915_request_out(rq);
+
+   /*
+* If this is part of a virtual engine, its next request may have
+* been blocked waiting for access to the active context. We have
+* to kick all the siblings again in case we need to switch (e.g.
+* the next request is not runnable on this engine). Hopefully,
+* we will already have submitted the next request before the
+* tasklet runs and do not need to rebuild each virtual tree
+* and kick everyone again.
+*/
+   if (rq->engine != rq->hw_context->engine)
+   kick_siblings(rq);
 }
 
 static u64 execlists_update_context(struct i915_request *rq)
@@ -787,6 +808,9 @@ static bool virtual_matches(const struct virtual_engine *ve,
 {
const struct intel_engine_cs *active;
 
+   if (!(rq->execution_mask & engine->mask)) /* We peeked too soon! */
+   return false;
+
/*
 * We track when the HW has completed saving the context image
 * (i.e. when we have seen the final CS event switching out of
@@ -3159,12 +3183,44 @@ static const struct intel_context_ops 
virtual_context_ops = {
.destroy = virtual_context_destroy,
 };
 
+static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)
+{
+   struct i915_request *rq;
+   intel_engine_mask_t mask;
+
+   rq = READ_ONCE(ve->request);
+   if (!rq)
+   return 0;
+
+   /* The rq is ready for submission; rq->execution_mask is now stable. */
+   mask = rq->execution_mask;
+   if (unlikely(!mask)) {
+   /* Invalid selection, submit to a random engine in error */
+   i915_request_skip(rq, -ENODEV);
+   mask = ve->siblings[0]->mask;
+   }
+
+   GEM_TRACE("%s: rq=%llx:%lld, mask=%x, prio=%d\n",
+ ve->base.name,
+ rq->fence.context, rq->fence.seqno,
+ mask, ve->base.execlists.queue_priority_hint);
+
+   return mask;
+}
+
 static void virtual_submission_tasklet(unsigned long data)
 {
struct virtual_engine * const ve = (struct virtual_engine *)data;
const int prio = ve->base.execlists.queue_priority_hint;
+   intel_engine_mask_t mask;
unsigned int n;
 
+   rcu_read_lock();
+   mask = virtual_submission_mask(ve);
+   rcu_read_unlock();
+   if (unlikely(!mask))
+   return;
+
local_irq_disable();
for (n = 0; READ_ONCE(ve->request) && n < ve->num_siblings; n++) {
struct intel_engine_cs *sibling = ve->siblings[n];
@@ -3172,6 +3228,17 @@ static void virtual_submission_tasklet(unsigned long 
data)
struct rb_node **parent, *rb;
bool first;
 
+   if (unlikely(!(mask & sibling->mask))) {
+   if (!RB_EMPTY_NODE(&node->rb)) {
+   spin_lock(&sibling->timeline.lock);
+   rb_erase_cached(&node->rb,
+   &sibling->execlists.virtual);
+   RB_CLEAR_NODE(&node->rb);
+   spin_unlock(&sibling->timeline.lock);
+   }
+   continue;
+   }
+
spin_lock(&sibling->timeline.lock);
 
if (!RB_EMPTY_NODE(&node->rb)) {
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 

[Intel-gfx] [PATCH 37/40] drm/i915: Replace engine->timeline with a plain list

2019-05-08 Thread Chris Wilson
To continue the onslaught of removing the assumption of a global
execution ordering, another casualty is the engine->timeline. Without an
actual timeline to track, it is overkill and we can replace it with a
much less grand plain list. We still need a list of requests inflight,
for the simple purpose of finding inflight requests (for retiring,
resetting, preemption etc).

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_engine.h|  6 ++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 62 ++--
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  6 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 95 ++-
 drivers/gpu/drm/i915/gt/intel_reset.c | 10 +-
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c| 15 ++-
 drivers/gpu/drm/i915/gt/mock_engine.c | 18 ++--
 drivers/gpu/drm/i915/i915_gpu_error.c |  5 +-
 drivers/gpu/drm/i915/i915_request.c   | 43 +++--
 drivers/gpu/drm/i915/i915_request.h   |  2 +-
 drivers/gpu/drm/i915/i915_scheduler.c | 36 +++
 drivers/gpu/drm/i915/i915_timeline.c  |  1 -
 drivers/gpu/drm/i915/i915_timeline.h  | 19 
 drivers/gpu/drm/i915/i915_timeline_types.h|  4 -
 drivers/gpu/drm/i915/intel_guc_submission.c   | 16 ++--
 .../gpu/drm/i915/selftests/mock_timeline.c|  1 -
 16 files changed, 152 insertions(+), 187 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
b/drivers/gpu/drm/i915/gt/intel_engine.h
index 1ae2934b7ecc..b8f5c4e1d39e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -544,4 +544,10 @@ static inline bool inject_preempt_hang(struct 
intel_engine_execlists *execlists)
 
 #endif
 
+void intel_engine_init_active(struct intel_engine_cs *engine,
+ unsigned int subclass);
+#define ENGINE_PHYSICAL0
+#define ENGINE_MOCK1
+#define ENGINE_VIRTUAL 2
+
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 794aa8b11742..f2e28510433b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -567,14 +567,7 @@ static int intel_engine_setup_common(struct 
intel_engine_cs *engine)
if (err)
return err;
 
-   err = i915_timeline_init(engine->i915,
-&engine->timeline,
-engine->status_page.vma);
-   if (err)
-   goto err_hwsp;
-
-   i915_timeline_set_subclass(&engine->timeline, TIMELINE_ENGINE);
-
+   intel_engine_init_active(engine, ENGINE_PHYSICAL);
intel_engine_init_breadcrumbs(engine);
intel_engine_init_execlists(engine);
intel_engine_init_hangcheck(engine);
@@ -587,10 +580,6 @@ static int intel_engine_setup_common(struct 
intel_engine_cs *engine)
intel_sseu_from_device_info(&RUNTIME_INFO(engine->i915)->sseu);
 
return 0;
-
-err_hwsp:
-   cleanup_status_page(engine);
-   return err;
 }
 
 /**
@@ -747,6 +736,27 @@ static int pin_context(struct i915_gem_context *ctx,
return 0;
 }
 
+void
+intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass)
+{
+   INIT_LIST_HEAD(&engine->active.requests);
+
+   spin_lock_init(&engine->active.lock);
+   lockdep_set_subclass(&engine->active.lock, subclass);
+
+   /*
+* Due to an interesting quirk in lockdep's internal debug tracking,
+* after setting a subclass we must ensure the lock is used. Otherwise,
+* nr_unused_locks is incremented once too often.
+*/
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+   local_irq_disable();
+   lock_map_acquire(&engine->active.lock.dep_map);
+   lock_map_release(&engine->active.lock.dep_map);
+   local_irq_enable();
+#endif
+}
+
 /**
  * intel_engines_init_common - initialize cengine state which might require hw 
access
  * @engine: Engine to initialize.
@@ -810,6 +820,8 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
  */
 void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 {
+   GEM_BUG_ON(!list_empty(&engine->active.requests));
+
cleanup_status_page(engine);
 
intel_engine_fini_breadcrumbs(engine);
@@ -824,8 +836,6 @@ void intel_engine_cleanup_common(struct intel_engine_cs 
*engine)
intel_context_unpin(engine->kernel_context);
GEM_BUG_ON(!llist_empty(&engine->barrier_tasks));
 
-   i915_timeline_fini(&engine->timeline);
-
intel_wa_list_free(&engine->ctx_wa_list);
intel_wa_list_free(&engine->wa_list);
intel_wa_list_free(&engine->whitelist);
@@ -1431,16 +1441,6 @@ void intel_engine_dump(struct intel_engine_cs *engine,
 
drm_printf(m, "\tRequests:\n");
 
-   rq = list_first_entry(&engine->timeline.requests,
- struct i915_request, link);
-   if (&rq->link != &engine->timeline.reques

[Intel-gfx] [PATCH 22/40] drm/i915: Move shmem object setup to its own file

2019-05-08 Thread Chris Wilson
Split the plain old shmem object into its own file to start decluttering
i915_gem.c

v2: Lose the confusing, hysterical raisins, suffix of _gtt.

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/Makefile |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 298 +++
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  41 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 512 +++
 drivers/gpu/drm/i915/gt/intel_lrc.c   |   4 +-
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c|   2 +-
 drivers/gpu/drm/i915/gvt/cmd_parser.c |  21 +-
 drivers/gpu/drm/i915/i915_drv.h   |  10 -
 drivers/gpu/drm/i915/i915_gem.c   | 826 +-
 drivers/gpu/drm/i915/i915_perf.c  |   2 +-
 drivers/gpu/drm/i915/intel_fbdev.c|   2 +-
 drivers/gpu/drm/i915/intel_guc.c  |   2 +-
 drivers/gpu/drm/i915/intel_uc_fw.c|   3 +-
 drivers/gpu/drm/i915/selftests/huge_pages.c   |   6 +-
 .../gpu/drm/i915/selftests/i915_gem_dmabuf.c  |   8 +-
 .../gpu/drm/i915/selftests/i915_gem_object.c  |   4 +-
 16 files changed, 883 insertions(+), 861 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_shmem.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 1c8bd0a5212d..625f9749355b 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -87,7 +87,8 @@ i915-y += $(gt-y)
 # GEM (Graphics Execution Management) code
 obj-y += gem/
 gem-y += \
-   gem/i915_gem_object.o
+   gem/i915_gem_object.o \
+   gem/i915_gem_shmem.o
 i915-y += \
  $(gem-y) \
  i915_active.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 8179252bb39b..86e7e88817af 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -26,6 +26,7 @@
 
 #include "../i915_drv.h"
 #include "../i915_globals.h"
+#include "../intel_frontbuffer.h"
 
 static struct i915_global_object {
struct i915_global base;
@@ -42,6 +43,64 @@ void i915_gem_object_free(struct drm_i915_gem_object *obj)
return kmem_cache_free(global.slab_objects, obj);
 }
 
+/* some bookkeeping */
+static void i915_gem_info_add_obj(struct drm_i915_private *i915,
+ u64 size)
+{
+   spin_lock(&i915->mm.object_stat_lock);
+   i915->mm.object_count++;
+   i915->mm.object_memory += size;
+   spin_unlock(&i915->mm.object_stat_lock);
+}
+
+static void i915_gem_info_remove_obj(struct drm_i915_private *i915,
+u64 size)
+{
+   spin_lock(&i915->mm.object_stat_lock);
+   i915->mm.object_count--;
+   i915->mm.object_memory -= size;
+   spin_unlock(&i915->mm.object_stat_lock);
+}
+
+static void
+frontbuffer_retire(struct i915_active_request *active,
+  struct i915_request *request)
+{
+   struct drm_i915_gem_object *obj =
+   container_of(active, typeof(*obj), frontbuffer_write);
+
+   intel_fb_obj_flush(obj, ORIGIN_CS);
+}
+
+void i915_gem_object_init(struct drm_i915_gem_object *obj,
+ const struct drm_i915_gem_object_ops *ops)
+{
+   mutex_init(&obj->mm.lock);
+
+   spin_lock_init(&obj->vma.lock);
+   INIT_LIST_HEAD(&obj->vma.list);
+
+   INIT_LIST_HEAD(&obj->lut_list);
+   INIT_LIST_HEAD(&obj->batch_pool_link);
+
+   init_rcu_head(&obj->rcu);
+
+   obj->ops = ops;
+
+   reservation_object_init(&obj->__builtin_resv);
+   obj->resv = &obj->__builtin_resv;
+
+   obj->frontbuffer_ggtt_origin = ORIGIN_GTT;
+   i915_active_request_init(&obj->frontbuffer_write,
+NULL, frontbuffer_retire);
+
+   obj->mm.madv = I915_MADV_WILLNEED;
+   INIT_RADIX_TREE(&obj->mm.get_page.radix, GFP_KERNEL | __GFP_NOWARN);
+   mutex_init(&obj->mm.get_page.lock);
+
+   i915_gem_info_add_obj(to_i915(obj->base.dev), obj->base.size);
+}
+
 /**
  * Mark up the object's coherency levels for a given cache_level
  * @obj: #drm_i915_gem_object
@@ -64,6 +123,245 @@ void i915_gem_object_set_cache_coherency(struct 
drm_i915_gem_object *obj,
!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE);
 }
 
+void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file)
+{
+   struct drm_i915_private *i915 = to_i915(gem->dev);
+   struct drm_i915_gem_object *obj = to_intel_bo(gem);
+   struct drm_i915_file_private *fpriv = file->driver_priv;
+   struct i915_lut_handle *lut, *ln;
+
+   mutex_lock(&i915->drm.struct_mutex);
+
+   list_for_each_entry_safe(lut, ln, &obj->lut_list, obj_link) {
+   struct i915_gem_context *ctx = lut->ctx;
+   struct i915_vma *vma;
+
+   GEM_BUG_ON(ctx->file_priv == ERR_PTR(-EBADF));
+   if (ctx->file_priv != fpriv)
+   conti

[Intel-gfx] [PATCH 26/40] drm/i915: Move more GEM objects under gem/

2019-05-08 Thread Chris Wilson
Continuing the theme of separating out the GEM clutter.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Makefile | 26 +--
 drivers/gpu/drm/i915/Makefile.header-test |  2 -
 .../gpu/drm/i915/{ => gem}/i915_gem_clflush.c | 27 +++
 drivers/gpu/drm/i915/gem/i915_gem_clflush.h   | 20 +
 .../gpu/drm/i915/{ => gem}/i915_gem_context.c | 27 ++-
 .../gpu/drm/i915/{ => gem}/i915_gem_context.h | 22 +
 .../i915/{ => gem}/i915_gem_context_types.h   |  0
 .../gpu/drm/i915/{ => gem}/i915_gem_dmabuf.c  | 28 +++-
 drivers/gpu/drm/i915/gem/i915_gem_domain.c|  2 +-
 .../drm/i915/{ => gem}/i915_gem_execbuffer.c  | 37 ---
 .../drm/i915/{ => gem}/i915_gem_internal.c| 33 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 10 -
 drivers/gpu/drm/i915/{ => gem}/i915_gem_pm.c  |  2 +-
 drivers/gpu/drm/i915/{ => gem}/i915_gem_pm.h  |  0
 .../drm/i915/{ => gem}/i915_gem_shrinker.c| 25 ++-
 .../gpu/drm/i915/{ => gem}/i915_gem_stolen.c  | 33 --
 .../gpu/drm/i915/{ => gem}/i915_gem_tiling.c  | 31 +++--
 .../gpu/drm/i915/{ => gem}/i915_gem_userptr.c | 30 +++--
 drivers/gpu/drm/i915/{ => gem}/i915_gemfs.c   | 25 ++-
 drivers/gpu/drm/i915/gem/i915_gemfs.h | 16 +++
 .../{ => gem}/selftests/huge_gem_object.c | 22 +
 .../drm/i915/gem/selftests/huge_gem_object.h  | 27 +++
 .../drm/i915/{ => gem}/selftests/huge_pages.c | 34 +-
 .../{ => gem}/selftests/i915_gem_coherency.c  | 26 ++-
 .../{ => gem}/selftests/i915_gem_context.c| 40 +
 .../{ => gem}/selftests/i915_gem_dmabuf.c | 26 ++-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 .../{ => gem}/selftests/i915_gem_object.c | 28 +++-
 .../i915/{ => gem}/selftests/igt_gem_utils.c  |  2 +-
 .../i915/{ => gem}/selftests/igt_gem_utils.h  |  0
 .../i915/{ => gem}/selftests/mock_context.c   | 24 ++
 .../gpu/drm/i915/gem/selftests/mock_context.h | 24 ++
 .../i915/{ => gem}/selftests/mock_dmabuf.c| 22 +
 .../gpu/drm/i915/gem/selftests/mock_dmabuf.h  | 22 +
 .../{ => gem}/selftests/mock_gem_object.h |  7 ++-
 drivers/gpu/drm/i915/gt/intel_context.c   |  4 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  3 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c   |  2 +
 drivers/gpu/drm/i915/gt/intel_lrc.h   | 14 +++---
 drivers/gpu/drm/i915/gt/intel_reset.c |  2 +
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c|  3 ++
 drivers/gpu/drm/i915/gt/mock_engine.c |  3 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  7 +--
 drivers/gpu/drm/i915/gt/selftest_lrc.c|  7 ++-
 .../gpu/drm/i915/gt/selftest_workarounds.c|  6 ++-
 drivers/gpu/drm/i915/gvt/mmio_context.c   |  1 +
 drivers/gpu/drm/i915/gvt/scheduler.c  |  5 ++-
 drivers/gpu/drm/i915/i915_debugfs.c   |  2 +-
 drivers/gpu/drm/i915/i915_drv.c   |  1 +
 drivers/gpu/drm/i915/i915_drv.h   |  2 +-
 drivers/gpu/drm/i915/i915_gem.c   | 11 ++---
 drivers/gpu/drm/i915/i915_gem_clflush.h   | 36 ---
 drivers/gpu/drm/i915/i915_gem_evict.c |  2 +
 drivers/gpu/drm/i915/i915_gemfs.h | 34 --
 drivers/gpu/drm/i915/i915_globals.c   |  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c |  2 +
 drivers/gpu/drm/i915/i915_perf.c  |  2 +
 drivers/gpu/drm/i915/i915_request.c   |  3 ++
 drivers/gpu/drm/i915/intel_display.c  |  1 -
 drivers/gpu/drm/i915/intel_guc_submission.c   |  2 +
 drivers/gpu/drm/i915/intel_overlay.c  |  2 +
 .../gpu/drm/i915/selftests/huge_gem_object.h  | 45 ---
 drivers/gpu/drm/i915/selftests/i915_active.c  |  2 +
 drivers/gpu/drm/i915/selftests/i915_gem.c |  6 ++-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  6 ++-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  3 +-
 drivers/gpu/drm/i915/selftests/i915_request.c |  5 ++-
 .../gpu/drm/i915/selftests/i915_timeline.c|  1 +
 drivers/gpu/drm/i915/selftests/i915_vma.c |  3 +-
 .../gpu/drm/i915/selftests/igt_flush_test.c   |  2 +
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |  3 +-
 drivers/gpu/drm/i915/selftests/igt_spinner.h  |  2 +-
 drivers/gpu/drm/i915/selftests/intel_guc.c|  1 +
 drivers/gpu/drm/i915/selftests/mock_context.h | 42 -
 drivers/gpu/drm/i915/selftests/mock_dmabuf.h  | 41 -
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  5 ++-
 drivers/gpu/drm/i915/selftests/mock_request.c |  2 +-
 77 files changed, 338 insertions(+), 692 deletions(-)
 rename drivers/gpu/drm/i915/{ => gem}/i915_gem_clflush.c (77%)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_clflush.h
 rename drivers/gpu/drm/i915/{ => gem}/i915_gem_context.c (98%)
 rename drivers/gpu/drm/i915/{ => gem}/i915_gem_context.h (85%)
 rename drivers/gpu/drm/i915/{ => gem}/i915_gem_context_types.

[Intel-gfx] [PATCH 14/40] drm/i915: Load balancing across a virtual engine

2019-05-08 Thread Chris Wilson
Having allowed the user to define a set of engines that they will want
to only use, we go one step further and allow them to bind those engines
into a single virtual instance. Submitting a batch to the virtual engine
will then forward it to any one of the set in a manner as best to
distribute load.  The virtual engine has a single timeline across all
engines (it operates as a single queue), so it is not able to concurrently
run batches across multiple engines by itself; that is left up to the user
to submit multiple concurrent batches to multiple queues. Multiple users
will be load balanced across the system.

The mechanism used for load balancing in this patch is a late greedy
balancer. When a request is ready for execution, it is added to each
engine's queue, and when an engine is ready for its next request it
claims it from the virtual engine. The first engine to do so, wins, i.e.
the request is executed at the earliest opportunity (idle moment) in the
system.

As not all HW is created equal, the user is still able to skip the
virtual engine and execute the batch on a specific engine, all within the
same queue. It will then be executed in order on the correct engine,
with execution on other virtual engines being moved away due to the load
detection.

A couple of areas for potential improvement left!

- The virtual engine always take priority over equal-priority tasks.
Mostly broken up by applying FQ_CODEL rules for prioritising new clients,
and hopefully the virtual and real engines are not then congested (i.e.
all work is via virtual engines, or all work is to the real engine).

- We require the breadcrumb irq around every virtual engine request. For
normal engines, we eliminate the need for the slow round trip via
interrupt by using the submit fence and queueing in order. For virtual
engines, we have to allow any job to transfer to a new ring, and cannot
coalesce the submissions, so require the completion fence instead,
forcing the persistent use of interrupts.

- We only drip feed single requests through each virtual engine and onto
the physical engines, even if there was enough work to fill all ELSP,
leaving small stalls with an idle CS event at the end of every request.
Could we be greedy and fill both slots? Being lazy is virtuous for load
distribution on less-than-full workloads though.

Other areas of improvement are more general, such as reducing lock
contention, reducing dispatch overhead, looking at direct submission
rather than bouncing around tasklets etc.

sseu: Lift the restriction to allow sseu to be reconfigured on virtual
engines composed of RENDER_CLASS (rcs).

v2: macroize check_user_mbz()
v3: Cancel virtual engines on wedging
v4: Commence commenting
v5: Replace 64b sibling_mask with a list of class:instance
v6: Drop the one-element array in the uabi
v7: Assert it is an virtual engine in to_virtual_engine()

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_engine_types.h |   8 +
 drivers/gpu/drm/i915/gt/intel_lrc.c  | 683 ++-
 drivers/gpu/drm/i915/gt/intel_lrc.h  |   9 +
 drivers/gpu/drm/i915/gt/selftest_lrc.c   | 180 +
 drivers/gpu/drm/i915/i915_gem.h  |   5 +
 drivers/gpu/drm/i915/i915_gem_context.c  | 116 +++-
 drivers/gpu/drm/i915/i915_scheduler.c|  19 +-
 drivers/gpu/drm/i915/i915_timeline_types.h   |   1 +
 include/uapi/drm/i915_drm.h  |  39 ++
 9 files changed, 1032 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index e381c1c73902..7b47e00fa082 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -227,6 +227,7 @@ struct intel_engine_execlists {
 * @queue: queue of requests, in priority lists
 */
struct rb_root_cached queue;
+   struct rb_root_cached virtual;
 
/**
 * @csb_write: control register for Context Switch buffer
@@ -445,6 +446,7 @@ struct intel_engine_cs {
 #define I915_ENGINE_HAS_PREEMPTION   BIT(2)
 #define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
 #define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4)
+#define I915_ENGINE_IS_VIRTUAL   BIT(5)
unsigned int flags;
 
/*
@@ -534,6 +536,12 @@ intel_engine_needs_breadcrumb_tasklet(const struct 
intel_engine_cs *engine)
return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
 }
 
+static inline bool
+intel_engine_is_virtual(const struct intel_engine_cs *engine)
+{
+   return engine->flags & I915_ENGINE_IS_VIRTUAL;
+}
+
 #define instdone_slice_mask(dev_priv__) \
(IS_GEN(dev_priv__, 7) ? \
 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index f1d62746e066..bc388df39802 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -166,6 +166,42 @@
 
 #define A

[Intel-gfx] [PATCH 05/40] drm/i915: Bump signaler priority on adding a waiter

2019-05-08 Thread Chris Wilson
The handling of the no-preemption priority level imposes the restriction
that we need to maintain the implied ordering even though preemption is
disabled. Otherwise we may end up with an AB-BA deadlock across multiple
engine due to a real preemption event reordering the no-preemption
WAITs. To resolve this issue we currently promote all requests to WAIT
on unsubmission, however this interferes with the timeslicing
requirement that we do not apply any implicit promotion that will defeat
the round-robin timeslice list. (If we automatically promote the active
request it will go back to the head of the queue and not the tail!)

So we need implicit promotion to prevent reordering around semaphores
where we are not allowed to preempt, and we must avoid implicit
promotion on unsubmission. So instead of at unsubmit, if we apply that
implicit promotion on adding the dependency, we avoid the semaphore
deadlock and we also reduce the gains made by the promotion for user
space waiting. Furthermore, by keeping the earlier dependencies at a
higher level, we reduce the search space for timeslicing without
altering runtime scheduling too badly (no dependencies at all will be
assigned a higher priority for rrul).

v2: Limit the bump to external edges (as originally intended) i.e.
between contexts and out to the user.

Testcase: igt/gem_concurrent_blit
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/selftest_lrc.c  | 12 
 drivers/gpu/drm/i915/i915_request.c |  9 -
 drivers/gpu/drm/i915/i915_scheduler.c   | 11 +++
 drivers/gpu/drm/i915/i915_scheduler_types.h |  3 ++-
 4 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 4b042893dc0e..5b3d8e33f1cf 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -98,12 +98,14 @@ static int live_busywait_preempt(void *arg)
ctx_hi = kernel_context(i915);
if (!ctx_hi)
goto err_unlock;
-   ctx_hi->sched.priority = INT_MAX;
+   ctx_hi->sched.priority =
+   I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
 
ctx_lo = kernel_context(i915);
if (!ctx_lo)
goto err_ctx_hi;
-   ctx_lo->sched.priority = INT_MIN;
+   ctx_lo->sched.priority =
+   I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
 
obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
if (IS_ERR(obj)) {
@@ -958,12 +960,14 @@ static int live_preempt_hang(void *arg)
ctx_hi = kernel_context(i915);
if (!ctx_hi)
goto err_spin_lo;
-   ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
+   ctx_hi->sched.priority =
+   I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
 
ctx_lo = kernel_context(i915);
if (!ctx_lo)
goto err_ctx_hi;
-   ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
+   ctx_lo->sched.priority =
+   I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
 
for_each_engine(engine, i915, id) {
struct i915_request *rq;
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index fa955b7b6def..9510db566a58 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -488,15 +488,6 @@ void __i915_request_unsubmit(struct i915_request *request)
/* We may be recursing from the signal callback of another i915 fence */
spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
 
-   /*
-* As we do not allow WAIT to preempt inflight requests,
-* once we have executed a request, along with triggering
-* any execution callbacks, we must preserve its ordering
-* within the non-preemptible FIFO.
-*/
-   BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK); /* only internal */
-   request->sched.attr.priority |= __NO_PREEMPTION;
-
if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags))
i915_request_cancel_breadcrumb(request);
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index 5581c5004ff0..d215dcdf0b1e 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -387,6 +387,16 @@ bool __i915_sched_node_add_dependency(struct 
i915_sched_node *node,
!node_started(signal))
node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
 
+   /*
+* As we do not allow WAIT to preempt inflight requests,
+* once we have executed a request, along with triggering
+* any execution callbacks, we must preserve its ordering
+* within the non-preemptible FIFO.
+*/
+   BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK);
+   

[Intel-gfx] [PATCH 09/40] drm/i915: Restore control over ppgtt for context creation ABI

2019-05-08 Thread Chris Wilson
Having hid the partially exposed new ABI from the PR, put it back again
for completion of context recovery. A significant part of context
recovery is the ability to reuse as much of the old context as is
feasible (to avoid expensive reconstruction). The biggest chunk kept
hidden at the moment is fine-control over the ctx->ppgtt (the GPU page
tables and associated translation tables and kernel maps), so make
control over the ctx->ppgtt explicit.

This allows userspace to create and share virtual memory address spaces
(within the limits of a single fd) between contexts they own, along with
the ability to query the contexts for the vm state.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.c |  2 ++
 drivers/gpu/drm/i915/i915_gem_context.c |  5 -
 include/uapi/drm/i915_drm.h | 15 +++
 3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 2c7a4318d13c..5061cb32856b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -3164,6 +3164,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, 
DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, 
i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, 
DRM_UNLOCKED|DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, 
DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, 
DRM_RENDER_ALLOW),
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 65cefc520e79..413c4529191d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -98,7 +98,6 @@
 #include "i915_user_extensions.h"
 
 #define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE (1 << 1)
-#define I915_CONTEXT_PARAM_VM 0x9
 
 #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
 
@@ -966,8 +965,6 @@ static int get_ppgtt(struct drm_i915_file_private 
*file_priv,
struct i915_hw_ppgtt *ppgtt;
int ret;
 
-   return -EINVAL; /* nothing to see here; please move along */
-
if (!ctx->ppgtt)
return -ENODEV;
 
@@ -1066,8 +1063,6 @@ static int set_ppgtt(struct drm_i915_file_private 
*file_priv,
struct i915_hw_ppgtt *ppgtt, *old;
int err;
 
-   return -EINVAL; /* nothing to see here; please move along */
-
if (args->size)
return -EINVAL;
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 3a73f5316766..d6ad4a15b2b9 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -355,6 +355,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_PERF_ADD_CONFIG   0x37
 #define DRM_I915_PERF_REMOVE_CONFIG0x38
 #define DRM_I915_QUERY 0x39
+#define DRM_I915_GEM_VM_CREATE 0x3a
+#define DRM_I915_GEM_VM_DESTROY0x3b
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INITDRM_IOW( DRM_COMMAND_BASE + 
DRM_I915_INIT, drm_i915_init_t)
@@ -415,6 +417,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_PERF_ADD_CONFIG DRM_IOW(DRM_COMMAND_BASE + 
DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
 #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG  DRM_IOW(DRM_COMMAND_BASE + 
DRM_I915_PERF_REMOVE_CONFIG, __u64)
 #define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_QUERY, struct drm_i915_query)
+#define DRM_IOCTL_I915_GEM_VM_CREATE   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
+#define DRM_IOCTL_I915_GEM_VM_DESTROY  DRM_IOW (DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -1507,6 +1511,17 @@ struct drm_i915_gem_context_param {
  * On creation, all new contexts are marked as recoverable.
  */
 #define I915_CONTEXT_PARAM_RECOVERABLE 0x8
+
+   /*
+* The id of the associated virtual memory address space (ppGTT) of
+* this context. Can be retrieved and passed to another context
+* (on the same fd) for both to use the same ppGTT and so share
+* address layouts, and avoid reloading the page tables on context
+* switches between themselves.
+*
+* See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
+*/
+#define I915_CONTEXT_PARAM_VM  0x9
 /* Must be kept compact -- no holes and well documented */
 
__u64 value;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 11/40] drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[]

2019-05-08 Thread Chris Wilson
Allow the user to specify a local engine index (as opposed to
class:index) that they can use to refer to a preset engine inside the
ctx->engine[] array defined by an earlier I915_CONTEXT_PARAM_ENGINES.
This will be useful for setting SSEU parameters on virtual engines that
are local to the context and do not have a valid global class:instance
lookup.

Note that due to the ambiguity in using class:instance with
ctx->engines[], if a user supplied engine map is active the user must
specify the engine to alter by its index into the ctx->engines[].

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_gem_context.c | 24 
 include/uapi/drm/i915_drm.h |  3 ++-
 2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 21bfcd529097..5fdb44714a5c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1363,6 +1363,7 @@ static int set_sseu(struct i915_gem_context *ctx,
struct drm_i915_gem_context_param_sseu user_sseu;
struct intel_context *ce;
struct intel_sseu sseu;
+   unsigned long lookup;
int ret;
 
if (args->size < sizeof(user_sseu))
@@ -1375,10 +1376,17 @@ static int set_sseu(struct i915_gem_context *ctx,
   sizeof(user_sseu)))
return -EFAULT;
 
-   if (user_sseu.flags || user_sseu.rsvd)
+   if (user_sseu.rsvd)
return -EINVAL;
 
-   ce = lookup_user_engine(ctx, 0, &user_sseu.engine);
+   if (user_sseu.flags & ~(I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX))
+   return -EINVAL;
+
+   lookup = 0;
+   if (user_sseu.flags & I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX)
+   lookup |= LOOKUP_USER_INDEX;
+
+   ce = lookup_user_engine(ctx, lookup, &user_sseu.engine);
if (IS_ERR(ce))
return PTR_ERR(ce);
 
@@ -1795,6 +1803,7 @@ static int get_sseu(struct i915_gem_context *ctx,
 {
struct drm_i915_gem_context_param_sseu user_sseu;
struct intel_context *ce;
+   unsigned long lookup;
int err;
 
if (args->size == 0)
@@ -1806,10 +1815,17 @@ static int get_sseu(struct i915_gem_context *ctx,
   sizeof(user_sseu)))
return -EFAULT;
 
-   if (user_sseu.flags || user_sseu.rsvd)
+   if (user_sseu.rsvd)
return -EINVAL;
 
-   ce = lookup_user_engine(ctx, 0, &user_sseu.engine);
+   if (user_sseu.flags & ~(I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX))
+   return -EINVAL;
+
+   lookup = 0;
+   if (user_sseu.flags & I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX)
+   lookup |= LOOKUP_USER_INDEX;
+
+   ce = lookup_user_engine(ctx, lookup, &user_sseu.engine);
if (IS_ERR(ce))
return PTR_ERR(ce);
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 8e1bb22926e4..82bd488ed0d1 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1576,9 +1576,10 @@ struct drm_i915_gem_context_param_sseu {
struct i915_engine_class_instance engine;
 
/*
-* Unused for now. Must be cleared to zero.
+* Unknown flags must be cleared to zero.
 */
__u32 flags;
+#define I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX (1u << 0)
 
/*
 * Mask of slices to enable for the context. Valid values are a subset
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 02/40] drm/i915: Rearrange i915_scheduler.c

2019-05-08 Thread Chris Wilson
To avoid pulling in a forward declaration in the next patch, move the
i915_sched_node handling to after the main dfs of the scheduler.

Signed-off-by: Chris Wilson 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_scheduler.c | 210 +-
 1 file changed, 105 insertions(+), 105 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index ec22c3fe7360..b7488c31e3e9 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -35,109 +35,6 @@ static inline bool node_signaled(const struct 
i915_sched_node *node)
return i915_request_completed(node_to_request(node));
 }
 
-void i915_sched_node_init(struct i915_sched_node *node)
-{
-   INIT_LIST_HEAD(&node->signalers_list);
-   INIT_LIST_HEAD(&node->waiters_list);
-   INIT_LIST_HEAD(&node->link);
-   node->attr.priority = I915_PRIORITY_INVALID;
-   node->semaphores = 0;
-   node->flags = 0;
-}
-
-static struct i915_dependency *
-i915_dependency_alloc(void)
-{
-   return kmem_cache_alloc(global.slab_dependencies, GFP_KERNEL);
-}
-
-static void
-i915_dependency_free(struct i915_dependency *dep)
-{
-   kmem_cache_free(global.slab_dependencies, dep);
-}
-
-bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
- struct i915_sched_node *signal,
- struct i915_dependency *dep,
- unsigned long flags)
-{
-   bool ret = false;
-
-   spin_lock_irq(&schedule_lock);
-
-   if (!node_signaled(signal)) {
-   INIT_LIST_HEAD(&dep->dfs_link);
-   list_add(&dep->wait_link, &signal->waiters_list);
-   list_add(&dep->signal_link, &node->signalers_list);
-   dep->signaler = signal;
-   dep->flags = flags;
-
-   /* Keep track of whether anyone on this chain has a semaphore */
-   if (signal->flags & I915_SCHED_HAS_SEMAPHORE_CHAIN &&
-   !node_started(signal))
-   node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
-
-   ret = true;
-   }
-
-   spin_unlock_irq(&schedule_lock);
-
-   return ret;
-}
-
-int i915_sched_node_add_dependency(struct i915_sched_node *node,
-  struct i915_sched_node *signal)
-{
-   struct i915_dependency *dep;
-
-   dep = i915_dependency_alloc();
-   if (!dep)
-   return -ENOMEM;
-
-   if (!__i915_sched_node_add_dependency(node, signal, dep,
- I915_DEPENDENCY_ALLOC))
-   i915_dependency_free(dep);
-
-   return 0;
-}
-
-void i915_sched_node_fini(struct i915_sched_node *node)
-{
-   struct i915_dependency *dep, *tmp;
-
-   GEM_BUG_ON(!list_empty(&node->link));
-
-   spin_lock_irq(&schedule_lock);
-
-   /*
-* Everyone we depended upon (the fences we wait to be signaled)
-* should retire before us and remove themselves from our list.
-* However, retirement is run independently on each timeline and
-* so we may be called out-of-order.
-*/
-   list_for_each_entry_safe(dep, tmp, &node->signalers_list, signal_link) {
-   GEM_BUG_ON(!node_signaled(dep->signaler));
-   GEM_BUG_ON(!list_empty(&dep->dfs_link));
-
-   list_del(&dep->wait_link);
-   if (dep->flags & I915_DEPENDENCY_ALLOC)
-   i915_dependency_free(dep);
-   }
-
-   /* Remove ourselves from everyone who depends upon us */
-   list_for_each_entry_safe(dep, tmp, &node->waiters_list, wait_link) {
-   GEM_BUG_ON(dep->signaler != node);
-   GEM_BUG_ON(!list_empty(&dep->dfs_link));
-
-   list_del(&dep->signal_link);
-   if (dep->flags & I915_DEPENDENCY_ALLOC)
-   i915_dependency_free(dep);
-   }
-
-   spin_unlock_irq(&schedule_lock);
-}
-
 static inline struct i915_priolist *to_priolist(struct rb_node *rb)
 {
return rb_entry(rb, struct i915_priolist, node);
@@ -239,6 +136,11 @@ i915_sched_lookup_priolist(struct intel_engine_cs *engine, 
int prio)
return &p->requests[idx];
 }
 
+void __i915_priolist_free(struct i915_priolist *p)
+{
+   kmem_cache_free(global.slab_priorities, p);
+}
+
 struct sched_cache {
struct list_head *priolist;
 };
@@ -440,9 +342,107 @@ void i915_schedule_bump_priority(struct i915_request *rq, 
unsigned int bump)
spin_unlock_irqrestore(&schedule_lock, flags);
 }
 
-void __i915_priolist_free(struct i915_priolist *p)
+void i915_sched_node_init(struct i915_sched_node *node)
 {
-   kmem_cache_free(global.slab_priorities, p);
+   INIT_LIST_HEAD(&node->signalers_list);
+   INIT_LIST_HEAD(&node->waiters_list);
+   INIT_LIST_HEAD(&node->link);
+   node->attr.priority = I915_PRIORITY_INVALID;
+ 

[Intel-gfx] [PATCH 10/40] drm/i915: Allow a context to define its set of engines

2019-05-08 Thread Chris Wilson
Over the last few years, we have debated how to extend the user API to
support an increase in the number of engines, that may be sparse and
even be heterogeneous within a class (not all video decoders created
equal). We settled on using (class, instance) tuples to identify a
specific engine, with an API for the user to construct a map of engines
to capabilities. Into this picture, we then add a challenge of virtual
engines; one user engine that maps behind the scenes to any number of
physical engines. To keep it general, we want the user to have full
control over that mapping. To that end, we allow the user to constrain a
context to define the set of engines that it can access, order fully
controlled by the user via (class, instance). With such precise control
in context setup, we can continue to use the existing execbuf uABI of
specifying a single index; only now it doesn't automagically map onto
the engines, it uses the user defined engine map from the context.

v2: Fixup freeing of local on success of get_engines()
v3: Allow empty engines[]
v4: s/nengine/num_engines/
v5: Replace 64 limit on num_engines with a note that execbuf is
currently limited to only using the first 64 engines.
v6: Actually use the engines_mutex to guard the ctx->engines.

Testcase: igt/gem_ctx_engines
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_gem_context.c   | 239 +-
 drivers/gpu/drm/i915/i915_gem_context.h   |  18 ++
 drivers/gpu/drm/i915/i915_gem_context_types.h |   1 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c|   5 +-
 drivers/gpu/drm/i915/i915_utils.h |  34 +++
 include/uapi/drm/i915_drm.h   |  31 +++
 6 files changed, 315 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 413c4529191d..21bfcd529097 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -90,7 +90,6 @@
 #include 
 
 #include "gt/intel_lrc_reg.h"
-#include "gt/intel_workarounds.h"
 
 #include "i915_drv.h"
 #include "i915_globals.h"
@@ -141,15 +140,31 @@ static void lut_close(struct i915_gem_context *ctx)
 }
 
 static struct intel_context *
-lookup_user_engine(struct i915_gem_context *ctx, u16 class, u16 instance)
+lookup_user_engine(struct i915_gem_context *ctx,
+  unsigned long flags,
+  const struct i915_engine_class_instance *ci)
+#define LOOKUP_USER_INDEX BIT(0)
 {
-   struct intel_engine_cs *engine;
+   int idx;
 
-   engine = intel_engine_lookup_user(ctx->i915, class, instance);
-   if (!engine)
+   if (!!(flags & LOOKUP_USER_INDEX) != i915_gem_context_user_engines(ctx))
return ERR_PTR(-EINVAL);
 
-   return i915_gem_context_get_engine(ctx, engine->id);
+   if (!i915_gem_context_user_engines(ctx)) {
+   struct intel_engine_cs *engine;
+
+   engine = intel_engine_lookup_user(ctx->i915,
+ ci->engine_class,
+ ci->engine_instance);
+   if (!engine)
+   return ERR_PTR(-EINVAL);
+
+   idx = engine->id;
+   } else {
+   idx = ci->engine_instance;
+   }
+
+   return i915_gem_context_get_engine(ctx, idx);
 }
 
 static inline int new_hw_id(struct drm_i915_private *i915, gfp_t gfp)
@@ -257,6 +272,17 @@ static void free_engines(struct i915_gem_engines *e)
__free_engines(e, e->num_engines);
 }
 
+static void free_engines_rcu(struct work_struct *wrk)
+{
+   struct i915_gem_engines *e =
+   container_of(wrk, struct i915_gem_engines, rcu.work);
+   struct drm_i915_private *i915 = e->i915;
+
+   mutex_lock(&i915->drm.struct_mutex);
+   free_engines(e);
+   mutex_unlock(&i915->drm.struct_mutex);
+}
+
 static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
 {
struct intel_engine_cs *engine;
@@ -1352,9 +1378,7 @@ static int set_sseu(struct i915_gem_context *ctx,
if (user_sseu.flags || user_sseu.rsvd)
return -EINVAL;
 
-   ce = lookup_user_engine(ctx,
-   user_sseu.engine.engine_class,
-   user_sseu.engine.engine_instance);
+   ce = lookup_user_engine(ctx, 0, &user_sseu.engine);
if (IS_ERR(ce))
return PTR_ERR(ce);
 
@@ -1379,6 +1403,191 @@ static int set_sseu(struct i915_gem_context *ctx,
return ret;
 }
 
+struct set_engines {
+   struct i915_gem_context *ctx;
+   struct i915_gem_engines *engines;
+};
+
+static const i915_user_extension_fn set_engines__extensions[] = {
+};
+
+static int
+set_engines(struct i915_gem_context *ctx,
+   const struct drm_i915_gem_context_param *args)
+{
+   struct i915_context_param_engines __user *user =
+   u64_to_user

[Intel-gfx] [PATCH 39/40] drm/i915/execlists: Preempt-to-busy

2019-05-08 Thread Chris Wilson
When using a global seqno, we required a precise stop-the-workd event to
handle preemption and unwind the global seqno counter. To accomplish
this, we would preempt to a special out-of-band context and wait for the
machine to report that it was idle. Given an idle machine, we could very
precisely see which requests had completed and which we needed to feed
back into the run queue.

However, now that we have scrapped the global seqno, we no longer need
to precisely unwind the global counter and only track requests by their
per-context seqno. This allows us to loosely unwind inflight requests
while scheduling a preemption, with the enormous caveat that the
requests we put back on the run queue are still _inflight_ (until the
preemption request is complete). This makes request tracking much more
messy, as at any point then we can see a completed request that we
believe is not currently scheduled for execution. We also have to be
careful not to rewind RING_TAIL past RING_HEAD on preempting to the
running context, and for this we use a semaphore to prevent completion
of the request before continuing.

To accomplish this feat, we change how we track requests scheduled to
the HW. Instead of appending our requests onto a single list as we
submit, we track each submission to ELSP as its own block. Then upon
receiving the CS preemption event, we promote the pending block to the
inflight block (discarding what was previously being tracked). As normal
CS completion events arrive, we then remove stale entries from the
inflight tracker.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |   2 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |   5 +
 drivers/gpu/drm/i915/gt/intel_engine.h|  61 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  61 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  52 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 684 --
 drivers/gpu/drm/i915/i915_gpu_error.c |  19 +-
 drivers/gpu/drm/i915/i915_request.c   |   6 +
 drivers/gpu/drm/i915/i915_request.h   |   1 +
 drivers/gpu/drm/i915/i915_scheduler.c |   3 +-
 drivers/gpu/drm/i915/i915_utils.h |  12 +
 drivers/gpu/drm/i915/intel_guc_submission.c   | 175 ++---
 drivers/gpu/drm/i915/selftests/i915_request.c |   8 +-
 13 files changed, 475 insertions(+), 614 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index aa2bd1d6ceb6..0f11c571bb04 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -642,7 +642,7 @@ static void init_contexts(struct drm_i915_private *i915)
 
 static bool needs_preempt_context(struct drm_i915_private *i915)
 {
-   return HAS_EXECLISTS(i915);
+   return USES_GUC_SUBMISSION(i915);
 }
 
 int i915_gem_contexts_init(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index e95be4be9612..b565c3ff4378 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -13,6 +13,7 @@
 #include 
 
 #include "i915_active_types.h"
+#include "i915_utils.h"
 #include "intel_engine_types.h"
 #include "intel_sseu.h"
 
@@ -38,6 +39,10 @@ struct intel_context {
struct i915_gem_context *gem_context;
struct intel_engine_cs *engine;
struct intel_engine_cs *inflight;
+#define intel_context_inflight(ce) ptr_mask_bits((ce)->inflight, 2)
+#define intel_context_inflight_count(ce)  ptr_unmask_bits((ce)->inflight, 2)
+#define intel_context_inflight_inc(ce) ptr_count_inc(&(ce)->inflight)
+#define intel_context_inflight_dec(ce) ptr_count_dec(&(ce)->inflight)
 
struct list_head signal_link;
struct list_head signals;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
b/drivers/gpu/drm/i915/gt/intel_engine.h
index b8f5c4e1d39e..12b19b17e9b0 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -106,71 +106,26 @@ hangcheck_action_to_str(const enum 
intel_engine_hangcheck_action a)
 
 void intel_engines_set_scheduler_caps(struct drm_i915_private *i915);
 
-static inline void
-execlists_set_active(struct intel_engine_execlists *execlists,
-unsigned int bit)
-{
-   __set_bit(bit, (unsigned long *)&execlists->active);
-}
-
-static inline bool
-execlists_set_active_once(struct intel_engine_execlists *execlists,
- unsigned int bit)
-{
-   return !__test_and_set_bit(bit, (unsigned long *)&execlists->active);
-}
-
-static inline void
-execlists_clear_active(struct intel_engine_execlists *execlists,
-  unsigned int bit)
-{
-   __clear_bit(bit, (unsigned long *)&execlists->active);
-}
-
-static inline void
-execlists_clear_all_active(struct intel_engine_execlists *execlists)
+static inline unsigned int
+execlists_num_ports(const st

[Intel-gfx] [PATCH 01/40] drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD

2019-05-08 Thread Chris Wilson
After realising we need to sample RING_START to detect context switches
from preemption events that do not allow for the seqno to advance, we
can also realise that the seqno itself is just a distance along the ring
and so can be replaced by sampling RING_HEAD.

Signed-off-by: Chris Wilson 
Cc: Mika Kuoppala 
---
 drivers/gpu/drm/i915/gt/intel_engine.h   | 15 -
 drivers/gpu/drm/i915/gt/intel_engine_cs.c|  5 ++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  3 +-
 drivers/gpu/drm/i915/gt/intel_hangcheck.c|  8 ++---
 drivers/gpu/drm/i915/gt/intel_lrc.c  | 19 +++-
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c   | 32 ++--
 drivers/gpu/drm/i915/i915_debugfs.c  | 12 ++--
 7 files changed, 17 insertions(+), 77 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
b/drivers/gpu/drm/i915/gt/intel_engine.h
index 06d785533502..9359b3a7ad9c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -215,8 +215,6 @@ intel_write_status_page(struct intel_engine_cs *engine, int 
reg, u32 value)
  */
 #define I915_GEM_HWS_PREEMPT   0x32
 #define I915_GEM_HWS_PREEMPT_ADDR  (I915_GEM_HWS_PREEMPT * sizeof(u32))
-#define I915_GEM_HWS_HANGCHECK 0x34
-#define I915_GEM_HWS_HANGCHECK_ADDR(I915_GEM_HWS_HANGCHECK * sizeof(u32))
 #define I915_GEM_HWS_SEQNO 0x40
 #define I915_GEM_HWS_SEQNO_ADDR(I915_GEM_HWS_SEQNO * 
sizeof(u32))
 #define I915_GEM_HWS_SCRATCH   0x80
@@ -548,17 +546,4 @@ static inline bool inject_preempt_hang(struct 
intel_engine_execlists *execlists)
 
 #endif
 
-static inline u32
-intel_engine_next_hangcheck_seqno(struct intel_engine_cs *engine)
-{
-   return engine->hangcheck.next_seqno =
-   next_pseudo_random32(engine->hangcheck.next_seqno);
-}
-
-static inline u32
-intel_engine_get_hangcheck_seqno(struct intel_engine_cs *engine)
-{
-   return intel_read_status_page(engine, I915_GEM_HWS_HANGCHECK);
-}
-
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 416d7e2e6f8c..4c3753c1b573 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -721,6 +721,7 @@ static int measure_breadcrumb_dw(struct intel_engine_cs 
*engine)
goto out_timeline;
 
dw = engine->emit_fini_breadcrumb(&frame->rq, frame->cs) - frame->cs;
+   GEM_BUG_ON(dw & 1); /* RING_TAIL must be qword aligned */
 
i915_timeline_unpin(&frame->timeline);
 
@@ -1444,9 +1445,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
drm_printf(m, "*** WEDGED ***\n");
 
drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
-   drm_printf(m, "\tHangcheck %x:%x [%d ms]\n",
-  engine->hangcheck.last_seqno,
-  engine->hangcheck.next_seqno,
+   drm_printf(m, "\tHangcheck: %d ms ago\n",
   jiffies_to_msecs(jiffies - 
engine->hangcheck.action_timestamp));
drm_printf(m, "\tReset count: %d (global %d)\n",
   i915_reset_engine_count(error, engine),
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index c0ab11b12e14..e381c1c73902 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -54,8 +54,7 @@ struct intel_instdone {
 struct intel_engine_hangcheck {
u64 acthd;
u32 last_ring;
-   u32 last_seqno;
-   u32 next_seqno;
+   u32 last_head;
unsigned long action_timestamp;
struct intel_instdone instdone;
 };
diff --git a/drivers/gpu/drm/i915/gt/intel_hangcheck.c 
b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
index 721ab74a382f..3a4d09b80fa0 100644
--- a/drivers/gpu/drm/i915/gt/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/intel_hangcheck.c
@@ -28,7 +28,7 @@
 struct hangcheck {
u64 acthd;
u32 ring;
-   u32 seqno;
+   u32 head;
enum intel_engine_hangcheck_action action;
unsigned long action_timestamp;
int deadlock;
@@ -134,16 +134,16 @@ static void hangcheck_load_sample(struct intel_engine_cs 
*engine,
  struct hangcheck *hc)
 {
hc->acthd = intel_engine_get_active_head(engine);
-   hc->seqno = intel_engine_get_hangcheck_seqno(engine);
hc->ring = ENGINE_READ(engine, RING_START);
+   hc->head = ENGINE_READ(engine, RING_HEAD);
 }
 
 static void hangcheck_store_sample(struct intel_engine_cs *engine,
   const struct hangcheck *hc)
 {
engine->hangcheck.acthd = hc->acthd;
-   engine->hangcheck.last_seqno = hc->seqno;
engine->hangcheck.last_ring = hc->ring;
+   engine->hangcheck.last_head = hc->head;
 }
 
 static enum intel_engine_hangcheck_action
@@ -156,7 +156,7 @@ hangcheck_get_action(struct 

[Intel-gfx] [PATCH 03/40] drm/i915: Pass i915_sched_node around internally

2019-05-08 Thread Chris Wilson
To simplify the next patch, update bump_priority and schedule to accept
the internal i915_sched_ndoe directly and not expect a request pointer.

add/remove: 0/0 grow/shrink: 2/1 up/down: 8/-15 (-7)
Function old new   delta
i915_schedule_bump_priority  109 113  +4
i915_schedule 50  54  +4
__i915_schedule  922 907 -15

v2: Adopt node for the old rq local, since it no longer is a request but
the origin node.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_scheduler.c | 36 ++-
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index b7488c31e3e9..f32d0ee6d58c 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -186,7 +186,7 @@ static void kick_submission(struct intel_engine_cs *engine, 
int prio)
tasklet_hi_schedule(&engine->execlists.tasklet);
 }
 
-static void __i915_schedule(struct i915_request *rq,
+static void __i915_schedule(struct i915_sched_node *node,
const struct i915_sched_attr *attr)
 {
struct intel_engine_cs *engine;
@@ -200,13 +200,13 @@ static void __i915_schedule(struct i915_request *rq,
lockdep_assert_held(&schedule_lock);
GEM_BUG_ON(prio == I915_PRIORITY_INVALID);
 
-   if (i915_request_completed(rq))
+   if (node_signaled(node))
return;
 
-   if (prio <= READ_ONCE(rq->sched.attr.priority))
+   if (prio <= READ_ONCE(node->attr.priority))
return;
 
-   stack.signaler = &rq->sched;
+   stack.signaler = node;
list_add(&stack.dfs_link, &dfs);
 
/*
@@ -257,9 +257,9 @@ static void __i915_schedule(struct i915_request *rq,
 * execlists_submit_request()), we can set our own priority and skip
 * acquiring the engine locks.
 */
-   if (rq->sched.attr.priority == I915_PRIORITY_INVALID) {
-   GEM_BUG_ON(!list_empty(&rq->sched.link));
-   rq->sched.attr = *attr;
+   if (node->attr.priority == I915_PRIORITY_INVALID) {
+   GEM_BUG_ON(!list_empty(&node->link));
+   node->attr = *attr;
 
if (stack.dfs_link.next == stack.dfs_link.prev)
return;
@@ -268,15 +268,14 @@ static void __i915_schedule(struct i915_request *rq,
}
 
memset(&cache, 0, sizeof(cache));
-   engine = rq->engine;
+   engine = node_to_request(node)->engine;
spin_lock(&engine->timeline.lock);
 
/* Fifo and depth-first replacement ensure our deps execute before us */
list_for_each_entry_safe_reverse(dep, p, &dfs, dfs_link) {
-   struct i915_sched_node *node = dep->signaler;
-
INIT_LIST_HEAD(&dep->dfs_link);
 
+   node = dep->signaler;
engine = sched_lock_engine(node, engine, &cache);
lockdep_assert_held(&engine->timeline.lock);
 
@@ -319,13 +318,20 @@ static void __i915_schedule(struct i915_request *rq,
 void i915_schedule(struct i915_request *rq, const struct i915_sched_attr *attr)
 {
spin_lock_irq(&schedule_lock);
-   __i915_schedule(rq, attr);
+   __i915_schedule(&rq->sched, attr);
spin_unlock_irq(&schedule_lock);
 }
 
+static void __bump_priority(struct i915_sched_node *node, unsigned int bump)
+{
+   struct i915_sched_attr attr = node->attr;
+
+   attr.priority |= bump;
+   __i915_schedule(node, &attr);
+}
+
 void i915_schedule_bump_priority(struct i915_request *rq, unsigned int bump)
 {
-   struct i915_sched_attr attr;
unsigned long flags;
 
GEM_BUG_ON(bump & ~I915_PRIORITY_MASK);
@@ -334,11 +340,7 @@ void i915_schedule_bump_priority(struct i915_request *rq, 
unsigned int bump)
return;
 
spin_lock_irqsave(&schedule_lock, flags);
-
-   attr = rq->sched.attr;
-   attr.priority |= bump;
-   __i915_schedule(rq, &attr);
-
+   __bump_priority(&rq->sched, bump);
spin_unlock_irqrestore(&schedule_lock, flags);
 }
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] RFC: x86/smp: use printk_deferred in native_smp_send_reschedule

2019-05-08 Thread Daniel Vetter
On Wed, May 8, 2019 at 9:53 AM Sergey Senozhatsky
 wrote:
>
> On (05/08/19 16:44), Sergey Senozhatsky wrote:
> > [..]
> > >  static void native_smp_send_reschedule(int cpu)
> > >  {
> > > if (unlikely(cpu_is_offline(cpu))) {
> > > -   WARN(1, "sched: Unexpected reschedule of offline CPU#%d!\n", 
> > > cpu);
> > > +   printk_deferred(KERN_WARNING
> > > +   "sched: Unexpected reschedule of offline 
> > > CPU#%d!\n", cpu);
> > > return;
> > > }
> > > apic->send_IPI(cpu, RESCHEDULE_VECTOR);
> >
> > Hmm,
> > One thing to notice here is that the CPU in question is offline-ed,
> > and printk_deferred() is a per-CPU type of deferred printk(). So the
> > following thing
> >
> >   __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> >   irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> >
> > might not print anything at all. In this particular case we always
> > need another CPU to do console_unlock(), since this_cpu() is not
> > really expected to do wake_up_klogd_work_func()->console_unlock().
>
> D'oh... It's remote CPU which is offline, not this_cpu().
> Sorry, my bad!
>
> Any printk-related patch in this area will make PeterZ really-really
> angry :)
>
> printk_deferred(), just like prinkt_safe(), depends on IRQ work;
> printk_safe(), however, can redirect multiple lines, unlike
> printk_deferred(). So if you want to keep the backtrace, you may
> do something like
>
> if (unlikely(cpu_is_offline(cpu))) {
> printk_safe_enter(...);
> WARN(1, "sched: Unexpected reschedule of offline CPU#%d!\n",
>  cpu);
> printk_safe_exit(...);
> return;
> }
>
> I think, in this case John's reworked-printk can do better than
> printk_safe/printk_deferred.

[coffee slowly kicking in it seems]

Locking at __up_console_sem in printk.c, we already do this. I get a
bit a feeling that the 2nd attempt in this saga (pulling the
wake_up_process out from under semaphore.lock spinlock of the
console_lock) is all we really need, since the more direct recursion
that Petr pointed out is already handled with printk_safe_enter/exit
around the up().

https://patchwork.kernel.org/patch/10930673/ for reference that
approach, in case it's lost in your inbox.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 24/40] drm/i915: Move mmap and friends to its own file

2019-05-08 Thread Chris Wilson
Continuing the decluttering of i915_gem.c, now the turn of do_mmap and
the faulthandlers

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  | 505 
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  56 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.h|   7 +
 .../drm/i915/gem/selftests/i915_gem_mman.c| 503 
 drivers/gpu/drm/i915/i915_drv.h   |   1 -
 drivers/gpu/drm/i915/i915_gem.c   | 561 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c|   2 +-
 .../gpu/drm/i915/selftests/i915_gem_object.c  | 487 ---
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 10 files changed, 1088 insertions(+), 1036 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_mman.c
 create mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index ba3b82f3cd49..d05757c52492 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -88,6 +88,7 @@ i915-y += $(gt-y)
 obj-y += gem/
 gem-y += \
gem/i915_gem_object.o \
+   gem/i915_gem_mman.o \
gem/i915_gem_pages.o \
gem/i915_gem_phys.o \
gem/i915_gem_shmem.o
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
new file mode 100644
index ..1bcc6e1091e9
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -0,0 +1,505 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2014-2016 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include "i915_gem_ioctls.h"
+#include "i915_gem_object.h"
+
+#include "../i915_gem_gtt.h"
+#include "../i915_vma.h"
+#include "../i915_drv.h"
+#include "../intel_drv.h"
+
+static inline bool
+__vma_matches(struct vm_area_struct *vma, struct file *filp,
+ unsigned long addr, unsigned long size)
+{
+   if (vma->vm_file != filp)
+   return false;
+
+   return vma->vm_start == addr &&
+  (vma->vm_end - vma->vm_start) == PAGE_ALIGN(size);
+}
+
+/**
+ * i915_gem_mmap_ioctl - Maps the contents of an object, returning the address
+ *  it is mapped to.
+ * @dev: drm device
+ * @data: ioctl data blob
+ * @file: drm file
+ *
+ * While the mapping holds a reference on the contents of the object, it 
doesn't
+ * imply a ref on the object itself.
+ *
+ * IMPORTANT:
+ *
+ * DRM driver writers who look a this function as an example for how to do GEM
+ * mmap support, please don't implement mmap support like here. The modern way
+ * to implement DRM mmap support is with an mmap offset ioctl (like
+ * i915_gem_mmap_gtt) and then using the mmap syscall on the DRM fd directly.
+ * That way debug tooling like valgrind will understand what's going on, hiding
+ * the mmap call in a driver private ioctl will break that. The i915 driver 
only
+ * does cpu mmaps this way because we didn't know better.
+ */
+int
+i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file)
+{
+   struct drm_i915_gem_mmap *args = data;
+   struct drm_i915_gem_object *obj;
+   unsigned long addr;
+
+   if (args->flags & ~(I915_MMAP_WC))
+   return -EINVAL;
+
+   if (args->flags & I915_MMAP_WC && !boot_cpu_has(X86_FEATURE_PAT))
+   return -ENODEV;
+
+   obj = i915_gem_object_lookup(file, args->handle);
+   if (!obj)
+   return -ENOENT;
+
+   /* prime objects have no backing filp to GEM mmap
+* pages from.
+*/
+   if (!obj->base.filp) {
+   addr = -ENXIO;
+   goto err;
+   }
+
+   if (range_overflows(args->offset, args->size, (u64)obj->base.size)) {
+   addr = -EINVAL;
+   goto err;
+   }
+
+   addr = vm_mmap(obj->base.filp, 0, args->size,
+  PROT_READ | PROT_WRITE, MAP_SHARED,
+  args->offset);
+   if (IS_ERR_VALUE(addr))
+   goto err;
+
+   if (args->flags & I915_MMAP_WC) {
+   struct mm_struct *mm = current->mm;
+   struct vm_area_struct *vma;
+
+   if (down_write_killable(&mm->mmap_sem)) {
+   addr = -EINTR;
+   goto err;
+   }
+   vma = find_vma(mm, addr);
+   if (vma && __vma_matches(vma, obj->base.filp, addr, args->size))
+   vma->vm_page_prot =
+   
pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+   else
+   addr = -ENOMEM;
+   up_write(&mm->mmap_sem);
+   if (IS_ERR_VALUE(addr))
+   goto err;
+
+   /* This may race, but that's ok, it only gets set */
+   WRITE_ONCE(obj->frontbuffer_ggtt_origin, ORIGIN_CP

[Intel-gfx] [PATCH 19/40] drm/i915: Split GEM object type definition to its own header

2019-05-08 Thread Chris Wilson
For convenience in avoiding inline spaghetti, keep the type definition
as a separate header.

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/gem/Makefile |   1 +
 drivers/gpu/drm/i915/gem/Makefile.header-test |  16 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 285 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   1 +
 drivers/gpu/drm/i915/i915_drv.h   |   3 +-
 drivers/gpu/drm/i915/i915_gem_batch_pool.h|   3 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h   |   1 +
 drivers/gpu/drm/i915/i915_gem_object.h| 295 +-
 9 files changed, 312 insertions(+), 294 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/Makefile
 create mode 100644 drivers/gpu/drm/i915/gem/Makefile.header-test
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_types.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 68106fe35a04..96344c9a0726 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -85,6 +85,7 @@ gt-$(CONFIG_DRM_I915_SELFTEST) += \
 i915-y += $(gt-y)
 
 # GEM (Graphics Execution Management) code
+obj-y += gem/
 i915-y += \
  i915_active.o \
  i915_cmd_parser.o \
diff --git a/drivers/gpu/drm/i915/gem/Makefile 
b/drivers/gpu/drm/i915/gem/Makefile
new file mode 100644
index ..07e7b8b840ea
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/Makefile
@@ -0,0 +1 @@
+include $(src)/Makefile.header-test # Extra header tests
diff --git a/drivers/gpu/drm/i915/gem/Makefile.header-test 
b/drivers/gpu/drm/i915/gem/Makefile.header-test
new file mode 100644
index ..61e06cbb4b32
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/Makefile.header-test
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: MIT
+# Copyright © 2019 Intel Corporation
+
+# Test the headers are compilable as standalone units
+header_test := $(notdir $(wildcard $(src)/*.h))
+
+quiet_cmd_header_test = HDRTEST $@
+  cmd_header_test = echo "\#include \"$( $@
+
+header_test_%.c: %.h
+   $(call cmd,header_test)
+
+extra-$(CONFIG_DRM_I915_WERROR) += \
+   $(foreach h,$(header_test),$(patsubst %.h,header_test_%.o,$(h)))
+
+clean-files += $(foreach h,$(header_test),$(patsubst %.h,header_test_%.c,$(h)))
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
new file mode 100644
index ..e4b50944f553
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -0,0 +1,285 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2016 Intel Corporation
+ */
+
+#ifndef __I915_GEM_OBJECT_TYPES_H__
+#define __I915_GEM_OBJECT_TYPES_H__
+
+#include 
+
+#include 
+
+#include "../i915_active.h"
+#include "../i915_selftest.h"
+
+struct drm_i915_gem_object;
+
+/*
+ * struct i915_lut_handle tracks the fast lookups from handle to vma used
+ * for execbuf. Although we use a radixtree for that mapping, in order to
+ * remove them as the object or context is closed, we need a secondary list
+ * and a translation entry (i915_lut_handle).
+ */
+struct i915_lut_handle {
+   struct list_head obj_link;
+   struct list_head ctx_link;
+   struct i915_gem_context *ctx;
+   u32 handle;
+};
+
+struct drm_i915_gem_object_ops {
+   unsigned int flags;
+#define I915_GEM_OBJECT_HAS_STRUCT_PAGEBIT(0)
+#define I915_GEM_OBJECT_IS_SHRINKABLE  BIT(1)
+#define I915_GEM_OBJECT_IS_PROXY   BIT(2)
+#define I915_GEM_OBJECT_ASYNC_CANCEL   BIT(3)
+
+   /* Interface between the GEM object and its backing storage.
+* get_pages() is called once prior to the use of the associated set
+* of pages before to binding them into the GTT, and put_pages() is
+* called after we no longer need them. As we expect there to be
+* associated cost with migrating pages between the backing storage
+* and making them available for the GPU (e.g. clflush), we may hold
+* onto the pages after they are no longer referenced by the GPU
+* in case they may be used again shortly (for example migrating the
+* pages to a different memory domain within the GTT). put_pages()
+* will therefore most likely be called when the object itself is
+* being released or under memory pressure (where we attempt to
+* reap pages for the shrinker).
+*/
+   int (*get_pages)(struct drm_i915_gem_object *obj);
+   void (*put_pages)(struct drm_i915_gem_object *obj,
+ struct sg_table *pages);
+
+   int (*pwrite)(struct drm_i915_gem_object *obj,
+ const struct drm_i915_gem_pwrite *arg);
+
+   int (*dmabuf_export)(struct drm_i915_gem_object *obj);
+   void (*release)(struct drm_i915_gem_object *obj);
+};
+
+struct drm_i915_gem_object {
+   struct drm_gem_object base;
+
+   const struct drm_i915_gem_object_ops *ops;
+
+  

[Intel-gfx] [PATCH 25/40] drm/i915: Move GEM domain management to its own file

2019-05-08 Thread Chris Wilson
Continuing the decluttering of i915_gem.c, that of the read/write
domains, perhaps the biggest of GEM's follies?

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_domain.c| 784 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  29 +
 drivers/gpu/drm/i915/gvt/cmd_parser.c |   4 +-
 drivers/gpu/drm/i915/gvt/scheduler.c  |   6 +-
 drivers/gpu/drm/i915/i915_cmd_parser.c|   8 +-
 drivers/gpu/drm/i915/i915_drv.h   |  26 -
 drivers/gpu/drm/i915/i915_gem.c   | 777 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c|   4 +-
 drivers/gpu/drm/i915/i915_gem_render_state.c  |   4 +-
 drivers/gpu/drm/i915/selftests/huge_pages.c   |   4 +-
 .../drm/i915/selftests/i915_gem_coherency.c   |   8 +-
 .../gpu/drm/i915/selftests/i915_gem_context.c |   8 +-
 13 files changed, 841 insertions(+), 822 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_domain.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index d05757c52492..29fc924ba819 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -87,6 +87,7 @@ i915-y += $(gt-y)
 # GEM (Graphics Execution Management) code
 obj-y += gem/
 gem-y += \
+   gem/i915_gem_domain.o \
gem/i915_gem_object.o \
gem/i915_gem_mman.o \
gem/i915_gem_pages.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
new file mode 100644
index ..eee421e3021c
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -0,0 +1,784 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2014-2016 Intel Corporation
+ */
+
+#include "i915_gem_ioctls.h"
+#include "i915_gem_object.h"
+
+#include "../i915_drv.h"
+#include "../i915_gem_clflush.h"
+#include "../i915_gem_gtt.h"
+#include "../i915_vma.h"
+
+#include "../intel_frontbuffer.h"
+
+static void __i915_gem_object_flush_for_display(struct drm_i915_gem_object 
*obj)
+{
+   /*
+* We manually flush the CPU domain so that we can override and
+* force the flush for the display, and perform it asyncrhonously.
+*/
+   i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+   if (obj->cache_dirty)
+   i915_gem_clflush_object(obj, I915_CLFLUSH_FORCE);
+   obj->write_domain = 0;
+}
+
+void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj)
+{
+   if (!READ_ONCE(obj->pin_global))
+   return;
+
+   mutex_lock(&obj->base.dev->struct_mutex);
+   __i915_gem_object_flush_for_display(obj);
+   mutex_unlock(&obj->base.dev->struct_mutex);
+}
+
+/**
+ * Moves a single object to the WC read, and possibly write domain.
+ * @obj: object to act on
+ * @write: ask for write access or read only
+ *
+ * This function returns when the move is complete, including waiting on
+ * flushes to occur.
+ */
+int
+i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
+{
+   int ret;
+
+   lockdep_assert_held(&obj->base.dev->struct_mutex);
+
+   ret = i915_gem_object_wait(obj,
+  I915_WAIT_INTERRUPTIBLE |
+  I915_WAIT_LOCKED |
+  (write ? I915_WAIT_ALL : 0),
+  MAX_SCHEDULE_TIMEOUT);
+   if (ret)
+   return ret;
+
+   if (obj->write_domain == I915_GEM_DOMAIN_WC)
+   return 0;
+
+   /* Flush and acquire obj->pages so that we are coherent through
+* direct access in memory with previous cached writes through
+* shmemfs and that our cache domain tracking remains valid.
+* For example, if the obj->filp was moved to swap without us
+* being notified and releasing the pages, we would mistakenly
+* continue to assume that the obj remained out of the CPU cached
+* domain.
+*/
+   ret = i915_gem_object_pin_pages(obj);
+   if (ret)
+   return ret;
+
+   i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_WC);
+
+   /* Serialise direct access to this object with the barriers for
+* coherent writes from the GPU, by effectively invalidating the
+* WC domain upon first access.
+*/
+   if ((obj->read_domains & I915_GEM_DOMAIN_WC) == 0)
+   mb();
+
+   /* It should now be out of any other write domains, and we can update
+* the domain values for our changes.
+*/
+   GEM_BUG_ON((obj->write_domain & ~I915_GEM_DOMAIN_WC) != 0);
+   obj->read_domains |= I915_GEM_DOMAIN_WC;
+   if (write) {
+   obj->read_domains = I915_GEM_DOMAIN_WC;
+   obj->write_domain = I915_GEM_DOMAIN_WC;
+   obj->mm.dirty = true;
+   }
+
+   i915_gem_object_unpin_pages(obj);
+   return 0;
+}
+

[Intel-gfx] [PATCH 07/40] drm/i915: Seal races between async GPU cancellation, retirement and signaling

2019-05-08 Thread Chris Wilson
Currently there is an underlying assumption that i915_request_unsubmit()
is synchronous wrt the GPU -- that is the request is no longer in flight
as we remove it. In the near future that may change, and this may upset
our signaling as we can process an interrupt for that request while it
is no longer in flight.

CPU0CPU1
intel_engine_breadcrumbs_irq
(queue request completion)
i915_request_cancel_signaling
... ...
i915_request_enable_signaling
dma_fence_signal

Hence in the time it took us to drop the lock to signal the request, a
preemption event may have occurred and re-queued the request. In the
process, that request would have seen I915_FENCE_FLAG_SIGNAL clear and
so reused the rq->signal_link that was in use on CPU0, leading to bad
pointer chasing in intel_engine_breadcrumbs_irq.

A related issue was that if someone started listening for a signal on a
completed but no longer in-flight request, we missed the opportunity to
immediately signal that request.

Furthermore, as intel_contexts may be immediately released during
request retirement, in order to be entirely sure that
intel_engine_breadcrumbs_irq may no longer dereference the intel_context
(ce->signals and ce->signal_link), we must wait for irq spinlock.

In order to prevent the race, we use a bit in the fence.flags to signal
the transfer onto the signal list inside intel_engine_breadcrumbs_irq.
For simplicity, we use the DMA_FENCE_FLAG_SIGNALED_BIT as it then
quickly signals to any outside observer that the fence is indeed signaled.

v2: Sketch out potential dma-fence API for manual signaling

Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context 
interrupt tracking")
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/dma-buf/dma-fence.c |  1 +
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 70 ++---
 drivers/gpu/drm/i915/i915_request.c |  1 +
 drivers/gpu/drm/i915/intel_guc_submission.c |  1 -
 4 files changed, 51 insertions(+), 22 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 3aa8733f832a..9bf06042619a 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -29,6 +29,7 @@
 
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
 
 static DEFINE_SPINLOCK(dma_fence_stub_lock);
 static struct dma_fence dma_fence_stub;
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index fe455f01aa65..7053a90e5cb5 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -23,6 +23,7 @@
  */
 
 #include 
+#include 
 #include 
 
 #include "i915_drv.h"
@@ -96,9 +97,30 @@ check_signal_order(struct intel_context *ce, struct 
i915_request *rq)
return true;
 }
 
+static void
+__dma_fence_signal_timestamp(struct dma_fence *fence, ktime_t timestamp)
+{
+   fence->timestamp = timestamp;
+   set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags);
+   trace_dma_fence_signaled(fence);
+}
+
+static void
+__dma_fence_signal_notify(struct dma_fence *fence)
+{
+   struct dma_fence_cb *cur, *tmp;
+
+   list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) {
+   INIT_LIST_HEAD(&cur->node);
+   cur->func(fence, cur);
+   }
+   INIT_LIST_HEAD(&fence->cb_list);
+}
+
 void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
 {
struct intel_breadcrumbs *b = &engine->breadcrumbs;
+   const ktime_t timestamp = ktime_get();
struct intel_context *ce, *cn;
struct list_head *pos, *next;
LIST_HEAD(signal);
@@ -122,6 +144,11 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs 
*engine)
 
GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_SIGNAL,
 &rq->fence.flags));
+   clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
+
+   if (test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+&rq->fence.flags))
+   continue;
 
/*
 * Queue for execution after dropping the signaling
@@ -129,14 +156,6 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs 
*engine)
 * more signalers to the same context or engine.
 */
i915_request_get(rq);
-
-   /*
-* We may race with direct invocation of
-* dma_fence_signal(), e.g. i915_request_retire(),
-* so we need to acquire our reference to the request
-* before we cancel the breadcrumb

[Intel-gfx] [PATCH 04/40] drm/i915: Check for no-op priority changes first

2019-05-08 Thread Chris Wilson
In all likelihood, the priority and node are already in the CPU cache
and by checking them first, we can avoid having to chase the
*request->hwsp for the current breadcrumb.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_scheduler.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index f32d0ee6d58c..5581c5004ff0 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -200,10 +200,10 @@ static void __i915_schedule(struct i915_sched_node *node,
lockdep_assert_held(&schedule_lock);
GEM_BUG_ON(prio == I915_PRIORITY_INVALID);
 
-   if (node_signaled(node))
+   if (prio <= READ_ONCE(node->attr.priority))
return;
 
-   if (prio <= READ_ONCE(node->attr.priority))
+   if (node_signaled(node))
return;
 
stack.signaler = node;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 28/40] drm/i915: Move GEM object domain management from struct_mutex to local

2019-05-08 Thread Chris Wilson
Use the per-object local lock to control the cache domain of the
individual GEM objects, not struct_mutex. This is a huge leap forward
for us in terms of object-level synchronisation; execbuffers are
coordinated using the ww_mutex and pread/pwrite is finally fully
serialised again.

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_domain.c|  70 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 123 --
 drivers/gpu/drm/i915/gem/i915_gem_fence.c |  97 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c|   2 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  14 ++
 drivers/gpu/drm/i915/gem/i915_gem_pm.c|   7 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  12 +-
 .../i915/gem/selftests/i915_gem_coherency.c   |  12 ++
 .../drm/i915/gem/selftests/i915_gem_context.c |  20 +++
 .../drm/i915/gem/selftests/i915_gem_mman.c|   6 +
 .../drm/i915/gem/selftests/i915_gem_phys.c|   4 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   4 +
 drivers/gpu/drm/i915/gt/selftest_lrc.c|   2 +
 .../gpu/drm/i915/gt/selftest_workarounds.c|   6 +
 drivers/gpu/drm/i915/gvt/cmd_parser.c |   2 +
 drivers/gpu/drm/i915/gvt/scheduler.c  |   8 +-
 drivers/gpu/drm/i915/i915_cmd_parser.c|  23 ++--
 drivers/gpu/drm/i915/i915_gem.c   | 122 +
 drivers/gpu/drm/i915/i915_gem_gtt.c   |   5 +-
 drivers/gpu/drm/i915/i915_gem_render_state.c  |   2 +
 drivers/gpu/drm/i915/i915_vma.c   |   8 +-
 drivers/gpu/drm/i915/i915_vma.h   |  12 ++
 drivers/gpu/drm/i915/intel_display.c  |   5 +
 drivers/gpu/drm/i915/intel_guc_log.c  |   6 +-
 drivers/gpu/drm/i915/intel_overlay.c  |  25 ++--
 drivers/gpu/drm/i915/intel_uc_fw.c|   6 +-
 drivers/gpu/drm/i915/selftests/i915_request.c |   4 +
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |   2 +
 31 files changed, 444 insertions(+), 180 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_fence.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 9f5f4acacae5..e5348c355987 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -93,6 +93,7 @@ gem-y += \
gem/i915_gem_dmabuf.o \
gem/i915_gem_domain.o \
gem/i915_gem_execbuffer.o \
+   gem/i915_gem_fence.o \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
gem/i915_gem_mman.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index 093bfff55a96..efab47250588 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -96,6 +96,8 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
 {
struct clflush *clflush;
 
+   assert_object_held(obj);
+
/*
 * Stolen memory is always coherent with the GPU as it is explicitly
 * marked as wc by the system, or the system is cache-coherent.
@@ -145,9 +147,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object 
*obj,
true, I915_FENCE_TIMEOUT,
I915_FENCE_GFP);
 
-   reservation_object_lock(obj->resv, NULL);
reservation_object_add_excl_fence(obj->resv, &clflush->dma);
-   reservation_object_unlock(obj->resv);
 
i915_sw_fence_commit(&clflush->wait);
} else if (obj->mm.pages) {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 4e7efe159531..50981ea513f0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -152,7 +152,6 @@ static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, 
struct vm_area_struct *
 static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum 
dma_data_direction direction)
 {
struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
-   struct drm_device *dev = obj->base.dev;
bool write = (direction == DMA_BIDIRECTIONAL || direction == 
DMA_TO_DEVICE);
int err;
 
@@ -160,12 +159,12 @@ static int i915_gem_begin_cpu_access(struct dma_buf 
*dma_buf, enum dma_data_dire
if (err)
return err;
 
-   err = i915_mutex_lock_interruptible(dev);
+   err = i915_gem_object_lock_interruptible(obj);
if (err)
goto out;
 
err = i915_gem_object_set_to_cpu_domain(obj, write);
-   mutex_unlock(&dev->struct_mutex);
+   i915_gem_object_unlock(obj);
 
 out:
i915_gem_object_unpin_pages(obj);
@@ -175,19 +174,18 @@ static int i915_gem_begin_cpu_access(struct dma_buf 
*dma_buf, enum dma_da

[Intel-gfx] [PATCH 08/40] dma-fence: Refactor signaling for manual invocation

2019-05-08 Thread Chris Wilson
Move the duplicated code within dma-fence.c into the header for wider
reuse.

Signed-off-by: Chris Wilson 
---
 drivers/dma-buf/Makefile|  10 +-
 drivers/dma-buf/dma-fence-trace.c   |  28 +++
 drivers/dma-buf/dma-fence.c |  28 +--
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c |  21 --
 include/linux/dma-fence-types.h | 248 
 include/linux/dma-fence.h   | 244 ++-
 6 files changed, 312 insertions(+), 267 deletions(-)
 create mode 100644 drivers/dma-buf/dma-fence-trace.c
 create mode 100644 include/linux/dma-fence-types.h

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 1f006e083eb9..56e579878f26 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1,5 +1,11 @@
-obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
-reservation.o seqno-fence.o
+obj-y := \
+   dma-buf.o \
+   dma-fence.o \
+   dma-fence-array.o \
+   dma-fence-chain.o \
+   dma-fence-trace.o \
+   reservation.o \
+   seqno-fence.o
 obj-$(CONFIG_SYNC_FILE)+= sync_file.o
 obj-$(CONFIG_SW_SYNC)  += sw_sync.o sync_debug.o
 obj-$(CONFIG_UDMABUF)  += udmabuf.o
diff --git a/drivers/dma-buf/dma-fence-trace.c 
b/drivers/dma-buf/dma-fence-trace.c
new file mode 100644
index ..eb6f282be4c0
--- /dev/null
+++ b/drivers/dma-buf/dma-fence-trace.c
@@ -0,0 +1,28 @@
+/*
+ * Fence mechanism for dma-buf and to allow for asynchronous dma access
+ *
+ * Copyright (C) 2012 Canonical Ltd
+ * Copyright (C) 2012 Texas Instruments
+ *
+ * Authors:
+ * Rob Clark 
+ * Maarten Lankhorst 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include 
+
+#define CREATE_TRACE_POINTS
+#include 
+
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 9bf06042619a..64582c4e0b6f 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -24,13 +24,6 @@
 #include 
 #include 
 
-#define CREATE_TRACE_POINTS
-#include 
-
-EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
-EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
-EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
-
 static DEFINE_SPINLOCK(dma_fence_stub_lock);
 static struct dma_fence dma_fence_stub;
 
@@ -136,7 +129,6 @@ EXPORT_SYMBOL(dma_fence_context_alloc);
  */
 int dma_fence_signal_locked(struct dma_fence *fence)
 {
-   struct dma_fence_cb *cur, *tmp;
int ret = 0;
 
lockdep_assert_held(fence->lock);
@@ -152,15 +144,10 @@ int dma_fence_signal_locked(struct dma_fence *fence)
 * still run through all callbacks
 */
} else {
-   fence->timestamp = ktime_get();
-   set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags);
-   trace_dma_fence_signaled(fence);
+   __dma_fence_signal_timestamp(fence, ktime_get());
}
 
-   list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) {
-   list_del_init(&cur->node);
-   cur->func(fence, cur);
-   }
+   __dma_fence_signal_notify(fence);
return ret;
 }
 EXPORT_SYMBOL(dma_fence_signal_locked);
@@ -188,18 +175,11 @@ int dma_fence_signal(struct dma_fence *fence)
if (test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
return -EINVAL;
 
-   fence->timestamp = ktime_get();
-   set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags);
-   trace_dma_fence_signaled(fence);
+   __dma_fence_signal_timestamp(fence, ktime_get());
 
if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) {
-   struct dma_fence_cb *cur, *tmp;
-
spin_lock_irqsave(fence->lock, flags);
-   list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) {
-   list_del_init(&cur->node);
-   cur->func(fence, cur);
-   }
+   __dma_fence_signal_notify(fence);
spin_unlock_irqrestore(fence->lock, flags);
}
return 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index 7053a90e5cb5..7dd154f75086 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -23,7 +23,6 @@
  */
 
 #include 
-#include 
 #include 
 
 #include "i915_drv.h"
@@ -97,26 +96,6 @@ check_signal_order(struct i

[Intel-gfx] [PATCH 17/40] drm/i915/execlists: Virtual engine bonding

2019-05-08 Thread Chris Wilson
Some users require that when a master batch is executed on one particular
engine, a companion batch is run simultaneously on a specific slave
engine. For this purpose, we introduce virtual engine bonding, allowing
maps of master:slaves to be constructed to constrain which physical
engines a virtual engine may select given a fence on a master engine.

For the moment, we continue to ignore the issue of preemption deferring
the master request for later. Ideally, we would like to then also remove
the slave and run something else rather than have it stall the pipeline.
With load balancing, we should be able to move workload around it, but
there is a similar stall on the master pipeline while it may wait for
the slave to be executed. At the cost of more latency for the bonded
request, it may be interesting to launch both on their engines in
lockstep. (Bubbles abound.)

Opens: Also what about bonding an engine as its own master? It doesn't
break anything internally, so allow the silliness.

v2: Emancipate the bonds
v3: Couple in delayed scheduling for the selftests
v4: Handle invalid mutually exclusive bonding
v5: Mention what the uapi does
v6: s/nbond/num_bonds/

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 +
 drivers/gpu/drm/i915/gt/intel_lrc.c   |  98 +
 drivers/gpu/drm/i915/gt/intel_lrc.h   |   4 +
 drivers/gpu/drm/i915/gt/selftest_lrc.c| 191 ++
 drivers/gpu/drm/i915/i915_gem_context.c   |  84 
 drivers/gpu/drm/i915/selftests/lib_sw_fence.c |   3 +
 include/uapi/drm/i915_drm.h   |  44 
 7 files changed, 431 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 7b47e00fa082..f3fc2e8acc90 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -405,6 +405,13 @@ struct intel_engine_cs {
 */
void(*submit_request)(struct i915_request *rq);
 
+   /*
+* Called on signaling of a SUBMIT_FENCE, passing along the signaling
+* request down to the bonded pairs.
+*/
+   void(*bond_execute)(struct i915_request *rq,
+   struct dma_fence *signal);
+
/*
 * Call when the priority on a request has changed and it and its
 * dependencies may need rescheduling. Note the request itself may
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 69849ffb9c82..5d80f99661bc 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -191,6 +191,18 @@ struct virtual_engine {
int prio;
} nodes[I915_NUM_ENGINES];
 
+   /*
+* Keep track of bonded pairs -- restrictions upon on our selection
+* of physical engines any particular request may be submitted to.
+* If we receive a submit-fence from a master engine, we will only
+* use one of sibling_mask physical engines.
+*/
+   struct ve_bond {
+   const struct intel_engine_cs *master;
+   intel_engine_mask_t sibling_mask;
+   } *bonds;
+   unsigned int num_bonds;
+
/* And finally, which physical engines this virtual engine maps onto. */
unsigned int num_siblings;
struct intel_engine_cs *siblings[0];
@@ -1002,6 +1014,7 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
rb_erase_cached(rb, &execlists->virtual);
RB_CLEAR_NODE(rb);
 
+   GEM_BUG_ON(!(rq->execution_mask & engine->mask));
rq->engine = engine;
 
if (engine != ve->siblings[0]) {
@@ -3110,6 +3123,8 @@ static void virtual_context_destroy(struct kref *kref)
if (ve->context.state)
__execlists_context_fini(&ve->context);
 
+   kfree(ve->bonds);
+
i915_timeline_fini(&ve->base.timeline);
kfree(ve);
 }
@@ -3306,6 +3321,38 @@ static void virtual_submit_request(struct i915_request 
*rq)
tasklet_schedule(&ve->base.execlists.tasklet);
 }
 
+static struct ve_bond *
+virtual_find_bond(struct virtual_engine *ve,
+ const struct intel_engine_cs *master)
+{
+   int i;
+
+   for (i = 0; i < ve->num_bonds; i++) {
+   if (ve->bonds[i].master == master)
+   return &ve->bonds[i];
+   }
+
+   return NULL;
+}
+
+static void
+virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
+{
+   struct virtual_engine *ve = to_virtual_engine(rq->engine);
+   struct ve_bond *bond;
+
+   bond = virtual_find_bond(ve, to_request(signal)->engine);
+   if (bond) {
+   intel_engine_mask_t old, new, cmp;
+
+   cmp = READ_ONCE(rq->execution_mask);
+ 

[Intel-gfx] [PATCH 20/40] drm/i915: Pull GEM ioctls interface to its own file

2019-05-08 Thread Chris Wilson
Declutter i915_drv/gem.h by moving the ioctl API into its own header.

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h | 52 ++
 drivers/gpu/drm/i915/i915_drv.c|  1 +
 drivers/gpu/drm/i915/i915_drv.h| 38 
 drivers/gpu/drm/i915/i915_gem.c|  1 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  1 +
 drivers/gpu/drm/i915/i915_gem_tiling.c |  3 ++
 drivers/gpu/drm/i915/i915_gem_userptr.c| 12 +++--
 7 files changed, 66 insertions(+), 42 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h 
b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
new file mode 100644
index ..ddc7f2a52b3e
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
@@ -0,0 +1,52 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef I915_GEM_IOCTLS_H
+#define I915_GEM_IOCTLS_H
+
+struct drm_device;
+struct drm_file;
+
+int i915_gem_busy_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file);
+int i915_gem_create_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file);
+int i915_gem_execbuffer_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file);
+int i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file);
+int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_get_tiling_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file);
+int i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file);
+int i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file);
+int i915_gem_pread_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file);
+int i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file);
+int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file);
+int i915_gem_set_tiling_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file);
+int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file);
+int i915_gem_throttle_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file);
+int i915_gem_userptr_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+int i915_gem_wait_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file);
+
+#endif
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 83d2eb9e74cb..30afd8aef737 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -47,6 +47,7 @@
 #include 
 #include 
 
+#include "gem/i915_gem_ioctls.h"
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_reset.h"
 #include "gt/intel_workarounds.h"
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0b9ec8b3e5fa..95e9b2c69f17 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2736,46 +2736,8 @@ static inline bool intel_vgpu_active(struct 
drm_i915_private *dev_priv)
 }
 
 /* i915_gem.c */
-int i915_gem_create_ioctl(struct drm_device *dev, void *data,
- struct drm_file *file_priv);
-int i915_gem_pread_ioctl(struct drm_device *dev, void *data,
-struct drm_file *file_priv);
-int i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
- struct drm_file *file_priv);
-int i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
-   struct drm_file *file_priv);
-int i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
-   struct drm_file *file_priv);
-int i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
- struct drm_file *file_priv);
-int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
-struct drm_file *file_priv);
-int i915_gem_execbuffer_ioctl(struct drm_device *dev, void *data,
- struct drm_file *file_priv);
-int i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *

[Intel-gfx] [PATCH 27/40] drm/i915: Pull scatterlist utils out of i915_gem.h

2019-05-08 Thread Chris Wilson
Out scatterlist utility routines can be pulled out of i915_gem.h for a
bit more decluttering.

v2: Push I915_GTT_PAGE_SIZE out of i915_scatterlist itself and into the
caller.

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|   1 +
 drivers/gpu/drm/i915/gem/i915_gem_internal.c  |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_phys.c  |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |   1 +
 .../drm/i915/gem/selftests/huge_gem_object.c  |   2 +
 .../drm/i915/gem/selftests/i915_gem_dmabuf.c  |   2 +
 drivers/gpu/drm/i915/i915_drv.h   | 110 ---
 drivers/gpu/drm/i915/i915_gem.c   |  30 +
 drivers/gpu/drm/i915/i915_gem_fence_reg.c |   2 +
 drivers/gpu/drm/i915/i915_gem_gtt.c   |   3 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h   |   4 +-
 drivers/gpu/drm/i915/i915_gpu_error.c |   1 +
 drivers/gpu/drm/i915/i915_scatterlist.c   |  39 ++
 drivers/gpu/drm/i915/i915_scatterlist.h   | 127 ++
 drivers/gpu/drm/i915/selftests/i915_vma.c |   1 +
 drivers/gpu/drm/i915/selftests/scatterlist.c  |   1 +
 19 files changed, 188 insertions(+), 141 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_scatterlist.c
 create mode 100644 drivers/gpu/drm/i915/i915_scatterlist.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 0c39ef967cf0..9f5f4acacae5 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -44,6 +44,7 @@ i915-y += i915_drv.o \
  i915_irq.o \
  i915_params.o \
  i915_pci.o \
+ i915_scatterlist.o \
  i915_suspend.o \
  i915_sysfs.o \
  intel_csr.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 9b75dd8c267d..4e7efe159531 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -11,6 +11,7 @@
 #include "i915_gem_object.h"
 
 #include "../i915_drv.h"
+#include "../i915_scatterlist.h"
 
 static struct drm_i915_gem_object *dma_buf_to_obj(struct dma_buf *buf)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c 
b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
index d40adb3bbe29..d072d0cbce06 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
@@ -14,6 +14,7 @@
 
 #include "../i915_drv.h"
 #include "../i915_gem.h"
+#include "../i915_scatterlist.h"
 #include "../i915_utils.h"
 
 #define QUIET (__GFP_NORETRY | __GFP_NOWARN)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 452aba0a3359..21635d0ac95e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -7,6 +7,7 @@
 #include "i915_gem_object.h"
 
 #include "../i915_drv.h"
+#include "../i915_scatterlist.h"
 
 void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 struct sg_table *pages,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c 
b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 1bf3e0afcba2..22d185301578 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -16,6 +16,7 @@
 #include "i915_gem_object.h"
 
 #include "../i915_drv.h"
+#include "../i915_scatterlist.h"
 
 static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 78df86d1db7a..2eee1af88bcb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -10,6 +10,7 @@
 #include "i915_gem_object.h"
 
 #include "../i915_drv.h"
+#include "../i915_scatterlist.h"
 
 /*
  * Move pages to appropriate lru and release the pagevec, decrementing the
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 007a2b8cac8b..0137baa409e1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -15,6 +15,7 @@
 #include "i915_gem_ioctls.h"
 #include "i915_gem_object.h"
 
+#include "../i915_scatterlist.h"
 #include "../i915_trace.h"
 #include "../intel_drv.h"
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c 
b/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c
index 824f3761314c..c03781f8b435 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c
@@ -4,6 +4,8 @@
  * Copyright © 2016 Intel Corporation
  */
 
+#include "../../i915_scatterlist.h"
+
 #include "huge_gem_object.h"
 
 static void huge_free_pages(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/

[Intel-gfx] [PATCH 13/40] drm/i915: Allow userspace to clone contexts on creation

2019-05-08 Thread Chris Wilson
A usecase arose out of handling context recovery in mesa, whereby they
wish to recreate a context with fresh logical state but preserving all
other details of the original. Currently, they create a new context and
iterate over which bits they want to copy across, but it would much more
convenient if they were able to just pass in a target context to clone
during creation. This essentially extends the setparam during creation
to pull the details from a target context instead of the user supplied
parameters.

The ideal here is that we don't expose control over anything more than
can be obtained via CONTEXT_PARAM. That is userspace retains explicit
control over all features, and this api is just convenience.

For example, you could replace

struct context_param p = { .param = CONTEXT_PARAM_VM };

param.ctx_id = old_id;
gem_context_get_param(&p.param);

new_id = gem_context_create();

param.ctx_id = new_id;
gem_context_set_param(&p.param);

gem_vm_destroy(param.value); /* drop the ref to VM_ID handle */

with

struct create_ext_param p = {
  { .name = CONTEXT_CREATE_CLONE },
  .clone_id = old_id,
  .flags = CLONE_FLAGS_VM
}
new_id = gem_context_create_ext(&p);

and not have to worry about stray namespace pollution etc.

Signed-off-by: Chris Wilson 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_gem_context.c | 206 
 include/uapi/drm/i915_drm.h |  15 ++
 2 files changed, 221 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 9cd671298daf..040c0b770c21 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1697,8 +1697,214 @@ static int create_setparam(struct i915_user_extension 
__user *ext, void *data)
return ctx_setparam(arg->fpriv, arg->ctx, &local.param);
 }
 
+static int clone_engines(struct i915_gem_context *dst,
+struct i915_gem_context *src)
+{
+   struct i915_gem_engines *e = i915_gem_context_lock_engines(src);
+   struct i915_gem_engines *clone;
+   bool user_engines;
+   unsigned long n;
+
+   clone = kmalloc(struct_size(e, engines, e->num_engines), GFP_KERNEL);
+   if (!clone)
+   goto err_unlock;
+
+   clone->i915 = dst->i915;
+   for (n = 0; n < e->num_engines; n++) {
+   if (!e->engines[n]) {
+   clone->engines[n] = NULL;
+   continue;
+   }
+
+   clone->engines[n] =
+   intel_context_create(dst, e->engines[n]->engine);
+   if (!clone->engines[n]) {
+   __free_engines(clone, n);
+   goto err_unlock;
+   }
+   }
+   clone->num_engines = n;
+
+   user_engines = i915_gem_context_user_engines(src);
+   i915_gem_context_unlock_engines(src);
+
+   free_engines(dst->engines);
+   RCU_INIT_POINTER(dst->engines, clone);
+   if (user_engines)
+   i915_gem_context_set_user_engines(dst);
+   else
+   i915_gem_context_clear_user_engines(dst);
+   return 0;
+
+err_unlock:
+   i915_gem_context_unlock_engines(src);
+   return -ENOMEM;
+}
+
+static int clone_flags(struct i915_gem_context *dst,
+  struct i915_gem_context *src)
+{
+   dst->user_flags = src->user_flags;
+   return 0;
+}
+
+static int clone_schedattr(struct i915_gem_context *dst,
+  struct i915_gem_context *src)
+{
+   dst->sched = src->sched;
+   return 0;
+}
+
+static int clone_sseu(struct i915_gem_context *dst,
+ struct i915_gem_context *src)
+{
+   struct i915_gem_engines *e = i915_gem_context_lock_engines(src);
+   struct i915_gem_engines *clone;
+   unsigned long n;
+   int err;
+
+   clone = dst->engines; /* no locking required; sole access */
+   if (e->num_engines != clone->num_engines) {
+   err = -EINVAL;
+   goto unlock;
+   }
+
+   for (n = 0; n < e->num_engines; n++) {
+   struct intel_context *ce = e->engines[n];
+
+   if (clone->engines[n]->engine->class != ce->engine->class) {
+   /* Must have compatible engine maps! */
+   err = -EINVAL;
+   goto unlock;
+   }
+
+   /* serialises with set_sseu */
+   err = intel_context_lock_pinned(ce);
+   if (err)
+   goto unlock;
+
+   clone->engines[n]->sseu = ce->sseu;
+   intel_context_unlock_pinned(ce);
+   }
+
+   err = 0;
+unlock:
+   i915_gem_context_unlock_engines(src);
+   return err;
+}
+
+static int clone_timeline(struct i915_gem_context *dst,
+ struct i915_gem_context *src)
+{
+ 

[Intel-gfx] [PATCH 23/40] drm/i915: Move phys objects to its own file

2019-05-08 Thread Chris Wilson
Continuing the decluttering of i915_gem.c, this time the legacy physical
object.

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/Makefile |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  11 +-
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_pages.c | 510 +
 drivers/gpu/drm/i915/gem/i915_gem_phys.c  | 212 ++
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  61 ++
 .../drm/i915/gem/selftests/i915_gem_phys.c|  80 ++
 drivers/gpu/drm/i915/i915_drv.h   |   2 -
 drivers/gpu/drm/i915/i915_gem.c   | 699 +-
 drivers/gpu/drm/i915/i915_gem_shrinker.c  |  59 +-
 .../gpu/drm/i915/selftests/i915_gem_object.c  |  54 --
 .../drm/i915/selftests/i915_mock_selftests.h  |   1 +
 12 files changed, 885 insertions(+), 808 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_pages.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_phys.c
 create mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 625f9749355b..ba3b82f3cd49 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -88,6 +88,8 @@ i915-y += $(gt-y)
 obj-y += gem/
 gem-y += \
gem/i915_gem_object.o \
+   gem/i915_gem_pages.o \
+   gem/i915_gem_phys.o \
gem/i915_gem_shmem.o
 i915-y += \
  $(gem-y) \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index ad82303f741a..2e963a593245 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -33,11 +33,17 @@ void __i915_gem_object_release_shmem(struct 
drm_i915_gem_object *obj,
 struct sg_table *pages,
 bool needs_clflush);
 
+int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align);
+
 void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file);
 void i915_gem_free_object(struct drm_gem_object *obj);
 
 void i915_gem_flush_free_objects(struct drm_i915_private *i915);
 
+struct sg_table *
+__i915_gem_object_unset_pages(struct drm_i915_gem_object *obj);
+void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
+
 /**
  * i915_gem_object_lookup_rcu - look up a temporary GEM object from its handle
  * @filp: DRM file private date
@@ -231,6 +237,8 @@ i915_gem_object_get_dma_address(struct drm_i915_gem_object 
*obj,
 void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 struct sg_table *pages,
 unsigned int sg_page_sizes);
+
+int i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
 int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
 
 static inline int __must_check
@@ -286,7 +294,8 @@ enum i915_mm_subclass { /* lockdep subclass for 
obj->mm.lock/struct_mutex */
 
 int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
enum i915_mm_subclass subclass);
-void __i915_gem_object_truncate(struct drm_i915_gem_object *obj);
+void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
+void i915_gem_object_writeback(struct drm_i915_gem_object *obj);
 
 enum i915_map_type {
I915_MAP_WB = 0,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index e4b50944f553..24abdd6999b5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -52,6 +52,8 @@ struct drm_i915_gem_object_ops {
int (*get_pages)(struct drm_i915_gem_object *obj);
void (*put_pages)(struct drm_i915_gem_object *obj,
  struct sg_table *pages);
+   void (*truncate)(struct drm_i915_gem_object *obj);
+   void (*writeback)(struct drm_i915_gem_object *obj);
 
int (*pwrite)(struct drm_i915_gem_object *obj,
  const struct drm_i915_gem_pwrite *arg);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
new file mode 100644
index ..452aba0a3359
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -0,0 +1,510 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2014-2016 Intel Corporation
+ */
+
+#include "i915_gem_object.h"
+
+#include "../i915_drv.h"
+
+void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
+struct sg_table *pages,
+unsigned int sg_page_sizes)
+{
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
+   unsigned long supported = INTEL_INFO(i915)->page_sizes;
+   int i;
+
+   lockdep_assert_held(&obj->mm.lock);
+
+   /* Make the pages coherent with the GPU (flushing any swapin). */
+   if (

[Intel-gfx] [PATCH 33/40] drm/i915: Move object close under its own lock

2019-05-08 Thread Chris Wilson
Use i915_gem_object_lock() to guard the LUT and active reference to
allow us to break free of struct_mutex for handling GEM_CLOSE.

Testcase: igt/gem_close_race
Testcase: igt/gem_exec_parallel
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 76 ++-
 .../gpu/drm/i915/gem/i915_gem_context_types.h | 12 +--
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 23 --
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 38 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 -
 .../gpu/drm/i915/gem/selftests/mock_context.c |  1 -
 drivers/gpu/drm/i915/i915_drv.h   |  4 +-
 drivers/gpu/drm/i915/i915_gem.c   |  1 +
 drivers/gpu/drm/i915/i915_gem_gtt.c   |  1 +
 drivers/gpu/drm/i915/i915_timeline.c  | 13 ++--
 drivers/gpu/drm/i915/i915_vma.c   | 42 ++
 drivers/gpu/drm/i915/i915_vma.h   | 17 ++---
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  1 +
 13 files changed, 132 insertions(+), 98 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5016a3e1f863..a9608d9ced6a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -95,24 +95,42 @@ void i915_lut_handle_free(struct i915_lut_handle *lut)
 
 static void lut_close(struct i915_gem_context *ctx)
 {
-   struct i915_lut_handle *lut, *ln;
struct radix_tree_iter iter;
void __rcu **slot;
 
-   list_for_each_entry_safe(lut, ln, &ctx->handles_list, ctx_link) {
-   list_del(&lut->obj_link);
-   i915_lut_handle_free(lut);
-   }
-   INIT_LIST_HEAD(&ctx->handles_list);
+   lockdep_assert_held(&ctx->mutex);
 
rcu_read_lock();
radix_tree_for_each_slot(slot, &ctx->handles_vma, &iter, 0) {
struct i915_vma *vma = rcu_dereference_raw(*slot);
-
-   radix_tree_iter_delete(&ctx->handles_vma, &iter, slot);
-
-   vma->open_count--;
-   i915_vma_put(vma);
+   struct drm_i915_gem_object *obj = vma->obj;
+   struct i915_lut_handle *lut;
+   bool found = false;
+
+   rcu_read_unlock();
+   i915_gem_object_lock(obj);
+   list_for_each_entry(lut, &obj->lut_list, obj_link) {
+   if (lut->ctx != ctx)
+   continue;
+
+   if (lut->handle != iter.index)
+   continue;
+
+   list_del(&lut->obj_link);
+   i915_lut_handle_free(lut);
+   found = true;
+   break;
+   }
+   i915_gem_object_unlock(obj);
+   rcu_read_lock();
+
+   if (found) {
+   radix_tree_iter_delete(&ctx->handles_vma, &iter, slot);
+   if (atomic_dec_and_test(&vma->open_count) &&
+   !i915_vma_is_ggtt(vma))
+   i915_vma_close(vma);
+   i915_gem_object_put(obj);
+   }
}
rcu_read_unlock();
 }
@@ -250,15 +268,9 @@ static void free_engines(struct i915_gem_engines *e)
__free_engines(e, e->num_engines);
 }
 
-static void free_engines_rcu(struct work_struct *wrk)
+static void free_engines_rcu(struct rcu_head *rcu)
 {
-   struct i915_gem_engines *e =
-   container_of(wrk, struct i915_gem_engines, rcu.work);
-   struct drm_i915_private *i915 = e->i915;
-
-   mutex_lock(&i915->drm.struct_mutex);
-   free_engines(e);
-   mutex_unlock(&i915->drm.struct_mutex);
+   free_engines(container_of(rcu, struct i915_gem_engines, rcu));
 }
 
 static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
@@ -271,7 +283,7 @@ static struct i915_gem_engines *default_engines(struct 
i915_gem_context *ctx)
if (!e)
return ERR_PTR(-ENOMEM);
 
-   e->i915 = ctx->i915;
+   init_rcu_head(&e->rcu);
for_each_engine(engine, ctx->i915, id) {
struct intel_context *ce;
 
@@ -359,7 +371,10 @@ void i915_gem_context_release(struct kref *ref)
 
 static void context_close(struct i915_gem_context *ctx)
 {
+   mutex_lock(&ctx->mutex);
+
i915_gem_context_set_closed(ctx);
+   ctx->file_priv = ERR_PTR(-EBADF);
 
/*
 * This context will never again be assinged to HW, so we can
@@ -374,7 +389,7 @@ static void context_close(struct i915_gem_context *ctx)
 */
lut_close(ctx);
 
-   ctx->file_priv = ERR_PTR(-EBADF);
+   mutex_unlock(&ctx->mutex);
i915_gem_context_put(ctx);
 }
 
@@ -429,7 +444,6 @@ __create_context(struct drm_i915_private *dev_priv)
RCU_INIT_POINTER(ctx->engines, e);
 
INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
-   INIT_LIST_HEAD(&ctx->handles_list);
INIT_LIST_HEAD(

[Intel-gfx] [PATCH 06/40] drm/i915: Convert inconsistent static engine tables into an init error

2019-05-08 Thread Chris Wilson
Remove the modification of the "constant" device info by promoting the
inconsistent intel_engine static table into an initialisation error.
Now, if we add a new engine into the device_info, we must first add that
engine information into the intel_engines.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 28 ---
 1 file changed, 9 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4c3753c1b573..6434170ea4b6 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -372,15 +372,14 @@ void intel_engines_cleanup(struct drm_i915_private *i915)
  */
 int intel_engines_init_mmio(struct drm_i915_private *i915)
 {
-   struct intel_device_info *device_info = mkwrite_device_info(i915);
-   const unsigned int engine_mask = INTEL_INFO(i915)->engine_mask;
-   unsigned int mask = 0;
unsigned int i;
int err;
 
-   WARN_ON(engine_mask == 0);
-   WARN_ON(engine_mask &
-   GENMASK(BITS_PER_TYPE(mask) - 1, I915_NUM_ENGINES));
+   /* We always presume we have at least RCS available for later probing */
+   if (GEM_WARN_ON(!HAS_ENGINE(i915, RCS0))) {
+   err = -ENODEV;
+   goto cleanup;
+   }
 
if (i915_inject_load_failure())
return -ENODEV;
@@ -392,25 +391,16 @@ int intel_engines_init_mmio(struct drm_i915_private *i915)
err = intel_engine_setup(i915, i);
if (err)
goto cleanup;
-
-   mask |= BIT(i);
}
 
-   /*
-* Catch failures to update intel_engines table when the new engines
-* are added to the driver by a warning and disabling the forgotten
-* engines.
-*/
-   if (WARN_ON(mask != engine_mask))
-   device_info->engine_mask = mask;
-
-   /* We always presume we have at least RCS available for later probing */
-   if (WARN_ON(!HAS_ENGINE(i915, RCS0))) {
+   /* Catch failures to update intel_engines table for new engines. */
+   if (GEM_WARN_ON(INTEL_INFO(i915)->engine_mask >> i)) {
err = -ENODEV;
goto cleanup;
}
 
-   RUNTIME_INFO(i915)->num_engines = hweight32(mask);
+   RUNTIME_INFO(i915)->num_engines =
+   hweight32(INTEL_INFO(i915)->engine_mask);
 
i915_check_and_clear_faults(i915);
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 32/40] drm/i915: Drop the deferred active reference

2019-05-08 Thread Chris Wilson
An old optimisation to reduce the number of atomics per batch sadly
relies on struct_mutex for coordination. In order to remove struct_mutex
from serialising object/context closing, always taking and releasing an
active reference on first use / last use greatly simplifies the locking.

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 13 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h| 24 +--
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  8 ---
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  3 +--
 .../i915/gem/selftests/i915_gem_coherency.c   |  4 ++--
 .../drm/i915/gem/selftests/i915_gem_context.c | 11 +
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c|  3 +--
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  9 +--
 .../gpu/drm/i915/gt/selftest_workarounds.c|  3 ---
 drivers/gpu/drm/i915/gvt/scheduler.c  |  2 +-
 drivers/gpu/drm/i915/i915_gem_batch_pool.c|  2 +-
 drivers/gpu/drm/i915/i915_gem_render_state.c  |  2 +-
 drivers/gpu/drm/i915/i915_vma.c   | 15 +---
 drivers/gpu/drm/i915/selftests/i915_request.c |  8 ---
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |  9 +--
 18 files changed, 26 insertions(+), 96 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 217d87c13cd8..5016a3e1f863 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -112,7 +112,7 @@ static void lut_close(struct i915_gem_context *ctx)
radix_tree_iter_delete(&ctx->handles_vma, &iter, slot);
 
vma->open_count--;
-   __i915_gem_object_release_unless_active(vma->obj);
+   i915_vma_put(vma);
}
rcu_read_unlock();
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index df79e2eead62..c178cf7614c6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -156,7 +156,7 @@ void i915_gem_close_object(struct drm_gem_object *gem, 
struct drm_file *file)
list_del(&lut->ctx_link);
 
i915_lut_handle_free(lut);
-   __i915_gem_object_release_unless_active(obj);
+   i915_gem_object_put(obj);
}
 
mutex_unlock(&i915->drm.struct_mutex);
@@ -348,17 +348,6 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
call_rcu(&obj->rcu, __i915_gem_free_object_rcu);
 }
 
-void __i915_gem_object_release_unless_active(struct drm_i915_gem_object *obj)
-{
-   lockdep_assert_held(&obj->base.dev->struct_mutex);
-
-   if (!i915_gem_object_has_active_reference(obj) &&
-   i915_gem_object_is_active(obj))
-   i915_gem_object_set_active_reference(obj);
-   else
-   i915_gem_object_put(obj);
-}
-
 static inline enum fb_op_origin
 fb_write_origin(struct drm_i915_gem_object *obj, unsigned int domain)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 23bca003fbfb..dcb765cd97a3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -161,31 +161,9 @@ i915_gem_object_needs_async_cancel(const struct 
drm_i915_gem_object *obj)
 static inline bool
 i915_gem_object_is_active(const struct drm_i915_gem_object *obj)
 {
-   return obj->active_count;
+   return READ_ONCE(obj->active_count);
 }
 
-static inline bool
-i915_gem_object_has_active_reference(const struct drm_i915_gem_object *obj)
-{
-   return test_bit(I915_BO_ACTIVE_REF, &obj->flags);
-}
-
-static inline void
-i915_gem_object_set_active_reference(struct drm_i915_gem_object *obj)
-{
-   lockdep_assert_held(&obj->base.dev->struct_mutex);
-   __set_bit(I915_BO_ACTIVE_REF, &obj->flags);
-}
-
-static inline void
-i915_gem_object_clear_active_reference(struct drm_i915_gem_object *obj)
-{
-   lockdep_assert_held(&obj->base.dev->struct_mutex);
-   __clear_bit(I915_BO_ACTIVE_REF, &obj->flags);
-}
-
-void __i915_gem_object_release_unless_active(struct drm_i915_gem_object *obj);
-
 static inline bool
 i915_gem_object_is_framebuffer(const struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 24abdd6999b5..133581339d81 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -120,14 +120,6 @@ struct drm_i915_gem_object {
struct list_head batch_pool_link;
I915_SELFTEST_DECLARE(struct list_head st_link);
 
-   unsigned long flags;
-
-   /**
-* Have we taken a reference for th

[Intel-gfx] [PATCH 18/40] drm/i915: Allow specification of parallel execbuf

2019-05-08 Thread Chris Wilson
There is a desire to split a task onto two engines and have them run at
the same time, e.g. scanline interleaving to spread the workload evenly.
Through the use of the out-fence from the first execbuf, we can
coordinate secondary execbuf to only become ready simultaneously with
the first, so that with all things idle the second execbufs are executed
in parallel with the first. The key difference here between the new
EXEC_FENCE_SUBMIT and the existing EXEC_FENCE_IN is that the in-fence
waits for the completion of the first request (so that all of its
rendering results are visible to the second execbuf, the more common
userspace fence requirement).

Since we only have a single input fence slot, userspace cannot mix an
in-fence and a submit-fence. It has to use one or the other! This is not
such a harsh requirement, since by virtue of the submit-fence, the
secondary execbuf inherit all of the dependencies from the first
request, and for the application the dependencies should be common
between the primary and secondary execbuf.

Suggested-by: Tvrtko Ursulin 
Testcase: igt/gem_exec_fence/parallel
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Reviewed-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_drv.c|  1 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 25 +-
 include/uapi/drm/i915_drm.h| 17 ++-
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 5061cb32856b..83d2eb9e74cb 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -443,6 +443,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void 
*data,
case I915_PARAM_HAS_EXEC_CAPTURE:
case I915_PARAM_HAS_EXEC_BATCH_FIRST:
case I915_PARAM_HAS_EXEC_FENCE_ARRAY:
+   case I915_PARAM_HAS_EXEC_SUBMIT_FENCE:
/* For the time being all of these are always true;
 * if some supported hardware does not have one of these
 * features this value needs to be provided from
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index d6c5220addd0..7ce25b54c57b 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2318,6 +2318,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 {
struct i915_execbuffer eb;
struct dma_fence *in_fence = NULL;
+   struct dma_fence *exec_fence = NULL;
struct sync_file *out_fence = NULL;
int out_fence_fd = -1;
int err;
@@ -2360,11 +2361,24 @@ i915_gem_do_execbuffer(struct drm_device *dev,
return -EINVAL;
}
 
+   if (args->flags & I915_EXEC_FENCE_SUBMIT) {
+   if (in_fence) {
+   err = -EINVAL;
+   goto err_in_fence;
+   }
+
+   exec_fence = sync_file_get_fence(lower_32_bits(args->rsvd2));
+   if (!exec_fence) {
+   err = -EINVAL;
+   goto err_in_fence;
+   }
+   }
+
if (args->flags & I915_EXEC_FENCE_OUT) {
out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
if (out_fence_fd < 0) {
err = out_fence_fd;
-   goto err_in_fence;
+   goto err_exec_fence;
}
}
 
@@ -2494,6 +2508,13 @@ i915_gem_do_execbuffer(struct drm_device *dev,
goto err_request;
}
 
+   if (exec_fence) {
+   err = i915_request_await_execution(eb.request, exec_fence,
+  eb.engine->bond_execute);
+   if (err < 0)
+   goto err_request;
+   }
+
if (fences) {
err = await_fence_array(&eb, fences);
if (err)
@@ -2555,6 +2576,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_out_fence:
if (out_fence_fd != -1)
put_unused_fd(out_fence_fd);
+err_exec_fence:
+   dma_fence_put(exec_fence);
 err_in_fence:
dma_fence_put(in_fence);
return err;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index e2da9027bcdf..bdb00ec1f8be 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -604,6 +604,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_MMAP_GTT_COHERENT   52
 
+/*
+ * Query whether DRM_I915_GEM_EXECBUFFER2 supports coordination of parallel
+ * execution through use of explicit fence support.
+ * See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_SUBMIT.
+ */
+#define I915_PARAM_HAS_EXEC_SUBMIT_FENCE 53
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1126,7 +1132,16 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_ARRAY   (1<<19)
 
-#define __I915_EXEC_UNKNO

Re: [Intel-gfx] [PATCH] RFC: console: hack up console_lock more v2

2019-05-08 Thread Daniel Vetter
On Mon, May 06, 2019 at 01:24:48PM +0200, Petr Mladek wrote:
> On Mon 2019-05-06 11:38:13, Daniel Vetter wrote:
> > On Mon, May 06, 2019 at 10:26:28AM +0200, Petr Mladek wrote:
> > > On Mon 2019-05-06 10:16:14, Petr Mladek wrote:
> > > > On Mon 2019-05-06 09:45:53, Daniel Vetter wrote:
> > > > > console_trylock, called from within printk, can be called from pretty
> > > > > much anywhere. Including try_to_wake_up. Note that this isn't common,
> > > > > usually the box is in pretty bad shape at that point already. But it
> > > > > really doesn't help when then lockdep jumps in and spams the logs,
> > > > > potentially obscuring the real backtrace we're really interested in.
> > > > > One case I've seen (slightly simplified backtrace):
> > > > > 
> > > > >  Call Trace:
> > > > >   
> > > > >   console_trylock+0xe/0x60
> > > > >   vprintk_emit+0xf1/0x320
> > > > >   printk+0x4d/0x69
> > > > >   __warn_printk+0x46/0x90
> > > > >   native_smp_send_reschedule+0x2f/0x40
> > > > >   check_preempt_curr+0x81/0xa0
> > > > >   ttwu_do_wakeup+0x14/0x220
> > > > >   try_to_wake_up+0x218/0x5f0
> > > > 
> > > > try_to_wake_up() takes p->pi_lock. It could deadlock because it
> > > > can get called recursively from printk_safe_up().
> > > > 
> > > > And there are more locks taken from try_to_wake_up(), for example,
> > > > __task_rq_lock() taken from ttwu_remote().
> > > > 
> > > > IMHO, the most reliable solution would be do call the entire
> > > > up_console_sem() from printk deferred context. We could assign
> > > > few bytes for this context in the per-CPU printk_deferred
> > > > variable.
> > > 
> > > Ah, I was too fast and did the same mistake. This won't help because
> > > it would still call try_to_wake_up() recursively.
> > 
> > Uh :-/
> > 
> > > We need to call all printk's that can be called under locks
> > > taken in try_to_wake_up() path in printk deferred context.
> > > Unfortunately it is whack a mole approach.
> > 
> > Hm since it's whack-a-mole anyway, what about converting the WARN_ON into
> > a prinkt_deferred, like all the other scheduler related code? Feels a
> > notch more consistent to me than leaking the printk_context into areas it
> > wasn't really meant built for. Scheduler code already fully subscribed to
> > the whack-a-mole approach after all.
> 
> I am not sure how exactly you mean the conversion.
> 
> Anyway, we do not want to use printk_deferred() treewide. It reduces
> the chance that the messages reach consoles. Scheduler is an
> exception because of the possible deadlocks.
> 
> A solution would be to define WARN_ON_DEFERRED() that would
> call normal WARN_ON() in printk deferred context and
> use in scheduler.

Sent it out, and then Sergey pointed out printk_safe_enter/exit (which I
guess is what you meant, and which I missed), but we're doing this already
around the up() call in __up_console_sem.

So I think these further recursions you're pointed out are already handled
correctly, and all we need to do is to break the loop involving
semaphore.lock of the console_lock semaphore only. Which I think this
patch here achieves.

Thoughts? Or are we again missing something here?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v6 4/6] drm/i915/dp: Add a support of YCBCR 4:2:0 to DP MSA

2019-05-08 Thread Gwan-gyeong Mun
When YCBCR 4:2:0 outputs is used for DP, we should program YCBCR 4:2:0 to
MSA and VSC SDP.

As per DP 1.4a spec section 2.2.4.3 [MSA Field for Indication of Color
Encoding Format and Content Color Gamut] while sending YCBCR 420 signals
we should program MSA MISC1 fields which indicate VSC SDP for the Pixel
Encoding/Colorimetry Format.

v2: Block comment style fix.
v6:
  Fix an wrong setting of MSA MISC1 fields for Pixel Encoding/Colorimetry
  Format indication. As per DP 1.4a spec Table 2-96 [MSA MISC1 and MISC0
  Fields for Pixel Encoding/Colorimetry Format Indication]
  When MISC1, bit 6, is Set to 1, a Source device uses a VSC SDP to
  indicate the Pixel Encoding/Colorimetry Format. On the wrong version
  it set a bit 5 of MISC1, now it set a bit 6 of MISC1.

Signed-off-by: Gwan-gyeong Mun 
Reviewed-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/i915_reg.h  | 1 +
 drivers/gpu/drm/i915/intel_ddi.c | 8 
 2 files changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index e97c47fca645..2ad98e62034f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -9524,6 +9524,7 @@ enum skl_power_gate {
 #define  TRANS_MSA_12_BPC  (3 << 5)
 #define  TRANS_MSA_16_BPC  (4 << 5)
 #define  TRANS_MSA_CEA_RANGE   (1 << 3)
+#define  TRANS_MSA_USE_VSC_SDP (1 << 14)
 
 /* LCPLL Control */
 #define LCPLL_CTL  _MMIO(0x130040)
diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index 2f1688ea5a2c..4441c5ba71fb 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -1717,6 +1717,14 @@ void intel_ddi_set_pipe_settings(const struct 
intel_crtc_state *crtc_state)
 */
if (crtc_state->output_format == INTEL_OUTPUT_FORMAT_YCBCR444)
temp |= TRANS_MSA_SAMPLING_444 | TRANS_MSA_CLRSP_YCBCR;
+   /*
+* As per DP 1.4a spec section 2.2.4.3 [MSA Field for Indication
+* of Color Encoding Format and Content Color Gamut] while sending
+* YCBCR 420 signals we should program MSA MISC1 fields which
+* indicate VSC SDP for the Pixel Encoding/Colorimetry Format.
+*/
+   if (crtc_state->output_format == INTEL_OUTPUT_FORMAT_YCBCR420)
+   temp |= TRANS_MSA_USE_VSC_SDP;
I915_WRITE(TRANS_MSA_MISC(cpu_transcoder), temp);
 }
 
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v6 0/6] drm/i915/dp: Support for DP YCbCr4:2:0 outputs

2019-05-08 Thread Gwan-gyeong Mun
On Gen 11 platform, to enable resolutions like 5K@120 (or higher) we need
to use DSC (DP 1.4) or YCbCr4:2:0 (DP 1.3 or 1.4) on DP.
In order to support YCbCr4:2:0 on DP we need to program YCBCR 4:2:0
to MSA and VSC SDP.
And Link M/N values are calculated and applied based on the Full Clock
forYCbCr420.
The Bit per Pixel needs to be adjusted for YUV420 mode as it requires only
half of the RGB case.
 - Link M/N values are calculated and applied based on the Full Clock
 - Data M/N values needs to be calculated considering the data is half
   due to subsampling

These patches add a VSC structure for handling Pixel Encoding/Colorimetry
Formats and program YCBCR 4:2:0 to MSA and VSC SDP. And it changes a link
bandwidth computation for DP.

These patches tested on below test environment.
Test Environment: 
 - Tested System: Gen11 platform
 - Monitor: Wasabi Mango UHD430 REAL4K HDMI 2.0 Slim HDR DUAL DP i20
(This monitor supports HDMI YCbCr 4:2:0)
 - DP to HDMI Adaptor (Dongle) : Club3D CAC-1080 (This dongle supports
 DP YCbCr 4:2:0 pass through feature.)
 - To enable DP YCbCr 4:2:0 forcely, test enviromnent uses work arounds
   patches. You can find these on
   https://gitlab.freedesktop.org/elongbug/drm-tip/tree/dp_ycbcr420_work

The idea of a scaling (RGB -> YCbCr4:4:4 -> YCbCr 4:2:0) is to follow the
same approach used in YCbCr 4:2:0 on HDMI.

v2: Addressed Maarten's review comments, fixed minor coding and block comment
style. And reordered a first patch  ("drm/i915/dp: Support DP ports
YUV 4:2:0 output to GEN11") as a last patch.

v3: Addressed Ville's review comments.
Style fixed with few naming.
If lscon is active, it makes not to call intel_dp_ycbcr420_config() to avoid
to clobber of lspcon_ycbcr420_config() routine.
And it move the 420_only check into the intel_dp_ycbcr420_config().
Remove a changing of pipe_bpp on intel_ddi_set_pipe_settings().
Because the pipe is running at the full bpp, keep pipe_bpp as RGB even though
YCbCr 4:2:0 output format is used.
Add a link bandwidth computation for YCbCr4:2:0 output format.

v4: Fix uninitialized return value which is reported by Dan Carpenter.

v5: Addressed reivew comments from Ville.
In order to make codes simple, it adds and uses intel_dp_output_bpp() function.
In order to avoid the extra indentation, it inverts if-clause on
intel_dp_ycbcr420_config().
Remove the error print where no errors print are allowed.

v6:
Link M/N values are calculated and applied based on the Full Clock for
YCbCr420. The Bit per Pixel needs to be adjusted for YUV420 mode as it
requires only half of the RGB case.
 - Link M/N values are calculated and applied based on the Full Clock
 - Data M/N values needs to be calculated considering the data is half due
   to subsampling
Remove a doubling of pixel clock on a dot clock calculator for DP YCbCr 4:2:0.
Rebase and remove a duplicate setting of vsc_sdp.DB17.
Add a setting of dynamic range bit to  vsc_sdp.DB17.
Change Content Type bit to "Graphics" from "Not defined".
Change a dividing of pipe_bpp to muliplying to constant values on a switch-case
statement.
Fix an wrong setting of MSA MISC1 fields for Pixel Encoding/Colorimetry
Format indication. As per DP 1.4a spec Table 2-96 [MSA MISC1 and MISC0
Fields for Pixel Encoding/Colorimetry Format Indication] When MISC1, bit 6,
is Set to 1, a Source device uses a VSC SDP to indicate the Pixel
Encoding/Colorimetry Format. On the wrong version it set a bit 5 of MISC1,
now it set a bit 6 of MISC1.

References: https://patchwork.freedesktop.org/series/56059/

Gwan-gyeong Mun (6):
  drm/i915/dp: Add a config function for YCBCR420 outputs
  drm: Add a VSC structure for handling Pixel Encoding/Colorimetry
Formats
  drm/i915/dp: Program VSC Header and DB for Pixel Encoding/Colorimetry
Format
  drm/i915/dp: Add a support of YCBCR 4:2:0 to DP MSA
  drm/i915/dp: Change a link bandwidth computation for DP
  drm/i915/dp: Support DP ports YUV 4:2:0 output to GEN11

 drivers/gpu/drm/i915/i915_reg.h  |   1 +
 drivers/gpu/drm/i915/intel_ddi.c |  12 ++-
 drivers/gpu/drm/i915/intel_dp.c  | 149 ++-
 drivers/gpu/drm/i915/intel_drv.h |   2 +
 include/drm/drm_dp_helper.h  |  17 
 5 files changed, 178 insertions(+), 3 deletions(-)

-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v6 3/6] drm/i915/dp: Program VSC Header and DB for Pixel Encoding/Colorimetry Format

2019-05-08 Thread Gwan-gyeong Mun
Function intel_pixel_encoding_setup_vsc handles vsc header and data block
setup for pixel encoding / colorimetry format.

Setup VSC header and data block in function intel_pixel_encoding_setup_vsc
for pixel encoding / colorimetry format as per dp 1.4a spec,
section 2.2.5.7.1, table 2-119: VSC SDP Header Bytes, section 2.2.5.7.5,
table 2-120:VSC SDP Payload for DB16 through DB18.

v2:
  Minor style fix. [Maarten]
  Refer to commit ids instead of patchwork. [Maarten]

v6: Rebase

Cc: Maarten Lankhorst 
Signed-off-by: Gwan-gyeong Mun 
---
 drivers/gpu/drm/i915/intel_ddi.c |  1 +
 drivers/gpu/drm/i915/intel_dp.c  | 73 
 drivers/gpu/drm/i915/intel_drv.h |  2 +
 3 files changed, 76 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index cd5277d98b03..2f1688ea5a2c 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -3391,6 +3391,7 @@ static void intel_enable_ddi_dp(struct intel_encoder 
*encoder,
 
intel_edp_backlight_on(crtc_state, conn_state);
intel_psr_enable(intel_dp, crtc_state);
+   intel_dp_ycbcr_420_enable(intel_dp, crtc_state);
intel_edp_drrs_enable(intel_dp, crtc_state);
 
if (crtc_state->has_audio)
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 06a3417a88d1..74aad8830a80 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -4394,6 +4394,79 @@ u8 intel_dp_dsc_get_slice_count(struct intel_dp 
*intel_dp,
return 0;
 }
 
+static void
+intel_pixel_encoding_setup_vsc(struct intel_dp *intel_dp,
+  const struct intel_crtc_state *crtc_state)
+{
+   struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
+   struct dp_vsc_sdp vsc_sdp;
+
+   if (!intel_dp->attached_connector->base.ycbcr_420_allowed)
+   return;
+
+   /* Prepare VSC Header for SU as per DP 1.4a spec, Table 2-119 */
+   memset(&vsc_sdp, 0, sizeof(vsc_sdp));
+   vsc_sdp.sdp_header.HB0 = 0;
+   vsc_sdp.sdp_header.HB1 = 0x7;
+
+   /* VSC SDP supporting 3D stereo, PSR2, and Pixel Encoding/
+* Colorimetry Format indication. A DP Source device is allowed
+* to indicate the pixel encoding/colorimetry format to the DP Sink
+* device with VSC SDP only when the DP Sink device supports it
+* (i.e., VSC_SDP_EXTENSION_FOR_COLORIMETRY_SUPPORTED bit in the 
register
+* DPRX_FEATURE_ENUMERATION_LIST (DPCD Address 02210h, bit 3) is set to 
1)
+*/
+   vsc_sdp.sdp_header.HB2 = 0x5;
+
+   /* VSC SDP supporting 3D stereo, + PSR2, + Pixel Encoding/
+* Colorimetry Format indication (HB2 = 05h).
+*/
+   vsc_sdp.sdp_header.HB3 = 0x13;
+   /* YCbCr 420 = 3h DB16[7:4] ITU-R BT.601 = 0h, ITU-R BT.709 = 1h
+* DB16[3:0] DP 1.4a spec, Table 2-120
+*/
+
+   /* Commit id (25edf91501b8 "drm/i915: prepare csc unit for YCBCR420 
output")
+* uses the BT.709 color space to perform RGB->YCBCR conversion.
+*/
+   vsc_sdp.DB16 = 0x3 << 4; /* 0x3 << 4 , YCbCr 420*/
+   vsc_sdp.DB16 |= 0x1; /* 0x1, ITU-R BT.709 */
+
+   /* For pixel encoding formats YCbCr444, YCbCr422, YCbCr420, and Y Only,
+* the following Component Bit Depth values are defined:
+* 001b = 8bpc.
+* 010b = 10bpc.
+* 011b = 12bpc.
+* 100b = 16bpc.
+*/
+   vsc_sdp.DB17 = 0x1;
+
+   /*
+* Content Type (Bits 2:0)
+* 000b = Not defined.
+* 001b = Graphics.
+* 010b = Photo.
+* 011b = Video.
+* 100b = Game
+* All other values are RESERVED.
+* Note: See CTA-861-G for the definition and expected
+* processing by a stream sink for the above contect types.
+*/
+   vsc_sdp.DB18 = 0;
+
+   intel_dig_port->write_infoframe(&intel_dig_port->base,
+   crtc_state, DP_SDP_VSC, &vsc_sdp, sizeof(vsc_sdp));
+}
+
+void intel_dp_ycbcr_420_enable(struct intel_dp *intel_dp,
+  const struct intel_crtc_state *crtc_state)
+{
+   if (crtc_state->output_format != INTEL_OUTPUT_FORMAT_YCBCR420)
+   return;
+
+   intel_pixel_encoding_setup_vsc(intel_dp, crtc_state);
+}
+
 static u8 intel_dp_autotest_link_training(struct intel_dp *intel_dp)
 {
int status = 0;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 247893ed1543..5d1845526cf8 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1576,6 +1576,8 @@ void intel_dp_get_m_n(struct intel_crtc *crtc,
  struct intel_crtc_state *pipe_config);
 void intel_dp_set_m_n(const struct intel_crtc_state *crtc_state,
  enum link_m_n_set m_n);
+void intel_dp_ycbcr_420_enable(struct intel_dp *intel_dp,
+  const struct intel_crtc_state 

[Intel-gfx] [PATCH v6 1/6] drm/i915/dp: Add a config function for YCBCR420 outputs

2019-05-08 Thread Gwan-gyeong Mun
This patch checks a support of YCBCR420 outputs on an encoder level.
If the input mode is YCBCR420-only mode then it prepares DP as an YCBCR420
output, else it continues with RGB output mode.
It set output_format to INTEL_OUTPUT_FORMAT_YCBCR420 in order to using
a pipe scaler as RGB to YCbCr 4:4:4.

v2:
  Addressed review comments from Ville.
  Style fixed with few naming.
  %s/config/crtc_state/
  %s/intel_crtc/crtc/
  If lscon is active, it makes not to call intel_dp_ycbcr420_config()
  to avoid to clobber of lspcon_ycbcr420_config() routine.
  And it move the 420_only check into the intel_dp_ycbcr420_config().

v3: Fix uninitialized return value and it is reported by Dan Carpenter.

v4:
  Addressed review comments from Ville.
  In order to avoid the extra indentation, it inverts if-clause on
  intel_dp_ycbcr420_config().
  Remove the error print where no errors print are allowed.

v6: Rebase

Cc: Ville Syrjälä 
Signed-off-by: Gwan-gyeong Mun 
Reviewed-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/intel_dp.c | 35 -
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 53cc4afea256..06a3417a88d1 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -2085,6 +2085,34 @@ intel_dp_compute_link_config(struct intel_encoder 
*encoder,
return 0;
 }
 
+static int
+intel_dp_ycbcr420_config(struct drm_connector *connector,
+struct intel_crtc_state *crtc_state)
+{
+   const struct drm_display_info *info = &connector->display_info;
+   const struct drm_display_mode *adjusted_mode =
+   &crtc_state->base.adjusted_mode;
+   struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+   int ret;
+
+   if (!drm_mode_is_420_only(info, adjusted_mode) ||
+   !connector->ycbcr_420_allowed)
+   return 0;
+
+   crtc_state->output_format = INTEL_OUTPUT_FORMAT_YCBCR420;
+
+   /* YCBCR 420 output conversion needs a scaler */
+   ret = skl_update_scaler_crtc(crtc_state);
+   if (ret) {
+   DRM_DEBUG_KMS("Scaler allocation for output failed\n");
+   return ret;
+   }
+
+   intel_pch_panel_fitting(crtc, crtc_state, DRM_MODE_SCALE_FULLSCREEN);
+
+   return 0;
+}
+
 bool intel_dp_limited_color_range(const struct intel_crtc_state *crtc_state,
  const struct drm_connector_state *conn_state)
 {
@@ -2124,7 +2152,7 @@ intel_dp_compute_config(struct intel_encoder *encoder,
to_intel_digital_connector_state(conn_state);
bool constant_n = drm_dp_has_quirk(&intel_dp->desc,
   DP_DPCD_QUIRK_CONSTANT_N);
-   int ret, output_bpp;
+   int ret = 0, output_bpp;
 
if (HAS_PCH_SPLIT(dev_priv) && !HAS_DDI(dev_priv) && port != PORT_A)
pipe_config->has_pch_encoder = true;
@@ -2132,6 +2160,11 @@ intel_dp_compute_config(struct intel_encoder *encoder,
pipe_config->output_format = INTEL_OUTPUT_FORMAT_RGB;
if (lspcon->active)
lspcon_ycbcr420_config(&intel_connector->base, pipe_config);
+   else
+   ret = intel_dp_ycbcr420_config(&intel_connector->base, 
pipe_config);
+
+   if (ret)
+   return ret;
 
pipe_config->has_drrs = false;
if (IS_G4X(dev_priv) || port == PORT_A)
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v6 5/6] drm/i915/dp: Change a link bandwidth computation for DP

2019-05-08 Thread Gwan-gyeong Mun
Data M/N calculations were assumed a bpp as RGB format. But when we are
using YCbCr 4:2:0 output format on DP, we should change bpp calculations
as YCbCr 4:2:0 format. The pipe_bpp value was assumed RGB format,
therefore, it was multiplied with 3. But YCbCr 4:2:0 requires a multiplier
value to 1.5.
Therefore we need to divide pipe_bpp to 2 while DP output uses YCbCr4:2:0
format.
 - RGB format bpp = bpc x 3
 - YCbCr 4:2:0 format bpp = bpc x 1.5

But Link M/N values are calculated and applied based on the Full Clock for
YCbCr 4:2:0. And DP YCbCr 4:2:0 does not need to pixel clock double for
a dotclock caluation. Only for HDMI YCbCr 4:2:0 needs to pixel clock double
for a dot clock calculation.

And it adds missed bpc values for a programming of VSC Header.
It only affects dp and edp port which use YCbCr 4:2:0 output format.
And for now, it does not consider a use case of DSC + YCbCr 4:2:0.

v2:
  Addressed review comments from Ville.
  Remove a changing of pipe_bpp on intel_ddi_set_pipe_settings().
  Because the pipe is running at the full bpp, keep pipe_bpp as RGB
  even though YCbCr 4:2:0 output format is used.
  Add a link bandwidth computation for YCbCr4:2:0 output format.

v3:
  Addressed reivew comments from Ville.
  In order to make codes simple, it adds and uses intel_dp_output_bpp()
  function.

v6:
  Link M/N values are calculated and applied based on the Full Clock for
  YCbCr420. The Bit per Pixel needs to be adjusted for YUV420 mode as it
  requires only half of the RGB case.
- Link M/N values are calculated and applied based on the Full Clock
- Data M/N values needs to be calculated considering the data is half
  due to subsampling
  Remove a doubling of pixel clock on a dot clock calculator for
  DP YCbCr 4:2:0.
  Rebase and remove a duplicate setting of vsc_sdp.DB17.
  Add a setting of dynamic range bit to  vsc_sdp.DB17.
  Change Content Type bit to "Graphics" from "Not defined".
  Change a dividing of pipe_bpp to muliplying to constant values on a
  switch-case statement.

Cc: Ville Syrjälä 
Signed-off-by: Gwan-gyeong Mun 
---
 drivers/gpu/drm/i915/intel_ddi.c |  3 ++-
 drivers/gpu/drm/i915/intel_dp.c  | 42 +---
 2 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index 4441c5ba71fb..e22a0898b957 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -1457,7 +1457,8 @@ static void ddi_dotclock_get(struct intel_crtc_state 
*pipe_config)
else
dotclock = pipe_config->port_clock;
 
-   if (pipe_config->output_format == INTEL_OUTPUT_FORMAT_YCBCR420)
+   if (pipe_config->output_format == INTEL_OUTPUT_FORMAT_YCBCR420 &&
+   !intel_crtc_has_dp_encoder(pipe_config))
dotclock *= 2;
 
if (pipe_config->pixel_multiplier)
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 74aad8830a80..c75e2bbe612a 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -1842,6 +1842,19 @@ intel_dp_adjust_compliance_config(struct intel_dp 
*intel_dp,
}
 }
 
+static int intel_dp_output_bpp(const struct intel_crtc_state *crtc_state, int 
bpp)
+{
+   /*
+* bpp value was assumed to RGB format. And YCbCr 4:2:0 output
+* format of the number of bytes per pixel will be half the number
+* of bytes of RGB pixel.
+*/
+   if (crtc_state->output_format == INTEL_OUTPUT_FORMAT_YCBCR420)
+   bpp /= 2;
+
+   return bpp;
+}
+
 /* Optimize link config in order: max bpp, min clock, min lanes */
 static int
 intel_dp_compute_link_config_wide(struct intel_dp *intel_dp,
@@ -2212,7 +2225,7 @@ intel_dp_compute_config(struct intel_encoder *encoder,
if (pipe_config->dsc_params.compression_enable)
output_bpp = pipe_config->dsc_params.compressed_bpp;
else
-   output_bpp = pipe_config->pipe_bpp;
+   output_bpp = intel_dp_output_bpp(pipe_config, 
pipe_config->pipe_bpp);
 
intel_link_compute_m_n(output_bpp,
   pipe_config->lane_count,
@@ -4439,7 +4452,30 @@ intel_pixel_encoding_setup_vsc(struct intel_dp *intel_dp,
 * 011b = 12bpc.
 * 100b = 16bpc.
 */
-   vsc_sdp.DB17 = 0x1;
+   switch (crtc_state->pipe_bpp) {
+   case 24: /* 8bpc */
+   vsc_sdp.DB17 = 0x1;
+   break;
+   case 30: /* 10bpc */
+   vsc_sdp.DB17 = 0x2;
+   break;
+   case 36: /* 12bpc */
+   vsc_sdp.DB17 = 0x3;
+   break;
+   case 48: /* 16bpc */
+   vsc_sdp.DB17 = 0x4;
+   break;
+   default:
+   DRM_DEBUG_KMS("Invalid bpp value '%d'\n", crtc_state->pipe_bpp);
+   break;
+   }
+
+   /*
+* Dynamic Range (Bit 7)
+* 0 = VESA range, 1 = CTA range.
+* all YCbCr ar

[Intel-gfx] [PATCH v6 2/6] drm: Add a VSC structure for handling Pixel Encoding/Colorimetry Formats

2019-05-08 Thread Gwan-gyeong Mun
SDP VSC Header and Data Block follow DP 1.4a spec, section 2.2.5.7.5,
chapter "VSC SDP Payload for Pixel Encoding/Colorimetry Format".

Signed-off-by: Gwan-gyeong Mun 
Reviewed-by: Maarten Lankhorst 
---
 include/drm/drm_dp_helper.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
index 97ce790a5b5a..3793bea7b7fe 100644
--- a/include/drm/drm_dp_helper.h
+++ b/include/drm/drm_dp_helper.h
@@ -1096,6 +1096,23 @@ struct edp_vsc_psr {
u8 DB8_31[24]; /* Reserved */
 } __packed;
 
+struct dp_vsc_sdp {
+   struct dp_sdp_header sdp_header;
+   u8 DB0; /* Stereo Interface */
+   u8 DB1; /* 0 - PSR State; 1 - Update RFB; 2 - CRC Valid */
+   u8 DB2; /* CRC value bits 7:0 of the R or Cr component */
+   u8 DB3; /* CRC value bits 15:8 of the R or Cr component */
+   u8 DB4; /* CRC value bits 7:0 of the G or Y component */
+   u8 DB5; /* CRC value bits 15:8 of the G or Y component */
+   u8 DB6; /* CRC value bits 7:0 of the B or Cb component */
+   u8 DB7; /* CRC value bits 15:8 of the B or Cb component */
+   u8 DB8_15[8];  /* Reserved */
+   u8 DB16; /* Pixel Encoding and Colorimetry Formats */
+   u8 DB17; /* Dynamic Range and Component Bit Depth */
+   u8 DB18; /* Content Type */
+   u8 DB19_31[13]; /* Reserved */
+} __packed;
+
 #define EDP_VSC_PSR_STATE_ACTIVE   (1<<0)
 #define EDP_VSC_PSR_UPDATE_RFB (1<<1)
 #define EDP_VSC_PSR_CRC_VALUES_VALID   (1<<2)
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v6 6/6] drm/i915/dp: Support DP ports YUV 4:2:0 output to GEN11

2019-05-08 Thread Gwan-gyeong Mun
Bspec describes that GEN10 only supports capability of YUV 4:2:0 output to
HDMI port and GEN11 supports capability of YUV 4:2:0 output to both DP and
HDMI ports.

v2: Minor style fix.

Signed-off-by: Gwan-gyeong Mun 
Reviewed-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/intel_dp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index c75e2bbe612a..9b3724cd37cd 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -7378,6 +7378,9 @@ intel_dp_init_connector(struct intel_digital_port 
*intel_dig_port,
connector->interlace_allowed = true;
connector->doublescan_allowed = 0;
 
+   if (INTEL_GEN(dev_priv) >= 11)
+   connector->ycbcr_420_allowed = true;
+
intel_encoder->hpd_pin = intel_hpd_pin_default(dev_priv, port);
 
intel_dp_aux_init(intel_dp);
-- 
2.21.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] RFC: x86/smp: use printk_deferred in native_smp_send_reschedule

2019-05-08 Thread Daniel Vetter
On Wed, May 8, 2019 at 9:44 AM Sergey Senozhatsky
 wrote:
>
> On (05/07/19 19:33), Daniel Vetter wrote:
> [..]
> > - make the console_trylock trylock also the spinlock. This works in
> >   the limited case of the console_lock use-case, but doesn't fix the
> >   same semaphore.lock acquisition in the up() path in console_unlock,
> >   which we can't avoid with a trylock.
> >
> > - move the wake_up_process in up() out from under the semaphore.lock
> >   spinlock critical section. Again this works for the limited case of
> >   the console_lock, and does fully break the cycle for this lock.
> >   Unfortunately there's still plenty of scheduler related locks that
> >   wake_up_process needs, so the loop is still there, just with a few
> >   less locks involved.
> >
> > Hence now third attempt, trying to fix this by using printk_deferred()
> > instead of the normal printk that WARN() uses.
> > native_smp_send_reschedule is only called from scheduler related code,
> > which has to use printk_deferred due to this locking recursion, so
> > this seems consistent.
> >
> > It has the unfortunate downside that we're losing the backtrace though
> > (I didn't find a printk_deferred version of WARN, and I'm not sure
> > it's a bright idea to dump that much using printk_deferred.)
>
> I'm catching up with the emails now (was offline for almost 2 weeks),
> so I haven't seen [yet] all of the previous patches/discussions.
>
> [..]
> >  static void native_smp_send_reschedule(int cpu)
> >  {
> >   if (unlikely(cpu_is_offline(cpu))) {
> > - WARN(1, "sched: Unexpected reschedule of offline CPU#%d!\n", 
> > cpu);
> > + printk_deferred(KERN_WARNING
> > + "sched: Unexpected reschedule of offline 
> > CPU#%d!\n", cpu);
> >   return;
> >   }
> >   apic->send_IPI(cpu, RESCHEDULE_VECTOR);
>
> Hmm,
> One thing to notice here is that the CPU in question is offline-ed,
> and printk_deferred() is a per-CPU type of deferred printk(). So the
> following thing
>
> __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
>
> might not print anything at all. In this particular case we always
> need another CPU to do console_unlock(), since this_cpu() is not
> really expected to do wake_up_klogd_work_func()->console_unlock().

Hm right, I was happy enough when Petr pointed out the printk_deferred
infrastructure that I didn't look too deeply into how it works. From a
quick loo




--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 12/40] drm/i915: Re-expose SINGLE_TIMELINE flags for context creation

2019-05-08 Thread Chris Wilson
The SINGLE_TIMELINE flag can be used to create a context such that all
engine instances within that context share a common timeline. This can
be useful for mixing operations between real and virtual engines, or
when using a composite context for a single client API context.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_context.c | 4 
 include/uapi/drm/i915_drm.h | 3 ++-
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 5fdb44714a5c..9cd671298daf 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -96,8 +96,6 @@
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
-#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE (1 << 1)
-
 #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
 
 static struct i915_global_gem_context {
@@ -505,8 +503,6 @@ i915_gem_create_context(struct drm_i915_private *dev_priv, 
unsigned int flags)
 
lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
-   BUILD_BUG_ON(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &
-~I915_CONTEXT_CREATE_FLAGS_UNKNOWN);
if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
!HAS_EXECLISTS(dev_priv))
return ERR_PTR(-EINVAL);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 82bd488ed0d1..957ba8e60e02 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1469,8 +1469,9 @@ struct drm_i915_gem_context_create_ext {
__u32 ctx_id; /* output: id of new context*/
__u32 flags;
 #define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS   (1u << 0)
+#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE  (1u << 1)
 #define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
-   (-(I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS << 1))
+   (-(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE << 1))
__u64 extensions;
 };
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 29/40] drm/i915: Move GEM object waiting to its own file

2019-05-08 Thread Chris Wilson
Continuing the decluttering of i915_gem.c by moving the object wait
decomposition into its own file.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Makefile  |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h |   8 +
 drivers/gpu/drm/i915/gem/i915_gem_wait.c   | 277 +
 drivers/gpu/drm/i915/i915_drv.h|   7 -
 drivers/gpu/drm/i915/i915_gem.c| 254 ---
 drivers/gpu/drm/i915/i915_utils.h  |  10 -
 6 files changed, 286 insertions(+), 271 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_wait.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index e5348c355987..a4cc2f7f9bc6 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -105,6 +105,7 @@ gem-y += \
gem/i915_gem_stolen.o \
gem/i915_gem_tiling.o \
gem/i915_gem_userptr.o \
+   gem/i915_gem_wait.o \
gem/i915_gemfs.o
 i915-y += \
  $(gem-y) \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 509d145d808a..23bca003fbfb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -436,4 +436,12 @@ static inline void __start_cpu_write(struct 
drm_i915_gem_object *obj)
obj->cache_dirty = true;
 }
 
+int i915_gem_object_wait(struct drm_i915_gem_object *obj,
+unsigned int flags,
+long timeout);
+int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
+ unsigned int flags,
+ const struct i915_sched_attr *attr);
+#define I915_PRIORITY_DISPLAY I915_USER_PRIORITY(I915_PRIORITY_MAX)
+
 #endif
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c 
b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
new file mode 100644
index ..fed5c751ef37
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -0,0 +1,277 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2016 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include "gt/intel_engine.h"
+
+#include "i915_gem_ioctls.h"
+#include "i915_gem_object.h"
+
+static long
+i915_gem_object_wait_fence(struct dma_fence *fence,
+  unsigned int flags,
+  long timeout)
+{
+   BUILD_BUG_ON(I915_WAIT_INTERRUPTIBLE != 0x1);
+
+   if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
+   return timeout;
+
+   if (dma_fence_is_i915(fence))
+   return i915_request_wait(to_request(fence), flags, timeout);
+
+   return dma_fence_wait_timeout(fence,
+ flags & I915_WAIT_INTERRUPTIBLE,
+ timeout);
+}
+
+static long
+i915_gem_object_wait_reservation(struct reservation_object *resv,
+unsigned int flags,
+long timeout)
+{
+   unsigned int seq = __read_seqcount_begin(&resv->seq);
+   struct dma_fence *excl;
+   bool prune_fences = false;
+
+   if (flags & I915_WAIT_ALL) {
+   struct dma_fence **shared;
+   unsigned int count, i;
+   int ret;
+
+   ret = reservation_object_get_fences_rcu(resv,
+   &excl, &count, &shared);
+   if (ret)
+   return ret;
+
+   for (i = 0; i < count; i++) {
+   timeout = i915_gem_object_wait_fence(shared[i],
+flags, timeout);
+   if (timeout < 0)
+   break;
+
+   dma_fence_put(shared[i]);
+   }
+
+   for (; i < count; i++)
+   dma_fence_put(shared[i]);
+   kfree(shared);
+
+   /*
+* If both shared fences and an exclusive fence exist,
+* then by construction the shared fences must be later
+* than the exclusive fence. If we successfully wait for
+* all the shared fences, we know that the exclusive fence
+* must all be signaled. If all the shared fences are
+* signaled, we can prune the array and recover the
+* floating references on the fences/requests.
+*/
+   prune_fences = count && timeout >= 0;
+   } else {
+   excl = reservation_object_get_excl_rcu(resv);
+   }
+
+   if (excl && timeout >= 0)
+   timeout = i915_gem_object_wait_fence(excl, flags, timeout);
+
+   dma_fence_put(excl);
+
+   /*
+* Opportunistically prune the fences iff we know they have *all* been
+* signaled and that the reservation object has not been changed (i.e.
+* no new fences have been added)

[Intel-gfx] [PATCH 30/40] drm/i915: Move GEM object busy checking to its own file

2019-05-08 Thread Chris Wilson
Continuing the decluttering of i915_gem.c by moving the object busy
checking into its own file.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Makefile|   1 +
 drivers/gpu/drm/i915/gem/i915_gem_busy.c | 138 +++
 drivers/gpu/drm/i915/i915_gem.c  | 128 -
 3 files changed, 139 insertions(+), 128 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_busy.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index a4cc2f7f9bc6..865d7b51c297 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -88,6 +88,7 @@ i915-y += $(gt-y)
 # GEM (Graphics Execution Management) code
 obj-y += gem/
 gem-y += \
+   gem/i915_gem_busy.o \
gem/i915_gem_clflush.o \
gem/i915_gem_context.o \
gem/i915_gem_dmabuf.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c 
b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
new file mode 100644
index ..5a5eda3003e9
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
@@ -0,0 +1,138 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2014-2016 Intel Corporation
+ */
+
+#include "gt/intel_engine.h"
+
+#include "i915_gem_ioctls.h"
+#include "i915_gem_object.h"
+
+static __always_inline u32 __busy_read_flag(u8 id)
+{
+   if (id == (u8)I915_ENGINE_CLASS_INVALID)
+   return 0xu;
+
+   GEM_BUG_ON(id >= 16);
+   return 0x1u << id;
+}
+
+static __always_inline u32 __busy_write_id(u8 id)
+{
+   /*
+* The uABI guarantees an active writer is also amongst the read
+* engines. This would be true if we accessed the activity tracking
+* under the lock, but as we perform the lookup of the object and
+* its activity locklessly we can not guarantee that the last_write
+* being active implies that we have set the same engine flag from
+* last_read - hence we always set both read and write busy for
+* last_write.
+*/
+   if (id == (u8)I915_ENGINE_CLASS_INVALID)
+   return 0xu;
+
+   return (id + 1) | __busy_read_flag(id);
+}
+
+static __always_inline unsigned int
+__busy_set_if_active(const struct dma_fence *fence, u32 (*flag)(u8 id))
+{
+   const struct i915_request *rq;
+
+   /*
+* We have to check the current hw status of the fence as the uABI
+* guarantees forward progress. We could rely on the idle worker
+* to eventually flush us, but to minimise latency just ask the
+* hardware.
+*
+* Note we only report on the status of native fences.
+*/
+   if (!dma_fence_is_i915(fence))
+   return 0;
+
+   /* opencode to_request() in order to avoid const warnings */
+   rq = container_of(fence, const struct i915_request, fence);
+   if (i915_request_completed(rq))
+   return 0;
+
+   /* Beware type-expansion follies! */
+   BUILD_BUG_ON(!typecheck(u8, rq->engine->uabi_class));
+   return flag(rq->engine->uabi_class);
+}
+
+static __always_inline unsigned int
+busy_check_reader(const struct dma_fence *fence)
+{
+   return __busy_set_if_active(fence, __busy_read_flag);
+}
+
+static __always_inline unsigned int
+busy_check_writer(const struct dma_fence *fence)
+{
+   if (!fence)
+   return 0;
+
+   return __busy_set_if_active(fence, __busy_write_id);
+}
+
+int
+i915_gem_busy_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file)
+{
+   struct drm_i915_gem_busy *args = data;
+   struct drm_i915_gem_object *obj;
+   struct reservation_object_list *list;
+   unsigned int seq;
+   int err;
+
+   err = -ENOENT;
+   rcu_read_lock();
+   obj = i915_gem_object_lookup_rcu(file, args->handle);
+   if (!obj)
+   goto out;
+
+   /*
+* A discrepancy here is that we do not report the status of
+* non-i915 fences, i.e. even though we may report the object as idle,
+* a call to set-domain may still stall waiting for foreign rendering.
+* This also means that wait-ioctl may report an object as busy,
+* where busy-ioctl considers it idle.
+*
+* We trade the ability to warn of foreign fences to report on which
+* i915 engines are active for the object.
+*
+* Alternatively, we can trade that extra information on read/write
+* activity with
+*  args->busy =
+*  !reservation_object_test_signaled_rcu(obj->resv, true);
+* to report the overall busyness. This is what the wait-ioctl does.
+*
+*/
+retry:
+   seq = raw_read_seqcount(&obj->resv->seq);
+
+   /* Translate the exclusive fence to the READ *and* WRITE engine */
+   args->busy = busy_check_writer(rcu_dereference(obj->resv->fence_excl));
+
+   /* Translate shared fences to READ set of engines */
+  

[Intel-gfx] [PATCH 31/40] drm/i915: Move GEM client throttling to its own file

2019-05-08 Thread Chris Wilson
Continuing the decluttering of i915_gem.c by moving the client self
throttling into its own file.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Makefile|  1 +
 drivers/gpu/drm/i915/gem/i915_gem_throttle.c | 74 
 drivers/gpu/drm/i915/i915_drv.h  |  6 --
 drivers/gpu/drm/i915/i915_gem.c  | 58 ---
 4 files changed, 75 insertions(+), 64 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_throttle.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 865d7b51c297..78d36eaff070 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -104,6 +104,7 @@ gem-y += \
gem/i915_gem_shmem.o \
gem/i915_gem_shrinker.o \
gem/i915_gem_stolen.o \
+   gem/i915_gem_throttle.o \
gem/i915_gem_tiling.o \
gem/i915_gem_userptr.o \
gem/i915_gem_wait.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_throttle.c 
b/drivers/gpu/drm/i915/gem/i915_gem_throttle.c
new file mode 100644
index ..491bc28c175d
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_throttle.c
@@ -0,0 +1,74 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2014-2016 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "i915_gem_ioctls.h"
+#include "i915_gem_object.h"
+
+#include "../i915_drv.h"
+
+/*
+ * 20ms is a fairly arbitrary limit (greater than the average frame time)
+ * chosen to prevent the CPU getting more than a frame ahead of the GPU
+ * (when using lax throttling for the frontbuffer). We also use it to
+ * offer free GPU waitboosts for severely congested workloads.
+ */
+#define DRM_I915_THROTTLE_JIFFIES msecs_to_jiffies(20)
+
+/*
+ * Throttle our rendering by waiting until the ring has completed our requests
+ * emitted over 20 msec ago.
+ *
+ * Note that if we were to use the current jiffies each time around the loop,
+ * we wouldn't escape the function with any frames outstanding if the time to
+ * render a frame was over 20ms.
+ *
+ * This should get us reasonable parallelism between CPU and GPU but also
+ * relatively low latency when blocking on a particular request to finish.
+ */
+int
+i915_gem_throttle_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file)
+{
+   struct drm_i915_file_private *file_priv = file->driver_priv;
+   unsigned long recent_enough = jiffies - DRM_I915_THROTTLE_JIFFIES;
+   struct i915_request *request, *target = NULL;
+   long ret;
+
+   /* ABI: return -EIO if already wedged */
+   ret = i915_terminally_wedged(to_i915(dev));
+   if (ret)
+   return ret;
+
+   spin_lock(&file_priv->mm.lock);
+   list_for_each_entry(request, &file_priv->mm.request_list, client_link) {
+   if (time_after_eq(request->emitted_jiffies, recent_enough))
+   break;
+
+   if (target) {
+   list_del(&target->client_link);
+   target->file_priv = NULL;
+   }
+
+   target = request;
+   }
+   if (target)
+   i915_request_get(target);
+   spin_unlock(&file_priv->mm.lock);
+
+   if (!target)
+   return 0;
+
+   ret = i915_request_wait(target,
+   I915_WAIT_INTERRUPTIBLE,
+   MAX_SCHEDULE_TIMEOUT);
+   i915_request_put(target);
+
+   return ret < 0 ? ret : 0;
+}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8eb01b1b3e0e..b372ac47aa1c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -212,12 +212,6 @@ struct drm_i915_file_private {
struct {
spinlock_t lock;
struct list_head request_list;
-/* 20ms is a fairly arbitrary limit (greater than the average frame time)
- * chosen to prevent the CPU getting more than a frame ahead of the GPU
- * (when using lax throttling for the frontbuffer). We also use it to
- * offer free GPU waitboosts for severely congested workloads.
- */
-#define DRM_I915_THROTTLE_JIFFIES msecs_to_jiffies(20)
} mm;
 
struct idr context_idr;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2f1e6dd78dc1..2a9e8ecf2926 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -988,57 +988,6 @@ int i915_gem_wait_for_idle(struct drm_i915_private *i915,
return 0;
 }
 
-/* Throttle our rendering by waiting until the ring has completed our requests
- * emitted over 20 msec ago.
- *
- * Note that if we were to use the current jiffies each time around the loop,
- * we wouldn't escape the function with any frames outstanding if the time to
- * render a frame was over 20ms.
- *
- * This should get us reasonable parallelism between CPU and GPU but also
- * relatively low latency when blocking on a particular request to finish.
- *

[Intel-gfx] [PATCH 36/40] drm/i915: Stop retiring along engine

2019-05-08 Thread Chris Wilson
We no longer track the execution order along the engine and so no longer
need to enforce ordering of retire along the engine.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_request.c | 128 +++-
 1 file changed, 52 insertions(+), 76 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index ae4c96ba02a9..c6fc0e8c3876 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -183,72 +183,23 @@ static void free_capture_list(struct i915_request 
*request)
}
 }
 
-static void __retire_engine_request(struct intel_engine_cs *engine,
-   struct i915_request *rq)
-{
-   GEM_TRACE("%s(%s) fence %llx:%lld, current %d\n",
- __func__, engine->name,
- rq->fence.context, rq->fence.seqno,
- hwsp_seqno(rq));
-
-   GEM_BUG_ON(!i915_request_completed(rq));
-
-   local_irq_disable();
-
-   spin_lock(&engine->timeline.lock);
-   GEM_BUG_ON(!list_is_first(&rq->link, &engine->timeline.requests));
-   list_del_init(&rq->link);
-   spin_unlock(&engine->timeline.lock);
-
-   spin_lock(&rq->lock);
-   i915_request_mark_complete(rq);
-   if (!i915_request_signaled(rq))
-   dma_fence_signal_locked(&rq->fence);
-   if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags))
-   i915_request_cancel_breadcrumb(rq);
-   if (rq->waitboost) {
-   GEM_BUG_ON(!atomic_read(&rq->i915->gt_pm.rps.num_waiters));
-   atomic_dec(&rq->i915->gt_pm.rps.num_waiters);
-   }
-   spin_unlock(&rq->lock);
-
-   local_irq_enable();
-}
-
-static void __retire_engine_upto(struct intel_engine_cs *engine,
-struct i915_request *rq)
-{
-   struct i915_request *tmp;
-
-   if (list_empty(&rq->link))
-   return;
-
-   do {
-   tmp = list_first_entry(&engine->timeline.requests,
-  typeof(*tmp), link);
-
-   GEM_BUG_ON(tmp->engine != engine);
-   __retire_engine_request(engine, tmp);
-   } while (tmp != rq);
-}
-
-static void i915_request_retire(struct i915_request *request)
+static bool i915_request_retire(struct i915_request *rq)
 {
struct i915_active_request *active, *next;
 
-   GEM_TRACE("%s fence %llx:%lld, current %d\n",
- request->engine->name,
- request->fence.context, request->fence.seqno,
- hwsp_seqno(request));
+   lockdep_assert_held(&rq->i915->drm.struct_mutex);
+   if (!i915_request_completed(rq))
+   return false;
 
-   lockdep_assert_held(&request->i915->drm.struct_mutex);
-   GEM_BUG_ON(!i915_sw_fence_signaled(&request->submit));
-   GEM_BUG_ON(!i915_request_completed(request));
+   GEM_TRACE("%s fence %llx:%lld, current %d\n",
+ rq->engine->name,
+ rq->fence.context, rq->fence.seqno,
+ hwsp_seqno(rq));
 
-   trace_i915_request_retire(request);
+   GEM_BUG_ON(!i915_sw_fence_signaled(&rq->submit));
+   trace_i915_request_retire(rq);
 
-   advance_ring(request);
-   free_capture_list(request);
+   advance_ring(rq);
 
/*
 * Walk through the active list, calling retire on each. This allows
@@ -260,7 +211,7 @@ static void i915_request_retire(struct i915_request 
*request)
 * pass along the auxiliary information (to avoid dereferencing
 * the node after the callback).
 */
-   list_for_each_entry_safe(active, next, &request->active_list, link) {
+   list_for_each_entry_safe(active, next, &rq->active_list, link) {
/*
 * In microbenchmarks or focusing upon time inside the kernel,
 * we may spend an inordinate amount of time simply handling
@@ -276,18 +227,39 @@ static void i915_request_retire(struct i915_request 
*request)
INIT_LIST_HEAD(&active->link);
RCU_INIT_POINTER(active->request, NULL);
 
-   active->retire(active, request);
+   active->retire(active, rq);
+   }
+
+   local_irq_disable();
+
+   spin_lock(&rq->engine->timeline.lock);
+   list_del(&rq->link);
+   spin_unlock(&rq->engine->timeline.lock);
+
+   spin_lock(&rq->lock);
+   i915_request_mark_complete(rq);
+   if (!i915_request_signaled(rq))
+   dma_fence_signal_locked(&rq->fence);
+   if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags))
+   i915_request_cancel_breadcrumb(rq);
+   if (rq->waitboost) {
+   GEM_BUG_ON(!atomic_read(&rq->i915->gt_pm.rps.num_waiters));
+   atomic_dec(&rq->i915->gt_pm.rps.num_waiters);
}
+   spin_unlock(&rq->lock);
+
+   local_irq_enable();
 
-   i915_request_remove_from_client(request);

[Intel-gfx] [PATCH 40/40] drm/i915/execlists: Minimalistic timeslicing

2019-05-08 Thread Chris Wilson
If we have multiple contexts of equal priority pending execution,
activate a timer to demote the currently executing context in favour of
the next in the queue when that timeslice expires. This enforces
fairness between contexts (so long as they allow preemption -- forced
preemption, in the future, will kick those who do not obey) and allows
us to avoid userspace blocking forward progress with e.g. unbounded
MI_SEMAPHORE_WAIT.

For the starting point here, we use the jiffie as our timeslice so that
we should be reasonably efficient wrt frequent CPU wakeups.

Testcase: igt/gem_exec_scheduler/semaphore-resolve
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_engine_types.h |   6 +
 drivers/gpu/drm/i915/gt/intel_lrc.c  | 111 +
 drivers/gpu/drm/i915/gt/selftest_lrc.c   | 223 +++
 drivers/gpu/drm/i915/i915_scheduler.c|   1 +
 drivers/gpu/drm/i915/i915_scheduler_types.h  |   1 +
 5 files changed, 342 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index de43bb62c24a..7f093f4b0af0 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "i915_gem.h"
@@ -137,6 +138,11 @@ struct intel_engine_execlists {
 */
struct tasklet_struct tasklet;
 
+   /**
+* @timer: kick the current context if its timeslice expires
+*/
+   struct timer_list timer;
+
/**
 * @default_priolist: priority list for I915_PRIORITY_NORMAL
 */
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index f08c2e223879..ea869f77a3ee 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -268,6 +268,7 @@ static int effective_prio(const struct i915_request *rq)
}
 
/* Restrict mere WAIT boosts from triggering preemption */
+   BUILD_BUG_ON(__NO_PREEMPTION & ~I915_PRIORITY_MASK); /* only internal */
return prio | __NO_PREEMPTION;
 }
 
@@ -852,6 +853,81 @@ last_active(const struct intel_engine_execlists *execlists)
return *last;
 }
 
+static void
+defer_request(struct i915_request * const rq, struct list_head * const pl)
+{
+   struct i915_dependency *p;
+
+   /*
+* We want to move the interrupted request to the back of
+* the round-robin list (i.e. its priority level), but
+* in doing so, we must then move all requests that were in
+* flight and were waiting for the interrupted request to
+* be run after it again.
+*/
+   list_move_tail(&rq->sched.link, pl);
+
+   list_for_each_entry(p, &rq->sched.waiters_list, wait_link) {
+   struct i915_request *w =
+   container_of(p->waiter, typeof(*w), sched);
+
+   if (!i915_sw_fence_done(&w->submit))
+   continue;
+
+   /* Leave semaphores spinning on the other engines */
+   if (w->engine != rq->engine)
+   continue;
+
+   /* No waiter should start before the active request completed */
+   GEM_BUG_ON(i915_request_started(w));
+
+   GEM_BUG_ON(rq_prio(w) > rq_prio(rq));
+   if (rq_prio(w) < rq_prio(rq))
+   continue;
+
+   /*
+* This should be very shallow as it is limited by the
+* number of requests that can fit in a ring (<64) and
+* the number of contexts that can be in flight on this
+* engine.
+*/
+   defer_request(w, pl);
+   }
+}
+
+static void defer_active(struct intel_engine_cs *engine)
+{
+   struct i915_request *rq;
+
+   rq = __unwind_incomplete_requests(engine, 0);
+   if (!rq)
+   return;
+
+   defer_request(rq, i915_sched_lookup_priolist(engine, rq_prio(rq)));
+}
+
+static bool
+need_timeslice(struct intel_engine_cs *engine, const struct i915_request *rq)
+{
+   int hint;
+
+   if (list_is_last(&rq->sched.link, &engine->active.requests))
+   return false;
+
+   hint = max(rq_prio(list_next_entry(rq, sched.link)),
+  engine->execlists.queue_priority_hint);
+
+   return hint >= rq_prio(rq);
+}
+
+static bool
+enable_timeslice(struct intel_engine_cs *engine)
+{
+   struct i915_request *last = last_active(&engine->execlists);
+
+   return last && need_timeslice(engine, last);
+}
+
 static bool execlists_dequeue(struct intel_engine_cs *engine)
 {
struct intel_engine_execlists * const execlists = &engine->execlists;
@@ -945,6 +1021,27 @@ static bool execlists_dequeue(struct intel_engine_cs 
*engine)
 */
last->hw_context->lrc_desc |= CTX_DESC_FORCE_RESTORE;
last = NULL;
+  

[Intel-gfx] [PATCH 34/40] drm/i915: Rename intel_context.active to .inflight

2019-05-08 Thread Chris Wilson
Rename the engine this HW context is currently active upon (that we are
flying upon) to disambiguate between the mixture of different active
terms (and prevent conflict in future patches).

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |  2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 22 +--
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 963a312430e6..825fcf0ac9c4 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -37,7 +37,7 @@ struct intel_context {
 
struct i915_gem_context *gem_context;
struct intel_engine_cs *engine;
-   struct intel_engine_cs *active;
+   struct intel_engine_cs *inflight;
 
struct list_head signal_link;
struct list_head signals;
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 64bd25a9e6f5..5e418bf46c46 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -460,7 +460,7 @@ __unwind_incomplete_requests(struct intel_engine_cs 
*engine, int boost)
__i915_request_unsubmit(rq);
unwind_wa_tail(rq);
 
-   GEM_BUG_ON(rq->hw_context->active);
+   GEM_BUG_ON(rq->hw_context->inflight);
 
/*
 * Push the request back into the queue for later resubmission.
@@ -557,11 +557,11 @@ execlists_user_end(struct intel_engine_execlists 
*execlists)
 static inline void
 execlists_context_schedule_in(struct i915_request *rq)
 {
-   GEM_BUG_ON(rq->hw_context->active);
+   GEM_BUG_ON(rq->hw_context->inflight);
 
execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
intel_engine_context_in(rq->engine);
-   rq->hw_context->active = rq->engine;
+   rq->hw_context->inflight = rq->engine;
 }
 
 static void kick_siblings(struct i915_request *rq)
@@ -576,7 +576,7 @@ static void kick_siblings(struct i915_request *rq)
 static inline void
 execlists_context_schedule_out(struct i915_request *rq, unsigned long status)
 {
-   rq->hw_context->active = NULL;
+   rq->hw_context->inflight = NULL;
intel_engine_context_out(rq->engine);
execlists_context_status_change(rq, status);
trace_i915_request_out(rq);
@@ -820,7 +820,7 @@ static bool virtual_matches(const struct virtual_engine *ve,
const struct i915_request *rq,
const struct intel_engine_cs *engine)
 {
-   const struct intel_engine_cs *active;
+   const struct intel_engine_cs *inflight;
 
if (!(rq->execution_mask & engine->mask)) /* We peeked too soon! */
return false;
@@ -834,8 +834,8 @@ static bool virtual_matches(const struct virtual_engine *ve,
 * we reuse the register offsets). This is a very small
 * hystersis on the greedy seelction algorithm.
 */
-   active = READ_ONCE(ve->context.active);
-   if (active && active != engine)
+   inflight = READ_ONCE(ve->context.inflight);
+   if (inflight && inflight != engine)
return false;
 
return true;
@@ -1023,7 +1023,7 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
u32 *regs = ve->context.lrc_reg_state;
unsigned int n;
 
-   GEM_BUG_ON(READ_ONCE(ve->context.active));
+   GEM_BUG_ON(READ_ONCE(ve->context.inflight));
virtual_update_register_offsets(regs, engine);
 
if (!list_empty(&ve->context.signals))
@@ -1501,7 +1501,7 @@ static void execlists_context_unpin(struct intel_context 
*ce)
 * had the chance to run yet; let it run before we teardown the
 * reference it may use.
 */
-   engine = READ_ONCE(ce->active);
+   engine = READ_ONCE(ce->inflight);
if (unlikely(engine)) {
unsigned long flags;
 
@@ -1509,7 +1509,7 @@ static void execlists_context_unpin(struct intel_context 
*ce)
process_csb(engine);
spin_unlock_irqrestore(&engine->timeline.lock, flags);
 
-   GEM_BUG_ON(READ_ONCE(ce->active));
+   GEM_BUG_ON(READ_ONCE(ce->inflight));
}
 
i915_gem_context_unpin_hw_id(ce->gem_context);
@@ -3103,7 +3103,7 @@ static void virtual_context_destroy(struct kref *kref)
unsigned int n;
 
GEM_BUG_ON(ve->request);
-   GEM_BUG_ON(ve->context.active);
+   GEM_BUG_ON(ve->context.inflight);
 
for (n = 0; n < ve->num_siblings; n++) {
struct intel_engine_cs *sibling = ve->siblings[n];
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://list

[Intel-gfx] [PATCH 35/40] drm/i915: Keep contexts pinned until after the next kernel context switch

2019-05-08 Thread Chris Wilson
We need to keep the context image pinned in memory until after the GPU
has finished writing into it. Since it continues to write as we signal
the final breadcrumb, we need to keep it pinned until the request after
it is complete. Currently we know the order in which requests execute on
each engine, and so to remove that presumption we need to identify a
request/context-switch we know must occur after our completion. Any
request queued after the signal must imply a context switch, for
simplicity we use a fresh request from the kernel context.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 24 ++
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |  1 -
 drivers/gpu/drm/i915/gem/i915_gem_pm.c| 20 -
 drivers/gpu/drm/i915/gt/intel_context.c   | 80 +++---
 drivers/gpu/drm/i915/gt/intel_context.h   |  3 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |  6 +-
 drivers/gpu/drm/i915/gt/intel_engine.h|  2 -
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 23 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  | 13 +--
 drivers/gpu/drm/i915/gt/intel_lrc.c   | 62 ++
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c| 44 +-
 drivers/gpu/drm/i915/gt/mock_engine.c | 11 +--
 drivers/gpu/drm/i915/i915_active.c| 81 ++-
 drivers/gpu/drm/i915/i915_active.h|  5 ++
 drivers/gpu/drm/i915/i915_active_types.h  |  3 +
 drivers/gpu/drm/i915/i915_gem.c   |  4 -
 drivers/gpu/drm/i915/i915_request.c   | 15 
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  1 -
 19 files changed, 215 insertions(+), 185 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index a9608d9ced6a..aa2bd1d6ceb6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -688,17 +688,6 @@ int i915_gem_contexts_init(struct drm_i915_private 
*dev_priv)
return 0;
 }
 
-void i915_gem_contexts_lost(struct drm_i915_private *dev_priv)
-{
-   struct intel_engine_cs *engine;
-   enum intel_engine_id id;
-
-   lockdep_assert_held(&dev_priv->drm.struct_mutex);
-
-   for_each_engine(engine, dev_priv, id)
-   intel_engine_lost_context(engine);
-}
-
 void i915_gem_contexts_fini(struct drm_i915_private *i915)
 {
lockdep_assert_held(&i915->drm.struct_mutex);
@@ -1183,10 +1172,6 @@ gen8_modify_rpcs(struct intel_context *ce, struct 
intel_sseu sseu)
if (ret)
goto out_add;
 
-   ret = gen8_emit_rpcs_config(rq, ce, sseu);
-   if (ret)
-   goto out_add;
-
/*
 * Guarantee context image and the timeline remains pinned until the
 * modifying request is retired by setting the ce activity tracker.
@@ -1194,9 +1179,12 @@ gen8_modify_rpcs(struct intel_context *ce, struct 
intel_sseu sseu)
 * But we only need to take one pin on the account of it. Or in other
 * words transfer the pinned ce object to tracked active request.
 */
-   if (!i915_active_request_isset(&ce->active_tracker))
-   __intel_context_pin(ce);
-   __i915_active_request_set(&ce->active_tracker, rq);
+   GEM_BUG_ON(i915_active_is_idle(&ce->active));
+   ret = i915_active_ref(&ce->active, rq->fence.context, rq);
+   if (ret)
+   goto out_add;
+
+   ret = gen8_emit_rpcs_config(rq, ce, sseu);
 
 out_add:
i915_request_add(rq);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index 630392c77e48..9691dd062f72 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -134,7 +134,6 @@ static inline bool i915_gem_context_is_kernel(struct 
i915_gem_context *ctx)
 
 /* i915_gem_context.c */
 int __must_check i915_gem_contexts_init(struct drm_i915_private *dev_priv);
-void i915_gem_contexts_lost(struct drm_i915_private *dev_priv);
 void i915_gem_contexts_fini(struct drm_i915_private *dev_priv);
 
 int i915_gem_context_open(struct drm_i915_private *i915,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
index 111b687f14b5..1f2feddd9e20 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
@@ -10,6 +10,22 @@
 #include "i915_drv.h"
 #include "i915_globals.h"
 
+static void call_idle_barriers(struct intel_engine_cs *engine)
+{
+   struct llist_node *node, *next;
+
+   llist_for_each_safe(node, next, llist_del_all(&engine->barrier_tasks)) {
+   struct i915_active_request *active =
+   container_of((struct list_head *)node,
+typeof(*active), link);
+
+   INIT_LIST_HEAD(&active->link);
+   RCU_INIT_POINTER(active->request, NU

Re: [Intel-gfx] [PATCH 20/45] drm/i915: Apply an execution_mask to the virtual_engine

2019-05-08 Thread Tvrtko Ursulin


On 07/05/2019 17:59, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2019-04-29 15:12:23)


On 25/04/2019 10:19, Chris Wilson wrote:

   static void virtual_submission_tasklet(unsigned long data)
   {
   struct virtual_engine * const ve = (struct virtual_engine *)data;
   const int prio = ve->base.execlists.queue_priority_hint;
+ intel_engine_mask_t mask;
   unsigned int n;
   
+ rcu_read_lock();

+ mask = virtual_submission_mask(ve);
+ rcu_read_unlock();
+ if (unlikely(!mask))


Is the rcu_lock think solely for the same protection against wedging in
submit_notify?


No. We may still be in the rbtree of the physical engines and
ve->request may be plucked out from underneath us as we read it. And in
the time it takes to tracek, that request may have been executed,
retired and freed. To prevent the dangling stale dereference, we use
rcu_read_lock() here as we peek into the request, and spinlocks around
the actual transfer to the execution backend.


So it's not actually about ve->request as s member pointer, but the 
request object itself. That could make sense, but then wouldn't you need 
to hold the rcu_read_lock over the whole tasklet? There is another 
ve->request read in the for loop just below, although not an actual 
dereference. I guess I just answered it to myself. Okay, looks good then.


Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2 4/7] lib/hexdump.c: Replace ascii bool in hex_dump_to_buffer with flags

2019-05-08 Thread Jani Nikula
On Wed, 08 May 2019, Alastair D'Silva  wrote:
> From: Alastair D'Silva 
>
> In order to support additional features in hex_dump_to_buffer, replace
> the ascii bool parameter with flags.
>
> Signed-off-by: Alastair D'Silva 
> ---
>  drivers/gpu/drm/i915/intel_engine_cs.c|  2 +-

For i915,

Acked-by: Jani Nikula 

>  drivers/isdn/hardware/mISDN/mISDNisar.c   |  6 --
>  drivers/mailbox/mailbox-test.c|  2 +-
>  drivers/net/ethernet/amd/xgbe/xgbe-drv.c  |  2 +-
>  drivers/net/ethernet/synopsys/dwc-xlgmac-common.c |  2 +-
>  drivers/net/wireless/ath/ath10k/debug.c   |  3 ++-
>  drivers/net/wireless/intel/iwlegacy/3945-mac.c|  2 +-
>  drivers/platform/chrome/wilco_ec/debugfs.c|  2 +-
>  drivers/scsi/scsi_logging.c   |  8 +++-
>  drivers/staging/fbtft/fbtft-core.c|  2 +-
>  fs/seq_file.c |  3 ++-
>  include/linux/printk.h|  8 
>  lib/hexdump.c | 15 ---
>  lib/test_hexdump.c|  5 +++--
>  14 files changed, 33 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
> b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 49fa43ff02ba..fb133e729f9a 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1318,7 +1318,7 @@ static void hexdump(struct drm_printer *m, const void 
> *buf, size_t len)
>   WARN_ON_ONCE(hex_dump_to_buffer(buf + pos, len - pos,
>   rowsize, sizeof(u32),
>   line, sizeof(line),
> - false) >= sizeof(line));
> + 0) >= sizeof(line));
>   drm_printf(m, "[%04zx] %s\n", pos, line);
>  
>   prev = buf + pos;
> diff --git a/drivers/isdn/hardware/mISDN/mISDNisar.c 
> b/drivers/isdn/hardware/mISDN/mISDNisar.c
> index 386731ec2489..f13f34db6c17 100644
> --- a/drivers/isdn/hardware/mISDN/mISDNisar.c
> +++ b/drivers/isdn/hardware/mISDN/mISDNisar.c
> @@ -84,7 +84,8 @@ send_mbox(struct isar_hw *isar, u8 his, u8 creg, u8 len, u8 
> *msg)
>  
>   while (l < (int)len) {
>   hex_dump_to_buffer(msg + l, len - l, 32, 1,
> -isar->log, 256, 1);
> +isar->log, 256,
> +HEXDUMP_ASCII);
>   pr_debug("%s: %s %02x: %s\n", isar->name,
>__func__, l, isar->log);
>   l += 32;
> @@ -113,7 +114,8 @@ rcv_mbox(struct isar_hw *isar, u8 *msg)
>  
>   while (l < (int)isar->clsb) {
>   hex_dump_to_buffer(msg + l, isar->clsb - l, 32,
> -1, isar->log, 256, 1);
> +1, isar->log, 256,
> +HEXDUMP_ASCII);
>   pr_debug("%s: %s %02x: %s\n", isar->name,
>__func__, l, isar->log);
>   l += 32;
> diff --git a/drivers/mailbox/mailbox-test.c b/drivers/mailbox/mailbox-test.c
> index 4e4ac4be6423..2f9a094d0259 100644
> --- a/drivers/mailbox/mailbox-test.c
> +++ b/drivers/mailbox/mailbox-test.c
> @@ -213,7 +213,7 @@ static ssize_t mbox_test_message_read(struct file *filp, 
> char __user *userbuf,
>   hex_dump_to_buffer(ptr,
>  MBOX_BYTES_PER_LINE,
>  MBOX_BYTES_PER_LINE, 1, touser + l,
> -MBOX_HEXDUMP_LINE_LEN, true);
> +MBOX_HEXDUMP_LINE_LEN, HEXDUMP_ASCII);
>  
>   ptr += MBOX_BYTES_PER_LINE;
>   l += MBOX_HEXDUMP_LINE_LEN;
> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c 
> b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> index 0cc911f928b1..e954a31cee0c 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> @@ -2992,7 +2992,7 @@ void xgbe_print_pkt(struct net_device *netdev, struct 
> sk_buff *skb, bool tx_rx)
>   unsigned int len = min(skb->len - i, 32U);
>  
>   hex_dump_to_buffer(&skb->data[i], len, 32, 1,
> -buffer, sizeof(buffer), false);
> +buffer, sizeof(buffer), 0);
>   netdev_dbg(netdev, "  %#06x: %s\n", i, buffer);
>   }
>  
> diff --git a/drivers/net/ethernet/synopsys/dwc-xlgmac-common.c 
> b/drivers/net/ethernet/synopsys/dwc-xlgmac-common.c
> index eb1c6b03c329..b80adfa1f890 100644
> --- a/drivers/net/ethernet/syno

Re: [Intel-gfx] [v8 01/10] drm: Add HDR source metadata property

2019-05-08 Thread Shankar, Uma


>-Original Message-
>From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
>Sent: Tuesday, May 7, 2019 5:40 PM
>To: Shankar, Uma 
>Cc: Jonas Karlman ; intel-gfx@lists.freedesktop.org; dri-
>de...@lists.freedesktop.org; seanp...@chromium.org; emil.l.veli...@gmail.com;
>dcasta...@chromium.org; Lankhorst, Maarten ;
>Syrjala, Ville 
>Subject: Re: [v8 01/10] drm: Add HDR source metadata property
>
>On Tue, May 07, 2019 at 01:25:42PM +0300, Ville Syrjälä wrote:
>> On Tue, May 07, 2019 at 09:03:45AM +, Shankar, Uma wrote:
>> >
>> >
>> > >-Original Message-
>> > >From: Jonas Karlman [mailto:jo...@kwiboo.se]
>> > >Sent: Saturday, May 4, 2019 3:48 PM
>> > >To: Shankar, Uma ;
>> > >intel-gfx@lists.freedesktop.org; dri- de...@lists.freedesktop.org
>> > >Cc: dcasta...@chromium.org; emil.l.veli...@gmail.com;
>> > >seanp...@chromium.org; Syrjala, Ville ;
>> > >Lankhorst, Maarten 
>> > >Subject: Re: [v8 01/10] drm: Add HDR source metadata property
>> > >
>> > >On 2019-04-09 18:44, Uma Shankar wrote:
>> > >> This patch adds a blob property to get HDR metadata information
>> > >> from userspace. This will be send as part of AVI Infoframe to panel.
>> > >>
>> > >> It also implements get() and set() functions for HDR output
>> > >> metadata property.The blob data is received from userspace and
>> > >> saved in connector state, the same is returned as blob in get
>> > >> property call to userspace.
>> > >>
>> > >> v2: Rebase and modified the metadata structure elements as per
>> > >> Ville's POC changes.
>> > >>
>> > >> v3: No Change
>> > >>
>> > >> v4: Addressed Shashank's review comments
>> > >>
>> > >> v5: Rebase.
>> > >>
>> > >> v6: Addressed Brian Starkey's review comments, defined new
>> > >> structure with header for dynamic metadata scalability.
>> > >> Merge get/set property functions for metadata in this patch.
>> > >>
>> > >> v7: Addressed Jonas Karlman review comments and defined separate
>> > >> structure for infoframe to better align with CTA 861.G spec.
>> > >> Added Shashank's RB.
>> > >>
>> > >> Signed-off-by: Uma Shankar 
>> > >> Reviewed-by: Shashank Sharma 
>> > >> ---
>> > >>  drivers/gpu/drm/drm_atomic.c  |  2 ++
>> > >>  drivers/gpu/drm/drm_atomic_uapi.c | 13 +
>> > >>  drivers/gpu/drm/drm_connector.c   |  6 ++
>> > >>  include/drm/drm_connector.h   | 11 +++
>> > >>  include/drm/drm_mode_config.h |  6 ++
>> > >>  include/linux/hdmi.h  | 10 ++
>> > >>  include/uapi/drm/drm_mode.h   | 39
>> > >+++
>> > >>  7 files changed, 87 insertions(+)
>> > >>
>> > >> diff --git a/drivers/gpu/drm/drm_atomic.c
>> > >> b/drivers/gpu/drm/drm_atomic.c index 5eb4013..8b9c126 100644
>> > >> --- a/drivers/gpu/drm/drm_atomic.c
>> > >> +++ b/drivers/gpu/drm/drm_atomic.c
>> > >> @@ -881,6 +881,8 @@ static void
>> > >> drm_atomic_connector_print_state(struct drm_printer *p,
>> > >>
>> > >> drm_printf(p, "connector[%u]: %s\n", connector->base.id,
>> > >>connector- name);
>> > >> drm_printf(p, "\tcrtc=%s\n", state->crtc ? state->crtc->name :
>> > >> "(null)");
>> > >> +   drm_printf(p, "\thdr_metadata_changed=%d\n",
>> > >> +  state->hdr_metadata_changed);
>> > >>
>> > >> if (connector->connector_type == DRM_MODE_CONNECTOR_WRITEBACK)
>> > >> if (state->writeback_job && state->writeback_job->fb) 
>> > >> diff
>> > >> --git a/drivers/gpu/drm/drm_atomic_uapi.c
>> > >> b/drivers/gpu/drm/drm_atomic_uapi.c
>> > >> index ea797d4..6d0d161 100644
>> > >> --- a/drivers/gpu/drm/drm_atomic_uapi.c
>> > >> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
>> > >> @@ -673,6 +673,8 @@ static int
>> > >> drm_atomic_connector_set_property(struct drm_connector *connector,  {
>> > >> struct drm_device *dev = connector->dev;
>> > >> struct drm_mode_config *config = &dev->mode_config;
>> > >> +   bool replaced = false;
>> > >> +   int ret;
>> > >>
>> > >> if (property == config->prop_crtc_id) {
>> > >> struct drm_crtc *crtc = drm_crtc_find(dev, NULL, val); 
>> > >> @@
>> > >> -721,6
>> > >> +723,14 @@ static int drm_atomic_connector_set_property(struct
>> > >> +drm_connector
>> > >*connector,
>> > >>  */
>> > >> if (state->link_status != DRM_LINK_STATUS_GOOD)
>> > >> state->link_status = val;
>> > >> +   } else if (property == config->hdr_output_metadata_property) {
>> > >> +   ret = drm_atomic_replace_property_blob_from_id(dev,
>> > >> +   &state->hdr_output_metadata_blob_ptr,
>> > >> +   val,
>> > >> +   -1, sizeof(struct hdr_output_metadata),
>> > >> +   &replaced);
>> > >> +   state->hdr_metadata_changed |= replaced;
>> > >> +   return ret;
>> > >> } else if (property == config->aspect_ratio_property) {
>> > >>  

Re: [Intel-gfx] [PATCH v2 4/7] lib/hexdump.c: Replace ascii bool in hex_dump_to_buffer with flags

2019-05-08 Thread Greg Kroah-Hartman
On Wed, May 08, 2019 at 05:01:44PM +1000, Alastair D'Silva wrote:
> From: Alastair D'Silva 
> 
> In order to support additional features in hex_dump_to_buffer, replace
> the ascii bool parameter with flags.
> 
> Signed-off-by: Alastair D'Silva 
> ---
>  drivers/gpu/drm/i915/intel_engine_cs.c|  2 +-
>  drivers/isdn/hardware/mISDN/mISDNisar.c   |  6 --
>  drivers/mailbox/mailbox-test.c|  2 +-
>  drivers/net/ethernet/amd/xgbe/xgbe-drv.c  |  2 +-
>  drivers/net/ethernet/synopsys/dwc-xlgmac-common.c |  2 +-
>  drivers/net/wireless/ath/ath10k/debug.c   |  3 ++-
>  drivers/net/wireless/intel/iwlegacy/3945-mac.c|  2 +-
>  drivers/platform/chrome/wilco_ec/debugfs.c|  2 +-
>  drivers/scsi/scsi_logging.c   |  8 +++-
>  drivers/staging/fbtft/fbtft-core.c|  2 +-
>  fs/seq_file.c |  3 ++-
>  include/linux/printk.h|  8 
>  lib/hexdump.c | 15 ---
>  lib/test_hexdump.c|  5 +++--
>  14 files changed, 33 insertions(+), 29 deletions(-)

For staging stuff:

Reviewed-by: Greg Kroah-Hartman 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2 4/7] lib/hexdump.c: Replace ascii bool in hex_dump_to_buffer with flags

2019-05-08 Thread David Laight
From: Alastair D'Silva
> Sent: 08 May 2019 08:02
> To: alast...@d-silva.org
...
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -480,13 +480,13 @@ enum {
>   DUMP_PREFIX_OFFSET
>  };
> 
> -extern int hex_dump_to_buffer(const void *buf, size_t len, int rowsize,
> -   int groupsize, char *linebuf, size_t linebuflen,
> -   bool ascii);
> -
>  #define HEXDUMP_ASCII(1 << 0)
>  #define HEXDUMP_SUPPRESS_REPEATED(1 << 1)

These ought to be BIT(0) and BIT(1)

> +extern int hex_dump_to_buffer(const void *buf, size_t len, int rowsize,
> +   int groupsize, char *linebuf, size_t linebuflen,
> +   u64 flags);

Why 'u64 flags' ?
How many flags do you envisage ??
Your HEXDUMP_ASCII (etc) flags are currently signed values and might
get sign extended causing grief.
'unsigned int flags' is probably sufficient.

I've not really looked at the code, it seems OTT in places though.

If someone copies it somewhere where the performance matters
(I've user space code which is dominated by its tracing!)
then you don't want all the function calls and conditionals
even if you want some of the functionality.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [v8 08/10] drm/i915:Enabled Modeset when HDR Infoframe changes

2019-05-08 Thread Shankar, Uma


>-Original Message-
>From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
>Sent: Tuesday, May 7, 2019 5:32 PM
>To: Shankar, Uma 
>Cc: intel-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; 
>Lankhorst,
>Maarten ; Syrjala, Ville 
>;
>Sharma, Shashank ; emil.l.veli...@gmail.com;
>brian.star...@arm.com; dcasta...@chromium.org; seanp...@chromium.org;
>Roper, Matthew D ; jo...@kwiboo.se
>Subject: Re: [v8 08/10] drm/i915:Enabled Modeset when HDR Infoframe changes
>
>On Tue, Apr 09, 2019 at 10:14:42PM +0530, Uma Shankar wrote:
>> This patch enables modeset whenever HDR metadata needs to be updated
>> to sink.
>>
>> v2: Addressed Shashank's review comments.
>>
>> v3: Added Shashank's RB.
>>
>> Signed-off-by: Ville Syrjälä 
>> Signed-off-by: Uma Shankar 
>> Reviewed-by: Shashank Sharma 
>> ---
>>  drivers/gpu/drm/i915/intel_atomic.c | 14 +-
>>  drivers/gpu/drm/i915/intel_hdmi.c   | 26 ++
>>  2 files changed, 39 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_atomic.c
>> b/drivers/gpu/drm/i915/intel_atomic.c
>> index 8c8fae3..e8b5f84 100644
>> --- a/drivers/gpu/drm/i915/intel_atomic.c
>> +++ b/drivers/gpu/drm/i915/intel_atomic.c
>> @@ -104,6 +104,16 @@ int intel_digital_connector_atomic_set_property(struct
>drm_connector *connector,
>>  return -EINVAL;
>>  }
>>
>> +static bool blob_equal(const struct drm_property_blob *a,
>> +   const struct drm_property_blob *b) {
>> +if (a && b)
>> +return a->length == b->length &&
>> +!memcmp(a->data, b->data, a->length);
>> +
>> +return !a == !b;
>> +}
>
>I have a feeling the memcmp() is overkill. We could just check for whether the 
>blob is
>the same or not. If userspace is an idiot and creates a new blob with 
>identical content
>so be it.

I feel it's good to have such check in kernel and avoid extra infoframe 
programming.
More like how we suppress modeset request if same mode is requested as part of
modeset.  Hope its ok.

>> +
>>  int intel_digital_connector_atomic_check(struct drm_connector *conn,
>>   struct drm_connector_state *new_state) 
>>  {
>@@ -131,7 +141,9 @@
>> int intel_digital_connector_atomic_check(struct drm_connector *conn,
>>  new_conn_state->base.colorspace != old_conn_state->base.colorspace 
>> ||
>>  new_conn_state->base.picture_aspect_ratio != old_conn_state-
>>base.picture_aspect_ratio ||
>>  new_conn_state->base.content_type != old_conn_state-
>>base.content_type ||
>> -new_conn_state->base.scaling_mode != old_conn_state-
>>base.scaling_mode)
>> +new_conn_state->base.scaling_mode != old_conn_state-
>>base.scaling_mode ||
>> +!blob_equal(new_conn_state->base.hdr_output_metadata_blob_ptr,
>> +old_conn_state->base.hdr_output_metadata_blob_ptr))
>>  crtc_state->mode_changed = true;
>>
>>  return 0;
>> diff --git a/drivers/gpu/drm/i915/intel_hdmi.c
>> b/drivers/gpu/drm/i915/intel_hdmi.c
>> index 0ecfda0..85333a7 100644
>> --- a/drivers/gpu/drm/i915/intel_hdmi.c
>> +++ b/drivers/gpu/drm/i915/intel_hdmi.c
>> @@ -799,6 +799,20 @@ void intel_read_infoframe(struct intel_encoder *encoder,
>>  return true;
>>  }
>>
>> +static bool is_eotf_supported(u8 output_eotf, u8 sink_eotf) {
>> +if (output_eotf == 0)
>> +return (sink_eotf & (1 << 0));
>> +if (output_eotf == 1)
>> +return (sink_eotf & (1 << 1));
>> +if (output_eotf == 2)
>> +return (sink_eotf & (1 << 2));
>> +if (output_eotf == 3)
>> +return (sink_eotf & (1 << 3));
>> +
>> +return false;
>
>return sink_eotf & BIT(output_eotf);

Nice suggestion. Will update this.

Regards,
Uma Shankar

>> +}
>> +
>>  static bool
>>  intel_hdmi_compute_drm_infoframe(struct intel_encoder *encoder,
>>   struct intel_crtc_state *crtc_state, @@ -806,11
>+820,23 @@ void
>> intel_read_infoframe(struct intel_encoder *encoder,  {
>>  struct hdmi_drm_infoframe *frame = &crtc_state->infoframes.drm.drm;
>>  struct hdr_output_metadata *hdr_metadata;
>> +struct drm_connector *connector = conn_state->connector;
>>  int ret;
>>
>> +if (!conn_state->hdr_output_metadata_blob_ptr ||
>> +conn_state->hdr_output_metadata_blob_ptr->length == 0)
>> +return true;
>> +
>>  hdr_metadata = (struct hdr_output_metadata *)
>>  conn_state->hdr_output_metadata_blob_ptr->data;
>>
>> +/* Sink EOTF is Bit map while infoframe is absolute values */
>> +if (!is_eotf_supported(hdr_metadata->hdmi_metadata_type1.eotf,
>> +   connector->hdr_sink_metadata.hdmi_type1.eotf)) {
>> +DRM_ERROR("EOTF Not Supported\n");
>> +return true;
>> +}
>> +
>>  ret = drm_hdmi_infoframe_set_hdr_metadata(frame, hdr_metadata);
>>  if (ret < 0) {
>>  DRM_ERROR("couldn't set HDR metada

[Intel-gfx] [PATCH i-g-t 10/16] i915/gem_exec_whisper: Fork all-engine tests one-per-engine

2019-05-08 Thread Chris Wilson
Add a new mode for some more stress, submit the all-engines tests
simultaneously, a stream per engine.

Signed-off-by: Chris Wilson 
---
 tests/i915/gem_exec_whisper.c | 27 ++-
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/tests/i915/gem_exec_whisper.c b/tests/i915/gem_exec_whisper.c
index d3e0b0ba2..d5afc8119 100644
--- a/tests/i915/gem_exec_whisper.c
+++ b/tests/i915/gem_exec_whisper.c
@@ -88,6 +88,7 @@ static void verify_reloc(int fd, uint32_t handle,
 #define SYNC 0x40
 #define PRIORITY 0x80
 #define QUEUES 0x100
+#define ALL 0x200
 
 struct hang {
struct drm_i915_gem_exec_object2 obj;
@@ -199,6 +200,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
uint64_t old_offset;
int i, n, loc;
int debugfs;
+   int nchild;
 
if (flags & PRIORITY) {
igt_require(gem_scheduler_enabled(fd));
@@ -215,6 +217,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
engines[nengine++] = engine;
}
} else {
+   igt_assert(!(flags & ALL));
igt_require(gem_has_ring(fd, engine));
igt_require(gem_can_store_dword(fd, engine));
engines[nengine++] = engine;
@@ -233,11 +236,22 @@ static void whisper(int fd, unsigned engine, unsigned 
flags)
if (flags & HANG)
init_hang(&hang);
 
+   nchild = 1;
+   if (flags & FORKED)
+   nchild *= sysconf(_SC_NPROCESSORS_ONLN);
+   if (flags & ALL)
+   nchild *= nengine;
+
intel_detect_and_clear_missed_interrupts(fd);
gpu_power_read(&power, &sample[0]);
-   igt_fork(child, flags & FORKED ? sysconf(_SC_NPROCESSORS_ONLN) : 1)  {
+   igt_fork(child, nchild) {
unsigned int pass;
 
+   if (flags & ALL) {
+   engines[0] = engines[child % nengine];
+   nengine = 1;
+   }
+
memset(&scratch, 0, sizeof(scratch));
scratch.handle = gem_create(fd, 4096);
scratch.flags = EXEC_OBJECT_WRITE;
@@ -341,7 +355,7 @@ static void whisper(int fd, unsigned engine, unsigned flags)
igt_until_timeout(150) {
uint64_t offset;
 
-   if (!(flags & FORKED))
+   if (nchild == 1)
write_seqno(debugfs, pass);
 
if (flags & HANG)
@@ -382,8 +396,8 @@ static void whisper(int fd, unsigned engine, unsigned flags)
 
gem_write(fd, batches[1023].handle, loc, &pass, 
sizeof(pass));
for (n = 1024; --n >= 1; ) {
+   uint32_t handle[2] = {};
int this_fd = fd;
-   uint32_t handle[2];
 
execbuf.buffers_ptr = 
to_user_pointer(&batches[n-1]);
reloc_migrations += batches[n-1].offset 
!= inter[n].presumed_offset;
@@ -550,7 +564,7 @@ igt_main
{ "queues-sync", QUEUES | SYNC },
{ NULL }
};
-   int fd;
+   int fd = -1;
 
igt_fixture {
fd = drm_open_driver_master(DRIVER_INTEL);
@@ -561,9 +575,12 @@ igt_main
igt_fork_hang_detector(fd);
}
 
-   for (const struct mode *m = modes; m->name; m++)
+   for (const struct mode *m = modes; m->name; m++) {
igt_subtest_f("%s", m->name)
whisper(fd, ALL_ENGINES, m->flags);
+   igt_subtest_f("%s-all", m->name)
+   whisper(fd, ALL_ENGINES, m->flags | ALL);
+   }
 
for (const struct intel_execution_engine *e = intel_execution_engines;
 e->name; e++) {
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH i-g-t 14/16] i915/gem_exec_balancer: Exercise bonded pairs

2019-05-08 Thread Chris Wilson
The submit-fence + load_balancing apis allow for us to execute a named
pair of engines in parallel; that this by submitting a request to one
engine, we can then use the generated submit-fence to submit a second
request to another engine and have it execute at the same time.
Furthermore, by specifying bonded pairs, we can direct the virtual
engine to use a particular engine in parallel to the first request.

Signed-off-by: Chris Wilson 
---
 tests/i915/gem_exec_balancer.c | 234 +++--
 1 file changed, 224 insertions(+), 10 deletions(-)

diff --git a/tests/i915/gem_exec_balancer.c b/tests/i915/gem_exec_balancer.c
index 25195d478..20ad66727 100644
--- a/tests/i915/gem_exec_balancer.c
+++ b/tests/i915/gem_exec_balancer.c
@@ -98,9 +98,35 @@ list_engines(int i915, uint32_t class_mask, unsigned int 
*out)
return engines;
 }
 
+static int __set_engines(int i915, uint32_t ctx,
+const struct i915_engine_class_instance *ci,
+unsigned int count)
+{
+   I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, count);
+   struct drm_i915_gem_context_param p = {
+   .ctx_id = ctx,
+   .param = I915_CONTEXT_PARAM_ENGINES,
+   .size = sizeof(engines),
+   .value = to_user_pointer(&engines)
+   };
+
+   engines.extensions = 0;
+   memcpy(engines.engines, ci, sizeof(engines.engines));
+
+   return __gem_context_set_param(i915, &p);
+}
+
+static void set_engines(int i915, uint32_t ctx,
+   const struct i915_engine_class_instance *ci,
+   unsigned int count)
+{
+   igt_assert_eq(__set_engines(i915, ctx, ci, count), 0);
+}
+
 static int __set_load_balancer(int i915, uint32_t ctx,
   const struct i915_engine_class_instance *ci,
-  unsigned int count)
+  unsigned int count,
+  void *ext)
 {
I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(balancer, count);
I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 1 + count);
@@ -113,6 +139,7 @@ static int __set_load_balancer(int i915, uint32_t ctx,
 
memset(&balancer, 0, sizeof(balancer));
balancer.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE;
+   balancer.base.next_extension = to_user_pointer(ext);
 
igt_assert(count);
balancer.num_siblings = count;
@@ -131,9 +158,10 @@ static int __set_load_balancer(int i915, uint32_t ctx,
 
 static void set_load_balancer(int i915, uint32_t ctx,
  const struct i915_engine_class_instance *ci,
- unsigned int count)
+ unsigned int count,
+ void *ext)
 {
-   igt_assert_eq(__set_load_balancer(i915, ctx, ci, count), 0);
+   igt_assert_eq(__set_load_balancer(i915, ctx, ci, count, ext), 0);
 }
 
 static uint32_t load_balancer_create(int i915,
@@ -143,7 +171,7 @@ static uint32_t load_balancer_create(int i915,
uint32_t ctx;
 
ctx = gem_context_create(i915);
-   set_load_balancer(i915, ctx, ci, count);
+   set_load_balancer(i915, ctx, ci, count, NULL);
 
return ctx;
 }
@@ -288,6 +316,74 @@ static void invalid_balancer(int i915)
}
 }
 
+static void invalid_bonds(int i915)
+{
+   I915_DEFINE_CONTEXT_ENGINES_BOND(bonds[16], 1);
+   I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 1);
+   struct drm_i915_gem_context_param p = {
+   .ctx_id = gem_context_create(i915),
+   .param = I915_CONTEXT_PARAM_ENGINES,
+   .value = to_user_pointer(&engines),
+   .size = sizeof(engines),
+   };
+   uint32_t handle;
+   void *ptr;
+
+   memset(&engines, 0, sizeof(engines));
+   gem_context_set_param(i915, &p);
+
+   memset(bonds, 0, sizeof(bonds));
+   for (int n = 0; n < ARRAY_SIZE(bonds); n++) {
+   bonds[n].base.name = I915_CONTEXT_ENGINES_EXT_BOND;
+   bonds[n].base.next_extension =
+   n ? to_user_pointer(&bonds[n - 1]) : 0;
+   bonds[n].num_bonds = 1;
+   }
+   engines.extensions = to_user_pointer(&bonds);
+   gem_context_set_param(i915, &p);
+
+   bonds[0].base.next_extension = -1ull;
+   igt_assert_eq(__gem_context_set_param(i915, &p), -EFAULT);
+
+   bonds[0].base.next_extension = to_user_pointer(&bonds[0]);
+   igt_assert_eq(__gem_context_set_param(i915, &p), -E2BIG);
+
+   engines.extensions = to_user_pointer(&bonds[1]);
+   igt_assert_eq(__gem_context_set_param(i915, &p), -E2BIG);
+   bonds[0].base.next_extension = 0;
+   gem_context_set_param(i915, &p);
+
+   handle = gem_create(i915, 4096 * 3);
+   ptr = gem_mmap__gtt(i915, handle, 4096 * 3, PROT_WRITE);
+   gem_close(i915, handle);
+
+   memcpy(ptr + 4096, &bonds[0], sizeof(bonds[0]));
+   engines.extensions = to_user_pointer(pt

[Intel-gfx] [PATCH i-g-t 15/16] i915/gem_exec_latency: Measure the latency of context switching

2019-05-08 Thread Chris Wilson
Measure the baseline latency between contexts in order to directly
compare that with the additional cost of preemption.

Signed-off-by: Chris Wilson 
---
 tests/i915/gem_exec_latency.c | 230 ++
 1 file changed, 230 insertions(+)

diff --git a/tests/i915/gem_exec_latency.c b/tests/i915/gem_exec_latency.c
index e56d62780..e88fbbc6a 100644
--- a/tests/i915/gem_exec_latency.c
+++ b/tests/i915/gem_exec_latency.c
@@ -410,6 +410,86 @@ static void latency_from_ring(int fd,
}
 }
 
+static void execution_latency(int i915, unsigned int ring, const char *name)
+{
+   struct drm_i915_gem_exec_object2 obj = {
+   .handle = gem_create(i915, 4095),
+   };
+   struct drm_i915_gem_execbuffer2 execbuf = {
+   .buffers_ptr = to_user_pointer(&obj),
+   .buffer_count = 1,
+   .flags = ring | LOCAL_I915_EXEC_NO_RELOC | 
LOCAL_I915_EXEC_HANDLE_LUT,
+   };
+   const unsigned int mmio_base = 0x2000;
+   const unsigned int cs_timestamp = mmio_base + 0x358;
+   volatile uint32_t *timestamp;
+   uint32_t *cs, *result;
+
+   timestamp =
+   (volatile uint32_t *)((volatile char *)igt_global_mmio + 
cs_timestamp);
+
+   obj.handle = gem_create(i915, 4096);
+   obj.flags = EXEC_OBJECT_PINNED;
+   result = gem_mmap__wc(i915, obj.handle, 0, 4096, PROT_WRITE);
+
+   for (int i = 0; i < 16; i++) {
+   cs = result + 16 * i;
+   *cs++ = 0x24 << 23 | 2; /* SRM */
+   *cs++ = cs_timestamp;
+   *cs++ = 4096 - 16 * 4 + i * 4;
+   *cs++ = 0;
+   *cs++ = 0xa << 23;
+   }
+
+   cs = result + 1024 - 16;
+
+   for (int length = 2; length <= 16; length <<= 1) {
+   struct igt_mean submit, batch, total;
+   int last = length - 1;
+
+   igt_mean_init(&submit);
+   igt_mean_init(&batch);
+   igt_mean_init(&total);
+
+   igt_until_timeout(2) {
+   uint32_t now, end;
+
+   cs[last] = 0;
+
+   now = *timestamp;
+   for (int i = 0; i < length; i++) {
+   execbuf.batch_start_offset = 64 * i;
+   gem_execbuf(i915, &execbuf);
+   }
+   while (!((volatile uint32_t *)cs)[last])
+   ;
+   end = *timestamp;
+
+   igt_mean_add(&submit, (cs[0] - now) * rcs_clock);
+   igt_mean_add(&batch, (cs[last] - cs[0]) * rcs_clock / 
last);
+   igt_mean_add(&total, (end - now) * rcs_clock);
+   }
+
+   igt_info("%sx%d Submission latency: %.2f±%.2fus\n",
+name, length,
+1e-3 * igt_mean_get(&submit),
+1e-3 * sqrt(igt_mean_get_variance(&submit)));
+
+   igt_info("%sx%d Inter-batch latency: %.2f±%.2fus\n",
+name, length,
+1e-3 * igt_mean_get(&batch),
+1e-3 * sqrt(igt_mean_get_variance(&batch)));
+
+   igt_info("%sx%d End-to-end latency: %.2f±%.2fus\n",
+name, length,
+1e-3 * igt_mean_get(&total),
+1e-3 * sqrt(igt_mean_get_variance(&total)));
+   }
+
+   munmap(result, 4096);
+   gem_close(i915, obj.handle);
+}
+
 static void
 __submit_spin(int fd, igt_spin_t *spin, unsigned int flags)
 {
@@ -616,6 +696,142 @@ rthog_latency_on_ring(int fd, unsigned int engine, const 
char *name, unsigned in
munmap(results, MMAP_SZ);
 }
 
+static void context_switch(int i915,
+  unsigned int engine, const char *name,
+  unsigned int flags)
+{
+   struct drm_i915_gem_exec_object2 obj[2];
+   struct drm_i915_gem_relocation_entry reloc[5];
+   struct drm_i915_gem_execbuffer2 eb;
+   uint32_t *cs, *bbe, *results, v;
+   unsigned int mmio_base;
+   struct igt_mean mean;
+   uint32_t ctx[2];
+
+   /* XXX i915_query()! */
+   switch (engine) {
+   case I915_EXEC_DEFAULT:
+   case I915_EXEC_RENDER:
+   mmio_base = 0x2000;
+   break;
+#if 0
+   case I915_EXEC_BSD:
+   mmio_base = 0x12000;
+   break;
+#endif
+   case I915_EXEC_BLT:
+   mmio_base = 0x22000;
+   break;
+
+   case I915_EXEC_VEBOX:
+   if (intel_gen(intel_get_drm_devid(i915)) >= 11)
+   mmio_base = 0x1d8000;
+   else
+   mmio_base = 0x1a000;
+   break;
+
+   default:
+   igt_skip("mmio base not known\n");
+   }
+
+   for (int i = 0; i < ARRAY_SIZE(ctx); i++)
+   ctx[i] = gem_context_create(i915);
+
+   if 

[Intel-gfx] [PATCH i-g-t 02/16] drm-uapi: Import i915_drm.h upto 53073249452d

2019-05-08 Thread Chris Wilson
commit 53073249452d307b66c2ab9a4b5ebf94db534ad6
Author: Chris Wilson 
Date:   Thu Jan 25 17:55:58 2018 +

drm/i915: Allow contexts to share a single timeline across all engines

Signed-off-by: Chris Wilson 
---
 include/drm-uapi/i915_drm.h | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index e01b3e1fd..1b0488a81 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -355,6 +355,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_PERF_ADD_CONFIG   0x37
 #define DRM_I915_PERF_REMOVE_CONFIG0x38
 #define DRM_I915_QUERY 0x39
+#define DRM_I915_GEM_VM_CREATE 0x3a
+#define DRM_I915_GEM_VM_DESTROY0x3b
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INITDRM_IOW( DRM_COMMAND_BASE + 
DRM_I915_INIT, drm_i915_init_t)
@@ -415,6 +417,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_PERF_ADD_CONFIG DRM_IOW(DRM_COMMAND_BASE + 
DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
 #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG  DRM_IOW(DRM_COMMAND_BASE + 
DRM_I915_PERF_REMOVE_CONFIG, __u64)
 #define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_QUERY, struct drm_i915_query)
+#define DRM_IOCTL_I915_GEM_VM_CREATE   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
+#define DRM_IOCTL_I915_GEM_VM_DESTROY  DRM_IOW (DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -1464,8 +1468,9 @@ struct drm_i915_gem_context_create_ext {
__u32 ctx_id; /* output: id of new context*/
__u32 flags;
 #define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS   (1u << 0)
+#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE  (1u << 1)
 #define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
-   (-(I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS << 1))
+   (-(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE << 1))
__u64 extensions;
 };
 
@@ -1507,6 +1512,17 @@ struct drm_i915_gem_context_param {
  * On creation, all new contexts are marked as recoverable.
  */
 #define I915_CONTEXT_PARAM_RECOVERABLE 0x8
+
+   /*
+* The id of the associated virtual memory address space (ppGTT) of
+* this context. Can be retrieved and passed to another context
+* (on the same fd) for both to use the same ppGTT and so share
+* address layouts, and avoid reloading the page tables on context
+* switches between themselves.
+*
+* See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
+*/
+#define I915_CONTEXT_PARAM_VM  0x9
 /* Must be kept compact -- no holes and well documented */
 
__u64 value;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH i-g-t 09/16] i915/gem_ctx_switch: Exercise queues

2019-05-08 Thread Chris Wilson
Queues are a form of contexts that share vm and enfore a single timeline
across all engines. Test switching between them, just like ordinary
contexts.

Signed-off-by: Chris Wilson 
---
 tests/i915/gem_ctx_switch.c | 75 +++--
 1 file changed, 55 insertions(+), 20 deletions(-)

diff --git a/tests/i915/gem_ctx_switch.c b/tests/i915/gem_ctx_switch.c
index 87e13b915..647911d4c 100644
--- a/tests/i915/gem_ctx_switch.c
+++ b/tests/i915/gem_ctx_switch.c
@@ -44,7 +44,8 @@
 #define LOCAL_I915_EXEC_NO_RELOC (1<<11)
 #define LOCAL_I915_EXEC_HANDLE_LUT (1<<12)
 
-#define INTERRUPTIBLE 1
+#define INTERRUPTIBLE 0x1
+#define QUEUE 0x2
 
 static double elapsed(const struct timespec *start, const struct timespec *end)
 {
@@ -126,8 +127,12 @@ static void single(int fd, uint32_t handle,
 
gem_require_ring(fd, e->exec_id | e->flags);
 
-   for (n = 0; n < 64; n++)
-   contexts[n] = gem_context_create(fd);
+   for (n = 0; n < 64; n++) {
+   if (flags & QUEUE)
+   contexts[n] = gem_queue_create(fd);
+   else
+   contexts[n] = gem_context_create(fd);
+   }
 
memset(&obj, 0, sizeof(obj));
obj.handle = handle;
@@ -232,8 +237,12 @@ static void all(int fd, uint32_t handle, unsigned flags, 
int timeout)
}
igt_require(nengine);
 
-   for (n = 0; n < ARRAY_SIZE(contexts); n++)
-   contexts[n] = gem_context_create(fd);
+   for (n = 0; n < ARRAY_SIZE(contexts); n++) {
+   if (flags & QUEUE)
+   contexts[n] = gem_queue_create(fd);
+   else
+   contexts[n] = gem_context_create(fd);
+   }
 
memset(obj, 0, sizeof(obj));
obj[1].handle = handle;
@@ -298,6 +307,17 @@ igt_main
 {
const int ncpus = sysconf(_SC_NPROCESSORS_ONLN);
const struct intel_execution_engine *e;
+   static const struct {
+   const char *name;
+   unsigned int flags;
+   bool (*require)(int fd);
+   } phases[] = {
+   { "", 0, NULL },
+   { "-interruptible", INTERRUPTIBLE, NULL },
+   { "-queue", QUEUE, gem_has_queues },
+   { "-queue-interruptible", QUEUE | INTERRUPTIBLE, gem_has_queues 
},
+   { }
+   };
uint32_t light = 0, heavy;
int fd = -1;
 
@@ -319,21 +339,26 @@ igt_main
}
 
for (e = intel_execution_engines; e->name; e++) {
-   igt_subtest_f("%s%s", e->exec_id == 0 ? "basic-" : "", e->name)
-   single(fd, light, e, 0, 1, 5);
-
-   igt_skip_on_simulation();
-
-   igt_subtest_f("%s%s-heavy", e->exec_id == 0 ? "basic-" : "", 
e->name)
-   single(fd, heavy, e, 0, 1, 5);
-   igt_subtest_f("%s-interruptible", e->name)
-   single(fd, light, e, INTERRUPTIBLE, 1, 150);
-   igt_subtest_f("forked-%s", e->name)
-   single(fd, light, e, 0, ncpus, 150);
-   igt_subtest_f("forked-%s-heavy", e->name)
-   single(fd, heavy, e, 0, ncpus, 150);
-   igt_subtest_f("forked-%s-interruptible", e->name)
-   single(fd, light, e, INTERRUPTIBLE, ncpus, 150);
+   for (typeof(*phases) *p = phases; p->name; p++) {
+   igt_subtest_group {
+   igt_fixture {
+   if (p->require)
+   igt_require(p->require(fd));
+   }
+
+   igt_subtest_f("%s%s%s", e->exec_id == 0 ? 
"basic-" : "", e->name, p->name)
+   single(fd, light, e, p->flags, 1, 5);
+
+   igt_skip_on_simulation();
+
+   igt_subtest_f("%s%s-heavy%s", e->exec_id == 0 ? 
"basic-" : "", e->name, p->name)
+   single(fd, heavy, e, p->flags, 1, 5);
+   igt_subtest_f("forked-%s%s", e->name, p->name)
+   single(fd, light, e, p->flags, ncpus, 
150);
+   igt_subtest_f("forked-%s-heavy%s", e->name, 
p->name)
+   single(fd, heavy, e, p->flags, ncpus, 
150);
+   }
+   }
}
 
igt_subtest("basic-all-light")
@@ -341,6 +366,16 @@ igt_main
igt_subtest("basic-all-heavy")
all(fd, heavy, 0, 5);
 
+   igt_subtest_group {
+   igt_fixture {
+   igt_require(gem_has_queues(fd));
+   }
+   igt_subtest("basic-queue-light")
+   all(fd, light, QUEUE, 5);
+   igt_subtest("basic-queue-heavy")
+   all(fd, heavy, QUEUE, 5);
+   }
+
igt_

[Intel-gfx] [PATCH i-g-t 04/16] i915/gem_ctx_param: Test set/get (copy) VM

2019-05-08 Thread Chris Wilson
Exercise reusing the GTT of one ctx in another.

v2: Test setting back to the same VM
v3: Check the VM still exists after the parent ctx are dead.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 tests/i915/gem_ctx_param.c | 107 -
 1 file changed, 95 insertions(+), 12 deletions(-)

diff --git a/tests/i915/gem_ctx_param.c b/tests/i915/gem_ctx_param.c
index b6f57236c..d949cef32 100644
--- a/tests/i915/gem_ctx_param.c
+++ b/tests/i915/gem_ctx_param.c
@@ -28,6 +28,7 @@
 #include 
 
 #include "igt.h"
+#include "i915/gem_vm.h"
 
 IGT_TEST_DESCRIPTION("Basic test for context set/get param input validation.");
 
@@ -36,17 +37,6 @@ IGT_TEST_DESCRIPTION("Basic test for context set/get param 
input validation.");
 #define NEW_CTXBIT(0)
 #define USER BIT(1)
 
-static int reopen_driver(int fd)
-{
-   char path[256];
-
-   snprintf(path, sizeof(path), "/proc/self/fd/%d", fd);
-   fd = open(path, O_RDWR);
-   igt_assert_lte(0, fd);
-
-   return fd;
-}
-
 static void set_priority(int i915)
 {
static const int64_t test_values[] = {
@@ -91,7 +81,7 @@ static void set_priority(int i915)
igt_permute_array(values, size, igt_exchange_int64);
 
igt_fork(flags, NEW_CTX | USER) {
-   int fd = reopen_driver(i915);
+   int fd = gem_reopen_driver(i915);
struct drm_i915_gem_context_param arg = {
.param = I915_CONTEXT_PARAM_PRIORITY,
.ctx_id = flags & NEW_CTX ? gem_context_create(fd) : 0,
@@ -143,6 +133,96 @@ static void set_priority(int i915)
free(values);
 }
 
+static uint32_t __batch_create(int i915, uint32_t offset)
+{
+   const uint32_t bbe = MI_BATCH_BUFFER_END;
+   uint32_t handle;
+
+   handle = gem_create(i915, ALIGN(offset + 4, 4096));
+   gem_write(i915, handle, offset, &bbe, sizeof(bbe));
+
+   return handle;
+}
+
+static uint32_t batch_create(int i915)
+{
+   return __batch_create(i915, 0);
+}
+
+static void test_vm(int i915)
+{
+   const uint64_t nonzero_offset = 48 << 20;
+   struct drm_i915_gem_exec_object2 batch = {
+   .handle = batch_create(i915),
+   };
+   struct drm_i915_gem_execbuffer2 eb = {
+   .buffers_ptr = to_user_pointer(&batch),
+   .buffer_count = 1,
+   };
+   struct drm_i915_gem_context_param arg = {
+   .param = I915_CONTEXT_PARAM_VM,
+   };
+   uint32_t parent, child;
+
+   arg.value = -1ull;
+   igt_require(__gem_context_set_param(i915, &arg) == -ENOENT);
+
+   parent = gem_context_create(i915);
+   child = gem_context_create(i915);
+
+   /* Using implicit soft-pinning */
+   eb.rsvd1 = parent;
+   batch.offset = nonzero_offset;
+   gem_execbuf(i915, &eb);
+   igt_assert_eq_u64(batch.offset, nonzero_offset);
+
+   eb.rsvd1 = child;
+   batch.offset = 0;
+   gem_execbuf(i915, &eb);
+   igt_assert_eq_u64(batch.offset, 0);
+
+   eb.rsvd1 = parent;
+   gem_execbuf(i915, &eb);
+   igt_assert_eq_u64(batch.offset, nonzero_offset);
+
+   arg.ctx_id = parent;
+   gem_context_get_param(i915, &arg);
+   gem_context_set_param(i915, &arg);
+
+   /* Still the same VM, so expect the old VMA again */
+   batch.offset = 0;
+   gem_execbuf(i915, &eb);
+   igt_assert_eq_u64(batch.offset, nonzero_offset);
+
+   arg.ctx_id = child;
+   gem_context_set_param(i915, &arg);
+
+   eb.rsvd1 = child;
+   batch.offset = 0;
+   gem_execbuf(i915, &eb);
+   igt_assert_eq_u64(batch.offset, nonzero_offset);
+
+   gem_context_destroy(i915, child);
+   gem_context_destroy(i915, parent);
+
+   /* both contexts destroyed, but we still keep hold of the vm */
+   child = gem_context_create(i915);
+
+   arg.ctx_id = child;
+   gem_context_set_param(i915, &arg);
+
+   eb.rsvd1 = child;
+   batch.offset = 0;
+   gem_execbuf(i915, &eb);
+   igt_assert_eq_u64(batch.offset, nonzero_offset);
+
+   gem_context_destroy(i915, child);
+   gem_vm_destroy(i915, arg.value);
+
+   gem_sync(i915, batch.handle);
+   gem_close(i915, batch.handle);
+}
+
 igt_main
 {
struct drm_i915_gem_context_param arg;
@@ -253,6 +333,9 @@ igt_main
gem_context_set_param(fd, &arg);
}
 
+   igt_subtest("vm")
+   test_vm(fd);
+
arg.param = I915_CONTEXT_PARAM_PRIORITY;
 
igt_subtest("set-priority-not-supported") {
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH i-g-t 06/16] drm-uapi: Import i915_drm.h upto 364df3d04d51

2019-05-08 Thread Chris Wilson
commit 364df3d04d51f0aad13b898f3dffca8c2d03d2b3 (HEAD)
Author: Chris Wilson 
Date:   Fri Jun 30 13:40:53 2017 +0100

drm/i915: Allow specification of parallel execbuf

Signed-off-by: Chris Wilson 
---
 include/drm-uapi/i915_drm.h | 146 +++-
 1 file changed, 145 insertions(+), 1 deletion(-)

diff --git a/include/drm-uapi/i915_drm.h b/include/drm-uapi/i915_drm.h
index 1b0488a81..72be2705e 100644
--- a/include/drm-uapi/i915_drm.h
+++ b/include/drm-uapi/i915_drm.h
@@ -136,6 +136,8 @@ enum drm_i915_gem_engine_class {
 struct i915_engine_class_instance {
__u16 engine_class; /* see enum drm_i915_gem_engine_class */
__u16 engine_instance;
+#define I915_ENGINE_CLASS_INVALID_NONE -1
+#define I915_ENGINE_CLASS_INVALID_VIRTUAL -2
 };
 
 /**
@@ -602,6 +604,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_MMAP_GTT_COHERENT   52
 
+/*
+ * Query whether DRM_I915_GEM_EXECBUFFER2 supports coordination of parallel
+ * execution through use of explicit fence support.
+ * See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_SUBMIT.
+ */
+#define I915_PARAM_HAS_EXEC_SUBMIT_FENCE 53
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1124,7 +1132,16 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_ARRAY   (1<<19)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
+/*
+ * Setting I915_EXEC_FENCE_SUBMIT implies that lower_32_bits(rsvd2) represent
+ * a sync_file fd to wait upon (in a nonblocking manner) prior to executing
+ * the batch.
+ *
+ * Returns -EINVAL if the sync_file fd cannot be found.
+ */
+#define I915_EXEC_FENCE_SUBMIT (1 << 20)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1))
 
 #define I915_EXEC_CONTEXT_ID_MASK  (0x)
 #define i915_execbuffer2_set_context_id(eb2, context) \
@@ -1523,6 +1540,30 @@ struct drm_i915_gem_context_param {
 * See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
 */
 #define I915_CONTEXT_PARAM_VM  0x9
+
+/*
+ * I915_CONTEXT_PARAM_ENGINES:
+ *
+ * Bind this context to operate on this subset of available engines. 
Henceforth,
+ * the I915_EXEC_RING selector for DRM_IOCTL_I915_GEM_EXECBUFFER2 operates as
+ * an index into this array of engines; I915_EXEC_DEFAULT selecting engine[0]
+ * and upwards. Slots 0...N are filled in using the specified (class, 
instance).
+ * Use
+ * engine_class: I915_ENGINE_CLASS_INVALID,
+ * engine_instance: I915_ENGINE_CLASS_INVALID_NONE
+ * to specify a gap in the array that can be filled in later, e.g. by a
+ * virtual engine used for load balancing.
+ *
+ * Setting the number of engines bound to the context to 0, by passing a zero
+ * sized argument, will revert back to default settings.
+ *
+ * See struct i915_context_param_engines.
+ *
+ * Extensions:
+ *   i915_context_engines_load_balance (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
+ *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
+ */
+#define I915_CONTEXT_PARAM_ENGINES 0xa
 /* Must be kept compact -- no holes and well documented */
 
__u64 value;
@@ -1586,12 +1627,115 @@ struct drm_i915_gem_context_param_sseu {
__u32 rsvd;
 };
 
+/*
+ * i915_context_engines_load_balance:
+ *
+ * Enable load balancing across this set of engines.
+ *
+ * Into the I915_EXEC_DEFAULT slot [0], a virtual engine is created that when
+ * used will proxy the execbuffer request onto one of the set of engines
+ * in such a way as to distribute the load evenly across the set.
+ *
+ * The set of engines must be compatible (e.g. the same HW class) as they
+ * will share the same logical GPU context and ring.
+ *
+ * To intermix rendering with the virtual engine and direct rendering onto
+ * the backing engines (bypassing the load balancing proxy), the context must
+ * be defined to use a single timeline for all engines.
+ */
+struct i915_context_engines_load_balance {
+   struct i915_user_extension base;
+
+   __u16 engine_index;
+   __u16 num_siblings;
+   __u32 flags; /* all undefined flags must be zero */
+
+   __u64 mbz64; /* reserved for future use; must be zero */
+
+   struct i915_engine_class_instance engines[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(name__, N__) struct { \
+   struct i915_user_extension base; \
+   __u16 engine_index; \
+   __u16 num_siblings; \
+   __u32 flags; \
+   __u64 mbz64; \
+   struct i915_engine_class_instance engines[N__]; \
+} __attribute__((packed)) name__
+
+/*
+ * i915_context_engines_bond:
+ *
+ * Constructed bonded pairs for execution within a virtual engine.
+ *
+ * All engines are equal, but some are more equal than others. Given
+ * the distribution of resources in the HW, it may be preferable to run
+ * a request on a given subset of engines in parallel to a request on a
+ * specific engine. We enable this selection of engines within a virtual
+ * engine by 

[Intel-gfx] [PATCH i-g-t 11/16] i915/gem_exec_whisper: debugfs/next_seqno is defunct

2019-05-08 Thread Chris Wilson
We removed next_seqno in 5.1, so time to wave goodbye.

Signed-off-by: Chris Wilson 
---
 tests/i915/gem_exec_whisper.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/tests/i915/gem_exec_whisper.c b/tests/i915/gem_exec_whisper.c
index d5afc8119..61b8d6dac 100644
--- a/tests/i915/gem_exec_whisper.c
+++ b/tests/i915/gem_exec_whisper.c
@@ -44,15 +44,6 @@
 
 #define VERIFY 0
 
-static void write_seqno(int dir, unsigned offset)
-{
-   uint32_t seqno = UINT32_MAX - offset;
-
-   igt_sysfs_printf(dir, "i915_next_seqno", "0x%x", seqno);
-
-   igt_debug("next seqno set to: 0x%x\n", seqno);
-}
-
 static void check_bo(int fd, uint32_t handle, int pass)
 {
uint32_t *map;
@@ -355,9 +346,6 @@ static void whisper(int fd, unsigned engine, unsigned flags)
igt_until_timeout(150) {
uint64_t offset;
 
-   if (nchild == 1)
-   write_seqno(debugfs, pass);
-
if (flags & HANG)
submit_hang(&hang, engines, nengine, 
flags);
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH i-g-t 16/16] i915/gem_exec_latency: Add another variant of waiter latency

2019-05-08 Thread Chris Wilson
Signed-off-by: Chris Wilson 
---
 tests/i915/gem_exec_latency.c | 81 +++
 1 file changed, 81 insertions(+)

diff --git a/tests/i915/gem_exec_latency.c b/tests/i915/gem_exec_latency.c
index e88fbbc6a..fd4ceb4d6 100644
--- a/tests/i915/gem_exec_latency.c
+++ b/tests/i915/gem_exec_latency.c
@@ -490,6 +490,83 @@ static void execution_latency(int i915, unsigned int ring, 
const char *name)
gem_close(i915, obj.handle);
 }
 
+static void wakeup_latency(int i915, unsigned int ring, const char *name)
+{
+   struct drm_i915_gem_exec_object2 obj = {
+   .handle = gem_create(i915, 4095),
+   };
+   struct drm_i915_gem_execbuffer2 execbuf = {
+   .buffers_ptr = to_user_pointer(&obj),
+   .buffer_count = 1,
+   .flags = ring | LOCAL_I915_EXEC_NO_RELOC | 
LOCAL_I915_EXEC_HANDLE_LUT,
+   };
+   const unsigned int mmio_base = 0x2000;
+   const unsigned int cs_timestamp = mmio_base + 0x358;
+   volatile uint32_t *timestamp;
+   struct igt_mean wakeup;
+   uint32_t *cs, *result;
+
+   timestamp =
+   (volatile uint32_t *)((volatile char *)igt_global_mmio + 
cs_timestamp);
+
+   obj.handle = gem_create(i915, 4096);
+   obj.flags = EXEC_OBJECT_PINNED;
+   result = gem_mmap__wc(i915, obj.handle, 0, 4096, PROT_WRITE);
+
+   cs = result;
+
+   *cs++ = 0x24 << 23 | 2; /* SRM */
+   *cs++ = cs_timestamp;
+   *cs++ = 4096 - 16 * 4;
+   *cs++ = 0;
+
+   *cs++ = MI_BATCH_BUFFER_START | 1;
+   *cs++ = 0;
+   *cs++ = 0;
+
+   *cs++ = 0x24 << 23 | 2; /* SRM */
+   *cs++ = cs_timestamp;
+   *cs++ = 4096 - 16 * 4 + 4;
+   *cs++ = 0;
+   *cs++ = 0xa << 23;
+
+   cs = result + 1024 - 16;
+
+   {
+   struct sched_param p = { .sched_priority = 99 };
+   sched_setscheduler(0, SCHED_FIFO | SCHED_RESET_ON_FORK, &p);
+   }
+
+   igt_mean_init(&wakeup);
+   igt_until_timeout(2) {
+   uint32_t end;
+
+   igt_fork(child, 1) {
+   result[4] = MI_BATCH_BUFFER_START | 1;
+   cs[0] = 0;
+
+   gem_execbuf(i915, &execbuf);
+
+   while (!cs[0])
+   ;
+   result[4] = 0;
+   __sync_synchronize();
+   }
+   gem_sync(i915, obj.handle);
+   end = *timestamp;
+
+   igt_mean_add(&wakeup, (end - cs[1]) * rcs_clock);
+   igt_waitchildren();
+   }
+   igt_info("%s Wakeup latency: %.2f±%.2fms [%.2f, %.2f]\n", name,
+1e-6 * igt_mean_get(&wakeup),
+1e-6 * sqrt(igt_mean_get_variance(&wakeup)),
+1e-6 * wakeup.min, 1e-6 * wakeup.max);
+
+   munmap(result, 4096);
+   gem_close(i915, obj.handle);
+}
+
 static void
 __submit_spin(int fd, igt_spin_t *spin, unsigned int flags)
 {
@@ -942,6 +1019,10 @@ igt_main
execution_latency(device,
  e->exec_id | e->flags,
  e->name);
+   igt_subtest_f("%s-wakeup-latency", e->name)
+   wakeup_latency(device,
+   e->exec_id | e->flags,
+   e->name);
 
igt_subtest_f("%s-live-dispatch-queued", 
e->name)
latency_on_ring(device,
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH i-g-t 07/16] i915: Add gem_ctx_clone

2019-05-08 Thread Chris Wilson
Exercise cloning contexts, an extension of merely creating one.

Signed-off-by: Chris Wilson 
---
 tests/Makefile.sources |   1 +
 tests/i915/gem_ctx_clone.c | 460 +
 tests/meson.build  |   1 +
 3 files changed, 462 insertions(+)
 create mode 100644 tests/i915/gem_ctx_clone.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 1a541d206..e1b7feeb2 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -21,6 +21,7 @@ TESTS_progs = \
drm_import_export \
drm_mm \
drm_read \
+   i915/gem_ctx_clone \
i915/gem_vm_create \
kms_3d \
kms_addfb_basic \
diff --git a/tests/i915/gem_ctx_clone.c b/tests/i915/gem_ctx_clone.c
new file mode 100644
index 0..cdc5bf413
--- /dev/null
+++ b/tests/i915/gem_ctx_clone.c
@@ -0,0 +1,460 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "igt.h"
+#include "igt_gt.h"
+#include "i915/gem_vm.h"
+#include "i915_drm.h"
+
+static int ctx_create_ioctl(int i915, struct drm_i915_gem_context_create_ext 
*arg)
+{
+   int err;
+
+   err = 0;
+   if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
+   err = -errno;
+   igt_assume(err);
+   }
+
+   errno = 0;
+   return err;
+}
+
+static bool has_ctx_clone(int i915)
+{
+   struct drm_i915_gem_context_create_ext_clone ext = {
+   { .name = I915_CONTEXT_CREATE_EXT_CLONE },
+   .clone_id = -1,
+   };
+   struct drm_i915_gem_context_create_ext create = {
+   .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+   .extensions = to_user_pointer(&ext),
+   };
+   return ctx_create_ioctl(i915, &create) == -ENOENT;
+}
+
+static void invalid_clone(int i915)
+{
+   struct drm_i915_gem_context_create_ext_clone ext = {
+   { .name = I915_CONTEXT_CREATE_EXT_CLONE },
+   };
+   struct drm_i915_gem_context_create_ext create = {
+   .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+   .extensions = to_user_pointer(&ext),
+   };
+
+   igt_assert_eq(ctx_create_ioctl(i915, &create), 0);
+   gem_context_destroy(i915, create.ctx_id);
+
+   ext.flags = -1; /* Hopefully we won't run out of flags */
+   igt_assert_eq(ctx_create_ioctl(i915, &create), -EINVAL);
+   ext.flags = 0;
+
+   ext.base.next_extension = -1;
+   igt_assert_eq(ctx_create_ioctl(i915, &create), -EFAULT);
+   ext.base.next_extension = to_user_pointer(&ext);
+   igt_assert_eq(ctx_create_ioctl(i915, &create), -E2BIG);
+   ext.base.next_extension = 0;
+
+   ext.clone_id = -1;
+   igt_assert_eq(ctx_create_ioctl(i915, &create), -ENOENT);
+   ext.clone_id = 0;
+}
+
+static void clone_flags(int i915)
+{
+   struct drm_i915_gem_context_create_ext_setparam set = {
+   { .name = I915_CONTEXT_CREATE_EXT_SETPARAM },
+   { .param = I915_CONTEXT_PARAM_RECOVERABLE },
+   };
+   struct drm_i915_gem_context_create_ext_clone ext = {
+   { .name = I915_CONTEXT_CREATE_EXT_CLONE },
+   .flags = I915_CONTEXT_CLONE_FLAGS,
+   };
+   struct drm_i915_gem_context_create_ext create = {
+   .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+   .extensions = to_user_pointer(&ext),
+   };
+   int expected;
+
+   set.param.value = 1; /* default is recoverable */
+   igt_require(__gem_context_set_param(i915, &set.param) == 0);
+
+   for (int pass = 0; pass < 2; pass++) { /* cloning default, then child */
+   igt_debug("Cloning %d\n", ext.clone_id);
+   igt_assert_eq(ctx_create_ioctl(i915, &create), 0);
+
+   set.param.ctx_id = ext.clone_id;
+   gem_context_get_param(i915, &set.param);
+   

[Intel-gfx] [PATCH i-g-t 08/16] i915: Exercise creating context with shared GTT

2019-05-08 Thread Chris Wilson
v2: Test each shared context is its own timeline and allows request
reordering between shared contexts.

Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: Tvrtko Ursulin 
Cc: Mika Kuoppala 
Cc: Michal Wajdeczko 
---
 lib/i915/gem_context.c|  68 +++
 lib/i915/gem_context.h|  13 +
 tests/Makefile.sources|   1 +
 tests/i915/gem_ctx_shared.c   | 856 ++
 tests/i915/gem_exec_whisper.c |  32 +-
 tests/meson.build |   1 +
 6 files changed, 962 insertions(+), 9 deletions(-)
 create mode 100644 tests/i915/gem_ctx_shared.c

diff --git a/lib/i915/gem_context.c b/lib/i915/gem_context.c
index f94d89cb4..8fb8984d1 100644
--- a/lib/i915/gem_context.c
+++ b/lib/i915/gem_context.c
@@ -272,6 +272,74 @@ void gem_context_set_priority(int fd, uint32_t ctx_id, int 
prio)
igt_assert_eq(__gem_context_set_priority(fd, ctx_id, prio), 0);
 }
 
+int
+__gem_context_clone(int i915,
+   uint32_t src, unsigned int share,
+   unsigned int flags,
+   uint32_t *out)
+{
+   struct drm_i915_gem_context_create_ext_clone clone = {
+   { .name = I915_CONTEXT_CREATE_EXT_CLONE },
+   .clone_id = src,
+   .flags = share,
+   };
+   struct drm_i915_gem_context_create_ext arg = {
+   .flags = flags | I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+   .extensions = to_user_pointer(&clone),
+   };
+   int err = 0;
+
+   if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, &arg))
+   err = -errno;
+
+   *out = arg.ctx_id;
+
+   errno = 0;
+   return err;
+}
+
+static bool __gem_context_has(int i915, uint32_t share, unsigned int flags)
+{
+   uint32_t ctx;
+
+   __gem_context_clone(i915, 0, share, flags, &ctx);
+   if (ctx)
+   gem_context_destroy(i915, ctx);
+
+   errno = 0;
+   return ctx;
+}
+
+bool gem_contexts_has_shared_gtt(int i915)
+{
+   return __gem_context_has(i915, I915_CONTEXT_CLONE_VM, 0);
+}
+
+bool gem_has_queues(int i915)
+{
+   return __gem_context_has(i915,
+I915_CONTEXT_CLONE_VM,
+I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE);
+}
+
+uint32_t gem_context_clone(int i915,
+  uint32_t src, unsigned int share,
+  unsigned int flags)
+{
+   uint32_t ctx;
+
+   igt_assert_eq(__gem_context_clone(i915, src, share, flags, &ctx), 0);
+
+   return ctx;
+}
+
+uint32_t gem_queue_create(int i915)
+{
+   return gem_context_clone(i915, 0,
+I915_CONTEXT_CLONE_VM,
+I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE);
+}
+
 bool gem_context_has_engine(int fd, uint32_t ctx, uint64_t engine)
 {
struct drm_i915_gem_exec_object2 exec = {};
diff --git a/lib/i915/gem_context.h b/lib/i915/gem_context.h
index a052714d4..8043c3401 100644
--- a/lib/i915/gem_context.h
+++ b/lib/i915/gem_context.h
@@ -29,6 +29,19 @@ int __gem_context_create(int fd, uint32_t *ctx_id);
 void gem_context_destroy(int fd, uint32_t ctx_id);
 int __gem_context_destroy(int fd, uint32_t ctx_id);
 
+int __gem_context_clone(int i915,
+   uint32_t src, unsigned int share,
+   unsigned int flags,
+   uint32_t *out);
+uint32_t gem_context_clone(int i915,
+  uint32_t src, unsigned int share,
+  unsigned int flags);
+
+uint32_t gem_queue_create(int i915);
+
+bool gem_contexts_has_shared_gtt(int i915);
+bool gem_has_queues(int i915);
+
 bool gem_has_contexts(int fd);
 void gem_require_contexts(int fd);
 void gem_context_require_bannable(int fd);
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index e1b7feeb2..3552e895b 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -22,6 +22,7 @@ TESTS_progs = \
drm_mm \
drm_read \
i915/gem_ctx_clone \
+   i915/gem_ctx_shared \
i915/gem_vm_create \
kms_3d \
kms_addfb_basic \
diff --git a/tests/i915/gem_ctx_shared.c b/tests/i915/gem_ctx_shared.c
new file mode 100644
index 0..0076f5e9d
--- /dev/null
+++ b/tests/i915/gem_ctx_shared.c
@@ -0,0 +1,856 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.

[Intel-gfx] [PATCH i-g-t 05/16] i915/gem_ctx_create: Basic checks for constructor properties

2019-05-08 Thread Chris Wilson
Check that the extended create interface accepts setparam.

Signed-off-by: Chris Wilson 
---
 tests/i915/gem_ctx_create.c | 225 ++--
 1 file changed, 213 insertions(+), 12 deletions(-)

diff --git a/tests/i915/gem_ctx_create.c b/tests/i915/gem_ctx_create.c
index a664070db..9b4fddbe7 100644
--- a/tests/i915/gem_ctx_create.c
+++ b/tests/i915/gem_ctx_create.c
@@ -33,6 +33,7 @@
 #include 
 
 #include "igt_rand.h"
+#include "sw_sync.h"
 
 #define LOCAL_I915_EXEC_BSD_SHIFT  (13)
 #define LOCAL_I915_EXEC_BSD_MASK   (3 << LOCAL_I915_EXEC_BSD_SHIFT)
@@ -45,12 +46,33 @@ static unsigned all_nengine;
 static unsigned ppgtt_engines[16];
 static unsigned ppgtt_nengine;
 
-static int __gem_context_create_local(int fd, struct 
drm_i915_gem_context_create *arg)
+static int create_ioctl(int fd, struct drm_i915_gem_context_create *arg)
 {
-   int ret = 0;
-   if (drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, arg))
-   ret = -errno;
-   return ret;
+   int err;
+
+   err = 0;
+   if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_CREATE, arg)) {
+   err = -errno;
+   igt_assert(err);
+   }
+
+   errno = 0;
+   return err;
+}
+
+static int create_ext_ioctl(int i915,
+   struct drm_i915_gem_context_create_ext *arg)
+{
+   int err;
+
+   err = 0;
+   if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT, arg)) {
+   err = -errno;
+   igt_assume(err);
+   }
+
+   errno = 0;
+   return err;
 }
 
 static double elapsed(const struct timespec *start,
@@ -308,6 +330,187 @@ static void maximum(int fd, int ncpus, unsigned mode)
free(contexts);
 }
 
+static void basic_ext_param(int i915)
+{
+   struct drm_i915_gem_context_create_ext_setparam ext = {
+   { .name = I915_CONTEXT_CREATE_EXT_SETPARAM },
+   };
+   struct drm_i915_gem_context_create_ext create = {
+   .flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS
+   };
+   struct drm_i915_gem_context_param get;
+
+   igt_require(create_ext_ioctl(i915, &create) == 0);
+   gem_context_destroy(i915, create.ctx_id);
+
+   create.extensions = -1ull;
+   igt_assert_eq(create_ext_ioctl(i915, &create), -EFAULT);
+
+   create.extensions = to_user_pointer(&ext);
+   igt_assert_eq(create_ext_ioctl(i915, &create), -EINVAL);
+
+   ext.param.param = I915_CONTEXT_PARAM_PRIORITY;
+   if (create_ext_ioctl(i915, &create) != -ENODEV) {
+   gem_context_destroy(i915, create.ctx_id);
+
+   ext.base.next_extension = -1ull;
+   igt_assert_eq(create_ext_ioctl(i915, &create), -EFAULT);
+   ext.base.next_extension = to_user_pointer(&ext);
+   igt_assert_eq(create_ext_ioctl(i915, &create), -E2BIG);
+   ext.base.next_extension = 0;
+
+   ext.param.value = 32;
+   igt_assert_eq(create_ext_ioctl(i915, &create), 0);
+
+   memset(&get, 0, sizeof(get));
+   get.ctx_id = create.ctx_id;
+   get.param = I915_CONTEXT_PARAM_PRIORITY;
+   gem_context_get_param(i915, &get);
+   igt_assert_eq(get.value, ext.param.value);
+
+   gem_context_destroy(i915, create.ctx_id);
+   }
+}
+
+static void check_single_timeline(int i915, uint32_t ctx, int num_engines)
+{
+#define RCS_TIMESTAMP (0x2000 + 0x358)
+   const int gen = intel_gen(intel_get_drm_devid(i915));
+   const int has_64bit_reloc = gen >= 8;
+   struct drm_i915_gem_exec_object2 results = { .handle = gem_create(i915, 
4096) };
+   const uint32_t bbe = MI_BATCH_BUFFER_END;
+   int timeline = sw_sync_timeline_create();
+   uint32_t last, *map;
+
+   {
+   struct drm_i915_gem_execbuffer2 execbuf = {
+   .buffers_ptr = to_user_pointer(&results),
+   .buffer_count = 1,
+   .rsvd1 = ctx,
+   };
+   gem_write(i915, results.handle, 0, &bbe, sizeof(bbe));
+   gem_execbuf(i915, &execbuf);
+   results.flags = EXEC_OBJECT_PINNED;
+   }
+
+   for (int i = 0; i < num_engines; i++) {
+   struct drm_i915_gem_exec_object2 obj[2] = {
+   results, /* write hazard lies! */
+   { .handle = gem_create(i915, 4096) },
+   };
+   struct drm_i915_gem_execbuffer2 execbuf = {
+   .buffers_ptr = to_user_pointer(obj),
+   .buffer_count = 2,
+   .rsvd1 = ctx,
+   .rsvd2 = sw_sync_timeline_create_fence(timeline, 
num_engines - i),
+   .flags = i | I915_EXEC_FENCE_IN,
+   };
+   uint64_t offset = results.offset + 4 * i;
+   uint32_t *cs;
+   int j = 0;
+
+   cs = gem_mmap__cpu(i915, obj[1].handl

[Intel-gfx] [PATCH i-g-t 01/16] i915/gem_exec_schedule: Semaphore priority fixups

2019-05-08 Thread Chris Wilson
A stray git add from my test boxen -- we were being careful enough to
preserve priority and ordering to match the implicit policies.

Signed-off-by: Chris Wilson 
---
 tests/i915/gem_exec_schedule.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/i915/gem_exec_schedule.c b/tests/i915/gem_exec_schedule.c
index 330e8a54e..77a264a6a 100644
--- a/tests/i915/gem_exec_schedule.c
+++ b/tests/i915/gem_exec_schedule.c
@@ -507,6 +507,7 @@ static void semaphore_resolve(int i915)
uint32_t handle, cancel;
uint32_t *cs, *map;
igt_spin_t *spin;
+   int64_t poke = 1;
 
if (!gem_can_store_dword(i915, engine))
continue;
@@ -587,6 +588,7 @@ static void semaphore_resolve(int i915)
eb.buffer_count = 2;
eb.rsvd1 = inner;
gem_execbuf(i915, &eb);
+   gem_wait(i915, cancel, &poke);
gem_close(i915, cancel);
 
gem_sync(i915, handle); /* To hang unless cancel runs! */
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH i-g-t 13/16] i915: Add gem_exec_balancer

2019-05-08 Thread Chris Wilson
Exercise the in-kernel load balancer checking that we can distribute
batches across the set of ctx->engines to avoid load.

Signed-off-by: Chris Wilson 
---
 tests/Makefile.am  |1 +
 tests/Makefile.sources |1 +
 tests/i915/gem_exec_balancer.c | 1050 
 tests/meson.build  |7 +
 4 files changed, 1059 insertions(+)
 create mode 100644 tests/i915/gem_exec_balancer.c

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 5097debf6..c6af0aeaf 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -96,6 +96,7 @@ gem_close_race_LDADD = $(LDADD) -lpthread
 gem_ctx_thrash_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
 gem_ctx_thrash_LDADD = $(LDADD) -lpthread
 gem_ctx_sseu_LDADD = $(LDADD) $(top_builddir)/lib/libigt_perf.la
+i915_gem_exec_balancer_LDADD = $(LDADD) $(top_builddir)/lib/libigt_perf.la
 gem_exec_capture_LDADD = $(LDADD) -lz
 gem_exec_parallel_CFLAGS = $(AM_CFLAGS) $(THREAD_CFLAGS)
 gem_exec_parallel_LDADD = $(LDADD) -lpthread
diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index e7ee27e81..323b625aa 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -24,6 +24,7 @@ TESTS_progs = \
i915/gem_ctx_clone \
i915/gem_ctx_engines \
i915/gem_ctx_shared \
+   i915/gem_exec_balancer \
i915/gem_vm_create \
kms_3d \
kms_addfb_basic \
diff --git a/tests/i915/gem_exec_balancer.c b/tests/i915/gem_exec_balancer.c
new file mode 100644
index 0..25195d478
--- /dev/null
+++ b/tests/i915/gem_exec_balancer.c
@@ -0,0 +1,1050 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include 
+
+#include "igt.h"
+#include "igt_perf.h"
+#include "i915/gem_ring.h"
+#include "sw_sync.h"
+
+IGT_TEST_DESCRIPTION("Exercise in-kernel load-balancing");
+
+#define INSTANCE_COUNT (1 << I915_PMU_SAMPLE_INSTANCE_BITS)
+
+static bool has_class_instance(int i915, uint16_t class, uint16_t instance)
+{
+   int fd;
+
+   fd = perf_i915_open(I915_PMU_ENGINE_BUSY(class, instance));
+   if (fd != -1) {
+   close(fd);
+   return true;
+   }
+
+   return false;
+}
+
+static struct i915_engine_class_instance *
+list_engines(int i915, uint32_t class_mask, unsigned int *out)
+{
+   unsigned int count = 0, size = 64;
+   struct i915_engine_class_instance *engines;
+
+   engines = malloc(size * sizeof(*engines));
+   if (!engines) {
+   *out = 0;
+   return NULL;
+   }
+
+   for (enum drm_i915_gem_engine_class class = I915_ENGINE_CLASS_RENDER;
+class_mask;
+class++, class_mask >>= 1) {
+   if (!(class_mask & 1))
+   continue;
+
+   for (unsigned int instance = 0;
+instance < INSTANCE_COUNT;
+instance++) {
+if (!has_class_instance(i915, class, instance))
+continue;
+
+   if (count == size) {
+   struct i915_engine_class_instance *e;
+
+   size *= 2;
+   e = realloc(engines, size*sizeof(*engines));
+   if (!e) {
+   *out = count;
+   return engines;
+   }
+
+   engines = e;
+   }
+
+   engines[count++] = (struct i915_engine_class_instance){
+   .engine_class = class,
+   .engine_instance = instance,
+   };
+   }
+   }
+
+   if (!count) {
+   free(engines);
+   engines = NULL;
+   }
+
+   *out = count;
+   return engines;
+}
+

[Intel-gfx] [PATCH i-g-t 12/16] i915: Add gem_ctx_engines

2019-05-08 Thread Chris Wilson
To exercise the new I915_CONTEXT_PARAM_ENGINES and interactions with
gem_execbuf().

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Cc: Andi Shyti 
Reviewed-by: Andi Shyti 
---
 tests/Makefile.sources   |   1 +
 tests/i915/gem_ctx_engines.c | 517 +++
 tests/meson.build|   1 +
 3 files changed, 519 insertions(+)
 create mode 100644 tests/i915/gem_ctx_engines.c

diff --git a/tests/Makefile.sources b/tests/Makefile.sources
index 3552e895b..e7ee27e81 100644
--- a/tests/Makefile.sources
+++ b/tests/Makefile.sources
@@ -22,6 +22,7 @@ TESTS_progs = \
drm_mm \
drm_read \
i915/gem_ctx_clone \
+   i915/gem_ctx_engines \
i915/gem_ctx_shared \
i915/gem_vm_create \
kms_3d \
diff --git a/tests/i915/gem_ctx_engines.c b/tests/i915/gem_ctx_engines.c
new file mode 100644
index 0..f83aa4772
--- /dev/null
+++ b/tests/i915/gem_ctx_engines.c
@@ -0,0 +1,517 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "igt.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "i915/gem_context.h"
+#include "sw_sync.h"
+
+#define engine_class(e, n) ((e)->engines[(n)].engine_class)
+#define engine_instance(e, n) ((e)->engines[(n)].engine_instance)
+
+static bool has_context_engines(int i915)
+{
+   struct drm_i915_gem_context_param param = {
+   .ctx_id = 0,
+   .param = I915_CONTEXT_PARAM_ENGINES,
+   };
+   return __gem_context_set_param(i915, ¶m) == 0;
+}
+
+static void invalid_engines(int i915)
+{
+   struct i915_context_param_engines stack = {}, *engines;
+   struct drm_i915_gem_context_param param = {
+   .ctx_id = gem_context_create(i915),
+   .param = I915_CONTEXT_PARAM_ENGINES,
+   .value = to_user_pointer(&stack),
+   };
+   uint32_t handle;
+   void *ptr;
+
+   param.size = 0;
+   igt_assert_eq(__gem_context_set_param(i915, ¶m), 0);
+
+   param.size = 1;
+   igt_assert_eq(__gem_context_set_param(i915, ¶m), -EINVAL);
+
+   param.size = sizeof(stack) - 1;
+   igt_assert_eq(__gem_context_set_param(i915, ¶m), -EINVAL);
+
+   param.size = sizeof(stack) + 1;
+   igt_assert_eq(__gem_context_set_param(i915, ¶m), -EINVAL);
+
+   param.size = 0;
+   igt_assert_eq(__gem_context_set_param(i915, ¶m), 0);
+
+   /* Create a single page surrounded by inaccessible nothingness */
+   ptr = mmap(NULL, 3 * 4096, PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
+   igt_assert(ptr != MAP_FAILED);
+
+   munmap(ptr, 4096);
+   engines = ptr + 4096;
+   munmap(ptr + 2 *4096, 4096);
+
+   param.size = sizeof(*engines) + sizeof(*engines->engines);
+   param.value = to_user_pointer(engines);
+
+   engines->engines[0].engine_class = -1;
+   igt_assert_eq(__gem_context_set_param(i915, ¶m), -ENOENT);
+
+   mprotect(engines, 4096, PROT_READ);
+   igt_assert_eq(__gem_context_set_param(i915, ¶m), -ENOENT);
+
+   mprotect(engines, 4096, PROT_WRITE);
+   engines->engines[0].engine_class = 0;
+   if (__gem_context_set_param(i915, ¶m)) /* XXX needs RCS */
+   goto out;
+
+   engines->extensions = to_user_pointer(ptr);
+   igt_assert_eq(__gem_context_set_param(i915, ¶m), -EFAULT);
+
+   engines->extensions = 0;
+   igt_assert_eq(__gem_context_set_param(i915, ¶m), 0);
+
+   param.value = to_user_pointer(engines - 1);
+   igt_assert_eq(__gem_context_set_param(i915, ¶m), -EFAULT);
+
+   param.value = to_user_pointer(engines) - 1;
+   igt_assert_eq(__gem_context_set_param(i915, ¶m), -EFAULT);
+
+   param.value = to_user_pointer(engines) - param.size +  1;
+   igt_assert_eq(__gem_context_set_param(i915, ¶m), -EFAU

[Intel-gfx] [PATCH i-g-t 03/16] i915: Add gem_vm_create

2019-05-08 Thread Chris Wilson
Exercise basic creation and swapping between new address spaces.

v2: Check isolation that the same vm_id on different fd are indeed
different VM.
v3: Cross-over check with CREATE_EXT_SETPARAM

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Reviewed-by: Tvrtko Ursulin 
---
 lib/Makefile.sources   |   2 +
 lib/i915/gem_vm.c  | 130 
 lib/i915/gem_vm.h  |  38 
 lib/meson.build|   1 +
 tests/Makefile.sources |   1 +
 tests/i915/gem_vm_create.c | 412 +
 tests/meson.build  |   1 +
 7 files changed, 585 insertions(+)
 create mode 100644 lib/i915/gem_vm.c
 create mode 100644 lib/i915/gem_vm.h
 create mode 100644 tests/i915/gem_vm_create.c

diff --git a/lib/Makefile.sources b/lib/Makefile.sources
index 976858238..891f65b96 100644
--- a/lib/Makefile.sources
+++ b/lib/Makefile.sources
@@ -13,6 +13,8 @@ lib_source_list = \
i915/gem_ring.c \
i915/gem_mman.c \
i915/gem_mman.h \
+   i915/gem_vm.c   \
+   i915/gem_vm.h   \
i915_3d.h   \
i915_reg.h  \
i915_pciids.h   \
diff --git a/lib/i915/gem_vm.c b/lib/i915/gem_vm.c
new file mode 100644
index 0..9a022a56c
--- /dev/null
+++ b/lib/i915/gem_vm.c
@@ -0,0 +1,130 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include 
+#include 
+
+#include "ioctl_wrappers.h"
+#include "drmtest.h"
+
+#include "i915/gem_vm.h"
+
+/**
+ * SECTION:gem_vm
+ * @short_description: Helpers for dealing with address spaces (vm/GTT)
+ * @title: GEM Virtual Memory
+ *
+ * This helper library contains functions used for handling gem address
+ * spaces.
+ */
+
+/**
+ * gem_has_vm:
+ * @i915: open i915 drm file descriptor
+ *
+ * Returns: whether VM creation is supported or not.
+ */
+bool gem_has_vm(int i915)
+{
+   uint32_t vm_id = 0;
+
+   __gem_vm_create(i915, &vm_id);
+   if (vm_id)
+   gem_vm_destroy(i915, vm_id);
+
+   return vm_id;
+}
+
+/**
+ * gem_require_vm:
+ * @i915: open i915 drm file descriptor
+ *
+ * This helper will automatically skip the test on platforms where address
+ * space creation is not available.
+ */
+void gem_require_vm(int i915)
+{
+   igt_require(gem_has_vm(i915));
+}
+
+int __gem_vm_create(int i915, uint32_t *vm_id)
+{
+   struct drm_i915_gem_vm_control ctl = {};
+   int err = 0;
+
+   if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_VM_CREATE, &ctl) == 0) {
+   *vm_id = ctl.vm_id;
+   } else {
+  err = -errno;
+  igt_assume(err != 0);
+   }
+
+   errno = 0;
+   return err;
+}
+
+/**
+ * gem_vm_create:
+ * @i915: open i915 drm file descriptor
+ *
+ * This wraps the VM_CREATE ioctl, which is used to allocate a new
+ * address space for use with GEM contexts.
+ *
+ * Returns: The id of the allocated address space.
+ */
+uint32_t gem_vm_create(int i915)
+{
+   uint32_t vm_id;
+
+   igt_assert_eq(__gem_vm_create(i915, &vm_id), 0);
+   igt_assert(vm_id != 0);
+
+   return vm_id;
+}
+
+int __gem_vm_destroy(int i915, uint32_t vm_id)
+{
+   struct drm_i915_gem_vm_control ctl = { .vm_id = vm_id };
+   int err = 0;
+
+   if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_VM_DESTROY, &ctl)) {
+   err = -errno;
+   igt_assume(err);
+   }
+
+   errno = 0;
+   return err;
+}
+
+/**
+ * gem_vm_destroy:
+ * @i915: open i915 drm file descriptor
+ * @vm_id: i915 VM id
+ *
+ * This wraps the VM_DESTROY ioctl, which is used to free an address space
+ * handle.
+ */
+void gem_vm_destroy(int i915, uint32_t vm_id)
+{
+   igt_assert_eq(__gem_vm_destroy(i915, vm_id), 0);
+}
diff --git a/lib/i915/gem_vm.h b/lib/i915/gem_vm.h
new file mode 100644
index 0..27af899d4
--- /dev/null
+++ b/lib/i915/gem_vm.h
@@ -0,0 +1,38 @@
+/*
+ * Copyright 

Re: [Intel-gfx] [PATCH 15/40] drm/i915: Apply an execution_mask to the virtual_engine

2019-05-08 Thread Tvrtko Ursulin


On 08/05/2019 09:06, Chris Wilson wrote:

Allow the user to direct which physical engines of the virtual engine
they wish to execute one, as sometimes it is necessary to override the
load balancing algorithm.

v2: Only kick the virtual engines on context-out if required

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/gt/intel_lrc.c|  67 +
  drivers/gpu/drm/i915/gt/selftest_lrc.c | 131 +
  drivers/gpu/drm/i915/i915_request.c|   1 +
  drivers/gpu/drm/i915/i915_request.h|   3 +
  4 files changed, 202 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index bc388df39802..69849ffb9c82 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -550,6 +550,15 @@ execlists_context_schedule_in(struct i915_request *rq)
rq->hw_context->active = rq->engine;
  }
  
+static void kick_siblings(struct i915_request *rq)

+{
+   struct virtual_engine *ve = to_virtual_engine(rq->hw_context->engine);
+   struct i915_request *next = READ_ONCE(ve->request);
+
+   if (next && next->execution_mask & ~rq->execution_mask)
+   tasklet_schedule(&ve->base.execlists.tasklet);
+}
+
  static inline void
  execlists_context_schedule_out(struct i915_request *rq, unsigned long status)
  {
@@ -557,6 +566,18 @@ execlists_context_schedule_out(struct i915_request *rq, 
unsigned long status)
intel_engine_context_out(rq->engine);
execlists_context_status_change(rq, status);
trace_i915_request_out(rq);
+
+   /*
+* If this is part of a virtual engine, its next request may have
+* been blocked waiting for access to the active context. We have
+* to kick all the siblings again in case we need to switch (e.g.
+* the next request is not runnable on this engine). Hopefully,
+* we will already have submitted the next request before the
+* tasklet runs and do not need to rebuild each virtual tree
+* and kick everyone again.
+*/
+   if (rq->engine != rq->hw_context->engine)
+   kick_siblings(rq);
  }
  
  static u64 execlists_update_context(struct i915_request *rq)

@@ -787,6 +808,9 @@ static bool virtual_matches(const struct virtual_engine *ve,
  {
const struct intel_engine_cs *active;
  
+	if (!(rq->execution_mask & engine->mask)) /* We peeked too soon! */

+   return false;
+
/*
 * We track when the HW has completed saving the context image
 * (i.e. when we have seen the final CS event switching out of
@@ -3159,12 +3183,44 @@ static const struct intel_context_ops 
virtual_context_ops = {
.destroy = virtual_context_destroy,
  };
  
+static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)

+{
+   struct i915_request *rq;
+   intel_engine_mask_t mask;
+
+   rq = READ_ONCE(ve->request);
+   if (!rq)
+   return 0;
+
+   /* The rq is ready for submission; rq->execution_mask is now stable. */
+   mask = rq->execution_mask;
+   if (unlikely(!mask)) {
+   /* Invalid selection, submit to a random engine in error */
+   i915_request_skip(rq, -ENODEV);
+   mask = ve->siblings[0]->mask;
+   }
+
+   GEM_TRACE("%s: rq=%llx:%lld, mask=%x, prio=%d\n",
+ ve->base.name,
+ rq->fence.context, rq->fence.seqno,
+ mask, ve->base.execlists.queue_priority_hint);
+
+   return mask;
+}
+
  static void virtual_submission_tasklet(unsigned long data)
  {
struct virtual_engine * const ve = (struct virtual_engine *)data;
const int prio = ve->base.execlists.queue_priority_hint;
+   intel_engine_mask_t mask;
unsigned int n;
  
+	rcu_read_lock();

+   mask = virtual_submission_mask(ve);
+   rcu_read_unlock();
+   if (unlikely(!mask))
+   return;
+
local_irq_disable();
for (n = 0; READ_ONCE(ve->request) && n < ve->num_siblings; n++) {
struct intel_engine_cs *sibling = ve->siblings[n];
@@ -3172,6 +3228,17 @@ static void virtual_submission_tasklet(unsigned long 
data)
struct rb_node **parent, *rb;
bool first;
  
+		if (unlikely(!(mask & sibling->mask))) {

+   if (!RB_EMPTY_NODE(&node->rb)) {
+   spin_lock(&sibling->timeline.lock);
+   rb_erase_cached(&node->rb,
+   &sibling->execlists.virtual);
+   RB_CLEAR_NODE(&node->rb);
+   spin_unlock(&sibling->timeline.lock);
+   }
+   continue;
+   }
+
spin_lock(&sibling->timeline.lock);
  
  		if (!RB_EMPTY_NODE(&node->rb)) {

diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c 
b/drivers/g

Re: [Intel-gfx] [PATCH 03/40] drm/i915: Pass i915_sched_node around internally

2019-05-08 Thread Tvrtko Ursulin


On 08/05/2019 09:06, Chris Wilson wrote:

To simplify the next patch, update bump_priority and schedule to accept
the internal i915_sched_ndoe directly and not expect a request pointer.

add/remove: 0/0 grow/shrink: 2/1 up/down: 8/-15 (-7)
Function old new   delta
i915_schedule_bump_priority  109 113  +4
i915_schedule 50  54  +4
__i915_schedule  922 907 -15

v2: Adopt node for the old rq local, since it no longer is a request but
the origin node.

Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/i915_scheduler.c | 36 ++-
  1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index b7488c31e3e9..f32d0ee6d58c 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -186,7 +186,7 @@ static void kick_submission(struct intel_engine_cs *engine, 
int prio)
tasklet_hi_schedule(&engine->execlists.tasklet);
  }
  
-static void __i915_schedule(struct i915_request *rq,

+static void __i915_schedule(struct i915_sched_node *node,
const struct i915_sched_attr *attr)
  {
struct intel_engine_cs *engine;
@@ -200,13 +200,13 @@ static void __i915_schedule(struct i915_request *rq,
lockdep_assert_held(&schedule_lock);
GEM_BUG_ON(prio == I915_PRIORITY_INVALID);
  
-	if (i915_request_completed(rq))

+   if (node_signaled(node))
return;
  
-	if (prio <= READ_ONCE(rq->sched.attr.priority))

+   if (prio <= READ_ONCE(node->attr.priority))
return;
  
-	stack.signaler = &rq->sched;

+   stack.signaler = node;
list_add(&stack.dfs_link, &dfs);
  
  	/*

@@ -257,9 +257,9 @@ static void __i915_schedule(struct i915_request *rq,
 * execlists_submit_request()), we can set our own priority and skip
 * acquiring the engine locks.
 */
-   if (rq->sched.attr.priority == I915_PRIORITY_INVALID) {
-   GEM_BUG_ON(!list_empty(&rq->sched.link));
-   rq->sched.attr = *attr;
+   if (node->attr.priority == I915_PRIORITY_INVALID) {
+   GEM_BUG_ON(!list_empty(&node->link));
+   node->attr = *attr;
  
  		if (stack.dfs_link.next == stack.dfs_link.prev)

return;
@@ -268,15 +268,14 @@ static void __i915_schedule(struct i915_request *rq,
}
  
  	memset(&cache, 0, sizeof(cache));

-   engine = rq->engine;
+   engine = node_to_request(node)->engine;
spin_lock(&engine->timeline.lock);
  
  	/* Fifo and depth-first replacement ensure our deps execute before us */

list_for_each_entry_safe_reverse(dep, p, &dfs, dfs_link) {
-   struct i915_sched_node *node = dep->signaler;
-
INIT_LIST_HEAD(&dep->dfs_link);
  
+		node = dep->signaler;

engine = sched_lock_engine(node, engine, &cache);
lockdep_assert_held(&engine->timeline.lock);
  
@@ -319,13 +318,20 @@ static void __i915_schedule(struct i915_request *rq,

  void i915_schedule(struct i915_request *rq, const struct i915_sched_attr 
*attr)
  {
spin_lock_irq(&schedule_lock);
-   __i915_schedule(rq, attr);
+   __i915_schedule(&rq->sched, attr);
spin_unlock_irq(&schedule_lock);
  }
  
+static void __bump_priority(struct i915_sched_node *node, unsigned int bump)

+{
+   struct i915_sched_attr attr = node->attr;
+
+   attr.priority |= bump;
+   __i915_schedule(node, &attr);
+}
+
  void i915_schedule_bump_priority(struct i915_request *rq, unsigned int bump)
  {
-   struct i915_sched_attr attr;
unsigned long flags;
  
  	GEM_BUG_ON(bump & ~I915_PRIORITY_MASK);

@@ -334,11 +340,7 @@ void i915_schedule_bump_priority(struct i915_request *rq, 
unsigned int bump)
return;
  
  	spin_lock_irqsave(&schedule_lock, flags);

-
-   attr = rq->sched.attr;
-   attr.priority |= bump;
-   __i915_schedule(rq, &attr);
-
+   __bump_priority(&rq->sched, bump);
spin_unlock_irqrestore(&schedule_lock, flags);
  }
  



Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 04/40] drm/i915: Check for no-op priority changes first

2019-05-08 Thread Tvrtko Ursulin


On 08/05/2019 09:06, Chris Wilson wrote:

In all likelihood, the priority and node are already in the CPU cache
and by checking them first, we can avoid having to chase the
*request->hwsp for the current breadcrumb.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/i915_scheduler.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index f32d0ee6d58c..5581c5004ff0 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -200,10 +200,10 @@ static void __i915_schedule(struct i915_sched_node *node,
lockdep_assert_held(&schedule_lock);
GEM_BUG_ON(prio == I915_PRIORITY_INVALID);
  
-	if (node_signaled(node))

+   if (prio <= READ_ONCE(node->attr.priority))
return;
  
-	if (prio <= READ_ONCE(node->attr.priority))

+   if (node_signaled(node))
return;
  
  	stack.signaler = node;




Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 07/40] drm/i915: Seal races between async GPU cancellation, retirement and signaling

2019-05-08 Thread Tvrtko Ursulin


On 08/05/2019 09:06, Chris Wilson wrote:

Currently there is an underlying assumption that i915_request_unsubmit()
is synchronous wrt the GPU -- that is the request is no longer in flight
as we remove it. In the near future that may change, and this may upset
our signaling as we can process an interrupt for that request while it
is no longer in flight.

CPU0CPU1
intel_engine_breadcrumbs_irq
(queue request completion)
i915_request_cancel_signaling
... ...
i915_request_enable_signaling
dma_fence_signal

Hence in the time it took us to drop the lock to signal the request, a
preemption event may have occurred and re-queued the request. In the
process, that request would have seen I915_FENCE_FLAG_SIGNAL clear and
so reused the rq->signal_link that was in use on CPU0, leading to bad
pointer chasing in intel_engine_breadcrumbs_irq.

A related issue was that if someone started listening for a signal on a
completed but no longer in-flight request, we missed the opportunity to
immediately signal that request.

Furthermore, as intel_contexts may be immediately released during
request retirement, in order to be entirely sure that
intel_engine_breadcrumbs_irq may no longer dereference the intel_context
(ce->signals and ce->signal_link), we must wait for irq spinlock.

In order to prevent the race, we use a bit in the fence.flags to signal
the transfer onto the signal list inside intel_engine_breadcrumbs_irq.
For simplicity, we use the DMA_FENCE_FLAG_SIGNALED_BIT as it then
quickly signals to any outside observer that the fence is indeed signaled.

v2: Sketch out potential dma-fence API for manual signaling

Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context 
interrupt tracking")
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  drivers/dma-buf/dma-fence.c |  1 +
  drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 70 ++---
  drivers/gpu/drm/i915/i915_request.c |  1 +
  drivers/gpu/drm/i915/intel_guc_submission.c |  1 -
  4 files changed, 51 insertions(+), 22 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 3aa8733f832a..9bf06042619a 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -29,6 +29,7 @@
  
  EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);

  EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
  
  static DEFINE_SPINLOCK(dma_fence_stub_lock);

  static struct dma_fence dma_fence_stub;
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index fe455f01aa65..7053a90e5cb5 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -23,6 +23,7 @@
   */
  
  #include 

+#include 
  #include 
  
  #include "i915_drv.h"

@@ -96,9 +97,30 @@ check_signal_order(struct intel_context *ce, struct 
i915_request *rq)
return true;
  }
  
+static void

+__dma_fence_signal_timestamp(struct dma_fence *fence, ktime_t timestamp)
+{
+   fence->timestamp = timestamp;
+   set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags);
+   trace_dma_fence_signaled(fence);
+}
+
+static void
+__dma_fence_signal_notify(struct dma_fence *fence)
+{
+   struct dma_fence_cb *cur, *tmp;
+
+   list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) {
+   INIT_LIST_HEAD(&cur->node);
+   cur->func(fence, cur);
+   }
+   INIT_LIST_HEAD(&fence->cb_list);
+}
+
  void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
  {
struct intel_breadcrumbs *b = &engine->breadcrumbs;
+   const ktime_t timestamp = ktime_get();
struct intel_context *ce, *cn;
struct list_head *pos, *next;
LIST_HEAD(signal);
@@ -122,6 +144,11 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs 
*engine)
  
  			GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_SIGNAL,

 &rq->fence.flags));
+   clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
+
+   if (test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+&rq->fence.flags))


I wanted this direct/open-coded manipulation to also be clearly called 
out for.


To follow the convention in other helpers here, if you could wrap it in 
something like __dma_fence_signal_return I could live with it.


Regards,

Tvrtko


+   continue;
  
  			/*

 * Queue for execution after dropping the signaling
@@ -129,14 +156,6 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs 
*engine)
 * more signalers to the same context or engine.
 */
i915_request_get(rq);
-
-   /

Re: [Intel-gfx] [PATCH 09/40] drm/i915: Restore control over ppgtt for context creation ABI

2019-05-08 Thread Tvrtko Ursulin


On 08/05/2019 09:06, Chris Wilson wrote:

Having hid the partially exposed new ABI from the PR, put it back again
for completion of context recovery. A significant part of context
recovery is the ability to reuse as much of the old context as is
feasible (to avoid expensive reconstruction). The biggest chunk kept
hidden at the moment is fine-control over the ctx->ppgtt (the GPU page
tables and associated translation tables and kernel maps), so make
control over the ctx->ppgtt explicit.

This allows userspace to create and share virtual memory address spaces
(within the limits of a single fd) between contexts they own, along with
the ability to query the contexts for the vm state.

Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/i915_drv.c |  2 ++
  drivers/gpu/drm/i915/i915_gem_context.c |  5 -
  include/uapi/drm/i915_drm.h | 15 +++
  3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 2c7a4318d13c..5061cb32856b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -3164,6 +3164,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, 
DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, 
i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, 
DRM_UNLOCKED|DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, 
DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, 
DRM_RENDER_ALLOW),
  };
  
  static struct drm_driver driver = {

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 65cefc520e79..413c4529191d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -98,7 +98,6 @@
  #include "i915_user_extensions.h"
  
  #define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE (1 << 1)

-#define I915_CONTEXT_PARAM_VM 0x9
  
  #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
  
@@ -966,8 +965,6 @@ static int get_ppgtt(struct drm_i915_file_private *file_priv,

struct i915_hw_ppgtt *ppgtt;
int ret;
  
-	return -EINVAL; /* nothing to see here; please move along */

-
if (!ctx->ppgtt)
return -ENODEV;
  
@@ -1066,8 +1063,6 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv,

struct i915_hw_ppgtt *ppgtt, *old;
int err;
  
-	return -EINVAL; /* nothing to see here; please move along */

-
if (args->size)
return -EINVAL;
  
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h

index 3a73f5316766..d6ad4a15b2b9 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -355,6 +355,8 @@ typedef struct _drm_i915_sarea {
  #define DRM_I915_PERF_ADD_CONFIG  0x37
  #define DRM_I915_PERF_REMOVE_CONFIG   0x38
  #define DRM_I915_QUERY0x39
+#define DRM_I915_GEM_VM_CREATE 0x3a
+#define DRM_I915_GEM_VM_DESTROY0x3b
  /* Must be kept compact -- no holes */
  
  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)

@@ -415,6 +417,8 @@ typedef struct _drm_i915_sarea {
  #define DRM_IOCTL_I915_PERF_ADD_CONFIGDRM_IOW(DRM_COMMAND_BASE + 
DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
  #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG DRM_IOW(DRM_COMMAND_BASE + 
DRM_I915_PERF_REMOVE_CONFIG, __u64)
  #define DRM_IOCTL_I915_QUERY  DRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_QUERY, struct drm_i915_query)
+#define DRM_IOCTL_I915_GEM_VM_CREATE   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
+#define DRM_IOCTL_I915_GEM_VM_DESTROY  DRM_IOW (DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
  
  /* Allow drivers to submit batchbuffers directly to hardware, relying

   * on the security mechanisms provided by hardware.
@@ -1507,6 +1511,17 @@ struct drm_i915_gem_context_param {
   * On creation, all new contexts are marked as recoverable.
   */
  #define I915_CONTEXT_PARAM_RECOVERABLE0x8
+
+   /*
+* The id of the associated virtual memory address space (ppGTT) of
+* this context. Can be retrieved and passed to another context
+* (on the same fd) for both to use the same ppGTT and so share
+* address layouts, and avoid reloading the page tables on context
+* switches between themselves.
+*
+* See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
+*/
+#define I915_CONTEXT_PARAM_VM  0x9
  /* Must be kept compact -- no holes and well documented */
  
  	__u64 value;




Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-g

Re: [Intel-gfx] [PATCH 12/40] drm/i915: Re-expose SINGLE_TIMELINE flags for context creation

2019-05-08 Thread Tvrtko Ursulin


On 08/05/2019 09:06, Chris Wilson wrote:

The SINGLE_TIMELINE flag can be used to create a context such that all
engine instances within that context share a common timeline. This can
be useful for mixing operations between real and virtual engines, or
when using a composite context for a single client API context.

Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/i915_gem_context.c | 4 
  include/uapi/drm/i915_drm.h | 3 ++-
  2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 5fdb44714a5c..9cd671298daf 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -96,8 +96,6 @@
  #include "i915_trace.h"
  #include "i915_user_extensions.h"
  
-#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE (1 << 1)

-
  #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
  
  static struct i915_global_gem_context {

@@ -505,8 +503,6 @@ i915_gem_create_context(struct drm_i915_private *dev_priv, 
unsigned int flags)
  
  	lockdep_assert_held(&dev_priv->drm.struct_mutex);
  
-	BUILD_BUG_ON(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &

-~I915_CONTEXT_CREATE_FLAGS_UNKNOWN);
if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
!HAS_EXECLISTS(dev_priv))
return ERR_PTR(-EINVAL);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 82bd488ed0d1..957ba8e60e02 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1469,8 +1469,9 @@ struct drm_i915_gem_context_create_ext {
__u32 ctx_id; /* output: id of new context*/
__u32 flags;
  #define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS  (1u << 0)
+#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE  (1u << 1)
  #define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
-   (-(I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS << 1))
+   (-(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE << 1))
__u64 extensions;
  };
  



Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/dp: Support for DP YCbCr4:2:0 outputs

2019-05-08 Thread Patchwork
== Series Details ==

Series: drm/i915/dp: Support for DP YCbCr4:2:0 outputs
URL   : https://patchwork.freedesktop.org/series/60404/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6063 -> Patchwork_12983


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/

Known issues


  Here are the changes found in Patchwork_12983 that come from known issues:

### IGT changes ###

 Possible fixes 

  * igt@gem_exec_basic@readonly-render:
- {fi-icl-y}: [INCOMPLETE][1] ([fdo#107713]) -> [PASS][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/fi-icl-y/igt@gem_exec_ba...@readonly-render.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/fi-icl-y/igt@gem_exec_ba...@readonly-render.html

  
 Warnings 

  * igt@i915_selftest@live_hangcheck:
- fi-apl-guc: [INCOMPLETE][3] ([fdo#103927]) -> [INCOMPLETE][4] 
([fdo#103927] / [fdo#110624])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/fi-apl-guc/igt@i915_selftest@live_hangcheck.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/fi-apl-guc/igt@i915_selftest@live_hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#110624]: https://bugs.freedesktop.org/show_bug.cgi?id=110624


Participating hosts (53 -> 45)
--

  Missing(8): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks 
fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_6063 -> Patchwork_12983

  CI_DRM_6063: 44ae4003d35743cbc7883825c5fe777d136b5247 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4972: f052e49a43cc9704ea5f240df15dd9d3dfed68ab @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12983: a902ce4da59ae0f2543950e07ed2678808966a5c @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

a902ce4da59a drm/i915/dp: Support DP ports YUV 4:2:0 output to GEN11
1472073c7d9d drm/i915/dp: Change a link bandwidth computation for DP
8bf6ac92cc19 drm/i915/dp: Add a support of YCBCR 4:2:0 to DP MSA
c0865d08f8ad drm/i915/dp: Program VSC Header and DB for Pixel 
Encoding/Colorimetry Format
3189eb1879bc drm: Add a VSC structure for handling Pixel Encoding/Colorimetry 
Formats
e1fd5f42e42e drm/i915/dp: Add a config function for YCBCR420 outputs

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 14/40] drm/i915: Load balancing across a virtual engine

2019-05-08 Thread Tvrtko Ursulin


On 08/05/2019 09:06, Chris Wilson wrote:

Having allowed the user to define a set of engines that they will want
to only use, we go one step further and allow them to bind those engines
into a single virtual instance. Submitting a batch to the virtual engine
will then forward it to any one of the set in a manner as best to
distribute load.  The virtual engine has a single timeline across all
engines (it operates as a single queue), so it is not able to concurrently
run batches across multiple engines by itself; that is left up to the user
to submit multiple concurrent batches to multiple queues. Multiple users
will be load balanced across the system.

The mechanism used for load balancing in this patch is a late greedy
balancer. When a request is ready for execution, it is added to each
engine's queue, and when an engine is ready for its next request it
claims it from the virtual engine. The first engine to do so, wins, i.e.
the request is executed at the earliest opportunity (idle moment) in the
system.

As not all HW is created equal, the user is still able to skip the
virtual engine and execute the batch on a specific engine, all within the
same queue. It will then be executed in order on the correct engine,
with execution on other virtual engines being moved away due to the load
detection.

A couple of areas for potential improvement left!

- The virtual engine always take priority over equal-priority tasks.
Mostly broken up by applying FQ_CODEL rules for prioritising new clients,
and hopefully the virtual and real engines are not then congested (i.e.
all work is via virtual engines, or all work is to the real engine).

- We require the breadcrumb irq around every virtual engine request. For
normal engines, we eliminate the need for the slow round trip via
interrupt by using the submit fence and queueing in order. For virtual
engines, we have to allow any job to transfer to a new ring, and cannot
coalesce the submissions, so require the completion fence instead,
forcing the persistent use of interrupts.

- We only drip feed single requests through each virtual engine and onto
the physical engines, even if there was enough work to fill all ELSP,
leaving small stalls with an idle CS event at the end of every request.
Could we be greedy and fill both slots? Being lazy is virtuous for load
distribution on less-than-full workloads though.

Other areas of improvement are more general, such as reducing lock
contention, reducing dispatch overhead, looking at direct submission
rather than bouncing around tasklets etc.

sseu: Lift the restriction to allow sseu to be reconfigured on virtual
engines composed of RENDER_CLASS (rcs).

v2: macroize check_user_mbz()
v3: Cancel virtual engines on wedging
v4: Commence commenting
v5: Replace 64b sibling_mask with a list of class:instance
v6: Drop the one-element array in the uabi
v7: Assert it is an virtual engine in to_virtual_engine()

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/gt/intel_engine_types.h |   8 +
  drivers/gpu/drm/i915/gt/intel_lrc.c  | 683 ++-
  drivers/gpu/drm/i915/gt/intel_lrc.h  |   9 +
  drivers/gpu/drm/i915/gt/selftest_lrc.c   | 180 +
  drivers/gpu/drm/i915/i915_gem.h  |   5 +
  drivers/gpu/drm/i915/i915_gem_context.c  | 116 +++-
  drivers/gpu/drm/i915/i915_scheduler.c|  19 +-
  drivers/gpu/drm/i915/i915_timeline_types.h   |   1 +
  include/uapi/drm/i915_drm.h  |  39 ++
  9 files changed, 1032 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index e381c1c73902..7b47e00fa082 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -227,6 +227,7 @@ struct intel_engine_execlists {
 * @queue: queue of requests, in priority lists
 */
struct rb_root_cached queue;
+   struct rb_root_cached virtual;
  
  	/**

 * @csb_write: control register for Context Switch buffer
@@ -445,6 +446,7 @@ struct intel_engine_cs {
  #define I915_ENGINE_HAS_PREEMPTION   BIT(2)
  #define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
  #define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4)
+#define I915_ENGINE_IS_VIRTUAL   BIT(5)
unsigned int flags;
  
  	/*

@@ -534,6 +536,12 @@ intel_engine_needs_breadcrumb_tasklet(const struct 
intel_engine_cs *engine)
return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
  }
  
+static inline bool

+intel_engine_is_virtual(const struct intel_engine_cs *engine)
+{
+   return engine->flags & I915_ENGINE_IS_VIRTUAL;
+}
+
  #define instdone_slice_mask(dev_priv__) \
(IS_GEN(dev_priv__, 7) ? \
 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index f1d62746e066..bc388df39802 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/dr

Re: [Intel-gfx] [PATCH 11/40] drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[]

2019-05-08 Thread Tvrtko Ursulin


On 08/05/2019 09:06, Chris Wilson wrote:

Allow the user to specify a local engine index (as opposed to
class:index) that they can use to refer to a preset engine inside the
ctx->engine[] array defined by an earlier I915_CONTEXT_PARAM_ENGINES.
This will be useful for setting SSEU parameters on virtual engines that
are local to the context and do not have a valid global class:instance
lookup.

Note that due to the ambiguity in using class:instance with
ctx->engines[], if a user supplied engine map is active the user must
specify the engine to alter by its index into the ctx->engines[].

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/i915_gem_context.c | 24 
  include/uapi/drm/i915_drm.h |  3 ++-
  2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 21bfcd529097..5fdb44714a5c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1363,6 +1363,7 @@ static int set_sseu(struct i915_gem_context *ctx,
struct drm_i915_gem_context_param_sseu user_sseu;
struct intel_context *ce;
struct intel_sseu sseu;
+   unsigned long lookup;
int ret;
  
  	if (args->size < sizeof(user_sseu))

@@ -1375,10 +1376,17 @@ static int set_sseu(struct i915_gem_context *ctx,
   sizeof(user_sseu)))
return -EFAULT;
  
-	if (user_sseu.flags || user_sseu.rsvd)

+   if (user_sseu.rsvd)
return -EINVAL;
  
-	ce = lookup_user_engine(ctx, 0, &user_sseu.engine);

+   if (user_sseu.flags & ~(I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX))
+   return -EINVAL;
+
+   lookup = 0;
+   if (user_sseu.flags & I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX)
+   lookup |= LOOKUP_USER_INDEX;
+
+   ce = lookup_user_engine(ctx, lookup, &user_sseu.engine);
if (IS_ERR(ce))
return PTR_ERR(ce);
  
@@ -1795,6 +1803,7 @@ static int get_sseu(struct i915_gem_context *ctx,

  {
struct drm_i915_gem_context_param_sseu user_sseu;
struct intel_context *ce;
+   unsigned long lookup;
int err;
  
  	if (args->size == 0)

@@ -1806,10 +1815,17 @@ static int get_sseu(struct i915_gem_context *ctx,
   sizeof(user_sseu)))
return -EFAULT;
  
-	if (user_sseu.flags || user_sseu.rsvd)

+   if (user_sseu.rsvd)
return -EINVAL;
  
-	ce = lookup_user_engine(ctx, 0, &user_sseu.engine);

+   if (user_sseu.flags & ~(I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX))
+   return -EINVAL;
+
+   lookup = 0;
+   if (user_sseu.flags & I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX)
+   lookup |= LOOKUP_USER_INDEX;
+
+   ce = lookup_user_engine(ctx, lookup, &user_sseu.engine);
if (IS_ERR(ce))
return PTR_ERR(ce);
  
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h

index 8e1bb22926e4..82bd488ed0d1 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1576,9 +1576,10 @@ struct drm_i915_gem_context_param_sseu {
struct i915_engine_class_instance engine;
  
  	/*

-* Unused for now. Must be cleared to zero.
+* Unknown flags must be cleared to zero.
 */
__u32 flags;
+#define I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX (1u << 0)
  
  	/*

 * Mask of slices to enable for the context. Valid values are a subset



Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/2] drm/i915: Fix fastset vs. pfit on/off on HSW EDP transcoder

2019-05-08 Thread Maarten Lankhorst
Op 25-04-2019 om 18:29 schreef Ville Syrjala:
> From: Ville Syrjälä 
>
> On HSW the pipe A panel fitter lives inside the display power well,
> and the input MUX for the EDP transcoder needs to be configured
> appropriately to route the data through the power well as needed.
> Changing the MUX setting is not allowed while the pipe is active,
> so we need to force a full modeset whenever we need to change it.
>
> Currently we may end up doing a fastset which won't change the
> MUX settings, but it will drop the power well reference, and that
> kills the pipe.
>
> Cc: sta...@vger.kernel.org
> Cc: Hans de Goede 
> Cc: Maarten Lankhorst 
> Fixes: d19f958db23c ("drm/i915: Enable fastset for non-boot modesets.")
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/intel_display.c  |  9 +
>  drivers/gpu/drm/i915/intel_pipe_crc.c | 13 ++---
>  2 files changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index c67f165b466c..691c9a929164 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -12133,6 +12133,7 @@ intel_pipe_config_compare(struct drm_i915_private 
> *dev_priv,
> struct intel_crtc_state *pipe_config,
> bool adjust)
>  {
> + struct intel_crtc *crtc = to_intel_crtc(current_config->base.crtc);
>   bool ret = true;
>   bool fixup_inherited = adjust &&
>   (current_config->base.mode.private_flags & 
> I915_MODE_FLAG_INHERITED) &&
> @@ -12354,6 +12355,14 @@ intel_pipe_config_compare(struct drm_i915_private 
> *dev_priv,
>   PIPE_CONF_CHECK_X(gmch_pfit.pgm_ratios);
>   PIPE_CONF_CHECK_X(gmch_pfit.lvds_border_bits);
>  
> + /*
> +  * Changing the EDP transcoder input mux
> +  * (A_ONOFF vs. A_ON) requires a full modeset.
> +  */
> + if (IS_HASWELL(dev_priv) && crtc->pipe == PIPE_A &&
> + current_config->cpu_transcoder == TRANSCODER_EDP)
> + PIPE_CONF_CHECK_BOOL(pch_pfit.enabled);

I guess it depends if we want to make it a blocker or not..

> +
>   if (!adjust) {
>   PIPE_CONF_CHECK_I(pipe_src_w);
>   PIPE_CONF_CHECK_I(pipe_src_h);
> diff --git a/drivers/gpu/drm/i915/intel_pipe_crc.c 
> b/drivers/gpu/drm/i915/intel_pipe_crc.c
> index e94b5b1bc1b7..e7c7be4911c1 100644
> --- a/drivers/gpu/drm/i915/intel_pipe_crc.c
> +++ b/drivers/gpu/drm/i915/intel_pipe_crc.c
> @@ -311,10 +311,17 @@ intel_crtc_crc_setup_workarounds(struct intel_crtc 
> *crtc, bool enable)
>   pipe_config->base.mode_changed = pipe_config->has_psr;
>   pipe_config->crc_enabled = enable;
>  
> - if (IS_HASWELL(dev_priv) && crtc->pipe == PIPE_A) {
> + if (IS_HASWELL(dev_priv) &&
> + pipe_config->base.active && crtc->pipe == PIPE_A &&
> + pipe_config->cpu_transcoder == TRANSCODER_EDP) {
> + bool old_need_power_well = pipe_config->pch_pfit.enabled ||
> + pipe_config->pch_pfit.force_thru;
> + bool new_need_power_well = pipe_config->pch_pfit.enabled ||
> + enable;
> +
>   pipe_config->pch_pfit.force_thru = enable;
> - if (pipe_config->cpu_transcoder == TRANSCODER_EDP &&
> - pipe_config->pch_pfit.enabled != enable)
> +
> + if (old_need_power_well != new_need_power_well)
>   pipe_config->base.connectors_changed = true;

Could we get rid of this logic and set mode_changed instead?

Ah, I see that is done in 2/2, much less surprises then. :)

In that case, for both:

Reviewed-by: Maarten Lankhorst 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/40] drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD

2019-05-08 Thread Patchwork
== Series Details ==

Series: series starting with [01/40] drm/i915/hangcheck: Replace 
hangcheck.seqno with RING_HEAD
URL   : https://patchwork.freedesktop.org/series/60403/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
67b421614964 drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD
5f78625e4a23 drm/i915: Rearrange i915_scheduler.c
fea1b40e0ff4 drm/i915: Pass i915_sched_node around internally
fad4daf02033 drm/i915: Check for no-op priority changes first
4c36ecd8e737 drm/i915: Bump signaler priority on adding a waiter
483a8a8f05cf drm/i915: Convert inconsistent static engine tables into an init 
error
0f169a3e02f6 drm/i915: Seal races between async GPU cancellation, retirement 
and signaling
eb869674ac86 dma-fence: Refactor signaling for manual invocation
-:30: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#30: 
new file mode 100644

-:35: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier 
tag in line 1
#35: FILE: drivers/dma-buf/dma-fence-trace.c:1:
+/*

-:173: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier 
tag in line 1
#173: FILE: include/linux/dma-fence-types.h:1:
+/*

-:244: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without comment
#244: FILE: include/linux/dma-fence-types.h:72:
+   spinlock_t *lock;

total: 0 errors, 3 warnings, 1 checks, 641 lines checked
1415c169395e drm/i915: Restore control over ppgtt for context creation ABI
-:81: WARNING:LONG_LINE: line over 100 characters
#81: FILE: include/uapi/drm/i915_drm.h:420:
+#define DRM_IOCTL_I915_GEM_VM_CREATE   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)

-:82: WARNING:LONG_LINE: line over 100 characters
#82: FILE: include/uapi/drm/i915_drm.h:421:
+#define DRM_IOCTL_I915_GEM_VM_DESTROY  DRM_IOW (DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)

-:82: WARNING:SPACING: space prohibited between function name and open 
parenthesis '('
#82: FILE: include/uapi/drm/i915_drm.h:421:
+#define DRM_IOCTL_I915_GEM_VM_DESTROY  DRM_IOW (DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)

-:82: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in 
parentheses
#82: FILE: include/uapi/drm/i915_drm.h:421:
+#define DRM_IOCTL_I915_GEM_VM_DESTROY  DRM_IOW (DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)

total: 1 errors, 3 warnings, 0 checks, 64 lines checked
d257fe3e40a0 drm/i915: Allow a context to define its set of engines
-:437: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#437: FILE: drivers/gpu/drm/i915/i915_utils.h:110:
+#define check_struct_size(p, member, n, sz) \
+   likely(__check_struct_size(sizeof(*(p)), \
+  sizeof(*(p)->member) + 
__must_be_array((p)->member), \
+  n, sz))

-:437: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'member' - possible 
side-effects?
#437: FILE: drivers/gpu/drm/i915/i915_utils.h:110:
+#define check_struct_size(p, member, n, sz) \
+   likely(__check_struct_size(sizeof(*(p)), \
+  sizeof(*(p)->member) + 
__must_be_array((p)->member), \
+  n, sz))

-:437: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'member' may be better as 
'(member)' to avoid precedence issues
#437: FILE: drivers/gpu/drm/i915/i915_utils.h:110:
+#define check_struct_size(p, member, n, sz) \
+   likely(__check_struct_size(sizeof(*(p)), \
+  sizeof(*(p)->member) + 
__must_be_array((p)->member), \
+  n, sz))

total: 0 errors, 0 warnings, 3 checks, 428 lines checked
7af4301e09f7 drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local 
ctx->engine[]
5bbcdc2dae1c drm/i915: Re-expose SINGLE_TIMELINE flags for context creation
fc7ed4a4ef81 drm/i915: Allow userspace to clone contexts on creation
-:213: ERROR:BRACKET_SPACE: space prohibited before open square bracket '['
#213: FILE: drivers/gpu/drm/i915/i915_gem_context.c:1858:
+#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y

total: 1 errors, 0 warnings, 0 checks, 235 lines checked
69947c56c90e drm/i915: Load balancing across a virtual engine
266984fb2212 drm/i915: Apply an execution_mask to the virtual_engine
f0417a49d538 drm/i915: Extend execution fence to support a callback
7afded600522 drm/i915/execlists: Virtual engine bonding
00c6c2e19f78 drm/i915: Allow specification of parallel execbuf
b688a0332163 drm/i915: Split GEM object type definition to its own header
-:25: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#25: 
new file mode 100644

-:59: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier 
tag in line 1
#59: FILE: drivers/gpu/drm/i915/gem/i915_gem_object_types.h:1:
+/*

-:60: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use 

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [01/40] drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD

2019-05-08 Thread Patchwork
== Series Details ==

Series: series starting with [01/40] drm/i915/hangcheck: Replace 
hangcheck.seqno with RING_HEAD
URL   : https://patchwork.freedesktop.org/series/60403/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD
Okay!

Commit: drm/i915: Rearrange i915_scheduler.c
Okay!

Commit: drm/i915: Pass i915_sched_node around internally
Okay!

Commit: drm/i915: Check for no-op priority changes first
Okay!

Commit: drm/i915: Bump signaler priority on adding a waiter
Okay!

Commit: drm/i915: Convert inconsistent static engine tables into an init error
Okay!

Commit: drm/i915: Seal races between async GPU cancellation, retirement and 
signaling
Okay!

Commit: dma-fence: Refactor signaling for manual invocation
Okay!

Commit: drm/i915: Restore control over ppgtt for context creation ABI
Okay!

Commit: drm/i915: Allow a context to define its set of engines
+drivers/gpu/drm/i915/i915_utils.h:87:13: error: incorrect type in conditional
+drivers/gpu/drm/i915/i915_utils.h:87:13: error: undefined identifier 
'__builtin_mul_overflow'
+drivers/gpu/drm/i915/i915_utils.h:87:13:got void
+drivers/gpu/drm/i915/i915_utils.h:87:13: warning: call with no type!
+drivers/gpu/drm/i915/i915_utils.h:90:13: error: incorrect type in conditional
+drivers/gpu/drm/i915/i915_utils.h:90:13: error: undefined identifier 
'__builtin_add_overflow'
+drivers/gpu/drm/i915/i915_utils.h:90:13:got void
+drivers/gpu/drm/i915/i915_utils.h:90:13: warning: call with no type!
-drivers/gpu/drm/i915/selftests/../i915_utils.h:184:16: warning: expression 
using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_utils.h:218:16: warning: expression 
using sizeof(void)
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: not a function 
+./include/linux/overflow.h:285:13:got void
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: not a function 
+./include/linux/overflow.h:287:13:got void

Commit: drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[]
Okay!

Commit: drm/i915: Re-expose SINGLE_TIMELINE flags for context creation
Okay!

Commit: drm/i915: Allow userspace to clone contexts on creation
+drivers/gpu/drm/i915/i915_gem_context.c:1859:17: error: bad integer constant 
expression
+drivers/gpu/drm/i915/i915_gem_context.c:1860:17: error: bad integer constant 
expression
+drivers/gpu/drm/i915/i915_gem_context.c:1861:17: error: bad integer constant 
expression
+drivers/gpu/drm/i915/i915_gem_context.c:1862:17: error: bad integer constant 
expression
+drivers/gpu/drm/i915/i915_gem_context.c:1863:17: error: bad integer constant 
expression
+drivers/gpu/drm/i915/i915_gem_context.c:1864:17: error: bad integer constant 
expression
-drivers/gpu/drm/i915/i915_utils.h:87:13: warning: call with no type!
-drivers/gpu/drm/i915/i915_utils.h:90:13: warning: call with no type!
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:1266:25: warning: expression 
using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:1266:25: warning: expression 
using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:454:16: warning: expression 
using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:571:33: warning: expression 
using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:571:33: warning: expression 
using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:693:33: warning: expression 
using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:693:33: warning: expression 
using sizeof(void)
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: not a function 
-./include/linux/overflow.h:285:13: warning: call with no type!
+./include/linux/overflow.h:285:13:got void
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: not a function 
-./include/linux/overflow.h:287:13: warning: call with no type!
+./include/linux/overflow.h:287:13:got void
-./include/linux/slab.h:666:13: warning: call with no type!

Commit: drm/i915: Load balancing across a virtual engine
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: undefined identifier 
'__builtin_mul_overflow'
+./include/linux/overflow.h:285:13:got void
+./include/linux/overflow.h:285:13: warning: call with no type!
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: undefined identifier 
'__builtin_add_overflow'
+./include/linux/overflow.h:287:13:got void
+./include/linux/overflow.h:287:13: warning: call with no type!
+./include/linux/slab.h:666:13: error: not a function 

Commit: drm/i915: Apply an execution_mask to the virtual_engine
Ok

Re: [Intel-gfx] [PATCH 14/40] drm/i915: Load balancing across a virtual engine

2019-05-08 Thread Chris Wilson
Quoting Tvrtko Ursulin (2019-05-08 11:29:34)
> 
> On 08/05/2019 09:06, Chris Wilson wrote:
> > +static int live_virtual_engine(void *arg)
> > +{
> > + struct drm_i915_private *i915 = arg;
> > + struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
> > + struct intel_engine_cs *engine;
> > + enum intel_engine_id id;
> > + unsigned int class, inst;
> > + int err = -ENODEV;
> > +
> > + if (USES_GUC_SUBMISSION(i915))
> > + return 0;
> > +
> > + mutex_lock(&i915->drm.struct_mutex);
> > +
> > + for_each_engine(engine, i915, id) {
> > + err = nop_virtual_engine(i915, &engine, 1, 1, 0);
> > + if (err) {
> > + pr_err("Failed to wrap engine %s: err=%d\n",
> > +engine->name, err);
> > + goto out_unlock;
> > + }
> > + }
> > +
> > + for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
> > + int nsibling, n;
> > +
> > + nsibling = 0;
> > + for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
> > + if (!i915->engine_class[class][inst])
> > + break;
> 
> I previous review I said I think this should be continue instead of 
> break so vcs0 + vcs2 skus can also be tested.

Completely missed that, sorry.

> > +
> > + siblings[nsibling++] = 
> > i915->engine_class[class][inst];
> > + }
> > + if (nsibling < 2)
> > + continue;
> 
> And also that single engine VE could be tested just as well, unless I am 
> missing something.

There's no such thing as single engine VE. The current design requires
that this type of struct virtual_engine encompasses more than one engine
(failing that we break the single request scheduling, although might be
able to lift that with timeslicing but the early results were not
favourable); the single engine being a regular intel_context instance.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [01/40] drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD

2019-05-08 Thread Patchwork
== Series Details ==

Series: series starting with [01/40] drm/i915/hangcheck: Replace 
hangcheck.seqno with RING_HEAD
URL   : https://patchwork.freedesktop.org/series/60403/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6063 -> Patchwork_12984


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12984/

New tests
-

  New tests have been introduced between CI_DRM_6063 and Patchwork_12984:

### New IGT tests (1) ###

  * igt@i915_selftest@live_mman:
- Statuses : 42 pass(s)
- Exec time: [4.50, 57.25] s

  

Known issues


  Here are the changes found in Patchwork_12984 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_pm_rpm@module-reload:
- fi-skl-6770hq:  [PASS][1] -> [FAIL][2] ([fdo#108511])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/fi-skl-6770hq/igt@i915_pm_...@module-reload.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12984/fi-skl-6770hq/igt@i915_pm_...@module-reload.html

  
 Possible fixes 

  * igt@gem_exec_basic@readonly-render:
- {fi-icl-y}: [INCOMPLETE][3] ([fdo#107713]) -> [PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/fi-icl-y/igt@gem_exec_ba...@readonly-render.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12984/fi-icl-y/igt@gem_exec_ba...@readonly-render.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#108511]: https://bugs.freedesktop.org/show_bug.cgi?id=108511


Participating hosts (53 -> 43)
--

  Missing(10): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks 
fi-bsw-cyan fi-apl-guc fi-ctg-p8600 fi-pnv-d510 fi-byt-clapper fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_6063 -> Patchwork_12984

  CI_DRM_6063: 44ae4003d35743cbc7883825c5fe777d136b5247 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4972: f052e49a43cc9704ea5f240df15dd9d3dfed68ab @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12984: 9bf3a26c75f2ad0800b50d5eaf9a6f40d90254bd @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

9bf3a26c75f2 drm/i915/execlists: Minimalistic timeslicing
c3f5d7c6f6cc drm/i915/execlists: Preempt-to-busy
3ab6fd267bc8 drm/i915: Flush the execution-callbacks on retiring
f29e29f63664 drm/i915: Replace engine->timeline with a plain list
4bcdd4c21eeb drm/i915: Stop retiring along engine
16ad5ab86e98 drm/i915: Keep contexts pinned until after the next kernel context 
switch
ff7e06f46eef drm/i915: Rename intel_context.active to .inflight
06e6fa4897ee drm/i915: Move object close under its own lock
5ec5ae3c8224 drm/i915: Drop the deferred active reference
a0342667999c drm/i915: Move GEM client throttling to its own file
3853ef7d5efd drm/i915: Move GEM object busy checking to its own file
37093e2cfcb7 drm/i915: Move GEM object waiting to its own file
82fa4ffba0ac drm/i915: Move GEM object domain management from struct_mutex to 
local
e95a11a1ae00 drm/i915: Pull scatterlist utils out of i915_gem.h
8be66d73a04c drm/i915: Move more GEM objects under gem/
1786d7cb9bc6 drm/i915: Move GEM domain management to its own file
4734074d136c drm/i915: Move mmap and friends to its own file
35b7895848a6 drm/i915: Move phys objects to its own file
fb33bd9448ad drm/i915: Move shmem object setup to its own file
3483581bdd02 drm/i915: Move object->pages API to i915_gem_object.[ch]
39f76472714f drm/i915: Pull GEM ioctls interface to its own file
b688a0332163 drm/i915: Split GEM object type definition to its own header
00c6c2e19f78 drm/i915: Allow specification of parallel execbuf
7afded600522 drm/i915/execlists: Virtual engine bonding
f0417a49d538 drm/i915: Extend execution fence to support a callback
266984fb2212 drm/i915: Apply an execution_mask to the virtual_engine
69947c56c90e drm/i915: Load balancing across a virtual engine
fc7ed4a4ef81 drm/i915: Allow userspace to clone contexts on creation
5bbcdc2dae1c drm/i915: Re-expose SINGLE_TIMELINE flags for context creation
7af4301e09f7 drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local 
ctx->engine[]
d257fe3e40a0 drm/i915: Allow a context to define its set of engines
1415c169395e drm/i915: Restore control over ppgtt for context creation ABI
eb869674ac86 dma-fence: Refactor signaling for manual invocation
0f169a3e02f6 drm/i915: Seal races between async GPU cancellation, retirement 
and signaling
483a8a8f05cf drm/i915: Convert inconsistent static engine tables into an init 
error
4c36ecd8e737 drm/i915: Bump signaler priority on adding a waiter
fad4daf02033 drm/i915: Check for no-op priority changes first
fea1b40e0ff4 drm/i915: Pass i915_sched_node around internally
5f78625e4a23 drm/i915: Rea

[Intel-gfx] [PATCH] drm/i915: Load balancing across a virtual engine

2019-05-08 Thread Chris Wilson
Having allowed the user to define a set of engines that they will want
to only use, we go one step further and allow them to bind those engines
into a single virtual instance. Submitting a batch to the virtual engine
will then forward it to any one of the set in a manner as best to
distribute load.  The virtual engine has a single timeline across all
engines (it operates as a single queue), so it is not able to concurrently
run batches across multiple engines by itself; that is left up to the user
to submit multiple concurrent batches to multiple queues. Multiple users
will be load balanced across the system.

The mechanism used for load balancing in this patch is a late greedy
balancer. When a request is ready for execution, it is added to each
engine's queue, and when an engine is ready for its next request it
claims it from the virtual engine. The first engine to do so, wins, i.e.
the request is executed at the earliest opportunity (idle moment) in the
system.

As not all HW is created equal, the user is still able to skip the
virtual engine and execute the batch on a specific engine, all within the
same queue. It will then be executed in order on the correct engine,
with execution on other virtual engines being moved away due to the load
detection.

A couple of areas for potential improvement left!

- The virtual engine always take priority over equal-priority tasks.
Mostly broken up by applying FQ_CODEL rules for prioritising new clients,
and hopefully the virtual and real engines are not then congested (i.e.
all work is via virtual engines, or all work is to the real engine).

- We require the breadcrumb irq around every virtual engine request. For
normal engines, we eliminate the need for the slow round trip via
interrupt by using the submit fence and queueing in order. For virtual
engines, we have to allow any job to transfer to a new ring, and cannot
coalesce the submissions, so require the completion fence instead,
forcing the persistent use of interrupts.

- We only drip feed single requests through each virtual engine and onto
the physical engines, even if there was enough work to fill all ELSP,
leaving small stalls with an idle CS event at the end of every request.
Could we be greedy and fill both slots? Being lazy is virtuous for load
distribution on less-than-full workloads though.

Other areas of improvement are more general, such as reducing lock
contention, reducing dispatch overhead, looking at direct submission
rather than bouncing around tasklets etc.

sseu: Lift the restriction to allow sseu to be reconfigured on virtual
engines composed of RENDER_CLASS (rcs).

v2: macroize check_user_mbz()
v3: Cancel virtual engines on wedging
v4: Commence commenting
v5: Replace 64b sibling_mask with a list of class:instance
v6: Drop the one-element array in the uabi
v7: Assert it is an virtual engine in to_virtual_engine()
v8: Skip over holes in [class][inst] so we can selftest with (vcs0, vcs2)

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_engine_types.h |   8 +
 drivers/gpu/drm/i915/gt/intel_lrc.c  | 683 ++-
 drivers/gpu/drm/i915/gt/intel_lrc.h  |   9 +
 drivers/gpu/drm/i915/gt/selftest_lrc.c   | 180 +
 drivers/gpu/drm/i915/i915_gem.h  |   5 +
 drivers/gpu/drm/i915/i915_gem_context.c  | 116 +++-
 drivers/gpu/drm/i915/i915_scheduler.c|  19 +-
 drivers/gpu/drm/i915/i915_timeline_types.h   |   1 +
 include/uapi/drm/i915_drm.h  |  39 ++
 9 files changed, 1032 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index e381c1c73902..7b47e00fa082 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -227,6 +227,7 @@ struct intel_engine_execlists {
 * @queue: queue of requests, in priority lists
 */
struct rb_root_cached queue;
+   struct rb_root_cached virtual;
 
/**
 * @csb_write: control register for Context Switch buffer
@@ -445,6 +446,7 @@ struct intel_engine_cs {
 #define I915_ENGINE_HAS_PREEMPTION   BIT(2)
 #define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
 #define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4)
+#define I915_ENGINE_IS_VIRTUAL   BIT(5)
unsigned int flags;
 
/*
@@ -534,6 +536,12 @@ intel_engine_needs_breadcrumb_tasklet(const struct 
intel_engine_cs *engine)
return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
 }
 
+static inline bool
+intel_engine_is_virtual(const struct intel_engine_cs *engine)
+{
+   return engine->flags & I915_ENGINE_IS_VIRTUAL;
+}
+
 #define instdone_slice_mask(dev_priv__) \
(IS_GEN(dev_priv__, 7) ? \
 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index f1d62746e066..bc388df39802 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+

[Intel-gfx] [PATCH] drm/i915: Seal races between async GPU cancellation, retirement and signaling

2019-05-08 Thread Chris Wilson
Currently there is an underlying assumption that i915_request_unsubmit()
is synchronous wrt the GPU -- that is the request is no longer in flight
as we remove it. In the near future that may change, and this may upset
our signaling as we can process an interrupt for that request while it
is no longer in flight.

CPU0CPU1
intel_engine_breadcrumbs_irq
(queue request completion)
i915_request_cancel_signaling
... ...
i915_request_enable_signaling
dma_fence_signal

Hence in the time it took us to drop the lock to signal the request, a
preemption event may have occurred and re-queued the request. In the
process, that request would have seen I915_FENCE_FLAG_SIGNAL clear and
so reused the rq->signal_link that was in use on CPU0, leading to bad
pointer chasing in intel_engine_breadcrumbs_irq.

A related issue was that if someone started listening for a signal on a
completed but no longer in-flight request, we missed the opportunity to
immediately signal that request.

Furthermore, as intel_contexts may be immediately released during
request retirement, in order to be entirely sure that
intel_engine_breadcrumbs_irq may no longer dereference the intel_context
(ce->signals and ce->signal_link), we must wait for irq spinlock.

In order to prevent the race, we use a bit in the fence.flags to signal
the transfer onto the signal list inside intel_engine_breadcrumbs_irq.
For simplicity, we use the DMA_FENCE_FLAG_SIGNALED_BIT as it then
quickly signals to any outside observer that the fence is indeed signaled.

v2: Sketch out potential dma-fence API for manual signaling
v3: And the test_and_set_bit()

Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context 
interrupt tracking")
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/dma-buf/dma-fence.c |  1 +
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 78 +++--
 drivers/gpu/drm/i915/i915_request.c |  1 +
 drivers/gpu/drm/i915/intel_guc_submission.c |  1 -
 4 files changed, 59 insertions(+), 22 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 3aa8733f832a..9bf06042619a 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -29,6 +29,7 @@
 
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
 
 static DEFINE_SPINLOCK(dma_fence_stub_lock);
 static struct dma_fence dma_fence_stub;
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index fe455f01aa65..c092bdf5f0bf 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -23,6 +23,7 @@
  */
 
 #include 
+#include 
 #include 
 
 #include "i915_drv.h"
@@ -96,9 +97,39 @@ check_signal_order(struct intel_context *ce, struct 
i915_request *rq)
return true;
 }
 
+static bool
+__dma_fence_signal(struct dma_fence *fence)
+{
+   return !test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags);
+}
+
+static void
+__dma_fence_signal__timestamp(struct dma_fence *fence, ktime_t timestamp)
+{
+   fence->timestamp = timestamp;
+   set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags);
+   trace_dma_fence_signaled(fence);
+}
+
+static void
+__dma_fence_signal__notify(struct dma_fence *fence)
+{
+   struct dma_fence_cb *cur, *tmp;
+
+   lockdep_assert_held(fence->lock);
+   lockdep_assert_irqs_disabled();
+
+   list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) {
+   INIT_LIST_HEAD(&cur->node);
+   cur->func(fence, cur);
+   }
+   INIT_LIST_HEAD(&fence->cb_list);
+}
+
 void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
 {
struct intel_breadcrumbs *b = &engine->breadcrumbs;
+   const ktime_t timestamp = ktime_get();
struct intel_context *ce, *cn;
struct list_head *pos, *next;
LIST_HEAD(signal);
@@ -122,6 +153,10 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs 
*engine)
 
GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_SIGNAL,
 &rq->fence.flags));
+   clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
+
+   if (!__dma_fence_signal(&rq->fence))
+   continue;
 
/*
 * Queue for execution after dropping the signaling
@@ -129,14 +164,6 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs 
*engine)
 * more signalers to the same context or engine.
 */
i915_request_get(rq);
-
-   /*
-* We may race with direct invocation of
-* 

[Intel-gfx] [PATCH] dma-fence: Refactor signaling for manual invocation

2019-05-08 Thread Chris Wilson
Move the duplicated code within dma-fence.c into the header for wider
reuse.

Signed-off-by: Chris Wilson 
---
 drivers/dma-buf/Makefile|  10 +-
 drivers/dma-buf/dma-fence-trace.c   |  28 +++
 drivers/dma-buf/dma-fence.c |  32 +--
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c |  30 ---
 include/linux/dma-fence-types.h | 248 +++
 include/linux/dma-fence.h   | 251 +++-
 6 files changed, 321 insertions(+), 278 deletions(-)
 create mode 100644 drivers/dma-buf/dma-fence-trace.c
 create mode 100644 include/linux/dma-fence-types.h

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 1f006e083eb9..56e579878f26 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1,5 +1,11 @@
-obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
-reservation.o seqno-fence.o
+obj-y := \
+   dma-buf.o \
+   dma-fence.o \
+   dma-fence-array.o \
+   dma-fence-chain.o \
+   dma-fence-trace.o \
+   reservation.o \
+   seqno-fence.o
 obj-$(CONFIG_SYNC_FILE)+= sync_file.o
 obj-$(CONFIG_SW_SYNC)  += sw_sync.o sync_debug.o
 obj-$(CONFIG_UDMABUF)  += udmabuf.o
diff --git a/drivers/dma-buf/dma-fence-trace.c 
b/drivers/dma-buf/dma-fence-trace.c
new file mode 100644
index ..eb6f282be4c0
--- /dev/null
+++ b/drivers/dma-buf/dma-fence-trace.c
@@ -0,0 +1,28 @@
+/*
+ * Fence mechanism for dma-buf and to allow for asynchronous dma access
+ *
+ * Copyright (C) 2012 Canonical Ltd
+ * Copyright (C) 2012 Texas Instruments
+ *
+ * Authors:
+ * Rob Clark 
+ * Maarten Lankhorst 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include 
+
+#define CREATE_TRACE_POINTS
+#include 
+
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 9bf06042619a..8196a179fdc2 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -24,13 +24,6 @@
 #include 
 #include 
 
-#define CREATE_TRACE_POINTS
-#include 
-
-EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
-EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
-EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
-
 static DEFINE_SPINLOCK(dma_fence_stub_lock);
 static struct dma_fence dma_fence_stub;
 
@@ -136,7 +129,6 @@ EXPORT_SYMBOL(dma_fence_context_alloc);
  */
 int dma_fence_signal_locked(struct dma_fence *fence)
 {
-   struct dma_fence_cb *cur, *tmp;
int ret = 0;
 
lockdep_assert_held(fence->lock);
@@ -144,7 +136,7 @@ int dma_fence_signal_locked(struct dma_fence *fence)
if (WARN_ON(!fence))
return -EINVAL;
 
-   if (test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
+   if (!__dma_fence_signal(fence)) {
ret = -EINVAL;
 
/*
@@ -152,15 +144,10 @@ int dma_fence_signal_locked(struct dma_fence *fence)
 * still run through all callbacks
 */
} else {
-   fence->timestamp = ktime_get();
-   set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags);
-   trace_dma_fence_signaled(fence);
+   __dma_fence_signal__timestamp(fence, ktime_get());
}
 
-   list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) {
-   list_del_init(&cur->node);
-   cur->func(fence, cur);
-   }
+   __dma_fence_signal__notify(fence);
return ret;
 }
 EXPORT_SYMBOL(dma_fence_signal_locked);
@@ -185,21 +172,14 @@ int dma_fence_signal(struct dma_fence *fence)
if (!fence)
return -EINVAL;
 
-   if (test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
+   if (!__dma_fence_signal(fence))
return -EINVAL;
 
-   fence->timestamp = ktime_get();
-   set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags);
-   trace_dma_fence_signaled(fence);
+   __dma_fence_signal__timestamp(fence, ktime_get());
 
if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) {
-   struct dma_fence_cb *cur, *tmp;
-
spin_lock_irqsave(fence->lock, flags);
-   list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) {
-   list_del_init(&cur->node);
-   cur->func(fence, cur);
-   }
+   __dma_fence_signal__notify(fence);
spin_unlock_irqrestore(fence->lock, flag

Re: [Intel-gfx] [PATCH] drm/i915: Load balancing across a virtual engine

2019-05-08 Thread Tvrtko Ursulin


On 08/05/2019 12:23, Chris Wilson wrote:

Having allowed the user to define a set of engines that they will want
to only use, we go one step further and allow them to bind those engines
into a single virtual instance. Submitting a batch to the virtual engine
will then forward it to any one of the set in a manner as best to
distribute load.  The virtual engine has a single timeline across all
engines (it operates as a single queue), so it is not able to concurrently
run batches across multiple engines by itself; that is left up to the user
to submit multiple concurrent batches to multiple queues. Multiple users
will be load balanced across the system.

The mechanism used for load balancing in this patch is a late greedy
balancer. When a request is ready for execution, it is added to each
engine's queue, and when an engine is ready for its next request it
claims it from the virtual engine. The first engine to do so, wins, i.e.
the request is executed at the earliest opportunity (idle moment) in the
system.

As not all HW is created equal, the user is still able to skip the
virtual engine and execute the batch on a specific engine, all within the
same queue. It will then be executed in order on the correct engine,
with execution on other virtual engines being moved away due to the load
detection.

A couple of areas for potential improvement left!

- The virtual engine always take priority over equal-priority tasks.
Mostly broken up by applying FQ_CODEL rules for prioritising new clients,
and hopefully the virtual and real engines are not then congested (i.e.
all work is via virtual engines, or all work is to the real engine).

- We require the breadcrumb irq around every virtual engine request. For
normal engines, we eliminate the need for the slow round trip via
interrupt by using the submit fence and queueing in order. For virtual
engines, we have to allow any job to transfer to a new ring, and cannot
coalesce the submissions, so require the completion fence instead,
forcing the persistent use of interrupts.

- We only drip feed single requests through each virtual engine and onto
the physical engines, even if there was enough work to fill all ELSP,
leaving small stalls with an idle CS event at the end of every request.
Could we be greedy and fill both slots? Being lazy is virtuous for load
distribution on less-than-full workloads though.

Other areas of improvement are more general, such as reducing lock
contention, reducing dispatch overhead, looking at direct submission
rather than bouncing around tasklets etc.

sseu: Lift the restriction to allow sseu to be reconfigured on virtual
engines composed of RENDER_CLASS (rcs).

v2: macroize check_user_mbz()
v3: Cancel virtual engines on wedging
v4: Commence commenting
v5: Replace 64b sibling_mask with a list of class:instance
v6: Drop the one-element array in the uabi
v7: Assert it is an virtual engine in to_virtual_engine()
v8: Skip over holes in [class][inst] so we can selftest with (vcs0, vcs2)

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/gt/intel_engine_types.h |   8 +
  drivers/gpu/drm/i915/gt/intel_lrc.c  | 683 ++-
  drivers/gpu/drm/i915/gt/intel_lrc.h  |   9 +
  drivers/gpu/drm/i915/gt/selftest_lrc.c   | 180 +
  drivers/gpu/drm/i915/i915_gem.h  |   5 +
  drivers/gpu/drm/i915/i915_gem_context.c  | 116 +++-
  drivers/gpu/drm/i915/i915_scheduler.c|  19 +-
  drivers/gpu/drm/i915/i915_timeline_types.h   |   1 +
  include/uapi/drm/i915_drm.h  |  39 ++
  9 files changed, 1032 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index e381c1c73902..7b47e00fa082 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -227,6 +227,7 @@ struct intel_engine_execlists {
 * @queue: queue of requests, in priority lists
 */
struct rb_root_cached queue;
+   struct rb_root_cached virtual;
  
  	/**

 * @csb_write: control register for Context Switch buffer
@@ -445,6 +446,7 @@ struct intel_engine_cs {
  #define I915_ENGINE_HAS_PREEMPTION   BIT(2)
  #define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
  #define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4)
+#define I915_ENGINE_IS_VIRTUAL   BIT(5)
unsigned int flags;
  
  	/*

@@ -534,6 +536,12 @@ intel_engine_needs_breadcrumb_tasklet(const struct 
intel_engine_cs *engine)
return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
  }
  
+static inline bool

+intel_engine_is_virtual(const struct intel_engine_cs *engine)
+{
+   return engine->flags & I915_ENGINE_IS_VIRTUAL;
+}
+
  #define instdone_slice_mask(dev_priv__) \
(IS_GEN(dev_priv__, 7) ? \
 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index f1d62746e066..bc388df3

Re: [Intel-gfx] [PATCH 14/40] drm/i915: Load balancing across a virtual engine

2019-05-08 Thread Tvrtko Ursulin


On 08/05/2019 12:17, Chris Wilson wrote:

Quoting Tvrtko Ursulin (2019-05-08 11:29:34)


On 08/05/2019 09:06, Chris Wilson wrote:

+static int live_virtual_engine(void *arg)
+{
+ struct drm_i915_private *i915 = arg;
+ struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
+ struct intel_engine_cs *engine;
+ enum intel_engine_id id;
+ unsigned int class, inst;
+ int err = -ENODEV;
+
+ if (USES_GUC_SUBMISSION(i915))
+ return 0;
+
+ mutex_lock(&i915->drm.struct_mutex);
+
+ for_each_engine(engine, i915, id) {
+ err = nop_virtual_engine(i915, &engine, 1, 1, 0);
+ if (err) {
+ pr_err("Failed to wrap engine %s: err=%d\n",
+engine->name, err);
+ goto out_unlock;
+ }
+ }
+
+ for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
+ int nsibling, n;
+
+ nsibling = 0;
+ for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
+ if (!i915->engine_class[class][inst])
+ break;


I previous review I said I think this should be continue instead of
break so vcs0 + vcs2 skus can also be tested.


Completely missed that, sorry.


+
+ siblings[nsibling++] = i915->engine_class[class][inst];
+ }
+ if (nsibling < 2)
+ continue;


And also that single engine VE could be tested just as well, unless I am
missing something.


There's no such thing as single engine VE. The current design requires
that this type of struct virtual_engine encompasses more than one engine
(failing that we break the single request scheduling, although might be
able to lift that with timeslicing but the early results were not
favourable); the single engine being a regular intel_context instance.


Yeah my bad, the auto-magic replacement with physical engine happens one 
level higher than what this selftest is operating on.


Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/40] drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD (rev4)

2019-05-08 Thread Patchwork
== Series Details ==

Series: series starting with [01/40] drm/i915/hangcheck: Replace 
hangcheck.seqno with RING_HEAD (rev4)
URL   : https://patchwork.freedesktop.org/series/60403/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
e7ad7ff7e483 drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD
91a757861669 drm/i915: Rearrange i915_scheduler.c
388db2b0c7a6 drm/i915: Pass i915_sched_node around internally
f4522862f19a drm/i915: Check for no-op priority changes first
007c8b9dcc4b drm/i915: Bump signaler priority on adding a waiter
bfcd51b8596a drm/i915: Convert inconsistent static engine tables into an init 
error
753e000c7a33 drm/i915: Seal races between async GPU cancellation, retirement 
and signaling
559d9dee1ecb dma-fence: Refactor signaling for manual invocation
-:30: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#30: 
new file mode 100644

-:35: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier 
tag in line 1
#35: FILE: drivers/dma-buf/dma-fence-trace.c:1:
+/*

-:195: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier 
tag in line 1
#195: FILE: include/linux/dma-fence-types.h:1:
+/*

-:266: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without comment
#266: FILE: include/linux/dma-fence-types.h:72:
+   spinlock_t *lock;

total: 0 errors, 3 warnings, 1 checks, 669 lines checked
692fdcbeb38c drm/i915: Restore control over ppgtt for context creation ABI
-:81: WARNING:LONG_LINE: line over 100 characters
#81: FILE: include/uapi/drm/i915_drm.h:420:
+#define DRM_IOCTL_I915_GEM_VM_CREATE   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)

-:82: WARNING:LONG_LINE: line over 100 characters
#82: FILE: include/uapi/drm/i915_drm.h:421:
+#define DRM_IOCTL_I915_GEM_VM_DESTROY  DRM_IOW (DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)

-:82: WARNING:SPACING: space prohibited between function name and open 
parenthesis '('
#82: FILE: include/uapi/drm/i915_drm.h:421:
+#define DRM_IOCTL_I915_GEM_VM_DESTROY  DRM_IOW (DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)

-:82: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in 
parentheses
#82: FILE: include/uapi/drm/i915_drm.h:421:
+#define DRM_IOCTL_I915_GEM_VM_DESTROY  DRM_IOW (DRM_COMMAND_BASE + 
DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)

total: 1 errors, 3 warnings, 0 checks, 64 lines checked
82ba6f07ab8b drm/i915: Allow a context to define its set of engines
-:437: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#437: FILE: drivers/gpu/drm/i915/i915_utils.h:110:
+#define check_struct_size(p, member, n, sz) \
+   likely(__check_struct_size(sizeof(*(p)), \
+  sizeof(*(p)->member) + 
__must_be_array((p)->member), \
+  n, sz))

-:437: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'member' - possible 
side-effects?
#437: FILE: drivers/gpu/drm/i915/i915_utils.h:110:
+#define check_struct_size(p, member, n, sz) \
+   likely(__check_struct_size(sizeof(*(p)), \
+  sizeof(*(p)->member) + 
__must_be_array((p)->member), \
+  n, sz))

-:437: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'member' may be better as 
'(member)' to avoid precedence issues
#437: FILE: drivers/gpu/drm/i915/i915_utils.h:110:
+#define check_struct_size(p, member, n, sz) \
+   likely(__check_struct_size(sizeof(*(p)), \
+  sizeof(*(p)->member) + 
__must_be_array((p)->member), \
+  n, sz))

total: 0 errors, 0 warnings, 3 checks, 428 lines checked
94d7eb79a37a drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local 
ctx->engine[]
a699c113a8e2 drm/i915: Re-expose SINGLE_TIMELINE flags for context creation
b60419b1ad85 drm/i915: Allow userspace to clone contexts on creation
-:213: ERROR:BRACKET_SPACE: space prohibited before open square bracket '['
#213: FILE: drivers/gpu/drm/i915/i915_gem_context.c:1858:
+#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y

total: 1 errors, 0 warnings, 0 checks, 235 lines checked
263014777cd1 drm/i915: Load balancing across a virtual engine
ff8dbcf5aaa5 drm/i915: Apply an execution_mask to the virtual_engine
b4069da09690 drm/i915: Extend execution fence to support a callback
75381837ffb3 drm/i915/execlists: Virtual engine bonding
d1ad066b91b5 drm/i915: Allow specification of parallel execbuf
d5d4ec7726a0 drm/i915: Split GEM object type definition to its own header
-:25: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#25: 
new file mode 100644

-:59: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier 
tag in line 1
#59: FILE: drivers/gpu/drm/i915/gem/i915_gem_object_types.h:1:
+/*

-:60: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag 

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/dp: Support for DP YCbCr4:2:0 outputs

2019-05-08 Thread Patchwork
== Series Details ==

Series: drm/i915/dp: Support for DP YCbCr4:2:0 outputs
URL   : https://patchwork.freedesktop.org/series/60404/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6063_full -> Patchwork_12983_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_12983_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_workarounds@suspend-resume-context:
- shard-apl:  [PASS][1] -> [DMESG-WARN][2] ([fdo#108566]) +6 
similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/shard-apl6/igt@gem_workarou...@suspend-resume-context.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/shard-apl2/igt@gem_workarou...@suspend-resume-context.html

  * igt@kms_cursor_crc@cursor-128x128-suspend:
- shard-skl:  [PASS][3] -> [INCOMPLETE][4] ([fdo#104108])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/shard-skl6/igt@kms_cursor_...@cursor-128x128-suspend.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/shard-skl8/igt@kms_cursor_...@cursor-128x128-suspend.html

  * igt@kms_draw_crc@draw-method-xrgb-render-xtiled:
- shard-skl:  [PASS][5] -> [FAIL][6] ([fdo#103184] / [fdo#103232])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/shard-skl4/igt@kms_draw_...@draw-method-xrgb-render-xtiled.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/shard-skl1/igt@kms_draw_...@draw-method-xrgb-render-xtiled.html

  * igt@kms_fbcon_fbt@fbc-suspend:
- shard-skl:  [PASS][7] -> [INCOMPLETE][8] ([fdo#104108] / 
[fdo#107773])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/shard-skl9/igt@kms_fbcon_...@fbc-suspend.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/shard-skl10/igt@kms_fbcon_...@fbc-suspend.html

  * igt@kms_flip@2x-modeset-vs-vblank-race-interruptible:
- shard-glk:  [PASS][9] -> [FAIL][10] ([fdo#103060])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/shard-glk1/igt@kms_f...@2x-modeset-vs-vblank-race-interruptible.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/shard-glk4/igt@kms_f...@2x-modeset-vs-vblank-race-interruptible.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible:
- shard-skl:  [PASS][11] -> [FAIL][12] ([fdo#105363])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/shard-skl7/igt@kms_f...@flip-vs-expired-vblank-interruptible.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/shard-skl10/igt@kms_f...@flip-vs-expired-vblank-interruptible.html

  * igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-move:
- shard-hsw:  [PASS][13] -> [SKIP][14] ([fdo#109271]) +1 similar 
issue
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/shard-hsw8/igt@kms_frontbuffer_track...@fbc-2p-scndscrn-spr-indfb-move.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/shard-hsw6/igt@kms_frontbuffer_track...@fbc-2p-scndscrn-spr-indfb-move.html

  * igt@kms_frontbuffer_tracking@fbc-tilingchange:
- shard-iclb: [PASS][15] -> [FAIL][16] ([fdo#103167]) +3 similar 
issues
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/shard-iclb6/igt@kms_frontbuffer_track...@fbc-tilingchange.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/shard-iclb4/igt@kms_frontbuffer_track...@fbc-tilingchange.html

  * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
- shard-skl:  [PASS][17] -> [FAIL][18] ([fdo#108145] / [fdo#110403])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/shard-skl4/igt@kms_plane_alpha_bl...@pipe-b-coverage-7efc.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/shard-skl1/igt@kms_plane_alpha_bl...@pipe-b-coverage-7efc.html

  * igt@kms_plane_lowres@pipe-a-tiling-x:
- shard-iclb: [PASS][19] -> [FAIL][20] ([fdo#103166])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/shard-iclb1/igt@kms_plane_low...@pipe-a-tiling-x.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/shard-iclb7/igt@kms_plane_low...@pipe-a-tiling-x.html

  * igt@kms_plane_scaling@pipe-a-scaler-with-pixel-format:
- shard-glk:  [PASS][21] -> [SKIP][22] ([fdo#109271] / 
[fdo#109278]) +1 similar issue
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/shard-glk9/igt@kms_plane_scal...@pipe-a-scaler-with-pixel-format.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12983/shard-glk5/igt@kms_plane_scal...@pipe-a-scaler-with-pixel-format.html

  * igt@kms_psr@psr2_sprite_blt:
- shard-iclb: [PASS][23] -> [SKIP][24] ([fdo#109441])
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6063/shard-iclb2/igt@kms_psr@psr2_sprite_blt.html
   [24]: 
https://intel-gfx-ci.01.

Re: [Intel-gfx] [PATCH] drm/i915: Seal races between async GPU cancellation, retirement and signaling

2019-05-08 Thread Tvrtko Ursulin


On 08/05/2019 12:24, Chris Wilson wrote:

Currently there is an underlying assumption that i915_request_unsubmit()
is synchronous wrt the GPU -- that is the request is no longer in flight
as we remove it. In the near future that may change, and this may upset
our signaling as we can process an interrupt for that request while it
is no longer in flight.

CPU0CPU1
intel_engine_breadcrumbs_irq
(queue request completion)
i915_request_cancel_signaling
... ...
i915_request_enable_signaling
dma_fence_signal

Hence in the time it took us to drop the lock to signal the request, a
preemption event may have occurred and re-queued the request. In the
process, that request would have seen I915_FENCE_FLAG_SIGNAL clear and
so reused the rq->signal_link that was in use on CPU0, leading to bad
pointer chasing in intel_engine_breadcrumbs_irq.

A related issue was that if someone started listening for a signal on a
completed but no longer in-flight request, we missed the opportunity to
immediately signal that request.

Furthermore, as intel_contexts may be immediately released during
request retirement, in order to be entirely sure that
intel_engine_breadcrumbs_irq may no longer dereference the intel_context
(ce->signals and ce->signal_link), we must wait for irq spinlock.

In order to prevent the race, we use a bit in the fence.flags to signal
the transfer onto the signal list inside intel_engine_breadcrumbs_irq.
For simplicity, we use the DMA_FENCE_FLAG_SIGNALED_BIT as it then
quickly signals to any outside observer that the fence is indeed signaled.

v2: Sketch out potential dma-fence API for manual signaling
v3: And the test_and_set_bit()

Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context 
interrupt tracking")
Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  drivers/dma-buf/dma-fence.c |  1 +
  drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 78 +++--
  drivers/gpu/drm/i915/i915_request.c |  1 +
  drivers/gpu/drm/i915/intel_guc_submission.c |  1 -
  4 files changed, 59 insertions(+), 22 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 3aa8733f832a..9bf06042619a 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -29,6 +29,7 @@
  
  EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);

  EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
  
  static DEFINE_SPINLOCK(dma_fence_stub_lock);

  static struct dma_fence dma_fence_stub;
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index fe455f01aa65..c092bdf5f0bf 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -23,6 +23,7 @@
   */
  
  #include 

+#include 
  #include 
  
  #include "i915_drv.h"

@@ -96,9 +97,39 @@ check_signal_order(struct intel_context *ce, struct 
i915_request *rq)
return true;
  }
  
+static bool

+__dma_fence_signal(struct dma_fence *fence)
+{
+   return !test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags);
+}
+
+static void
+__dma_fence_signal__timestamp(struct dma_fence *fence, ktime_t timestamp)
+{
+   fence->timestamp = timestamp;
+   set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags);
+   trace_dma_fence_signaled(fence);
+}
+
+static void
+__dma_fence_signal__notify(struct dma_fence *fence)
+{
+   struct dma_fence_cb *cur, *tmp;
+
+   lockdep_assert_held(fence->lock);
+   lockdep_assert_irqs_disabled();
+
+   list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) {
+   INIT_LIST_HEAD(&cur->node);
+   cur->func(fence, cur);
+   }
+   INIT_LIST_HEAD(&fence->cb_list);
+}
+
  void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine)
  {
struct intel_breadcrumbs *b = &engine->breadcrumbs;
+   const ktime_t timestamp = ktime_get();
struct intel_context *ce, *cn;
struct list_head *pos, *next;
LIST_HEAD(signal);
@@ -122,6 +153,10 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs 
*engine)
  
  			GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_SIGNAL,

 &rq->fence.flags));
+   clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
+
+   if (!__dma_fence_signal(&rq->fence))
+   continue;
  
  			/*

 * Queue for execution after dropping the signaling
@@ -129,14 +164,6 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs 
*engine)
 * more signalers to the same context or engine.
 */
i915_request_get(rq);
-
-   /*
-* We may race with direct invoc

[Intel-gfx] [PATCH] drm/i915: Reboot CI if forcewake fails

2019-05-08 Thread Chris Wilson
If the HW fail to ack a change in forcewake status, the machine is as
good as dead -- it may recover, but in reality it missed the mmio
updates and is now in a very inconsistent state. If it happens, we can't
trust the CI results (or at least the fails may be genuine but due to
the HW being dead and not the actual test!) so reboot the machine (CI
checks for a kernel taint in between each test and reboots if the
machine is tainted).

Signed-off-by: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_reset.c |  2 +-
 drivers/gpu/drm/i915/i915_drv.h   | 11 +++
 drivers/gpu/drm/i915/intel_uncore.c   |  8 ++--
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
b/drivers/gpu/drm/i915/gt/intel_reset.c
index 419b3415370b..464369bc55ad 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1042,7 +1042,7 @@ void i915_reset(struct drm_i915_private *i915,
 * rather than continue on into oblivion. For everyone else,
 * the system should still plod along, but they have been warned!
 */
-   add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
+   add_taint_for_CI(TAINT_WARN);
 error:
__i915_gem_set_wedged(i915);
goto finish;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0a6ec61496f1..d0257808734c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3375,4 +3375,15 @@ static inline u32 i915_scratch_offset(const struct 
drm_i915_private *i915)
return i915_ggtt_offset(i915->gt.scratch);
 }
 
+static inline void add_taint_for_CI(unsigned int taint)
+{
+   /*
+* The system is "ok", just about surviving for the user, but
+* CI results are now unreliable as the HW is very suspect.
+* CI checks the taint state after every test and will reboot
+* the machine if the kernel is tainted.
+*/
+   add_taint(taint, LOCKDEP_STILL_OK);
+}
+
 #endif
diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index d1d51e1121e2..6ec1bc97b665 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -111,9 +111,11 @@ wait_ack_set(const struct intel_uncore_forcewake_domain *d,
 static inline void
 fw_domain_wait_ack_clear(const struct intel_uncore_forcewake_domain *d)
 {
-   if (wait_ack_clear(d, FORCEWAKE_KERNEL))
+   if (wait_ack_clear(d, FORCEWAKE_KERNEL)) {
DRM_ERROR("%s: timed out waiting for forcewake ack to clear.\n",
  intel_uncore_forcewake_domain_to_str(d->id));
+   add_taint_for_CI(TAINT_WARN); /* CI unreliable */
+   }
 }
 
 enum ack_type {
@@ -186,9 +188,11 @@ fw_domain_get(const struct intel_uncore_forcewake_domain 
*d)
 static inline void
 fw_domain_wait_ack_set(const struct intel_uncore_forcewake_domain *d)
 {
-   if (wait_ack_set(d, FORCEWAKE_KERNEL))
+   if (wait_ack_set(d, FORCEWAKE_KERNEL)) {
DRM_ERROR("%s: timed out waiting for forcewake ack request.\n",
  intel_uncore_forcewake_domain_to_str(d->id));
+   add_taint_for_CI(TAINT_WARN); /* CI unreliable */
+   }
 }
 
 static inline void
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] dma-fence: Refactor signaling for manual invocation

2019-05-08 Thread Tvrtko Ursulin


On 08/05/2019 12:25, Chris Wilson wrote:

Move the duplicated code within dma-fence.c into the header for wider
reuse.


For this one I am not sure whether to go with static inlines or 
EXPORT_SYMBOL the helpers.


Also you'll need to mention that in the same patch you are optimized the 
whole-list-unlink. Presumably, when this goes to dri-devel one day.


But overall it makes sense to me to allow controlled fine control of the 
fence signaling from dma-fence core.


Regards,

Tvrtko


Signed-off-by: Chris Wilson 
---
  drivers/dma-buf/Makefile|  10 +-
  drivers/dma-buf/dma-fence-trace.c   |  28 +++
  drivers/dma-buf/dma-fence.c |  32 +--
  drivers/gpu/drm/i915/gt/intel_breadcrumbs.c |  30 ---
  include/linux/dma-fence-types.h | 248 +++
  include/linux/dma-fence.h   | 251 +++-
  6 files changed, 321 insertions(+), 278 deletions(-)
  create mode 100644 drivers/dma-buf/dma-fence-trace.c
  create mode 100644 include/linux/dma-fence-types.h

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 1f006e083eb9..56e579878f26 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1,5 +1,11 @@
-obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
-reservation.o seqno-fence.o
+obj-y := \
+   dma-buf.o \
+   dma-fence.o \
+   dma-fence-array.o \
+   dma-fence-chain.o \
+   dma-fence-trace.o \
+   reservation.o \
+   seqno-fence.o
  obj-$(CONFIG_SYNC_FILE)   += sync_file.o
  obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o
  obj-$(CONFIG_UDMABUF) += udmabuf.o
diff --git a/drivers/dma-buf/dma-fence-trace.c 
b/drivers/dma-buf/dma-fence-trace.c
new file mode 100644
index ..eb6f282be4c0
--- /dev/null
+++ b/drivers/dma-buf/dma-fence-trace.c
@@ -0,0 +1,28 @@
+/*
+ * Fence mechanism for dma-buf and to allow for asynchronous dma access
+ *
+ * Copyright (C) 2012 Canonical Ltd
+ * Copyright (C) 2012 Texas Instruments
+ *
+ * Authors:
+ * Rob Clark 
+ * Maarten Lankhorst 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include 
+
+#define CREATE_TRACE_POINTS
+#include 
+
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 9bf06042619a..8196a179fdc2 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -24,13 +24,6 @@
  #include 
  #include 
  
-#define CREATE_TRACE_POINTS

-#include 
-
-EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
-EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
-EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
-
  static DEFINE_SPINLOCK(dma_fence_stub_lock);
  static struct dma_fence dma_fence_stub;
  
@@ -136,7 +129,6 @@ EXPORT_SYMBOL(dma_fence_context_alloc);

   */
  int dma_fence_signal_locked(struct dma_fence *fence)
  {
-   struct dma_fence_cb *cur, *tmp;
int ret = 0;
  
  	lockdep_assert_held(fence->lock);

@@ -144,7 +136,7 @@ int dma_fence_signal_locked(struct dma_fence *fence)
if (WARN_ON(!fence))
return -EINVAL;
  
-	if (test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {

+   if (!__dma_fence_signal(fence)) {
ret = -EINVAL;
  
  		/*

@@ -152,15 +144,10 @@ int dma_fence_signal_locked(struct dma_fence *fence)
 * still run through all callbacks
 */
} else {
-   fence->timestamp = ktime_get();
-   set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags);
-   trace_dma_fence_signaled(fence);
+   __dma_fence_signal__timestamp(fence, ktime_get());
}
  
-	list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) {

-   list_del_init(&cur->node);
-   cur->func(fence, cur);
-   }
+   __dma_fence_signal__notify(fence);
return ret;
  }
  EXPORT_SYMBOL(dma_fence_signal_locked);
@@ -185,21 +172,14 @@ int dma_fence_signal(struct dma_fence *fence)
if (!fence)
return -EINVAL;
  
-	if (test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))

+   if (!__dma_fence_signal(fence))
return -EINVAL;
  
-	fence->timestamp = ktime_get();

-   set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags);
-   trace_dma_fence_signaled(fence);
+   __dma_fence_signal__timestamp(fence, ktime_get());
  
  	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) {

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [01/40] drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD (rev4)

2019-05-08 Thread Patchwork
== Series Details ==

Series: series starting with [01/40] drm/i915/hangcheck: Replace 
hangcheck.seqno with RING_HEAD (rev4)
URL   : https://patchwork.freedesktop.org/series/60403/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD
Okay!

Commit: drm/i915: Rearrange i915_scheduler.c
Okay!

Commit: drm/i915: Pass i915_sched_node around internally
Okay!

Commit: drm/i915: Check for no-op priority changes first
Okay!

Commit: drm/i915: Bump signaler priority on adding a waiter
Okay!

Commit: drm/i915: Convert inconsistent static engine tables into an init error
Okay!

Commit: drm/i915: Seal races between async GPU cancellation, retirement and 
signaling
Okay!

Commit: dma-fence: Refactor signaling for manual invocation
Okay!

Commit: drm/i915: Restore control over ppgtt for context creation ABI
Okay!

Commit: drm/i915: Allow a context to define its set of engines
+drivers/gpu/drm/i915/i915_utils.h:87:13: error: incorrect type in conditional
+drivers/gpu/drm/i915/i915_utils.h:87:13: error: undefined identifier 
'__builtin_mul_overflow'
+drivers/gpu/drm/i915/i915_utils.h:87:13:got void
+drivers/gpu/drm/i915/i915_utils.h:87:13: warning: call with no type!
+drivers/gpu/drm/i915/i915_utils.h:90:13: error: incorrect type in conditional
+drivers/gpu/drm/i915/i915_utils.h:90:13: error: undefined identifier 
'__builtin_add_overflow'
+drivers/gpu/drm/i915/i915_utils.h:90:13:got void
+drivers/gpu/drm/i915/i915_utils.h:90:13: warning: call with no type!
-drivers/gpu/drm/i915/selftests/../i915_utils.h:184:16: warning: expression 
using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_utils.h:218:16: warning: expression 
using sizeof(void)
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: not a function 
+./include/linux/overflow.h:285:13:got void
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: not a function 
+./include/linux/overflow.h:287:13:got void

Commit: drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[]
Okay!

Commit: drm/i915: Re-expose SINGLE_TIMELINE flags for context creation
Okay!

Commit: drm/i915: Allow userspace to clone contexts on creation
+drivers/gpu/drm/i915/i915_gem_context.c:1859:17: error: bad integer constant 
expression
+drivers/gpu/drm/i915/i915_gem_context.c:1860:17: error: bad integer constant 
expression
+drivers/gpu/drm/i915/i915_gem_context.c:1861:17: error: bad integer constant 
expression
+drivers/gpu/drm/i915/i915_gem_context.c:1862:17: error: bad integer constant 
expression
+drivers/gpu/drm/i915/i915_gem_context.c:1863:17: error: bad integer constant 
expression
+drivers/gpu/drm/i915/i915_gem_context.c:1864:17: error: bad integer constant 
expression
-drivers/gpu/drm/i915/i915_utils.h:87:13: warning: call with no type!
-drivers/gpu/drm/i915/i915_utils.h:90:13: warning: call with no type!
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:1266:25: warning: expression 
using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:1266:25: warning: expression 
using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:454:16: warning: expression 
using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:571:33: warning: expression 
using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:571:33: warning: expression 
using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:693:33: warning: expression 
using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:693:33: warning: expression 
using sizeof(void)
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: not a function 
-./include/linux/overflow.h:285:13: warning: call with no type!
+./include/linux/overflow.h:285:13:got void
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: not a function 
-./include/linux/overflow.h:287:13: warning: call with no type!
+./include/linux/overflow.h:287:13:got void
-./include/linux/slab.h:666:13: warning: call with no type!

Commit: drm/i915: Load balancing across a virtual engine
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: undefined identifier 
'__builtin_mul_overflow'
+./include/linux/overflow.h:285:13:got void
+./include/linux/overflow.h:285:13: warning: call with no type!
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: undefined identifier 
'__builtin_add_overflow'
+./include/linux/overflow.h:287:13:got void
+./include/linux/overflow.h:287:13: warning: call with no type!
+./include/linux/slab.h:666:13: error: not a function 

Commit: drm/i915: Apply an execution_mask to the virtual_en

  1   2   >