Re: [Intel-gfx] [PATCH 01/10] drm/i915: Move map-and-fenceable tracking to the VMA

2016-08-15 Thread Joonas Lahtinen
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> @@ -2843,8 +2843,7 @@ int i915_vma_unbind(struct i915_vma *vma)
>   GEM_BUG_ON(obj->bind_count == 0);
>   GEM_BUG_ON(!obj->pages);
>  
> - if (i915_vma_is_ggtt(vma) &&
> - vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {

Maybe make a comment here, as the test feel out-of-place quickly
glancing. Especially wrt. what it replaces. Although you mentioned in
IRC this will soon be eliminated?

> + if (i915_vma_is_map_and_fenceable(vma)) {
>   i915_gem_object_finish_gtt(obj);
>  
>   /* release the fence reg _after_ flushing */



> @@ -2864,13 +2864,9 @@ int i915_vma_unbind(struct i915_vma *vma)
>   drm_mm_remove_node(&vma->node);
>   list_move_tail(&vma->vm_link, &vma->vm->unbound_list);
>  
> - if (i915_vma_is_ggtt(vma)) {
> - if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {
> - obj->map_and_fenceable = false;
> - } else if (vma->pages) {
> - sg_free_table(vma->pages);
> - kfree(vma->pages);
> - }

Not sure if there should be a comment that for 1:1 mappings vma->pages
is just obj->pages so it should not be freed. Or maybe you could even
make the test if vma->pages != vma->obj->pages? More self-documenting.

> + if (vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL) {
> + sg_free_table(vma->pages);
> + kfree(vma->pages);
>   }
>   vma->pages = NULL;



> @@ -3693,7 +3687,10 @@ void __i915_vma_set_map_and_fenceable(struct i915_vma 
> *vma)

This might also clear, so function name should be
update_map_and_fenceable, really.

> @@ -2262,11 +2262,11 @@ void intel_unpin_fb_obj(struct drm_framebuffer *fb, 
> unsigned int rotation)
>   WARN_ON(!mutex_is_locked(&obj->base.dev->struct_mutex));
>  
>   intel_fill_fb_ggtt_view(&view, fb, rotation);
> + vma = i915_gem_object_to_ggtt(obj, &view);
>  
> - if (view.type == I915_GGTT_VIEW_NORMAL)
> + if (i915_vma_is_map_and_fenceable(vma))
>   i915_gem_object_unpin_fence(obj);
>  
> - vma = i915_gem_object_to_ggtt(obj, &view);
>   i915_gem_object_unpin_from_display_plane(vma);

This did not have NULL protection previously either, so should be OK.

Reviewed-by: Joonas Lahtinen 

Regards, Joonas 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/10] drm/i915: Move map-and-fenceable tracking to the VMA

2016-08-15 Thread Chris Wilson
On Mon, Aug 15, 2016 at 11:03:32AM +0300, Joonas Lahtinen wrote:
> On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> Not sure if there should be a comment that for 1:1 mappings vma->pages
> is just obj->pages so it should not be freed. Or maybe you could even
> make the test if vma->pages != vma->obj->pages? More self-documenting.

I contemplated making this vma->pages != vma->obj->pages as well in
light of the recent changes, will do.
> 
> > +   if (vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL) {
> > +   sg_free_table(vma->pages);
> > +   kfree(vma->pages);
> >     }
> >     vma->pages = NULL;
> 
> 
> 
> > @@ -3693,7 +3687,10 @@ void __i915_vma_set_map_and_fenceable(struct 
> > i915_vma *vma)
> 
> This might also clear, so function name should be
> update_map_and_fenceable, really.

update/compute either is a fine TODO ;)
 
> > @@ -2262,11 +2262,11 @@ void intel_unpin_fb_obj(struct drm_framebuffer *fb, 
> > unsigned int rotation)
> >     WARN_ON(!mutex_is_locked(&obj->base.dev->struct_mutex));
> >  
> >     intel_fill_fb_ggtt_view(&view, fb, rotation);
> > +   vma = i915_gem_object_to_ggtt(obj, &view);
> >  
> > -   if (view.type == I915_GGTT_VIEW_NORMAL)
> > +   if (i915_vma_is_map_and_fenceable(vma))
> >     i915_gem_object_unpin_fence(obj);
> >  
> > -   vma = i915_gem_object_to_ggtt(obj, &view);
> >     i915_gem_object_unpin_from_display_plane(vma);
> 
> This did not have NULL protection previously either, so should be OK.

Yup, the long goal here is to pass in the vma.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] drm/i915: Show RPS autotuning thresholds along with waitboost

2016-08-15 Thread David Weinehall
On Sun, Aug 14, 2016 at 02:28:56PM +0100, Chris Wilson wrote:
> For convenience when debugging user issues show the autotuning
> RPS parameters in debugfs/i915_rps_boost_info.
> 
> v2: Refine the presentation
> 
> Signed-off-by: Chris Wilson 
> Cc: frit...@kodi.tv

Looks good to me (well, it doesn't, I hate having things on the same
line as the case statement, but that's a personal opinion), compiles
and works as it should.

Reviewed-by: David Weinehall 

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 43 
> +++--
>  1 file changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index c461072da142..8d302906d768 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2441,6 +2441,16 @@ static int count_irq_waiters(struct drm_i915_private 
> *i915)
>   return count;
>  }
>  
> +static const char *rps_power_to_str(int power)
> +{
> + switch (power) {
> + default: return "unknown";
> + case LOW_POWER: return "low power";
> + case BETWEEN: return "mixed";
> + case HIGH_POWER: return "high power";
> + }
> +}
> +
>  static int i915_rps_boost_info(struct seq_file *m, void *data)
>  {
>   struct drm_info_node *node = m->private;
> @@ -2452,12 +2462,17 @@ static int i915_rps_boost_info(struct seq_file *m, 
> void *data)
>   seq_printf(m, "GPU busy? %s [%x]\n",
>  yesno(dev_priv->gt.awake), dev_priv->gt.active_engines);
>   seq_printf(m, "CPU waiting? %d\n", count_irq_waiters(dev_priv));
> - seq_printf(m, "Frequency requested %d; min hard:%d, soft:%d; max 
> soft:%d, hard:%d\n",
> -intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq),
> + seq_printf(m, "Frequency requested %d\n",
> +intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq));
> + seq_printf(m, "  min hard:%d, soft:%d; max soft:%d, hard:%d\n",
>  intel_gpu_freq(dev_priv, dev_priv->rps.min_freq),
>  intel_gpu_freq(dev_priv, dev_priv->rps.min_freq_softlimit),
>  intel_gpu_freq(dev_priv, dev_priv->rps.max_freq_softlimit),
>  intel_gpu_freq(dev_priv, dev_priv->rps.max_freq));
> + seq_printf(m, "  idle:%d, efficient:%d, boost:%d\n",
> +intel_gpu_freq(dev_priv, dev_priv->rps.idle_freq),
> +intel_gpu_freq(dev_priv, dev_priv->rps.efficient_freq),
> +intel_gpu_freq(dev_priv, dev_priv->rps.boost_freq));
>  
>   mutex_lock(&dev->filelist_mutex);
>   spin_lock(&dev_priv->rps.client_lock);
> @@ -2478,6 +2493,30 @@ static int i915_rps_boost_info(struct seq_file *m, 
> void *data)
>   spin_unlock(&dev_priv->rps.client_lock);
>   mutex_unlock(&dev->filelist_mutex);
>  
> + if (INTEL_GEN(dev_priv) >= 6 &&
> + dev_priv->rps.enabled &&
> + dev_priv->gt.active_engines) {
> + u32 rpupei, rpcurup;
> + u32 rpdownei, rpcurdown;
> +
> + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> + rpupei = I915_READ_FW(GEN6_RP_CUR_UP_EI) & GEN6_CURICONT_MASK;
> + rpcurup = I915_READ_FW(GEN6_RP_CUR_UP) & GEN6_CURBSYTAVG_MASK;
> + rpdownei = I915_READ_FW(GEN6_RP_CUR_DOWN_EI) & 
> GEN6_CURIAVG_MASK;
> + rpcurdown = I915_READ_FW(GEN6_RP_CUR_DOWN) & 
> GEN6_CURBSYTAVG_MASK;
> + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> +
> + seq_printf(m, "\nRPS Autotuning (current \"%s\" window):\n",
> +rps_power_to_str(dev_priv->rps.power));
> + seq_printf(m, "  Avg. up: %d%% [above threshold? %d%%]\n",
> +100*rpcurup/rpupei,
> +dev_priv->rps.up_threshold);
> + seq_printf(m, "  Avg. down: %d%% [below threshold? %d%%]\n",
> +100*rpcurdown/rpdownei,
> +dev_priv->rps.down_threshold);
> + } else
> + seq_printf(m, "\nRPS Autotuning inactive\n");
> +
>   return 0;
>  }
>  
> -- 
> 2.8.1
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI] drm/i915: Show RPS autotuning thresholds along with waitboost

2016-08-15 Thread Chris Wilson
For convenience when debugging user issues show the autotuning
RPS parameters in debugfs/i915_rps_boost_info.

v2: Refine the presentation
v3: Style

Signed-off-by: Chris Wilson 
Cc: frit...@kodi.tv
Link: 
http://patchwork.freedesktop.org/patch/msgid/1471181336-27523-1-git-send-email-ch...@chris-wilson.co.uk
Reviewed-by: David Weinehall 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 48 +++--
 drivers/gpu/drm/i915/i915_reg.h |  7 +++---
 2 files changed, 50 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 2a3d6d2..1949588 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2441,6 +2441,20 @@ static int count_irq_waiters(struct drm_i915_private 
*i915)
return count;
 }
 
+static const char *rps_power_to_str(unsigned power)
+{
+   const char *strings[] = {
+   [LOW_POWER] = "low power",
+   [BETWEEN] = "mixed",
+   [HIGH_POWER] = "high power",
+   };
+
+   if (power >= ARRAY_SIZE(strings) || !strings[power])
+   return "unknown";
+
+   return strings[power];
+}
+
 static int i915_rps_boost_info(struct seq_file *m, void *data)
 {
struct drm_info_node *node = m->private;
@@ -2452,12 +2466,17 @@ static int i915_rps_boost_info(struct seq_file *m, void 
*data)
seq_printf(m, "GPU busy? %s [%x]\n",
   yesno(dev_priv->gt.awake), dev_priv->gt.active_engines);
seq_printf(m, "CPU waiting? %d\n", count_irq_waiters(dev_priv));
-   seq_printf(m, "Frequency requested %d; min hard:%d, soft:%d; max 
soft:%d, hard:%d\n",
-  intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq),
+   seq_printf(m, "Frequency requested %d\n",
+  intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq));
+   seq_printf(m, "  min hard:%d, soft:%d; max soft:%d, hard:%d\n",
   intel_gpu_freq(dev_priv, dev_priv->rps.min_freq),
   intel_gpu_freq(dev_priv, dev_priv->rps.min_freq_softlimit),
   intel_gpu_freq(dev_priv, dev_priv->rps.max_freq_softlimit),
   intel_gpu_freq(dev_priv, dev_priv->rps.max_freq));
+   seq_printf(m, "  idle:%d, efficient:%d, boost:%d\n",
+  intel_gpu_freq(dev_priv, dev_priv->rps.idle_freq),
+  intel_gpu_freq(dev_priv, dev_priv->rps.efficient_freq),
+  intel_gpu_freq(dev_priv, dev_priv->rps.boost_freq));
 
mutex_lock(&dev->filelist_mutex);
spin_lock(&dev_priv->rps.client_lock);
@@ -2478,6 +2497,31 @@ static int i915_rps_boost_info(struct seq_file *m, void 
*data)
spin_unlock(&dev_priv->rps.client_lock);
mutex_unlock(&dev->filelist_mutex);
 
+   if (INTEL_GEN(dev_priv) >= 6 &&
+   dev_priv->rps.enabled &&
+   dev_priv->gt.active_engines) {
+   u32 rpup, rpupei;
+   u32 rpdown, rpdownei;
+
+   intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+   rpup = I915_READ_FW(GEN6_RP_CUR_UP) & GEN6_RP_EI_MASK;
+   rpupei = I915_READ_FW(GEN6_RP_CUR_UP_EI) & GEN6_RP_EI_MASK;
+   rpdown = I915_READ_FW(GEN6_RP_CUR_DOWN) & GEN6_RP_EI_MASK;
+   rpdownei = I915_READ_FW(GEN6_RP_CUR_DOWN_EI) & GEN6_RP_EI_MASK;
+   intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+
+   seq_printf(m, "\nRPS Autotuning (current \"%s\" window):\n",
+  rps_power_to_str(dev_priv->rps.power));
+   seq_printf(m, "  Avg. up: %d%% [above threshold? %d%%]\n",
+  100 * rpup / rpupei,
+  dev_priv->rps.up_threshold);
+   seq_printf(m, "  Avg. down: %d%% [below threshold? %d%%]\n",
+  100 * rpdown / rpdownei,
+  dev_priv->rps.down_threshold);
+   } else {
+   seq_puts(m, "\nRPS Autotuning inactive\n");
+   }
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index da82744..d4adf28 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7036,12 +7036,13 @@ enum {
 #define GEN6_RP_UP_THRESHOLD   _MMIO(0xA02C)
 #define GEN6_RP_DOWN_THRESHOLD _MMIO(0xA030)
 #define GEN6_RP_CUR_UP_EI  _MMIO(0xA050)
-#define   GEN6_CURICONT_MASK   0xff
+#define   GEN6_RP_EI_MASK  0xff
+#define   GEN6_CURICONT_MASK   GEN6_RP_EI_MASK
 #define GEN6_RP_CUR_UP _MMIO(0xA054)
-#define   GEN6_CURBSYTAVG_MASK 0xff
+#define   GEN6_CURBSYTAVG_MASK GEN6_RP_EI_MASK
 #define GEN6_RP_PREV_UP_MMIO(0xA058)
 #define GEN6_RP_CUR_DOWN_EI_MMIO(0xA05C)
-#define   GEN6_CURIAVG_MASK 

[Intel-gfx] [PATCH 5/5 v4] drm/i915: debugfs spring cleaning

2016-08-15 Thread David Weinehall
drm/i915: debugfs spring cleaning

Just like with sysfs, we do some major overhaul.

Pass dev_priv instead of dev to all feature macros (IS_, HAS_,
INTEL_, etc.). This has the side effect that a bunch of functions
now get dev_priv passed instead of dev.

All calls to INTEL_INFO()->gen have been replaced with
INTEL_GEN().

We want access to to_i915(node->minor->dev) in a lot of places,
so add the node_to_i915() helper to accomodate for this.

Finally, we have quite a few cases where we get a void * pointer,
and need to cast it to drm_device *, only to run to_i915() on it.
Add cast_to_i915() to do this.

v2: Don't introduce extra dev (Chris)

v3: Make pipe_crc_info have a pointer to drm_i915_private instead of
drm_device. This saves a bit of space, since we never use
drm_device anywhere in these functions.

Also some minor fixup that I missed in the previous version.

v4: Fixed a nasty bug in the earlier version (that could trigger
an oops).

Changed the code a bit so that dev_priv is passed directly
to various functions, thus removing the need for the
cast_to_i915() helper. Also did some additional cleanup.

Signed-off-by: David Weinehall 

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index bba47cfd5d61..4b1884a238da 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -46,6 +46,11 @@ enum {
PINNED_LIST,
 };
 
+static inline struct drm_i915_private *node_to_i915(struct drm_info_node *node)
+{
+   return to_i915(node->minor->dev);
+}
+
 /* As the drm_debugfs_init() routines are called before dev->dev_private is
  * allocated we need to hook into the minor for release. */
 static int
@@ -63,7 +68,7 @@ drm_add_fake_info_node(struct drm_minor *minor,
 
node->minor = minor;
node->dent = ent;
-   node->info_ent = (void *) key;
+   node->info_ent = (void *)key;
 
mutex_lock(&minor->debugfs_lock);
list_add(&node->list, &minor->debugfs_list);
@@ -74,12 +79,11 @@ drm_add_fake_info_node(struct drm_minor *minor,
 
 static int i915_capabilities(struct seq_file *m, void *data)
 {
-   struct drm_info_node *node = m->private;
-   struct drm_device *dev = node->minor->dev;
-   const struct intel_device_info *info = INTEL_INFO(dev);
+   struct drm_i915_private *dev_priv = node_to_i915(m->private);
+   const struct intel_device_info *info = INTEL_INFO(dev_priv);
 
-   seq_printf(m, "gen: %d\n", info->gen);
-   seq_printf(m, "pch: %d\n", INTEL_PCH_TYPE(dev));
+   seq_printf(m, "gen: %d\n", INTEL_GEN(dev_priv));
+   seq_printf(m, "pch: %d\n", INTEL_PCH_TYPE(dev_priv));
 #define PRINT_FLAG(x)  seq_printf(m, #x ": %s\n", yesno(info->x))
 #define SEP_SEMICOLON ;
DEV_INFO_FOR_EACH_FLAG(PRINT_FLAG, SEP_SEMICOLON);
@@ -136,13 +140,14 @@ static void
 describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 {
struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
+   struct drm_device *dev = &dev_priv->drm;
struct intel_engine_cs *engine;
struct i915_vma *vma;
unsigned int frontbuffer_bits;
int pin_count = 0;
enum intel_engine_id id;
 
-   lockdep_assert_held(&obj->base.dev->struct_mutex);
+   lockdep_assert_held(&dev->struct_mutex);
 
seq_printf(m, "%pK: %c%c%c%c%c %8zdKiB %02x %02x [ ",
   &obj->base,
@@ -157,13 +162,13 @@ describe_obj(struct seq_file *m, struct 
drm_i915_gem_object *obj)
for_each_engine_id(engine, dev_priv, id)
seq_printf(m, "%x ",
   i915_gem_active_get_seqno(&obj->last_read[id],
-
&obj->base.dev->struct_mutex));
+&dev->struct_mutex));
seq_printf(m, "] %x %x%s%s%s",
   i915_gem_active_get_seqno(&obj->last_write,
-&obj->base.dev->struct_mutex),
+&dev->struct_mutex),
   i915_gem_active_get_seqno(&obj->last_fence,
-&obj->base.dev->struct_mutex),
-  i915_cache_level_str(to_i915(obj->base.dev), 
obj->cache_level),
+&dev->struct_mutex),
+  i915_cache_level_str(dev_priv, obj->cache_level),
   obj->dirty ? " dirty" : "",
   obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
if (obj->base.name)
@@ -201,7 +206,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object 
*obj)
}
 
engine = i915_gem_active_get_engine(&obj->last_write,
-   &obj->base.dev->struct_mutex);
+   &dev->struct_mutex);
if (engine)
seq_printf(m, " (%s)", engine->name);
 
@@ -213,10 +218,10 @@ de

Re: [Intel-gfx] [PATCH 5/5 v3] drm/i915: debugfs spring cleaning

2016-08-15 Thread David Weinehall
On Fri, Aug 12, 2016 at 01:43:52PM +0100, Dave Gordon wrote:
> Alternatively (noting that almost the only use we make of this drm_info_node
> is to indirect multiple times to get dev_priv), we could change what is
> stored in (struct seq_file).private to make it more convenient and/or
> efficient. For example,
> 
> struct i915_debugfs_node {
>   struct drm_i915_private *dev_priv;
>   struct drm_info_node drm_info;  // if still required
> };
> 
> thus eliminating several memory cycles per use for a cost of one word extra
> data per debugfs node.

v4 of the patch doesn't eliminate the need for the node_to_i915() macro
and its users, but all functions that don't use the
drm_debugfs_create_files() helper now receive drm_i915_private *dev_priv
instead of drm_device *dev. This at least kills off the cast_to_i915()
macro.


Regards, David
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Ro.CI.BAT: failure for drm/i915: Show RPS autotuning thresholds along with waitboost (rev3)

2016-08-15 Thread Patchwork
== Series Details ==

Series: drm/i915: Show RPS autotuning thresholds along with waitboost (rev3)
URL   : https://patchwork.freedesktop.org/series/11063/
State : failure

== Summary ==

Series 11063v3 drm/i915: Show RPS autotuning thresholds along with waitboost
http://patchwork.freedesktop.org/api/1.0/series/11063/revisions/3/mbox

Test drv_module_reload_basic:
pass   -> SKIP   (ro-ivb-i7-3770)
Test kms_cursor_legacy:
Subgroup basic-cursor-vs-flip-varying-size:
pass   -> FAIL   (ro-ilk1-i5-650)
Subgroup basic-flip-vs-cursor-legacy:
pass   -> FAIL   (ro-bdw-i5-5250u)
Subgroup basic-flip-vs-cursor-varying-size:
pass   -> FAIL   (ro-skl3-i5-6260u)
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
skip   -> DMESG-WARN (ro-bdw-i5-5250u)
Subgroup suspend-read-crc-pipe-c:
skip   -> DMESG-WARN (ro-bdw-i5-5250u)

fi-hsw-i7-4770k  total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-qkkr  total:244  pass:185  dwarn:28  dfail:0   fail:3   skip:28 
fi-skl-i7-6700k  total:244  pass:208  dwarn:4   dfail:2   fail:2   skip:28 
fi-snb-i7-2600   total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:219  dwarn:3   dfail:0   fail:1   skip:17 
ro-bdw-i7-5600u  total:240  pass:206  dwarn:1   dfail:0   fail:1   skip:32 
ro-bsw-n3050 total:240  pass:193  dwarn:0   dfail:0   fail:5   skip:42 
ro-byt-n2820 total:240  pass:197  dwarn:0   dfail:0   fail:3   skip:40 
ro-hsw-i3-4010u  total:240  pass:214  dwarn:0   dfail:0   fail:0   skip:26 
ro-hsw-i7-4770r  total:240  pass:185  dwarn:0   dfail:0   fail:0   skip:55 
ro-ilk1-i5-650   total:235  pass:173  dwarn:0   dfail:0   fail:2   skip:60 
ro-ivb-i7-3770   total:240  pass:204  dwarn:0   dfail:0   fail:0   skip:36 
ro-ivb2-i7-3770  total:240  pass:209  dwarn:0   dfail:0   fail:0   skip:31 
ro-skl3-i5-6260u total:240  pass:222  dwarn:0   dfail:0   fail:4   skip:14 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1866/

441299a drm-intel-nightly: 2016y-08m-15d-07h-32m-02s UTC integration manifest
3653f3b drm/i915: Show RPS autotuning thresholds along with waitboost

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/10] drm/i915: Move fence tracking from object to vma

2016-08-15 Thread Joonas Lahtinen
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> @@ -455,15 +455,21 @@ struct intel_opregion {
>  struct intel_overlay;
>  struct intel_overlay_error_state;
>  
> -#define I915_FENCE_REG_NONE -1
> -#define I915_MAX_NUM_FENCES 32
> -/* 32 fences + sign bit for FENCE_REG_NONE */
> -#define I915_MAX_NUM_FENCE_BITS 6
> -
>  struct drm_i915_fence_reg {
>   struct list_head lru_list;

Could be converted to lru_link while at it.

> @@ -1131,15 +1131,11 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private 
> *i915,
>   } else {
>   node.start = i915_ggtt_offset(vma);
>   node.allocated = false;
> - ret = i915_gem_object_put_fence(obj);
> + ret = i915_vma_put_fence(vma);
>   if (ret)
>   goto out_unpin;
>   }
>  
> - ret = i915_gem_object_set_to_gtt_domain(obj, true);
> - if (ret)
> - goto out_unpin;
> -

This is a somewhat an unexpected change in here. Care to explain?

> +static void i965_write_fence_reg(struct drm_i915_fence_reg *fence,
> +  struct i915_vma *vma)
>  {
> - struct drm_i915_private *dev_priv = to_i915(dev);
>   i915_reg_t fence_reg_lo, fence_reg_hi;
>   int fence_pitch_shift;
> + u64 val;
>  
> - if (INTEL_INFO(dev)->gen >= 6) {
> - fence_reg_lo = FENCE_REG_GEN6_LO(reg);
> - fence_reg_hi = FENCE_REG_GEN6_HI(reg);
> + if (INTEL_INFO(fence->i915)->gen >= 6) {
> + fence_reg_lo = FENCE_REG_GEN6_LO(fence->id);
> + fence_reg_hi = FENCE_REG_GEN6_HI(fence->id);
>   fence_pitch_shift = GEN6_FENCE_PITCH_SHIFT;
> +
>   } else {
> - fence_reg_lo = FENCE_REG_965_LO(reg);
> - fence_reg_hi = FENCE_REG_965_HI(reg);
> + fence_reg_lo = FENCE_REG_965_LO(fence->id);
> + fence_reg_hi = FENCE_REG_965_HI(fence->id);
>   fence_pitch_shift = I965_FENCE_PITCH_SHIFT;
>   }
>  
> - /* To w/a incoherency with non-atomic 64-bit register updates,
> -  * we split the 64-bit update into two 32-bit writes. In order
> -  * for a partial fence not to be evaluated between writes, we
> -  * precede the update with write to turn off the fence register,
> -  * and only enable the fence as the last step.
> -  *
> -  * For extra levels of paranoia, we make sure each step lands
> -  * before applying the next step.
> -  */
> - I915_WRITE(fence_reg_lo, 0);
> - POSTING_READ(fence_reg_lo);
> -
> - if (obj) {
> - struct i915_vma *vma = i915_gem_object_to_ggtt(obj, NULL);
> - unsigned int tiling = i915_gem_object_get_tiling(obj);
> - unsigned int stride = i915_gem_object_get_stride(obj);
> - u64 size = vma->node.size;
> - u32 row_size = stride * (tiling == I915_TILING_Y ? 32 : 8);
> - u64 val;
> -
> - /* Adjust fence size to match tiled area */
> - size = rounddown(size, row_size);
> + if (vma) {
> + unsigned int tiling = i915_gem_object_get_tiling(vma->obj);
> + unsigned int tiling_y = tiling == I915_TILING_Y;

bool and maybe 'y_tiled'?

> + unsigned int stride = i915_gem_object_get_stride(vma->obj);
> + u32 row_size = stride * (tiling_y ? 32 : 8);
> + u32 size = rounddown(vma->node.size, row_size);
>  
>   val = ((vma->node.start + size - 4096) & 0xf000) << 32;
>   val |= vma->node.start & 0xf000;
>   val |= (u64)((stride / 128) - 1) << fence_pitch_shift;
> - if (tiling == I915_TILING_Y)
> + if (tiling_y)
>   val |= 1 << I965_FENCE_TILING_Y_SHIFT;

While around, BIT()

>   val |= I965_FENCE_REG_VALID;
> + } else
> + val = 0;
> +
> + if (1) {

Umm? At least ought to have TODO: / FIXME: or some explanation. And

if (!1)
return;

Would make the code more readable too, as you do not have any else
branch.

> @@ -152,20 +148,23 @@ static void i915_write_fence_reg(struct drm_device 
> *dev, int reg,
>   } else
>   val = 0;
>  
> - I915_WRITE(FENCE_REG(reg), val);
> - POSTING_READ(FENCE_REG(reg));
> + if (1) {

Ditto.

> @@ -186,96 +185,95 @@ static void i830_write_fence_reg(struct drm_device 
> *dev, int reg,
>   } else
>   val = 0;
>  
> - I915_WRITE(FENCE_REG(reg), val);
> - POSTING_READ(FENCE_REG(reg));
> -}
> + if (1) {

Ditto.

> -static struct drm_i915_fence_reg *
> -i915_find_fence_reg(struct drm_device *dev)
> +static struct drm_i915_fence_reg *fence_find(struct drm_i915_private 
> *dev_priv)
>  {
> - struct drm_i915_private *dev_priv = to_i915(dev);
> - struct drm_i915_fence_reg *reg, *avail;
> - int i;
> -
> - /* First try to find a free reg */
> - avail = NULL;
> - for (i = 0; i < dev_priv->num_fence_regs; i++) {
> - reg = &dev

Re: [Intel-gfx] [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs

2016-08-15 Thread Tvrtko Ursulin


On 12/08/16 17:31, Goel, Akash wrote:

On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote:

On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

As per the current i915 Driver load sequence, debugfs registration is
done
at the end and so the relay channel debugfs file is also created after
that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC
firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only
relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 61
+-
  1 file changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index af48f62..1c287d7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct
intel_guc *guc)
  relay_close(guc->log.relay_chan);
  }

-static int guc_create_log_relay_file(struct intel_guc *guc)
+static int guc_create_relay_channel(struct intel_guc *guc)
  {
  struct drm_i915_private *dev_priv = guc_to_i915(guc);
  struct rchan *guc_log_relay_chan;
-struct dentry *log_dir;
  size_t n_subbufs, subbuf_size;

-/* For now create the log file in /sys/kernel/debug/dri/0 dir */
-log_dir = dev_priv->drm.primary->debugfs_root;
-
-/* If /sys/kernel/debug/dri/0 location do not exist, then
debugfs is
- * not mounted and so can't create the relay file.
- * The relay API seems to fit well with debugfs only.


It only needs a dentry, I don't see that it has to be a debugfs one.


Besides dentry, there are other requirements for using relay, which can
be met only for a debugfs file.
debugfs wasn't the preferred choice to place the log file, but had no
other option, as relay API is compatible with debugfs only.


What are those and should they be mentioned in the comment above?

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/10] drm/i915: Move fence tracking from object to vma

2016-08-15 Thread Chris Wilson
On Mon, Aug 15, 2016 at 12:18:20PM +0300, Joonas Lahtinen wrote:
> On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> > +   if (1) {
> 
> Umm? At least ought to have TODO: / FIXME: or some explanation. And

You're not aware of the pipelined fencing?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 05/10] drm/i915: Fix partial GGTT faulting

2016-08-15 Thread Joonas Lahtinen
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> @@ -1717,26 +1716,30 @@ int i915_gem_fault(struct vm_area_struct *area, 
> struct vm_fault *vmf)
>   }
>  
>   /* Use a partial view if the object is bigger than the aperture. */

Move this comment down to where partial view is actually created.

> - if (obj->base.size >= ggtt->mappable_end &&
> - !i915_gem_object_is_tiled(obj)) {
> + /* Now pin it into the GTT if needed */
> + vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
> +    PIN_MAPPABLE | PIN_NONBLOCK);
> + if (IS_ERR(vma)) {
> + struct i915_ggtt_view partial;

'view' still makes more sense, less repeating of the word partial down.

> @@ -1754,26 +1757,7 @@ int i915_gem_fault(struct vm_area_struct *area, struct 
> vm_fault *vmf)
>   pfn = ggtt->mappable_base + i915_ggtt_offset(vma);
>   pfn >>= PAGE_SHIFT;
>  
> - if (unlikely(view.type == I915_GGTT_VIEW_PARTIAL)) {
> - /* Overriding existing pages in partial view does not cause
> -  * us any trouble as TLBs are still valid because the fault
> -  * is due to userspace losing part of the mapping or never
> -  * having accessed it before (at this partials' range).
> -  */
> - unsigned long base = area->vm_start +
> -  (view.params.partial.offset << PAGE_SHIFT);
> - unsigned int i;
> -
> - for (i = 0; i < view.params.partial.size; i++) {
> - ret = vm_insert_pfn(area,
> - base + i * PAGE_SIZE,
> - pfn + i);
> - if (ret)
> - break;
> - }
> -
> - obj->fault_mappable = true;
> - } else {
> + if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {

likely() ?

>   if (!obj->fault_mappable) {
>   unsigned long size =
>   min_t(unsigned long,
> @@ -1789,13 +1773,31 @@ int i915_gem_fault(struct vm_area_struct *area, 
> struct vm_fault *vmf)
>   if (ret)
>   break;
>   }
> -
> - obj->fault_mappable = true;
>   } else
>   ret = vm_insert_pfn(area,
>   (unsigned long)vmf->virtual_address,
>   pfn + page_offset);
> + } else {
> + /* Overriding existing pages in partial view does not cause
> +  * us any trouble as TLBs are still valid because the fault
> +  * is due to userspace losing part of the mapping or never
> +  * having accessed it before (at this partials' range).
> +  */
> + const struct i915_ggtt_view *view = &vma->ggtt_view;

I now see why you did the rename. Do not have a better idea really, so;

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 07/10] drm/i915: Fallback to using unmappable memory for scanout

2016-08-15 Thread Joonas Lahtinen
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> The existing ABI says that scanouts are pinned into the mappable region
> so that legacy clients (e.g. old Xorg or plymouthd) can write directly
> into the scanout through a GTT mapping. However if the surface does not
> fit into the mappable region, we are better off just trying to fit it
> anywhere and hoping for the best. (Any userspace that is cappable of

s/cappable/capable/

> using ginormous scanouts is also likely not to rely on pure GTT
> updates.) With the partial vma fault support, we are no longer
> restricted to only using scanouts that we can pin (though it is still
> preferred for performance reasons and for powersaving features like
> FBC).
> 
> v2: Skip fence pinning when not mappable.
> v3: Add a comment to explain the possible rammifactions of not being
> able to use fences for unmappable scanouts.
> v4: Rebase to skip over some local patches
> v5: Rebase to defer until after we have unmappable GTT fault support
> 
> Signed-off-by: Chris Wilson 

Reviewed-by: Joonas Lahtinen 

Could use some Acked-by tags.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 08/10] drm/i915: Track display alignment on VMA

2016-08-15 Thread Joonas Lahtinen
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> When using the aliasing ppgtt and pagefliping with the shrinker/eviction

s/fliping/flipping/

> active, we note that we often have to rebind the backbuffer before
> flipping onto the scanout because it has an invalid alignment. If we
> store the worst-case alignment required for a VMA, we can avoid having
> to rebind at critical junctures.
> 
> Signed-off-by: Chris Wilson 



> @@ -2984,17 +2983,10 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 
> alignment, u64 flags)
>   size = i915_gem_get_ggtt_size(dev_priv, size,
>     i915_gem_object_get_tiling(obj));
>  
> - min_alignment =
> - i915_gem_get_ggtt_alignment(dev_priv, size,
> - i915_gem_object_get_tiling(obj),
> - flags & PIN_MAPPABLE);
> - if (alignment == 0)
> - alignment = min_alignment;
> - if (alignment & (min_alignment - 1)) {
> - DRM_DEBUG("Invalid object alignment requested %llu, minimum 
> %llu\n",
> -   alignment, min_alignment);
> - return -EINVAL;
> - }
> + alignment = max(max(alignment, vma->display_alignment),
> + i915_gem_get_ggtt_alignment(dev_priv, size,
> + 
> i915_gem_object_get_tiling(obj),
> + flags & PIN_MAPPABLE));

No DRM_DEBUG no more?

> @@ -183,7 +183,7 @@ struct i915_vma {
>   struct drm_i915_fence_reg *fence;
>   struct sg_table *pages;
>   void __iomem *iomap;
> - u64 size;
> + u64 size, display_alignment;

Unrelated variables, better off their own lines.

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 03/32] drm/i915: Store the active context object on all engines upon error

2016-08-15 Thread Chris Wilson
With execlists, we have context objects everywhere, not just RCS. So
store them for post-mortem debugging. This also has a secondary effect
of removing one more unsafe list iteration with using preserved state
from the hanging request. And now we can cross-reference the request's
context state with that loaded by the GPU.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 28 
 1 file changed, 4 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 1c098fa65fbe..d11630bac188 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1043,28 +1043,6 @@ static void error_record_engine_registers(struct 
drm_i915_error_state *error,
}
 }
 
-static void i915_gem_record_active_context(struct intel_engine_cs *engine,
-  struct drm_i915_error_state *error,
-  struct drm_i915_error_engine *ee)
-{
-   struct drm_i915_private *dev_priv = engine->i915;
-   struct drm_i915_gem_object *obj;
-
-   /* Currently render ring is the only HW context user */
-   if (engine->id != RCS || !error->ccid)
-   return;
-
-   list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-   if (!i915_gem_obj_ggtt_bound(obj))
-   continue;
-
-   if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) 
{
-   ee->ctx = i915_error_ggtt_object_create(dev_priv, obj);
-   break;
-   }
-   }
-}
-
 static void i915_gem_record_rings(struct drm_i915_private *dev_priv,
  struct drm_i915_error_state *error)
 {
@@ -1114,6 +1092,10 @@ static void i915_gem_record_rings(struct 
drm_i915_private *dev_priv,
i915_error_ggtt_object_create(dev_priv,
  
engine->scratch.obj);
 
+   ee->ctx =
+   i915_error_ggtt_object_create(dev_priv,
+ 
request->ctx->engine[i].state);
+
if (request->pid) {
struct task_struct *task;
 
@@ -1144,8 +1126,6 @@ static void i915_gem_record_rings(struct drm_i915_private 
*dev_priv,
ee->wa_ctx = i915_error_ggtt_object_create(dev_priv,
   engine->wa_ctx.obj);
 
-   i915_gem_record_active_context(engine, error, ee);
-
count = 0;
list_for_each_entry(request, &engine->request_list, link)
count++;
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 05/32] drm/i915: Focus debugfs/i915_gem_pinned to show only display pins

2016-08-15 Thread Chris Wilson
Only those objects pinned to the display have semi-permanent pins of a
global nature (other pins are transient within their local vm). Simplify
i915_gem_pinned to only show the pertinent information about the pinned
objects within the GGTT.

v2: i915_gem_gtt_info is still shared with debugfs/i915_gem_gtt,
rename i915_gem_pinned to i915_gem_pin_display to better reflect its
contents

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index cf35ce0b8518..c3bc5db1124f 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -40,12 +40,6 @@
 #include 
 #include "i915_drv.h"
 
-enum {
-   ACTIVE_LIST,
-   INACTIVE_LIST,
-   PINNED_LIST,
-};
-
 /* As the drm_debugfs_init() routines are called before dev->dev_private is
  * allocated we need to hook into the minor for release. */
 static int
@@ -537,8 +531,8 @@ static int i915_gem_gtt_info(struct seq_file *m, void *data)
 {
struct drm_info_node *node = m->private;
struct drm_device *dev = node->minor->dev;
-   uintptr_t list = (uintptr_t) node->info_ent->data;
struct drm_i915_private *dev_priv = to_i915(dev);
+   bool show_pin_display_only = !!data;
struct drm_i915_gem_object *obj;
u64 total_obj_size, total_gtt_size;
int count, ret;
@@ -549,7 +543,7 @@ static int i915_gem_gtt_info(struct seq_file *m, void *data)
 
total_obj_size = total_gtt_size = count = 0;
list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-   if (list == PINNED_LIST && !i915_gem_obj_is_pinned(obj))
+   if (show_pin_display_only && !obj->pin_display)
continue;
 
seq_puts(m, "   ");
@@ -5381,7 +5375,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
{"i915_capabilities", i915_capabilities, 0},
{"i915_gem_objects", i915_gem_object_info, 0},
{"i915_gem_gtt", i915_gem_gtt_info, 0},
-   {"i915_gem_pinned", i915_gem_gtt_info, 0, (void *) PINNED_LIST},
+   {"i915_gem_pin_display", i915_gem_gtt_info, 0, (void *)1},
{"i915_gem_stolen", i915_gem_stolen_list_info },
{"i915_gem_pageflip", i915_gem_pageflip_info, 0},
{"i915_gem_request", i915_gem_request_info, 0},
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 14/32] drm/i915: Use VMA directly for checking tiling parameters

2016-08-15 Thread Chris Wilson
v2: Rename functions to suit their more active role

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_tiling.c | 51 --
 1 file changed, 30 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c 
b/drivers/gpu/drm/i915/i915_gem_tiling.c
index f4b984de83b5..b2b0cb7199ac 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -116,35 +116,46 @@ i915_tiling_ok(struct drm_device *dev, int stride, int 
size, int tiling_mode)
return true;
 }
 
-/* Is the current GTT allocation valid for the change in tiling? */
-static bool
-i915_gem_object_fence_ok(struct drm_i915_gem_object *obj, int tiling_mode)
+/* Make the current GTT allocation valid for the change in tiling. */
+static int
+i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj, int tiling_mode)
 {
struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
+   struct i915_vma *vma;
u32 size;
 
if (tiling_mode == I915_TILING_NONE)
-   return true;
+   return 0;
 
if (INTEL_GEN(dev_priv) >= 4)
-   return true;
+   return 0;
+
+   vma = i915_gem_obj_to_ggtt(obj);
+   if (!vma)
+   return 0;
+
+   if (!obj->map_and_fenceable)
+   return 0;
 
if (IS_GEN3(dev_priv)) {
-   if (i915_gem_obj_ggtt_offset(obj) & ~I915_FENCE_START_MASK)
-   return false;
+   if (vma->node.start & ~I915_FENCE_START_MASK)
+   goto bad;
} else {
-   if (i915_gem_obj_ggtt_offset(obj) & ~I830_FENCE_START_MASK)
-   return false;
+   if (vma->node.start & ~I830_FENCE_START_MASK)
+   goto bad;
}
 
size = i915_gem_get_ggtt_size(dev_priv, obj->base.size, tiling_mode);
-   if (i915_gem_obj_ggtt_size(obj) != size)
-   return false;
+   if (vma->node.size < size)
+   goto bad;
 
-   if (i915_gem_obj_ggtt_offset(obj) & (size - 1))
-   return false;
+   if (vma->node.start & (size - 1))
+   goto bad;
 
-   return true;
+   return 0;
+
+bad:
+   return i915_vma_unbind(vma);
 }
 
 /**
@@ -168,7 +179,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
struct drm_i915_gem_set_tiling *args = data;
struct drm_i915_private *dev_priv = to_i915(dev);
struct drm_i915_gem_object *obj;
-   int ret = 0;
+   int err = 0;
 
/* Make sure we don't cross-contaminate obj->tiling_and_stride */
BUILD_BUG_ON(I915_TILING_LAST & STRIDE_MASK);
@@ -187,7 +198,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 
mutex_lock(&dev->struct_mutex);
if (obj->pin_display || obj->framebuffer_references) {
-   ret = -EBUSY;
+   err = -EBUSY;
goto err;
}
 
@@ -234,11 +245,9 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 * has to also include the unfenced register the GPU uses
 * whilst executing a fenced command for an untiled object.
 */
-   if (obj->map_and_fenceable &&
-   !i915_gem_object_fence_ok(obj, args->tiling_mode))
-   ret = i915_vma_unbind(i915_gem_obj_to_ggtt(obj));
 
-   if (ret == 0) {
+   err = i915_gem_object_fence_prepare(obj, args->tiling_mode);
+   if (!err) {
if (obj->pages &&
obj->madv == I915_MADV_WILLNEED &&
dev_priv->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
@@ -281,7 +290,7 @@ err:
 
intel_runtime_pm_put(dev_priv);
 
-   return ret;
+   return err;
 }
 
 /**
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 07/32] drm/i915: Remove redundant WARN_ON from __i915_add_request()

2016-08-15 Thread Chris Wilson
It's an outright programming error, so explode if it is ever hit.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_request.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index 8a9e9bfeea09..4c5b7e104f2f 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -470,18 +470,12 @@ static void i915_gem_mark_busy(const struct 
intel_engine_cs *engine)
  */
 void __i915_add_request(struct drm_i915_gem_request *request, bool 
flush_caches)
 {
-   struct intel_engine_cs *engine;
-   struct intel_ring *ring;
+   struct intel_engine_cs *engine = request->engine;
+   struct intel_ring *ring = request->ring;
u32 request_start;
u32 reserved_tail;
int ret;
 
-   if (WARN_ON(!request))
-   return;
-
-   engine = request->engine;
-   ring = request->ring;
-
/*
 * To ensure that this call will not fail, space for its emissions
 * should already have been reserved in the ring buffer. Let the ring
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 02/32] drm/i915: Reduce amount of duplicate buffer information captured on error

2016-08-15 Thread Chris Wilson
When capturing the error state, we do not need to know about every
address space - just those that are related to the error. We know which
context is active at the time, therefore we know which VM are implicated
in the error. We can then restrict the VM which we report to the
relevant subset.

v2: s/i/count_active/ (and similar)
Rewrite label generation for "Buffers"

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h   |   9 +-
 drivers/gpu/drm/i915/i915_gpu_error.c | 224 +++---
 2 files changed, 105 insertions(+), 128 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b1017950087b..7eb911e47904 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -517,6 +517,7 @@ struct drm_i915_error_state {
int num_waiters;
int hangcheck_score;
enum intel_engine_hangcheck_action hangcheck_action;
+   struct i915_address_space *vm;
int num_requests;
 
/* our own tracking of ring head and tail */
@@ -587,17 +588,15 @@ struct drm_i915_error_state {
u32 read_domains;
u32 write_domain;
s32 fence_reg:I915_MAX_NUM_FENCE_BITS;
-   s32 pinned:2;
u32 tiling:2;
u32 dirty:1;
u32 purgeable:1;
u32 userptr:1;
s32 engine:4;
u32 cache_level:3;
-   } **active_bo, **pinned_bo;
-
-   u32 *active_bo_count, *pinned_bo_count;
-   u32 vm_count;
+   } *active_bo[I915_NUM_ENGINES], *pinned_bo;
+   u32 active_bo_count[I915_NUM_ENGINES], pinned_bo_count;
+   struct i915_address_space *active_vm[I915_NUM_ENGINES];
 };
 
 struct intel_connector;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index d54848f5f246..1c098fa65fbe 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -42,16 +42,6 @@ static const char *engine_str(int engine)
}
 }
 
-static const char *pin_flag(int pinned)
-{
-   if (pinned > 0)
-   return " P";
-   else if (pinned < 0)
-   return " p";
-   else
-   return "";
-}
-
 static const char *tiling_flag(int tiling)
 {
switch (tiling) {
@@ -189,7 +179,7 @@ static void print_error_buffers(struct 
drm_i915_error_state_buf *m,
 {
int i;
 
-   err_printf(m, "  %s [%d]:\n", name, count);
+   err_printf(m, "%s [%d]:\n", name, count);
 
while (count--) {
err_printf(m, "%08x_%08x %8u %02x %02x [ ",
@@ -202,7 +192,6 @@ static void print_error_buffers(struct 
drm_i915_error_state_buf *m,
err_printf(m, "%02x ", err->rseqno[i]);
 
err_printf(m, "] %02x", err->wseqno);
-   err_puts(m, pin_flag(err->pinned));
err_puts(m, tiling_flag(err->tiling));
err_puts(m, dirty_flag(err->dirty));
err_puts(m, purgeable_flag(err->purgeable));
@@ -414,18 +403,33 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
error_print_engine(m, &error->engine[i]);
}
 
-   for (i = 0; i < error->vm_count; i++) {
-   err_printf(m, "vm[%d]\n", i);
+   for (i = 0; i < ARRAY_SIZE(error->active_vm); i++) {
+   char buf[128];
+   int len, first = 1;
 
-   print_error_buffers(m, "Active",
+   if (!error->active_vm[i])
+   break;
+
+   len = scnprintf(buf, sizeof(buf), "Active (");
+   for (j = 0; j < ARRAY_SIZE(error->engine); j++) {
+   if (error->engine[j].vm != error->active_vm[i])
+   continue;
+
+   len += scnprintf(buf + len, sizeof(buf), "%s%s",
+first ? "" : ", ",
+dev_priv->engine[j].name);
+   first = 0;
+   }
+   scnprintf(buf + len, sizeof(buf), ")");
+   print_error_buffers(m, buf,
error->active_bo[i],
error->active_bo_count[i]);
-
-   print_error_buffers(m, "Pinned",
-   error->pinned_bo[i],
-   error->pinned_bo_count[i]);
}
 
+   print_error_buffers(m, "Pinned (global)",
+   error->pinned_bo,
+   error->pinned_bo_count);
+
for (i = 0; i < ARRAY_SIZE(error->engine); i++) {
struct drm_i915_error_engine *ee = &error->engine[i];
 
@@ -627,13 +631,10 @@ static void i915_error_state_free(struct kref *error_ref)
 
i915_error_object_free(error->semaphore_obj);
 

[Intel-gfx] [CI 01/32] drm/i915: Record the position of the start of the request

2016-08-15 Thread Chris Wilson
Not only does it make for good documentation and debugging aide, but it is
also vital for when we want to unwind requests - such as when throwing away
an incomplete request.

Signed-off-by: Chris Wilson 
Link: 
http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-2-git-send-email-arun.siluv...@linux.intel.com
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_gem_request.c | 13 +
 drivers/gpu/drm/i915/i915_gpu_error.c   |  6 --
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bf193ba1574e..b1017950087b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -557,6 +557,7 @@ struct drm_i915_error_state {
struct drm_i915_error_request {
long jiffies;
u32 seqno;
+   u32 head;
u32 tail;
} *requests;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index b764c1d440c8..8a9e9bfeea09 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -426,6 +426,13 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
if (ret)
goto err_ctx;
 
+   /* Record the position of the start of the request so that
+* should we detect the updated seqno part-way through the
+* GPU processing the request, we never over-estimate the
+* position of the head.
+*/
+   req->head = req->ring->tail;
+
return req;
 
 err_ctx:
@@ -500,8 +507,6 @@ void __i915_add_request(struct drm_i915_gem_request 
*request, bool flush_caches)
 
trace_i915_gem_request_add(request);
 
-   request->head = request_start;
-
/* Seal the request and mark it as pending execution. Note that
 * we may inspect this state, without holding any locks, during
 * hangcheck. Hence we apply the barrier to ensure that we do not
@@ -514,10 +519,10 @@ void __i915_add_request(struct drm_i915_gem_request 
*request, bool flush_caches)
list_add_tail(&request->link, &engine->request_list);
list_add_tail(&request->ring_link, &ring->request_list);
 
-   /* Record the position of the start of the request so that
+   /* Record the position of the start of the breadcrumb so that
 * should we detect the updated seqno part-way through the
 * GPU processing the request, we never over-estimate the
-* position of the head.
+* position of the ring's HEAD.
 */
request->postfix = ring->tail;
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index eecb87063c88..d54848f5f246 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -455,9 +455,10 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
   dev_priv->engine[i].name,
   ee->num_requests);
for (j = 0; j < ee->num_requests; j++) {
-   err_printf(m, "  seqno 0x%08x, emitted %ld, 
tail 0x%08x\n",
+   err_printf(m, "  seqno 0x%08x, emitted %ld, 
head 0x%08x, tail 0x%08x\n",
   ee->requests[j].seqno,
   ee->requests[j].jiffies,
+  ee->requests[j].head,
   ee->requests[j].tail);
}
}
@@ -1205,7 +1206,8 @@ static void i915_gem_record_rings(struct drm_i915_private 
*dev_priv,
erq = &ee->requests[count++];
erq->seqno = request->fence.seqno;
erq->jiffies = request->emitted_jiffies;
-   erq->tail = request->postfix;
+   erq->head = request->head;
+   erq->tail = request->tail;
}
}
 }
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 13/32] drm/i915: Convert fence computations to use vma directly

2016-08-15 Thread Chris Wilson
Lookup the GGTT vma once for the object assigned to the fence, and then
derive everything from that vma.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_fence.c | 55 +--
 1 file changed, 26 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_fence.c 
b/drivers/gpu/drm/i915/i915_gem_fence.c
index 9e8173fe2a09..d99fc5734cf1 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence.c
@@ -85,22 +85,19 @@ static void i965_write_fence_reg(struct drm_device *dev, 
int reg,
POSTING_READ(fence_reg_lo);
 
if (obj) {
-   u32 size = i915_gem_obj_ggtt_size(obj);
+   struct i915_vma *vma = i915_gem_obj_to_ggtt(obj);
unsigned int tiling = i915_gem_object_get_tiling(obj);
unsigned int stride = i915_gem_object_get_stride(obj);
-   uint64_t val;
+   u32 size = vma->node.size;
+   u32 row_size = stride * (tiling == I915_TILING_Y ? 32 : 8);
+   u64 val;
 
/* Adjust fence size to match tiled area */
-   if (tiling != I915_TILING_NONE) {
-   uint32_t row_size = stride *
-   (tiling == I915_TILING_Y ? 32 : 8);
-   size = (size / row_size) * row_size;
-   }
+   size = rounddown(size, row_size);
 
-   val = (uint64_t)((i915_gem_obj_ggtt_offset(obj) + size - 4096) &
-0xf000) << 32;
-   val |= i915_gem_obj_ggtt_offset(obj) & 0xf000;
-   val |= (uint64_t)((stride / 128) - 1) << fence_pitch_shift;
+   val = ((vma->node.start + size - 4096) & 0xf000) << 32;
+   val |= vma->node.start & 0xf000;
+   val |= (u64)((stride / 128) - 1) << fence_pitch_shift;
if (tiling == I915_TILING_Y)
val |= 1 << I965_FENCE_TILING_Y_SHIFT;
val |= I965_FENCE_REG_VALID;
@@ -123,17 +120,17 @@ static void i915_write_fence_reg(struct drm_device *dev, 
int reg,
u32 val;
 
if (obj) {
-   u32 size = i915_gem_obj_ggtt_size(obj);
+   struct i915_vma *vma = i915_gem_obj_to_ggtt(obj);
unsigned int tiling = i915_gem_object_get_tiling(obj);
unsigned int stride = i915_gem_object_get_stride(obj);
int pitch_val;
int tile_width;
 
-   WARN((i915_gem_obj_ggtt_offset(obj) & ~I915_FENCE_START_MASK) ||
-(size & -size) != size ||
-(i915_gem_obj_ggtt_offset(obj) & (size - 1)),
-"object 0x%08llx [fenceable? %d] not 1M or pot-size 
(0x%08x) aligned\n",
-i915_gem_obj_ggtt_offset(obj), obj->map_and_fenceable, 
size);
+   WARN((vma->node.start & ~I915_FENCE_START_MASK) ||
+!is_power_of_2(vma->node.size) ||
+(vma->node.start & (vma->node.size - 1)),
+"object 0x%08llx [fenceable? %d] not 1M or pot-size 
(0x%08llx) aligned\n",
+vma->node.start, obj->map_and_fenceable, vma->node.size);
 
if (tiling == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev))
tile_width = 128;
@@ -144,10 +141,10 @@ static void i915_write_fence_reg(struct drm_device *dev, 
int reg,
pitch_val = stride / tile_width;
pitch_val = ffs(pitch_val) - 1;
 
-   val = i915_gem_obj_ggtt_offset(obj);
+   val = vma->node.start;
if (tiling == I915_TILING_Y)
val |= 1 << I830_FENCE_TILING_Y_SHIFT;
-   val |= I915_FENCE_SIZE_BITS(size);
+   val |= I915_FENCE_SIZE_BITS(vma->node.size);
val |= pitch_val << I830_FENCE_PITCH_SHIFT;
val |= I830_FENCE_REG_VALID;
} else
@@ -161,27 +158,27 @@ static void i830_write_fence_reg(struct drm_device *dev, 
int reg,
struct drm_i915_gem_object *obj)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
-   uint32_t val;
+   u32 val;
 
if (obj) {
-   u32 size = i915_gem_obj_ggtt_size(obj);
+   struct i915_vma *vma = i915_gem_obj_to_ggtt(obj);
unsigned int tiling = i915_gem_object_get_tiling(obj);
unsigned int stride = i915_gem_object_get_stride(obj);
-   uint32_t pitch_val;
+   u32 pitch_val;
 
-   WARN((i915_gem_obj_ggtt_offset(obj) & ~I830_FENCE_START_MASK) ||
-(size & -size) != size ||
-(i915_gem_obj_ggtt_offset(obj) & (size - 1)),
-"object 0x%08llx not 512K or pot-size 0x%08x aligned\n",
-i915_gem_obj_ggtt_offset(obj), size);
+   WARN((vma->node.s

[Intel-gfx] [CI 09/32] drm/i915: Create a VMA for an object

2016-08-15 Thread Chris Wilson
In many places, we wish to store the VMA in preference to the object
itself and so being able to create the persistent VMA is useful.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 11 +++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  5 +
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 9c178b0c40b5..1bec50bd651b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3387,6 +3387,17 @@ __i915_gem_vma_create(struct drm_i915_gem_object *obj,
 }
 
 struct i915_vma *
+i915_vma_create(struct drm_i915_gem_object *obj,
+   struct i915_address_space *vm,
+   const struct i915_ggtt_view *view)
+{
+   GEM_BUG_ON(view && !i915_is_ggtt(vm));
+   GEM_BUG_ON(view ? i915_gem_obj_to_ggtt_view(obj, view) : 
i915_gem_obj_to_vma(obj, vm));
+
+   return __i915_gem_vma_create(obj, vm, view ?: &i915_ggtt_view_normal);
+}
+
+struct i915_vma *
 i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
  struct i915_address_space *vm)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b580e8a013ce..f2769e01cc8c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -228,6 +228,11 @@ struct i915_vma {
struct drm_i915_gem_exec_object2 *exec_entry;
 };
 
+struct i915_vma *
+i915_vma_create(struct drm_i915_gem_object *obj,
+   struct i915_address_space *vm,
+   const struct i915_ggtt_view *view);
+
 static inline bool i915_vma_is_ggtt(const struct i915_vma *vma)
 {
return vma->flags & I915_VMA_GGTT;
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 04/32] drm/i915: Remove inactive/active list from debugfs

2016-08-15 Thread Chris Wilson
These two files (i915_gem_active, i915_gem_inactive) no longer give
pertinent information since active/inactive tracking is per-vm and so we
need the information per-vm. They are obsolete so remove them.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 49 -
 1 file changed, 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index b8ed8db9f7ec..cf35ce0b8518 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -210,53 +210,6 @@ describe_obj(struct seq_file *m, struct 
drm_i915_gem_object *obj)
seq_printf(m, " (frontbuffer: 0x%03x)", frontbuffer_bits);
 }
 
-static int i915_gem_object_list_info(struct seq_file *m, void *data)
-{
-   struct drm_info_node *node = m->private;
-   uintptr_t list = (uintptr_t) node->info_ent->data;
-   struct list_head *head;
-   struct drm_device *dev = node->minor->dev;
-   struct drm_i915_private *dev_priv = to_i915(dev);
-   struct i915_ggtt *ggtt = &dev_priv->ggtt;
-   struct i915_vma *vma;
-   u64 total_obj_size, total_gtt_size;
-   int count, ret;
-
-   ret = mutex_lock_interruptible(&dev->struct_mutex);
-   if (ret)
-   return ret;
-
-   /* FIXME: the user of this interface might want more than just GGTT */
-   switch (list) {
-   case ACTIVE_LIST:
-   seq_puts(m, "Active:\n");
-   head = &ggtt->base.active_list;
-   break;
-   case INACTIVE_LIST:
-   seq_puts(m, "Inactive:\n");
-   head = &ggtt->base.inactive_list;
-   break;
-   default:
-   mutex_unlock(&dev->struct_mutex);
-   return -EINVAL;
-   }
-
-   total_obj_size = total_gtt_size = count = 0;
-   list_for_each_entry(vma, head, vm_link) {
-   seq_printf(m, "   ");
-   describe_obj(m, vma->obj);
-   seq_printf(m, "\n");
-   total_obj_size += vma->obj->base.size;
-   total_gtt_size += vma->node.size;
-   count++;
-   }
-   mutex_unlock(&dev->struct_mutex);
-
-   seq_printf(m, "Total %d objects, %llu bytes, %llu GTT size\n",
-  count, total_obj_size, total_gtt_size);
-   return 0;
-}
-
 static int obj_rank_by_stolen(void *priv,
  struct list_head *A, struct list_head *B)
 {
@@ -5429,8 +5382,6 @@ static const struct drm_info_list i915_debugfs_list[] = {
{"i915_gem_objects", i915_gem_object_info, 0},
{"i915_gem_gtt", i915_gem_gtt_info, 0},
{"i915_gem_pinned", i915_gem_gtt_info, 0, (void *) PINNED_LIST},
-   {"i915_gem_active", i915_gem_object_list_info, 0, (void *) ACTIVE_LIST},
-   {"i915_gem_inactive", i915_gem_object_list_info, 0, (void *) 
INACTIVE_LIST},
{"i915_gem_stolen", i915_gem_stolen_list_info },
{"i915_gem_pageflip", i915_gem_pageflip_info, 0},
{"i915_gem_request", i915_gem_request_info, 0},
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 06/32] drm/i915: Reduce i915_gem_objects to only show object information

2016-08-15 Thread Chris Wilson
No longer is knowing how much of the GTT (both mappable aperture and
beyond) relevant, and the output clutters the real information - that is
how many objects are allocated and bound (and by who) so that we can
quickly grasp if there is a leak.

v2: Relent, and rename pinned to indicate display only. Since the
display objects are semi-static and are of variable size, they are the
interesting objects to watch over time for aperture leaking. The other
pins are either static (such as the scratch page) or very short lived
(such as execbuf) and not part of the precious GGTT.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_debugfs.c   | 100 --
 drivers/gpu/drm/i915/i915_drv.h   | 249 +-
 drivers/gpu/drm/i915/i915_gpu_error.c |  15 ++
 3 files changed, 168 insertions(+), 196 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index c3bc5db1124f..77a9c56ad25f 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -269,17 +269,6 @@ static int i915_gem_stolen_list_info(struct seq_file *m, 
void *data)
return 0;
 }
 
-#define count_objects(list, member) do { \
-   list_for_each_entry(obj, list, member) { \
-   size += i915_gem_obj_total_ggtt_size(obj); \
-   ++count; \
-   if (obj->map_and_fenceable) { \
-   mappable_size += i915_gem_obj_ggtt_size(obj); \
-   ++mappable_count; \
-   } \
-   } \
-} while (0)
-
 struct file_stats {
struct drm_i915_file_private *file_priv;
unsigned long count;
@@ -394,30 +383,16 @@ static void print_context_stats(struct seq_file *m,
print_file_stats(m, "[k]contexts", stats);
 }
 
-#define count_vmas(list, member) do { \
-   list_for_each_entry(vma, list, member) { \
-   size += i915_gem_obj_total_ggtt_size(vma->obj); \
-   ++count; \
-   if (vma->obj->map_and_fenceable) { \
-   mappable_size += i915_gem_obj_ggtt_size(vma->obj); \
-   ++mappable_count; \
-   } \
-   } \
-} while (0)
-
 static int i915_gem_object_info(struct seq_file *m, void* data)
 {
struct drm_info_node *node = m->private;
struct drm_device *dev = node->minor->dev;
struct drm_i915_private *dev_priv = to_i915(dev);
struct i915_ggtt *ggtt = &dev_priv->ggtt;
-   u32 count, mappable_count, purgeable_count;
-   u64 size, mappable_size, purgeable_size;
-   unsigned long pin_mapped_count = 0, pin_mapped_purgeable_count = 0;
-   u64 pin_mapped_size = 0, pin_mapped_purgeable_size = 0;
+   u32 count, mapped_count, purgeable_count, dpy_count;
+   u64 size, mapped_size, purgeable_size, dpy_size;
struct drm_i915_gem_object *obj;
struct drm_file *file;
-   struct i915_vma *vma;
int ret;
 
ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -428,70 +403,51 @@ static int i915_gem_object_info(struct seq_file *m, void* 
data)
   dev_priv->mm.object_count,
   dev_priv->mm.object_memory);
 
-   size = count = mappable_size = mappable_count = 0;
-   count_objects(&dev_priv->mm.bound_list, global_list);
-   seq_printf(m, "%u [%u] objects, %llu [%llu] bytes in gtt\n",
-  count, mappable_count, size, mappable_size);
-
-   size = count = mappable_size = mappable_count = 0;
-   count_vmas(&ggtt->base.active_list, vm_link);
-   seq_printf(m, "  %u [%u] active objects, %llu [%llu] bytes\n",
-  count, mappable_count, size, mappable_size);
-
-   size = count = mappable_size = mappable_count = 0;
-   count_vmas(&ggtt->base.inactive_list, vm_link);
-   seq_printf(m, "  %u [%u] inactive objects, %llu [%llu] bytes\n",
-  count, mappable_count, size, mappable_size);
-
size = count = purgeable_size = purgeable_count = 0;
list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list) {
-   size += obj->base.size, ++count;
-   if (obj->madv == I915_MADV_DONTNEED)
-   purgeable_size += obj->base.size, ++purgeable_count;
+   size += obj->base.size;
+   ++count;
+
+   if (obj->madv == I915_MADV_DONTNEED) {
+   purgeable_size += obj->base.size;
+   ++purgeable_count;
+   }
+
if (obj->mapping) {
-   pin_mapped_count++;
-   pin_mapped_size += obj->base.size;
-   if (obj->pages_pin_count == 0) {
-   pin_mapped_purgeable_count++;
-   pin_mapped_purgeable_size += obj->base.size;
-   }
+   mapped_count++;
+   mapped_size += obj->base.size;

[Intel-gfx] [CI 08/32] drm/i915: Always set the vma->pages

2016-08-15 Thread Chris Wilson
Previously, we would only set the vma->pages pointer for GGTT entries.
However, if we always set it, we can use it to prettify some code that
may want to access the backing store associated with the VMA (as
assigned to the VMA).

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem.c |  8 
 drivers/gpu/drm/i915/i915_gem_gtt.c | 30 ++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  3 +--
 3 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f48c45080a65..45c45d3a6e31 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2868,12 +2868,12 @@ int i915_vma_unbind(struct i915_vma *vma)
if (i915_vma_is_ggtt(vma)) {
if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {
obj->map_and_fenceable = false;
-   } else if (vma->ggtt_view.pages) {
-   sg_free_table(vma->ggtt_view.pages);
-   kfree(vma->ggtt_view.pages);
+   } else if (vma->pages) {
+   sg_free_table(vma->pages);
+   kfree(vma->pages);
}
-   vma->ggtt_view.pages = NULL;
}
+   vma->pages = NULL;
 
/* Since the unbound list is global, only move to that list if
 * no more VMAs exist. */
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index d876501694c6..9c178b0c40b5 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -170,11 +170,13 @@ static int ppgtt_bind_vma(struct i915_vma *vma,
 {
u32 pte_flags = 0;
 
+   vma->pages = vma->obj->pages;
+
/* Currently applicable only to VLV */
if (vma->obj->gt_ro)
pte_flags |= PTE_READ_ONLY;
 
-   vma->vm->insert_entries(vma->vm, vma->obj->pages, vma->node.start,
+   vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
cache_level, pte_flags);
 
return 0;
@@ -2618,8 +2620,7 @@ static int ggtt_bind_vma(struct i915_vma *vma,
if (obj->gt_ro)
pte_flags |= PTE_READ_ONLY;
 
-   vma->vm->insert_entries(vma->vm, vma->ggtt_view.pages,
-   vma->node.start,
+   vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
cache_level, pte_flags);
 
/*
@@ -2651,8 +2652,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
 
if (flags & I915_VMA_GLOBAL_BIND) {
vma->vm->insert_entries(vma->vm,
-   vma->ggtt_view.pages,
-   vma->node.start,
+   vma->pages, vma->node.start,
cache_level, pte_flags);
}
 
@@ -2660,8 +2660,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
struct i915_hw_ppgtt *appgtt =
to_i915(vma->vm->dev)->mm.aliasing_ppgtt;
appgtt->base.insert_entries(&appgtt->base,
-   vma->ggtt_view.pages,
-   vma->node.start,
+   vma->pages, vma->node.start,
cache_level, pte_flags);
}
 
@@ -3557,28 +3556,27 @@ i915_get_ggtt_vma_pages(struct i915_vma *vma)
 {
int ret = 0;
 
-   if (vma->ggtt_view.pages)
+   if (vma->pages)
return 0;
 
if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL)
-   vma->ggtt_view.pages = vma->obj->pages;
+   vma->pages = vma->obj->pages;
else if (vma->ggtt_view.type == I915_GGTT_VIEW_ROTATED)
-   vma->ggtt_view.pages =
+   vma->pages =

intel_rotate_fb_obj_pages(&vma->ggtt_view.params.rotated, vma->obj);
else if (vma->ggtt_view.type == I915_GGTT_VIEW_PARTIAL)
-   vma->ggtt_view.pages =
-   intel_partial_pages(&vma->ggtt_view, vma->obj);
+   vma->pages = intel_partial_pages(&vma->ggtt_view, vma->obj);
else
WARN_ONCE(1, "GGTT view %u not implemented!\n",
  vma->ggtt_view.type);
 
-   if (!vma->ggtt_view.pages) {
+   if (!vma->pages) {
DRM_ERROR("Failed to get pages for GGTT view type %u!\n",
  vma->ggtt_view.type);
ret = -EINVAL;
-   } else if (IS_ERR(vma->ggtt_view.pages)) {
-   ret = PTR_ERR(vma->ggtt_view.pages);
-   vma->ggtt_view.pages = NULL;
+   } else if (IS_ERR(vma->pages)) {
+   ret = PTR_ERR(vma->pages);
+   vma->pages = NULL;
DRM_ERROR("Failed to get pages for VMA view type %

[Intel-gfx] [CI 16/32] drm/i915: Only change the context object's domain when binding

2016-08-15 Thread Chris Wilson
We know that the only access to the context object is via the GPU, and
the only time when it can be out of the GPU domain is when it is swapped
out and unbound. Therefore we only need to clflush the object when
binding, thus avoiding any potential stall on touching the domain on an
active context.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_context.c | 19 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  4 
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 3857ce097c84..824dfe14bcd0 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -772,6 +772,13 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
if (skip_rcs_switch(ppgtt, engine, to))
return 0;
 
+   /* Clear this page out of any CPU caches for coherent swap-in/out. */
+   if (!(vma->flags & I915_VMA_GLOBAL_BIND)) {
+   ret = i915_gem_object_set_to_gtt_domain(vma->obj, false);
+   if (ret)
+   return ret;
+   }
+
/* Trying to pin first makes error handling easier. */
ret = i915_vma_pin(vma, 0, to->ggtt_alignment, PIN_GLOBAL);
if (ret)
@@ -786,18 +793,6 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 */
from = engine->last_context;
 
-   /*
-* Clear this page out of any CPU caches for coherent swap-in/out. Note
-* that thanks to write = false in this call and us not setting any gpu
-* write domains when putting a context object onto the active list
-* (when switching away from it), this won't block.
-*
-* XXX: We need a real interface to do this instead of trickery.
-*/
-   ret = i915_gem_object_set_to_gtt_domain(vma->obj, false);
-   if (ret)
-   goto err;
-
if (needs_pd_load_pre(ppgtt, engine, to)) {
/* Older GENs and non render rings still want the load first,
 * "PP_DCLV followed by PP_DIR_BASE register through Load
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 2318a27341c8..81dc69d1ff05 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2092,6 +2092,10 @@ static int intel_ring_context_pin(struct 
i915_gem_context *ctx,
return 0;
 
if (ce->state) {
+   ret = i915_gem_object_set_to_gtt_domain(ce->state->obj, false);
+   if (ret)
+   goto error;
+
ret = i915_vma_pin(ce->state, 0, ctx->ggtt_alignment,
   PIN_GLOBAL | PIN_HIGH);
if (ret)
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 21/32] drm/i915: Move common seqno reset to intel_engine_cs.c

2016-08-15 Thread Chris Wilson
Since the intel_engine_init_seqno() is shared by all engine submission
backends, move it out of the legacy intel_ringbuffer.c and
into the new home for common routines, intel_engine_cs.c

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/intel_engine_cs.c  | 42 +
 drivers/gpu/drm/i915/intel_ringbuffer.c | 42 -
 2 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index 7104dec5e893..829624571ca4 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -161,6 +161,48 @@ cleanup:
return ret;
 }
 
+void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno)
+{
+   struct drm_i915_private *dev_priv = engine->i915;
+
+   /* Our semaphore implementation is strictly monotonic (i.e. we proceed
+* so long as the semaphore value in the register/page is greater
+* than the sync value), so whenever we reset the seqno,
+* so long as we reset the tracking semaphore value to 0, it will
+* always be before the next request's seqno. If we don't reset
+* the semaphore value, then when the seqno moves backwards all
+* future waits will complete instantly (causing rendering corruption).
+*/
+   if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv)) {
+   I915_WRITE(RING_SYNC_0(engine->mmio_base), 0);
+   I915_WRITE(RING_SYNC_1(engine->mmio_base), 0);
+   if (HAS_VEBOX(dev_priv))
+   I915_WRITE(RING_SYNC_2(engine->mmio_base), 0);
+   }
+   if (dev_priv->semaphore_obj) {
+   struct drm_i915_gem_object *obj = dev_priv->semaphore_obj;
+   struct page *page = i915_gem_object_get_dirty_page(obj, 0);
+   void *semaphores = kmap(page);
+   memset(semaphores + GEN8_SEMAPHORE_OFFSET(engine->id, 0),
+  0, I915_NUM_ENGINES * gen8_semaphore_seqno_size);
+   kunmap(page);
+   }
+   memset(engine->semaphore.sync_seqno, 0,
+  sizeof(engine->semaphore.sync_seqno));
+
+   intel_write_status_page(engine, I915_GEM_HWS_INDEX, seqno);
+   if (engine->irq_seqno_barrier)
+   engine->irq_seqno_barrier(engine);
+   engine->last_submitted_seqno = seqno;
+
+   engine->hangcheck.seqno = seqno;
+
+   /* After manually advancing the seqno, fake the interrupt in case
+* there are any waiters for that seqno.
+*/
+   intel_engine_wakeup(engine);
+}
+
 void intel_engine_init_hangcheck(struct intel_engine_cs *engine)
 {
memset(&engine->hangcheck, 0, sizeof(engine->hangcheck));
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c89aea55bc10..6008d54b9152 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2314,48 +2314,6 @@ int intel_ring_cacheline_align(struct 
drm_i915_gem_request *req)
return 0;
 }
 
-void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno)
-{
-   struct drm_i915_private *dev_priv = engine->i915;
-
-   /* Our semaphore implementation is strictly monotonic (i.e. we proceed
-* so long as the semaphore value in the register/page is greater
-* than the sync value), so whenever we reset the seqno,
-* so long as we reset the tracking semaphore value to 0, it will
-* always be before the next request's seqno. If we don't reset
-* the semaphore value, then when the seqno moves backwards all
-* future waits will complete instantly (causing rendering corruption).
-*/
-   if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv)) {
-   I915_WRITE(RING_SYNC_0(engine->mmio_base), 0);
-   I915_WRITE(RING_SYNC_1(engine->mmio_base), 0);
-   if (HAS_VEBOX(dev_priv))
-   I915_WRITE(RING_SYNC_2(engine->mmio_base), 0);
-   }
-   if (dev_priv->semaphore_obj) {
-   struct drm_i915_gem_object *obj = dev_priv->semaphore_obj;
-   struct page *page = i915_gem_object_get_dirty_page(obj, 0);
-   void *semaphores = kmap(page);
-   memset(semaphores + GEN8_SEMAPHORE_OFFSET(engine->id, 0),
-  0, I915_NUM_ENGINES * gen8_semaphore_seqno_size);
-   kunmap(page);
-   }
-   memset(engine->semaphore.sync_seqno, 0,
-  sizeof(engine->semaphore.sync_seqno));
-
-   intel_write_status_page(engine, I915_GEM_HWS_INDEX, seqno);
-   if (engine->irq_seqno_barrier)
-   engine->irq_seqno_barrier(engine);
-   engine->last_submitted_seqno = seqno;
-
-   engine->hangcheck.seqno = seqno;
-
-   /* After manually advancing the seqno, fake the interrupt in case
-* there 

[Intel-gfx] [CI 17/32] drm/i915: Move assertion for iomap access to i915_vma_pin_iomap

2016-08-15 Thread Chris Wilson
Access through the GTT requires the device to be awake. Ideally
i915_vma_pin_iomap() is short-lived and the pinning demarcates the
access through the iomap. This is not entirely true, we have a mixture
of long lived pins that exceed the wakelock (such as legacy ringbuffers)
and short lived pin that do live within the wakelock (such as execlist
ringbuffers).

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 1bec50bd651b..738a474c5afa 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3650,6 +3650,9 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 {
void __iomem *ptr;
 
+   /* Access through the GTT requires the device to be awake. */
+   assert_rpm_wakelock_held(to_i915(vma->vm->dev));
+
lockdep_assert_held(&vma->vm->dev->struct_mutex);
if (WARN_ON(!vma->obj->map_and_fenceable))
return IO_ERR_PTR(-ENODEV);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 81dc69d1ff05..4a614e567353 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1966,9 +1966,6 @@ int intel_ring_pin(struct intel_ring *ring)
if (ret)
goto err_unpin;
 
-   /* Access through the GTT requires the device to be awake. */
-   assert_rpm_wakelock_held(dev_priv);
-
addr = (void __force *)
i915_vma_pin_iomap(i915_gem_obj_to_ggtt(obj));
if (IS_ERR(addr)) {
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 30/32] drm/i915: Print the batchbuffer offset next to BBADDR in error state

2016-08-15 Thread Chris Wilson
It is useful when looking at captured error states to check the recorded
BBADDR register (the address of the last batchbuffer instruction loaded)
against the expected offset of the batch buffer, and so do a quick check
that (a) the capture is true or (b) HEAD hasn't wandered off into the
badlands.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h   |  1 +
 drivers/gpu/drm/i915/i915_gpu_error.c | 15 +--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d9f29244bafb..bb7d8130dbfd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -775,6 +775,7 @@ struct drm_i915_error_state {
struct drm_i915_error_object {
int page_count;
u64 gtt_offset;
+   u64 gtt_size;
u32 *pages[0];
} *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page;
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 638664f78dd5..0f0b65214ef1 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -242,8 +242,16 @@ static void error_print_engine(struct 
drm_i915_error_state_buf *m,
err_printf(m, "  IPEIR: 0x%08x\n", ee->ipeir);
err_printf(m, "  IPEHR: 0x%08x\n", ee->ipehr);
err_printf(m, "  INSTDONE: 0x%08x\n", ee->instdone);
+   if (ee->batchbuffer) {
+   u64 start = ee->batchbuffer->gtt_offset;
+   u64 end = start + ee->batchbuffer->gtt_size;
+
+   err_printf(m, "  batch: [0x%08x_%08x, 0x%08x_%08x]\n",
+  upper_32_bits(start), lower_32_bits(start),
+  upper_32_bits(end), lower_32_bits(end));
+   }
if (INTEL_GEN(m->i915) >= 4) {
-   err_printf(m, "  BBADDR: 0x%08x %08x\n",
+   err_printf(m, "  BBADDR: 0x%08x_%08x\n",
   (u32)(ee->bbaddr>>32), (u32)ee->bbaddr);
err_printf(m, "  BB_STATE: 0x%08x\n", ee->bbstate);
err_printf(m, "  INSTPS: 0x%08x\n", ee->instps);
@@ -677,7 +685,10 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
if (!dst)
return NULL;
 
-   reloc_offset = dst->gtt_offset = vma->node.start;
+   dst->gtt_offset = vma->node.start;
+   dst->gtt_size = vma->node.size;
+
+   reloc_offset = dst->gtt_offset;
use_ggtt = (src->cache_level == I915_CACHE_NONE &&
   (vma->flags & I915_VMA_GLOBAL_BIND) &&
   reloc_offset + num_pages * PAGE_SIZE <= ggtt->mappable_end);
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 18/32] drm/i915: Use VMA for ringbuffer tracking

2016-08-15 Thread Chris Wilson
Use the GGTT VMA as the primary cookie for handing ring objects as
the most common action upon the ring is mapping and unmapping which act
upon the VMA itself. By restructuring the code to work with the ring
VMA, we can shrink the code and remove a few cycles from context pinning.

v2: Move the flush of the object back to before the first pin. We use
the am-I-bound? query to only have to check the flush on the first
bind and so avoid stalling on active rings.
Lots of little renames and small hoops.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c|   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c  |   4 +-
 drivers/gpu/drm/i915/i915_guc_submission.c |  16 +-
 drivers/gpu/drm/i915/intel_lrc.c   |  17 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c| 243 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.h|  14 +-
 6 files changed, 139 insertions(+), 157 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index f05f8504a4fa..9e44d9eb8e76 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -356,7 +356,7 @@ static int per_file_ctx_stats(int id, void *ptr, void *data)
if (ctx->engine[n].state)
per_file_stats(0, ctx->engine[n].state->obj, data);
if (ctx->engine[n].ring)
-   per_file_stats(0, ctx->engine[n].ring->obj, data);
+   per_file_stats(0, ctx->engine[n].ring->vma->obj, data);
}
 
return 0;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 61708faebf79..27f973fbe80f 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1128,12 +1128,12 @@ static void i915_gem_record_rings(struct 
drm_i915_private *dev_priv,
ee->cpu_ring_tail = ring->tail;
ee->ringbuffer =
i915_error_ggtt_object_create(dev_priv,
- ring->obj);
+ ring->vma->obj);
}
 
ee->hws_page =
i915_error_ggtt_object_create(dev_priv,
- engine->status_page.obj);
+ 
engine->status_page.vma->obj);
 
ee->wa_ctx = i915_error_ggtt_object_create(dev_priv,
   engine->wa_ctx.obj);
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 4f0f173f9754..c40b92e212fa 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -343,7 +343,6 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
struct intel_context *ce = &ctx->engine[engine->id];
uint32_t guc_engine_id = engine->guc_id;
struct guc_execlist_context *lrc = &desc.lrc[guc_engine_id];
-   struct drm_i915_gem_object *obj;
 
/* TODO: We have a design issue to be solved here. Only when we
 * receive the first batch, we know which engine is used by the
@@ -358,17 +357,14 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
lrc->context_desc = lower_32_bits(ce->lrc_desc);
 
/* The state page is after PPHWSP */
-   gfx_addr = ce->state->node.start;
-   lrc->ring_lcra = gfx_addr + LRC_STATE_PN * PAGE_SIZE;
+   lrc->ring_lcra =
+   ce->state->node.start + LRC_STATE_PN * PAGE_SIZE;
lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) |
(guc_engine_id << GUC_ELC_ENGINE_OFFSET);
 
-   obj = ce->ring->obj;
-   gfx_addr = i915_gem_obj_ggtt_offset(obj);
-
-   lrc->ring_begin = gfx_addr;
-   lrc->ring_end = gfx_addr + obj->base.size - 1;
-   lrc->ring_next_free_location = gfx_addr;
+   lrc->ring_begin = ce->ring->vma->node.start;
+   lrc->ring_end = lrc->ring_begin + ce->ring->size - 1;
+   lrc->ring_next_free_location = lrc->ring_begin;
lrc->ring_current_tail_pointer_value = 0;
 
desc.engines_used |= (1 << guc_engine_id);
@@ -943,7 +939,7 @@ static void guc_create_ads(struct intel_guc *guc)
 * to find it.
 */
engine = &dev_priv->engine[RCS];
-   ads->golden_context_lrca = engine->status_page.gfx_addr;
+   ads->golden_context_lrca = engine->status_page.ggtt_offset;
 
for_each_engine(engine, dev_priv)
ads->eng_state_size[engine->guc_id] = 
intel_lr_context_size(engine);
diff --git a/drivers/gpu/drm/i915/int

[Intel-gfx] [CI 15/32] drm/i915: Use VMA as the primary object for context state

2016-08-15 Thread Chris Wilson
When working with contexts, we most frequently want the GGTT VMA for the
context state, first and foremost. Since the object is available via the
VMA, we need only then store the VMA.

v2: Formatting tweaks to debugfs output, restored some comments removed
in the next patch

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c| 34 
 drivers/gpu/drm/i915/i915_drv.h|  3 +-
 drivers/gpu/drm/i915/i915_gem_context.c| 51 +---
 drivers/gpu/drm/i915/i915_gpu_error.c  |  7 ++--
 drivers/gpu/drm/i915/i915_guc_submission.c |  6 +--
 drivers/gpu/drm/i915/intel_lrc.c   | 64 +++---
 drivers/gpu/drm/i915/intel_ringbuffer.c|  6 +--
 7 files changed, 86 insertions(+), 85 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 0ae61e94ce04..f05f8504a4fa 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -354,7 +354,7 @@ static int per_file_ctx_stats(int id, void *ptr, void *data)
 
for (n = 0; n < ARRAY_SIZE(ctx->engine); n++) {
if (ctx->engine[n].state)
-   per_file_stats(0, ctx->engine[n].state, data);
+   per_file_stats(0, ctx->engine[n].state->obj, data);
if (ctx->engine[n].ring)
per_file_stats(0, ctx->engine[n].ring->obj, data);
}
@@ -1977,7 +1977,7 @@ static int i915_context_status(struct seq_file *m, void 
*unused)
seq_printf(m, "%s: ", engine->name);
seq_putc(m, ce->initialised ? 'I' : 'i');
if (ce->state)
-   describe_obj(m, ce->state);
+   describe_obj(m, ce->state->obj);
if (ce->ring)
describe_ctx_ring(m, ce->ring);
seq_putc(m, '\n');
@@ -1995,36 +1995,34 @@ static void i915_dump_lrc_obj(struct seq_file *m,
  struct i915_gem_context *ctx,
  struct intel_engine_cs *engine)
 {
-   struct drm_i915_gem_object *ctx_obj = ctx->engine[engine->id].state;
+   struct i915_vma *vma = ctx->engine[engine->id].state;
struct page *page;
-   uint32_t *reg_state;
int j;
-   unsigned long ggtt_offset = 0;
 
seq_printf(m, "CONTEXT: %s %u\n", engine->name, ctx->hw_id);
 
-   if (ctx_obj == NULL) {
-   seq_puts(m, "\tNot allocated\n");
+   if (!vma) {
+   seq_puts(m, "\tFake context\n");
return;
}
 
-   if (!i915_gem_obj_ggtt_bound(ctx_obj))
-   seq_puts(m, "\tNot bound in GGTT\n");
-   else
-   ggtt_offset = i915_gem_obj_ggtt_offset(ctx_obj);
+   if (vma->flags & I915_VMA_GLOBAL_BIND)
+   seq_printf(m, "\tBound in GGTT at 0x%08x\n",
+  lower_32_bits(vma->node.start));
 
-   if (i915_gem_object_get_pages(ctx_obj)) {
-   seq_puts(m, "\tFailed to get pages for context object\n");
+   if (i915_gem_object_get_pages(vma->obj)) {
+   seq_puts(m, "\tFailed to get pages for context object\n\n");
return;
}
 
-   page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
-   if (!WARN_ON(page == NULL)) {
-   reg_state = kmap_atomic(page);
+   page = i915_gem_object_get_page(vma->obj, LRC_STATE_PN);
+   if (page) {
+   u32 *reg_state = kmap_atomic(page);
 
for (j = 0; j < 0x600 / sizeof(u32) / 4; j += 4) {
-   seq_printf(m, "\t[0x%08lx] 0x%08x 0x%08x 0x%08x 
0x%08x\n",
-  ggtt_offset + 4096 + (j * 4),
+   seq_printf(m,
+  "\t[0x%04x] 0x%08x 0x%08x 0x%08x 0x%08x\n",
+  j * 4,
   reg_state[j], reg_state[j + 1],
   reg_state[j + 2], reg_state[j + 3]);
}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3285c8e2c87a..259425d99e17 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -893,9 +893,8 @@ struct i915_gem_context {
u32 ggtt_alignment;
 
struct intel_context {
-   struct drm_i915_gem_object *state;
+   struct i915_vma *state;
struct intel_ring *ring;
-   struct i915_vma *lrc_vma;
uint32_t *lrc_reg_state;
u64 lrc_desc;
int pin_count;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 547caf26a6b9..3857ce097c84 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -155,7 +155

[Intel-gfx] [CI 25/32] drm/i915: Use VMA for wa_ctx tracking

2016-08-15 Thread Chris Wilson
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gpu_error.c   |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c| 58 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 +--
 3 files changed, 35 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 4068630bfc68..5e7734ca4579 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1134,7 +1134,7 @@ static void i915_gem_record_rings(struct drm_i915_private 
*dev_priv,
  
engine->status_page.vma->obj);
 
ee->wa_ctx = i915_error_ggtt_object_create(dev_priv,
-  engine->wa_ctx.obj);
+  
engine->wa_ctx.vma->obj);
 
count = 0;
list_for_each_entry(request, &engine->request_list, link)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 56c904e2dc98..64cb04e63512 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1165,45 +1165,51 @@ static int gen9_init_perctx_bb(struct intel_engine_cs 
*engine,
 
 static int lrc_setup_wa_ctx_obj(struct intel_engine_cs *engine, u32 size)
 {
-   int ret;
+   struct drm_i915_gem_object *obj;
+   struct i915_vma *vma;
+   int err;
 
-   engine->wa_ctx.obj = i915_gem_object_create(&engine->i915->drm,
-   PAGE_ALIGN(size));
-   if (IS_ERR(engine->wa_ctx.obj)) {
-   DRM_DEBUG_DRIVER("alloc LRC WA ctx backing obj failed.\n");
-   ret = PTR_ERR(engine->wa_ctx.obj);
-   engine->wa_ctx.obj = NULL;
-   return ret;
-   }
+   obj = i915_gem_object_create(&engine->i915->drm, PAGE_ALIGN(size));
+   if (IS_ERR(obj))
+   return PTR_ERR(obj);
 
-   ret = i915_gem_object_ggtt_pin(engine->wa_ctx.obj, NULL,
-  0, PAGE_SIZE, PIN_HIGH);
-   if (ret) {
-   DRM_DEBUG_DRIVER("pin LRC WA ctx backing obj failed: %d\n",
-ret);
-   i915_gem_object_put(engine->wa_ctx.obj);
-   return ret;
+   vma = i915_vma_create(obj, &engine->i915->ggtt.base, NULL);
+   if (IS_ERR(vma)) {
+   err = PTR_ERR(vma);
+   goto err;
}
 
+   err = i915_vma_pin(vma, 0, PAGE_SIZE, PIN_GLOBAL | PIN_HIGH);
+   if (err)
+   goto err;
+
+   engine->wa_ctx.vma = vma;
return 0;
+
+err:
+   i915_gem_object_put(obj);
+   return err;
 }
 
 static void lrc_destroy_wa_ctx_obj(struct intel_engine_cs *engine)
 {
-   if (engine->wa_ctx.obj) {
-   i915_gem_object_ggtt_unpin(engine->wa_ctx.obj);
-   i915_gem_object_put(engine->wa_ctx.obj);
-   engine->wa_ctx.obj = NULL;
-   }
+   struct i915_vma *vma;
+
+   vma = fetch_and_zero(&engine->wa_ctx.vma);
+   if (!vma)
+   return;
+
+   i915_vma_unpin(vma);
+   i915_vma_put(vma);
 }
 
 static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 {
-   int ret;
+   struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
uint32_t *batch;
uint32_t offset;
struct page *page;
-   struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
+   int ret;
 
WARN_ON(engine->id != RCS);
 
@@ -1226,7 +1232,7 @@ static int intel_init_workaround_bb(struct 
intel_engine_cs *engine)
return ret;
}
 
-   page = i915_gem_object_get_dirty_page(wa_ctx->obj, 0);
+   page = i915_gem_object_get_dirty_page(wa_ctx->vma->obj, 0);
batch = kmap_atomic(page);
offset = 0;
 
@@ -2019,9 +2025,9 @@ populate_lr_context(struct i915_gem_context *ctx,
   RING_INDIRECT_CTX(engine->mmio_base), 0);
ASSIGN_CTX_REG(reg_state, CTX_RCS_INDIRECT_CTX_OFFSET,
   RING_INDIRECT_CTX_OFFSET(engine->mmio_base), 0);
-   if (engine->wa_ctx.obj) {
+   if (engine->wa_ctx.vma) {
struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
-   uint32_t ggtt_offset = 
i915_gem_obj_ggtt_offset(wa_ctx->obj);
+   u32 ggtt_offset = wa_ctx->vma->node.start;
 
reg_state[CTX_RCS_INDIRECT_CTX+1] =
(ggtt_offset + wa_ctx->indirect_ctx.offset * 
sizeof(uint32_t)) |
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
b/drivers/gpu/drm/i915/intel_ringbuffer.h
index cb40785e7677..e3777572c70e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -123,12 +123,12 @@ struct drm_i915_reg_table;
  *an option for fu

[Intel-gfx] [CI 24/32] drm/i915: Use VMA for render state page tracking

2016-08-15 Thread Chris Wilson
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_render_state.c | 40 +++-
 drivers/gpu/drm/i915/i915_gem_render_state.h |  2 +-
 2 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c 
b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 57fd767a2d79..95b7e9afd5f8 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -30,8 +30,7 @@
 
 struct render_state {
const struct intel_renderstate_rodata *rodata;
-   struct drm_i915_gem_object *obj;
-   u64 ggtt_offset;
+   struct i915_vma *vma;
u32 aux_batch_size;
u32 aux_batch_offset;
 };
@@ -73,7 +72,7 @@ render_state_get_rodata(const struct drm_i915_gem_request 
*req)
 
 static int render_state_setup(struct render_state *so)
 {
-   struct drm_device *dev = so->obj->base.dev;
+   struct drm_device *dev = so->vma->vm->dev;
const struct intel_renderstate_rodata *rodata = so->rodata;
const bool has_64bit_reloc = INTEL_GEN(dev) >= 8;
unsigned int i = 0, reloc_index = 0;
@@ -81,18 +80,18 @@ static int render_state_setup(struct render_state *so)
u32 *d;
int ret;
 
-   ret = i915_gem_object_set_to_cpu_domain(so->obj, true);
+   ret = i915_gem_object_set_to_cpu_domain(so->vma->obj, true);
if (ret)
return ret;
 
-   page = i915_gem_object_get_dirty_page(so->obj, 0);
+   page = i915_gem_object_get_dirty_page(so->vma->obj, 0);
d = kmap(page);
 
while (i < rodata->batch_items) {
u32 s = rodata->batch[i];
 
if (i * 4  == rodata->reloc[reloc_index]) {
-   u64 r = s + so->ggtt_offset;
+   u64 r = s + so->vma->node.start;
s = lower_32_bits(r);
if (has_64bit_reloc) {
if (i + 1 >= rodata->batch_items ||
@@ -154,7 +153,7 @@ static int render_state_setup(struct render_state *so)
 
kunmap(page);
 
-   ret = i915_gem_object_set_to_gtt_domain(so->obj, false);
+   ret = i915_gem_object_set_to_gtt_domain(so->vma->obj, false);
if (ret)
return ret;
 
@@ -175,6 +174,7 @@ err_out:
 int i915_gem_render_state_init(struct drm_i915_gem_request *req)
 {
struct render_state so;
+   struct drm_i915_gem_object *obj;
int ret;
 
if (WARN_ON(req->engine->id != RCS))
@@ -187,21 +187,25 @@ int i915_gem_render_state_init(struct 
drm_i915_gem_request *req)
if (so.rodata->batch_items * 4 > 4096)
return -EINVAL;
 
-   so.obj = i915_gem_object_create(&req->i915->drm, 4096);
-   if (IS_ERR(so.obj))
-   return PTR_ERR(so.obj);
+   obj = i915_gem_object_create(&req->i915->drm, 4096);
+   if (IS_ERR(obj))
+   return PTR_ERR(obj);
 
-   ret = i915_gem_object_ggtt_pin(so.obj, NULL, 0, 0, 0);
-   if (ret)
+   so.vma = i915_vma_create(obj, &req->i915->ggtt.base, NULL);
+   if (IS_ERR(so.vma)) {
+   ret = PTR_ERR(so.vma);
goto err_obj;
+   }
 
-   so.ggtt_offset = i915_gem_obj_ggtt_offset(so.obj);
+   ret = i915_vma_pin(so.vma, 0, 0, PIN_GLOBAL);
+   if (ret)
+   goto err_obj;
 
ret = render_state_setup(&so);
if (ret)
goto err_unpin;
 
-   ret = req->engine->emit_bb_start(req, so.ggtt_offset,
+   ret = req->engine->emit_bb_start(req, so.vma->node.start,
 so.rodata->batch_items * 4,
 I915_DISPATCH_SECURE);
if (ret)
@@ -209,7 +213,7 @@ int i915_gem_render_state_init(struct drm_i915_gem_request 
*req)
 
if (so.aux_batch_size > 8) {
ret = req->engine->emit_bb_start(req,
-(so.ggtt_offset +
+(so.vma->node.start +
  so.aux_batch_offset),
 so.aux_batch_size,
 I915_DISPATCH_SECURE);
@@ -217,10 +221,10 @@ int i915_gem_render_state_init(struct 
drm_i915_gem_request *req)
goto err_unpin;
}
 
-   i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req, 0);
+   i915_vma_move_to_active(so.vma, req, 0);
 err_unpin:
-   i915_gem_object_ggtt_unpin(so.obj);
+   i915_vma_unpin(so.vma);
 err_obj:
-   i915_gem_object_put(so.obj);
+   i915_gem_object_put(obj);
return ret;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.h 
b/drivers/gpu/drm/i915/i915_gem_render_state.h
index c44fca8599bb..18cce3f06e9c 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.h
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.h
@@ -24,7 

[Intel-gfx] [CI 22/32] drm/i915/overlay: Use VMA as the primary tracker for images

2016-08-15 Thread Chris Wilson
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/intel_overlay.c | 39 
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_overlay.c 
b/drivers/gpu/drm/i915/intel_overlay.c
index 90f3ab424e01..d930e3a4a9cd 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -171,8 +171,8 @@ struct overlay_registers {
 struct intel_overlay {
struct drm_i915_private *i915;
struct intel_crtc *crtc;
-   struct drm_i915_gem_object *vid_bo;
-   struct drm_i915_gem_object *old_vid_bo;
+   struct i915_vma *vma;
+   struct i915_vma *old_vma;
bool active;
bool pfit_active;
u32 pfit_vscale_ratio; /* shifted-point number, (1<<12) == 1.0 */
@@ -317,15 +317,17 @@ static void intel_overlay_release_old_vid_tail(struct 
i915_gem_active *active,
 {
struct intel_overlay *overlay =
container_of(active, typeof(*overlay), last_flip);
-   struct drm_i915_gem_object *obj = overlay->old_vid_bo;
+   struct i915_vma *vma;
 
-   i915_gem_track_fb(obj, NULL,
- INTEL_FRONTBUFFER_OVERLAY(overlay->crtc->pipe));
+   vma = fetch_and_zero(&overlay->old_vma);
+   if (WARN_ON(!vma))
+   return;
 
-   i915_gem_object_ggtt_unpin(obj);
-   i915_gem_object_put(obj);
+   i915_gem_track_fb(vma->obj, NULL,
+ INTEL_FRONTBUFFER_OVERLAY(overlay->crtc->pipe));
 
-   overlay->old_vid_bo = NULL;
+   i915_gem_object_unpin_from_display_plane(vma->obj, 
&i915_ggtt_view_normal);
+   i915_vma_put(vma);
 }
 
 static void intel_overlay_off_tail(struct i915_gem_active *active,
@@ -333,15 +335,15 @@ static void intel_overlay_off_tail(struct i915_gem_active 
*active,
 {
struct intel_overlay *overlay =
container_of(active, typeof(*overlay), last_flip);
-   struct drm_i915_gem_object *obj = overlay->vid_bo;
+   struct i915_vma *vma;
 
/* never have the overlay hw on without showing a frame */
-   if (WARN_ON(!obj))
+   vma = fetch_and_zero(&overlay->vma);
+   if (WARN_ON(!vma))
return;
 
-   i915_gem_object_ggtt_unpin(obj);
-   i915_gem_object_put(obj);
-   overlay->vid_bo = NULL;
+   i915_gem_object_unpin_from_display_plane(vma->obj, 
&i915_ggtt_view_normal);
+   i915_vma_put(vma);
 
overlay->crtc->overlay = NULL;
overlay->crtc = NULL;
@@ -421,7 +423,7 @@ static int intel_overlay_release_old_vid(struct 
intel_overlay *overlay)
/* Only wait if there is actually an old frame to release to
 * guarantee forward progress.
 */
-   if (!overlay->old_vid_bo)
+   if (!overlay->old_vma)
return 0;
 
if (I915_READ(ISR) & I915_OVERLAY_PLANE_FLIP_PENDING_INTERRUPT) {
@@ -744,6 +746,7 @@ static int intel_overlay_do_put_image(struct intel_overlay 
*overlay,
struct drm_i915_private *dev_priv = overlay->i915;
u32 swidth, swidthsw, sheight, ostride;
enum pipe pipe = overlay->crtc->pipe;
+   struct i915_vma *vma;
 
lockdep_assert_held(&dev_priv->drm.struct_mutex);

WARN_ON(!drm_modeset_is_locked(&dev_priv->drm.mode_config.connection_mutex));
@@ -757,6 +760,8 @@ static int intel_overlay_do_put_image(struct intel_overlay 
*overlay,
if (ret != 0)
return ret;
 
+   vma = i915_gem_obj_to_ggtt_view(new_bo, &i915_ggtt_view_normal);
+
ret = i915_gem_object_put_fence(new_bo);
if (ret)
goto out_unpin;
@@ -834,11 +839,11 @@ static int intel_overlay_do_put_image(struct 
intel_overlay *overlay,
if (ret)
goto out_unpin;
 
-   i915_gem_track_fb(overlay->vid_bo, new_bo,
+   i915_gem_track_fb(overlay->vma->obj, new_bo,
  INTEL_FRONTBUFFER_OVERLAY(pipe));
 
-   overlay->old_vid_bo = overlay->vid_bo;
-   overlay->vid_bo = new_bo;
+   overlay->old_vma = overlay->vma;
+   overlay->vma = vma;
 
intel_frontbuffer_flip(dev_priv, INTEL_FRONTBUFFER_OVERLAY(pipe));
 
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 20/32] drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c

2016-08-15 Thread Chris Wilson
Since the scratch allocation and cleanup is shared by all engine
submission backends, move it out of the legacy intel_ringbuffer.c and
into the new home for common routines, intel_engine_cs.c

Signed-off-by: Chris Wilson 
Reviewed-by: Matthew Auld 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/intel_engine_cs.c  | 50 +
 drivers/gpu/drm/i915/intel_lrc.c|  1 -
 drivers/gpu/drm/i915/intel_ringbuffer.c | 50 -
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 +--
 4 files changed, 51 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index 186c12d07f99..7104dec5e893 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -195,6 +195,54 @@ void intel_engine_setup_common(struct intel_engine_cs 
*engine)
i915_gem_batch_pool_init(engine, &engine->batch_pool);
 }
 
+int intel_engine_create_scratch(struct intel_engine_cs *engine, int size)
+{
+   struct drm_i915_gem_object *obj;
+   struct i915_vma *vma;
+   int ret;
+
+   WARN_ON(engine->scratch);
+
+   obj = i915_gem_object_create_stolen(&engine->i915->drm, size);
+   if (!obj)
+   obj = i915_gem_object_create(&engine->i915->drm, size);
+   if (IS_ERR(obj)) {
+   DRM_ERROR("Failed to allocate scratch page\n");
+   return PTR_ERR(obj);
+   }
+
+   vma = i915_vma_create(obj, &engine->i915->ggtt.base, NULL);
+   if (IS_ERR(vma)) {
+   ret = PTR_ERR(vma);
+   goto err_unref;
+   }
+
+   ret = i915_vma_pin(vma, 0, 4096, PIN_GLOBAL | PIN_HIGH);
+   if (ret)
+   goto err_unref;
+
+   engine->scratch = vma;
+   DRM_DEBUG_DRIVER("%s pipe control offset: 0x%08llx\n",
+engine->name, vma->node.start);
+   return 0;
+
+err_unref:
+   i915_gem_object_put(obj);
+   return ret;
+}
+
+static void intel_engine_cleanup_scratch(struct intel_engine_cs *engine)
+{
+   struct i915_vma *vma;
+
+   vma = fetch_and_zero(&engine->scratch);
+   if (!vma)
+   return;
+
+   i915_vma_unpin(vma);
+   i915_vma_put(vma);
+}
+
 /**
  * intel_engines_init_common - initialize cengine state which might require hw 
access
  * @engine: Engine to initialize.
@@ -226,6 +274,8 @@ int intel_engine_init_common(struct intel_engine_cs *engine)
  */
 void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 {
+   intel_engine_cleanup_scratch(engine);
+
intel_engine_cleanup_cmd_parser(engine);
intel_engine_fini_breadcrumbs(engine);
i915_gem_batch_pool_fini(&engine->batch_pool);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 42999ba02152..56c904e2dc98 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1844,7 +1844,6 @@ int logical_render_ring_init(struct intel_engine_cs 
*engine)
else
engine->init_hw = gen8_init_render_ring;
engine->init_context = gen8_init_rcs_context;
-   engine->cleanup = intel_engine_cleanup_scratch;
engine->emit_flush = gen8_emit_flush_render;
engine->emit_request = gen8_emit_request_render;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 7ce912f8d96c..c89aea55bc10 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -613,54 +613,6 @@ out:
return ret;
 }
 
-void intel_engine_cleanup_scratch(struct intel_engine_cs *engine)
-{
-   struct i915_vma *vma;
-
-   vma = fetch_and_zero(&engine->scratch);
-   if (!vma)
-   return;
-
-   i915_vma_unpin(vma);
-   i915_vma_put(vma);
-}
-
-int intel_engine_create_scratch(struct intel_engine_cs *engine, int size)
-{
-   struct drm_i915_gem_object *obj;
-   struct i915_vma *vma;
-   int ret;
-
-   WARN_ON(engine->scratch);
-
-   obj = i915_gem_object_create_stolen(&engine->i915->drm, size);
-   if (!obj)
-   obj = i915_gem_object_create(&engine->i915->drm, size);
-   if (IS_ERR(obj)) {
-   DRM_ERROR("Failed to allocate scratch page\n");
-   return PTR_ERR(obj);
-   }
-
-   vma = i915_vma_create(obj, &engine->i915->ggtt.base, NULL);
-   if (IS_ERR(vma)) {
-   ret = PTR_ERR(vma);
-   goto err_unref;
-   }
-
-   ret = i915_vma_pin(vma, 0, 4096, PIN_GLOBAL | PIN_HIGH);
-   if (ret)
-   goto err_unref;
-
-   engine->scratch = vma;
-   DRM_DEBUG_DRIVER("%s pipe control offset: 0x%08llx\n",
-engine->name, vma->node.start);
-   return 0;
-
-err_unref:
-   i915_gem_object_put(obj);
-   return ret;
-}
-
 static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req)
 {
struct intel_r

[Intel-gfx] [CI 11/32] drm/i915: Add convenience wrappers for vma's object get/put

2016-08-15 Thread Chris Wilson
The VMA are unreferenced, they belong to the object and live until they
are closed. However, if we want to use the VMA as a cookie and use it to
keep the object alive, we want to hold onto a reference to the object
for the lifetime of the VMA cookie. To facilitate this, add a couple of
simple wrappers for managing the reference count on the object owning the
VMA.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h| 12 
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  4 ++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 855833a6306a..3285c8e2c87a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2373,6 +2373,18 @@ i915_gem_object_get_stride(struct drm_i915_gem_object 
*obj)
return obj->tiling_and_stride & STRIDE_MASK;
 }
 
+static inline struct i915_vma *i915_vma_get(struct i915_vma *vma)
+{
+   i915_gem_object_get(vma->obj);
+   return vma;
+}
+
+static inline void i915_vma_put(struct i915_vma *vma)
+{
+   lockdep_assert_held(&vma->vm->dev->struct_mutex);
+   i915_gem_object_put(vma->obj);
+}
+
 /*
  * Optimised SGL iterator for GEM objects
  */
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index c8d13fea4b25..ced05878b405 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -271,7 +271,7 @@ static void eb_destroy(struct eb_vmas *eb)
   exec_list);
list_del_init(&vma->exec_list);
i915_gem_execbuffer_unreserve_vma(vma);
-   i915_gem_object_put(vma->obj);
+   i915_vma_put(vma);
}
kfree(eb);
 }
@@ -900,7 +900,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
vma = list_first_entry(&eb->vmas, struct i915_vma, exec_list);
list_del_init(&vma->exec_list);
i915_gem_execbuffer_unreserve_vma(vma);
-   i915_gem_object_put(vma->obj);
+   i915_vma_put(vma);
}
 
mutex_unlock(&dev->struct_mutex);
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 27/32] drm/i915: Track pinned VMA

2016-08-15 Thread Chris Wilson
Treat the VMA as the primary struct responsible for tracking bindings
into the GPU's VM. That is we want to treat the VMA returned after we
pin an object into the VM as the cookie we hold and eventually release
when unpinning. Doing so eliminates the ambiguity in pinning the object
and then searching for the relevant pin later.

v2: Joonas' stylistic nitpicks, a fun rebase.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c|   2 +-
 drivers/gpu/drm/i915/i915_drv.h|  60 ++--
 drivers/gpu/drm/i915/i915_gem.c| 233 -
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  65 
 drivers/gpu/drm/i915/i915_gem_fence.c  |  14 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c|  73 +
 drivers/gpu/drm/i915/i915_gem_gtt.h|  14 --
 drivers/gpu/drm/i915/i915_gem_request.c|   2 +-
 drivers/gpu/drm/i915/i915_gem_request.h|   2 +-
 drivers/gpu/drm/i915/i915_gem_stolen.c |   2 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c |   2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c  |  58 +++
 drivers/gpu/drm/i915/intel_display.c   |  57 ---
 drivers/gpu/drm/i915/intel_drv.h   |   5 +-
 drivers/gpu/drm/i915/intel_fbc.c   |   2 +-
 drivers/gpu/drm/i915/intel_fbdev.c |  19 +--
 drivers/gpu/drm/i915/intel_guc_loader.c|  21 +--
 drivers/gpu/drm/i915/intel_overlay.c   |  32 ++--
 18 files changed, 266 insertions(+), 397 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index cee15b3db6ed..6d73bdf069f0 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -105,7 +105,7 @@ static char get_tiling_flag(struct drm_i915_gem_object *obj)
 
 static char get_global_flag(struct drm_i915_gem_object *obj)
 {
-   return i915_gem_obj_to_ggtt(obj) ? 'g' : ' ';
+   return i915_gem_object_to_ggtt(obj, NULL) ?  'g' : ' ';
 }
 
 static char get_pin_mapped_flag(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 50dc3613c61c..bbee45acedeb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3075,7 +3075,7 @@ struct drm_i915_gem_object 
*i915_gem_object_create_from_data(
 void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file);
 void i915_gem_free_object(struct drm_gem_object *obj);
 
-int __must_check
+struct i915_vma * __must_check
 i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
 const struct i915_ggtt_view *view,
 u64 size,
@@ -3279,12 +3279,11 @@ i915_gem_object_set_to_gtt_domain(struct 
drm_i915_gem_object *obj,
  bool write);
 int __must_check
 i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write);
-int __must_check
+struct i915_vma * __must_check
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 u32 alignment,
 const struct i915_ggtt_view *view);
-void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj,
- const struct i915_ggtt_view 
*view);
+void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma);
 int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj,
int align);
 int i915_gem_open(struct drm_device *dev, struct drm_file *file);
@@ -3304,63 +3303,34 @@ struct drm_gem_object *i915_gem_prime_import(struct 
drm_device *dev,
 struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
struct drm_gem_object *gem_obj, int flags);
 
-u64 i915_gem_obj_ggtt_offset_view(struct drm_i915_gem_object *o,
- const struct i915_ggtt_view *view);
-u64 i915_gem_obj_offset(struct drm_i915_gem_object *o,
-   struct i915_address_space *vm);
-static inline u64
-i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o)
-{
-   return i915_gem_obj_ggtt_offset_view(o, &i915_ggtt_view_normal);
-}
-
-bool i915_gem_obj_ggtt_bound_view(struct drm_i915_gem_object *o,
- const struct i915_ggtt_view *view);
-bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
-   struct i915_address_space *vm);
-
 struct i915_vma *
 i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
-   struct i915_address_space *vm);
-struct i915_vma *
-i915_gem_obj_to_ggtt_view(struct drm_i915_gem_object *obj,
- const struct i915_ggtt_view *view);
+struct i915_address_space *vm,
+const struct i915_ggtt_view *view);
 
 struct i915_vma *
 i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
- struct i915_address_space *vm);
-st

[Intel-gfx] [CI 26/32] drm/i915: Consolidate i915_vma_unpin_and_release()

2016-08-15 Thread Chris Wilson
In a few places, we repeat a call to clear a pointer to a vma whilst
unpinning and releasing a reference to its owner. Refactor those into a
common function.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c| 12 
 drivers/gpu/drm/i915/i915_gem_gtt.h|  1 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 21 -
 drivers/gpu/drm/i915/intel_engine_cs.c |  9 +
 drivers/gpu/drm/i915/intel_lrc.c   |  9 +
 drivers/gpu/drm/i915/intel_ringbuffer.c|  8 +---
 6 files changed, 20 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 738a474c5afa..d15eb1d71341 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3674,3 +3674,15 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
__i915_vma_pin(vma);
return ptr;
 }
+
+void i915_vma_unpin_and_release(struct i915_vma **p_vma)
+{
+   struct i915_vma *vma;
+
+   vma = fetch_and_zero(p_vma);
+   if (!vma)
+   return;
+
+   i915_vma_unpin(vma);
+   i915_vma_put(vma);
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index a2691943a404..ec538fcc9c20 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -232,6 +232,7 @@ struct i915_vma *
 i915_vma_create(struct drm_i915_gem_object *obj,
struct i915_address_space *vm,
const struct i915_ggtt_view *view);
+void i915_vma_unpin_and_release(struct i915_vma **p_vma);
 
 static inline bool i915_vma_is_ggtt(const struct i915_vma *vma)
 {
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index c40b92e212fa..e7dbc64ec1da 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -653,19 +653,6 @@ err:
return vma;
 }
 
-/**
- * guc_release_vma() - Release gem object allocated for GuC usage
- * @vma:   gem obj to be released
- */
-static void guc_release_vma(struct i915_vma *vma)
-{
-   if (!vma)
-   return;
-
-   i915_vma_unpin(vma);
-   i915_vma_put(vma);
-}
-
 static void
 guc_client_free(struct drm_i915_private *dev_priv,
struct i915_guc_client *client)
@@ -690,7 +677,7 @@ guc_client_free(struct drm_i915_private *dev_priv,
kunmap(kmap_to_page(client->client_base));
}
 
-   guc_release_vma(client->vma);
+   i915_vma_unpin_and_release(&client->vma);
 
if (client->ctx_index != GUC_INVALID_CTX_ID) {
guc_fini_ctx_desc(guc, client);
@@ -1048,12 +1035,12 @@ void i915_guc_submission_fini(struct drm_i915_private 
*dev_priv)
 {
struct intel_guc *guc = &dev_priv->guc;
 
-   guc_release_vma(fetch_and_zero(&guc->ads_vma));
-   guc_release_vma(fetch_and_zero(&guc->log_vma));
+   i915_vma_unpin_and_release(&guc->ads_vma);
+   i915_vma_unpin_and_release(&guc->log_vma);
 
if (guc->ctx_pool_vma)
ida_destroy(&guc->ctx_ids);
-   guc_release_vma(fetch_and_zero(&guc->ctx_pool_vma));
+   i915_vma_unpin_and_release(&guc->ctx_pool_vma);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index 573f642a74f8..f02d66bbec4b 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -279,14 +279,7 @@ err_unref:
 
 static void intel_engine_cleanup_scratch(struct intel_engine_cs *engine)
 {
-   struct i915_vma *vma;
-
-   vma = fetch_and_zero(&engine->scratch);
-   if (!vma)
-   return;
-
-   i915_vma_unpin(vma);
-   i915_vma_put(vma);
+   i915_vma_unpin_and_release(&engine->scratch);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 64cb04e63512..2673fb4f817b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1193,14 +1193,7 @@ err:
 
 static void lrc_destroy_wa_ctx_obj(struct intel_engine_cs *engine)
 {
-   struct i915_vma *vma;
-
-   vma = fetch_and_zero(&engine->wa_ctx.vma);
-   if (!vma)
-   return;
-
-   i915_vma_unpin(vma);
-   i915_vma_put(vma);
+   i915_vma_unpin_and_release(&engine->wa_ctx.vma);
 }
 
 static int intel_init_workaround_bb(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 30b066140b0c..65ef172e8761 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1257,14 +1257,8 @@ static int init_render_ring(struct intel_engine_cs 
*engine)
 static void render_ring_cleanup(struct intel_engine_cs *engine)
 {
struct drm_i915_private *dev_priv = engine->i915;
-   struct i915_vma *vma;
-
-   vma = fetch_a

[Intel-gfx] [CI 32/32] drm/i915: Record the RING_MODE register for post-mortem debugging

2016-08-15 Thread Chris Wilson
Just another useful register to inspect following a GPU hang.

v2: Remove partial decoding of RING_MODE to userspace, be consistent and
use GEN > 2 guards around RING_MODE everywhere.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h | 1 +
 drivers/gpu/drm/i915/i915_gpu_error.c   | 3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 7 ---
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bb7d8130dbfd..35caa9b2f36a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -757,6 +757,7 @@ struct drm_i915_error_state {
u32 tail;
u32 head;
u32 ctl;
+   u32 mode;
u32 hws;
u32 ipeir;
u32 ipehr;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 776818b86c0c..0c3f30ce85c3 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -236,6 +236,7 @@ static void error_print_engine(struct 
drm_i915_error_state_buf *m,
err_printf(m, "  HEAD:  0x%08x\n", ee->head);
err_printf(m, "  TAIL:  0x%08x\n", ee->tail);
err_printf(m, "  CTL:   0x%08x\n", ee->ctl);
+   err_printf(m, "  MODE:  0x%08x\n", ee->mode);
err_printf(m, "  HWS:   0x%08x\n", ee->hws);
err_printf(m, "  ACTHD: 0x%08x %08x\n",
   (u32)(ee->acthd>>32), (u32)ee->acthd);
@@ -1005,6 +1006,8 @@ static void error_record_engine_registers(struct 
drm_i915_error_state *error,
ee->head = I915_READ_HEAD(engine);
ee->tail = I915_READ_TAIL(engine);
ee->ctl = I915_READ_CTL(engine);
+   if (INTEL_GEN(dev_priv) > 2)
+   ee->mode = I915_READ_MODE(engine);
 
if (I915_NEED_GFX_HWS(dev_priv)) {
i915_reg_t mmio;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e3327a2ac6e1..fa22bd87bab0 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -498,7 +498,7 @@ static bool stop_ring(struct intel_engine_cs *engine)
 {
struct drm_i915_private *dev_priv = engine->i915;
 
-   if (!IS_GEN2(dev_priv)) {
+   if (INTEL_GEN(dev_priv) > 2) {
I915_WRITE_MODE(engine, _MASKED_BIT_ENABLE(STOP_RING));
if (intel_wait_for_register(dev_priv,
RING_MI_MODE(engine->mmio_base),
@@ -520,7 +520,7 @@ static bool stop_ring(struct intel_engine_cs *engine)
I915_WRITE_HEAD(engine, 0);
I915_WRITE_TAIL(engine, 0);
 
-   if (!IS_GEN2(dev_priv)) {
+   if (INTEL_GEN(dev_priv) > 2) {
(void)I915_READ_CTL(engine);
I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING));
}
@@ -2142,7 +2142,8 @@ void intel_engine_cleanup(struct intel_engine_cs *engine)
dev_priv = engine->i915;
 
if (engine->buffer) {
-   WARN_ON(!IS_GEN2(dev_priv) && (I915_READ_MODE(engine) & 
MODE_IDLE) == 0);
+   WARN_ON(INTEL_GEN(dev_priv) > 2 &&
+   (I915_READ_MODE(engine) & MODE_IDLE) == 0);
 
intel_ring_unpin(engine->buffer);
intel_ring_free(engine->buffer);
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 28/32] drm/i915: Introduce i915_ggtt_offset()

2016-08-15 Thread Chris Wilson
This little helper only exists to safely discard the upper unused 32bits
of the general 64-bit VMA address - as we know that all Global GTT
currently are less than 4GiB in size and so that the upper bits must be
zero. In many places, we use a u32 for the global GTT offset and we want
to document where we are discarding the full VMA offset.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c|  2 +-
 drivers/gpu/drm/i915/i915_drv.h|  2 +-
 drivers/gpu/drm/i915/i915_gem.c| 11 +--
 drivers/gpu/drm/i915/i915_gem_context.c|  6 --
 drivers/gpu/drm/i915/i915_gem_gtt.h|  9 +
 drivers/gpu/drm/i915/i915_guc_submission.c | 15 ---
 drivers/gpu/drm/i915/intel_display.c   | 10 +++---
 drivers/gpu/drm/i915/intel_engine_cs.c |  4 ++--
 drivers/gpu/drm/i915/intel_fbdev.c |  6 +++---
 drivers/gpu/drm/i915/intel_guc_loader.c|  6 +++---
 drivers/gpu/drm/i915/intel_lrc.c   | 20 +++-
 drivers/gpu/drm/i915/intel_overlay.c   | 10 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c| 28 ++--
 13 files changed, 70 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 6d73bdf069f0..f9bedcb1d9d0 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2008,7 +2008,7 @@ static void i915_dump_lrc_obj(struct seq_file *m,
 
if (vma->flags & I915_VMA_GLOBAL_BIND)
seq_printf(m, "\tBound in GGTT at 0x%08x\n",
-  lower_32_bits(vma->node.start));
+  i915_ggtt_offset(vma));
 
if (i915_gem_object_get_pages(vma->obj)) {
seq_puts(m, "\tFailed to get pages for context object\n\n");
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bbee45acedeb..bd58878de77b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3330,7 +3330,7 @@ static inline unsigned long
 i915_gem_object_ggtt_offset(struct drm_i915_gem_object *o,
const struct i915_ggtt_view *view)
 {
-   return i915_gem_object_to_ggtt(o, view)->node.start;
+   return i915_ggtt_offset(i915_gem_object_to_ggtt(o, view));
 }
 
 /* i915_gem_fence.c */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 685253a1323b..7e08c774a1aa 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -767,7 +767,7 @@ i915_gem_gtt_pread(struct drm_device *dev,
 
i915_gem_object_pin_pages(obj);
} else {
-   node.start = vma->node.start;
+   node.start = i915_ggtt_offset(vma);
node.allocated = false;
ret = i915_gem_object_put_fence(obj);
if (ret)
@@ -1071,7 +1071,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
 
i915_gem_object_pin_pages(obj);
} else {
-   node.start = vma->node.start;
+   node.start = i915_ggtt_offset(vma);
node.allocated = false;
ret = i915_gem_object_put_fence(obj);
if (ret)
@@ -1712,7 +1712,7 @@ int i915_gem_fault(struct vm_area_struct *area, struct 
vm_fault *vmf)
goto err_unpin;
 
/* Finally, remap it using the new GTT offset */
-   pfn = ggtt->mappable_base + vma->node.start;
+   pfn = ggtt->mappable_base + i915_ggtt_offset(vma);
pfn >>= PAGE_SHIFT;
 
if (unlikely(view.type == I915_GGTT_VIEW_PARTIAL)) {
@@ -3759,10 +3759,9 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
 
WARN(i915_vma_is_pinned(vma),
 "bo is already pinned in ggtt with incorrect alignment:"
-" offset=%08x %08x, req.alignment=%llx, 
req.map_and_fenceable=%d,"
+" offset=%08x, req.alignment=%llx, 
req.map_and_fenceable=%d,"
 " obj->map_and_fenceable=%d\n",
-upper_32_bits(vma->node.start),
-lower_32_bits(vma->node.start),
+i915_ggtt_offset(vma),
 alignment,
 !!(flags & PIN_MAPPABLE),
 obj->map_and_fenceable);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index e566167d9441..98d2956f91f4 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -631,7 +631,8 @@ mi_set_context(struct drm_i915_gem_request *req, u32 
hw_flags)
 
intel_ring_emit(ring, MI_NOOP);
intel_ring_emit(ring, MI_SET_CONTEXT);
-   intel_ring_emit(ring, req->ctx->engine[RCS].state->node.start | flags);
+   intel_ring_emit(ring,
+   i915_ggtt_offset(req->ctx->engine[RCS].state) | flags);
   

[Intel-gfx] [CI 23/32] drm/i915: Use VMA as the primary tracker for semaphore page

2016-08-15 Thread Chris Wilson
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c |  2 +-
 drivers/gpu/drm/i915/i915_drv.h |  4 +--
 drivers/gpu/drm/i915/i915_gpu_error.c   | 16 -
 drivers/gpu/drm/i915/intel_engine_cs.c  | 12 ---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 60 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 +--
 6 files changed, 55 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 9e44d9eb8e76..cee15b3db6ed 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3198,7 +3198,7 @@ static int i915_semaphore_status(struct seq_file *m, void 
*unused)
struct page *page;
uint64_t *seqno;
 
-   page = i915_gem_object_get_page(dev_priv->semaphore_obj, 0);
+   page = i915_gem_object_get_page(dev_priv->semaphore->obj, 0);
 
seqno = (uint64_t *)kmap_atomic(page);
for_each_engine_id(engine, dev_priv, id) {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 259425d99e17..50dc3613c61c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -733,7 +733,7 @@ struct drm_i915_error_state {
u64 fence[I915_MAX_NUM_FENCES];
struct intel_overlay_error_state *overlay;
struct intel_display_error_state *display;
-   struct drm_i915_error_object *semaphore_obj;
+   struct drm_i915_error_object *semaphore;
 
struct drm_i915_error_engine {
int engine_id;
@@ -1750,7 +1750,7 @@ struct drm_i915_private {
struct pci_dev *bridge_dev;
struct i915_gem_context *kernel_context;
struct intel_engine_cs engine[I915_NUM_ENGINES];
-   struct drm_i915_gem_object *semaphore_obj;
+   struct i915_vma *semaphore;
u32 next_seqno;
 
struct drm_dma_handle *status_page_dmah;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index c327733e6735..4068630bfc68 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -549,7 +549,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf 
*m,
}
}
 
-   if ((obj = error->semaphore_obj)) {
+   if ((obj = error->semaphore)) {
err_printf(m, "Semaphore page = 0x%08x\n",
   lower_32_bits(obj->gtt_offset));
for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
@@ -640,7 +640,7 @@ static void i915_error_state_free(struct kref *error_ref)
kfree(ee->waiters);
}
 
-   i915_error_object_free(error->semaphore_obj);
+   i915_error_object_free(error->semaphore);
 
for (i = 0; i < ARRAY_SIZE(error->active_bo); i++)
kfree(error->active_bo[i]);
@@ -876,7 +876,7 @@ static void gen8_record_semaphore_state(struct 
drm_i915_error_state *error,
struct intel_engine_cs *to;
enum intel_engine_id id;
 
-   if (!error->semaphore_obj)
+   if (!error->semaphore)
return;
 
for_each_engine_id(to, dev_priv, id) {
@@ -889,7 +889,7 @@ static void gen8_record_semaphore_state(struct 
drm_i915_error_state *error,
 
signal_offset =
(GEN8_SIGNAL_OFFSET(engine, id) & (PAGE_SIZE - 1)) / 4;
-   tmp = error->semaphore_obj->pages[0];
+   tmp = error->semaphore->pages[0];
idx = intel_engine_sync_index(engine, to);
 
ee->semaphore_mboxes[idx] = tmp[signal_offset];
@@ -1061,11 +1061,9 @@ static void i915_gem_record_rings(struct 
drm_i915_private *dev_priv,
struct drm_i915_gem_request *request;
int i, count;
 
-   if (dev_priv->semaphore_obj) {
-   error->semaphore_obj =
-   i915_error_ggtt_object_create(dev_priv,
- dev_priv->semaphore_obj);
-   }
+   error->semaphore =
+   i915_error_ggtt_object_create(dev_priv,
+ dev_priv->semaphore->obj);
 
for (i = 0; i < I915_NUM_ENGINES; i++) {
struct intel_engine_cs *engine = &dev_priv->engine[i];
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c 
b/drivers/gpu/drm/i915/intel_engine_cs.c
index 829624571ca4..573f642a74f8 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -179,12 +179,16 @@ void intel_engine_init_seqno(struct intel_engine_cs 
*engine, u32 seqno)
if (HAS_VEBOX(dev_priv))
I915_WRITE(RING_SYNC_2(engine->mmio_base), 0);
}
-   if (dev_priv->semaphore_obj) {
-   struct drm_i915_gem_object *obj = dev_priv->semaphore_obj;
-   struct page *page = i915_gem_object_get_dirty_page(obj, 0);
- 

[Intel-gfx] [CI 31/32] drm/i915: Only record active and pending requests upon a GPU hang

2016-08-15 Thread Chris Wilson
There is no other state pertaining to the completed requests in the
hang, other than gleamed through the ringbuffer, so including the
expired requests in the list of outstanding requests simply adds noise.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 109 +++---
 1 file changed, 61 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 0f0b65214ef1..776818b86c0c 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1060,12 +1060,68 @@ static void error_record_engine_registers(struct 
drm_i915_error_state *error,
}
 }
 
+static void engine_record_requests(struct intel_engine_cs *engine,
+  struct drm_i915_gem_request *first,
+  struct drm_i915_error_engine *ee)
+{
+   struct drm_i915_gem_request *request;
+   int count;
+
+   count = 0;
+   request = first;
+   list_for_each_entry_from(request, &engine->request_list, link)
+   count++;
+   if (!count)
+   return;
+
+   ee->requests = kcalloc(count, sizeof(*ee->requests), GFP_ATOMIC);
+   if (!ee->requests)
+   return;
+
+   ee->num_requests = count;
+
+   count = 0;
+   request = first;
+   list_for_each_entry_from(request, &engine->request_list, link) {
+   struct drm_i915_error_request *erq;
+
+   if (count >= ee->num_requests) {
+   /*
+* If the ring request list was changed in
+* between the point where the error request
+* list was created and dimensioned and this
+* point then just exit early to avoid crashes.
+*
+* We don't need to communicate that the
+* request list changed state during error
+* state capture and that the error state is
+* slightly incorrect as a consequence since we
+* are typically only interested in the request
+* list state at the point of error state
+* capture, not in any changes happening during
+* the capture.
+*/
+   break;
+   }
+
+   erq = &ee->requests[count++];
+   erq->seqno = request->fence.seqno;
+   erq->jiffies = request->emitted_jiffies;
+   erq->head = request->head;
+   erq->tail = request->tail;
+
+   rcu_read_lock();
+   erq->pid = request->ctx->pid ? pid_nr(request->ctx->pid) : 0;
+   rcu_read_unlock();
+   }
+   ee->num_requests = count;
+}
+
 static void i915_gem_record_rings(struct drm_i915_private *dev_priv,
  struct drm_i915_error_state *error)
 {
struct i915_ggtt *ggtt = &dev_priv->ggtt;
-   struct drm_i915_gem_request *request;
-   int i, count;
+   int i;
 
error->semaphore =
i915_error_object_create(dev_priv, dev_priv->semaphore);
@@ -1073,6 +1129,7 @@ static void i915_gem_record_rings(struct drm_i915_private 
*dev_priv,
for (i = 0; i < I915_NUM_ENGINES; i++) {
struct intel_engine_cs *engine = &dev_priv->engine[i];
struct drm_i915_error_engine *ee = &error->engine[i];
+   struct drm_i915_gem_request *request;
 
ee->pid = -1;
ee->engine_id = -1;
@@ -1131,6 +1188,8 @@ static void i915_gem_record_rings(struct drm_i915_private 
*dev_priv,
ee->cpu_ring_tail = ring->tail;
ee->ringbuffer =
i915_error_object_create(dev_priv, ring->vma);
+
+   engine_record_requests(engine, request, ee);
}
 
ee->hws_page =
@@ -1139,52 +1198,6 @@ static void i915_gem_record_rings(struct 
drm_i915_private *dev_priv,
 
ee->wa_ctx =
i915_error_object_create(dev_priv, engine->wa_ctx.vma);
-
-   count = 0;
-   list_for_each_entry(request, &engine->request_list, link)
-   count++;
-
-   ee->num_requests = count;
-   ee->requests =
-   kcalloc(count, sizeof(*ee->requests), GFP_ATOMIC);
-   if (!ee->requests) {
-   ee->num_requests = 0;
-   continue;
-   }
-
-   count = 0;
-   list_for_each_entry(request, &engine->request_list, link) {
-   struct drm_i915_error_request *erq;
-
-   if (count >= ee->num_reques

[Intel-gfx] [CI 10/32] drm/i915: Add fetch_and_zero() macro

2016-08-15 Thread Chris Wilson
A simple little macro to clear a pointer and return the old value. This
is useful for writing

value = *ptr;
if (!value)
return;

*ptr = 0;
...
free(value);

in a slightly more concise form:

value = fetch_and_zero(ptr);
if (!value)
return;

...
free(value);

with the idea that this establishes a pattern that may be extended for
atomic use (using xchg or cmpxchg) i.e. atomic_fetch_and_zero() and
similar to llist.

Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: Daniel Vetter 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 25b1e6c010d5..855833a6306a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3920,4 +3920,10 @@ bool i915_memcpy_from_wc(void *dst, const void *src, 
unsigned long len);
 #define ptr_pack_bits(ptr, bits)   \
((typeof(ptr))((unsigned long)(ptr) | (bits)))
 
+#define fetch_and_zero(ptr) ({ \
+   typeof(*ptr) __T = *(ptr);  \
+   *(ptr) = (typeof(*ptr))0;   \
+   __T;\
+})
+
 #endif
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [CI 29/32] drm/i915: Move debug only per-request pid tracking from request to ctx

2016-08-15 Thread Chris Wilson
Since contexts are not currently shared between userspace processes, we
have an exact correspondence between context creator and guilty batch
submitter. Therefore we can save some per-batch work by inspecting the
context->pid upon error instead. Note that we take the context's
creator's pid rather than the file's pid in order to better track fd
passed over sockets.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 25 -
 drivers/gpu/drm/i915/i915_drv.h |  2 ++
 drivers/gpu/drm/i915/i915_gem_context.c |  4 
 drivers/gpu/drm/i915/i915_gem_request.c |  6 --
 drivers/gpu/drm/i915/i915_gem_request.h |  3 ---
 drivers/gpu/drm/i915/i915_gpu_error.c   | 13 ++---
 6 files changed, 32 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index f9bedcb1d9d0..b89478a8d19a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -460,6 +460,8 @@ static int i915_gem_object_info(struct seq_file *m, void* 
data)
print_context_stats(m, dev_priv);
list_for_each_entry_reverse(file, &dev->filelist, lhead) {
struct file_stats stats;
+   struct drm_i915_file_private *file_priv = file->driver_priv;
+   struct drm_i915_gem_request *request;
struct task_struct *task;
 
memset(&stats, 0, sizeof(stats));
@@ -473,10 +475,17 @@ static int i915_gem_object_info(struct seq_file *m, void* 
data)
 * still alive (e.g. get_pid(current) => fork() => exit()).
 * Therefore, we need to protect this ->comm access using RCU.
 */
+   mutex_lock(&dev->struct_mutex);
+   request = list_first_entry_or_null(&file_priv->mm.request_list,
+  struct drm_i915_gem_request,
+  client_list);
rcu_read_lock();
-   task = pid_task(file->pid, PIDTYPE_PID);
+   task = pid_task(request && request->ctx->pid ?
+   request->ctx->pid : file->pid,
+   PIDTYPE_PID);
print_file_stats(m, task ? task->comm : "", stats);
rcu_read_unlock();
+   mutex_unlock(&dev->struct_mutex);
}
mutex_unlock(&dev->filelist_mutex);
 
@@ -658,12 +667,11 @@ static int i915_gem_request_info(struct seq_file *m, void 
*data)
 
seq_printf(m, "%s requests: %d\n", engine->name, count);
list_for_each_entry(req, &engine->request_list, link) {
+   struct pid *pid = req->ctx->pid;
struct task_struct *task;
 
rcu_read_lock();
-   task = NULL;
-   if (req->pid)
-   task = pid_task(req->pid, PIDTYPE_PID);
+   task = pid ? pid_task(pid, PIDTYPE_PID) : NULL;
seq_printf(m, "%x @ %d: %s [%d]\n",
   req->fence.seqno,
   (int) (jiffies - req->emitted_jiffies),
@@ -1952,18 +1960,17 @@ static int i915_context_status(struct seq_file *m, void 
*unused)
 
list_for_each_entry(ctx, &dev_priv->context_list, link) {
seq_printf(m, "HW context %u ", ctx->hw_id);
-   if (IS_ERR(ctx->file_priv)) {
-   seq_puts(m, "(deleted) ");
-   } else if (ctx->file_priv) {
-   struct pid *pid = ctx->file_priv->file->pid;
+   if (ctx->pid) {
struct task_struct *task;
 
-   task = get_pid_task(pid, PIDTYPE_PID);
+   task = get_pid_task(ctx->pid, PIDTYPE_PID);
if (task) {
seq_printf(m, "(%s [%d]) ",
   task->comm, task->pid);
put_task_struct(task);
}
+   } else if (IS_ERR(ctx->file_priv)) {
+   seq_puts(m, "(deleted) ");
} else {
seq_puts(m, "(kernel) ");
}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bd58878de77b..d9f29244bafb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -782,6 +782,7 @@ struct drm_i915_error_state {
 
struct drm_i915_error_request {
long jiffies;
+   pid_t pid;
u32 seqno;
u32 head;
u32 tail;
@@ -880,6 +881,7 @@ struct i915_gem_context {
struct drm_i915_private *i915;
struct drm_i915_file_private *file_priv;
struct

[Intel-gfx] [CI 12/32] drm/i915: Track pinned vma inside guc

2016-08-15 Thread Chris Wilson
Since the guc allocates and pins and object into the GGTT for its usage,
it is more natural to use that pinned VMA as our resource cookie.

v2: Embrace naming tautology
v3: Rewrite comments for guc_allocate_vma()

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_debugfs.c|  10 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h|   6 ++
 drivers/gpu/drm/i915/i915_guc_submission.c | 144 ++---
 drivers/gpu/drm/i915/intel_guc.h   |   9 +-
 drivers/gpu/drm/i915/intel_guc_loader.c|   7 +-
 5 files changed, 90 insertions(+), 86 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 77a9c56ad25f..0ae61e94ce04 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2570,15 +2570,15 @@ static int i915_guc_log_dump(struct seq_file *m, void 
*data)
struct drm_info_node *node = m->private;
struct drm_device *dev = node->minor->dev;
struct drm_i915_private *dev_priv = to_i915(dev);
-   struct drm_i915_gem_object *log_obj = dev_priv->guc.log_obj;
-   u32 *log;
+   struct drm_i915_gem_object *obj;
int i = 0, pg;
 
-   if (!log_obj)
+   if (!dev_priv->guc.log_vma)
return 0;
 
-   for (pg = 0; pg < log_obj->base.size / PAGE_SIZE; pg++) {
-   log = kmap_atomic(i915_gem_object_get_page(log_obj, pg));
+   obj = dev_priv->guc.log_vma->obj;
+   for (pg = 0; pg < obj->base.size / PAGE_SIZE; pg++) {
+   u32 *log = kmap_atomic(i915_gem_object_get_page(obj, pg));
 
for (i = 0; i < PAGE_SIZE / sizeof(u32); i += 4)
seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n",
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index f2769e01cc8c..a2691943a404 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -716,4 +716,10 @@ static inline void i915_vma_unpin_iomap(struct i915_vma 
*vma)
i915_vma_unpin(vma);
 }
 
+static inline struct page *i915_vma_first_page(struct i915_vma *vma)
+{
+   GEM_BUG_ON(!vma->pages);
+   return sg_page(vma->pages->sgl);
+}
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 6831321a9c8c..29de8cec1b58 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -183,7 +183,7 @@ static int guc_update_doorbell_id(struct intel_guc *guc,
  struct i915_guc_client *client,
  u16 new_id)
 {
-   struct sg_table *sg = guc->ctx_pool_obj->pages;
+   struct sg_table *sg = guc->ctx_pool_vma->pages;
void *doorbell_bitmap = guc->doorbell_bitmap;
struct guc_doorbell_info *doorbell;
struct guc_context_desc desc;
@@ -325,7 +325,6 @@ static void guc_init_proc_desc(struct intel_guc *guc,
 static void guc_init_ctx_desc(struct intel_guc *guc,
  struct i915_guc_client *client)
 {
-   struct drm_i915_gem_object *client_obj = client->client_obj;
struct drm_i915_private *dev_priv = guc_to_i915(guc);
struct intel_engine_cs *engine;
struct i915_gem_context *ctx = client->owner;
@@ -383,8 +382,8 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
 * The doorbell, process descriptor, and workqueue are all parts
 * of the client object, which the GuC will reference via the GGTT
 */
-   gfx_addr = i915_gem_obj_ggtt_offset(client_obj);
-   desc.db_trigger_phy = sg_dma_address(client_obj->pages->sgl) +
+   gfx_addr = client->vma->node.start;
+   desc.db_trigger_phy = sg_dma_address(client->vma->pages->sgl) +
client->doorbell_offset;
desc.db_trigger_cpu = (uintptr_t)client->client_base +
client->doorbell_offset;
@@ -400,7 +399,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc,
desc.desc_private = (uintptr_t)client;
 
/* Pool context is pinned already */
-   sg = guc->ctx_pool_obj->pages;
+   sg = guc->ctx_pool_vma->pages;
sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc),
 sizeof(desc) * client->ctx_index);
 }
@@ -413,7 +412,7 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
 
memset(&desc, 0, sizeof(desc));
 
-   sg = guc->ctx_pool_obj->pages;
+   sg = guc->ctx_pool_vma->pages;
sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc),
 sizeof(desc) * client->ctx_index);
 }
@@ -496,7 +495,7 @@ static void guc_add_workqueue_item(struct i915_guc_client 
*gc,
/* WQ starts from the page after doorbell / process_desc */
wq_page = (wq_off + GUC_DB_SIZE) >> PAGE_SHIFT;
wq_off &= PAGE_SIZE - 1;
-   base = kmap_atomic(i

[Intel-gfx] [CI 19/32] drm/i915: Use VMA for scratch page tracking

2016-08-15 Thread Chris Wilson
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_context.c |  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c   |  2 +-
 drivers/gpu/drm/i915/intel_display.c|  2 +-
 drivers/gpu/drm/i915/intel_lrc.c| 18 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 55 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.h | 10 ++
 6 files changed, 46 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 824dfe14bcd0..e566167d9441 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -660,7 +660,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 
hw_flags)
MI_STORE_REGISTER_MEM |
MI_SRM_LRM_GLOBAL_GTT);
intel_ring_emit_reg(ring, last_reg);
-   intel_ring_emit(ring, engine->scratch.gtt_offset);
+   intel_ring_emit(ring, engine->scratch->node.start);
intel_ring_emit(ring, MI_NOOP);
}
intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 27f973fbe80f..c327733e6735 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1101,7 +1101,7 @@ static void i915_gem_record_rings(struct drm_i915_private 
*dev_priv,
if (HAS_BROKEN_CS_TLB(dev_priv))
ee->wa_batchbuffer =
i915_error_ggtt_object_create(dev_priv,
- 
engine->scratch.obj);
+ 
engine->scratch->obj);
 
if (request->ctx->engine[i].state) {
ee->ctx = 
i915_error_ggtt_object_create(dev_priv,
diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 90309a9858b2..9d18f34f7ce5 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11795,7 +11795,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
intel_ring_emit(ring, MI_STORE_REGISTER_MEM |
  MI_SRM_LRM_GLOBAL_GTT);
intel_ring_emit_reg(ring, DERRMR);
-   intel_ring_emit(ring, req->engine->scratch.gtt_offset + 256);
+   intel_ring_emit(ring, req->engine->scratch->node.start + 256);
if (IS_GEN8(dev)) {
intel_ring_emit(ring, 0);
intel_ring_emit(ring, MI_NOOP);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 73dd2f9e0547..42999ba02152 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -914,7 +914,7 @@ static inline int gen8_emit_flush_coherentl3_wa(struct 
intel_engine_cs *engine,
wa_ctx_emit(batch, index, (MI_STORE_REGISTER_MEM_GEN8 |
   MI_SRM_LRM_GLOBAL_GTT));
wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4);
-   wa_ctx_emit(batch, index, engine->scratch.gtt_offset + 256);
+   wa_ctx_emit(batch, index, engine->scratch->node.start + 256);
wa_ctx_emit(batch, index, 0);
 
wa_ctx_emit(batch, index, MI_LOAD_REGISTER_IMM(1));
@@ -932,7 +932,7 @@ static inline int gen8_emit_flush_coherentl3_wa(struct 
intel_engine_cs *engine,
wa_ctx_emit(batch, index, (MI_LOAD_REGISTER_MEM_GEN8 |
   MI_SRM_LRM_GLOBAL_GTT));
wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4);
-   wa_ctx_emit(batch, index, engine->scratch.gtt_offset + 256);
+   wa_ctx_emit(batch, index, engine->scratch->node.start + 256);
wa_ctx_emit(batch, index, 0);
 
return index;
@@ -993,7 +993,7 @@ static int gen8_init_indirectctx_bb(struct intel_engine_cs 
*engine,
 
/* WaClearSlmSpaceAtContextSwitch:bdw,chv */
/* Actual scratch location is at 128 bytes offset */
-   scratch_addr = engine->scratch.gtt_offset + 2*CACHELINE_BYTES;
+   scratch_addr = engine->scratch->node.start + 2 * CACHELINE_BYTES;
 
wa_ctx_emit(batch, index, GFX_OP_PIPE_CONTROL(6));
wa_ctx_emit(batch, index, (PIPE_CONTROL_FLUSH_L3 |
@@ -1072,8 +1072,8 @@ static int gen9_init_indirectctx_bb(struct 
intel_engine_cs *engine,
/* WaClearSlmSpaceAtContextSwitch:kbl */
/* Actual scratch location is at 128 bytes offset */
if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_A0)) {
-   uint32_t scratch_addr
-   = engine->scratch.gtt_offset + 2*CACHELINE_BYTES;
+   u32 scratch_addr =
+   engine->scratch->node.start + 2 

Re: [Intel-gfx] [PATCH 03/10] drm/i915: Move fence tracking from object to vma

2016-08-15 Thread Chris Wilson
On Mon, Aug 15, 2016 at 12:18:20PM +0300, Joonas Lahtinen wrote:
> On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> > @@ -1131,15 +1131,11 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private 
> > *i915,
> >     } else {
> >     node.start = i915_ggtt_offset(vma);
> >     node.allocated = false;
> > -   ret = i915_gem_object_put_fence(obj);
> > +   ret = i915_vma_put_fence(vma);
> >     if (ret)
> >     goto out_unpin;
> >     }
> >  
> > -   ret = i915_gem_object_set_to_gtt_domain(obj, true);
> > -   if (ret)
> > -   goto out_unpin;
> > -
> 
> This is a somewhat an unexpected change in here. Care to explain?

Spontaneous disappearance due to rebasing. Pops back into existence
again later!
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm: make drm_get_format_name thread-safe

2016-08-15 Thread Jani Nikula
On Mon, 15 Aug 2016, Eric Engestrom  wrote:
> Signed-off-by: Eric Engestrom 
> ---
>
> I moved the main bits to be the first diffs, shouldn't affect anything
> when applying the patch, but I wanted to ask:
> I don't like the hard-coded `32` the appears in both kmalloc() and
> snprintf(), what do you think? If you don't like it either, what would
> you suggest? Should I #define it?
>
> Second question is about the patch mail itself: should I send this kind
> of patch separated by module, with a note requesting them to be squashed
> when applying? It has to land as a single patch, but for review it might
> be easier if people only see the bits they each care about, as well as
> to collect ack's/r-b's.
>
> Cheers,
>   Eric
>
> ---
>  drivers/gpu/drm/amd/amdgpu/dce_v10_0.c  |  6 ++--
>  drivers/gpu/drm/amd/amdgpu/dce_v11_0.c  |  6 ++--
>  drivers/gpu/drm/amd/amdgpu/dce_v8_0.c   |  6 ++--
>  drivers/gpu/drm/drm_atomic.c|  5 ++--
>  drivers/gpu/drm/drm_crtc.c  | 21 -
>  drivers/gpu/drm/drm_fourcc.c| 17 ++-
>  drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c |  6 ++--
>  drivers/gpu/drm/i915/i915_debugfs.c | 11 ++-
>  drivers/gpu/drm/i915/intel_atomic_plane.c   |  6 ++--
>  drivers/gpu/drm/i915/intel_display.c| 39 
> -
>  drivers/gpu/drm/radeon/atombios_crtc.c  | 12 +---
>  include/drm/drm_fourcc.h|  2 +-
>  12 files changed, 89 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
> index 0645c85..38216a1 100644
> --- a/drivers/gpu/drm/drm_fourcc.c
> +++ b/drivers/gpu/drm/drm_fourcc.c
> @@ -39,16 +39,14 @@ static char printable_char(int c)
>   * drm_get_format_name - return a string for drm fourcc format
>   * @format: format to compute name of
>   *
> - * Note that the buffer used by this function is globally shared and owned by
> - * the function itself.
> - *
> - * FIXME: This isn't really multithreading safe.
> + * Note that the buffer returned by this function is owned by the caller
> + * and will need to be freed.
>   */
>  const char *drm_get_format_name(uint32_t format)

I find it surprising that a function that allocates a buffer returns a
const pointer. Some userspace libraries have conventions about the
ownership based on constness.

(I also find it suprising that kfree() takes a const pointer; arguably
that call changes the memory.)

Is there precedent for this?

BR,
Jani.


>  {
> - static char buf[32];
> + char *buf = kmalloc(32, GFP_KERNEL);
>  
> - snprintf(buf, sizeof(buf),
> + snprintf(buf, 32,
>"%c%c%c%c %s-endian (0x%08x)",
>printable_char(format & 0xff),
>printable_char((format >> 8) & 0xff),
> @@ -73,6 +71,8 @@ EXPORT_SYMBOL(drm_get_format_name);
>  void drm_fb_get_bpp_depth(uint32_t format, unsigned int *depth,
> int *bpp)
>  {
> + const char *format_name;
> +
>   switch (format) {
>   case DRM_FORMAT_C8:
>   case DRM_FORMAT_RGB332:
> @@ -127,8 +127,9 @@ void drm_fb_get_bpp_depth(uint32_t format, unsigned int 
> *depth,
>   *bpp = 32;
>   break;
>   default:
> - DRM_DEBUG_KMS("unsupported pixel format %s\n",
> -   drm_get_format_name(format));
> + format_name = drm_get_format_name(format);
> + DRM_DEBUG_KMS("unsupported pixel format %s\n", format_name);
> + kfree(format_name);
>   *depth = 0;
>   *bpp = 0;
>   break;
> diff --git a/include/drm/drm_fourcc.h b/include/drm/drm_fourcc.h
> index 7f90a39..030d22d 100644
> --- a/include/drm/drm_fourcc.h
> +++ b/include/drm/drm_fourcc.h
> @@ -32,6 +32,6 @@ int drm_format_horz_chroma_subsampling(uint32_t format);
>  int drm_format_vert_chroma_subsampling(uint32_t format);
>  int drm_format_plane_width(int width, uint32_t format, int plane);
>  int drm_format_plane_height(int height, uint32_t format, int plane);
> -const char *drm_get_format_name(uint32_t format);
> +const char *drm_get_format_name(uint32_t format) __malloc;
>  
>  #endif /* __DRM_FOURCC_H__ */
> diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
> index c1b04e9..0bf8959 100644
> --- a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
> @@ -2071,6 +2071,7 @@ static int dce_v10_0_crtc_do_set_base(struct drm_crtc 
> *crtc,
>   u32 tmp, viewport_w, viewport_h;
>   int r;
>   bool bypass_lut = false;
> + const char *format_name;
>  
>   /* no fb bound */
>   if (!atomic && !crtc->primary->fb) {
> @@ -2182,8 +2183,9 @@ static int dce_v10_0_crtc_do_set_base(struct drm_crtc 
> *crtc,
>   bypass_lut = true;
>   break;
>   default:
> - DRM_ERROR("Unsupported screen form

Re: [Intel-gfx] [PATCH 09/10] drm/i915: Bump the inactive MRU tracking for all VMA accessed

2016-08-15 Thread Joonas Lahtinen
> When we bump the MRU access tracking on set-to-gtt, we need to not only
> bump the primary GGTT VMA but all partials as well. Similarly we want to
> bump the MRU access for when unpinning an object from the scanout.

Refer to the list as LRU in the commit title and message to avoid confusion.

On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> +static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object 
> *obj)
> +{
> + struct i915_vma *vma;
> +
> + list_for_each_entry(vma, &obj->vma_list, obj_link) {
> + if (!i915_vma_is_ggtt(vma))
> + continue;
> +
> + if (i915_vma_is_active(vma))
> + continue;

Could combine these two to one if.

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 10/10] drm/i915: Stop discarding GTT cache-domain on unbind vma

2016-08-15 Thread Joonas Lahtinen
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> Since commit 43566dedde54 ("drm/i915: Broaden application of
> set-domain(GTT)") we allowed objects to be in the GTT domain, but unbound.
> Therefore removing the GTT cache domain when removing the GGTT vma is no
> longer semantically correct.
> 
> An unfortunate side-effect is we lose the wondrously named
> i915_gem_object_finish_gtt(), not to be confused with
> i915_gem_gtt_finish_object()!
> 

Does what it promises.

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/skl: Do not error out when total_data_rate is 0

2016-08-15 Thread Maarten Lankhorst
This can happen when doing a modeset with only the cursor plane active.

Testcase: kms_atomic_transition
Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/intel_pm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 651277b0c917..550d9f0688ae 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3115,8 +3115,6 @@ skl_get_total_relative_data_rate(struct intel_crtc_state 
*intel_cstate)
total_data_rate += intel_cstate->wm.skl.plane_y_data_rate[id];
}
 
-   WARN_ON(cstate->plane_mask && total_data_rate == 0);
-
return total_data_rate;
 }
 
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 09/10] drm/i915: Bump the inactive MRU tracking for all VMA accessed

2016-08-15 Thread Chris Wilson
On Mon, Aug 15, 2016 at 12:59:09PM +0300, Joonas Lahtinen wrote:
> > When we bump the MRU access tracking on set-to-gtt, we need to not only
> > bump the primary GGTT VMA but all partials as well. Similarly we want to
> > bump the MRU access for when unpinning an object from the scanout.
> 
> Refer to the list as LRU in the commit title and message to avoid confusion.

Still disagree. We are adjusting the MRU entity, the code always has and
then evicting from the LRU.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [CI,01/32] drm/i915: Record the position of the start of the request

2016-08-15 Thread Patchwork
== Series Details ==

Series: series starting with [CI,01/32] drm/i915: Record the position of the 
start of the request
URL   : https://patchwork.freedesktop.org/series/11093/
State : failure

== Summary ==

Series 11093v1 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/11093/revisions/1/mbox

Test kms_cursor_legacy:
Subgroup basic-cursor-vs-flip-varying-size:
fail   -> PASS   (ro-ilk1-i5-650)
Subgroup basic-flip-vs-cursor-varying-size:
pass   -> FAIL   (ro-byt-n2820)
fail   -> PASS   (ro-bdw-i5-5250u)
pass   -> FAIL   (ro-skl3-i5-6260u)
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
skip   -> DMESG-WARN (ro-bdw-i5-5250u)
Subgroup suspend-read-crc-pipe-c:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)

fi-hsw-i7-4770k  total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-qkkr  total:244  pass:186  dwarn:29  dfail:0   fail:3   skip:26 
fi-skl-i7-6700k  total:244  pass:208  dwarn:4   dfail:2   fail:2   skip:28 
fi-snb-i7-2600   total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:219  dwarn:2   dfail:0   fail:1   skip:18 
ro-bdw-i7-5600u  total:240  pass:206  dwarn:1   dfail:0   fail:1   skip:32 
ro-bsw-n3050 total:240  pass:194  dwarn:0   dfail:0   fail:4   skip:42 
ro-byt-n2820 total:240  pass:197  dwarn:0   dfail:0   fail:3   skip:40 
ro-hsw-i3-4010u  total:240  pass:214  dwarn:0   dfail:0   fail:0   skip:26 
ro-hsw-i7-4770r  total:240  pass:185  dwarn:0   dfail:0   fail:0   skip:55 
ro-ilk1-i5-650   total:235  pass:174  dwarn:0   dfail:0   fail:1   skip:60 
ro-ivb-i7-3770   total:240  pass:205  dwarn:0   dfail:0   fail:0   skip:35 
ro-ivb2-i7-3770  total:240  pass:209  dwarn:0   dfail:0   fail:0   skip:31 
ro-skl3-i5-6260u total:240  pass:222  dwarn:0   dfail:0   fail:4   skip:14 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1867/

1b2e958 drm-intel-nightly: 2016y-08m-15d-09h-09m-06s UTC integration manifest
7d6041c drm/i915: Record the RING_MODE register for post-mortem debugging
d8ff181 drm/i915: Only record active and pending requests upon a GPU hang
d0a310c drm/i915: Print the batchbuffer offset next to BBADDR in error state
da7a99b drm/i915: Move debug only per-request pid tracking from request to ctx
fa4b7d2 drm/i915: Introduce i915_ggtt_offset()
b1fd3c1 drm/i915: Track pinned VMA
c2dd68d drm/i915: Consolidate i915_vma_unpin_and_release()
4b74e1f drm/i915: Use VMA for wa_ctx tracking
a2e786f drm/i915: Use VMA for render state page tracking
3083b2c drm/i915: Use VMA as the primary tracker for semaphore page
ed77da3 drm/i915/overlay: Use VMA as the primary tracker for images
1be7e98 drm/i915: Move common seqno reset to intel_engine_cs.c
fe5680d drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c
b1dc161 drm/i915: Use VMA for scratch page tracking
bc1fc79 drm/i915: Use VMA for ringbuffer tracking
1ced8c8 drm/i915: Move assertion for iomap access to i915_vma_pin_iomap
71987d7 drm/i915: Only change the context object's domain when binding
fd8cbd3 drm/i915: Use VMA as the primary object for context state
79f2877 drm/i915: Use VMA directly for checking tiling parameters
f1a7f7f drm/i915: Convert fence computations to use vma directly
bf12887 drm/i915: Track pinned vma inside guc
762650b drm/i915: Add convenience wrappers for vma's object get/put
eca1534 drm/i915: Add fetch_and_zero() macro
4f57f1d drm/i915: Create a VMA for an object
0fc7594 drm/i915: Always set the vma->pages
61a24bc drm/i915: Remove redundant WARN_ON from __i915_add_request()
4a925da drm/i915: Reduce i915_gem_objects to only show object information
4c3d11b drm/i915: Focus debugfs/i915_gem_pinned to show only display pins
07af7c5 drm/i915: Remove inactive/active list from debugfs
e5e12b1 drm/i915: Store the active context object on all engines upon error
0a3d2c5 drm/i915: Reduce amount of duplicate buffer information captured on 
error
a8cdde2 drm/i915: Record the position of the start of the request

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 03/10] drm/i915: Move fence tracking from object to vma

2016-08-15 Thread Joonas Lahtinen
On ma, 2016-08-15 at 10:25 +0100, Chris Wilson wrote:
> On Mon, Aug 15, 2016 at 12:18:20PM +0300, Joonas Lahtinen wrote:
> > 
> > On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> > > 
> > > + if (1) {
> > Umm? At least ought to have TODO: / FIXME: or some explanation. And
> You're not aware of the pipelined fencing?

I was most definitely not, now I am somewhat. Still need to add dem
TODOs.

Regards, Joonas

> -Chris
> 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 06/10] drm/i915: Choose not to evict faultable objects from the GGTT

2016-08-15 Thread Joonas Lahtinen
On pe, 2016-08-12 at 12:13 +0100, Chris Wilson wrote:
> On Fri, Aug 12, 2016 at 01:50:56PM +0300, Joonas Lahtinen wrote:
> > 
> > On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> > > 
> > > @@ -1715,10 +1716,10 @@ int i915_gem_fault(struct vm_area_struct *area, 
> > > struct vm_fault *vmf)
> > >   goto err_unlock;
> > >   }
> > >  
> > > - /* Use a partial view if the object is bigger than the aperture. */
> > > - /* Now pin it into the GTT if needed */
> > > - vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
> > > -    PIN_MAPPABLE | PIN_NONBLOCK);
> > > + flags = PIN_MAPPABLE;
> > > + if (obj->base.size > 2 << 20)
> > Magic number.
> One day there may be a MiB() macro. It is a magic number, just a rule of
> thumb based on minimum chunksize for a partial.

#define the minimum chunk size and use it here too? With a warning of
the number being derived from the wildest approximations.

>  
> > 
> > > 
> > > @@ -55,6 +55,9 @@ mark_free(struct i915_vma *vma, struct list_head 
> > > *unwind)
> > >   if (WARN_ON(!list_empty(&vma->exec_list)))
> > >   return false;
> > >  
> > > + if (flags & PIN_NOFAULT && vma->obj->fault_mappable)
> > > + return false;
> > The flag name is rather counter-intuitive for it describes other VMAs
> > rather than our new VMA...
> As does NONBLOCKING. We could loose this flag in favour of NOEVICT, but
> I haven't run anything to confirm if that's a good tradeoff.

Maybe the flag should be like __PIN_NOFAULTING to distinct in addition
to __PIN_NONBLOCKING? And then make sure they're never set on vma
itself.

Regards, Joonas

> -Chris
> 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs

2016-08-15 Thread Goel, Akash



On 8/15/2016 2:50 PM, Tvrtko Ursulin wrote:


On 12/08/16 17:31, Goel, Akash wrote:

On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote:

On 12/08/16 07:25, akash.g...@intel.com wrote:

From: Akash Goel 

As per the current i915 Driver load sequence, debugfs registration is
done
at the end and so the relay channel debugfs file is also created after
that
but the GuC firmware is loaded much earlier in the sequence.
As a result Driver could miss capturing the boot-time logs of GuC
firmware
if there are flush interrupts from the GuC side.
Relay has a provision to support early logging where initially only
relay
channel can be created, to have buffers for storing logs, and later on
channel can be associated with a debugfs file at appropriate time.
Have availed that, which allows Driver to capture boot time logs also,
which can be collected once Userspace comes up.

Suggested-by: Chris Wilson 
Signed-off-by: Akash Goel 
---
  drivers/gpu/drm/i915/i915_guc_submission.c | 61
+-
  1 file changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
b/drivers/gpu/drm/i915/i915_guc_submission.c
index af48f62..1c287d7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct
intel_guc *guc)
  relay_close(guc->log.relay_chan);
  }

-static int guc_create_log_relay_file(struct intel_guc *guc)
+static int guc_create_relay_channel(struct intel_guc *guc)
  {
  struct drm_i915_private *dev_priv = guc_to_i915(guc);
  struct rchan *guc_log_relay_chan;
-struct dentry *log_dir;
  size_t n_subbufs, subbuf_size;

-/* For now create the log file in /sys/kernel/debug/dri/0 dir */
-log_dir = dev_priv->drm.primary->debugfs_root;
-
-/* If /sys/kernel/debug/dri/0 location do not exist, then
debugfs is
- * not mounted and so can't create the relay file.
- * The relay API seems to fit well with debugfs only.


It only needs a dentry, I don't see that it has to be a debugfs one.


Besides dentry, there are other requirements for using relay, which can
be met only for a debugfs file.
debugfs wasn't the preferred choice to place the log file, but had no
other option, as relay API is compatible with debugfs only.


What are those and

For availing relay there are 3 requirements :-
a) Need the associated ‘dentry’ pointer of the file, while opening the
   relay channel.
b) Should be able to use 'relay_file_operations' fops for the file.
c) Set the 'i_private' field of file’s inode to the pointer of relay
   channel buffer.

All the above 3 requirements can be met for a debugfs file in a 
straightforward manner. But not all of them can be met for a file 
created inside sysfs or if the file is created inside /dev as a 
character device file.



should they be mentioned in the comment above?


Or should I mention them in the cover letter or commit message.

Best regards
Akash


Regards,

Tvrtko

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Ro.CI.BAT: failure for drm/i915/skl: Do not error out when total_data_rate is 0

2016-08-15 Thread Patchwork
== Series Details ==

Series: drm/i915/skl: Do not error out when total_data_rate is 0
URL   : https://patchwork.freedesktop.org/series/11094/
State : failure

== Summary ==

Series 11094v1 drm/i915/skl: Do not error out when total_data_rate is 0
http://patchwork.freedesktop.org/api/1.0/series/11094/revisions/1/mbox

Test kms_cursor_legacy:
Subgroup basic-flip-vs-cursor-legacy:
fail   -> PASS   (ro-byt-n2820)
Subgroup basic-flip-vs-cursor-varying-size:
pass   -> FAIL   (ro-byt-n2820)
pass   -> FAIL   (fi-hsw-i7-4770k)
fail   -> PASS   (ro-bdw-i5-5250u)
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
skip   -> DMESG-WARN (ro-bdw-i5-5250u)
Subgroup suspend-read-crc-pipe-b:
pass   -> INCOMPLETE (fi-hsw-i7-4770k)
skip   -> DMESG-WARN (ro-bdw-i5-5250u)

fi-hsw-i7-4770k  total:207  pass:185  dwarn:0   dfail:0   fail:1   skip:20 
fi-kbl-qkkr  total:244  pass:185  dwarn:29  dfail:0   fail:3   skip:27 
fi-skl-i7-6700k  total:244  pass:208  dwarn:4   dfail:2   fail:2   skip:28 
fi-snb-i7-2600   total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:219  dwarn:3   dfail:0   fail:1   skip:17 
ro-bdw-i7-5600u  total:240  pass:207  dwarn:0   dfail:0   fail:1   skip:32 
ro-bsw-n3050 total:240  pass:194  dwarn:0   dfail:0   fail:4   skip:42 
ro-byt-n2820 total:240  pass:198  dwarn:0   dfail:0   fail:2   skip:40 
ro-hsw-i3-4010u  total:240  pass:214  dwarn:0   dfail:0   fail:0   skip:26 
ro-hsw-i7-4770r  total:240  pass:185  dwarn:0   dfail:0   fail:0   skip:55 
ro-ilk1-i5-650   total:235  pass:173  dwarn:0   dfail:0   fail:2   skip:60 
ro-ivb-i7-3770   total:240  pass:205  dwarn:0   dfail:0   fail:0   skip:35 
ro-ivb2-i7-3770  total:240  pass:209  dwarn:0   dfail:0   fail:0   skip:31 
ro-skl3-i5-6260u total:240  pass:223  dwarn:0   dfail:0   fail:3   skip:14 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1868/

1b2e958 drm-intel-nightly: 2016y-08m-15d-09h-09m-06s UTC integration manifest
2f8ea64 drm/i915/skl: Do not error out when total_data_rate is 0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 6/9] drm/i915/cmdparser: Compare against the previous command descriptor

2016-08-15 Thread Matthew Auld
On 12 August 2016 at 16:07, Chris Wilson  wrote:
> On the blitter (and in test code), we see long sequences of repeated
> commands, e.g. XY_PIXEL_BLT, XY_SCANLINE_BLT, or XY_SRC_COPY. For these,
> we can skip the hashtable lookup by remembering the previous command
> descriptor and doing a straightforward compare of the command header.
> The corollary is that we need to do one extra comparison before lookup
> up new commands.
>
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/i915_cmd_parser.c | 20 +---
>  1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
> b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 274f2136a846..3b1100a0e0cb 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -350,6 +350,9 @@ static const struct drm_i915_cmd_descriptor 
> hsw_blt_cmds[] = {
> CMD(  MI_LOAD_SCAN_LINES_EXCL,  SMI,   !F,  0x3F,   R  ),
>  };
>
> +static const struct drm_i915_cmd_descriptor noop_desc =
> +   CMD(MI_NOOP, SMI, F, 1, S);
> +
>  #undef CMD
>  #undef SMI
>  #undef S3D
> @@ -898,11 +901,14 @@ find_cmd_in_table(struct intel_engine_cs *engine,
>  static const struct drm_i915_cmd_descriptor*
>  find_cmd(struct intel_engine_cs *engine,
>  u32 cmd_header,
> +const struct drm_i915_cmd_descriptor *desc,
>  struct drm_i915_cmd_descriptor *default_desc)
>  {
> -   const struct drm_i915_cmd_descriptor *desc;
> u32 mask;
>
> +   if (((cmd_header ^ desc->cmd.value) & desc->cmd.mask) == 0)
> +   return desc;
> +
> desc = find_cmd_in_table(engine, cmd_header);
> if (desc)
> return desc;
> @@ -911,10 +917,10 @@ find_cmd(struct intel_engine_cs *engine,
> if (!mask)
> return NULL;
>
> -   BUG_ON(!default_desc);
Why remove this, was it overkill?

> -   default_desc->flags = CMD_DESC_SKIP;
> +   default_desc->cmd.value = cmd_header;
> +   default_desc->cmd.mask = 0x;
Where did you pluck this mask from?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 09/10] drm/i915: Bump the inactive MRU tracking for all VMA accessed

2016-08-15 Thread Joonas Lahtinen
On ma, 2016-08-15 at 11:12 +0100, Chris Wilson wrote:
> On Mon, Aug 15, 2016 at 12:59:09PM +0300, Joonas Lahtinen wrote:
> > 
> > > 
> > > When we bump the MRU access tracking on set-to-gtt, we need to not only
> > > bump the primary GGTT VMA but all partials as well. Similarly we want to
> > > bump the MRU access for when unpinning an object from the scanout.
> > Refer to the list as LRU in the commit title and message to avoid confusion.
> Still disagree. We are adjusting the MRU entity, the code always has and
> then evicting from the LRU.

I would not use the abbreviation MRU when discussing LRU scheme, but
it's only the commit message so I can live with it.

Regards, Joonas

> -Chris
> 
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [STABLE 4.4 BACKPORT REQUEST] drm/i915: Don't complain about lack of ACPI video bios

2016-08-15 Thread Jani Nikula

Stable team, please backport

commit 78c3d5fa7354774b7c8638033d46c042ebae41fb
Author: Daniel Vetter 
Date:   Fri Oct 23 11:00:06 2015 +0200

drm/i915: Don't complain about lack of ACPI video bios

to v4.4.

Tested-by: Rainer Fiebig  # v4.4


BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 02/10] drm/i915/userptr: Make gup errors stickier

2016-08-15 Thread Joonas Lahtinen
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote:
> Keep any error reported by the gup_worker until we are notified that the
> arena has changed (via the mmu-notifier). This has the importance of
> making two consecutive calls to i915_gem_object_get_pages() reporting
> the same error, and curtailing an loop of detecting a fault and requeueing
> a gup_worker.
> 

I think this is for Mika to review.

Regards, Joonas

> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/i915_gem_userptr.c | 17 +++--
>  1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
> b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 57218cca7e05..be54825ef3e8 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -542,8 +542,6 @@ __i915_gem_userptr_get_pages_worker(struct work_struct 
> *_work)
>   }
>   }
>   obj->userptr.work = ERR_PTR(ret);
> - if (ret)
> - __i915_gem_userptr_set_active(obj, false);
>   }
>  
>   obj->userptr.workers--;
> @@ -628,15 +626,14 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object 
> *obj)
>    * to the vma (discard or cloning) which should prevent the more
>    * egregious cases from causing harm.
>    */
> - if (IS_ERR(obj->userptr.work)) {
> - /* active flag will have been dropped already by the worker */
> - ret = PTR_ERR(obj->userptr.work);
> - obj->userptr.work = NULL;
> - return ret;
> - }
> - if (obj->userptr.work)
> +
> + if (obj->userptr.work) {
>   /* active flag should still be held for the pending work */
> - return -EAGAIN;
> + if (IS_ERR(obj->userptr.work))
> + return PTR_ERR(obj->userptr.work);
> + else
> + return -EAGAIN;
> + }
>  
>   /* Let the mmu-notifier know that we have begun and need cancellation */
>   ret = __i915_gem_userptr_set_active(obj, true);
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4 0/2] drm/i915/opregion: proper handling of DIDL and CADL

2016-08-15 Thread Marcos Paulo de Souza
Hi list,

I have an Asus laptop, and these two patches solved my problem with
bright hot-keys not working[1].

I applied both patches on 4.8-rc1, and the only necessary fix was
changing priv_dev->dev to priv_dev->drm in all places of for_each_*
macros touched by these patches.

Is there any chance to get this merged before 4.8 is launched? And, if
there are other problems still in need of fixes in this patch, please
let me know.

Thanks!

[1] https://bugzilla.kernel.org/show_bug.cgi?id=152091
>From ae6d2f8916abe9573b91b3ecb565c9585dda579a Mon Sep 17 00:00:00 2001
From: Jani Nikula 
Date: Wed, 29 Jun 2016 18:36:41 +0300
Subject: [PATCH 1/2] drm/i915: make i915 the source of acpi device ids for
 _DOD

The graphics driver is supposed to define the DIDL, which are used for
_DOD, not the BIOS. Restore that behaviour.

This is basically a revert of

commit 3143751ff51a163b77f7efd389043e038f3e008e
Author: Zhang Rui 
Date:   Mon Mar 29 15:12:16 2010 +0800

drm/i915: set DIDL using the ACPI video output device _ADR method return.

which went out of its way to cater to a specific BIOS, setting up DIDL
based on _ADR method. Perhaps that approach worked on that specific
machine, but on the machines I checked the _ADR method invents the
device identifiers out of thin air if DIDL has not been set. The source
for _ADR is also supposed to be the DIDL set by the driver, not the
other way around.

With this, we'll also limit the number of outputs to what the driver
actually has.

v2: do not set ACPI_DEVICE_ID_SCHEME in the device id (Peter Wu)

Reviewed-and-tested-by: Peter Wu 
Signed-off-by: Jani Nikula 
---
 drivers/gpu/drm/i915/intel_drv.h  |  3 ++
 drivers/gpu/drm/i915/intel_opregion.c | 89 ++-
 2 files changed, 28 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index cc937a1..8656b4c 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -263,6 +263,9 @@ struct intel_connector {
 */
struct intel_encoder *encoder;
 
+   /* ACPI device id for ACPI and driver cooperation */
+   u32 acpi_device_id;
+
/* Reads out the current hw, returning true if the connector is enabled
 * and active (i.e. dpms ON state). */
bool (*get_hw_state)(struct intel_connector *);
diff --git a/drivers/gpu/drm/i915/intel_opregion.c 
b/drivers/gpu/drm/i915/intel_opregion.c
index adca262..494559a 100644
--- a/drivers/gpu/drm/i915/intel_opregion.c
+++ b/drivers/gpu/drm/i915/intel_opregion.c
@@ -674,11 +674,11 @@ static void set_did(struct intel_opregion *opregion, int 
i, u32 val)
}
 }
 
-static u32 acpi_display_type(struct drm_connector *connector)
+static u32 acpi_display_type(struct intel_connector *connector)
 {
u32 display_type;
 
-   switch (connector->connector_type) {
+   switch (connector->base.connector_type) {
case DRM_MODE_CONNECTOR_VGA:
case DRM_MODE_CONNECTOR_DVIA:
display_type = ACPI_DISPLAY_TYPE_VGA;
@@ -707,7 +707,7 @@ static u32 acpi_display_type(struct drm_connector 
*connector)
display_type = ACPI_DISPLAY_TYPE_OTHER;
break;
default:
-   MISSING_CASE(connector->connector_type);
+   MISSING_CASE(connector->base.connector_type);
display_type = ACPI_DISPLAY_TYPE_OTHER;
break;
}
@@ -718,34 +718,10 @@ static u32 acpi_display_type(struct drm_connector 
*connector)
 static void intel_didl_outputs(struct drm_i915_private *dev_priv)
 {
struct intel_opregion *opregion = &dev_priv->opregion;
-   struct pci_dev *pdev = dev_priv->drm.pdev;
-   struct drm_connector *connector;
-   acpi_handle handle;
-   struct acpi_device *acpi_dev, *acpi_cdev, *acpi_video_bus = NULL;
-   unsigned long long device_id;
-   acpi_status status;
-   u32 temp, max_outputs;
-   int i = 0;
-
-   handle = ACPI_HANDLE(&pdev->dev);
-   if (!handle || acpi_bus_get_device(handle, &acpi_dev))
-   return;
-
-   if (acpi_is_video_device(handle))
-   acpi_video_bus = acpi_dev;
-   else {
-   list_for_each_entry(acpi_cdev, &acpi_dev->children, node) {
-   if (acpi_is_video_device(acpi_cdev->handle)) {
-   acpi_video_bus = acpi_cdev;
-   break;
-   }
-   }
-   }
-
-   if (!acpi_video_bus) {
-   DRM_DEBUG_KMS("No ACPI video bus found\n");
-   return;
-   }
+   struct intel_connector *connector;
+   struct drm_device *dev = &dev_priv->drm;
+   int i = 0, max_outputs;
+   int display_index[16] = {};
 
/*
 * In theory, did2, the extended didl, gets added at opregion version
@@ -757,46 +733,31 @@ static void intel_didl_outputs(struct drm_i915_private 
*dev_priv)
max_outputs 

[Intel-gfx] [PATCH 2/5] drm/i915: Stop the machine whilst capturing the GPU crash dump

2016-08-15 Thread Chris Wilson
The error state is purposefully racy as we expect it to be called at any
time and so have avoided any locking whilst capturing the crash dump.
However, with multi-engine GPUs and multiple CPUs, those races can
manifest into OOPSes as we attempt to chase dangling pointers freed on
other CPUs. Under discussion are lots of ways to slow down normal
operation in order to protect the post-mortem error capture, but what it
we take the opposite approach and freeze the machine whilst the error
capture runs (note the GPU may still running, but as long as we don't
process any of the results the driver's bookkeeping will be static).

Note that by of itself, this is not a complete fix. It also depends on
the compiler barriers in list_add/list_del to prevent traversing the
lists into the void. We also depend that we only require state from
carefully controlled sources - i.e. all the state we require for
post-mortem debugging should be reachable from the request itself so
that we only have to worry about retrieving the request carefully. Once
we have the request, we know that all pointers from it are intact.

v2: Avoid drm_clflush_pages() inside stop_machine() as it may use
stop_machine() itself for its wbinvd fallback.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Kconfig  |  1 +
 drivers/gpu/drm/i915/i915_drv.h   |  2 ++
 drivers/gpu/drm/i915/i915_gpu_error.c | 46 +--
 3 files changed, 31 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 10a6ac11b6a9..0f46a9c04c0e 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -4,6 +4,7 @@ config DRM_I915
depends on X86 && PCI
select INTEL_GTT
select INTERVAL_TREE
+   select STOP_MACHINE
# we need shmfs for the swappable backing store, and in particular
# the shmem_readpage() which depends upon tmpfs
select SHMEM
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 20caac1796ef..52facd4a7179 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -705,6 +705,8 @@ struct drm_i915_error_state {
struct kref ref;
struct timeval time;
 
+   struct drm_i915_private *i915;
+
char error_msg[128];
bool simulated;
int iommu;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 0bbc22f9a705..0815e5c47431 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -28,6 +28,7 @@
  */
 
 #include 
+#include 
 #include "i915_drv.h"
 
 #ifdef CONFIG_DRM_I915_CAPTURE_ERROR
@@ -715,14 +716,12 @@ i915_error_object_create(struct drm_i915_private 
*dev_priv,
 
dst->page_count = num_pages;
while (num_pages--) {
-   unsigned long flags;
void *d;
 
d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
if (d == NULL)
goto unwind;
 
-   local_irq_save(flags);
if (use_ggtt) {
void __iomem *s;
 
@@ -741,15 +740,10 @@ i915_error_object_create(struct drm_i915_private 
*dev_priv,
 
page = i915_gem_object_get_page(src, i);
 
-   drm_clflush_pages(&page, 1);
-
s = kmap_atomic(page);
memcpy(d, s, PAGE_SIZE);
kunmap_atomic(s);
-
-   drm_clflush_pages(&page, 1);
}
-   local_irq_restore(flags);
 
dst->pages[i++] = d;
reloc_offset += PAGE_SIZE;
@@ -1404,6 +1398,31 @@ static void i915_capture_gen_state(struct 
drm_i915_private *dev_priv,
   sizeof(error->device_info));
 }
 
+static int capture(void *data)
+{
+   struct drm_i915_error_state *error = data;
+
+   /* Ensure that what we readback from memory matches what the GPU sees */
+   wbinvd();
+
+   i915_capture_gen_state(error->i915, error);
+   i915_capture_reg_state(error->i915, error);
+   i915_gem_record_fences(error->i915, error);
+   i915_gem_record_rings(error->i915, error);
+   i915_capture_active_buffers(error->i915, error);
+   i915_capture_pinned_buffers(error->i915, error);
+
+   do_gettimeofday(&error->time);
+
+   error->overlay = intel_overlay_capture_error_state(error->i915);
+   error->display = intel_display_capture_error_state(error->i915);
+
+   /* And make sure we don't leave trash in the CPU cache */
+   wbinvd();
+
+   return 0;
+}
+
 /**
  * i915_capture_error_state - capture an error record for later analysis
  * @dev: drm device
@@ -1435,18 +1454,9 @@ void i915_capture_error_state(struct drm_i915_private 
*dev_priv,
}
 
kref_init(&error->ref);
+   error->i915 = dev_priv;
 
-   i915_capture_gen_state(dev_priv, error);
-   i915_capture_reg_sta

[Intel-gfx] [PATCH 1/5] drm/i915: Allow disabling error capture

2016-08-15 Thread Chris Wilson
We currently capture the GPU state after we detect a hang. This is vital
for us to both triage and debug hangs in the wild (post-mortem
debugging). However, it comes at the cost of running some potentially
dangerous code (since it has to make very few assumption about the state
of the driver) that is quite resource intensive.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Kconfig  | 10 ++
 drivers/gpu/drm/i915/i915_debugfs.c   |  6 ++
 drivers/gpu/drm/i915/i915_drv.h   | 11 +++
 drivers/gpu/drm/i915/i915_gpu_error.c |  7 +++
 drivers/gpu/drm/i915/i915_params.c|  9 +
 drivers/gpu/drm/i915/i915_params.h|  1 +
 drivers/gpu/drm/i915/i915_sysfs.c |  8 
 drivers/gpu/drm/i915/intel_display.c  |  4 
 drivers/gpu/drm/i915/intel_overlay.c  |  4 
 9 files changed, 60 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 7769e469118f..10a6ac11b6a9 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -46,6 +46,16 @@ config DRM_I915_PRELIMINARY_HW_SUPPORT
 
  If in doubt, say "N".
 
+config DRM_I915_CAPTURE_ERROR
+   bool "Enable capturing GPU state following a hang"
+   depends on DRM_I915
+   default y
+   help
+ This option enables capturing the GPU state when a hang is detected.
+ This information is vital for triaging hangs and assists in debugging.
+
+ If in doubt, say "Y".
+
 config DRM_I915_USERPTR
bool "Always enable userptr support"
depends on DRM_I915
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index b89478a8d19a..f41ebf25655c 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -973,6 +973,8 @@ static int i915_hws_info(struct seq_file *m, void *data)
return 0;
 }
 
+#ifdef CONFIG_DRM_I915_CAPTURE_ERROR
+
 static ssize_t
 i915_error_state_write(struct file *filp,
   const char __user *ubuf,
@@ -1062,6 +1064,8 @@ static const struct file_operations i915_error_state_fops 
= {
.release = i915_error_state_release,
 };
 
+#endif
+
 static int
 i915_next_seqno_get(void *data, u64 *val)
 {
@@ -5399,7 +5403,9 @@ static const struct i915_debugfs_files {
{"i915_ring_missed_irq", &i915_ring_missed_irq_fops},
{"i915_ring_test_irq", &i915_ring_test_irq_fops},
{"i915_gem_drop_caches", &i915_drop_caches_fops},
+#ifdef CONFIG_DRM_I915_CAPTURE_ERROR
{"i915_error_state", &i915_error_state_fops},
+#endif
{"i915_next_seqno", &i915_next_seqno_fops},
{"i915_display_crc_ctl", &i915_display_crc_ctl_fops},
{"i915_pri_wm_latency", &i915_pri_wm_latency_fops},
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 35caa9b2f36a..20caac1796ef 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3482,6 +3482,7 @@ static inline void intel_display_crc_init(struct 
drm_device *dev) {}
 #endif
 
 /* i915_gpu_error.c */
+#ifdef CONFIG_DRM_I915_CAPTURE_ERROR
 __printf(2, 3)
 void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...);
 int i915_error_state_to_str(struct drm_i915_error_state_buf *estr,
@@ -3501,6 +3502,16 @@ void i915_error_state_get(struct drm_device *dev,
  struct i915_error_state_file_priv *error_priv);
 void i915_error_state_put(struct i915_error_state_file_priv *error_priv);
 void i915_destroy_error_state(struct drm_device *dev);
+#else
+static inline void i915_capture_error_state(struct drm_i915_private *dev_priv,
+   u32 engine_mask,
+   const char *error_msg)
+{
+}
+static inline void i915_destroy_error_state(struct drm_device *dev)
+{
+}
+#endif
 
 void i915_get_extra_instdone(struct drm_i915_private *dev_priv, uint32_t 
*instdone);
 const char *i915_cache_level_str(struct drm_i915_private *i915, int type);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 0c3f30ce85c3..0bbc22f9a705 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -30,6 +30,8 @@
 #include 
 #include "i915_drv.h"
 
+#ifdef CONFIG_DRM_I915_CAPTURE_ERROR
+
 static const char *engine_str(int engine)
 {
switch (engine) {
@@ -1419,6 +1421,9 @@ void i915_capture_error_state(struct drm_i915_private 
*dev_priv,
struct drm_i915_error_state *error;
unsigned long flags;
 
+   if (!i915.error_capture)
+   return;
+
if (READ_ONCE(dev_priv->gpu_error.first_error))
return;
 
@@ -1504,6 +1509,8 @@ void i915_destroy_error_state(struct drm_device *dev)
kref_put(&error->ref, i915_error_state_free);
 }
 
+#endif
+
 const char *i915_cache_level_str(struct drm_i915_private *i915, int type)
 {
switch (type) {
diff --git a/drivers/g

[Intel-gfx] Compress the GPU error state

2016-08-15 Thread Chris Wilson
After adjusting how we track the data to capture exactly what we use
via VMA, then adjusting the way we inspect the VMA we can finally
compress it.
-Chris

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 4/5] drm/i915: Consolidate error object printing

2016-08-15 Thread Chris Wilson
Leave all the pretty printing to userspace and simplify the error
capture to only have a single common object printer. It makes the kernel
code more compact, and the refactoring allows us to apply more complex
transformations like compressing the output.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 100 +-
 1 file changed, 25 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 185adcff0f2d..ae0b98eee9ec 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -311,10 +311,22 @@ void i915_error_printf(struct drm_i915_error_state_buf 
*e, const char *f, ...)
 }
 
 static void print_error_obj(struct drm_i915_error_state_buf *m,
+   struct intel_engine_cs *engine,
+   const char *name,
struct drm_i915_error_object *obj)
 {
int page, offset, elt;
 
+   if (!obj)
+   return;
+
+   if (name) {
+   err_printf(m, "%s --- %s = 0x%08x %08x\n",
+  engine ? engine->name : "global", name,
+  upper_32_bits(obj->gtt_offset),
+  lower_32_bits(obj->gtt_offset));
+   }
+
for (page = offset = 0; page < obj->page_count; page++) {
for (elt = 0; elt < PAGE_SIZE/4; elt++) {
err_printf(m, "%08x :  %08x\n", offset,
@@ -341,8 +353,8 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf 
*m,
struct drm_i915_private *dev_priv = to_i915(dev);
struct drm_i915_error_state *error = error_priv->error;
struct drm_i915_error_object *obj;
-   int i, j, offset, elt;
int max_hangcheck_score;
+   int i, j;
 
if (!error) {
err_printf(m, "no error state collected\n");
@@ -466,15 +478,7 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
err_printf(m, " --- gtt_offset = 0x%08x %08x\n",
   upper_32_bits(obj->gtt_offset),
   lower_32_bits(obj->gtt_offset));
-   print_error_obj(m, obj);
-   }
-
-   obj = ee->wa_batchbuffer;
-   if (obj) {
-   err_printf(m, "%s (w/a) --- gtt_offset = 0x%08x\n",
-  dev_priv->engine[i].name,
-  lower_32_bits(obj->gtt_offset));
-   print_error_obj(m, obj);
+   print_error_obj(m, &dev_priv->engine[i], NULL, obj);
}
 
if (ee->num_requests) {
@@ -503,77 +507,23 @@ int i915_error_state_to_str(struct 
drm_i915_error_state_buf *m,
}
}
 
-   if ((obj = ee->ringbuffer)) {
-   err_printf(m, "%s --- ringbuffer = 0x%08x\n",
-  dev_priv->engine[i].name,
-  lower_32_bits(obj->gtt_offset));
-   print_error_obj(m, obj);
-   }
+   print_error_obj(m, &dev_priv->engine[i],
+   "ringbuffer", ee->ringbuffer);
 
-   if ((obj = ee->hws_page)) {
-   u64 hws_offset = obj->gtt_offset;
-   u32 *hws_page = &obj->pages[0][0];
+   print_error_obj(m, &dev_priv->engine[i],
+   "HW Status", ee->hws_page);
 
-   if (i915.enable_execlists) {
-   hws_offset += LRC_PPHWSP_PN * PAGE_SIZE;
-   hws_page = &obj->pages[LRC_PPHWSP_PN][0];
-   }
-   err_printf(m, "%s --- HW Status = 0x%08llx\n",
-  dev_priv->engine[i].name, hws_offset);
-   offset = 0;
-   for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
-   err_printf(m, "[%04x] %08x %08x %08x %08x\n",
-  offset,
-  hws_page[elt],
-  hws_page[elt+1],
-  hws_page[elt+2],
-  hws_page[elt+3]);
-   offset += 16;
-   }
-   }
+   print_error_obj(m, &dev_priv->engine[i],
+   "HW context", ee->ctx);
 
-   obj = ee->wa_ctx;
-   if (obj) {
-   u64 wa_ctx_offset = obj->gtt_offset;
-   u32 *wa_ctx_page = &obj->pages[0][0];
-   struct intel_engine_cs *engine = &dev_priv->engine[RCS];
-   u32 wa_ctx_size 

[Intel-gfx] [PATCH 3/5] drm/i915: Always use the GTT for error capture

2016-08-15 Thread Chris Wilson
Since the GTT provides universal access to any GPU page, we can use it
to reduce our plethora of read methods to just one. It also has the
important characteristic of being exactly what the GPU sees - if there
are incoherency problems, seeing the batch as executed (rather than as
trapped inside the cpu cache) is important.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c   |  43 
 drivers/gpu/drm/i915/i915_gem_gtt.h   |   2 +
 drivers/gpu/drm/i915/i915_gpu_error.c | 122 --
 3 files changed, 75 insertions(+), 92 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3631944ac2d9..cbeec4cfe8a4 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2729,6 +2729,7 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv)
 */
struct i915_ggtt *ggtt = &dev_priv->ggtt;
unsigned long hole_start, hole_end;
+   struct i915_hw_ppgtt *ppgtt;
struct drm_mm_node *entry;
int ret;
 
@@ -2736,6 +2737,15 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv)
if (ret)
return ret;
 
+   /* Reserve a mappable slot for our lockless error capture */
+   ret = drm_mm_insert_node_in_range_generic(&ggtt->base.mm,
+ &ggtt->gpu_error,
+ 4096, 0, -1,
+ 0, ggtt->mappable_end,
+ 0, 0);
+   if (ret)
+   return ret;
+
/* Clear any non-preallocated blocks */
drm_mm_for_each_hole(entry, &ggtt->base.mm, hole_start, hole_end) {
DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
@@ -2750,25 +2760,21 @@ int i915_gem_init_ggtt(struct drm_i915_private 
*dev_priv)
   true);
 
if (USES_PPGTT(dev_priv) && !USES_FULL_PPGTT(dev_priv)) {
-   struct i915_hw_ppgtt *ppgtt;
-
ppgtt = kzalloc(sizeof(*ppgtt), GFP_KERNEL);
-   if (!ppgtt)
-   return -ENOMEM;
+   if (!ppgtt) {
+   ret = -ENOMEM;
+   goto err;
+   }
 
ret = __hw_ppgtt_init(ppgtt, dev_priv);
-   if (ret) {
-   kfree(ppgtt);
-   return ret;
-   }
+   if (ret)
+   goto err_ppgtt;
 
-   if (ppgtt->base.allocate_va_range)
+   if (ppgtt->base.allocate_va_range) {
ret = ppgtt->base.allocate_va_range(&ppgtt->base, 0,
ppgtt->base.total);
-   if (ret) {
-   ppgtt->base.cleanup(&ppgtt->base);
-   kfree(ppgtt);
-   return ret;
+   if (ret)
+   goto err_ppgtt_cleanup;
}
 
ppgtt->base.clear_range(&ppgtt->base,
@@ -2782,6 +2788,14 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv)
}
 
return 0;
+
+err_ppgtt_cleanup:
+   ppgtt->base.cleanup(&ppgtt->base);
+err_ppgtt:
+   kfree(ppgtt);
+err:
+   drm_mm_remove_node(&ggtt->gpu_error);
+   return ret;
 }
 
 /**
@@ -2801,6 +2815,9 @@ void i915_ggtt_cleanup_hw(struct drm_i915_private 
*dev_priv)
 
i915_gem_cleanup_stolen(&dev_priv->drm);
 
+   if (drm_mm_node_allocated(&ggtt->gpu_error))
+   drm_mm_remove_node(&ggtt->gpu_error);
+
if (drm_mm_initialized(&ggtt->base.mm)) {
intel_vgt_deballoon(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d6e4b6529196..79a08a050487 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -439,6 +439,8 @@ struct i915_ggtt {
bool do_idle_maps;
 
int mtrr;
+
+   struct drm_mm_node gpu_error;
 };
 
 struct i915_hw_ppgtt {
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 0815e5c47431..185adcff0f2d 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -628,7 +628,7 @@ static void i915_error_object_free(struct 
drm_i915_error_object *obj)
return;
 
for (page = 0; page < obj->page_count; page++)
-   kfree(obj->pages[page]);
+   free_page((unsigned long)obj->pages[page]);
 
kfree(obj);
 }
@@ -664,98 +664,69 @@ static void i915_error_state_free(struct kref *error_ref)
kfree(error);
 }
 
+static int compress_page(void *src, struct drm_i915_error_object *dst)
+{
+   unsigned long page;
+
+   page = __get_free_page(GFP_ATOMIC | __GFP_NOWARN);
+   if (!page)
+   re

[Intel-gfx] [PATCH 5/5] drm/i915: Compress GPU objects in error state

2016-08-15 Thread Chris Wilson
Our error states are quickly growing, pinning kernel memory with them.
The majority of the space is taken up by the error objects. These
compress well using zlib and without decode are mostly meaningless, so
encoding them does not hinder quickly parsing the error state for
familiarity.

v2: Make the zlib dependency optional

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/Kconfig  |  12 +++
 drivers/gpu/drm/i915/i915_drv.h   |   3 +-
 drivers/gpu/drm/i915/i915_gpu_error.c | 170 +-
 3 files changed, 163 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 0f46a9c04c0e..69657629d750 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -57,6 +57,18 @@ config DRM_I915_CAPTURE_ERROR
 
  If in doubt, say "Y".
 
+config DRM_I915_COMPRESS_ERROR
+   bool "Compress GPU error state"
+   depends on DRM_I915
+   select ZLIB_DEFLATE
+   default y
+   help
+ This option selects ZLIB_DEFLATE if it isn't already
+ selected and causes any error state captured upon a GPU hang
+ to be compressed using zlib.
+
+ If in doubt, say "Y".
+
 config DRM_I915_USERPTR
bool "Always enable userptr support"
depends on DRM_I915
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 52facd4a7179..6bb39301999e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -776,9 +776,10 @@ struct drm_i915_error_state {
u32 semaphore_mboxes[I915_NUM_ENGINES - 1];
 
struct drm_i915_error_object {
-   int page_count;
u64 gtt_offset;
u64 gtt_size;
+   int page_count;
+   int unused;
u32 *pages[0];
} *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page;
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index ae0b98eee9ec..404ae3356beb 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -29,6 +29,7 @@
 
 #include 
 #include 
+#include 
 #include "i915_drv.h"
 
 #ifdef CONFIG_DRM_I915_CAPTURE_ERROR
@@ -175,6 +176,110 @@ static void i915_error_puts(struct 
drm_i915_error_state_buf *e,
 #define err_printf(e, ...) i915_error_printf(e, __VA_ARGS__)
 #define err_puts(e, s) i915_error_puts(e, s)
 
+#ifdef CONFIG_DRM_I915_COMPRESS_ERROR
+
+static bool compress_init(struct z_stream_s *zstream)
+{
+   memset(zstream, 0, sizeof(*zstream));
+
+   zstream->workspace =
+   kmalloc(zlib_deflate_workspacesize(MAX_WBITS, MAX_MEM_LEVEL),
+   GFP_ATOMIC | __GFP_NOWARN);
+   if (!zstream->workspace)
+   return NULL;
+
+   if (zlib_deflateInit(zstream, Z_DEFAULT_COMPRESSION) != Z_OK) {
+   kfree(zstream->workspace);
+   return false;
+   }
+
+   return true;
+}
+
+static int compress_page(struct z_stream_s *zstream,
+void *src,
+struct drm_i915_error_object *dst)
+{
+   zstream->next_in = src;
+   zstream->avail_in = PAGE_SIZE;
+
+   do {
+   if (zstream->avail_out == 0) {
+   unsigned long page;
+
+   page = __get_free_page(GFP_ATOMIC | __GFP_NOWARN);
+   if (!page)
+   return -ENOMEM;
+
+   dst->pages[dst->page_count++] = (void *)page;
+
+   zstream->next_out = (void *)page;
+   zstream->avail_out = PAGE_SIZE;
+   }
+
+   if (zlib_deflate(zstream, Z_SYNC_FLUSH) != Z_OK)
+   return -EIO;
+
+   /* Fallback to uncompressed if we increase size? */
+   if (0 && zstream->total_out > zstream->total_in)
+   return -E2BIG;
+   } while (zstream->avail_in);
+
+   return 0;
+}
+
+static void compress_fini(struct z_stream_s *zstream,
+ struct drm_i915_error_object *dst)
+{
+   if (dst) {
+   zlib_deflate(zstream, Z_FINISH);
+   dst->unused = zstream->avail_out;
+   }
+
+   zlib_deflateEnd(zstream);
+   kfree(zstream->workspace);
+}
+
+static void err_compression_marker(struct drm_i915_error_state_buf *m)
+{
+   err_puts(m, ":");
+}
+
+#else
+
+static bool compress_init(struct z_stream_s *zstream)
+{
+   return true;
+}
+
+static int compress_page(struct z_stream_s *zstream,
+void *src,
+struct drm_i915_error_object *dst)
+{
+   unsigned long page;
+
+   page = __get_free_page(GFP_ATOMIC | __GFP_NOWARN);
+   if (!page)
+   return -ENOMEM;
+
+   dst->pages[dst->page_count++] =
+   memcpy((vo

[Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13

2016-08-15 Thread Mika Kuoppala
From: Jesse Barnes 

Use David's new IOMMU layer functions for supporting SVM in i915.

TODO:
  error record collection for failing SVM contexts
  callback handling for fatal faults
  scheduling

v2: integrate David's core IOMMU support
make sure we don't clobber the PASID in the context reg state
v3: fixup for intel-svm.h changes (David)
v4: use fault & halt for now (Jesse)
fix ring free in error path on context alloc (Julia)
v5: update with new callback struct (Jesse)
v6: fix init svm check per new IOMMU code (Jesse)
v7: drop debug code and obsolete i915_svm.c file (Jesse)
v8: fix !CONFIG_INTEL_IOMMU_SVM init stub (Jesse)
v9: update to new execlist and reg handling bits (Jesse)
context teardown fix (lrc deferred alloc vs teardown race?) (Jesse)
check for SVM availability at context create (Jesse)
v10: intel_context_svm_init/fini & rebase
v11: move context specific stuff to i915_gem_context
v12: move addressing to context descriptor
v13: strip out workqueue and mm notifiers

Cc: Daniel Vetter 
Cc: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: David Woodhouse 
Signed-off-by: David Woodhouse  (v3)
Signed-off-by: Jesse Barnes  (v9)
Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_drv.h |  32 ++
 drivers/gpu/drm/i915/i915_gem.c |   7 +++
 drivers/gpu/drm/i915/i915_gem_context.c | 104 +---
 drivers/gpu/drm/i915/i915_reg.h |  18 ++
 drivers/gpu/drm/i915/intel_lrc.c|  39 +---
 5 files changed, 167 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 598e078418e3..64f3f0f18509 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -866,6 +867,8 @@ struct i915_ctx_hang_stats {
  * @remap_slice: l3 row remapping information.
  * @flags: context specific flags:
  * CONTEXT_NO_ZEROMAP: do not allow mapping things to page 0.
+ * CONTEXT_NO_ERROR_CAPTURE: do not capture gpu state on hang.
+ * CONTEXT_SVM: context with 1:1 gpu vs cpu mapping of vm.
  * @file_priv: filp associated with this context (NULL for global default
  *context).
  * @hang_stats: information about the role of this context in possible GPU
@@ -891,6 +894,8 @@ struct i915_gem_context {
unsigned long flags;
 #define CONTEXT_NO_ZEROMAP BIT(0)
 #define CONTEXT_NO_ERROR_CAPTURE   BIT(1)
+#define CONTEXT_SVMBIT(2)
+
unsigned hw_id;
u32 user_handle;
 
@@ -909,6 +914,9 @@ struct i915_gem_context {
struct atomic_notifier_head status_notifier;
bool execlists_force_single_submission;
 
+   u32 pasid; /* svm, 20 bits */
+   struct task_struct *task;
+
struct list_head link;
 
u8 remap_slice;
@@ -2001,6 +2009,8 @@ struct drm_i915_private {
 
struct i915_runtime_pm pm;
 
+   bool svm_available;
+
/* Abstract the submission mechanism (legacy ringbuffer or execlists) 
away */
struct {
void (*cleanup_engine)(struct intel_engine_cs *engine);
@@ -3628,6 +3638,28 @@ extern void intel_set_memory_cxsr(struct 
drm_i915_private *dev_priv,
 int i915_reg_read_ioctl(struct drm_device *dev, void *data,
struct drm_file *file);
 
+/* svm */
+#ifdef CONFIG_INTEL_IOMMU_SVM
+static inline bool intel_init_svm(struct drm_device *dev)
+{
+   struct drm_i915_private *dev_priv = to_i915(dev);
+
+   dev_priv->svm_available = USES_FULL_48BIT_PPGTT(dev_priv) &&
+   intel_svm_available(&dev->pdev->dev);
+
+   return dev_priv->svm_available;
+}
+#else
+static inline bool intel_init_svm(struct drm_device *dev)
+{
+   struct drm_i915_private *dev_priv = to_i915(dev);
+
+   dev_priv->svm_available = false;
+
+   return dev_priv->svm_available;
+}
+#endif
+
 /* overlay */
 extern struct intel_overlay_error_state *
 intel_overlay_capture_error_state(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7e08c774a1aa..45d67b54c018 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4304,6 +4304,13 @@ i915_gem_init_hw(struct drm_device *dev)
}
}
 
+   if (INTEL_GEN(dev) >= 8) {
+   if (intel_init_svm(dev))
+   DRM_DEBUG_DRIVER("Initialized Intel SVM support\n");
+   else
+   DRM_ERROR("Failed to enable Intel SVM support\n");
+   }
+
i915_gem_init_swizzling(dev);
 
/*
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 189a6c018b72..9ab6332f296b 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -134,6 +134,47 @@ static int get_context_size(struct drm_i915_private 
*dev_p

[Intel-gfx] [PATCH RFC 0/4] svm support

2016-08-15 Thread Mika Kuoppala
Hi,

Now when fences got merged I reworked the series. It is now much
smaller in size. Some items are still missing like error state recording,
fault handling, documentation and in fences.

You can also find the most recent version in here:
https://cgit.freedesktop.org/~miku/drm-intel/log/?h=svm

-Mika

Jesse Barnes (4):
  drm/i915: add create_context2 ioctl
  drm/i915: IOMMU based SVM implementation v13
  drm/i915: add SVM execbuf ioctl v10
  drm/i915: Add param for SVM

 drivers/gpu/drm/i915/Kconfig   |   1 +
 drivers/gpu/drm/i915/i915_drv.c|   5 +
 drivers/gpu/drm/i915/i915_drv.h|  37 +++
 drivers/gpu/drm/i915/i915_gem.c|   7 ++
 drivers/gpu/drm/i915/i915_gem_context.c| 172 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 157 ++
 drivers/gpu/drm/i915/i915_reg.h|  18 +++
 drivers/gpu/drm/i915/intel_lrc.c   |  39 +++
 include/uapi/drm/i915_drm.h|  55 +
 9 files changed, 446 insertions(+), 45 deletions(-)

-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH RFC 1/4] drm/i915: add create_context2 ioctl

2016-08-15 Thread Mika Kuoppala
From: Jesse Barnes 

Add i915_gem_context_create2_ioctl for passing flags
(e.g. SVM) when creating a context.

v2: check the pad on create_context
v3: rebase
v4: i915_dma is no more. create_gvt needs flags

Cc: Daniel Vetter 
Cc: Chris Wilson 
Cc: Joonas Lahtinen 
Signed-off-by: Jesse Barnes  (v1)
Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_drv.c |  1 +
 drivers/gpu/drm/i915/i915_drv.h |  2 +
 drivers/gpu/drm/i915/i915_gem_context.c | 70 +++--
 include/uapi/drm/i915_drm.h | 18 +
 4 files changed, 78 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 13ae340ef1f3..9fb6de90eac0 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2566,6 +2566,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, 
DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, 
i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, 
i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE2, 
i915_gem_context_create2_ioctl, DRM_UNLOCKED),
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 35caa9b2f36a..598e078418e3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3399,6 +3399,8 @@ static inline bool i915_gem_context_is_default(const 
struct i915_gem_context *c)
 
 int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
  struct drm_file *file);
+int i915_gem_context_create2_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
 int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file);
 int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index 35950ee46a1d..189a6c018b72 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -341,17 +341,21 @@ err_out:
  */
 static struct i915_gem_context *
 i915_gem_create_context(struct drm_device *dev,
-   struct drm_i915_file_private *file_priv)
+   struct drm_i915_file_private *file_priv, u32 flags)
 {
struct i915_gem_context *ctx;
+   bool create_vm = false;
 
lockdep_assert_held(&dev->struct_mutex);
 
+   if (flags & (I915_GEM_CONTEXT_FULL_PPGTT | I915_GEM_CONTEXT_ENABLE_SVM))
+   create_vm = true;
+
ctx = __create_hw_context(dev, file_priv);
if (IS_ERR(ctx))
return ctx;
 
-   if (USES_FULL_PPGTT(dev)) {
+   if (create_vm) {
struct i915_hw_ppgtt *ppgtt =
i915_ppgtt_create(to_i915(dev), file_priv);
 
@@ -394,7 +398,8 @@ i915_gem_context_create_gvt(struct drm_device *dev)
if (ret)
return ERR_PTR(ret);
 
-   ctx = i915_gem_create_context(dev, NULL);
+   ctx = i915_gem_create_context(dev, NULL, USES_FULL_PPGTT(dev) ?
+ I915_GEM_CONTEXT_FULL_PPGTT : 0);
if (IS_ERR(ctx))
goto out;
 
@@ -440,6 +445,7 @@ int i915_gem_context_init(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
struct i915_gem_context *ctx;
+   u32 flags = 0;
 
/* Init should only be called once per module load. Eventually the
 * restriction on the context_disabled check can be loosened. */
@@ -472,7 +478,10 @@ int i915_gem_context_init(struct drm_device *dev)
}
}
 
-   ctx = i915_gem_create_context(dev, NULL);
+   if (USES_FULL_PPGTT(dev))
+   flags |= I915_GEM_CONTEXT_FULL_PPGTT;
+
+   ctx = i915_gem_create_context(dev, NULL, flags);
if (IS_ERR(ctx)) {
DRM_ERROR("Failed to create default global context (error 
%ld)\n",
  PTR_ERR(ctx));
@@ -552,7 +561,8 @@ int i915_gem_context_open(struct drm_device *dev, struct 
drm_file *file)
idr_init(&file_priv->context_idr);
 
mutex_lock(&dev->struct_mutex);
-   ctx = i915_gem_create_context(dev, file_priv);
+   ctx = i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev) ?
+ I915_GEM_CONTEXT_FULL_PPGTT : 0);
mutex_unlock(&dev->struct_mutex);
 
if (IS_ERR(ctx)) {
@@ -974,32 +984,66 @@ static bool contexts_enabled(struct drm_device *dev)
return i915.enable_execlists || to_i915(dev)->hw_context_size;
 }
 
-int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
- struct drm_f

[Intel-gfx] [PATCH RFC 4/4] drm/i915: Add param for SVM

2016-08-15 Thread Mika Kuoppala
From: Jesse Barnes 

Add possibility to query if svm is available.

v2: moved into i915_drv.c

Cc: Daniel Vetter 
Cc: Chris Wilson 
Cc: Joonas Lahtinen 
Signed-off-by: Jesse Barnes  (v1)
Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_drv.c | 3 +++
 include/uapi/drm/i915_drm.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index a07918d821e4..6d9c84253412 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -354,6 +354,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
case I915_PARAM_MIN_EU_IN_POOL:
value = INTEL_INFO(dev)->min_eu_in_pool;
break;
+   case I915_PARAM_HAS_SVM:
+   value = dev_priv->svm_available;
+   break;
default:
DRM_DEBUG("Unknown parameter %d\n", param->param);
return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 8d567744f221..c21ba4b769c4 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -391,6 +391,7 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_EXEC_SOFTPIN 37
 #define I915_PARAM_HAS_POOLED_EU38
 #define I915_PARAM_MIN_EU_IN_POOL   39
+#define I915_PARAM_HAS_SVM  40
 
 typedef struct drm_i915_getparam {
__s32 param;
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH RFC 3/4] drm/i915: add SVM execbuf ioctl v10

2016-08-15 Thread Mika Kuoppala
From: Jesse Barnes 

We just need to pass in an address to execute and some flags, since we
don't have to worry about buffer relocation or any of the other usual
stuff.  Returns a fence to be used for synchronization.

v2: add a request after batch submission (Jesse)
v3: add a flag for fence creation (Chris)
v4: add CLOEXEC flag (Kristian)
add non-RCS ring support (Jesse)
v5: update for request alloc change (Jesse)
v6: new sync file interface, error paths, request breadcrumbs
v7: always CLOEXEC for sync_file_install
v8: rebase on new sync file api
v9: rework on top of fence requests and sync_file
v10: take fence ref for sync_file (Chris)
 use correct flush (Chris)
 limit exec on rcs

Cc: Daniel Vetter 
Cc: Chris Wilson 
Cc: Joonas Lahtinen 
Signed-off-by: Jesse Barnes  (v5)
Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/Kconfig   |   1 +
 drivers/gpu/drm/i915/i915_drv.c|   1 +
 drivers/gpu/drm/i915/i915_drv.h|   3 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 157 +
 include/uapi/drm/i915_drm.h|  36 +++
 5 files changed, 198 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 7769e469118f..6503133c3f85 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -8,6 +8,7 @@ config DRM_I915
# the shmem_readpage() which depends upon tmpfs
select SHMEM
select TMPFS
+   select SYNC_FILE
select DRM_KMS_HELPER
select DRM_PANEL
select DRM_MIPI_DSI
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 9fb6de90eac0..a07918d821e4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2567,6 +2567,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, 
i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, 
i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE2, 
i915_gem_context_create2_ioctl, DRM_UNLOCKED),
+   DRM_IOCTL_DEF_DRV(I915_EXEC_MM, intel_exec_mm_ioctl, DRM_UNLOCKED),
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 64f3f0f18509..884d9844863c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3660,6 +3660,9 @@ static inline bool intel_init_svm(struct drm_device *dev)
 }
 #endif
 
+extern int intel_exec_mm_ioctl(struct drm_device *dev, void *data,
+  struct drm_file *file);
+
 /* overlay */
 extern struct intel_overlay_error_state *
 intel_overlay_capture_error_state(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 699315304748..c1ba6da1fd33 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1911,3 +1912,159 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
drm_free_large(exec2_list);
return ret;
 }
+
+static struct intel_engine_cs *
+exec_mm_select_engine(struct drm_i915_private *dev_priv,
+ struct drm_i915_exec_mm *exec_mm)
+{
+   unsigned int user_ring_id = exec_mm->ring_id & I915_EXEC_RING_MASK;
+   struct intel_engine_cs *e;
+
+   if (user_ring_id > I915_USER_RINGS) {
+   DRM_DEBUG("exec_mm with unknown ring: %u\n", user_ring_id);
+   return NULL;
+   }
+
+   e = &dev_priv->engine[user_ring_map[user_ring_id]];
+
+   if (!intel_engine_initialized(e)) {
+   DRM_DEBUG("exec_mm with invalid ring: %u\n", user_ring_id);
+   return NULL;
+   }
+
+   return e;
+}
+
+static int do_exec_mm(struct drm_i915_exec_mm *exec_mm,
+ struct drm_i915_gem_request *req,
+ const u32 flags)
+{
+   const bool create_fence = flags & I915_EXEC_MM_FENCE;
+   struct sync_file *out_fence;
+   int ret;
+
+   if (create_fence) {
+   out_fence = sync_file_create(fence_get(&req->fence));
+   if (!out_fence) {
+   DRM_DEBUG("sync file creation failed\n");
+   return ret;
+   }
+
+   exec_mm->fence = get_unused_fd_flags(O_CLOEXEC);
+   fd_install(exec_mm->fence, out_fence->file);
+   }
+
+   ret = req->engine->emit_flush(req, EMIT_INVALIDATE);
+   if (ret) {
+   DRM_DEBUG_DRIVER("engine flush failed: %d\n", ret);
+   goto fput;
+   }
+
+   ret = req->engine->emit_bb_start(req, exec_mm->batch_ptr, 0, 0);
+   if (ret) {
+   DRM_DEBUG_DRIVER("engine dispatch execbuf failed: %d\n", ret);
+ 

[Intel-gfx] [drm-intel:drm-intel-next-queued 7/33] drivers/gpu/drm/i915/i915_debugfs.c:392: error: 'mapped_count' may be used uninitialized in this function

2016-08-15 Thread kbuild test robot
tree:   git://anongit.freedesktop.org/drm-intel drm-intel-next-queued
head:   21a2c58a9c122151080ecbdddc115257cd7c30d8
commit: 2bd160a131ac617fc2441bfb4a02964c964a5da6 [7/33] drm/i915: Reduce 
i915_gem_objects to only show object information
config: x86_64-randconfig-s2-08151903 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
git checkout 2bd160a131ac617fc2441bfb4a02964c964a5da6
# save the attached .config to linux build tree
make ARCH=x86_64 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All errors (new ones prefixed by >>):

   cc1: warnings being treated as errors
   drivers/gpu/drm/i915/i915_debugfs.c: In function 'i915_gem_object_info':
>> drivers/gpu/drm/i915/i915_debugfs.c:392: error: 'mapped_count' may be used 
>> uninitialized in this function
>> drivers/gpu/drm/i915/i915_debugfs.c:393: error: 'mapped_size' may be used 
>> uninitialized in this function

vim +/mapped_count +392 drivers/gpu/drm/i915/i915_debugfs.c

   386  static int i915_gem_object_info(struct seq_file *m, void* data)
   387  {
   388  struct drm_info_node *node = m->private;
   389  struct drm_device *dev = node->minor->dev;
   390  struct drm_i915_private *dev_priv = to_i915(dev);
   391  struct i915_ggtt *ggtt = &dev_priv->ggtt;
 > 392  u32 count, mapped_count, purgeable_count, dpy_count;
 > 393  u64 size, mapped_size, purgeable_size, dpy_size;
   394  struct drm_i915_gem_object *obj;
   395  struct drm_file *file;
   396  int ret;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/2] drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass

2016-08-15 Thread Chris Wilson
Very old numbers indicate this is a 66% improvement when remapping the
entire object for fence contention - due to the elimination of
track_pfn_insert and its strcmp).

Signed-off-by: Chris Wilson 
Testcase: igt/gem_fence_upload/performance
Testcase: igt/gem_mmap_gtt
---
 drivers/gpu/drm/Makefile|  2 +-
 drivers/gpu/drm/i915/Makefile   |  3 +-
 drivers/gpu/drm/i915/i915_drv.h |  5 +++
 drivers/gpu/drm/i915/i915_gem.c | 50 
 drivers/gpu/drm/i915/i915_mm.c  | 85 +
 5 files changed, 100 insertions(+), 45 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_mm.c

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 0238bf8bc8c3..3ff094171ee5 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -46,7 +46,7 @@ obj-$(CONFIG_DRM_RADEON)+= radeon/
 obj-$(CONFIG_DRM_AMDGPU)+= amd/amdgpu/
 obj-$(CONFIG_DRM_MGA)  += mga/
 obj-$(CONFIG_DRM_I810) += i810/
-obj-$(CONFIG_DRM_I915)  += i915/
+obj-$(CONFIG_DRM_I915) += i915/
 obj-$(CONFIG_DRM_MGAG200) += mgag200/
 obj-$(CONFIG_DRM_VC4)  += vc4/
 obj-$(CONFIG_DRM_CIRRUS_QEMU) += cirrus/
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 3412413408c0..a7da24640e88 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -12,6 +12,7 @@ subdir-ccflags-y += \
 i915-y := i915_drv.o \
  i915_irq.o \
  i915_memcpy.o \
+ i915_mm.o \
  i915_params.o \
  i915_pci.o \
   i915_suspend.o \
@@ -113,6 +114,6 @@ i915-y += intel_gvt.o
 include $(src)/gvt/Makefile
 endif
 
-obj-$(CONFIG_DRM_I915)  += i915.o
+obj-$(CONFIG_DRM_I915) += i915.o
 
 CFLAGS_i915_trace_points.o := -I$(src)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 05efc0501f3c..0f25302fb517 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3936,6 +3936,11 @@ static inline bool __i915_request_irq_complete(struct 
drm_i915_gem_request *req)
 void i915_memcpy_init_early(struct drm_i915_private *dev_priv);
 bool i915_memcpy_from_wc(void *dst, const void *src, unsigned long len);
 
+/* i915_mm.c */
+int remap_io_mapping(struct vm_area_struct *vma,
+unsigned long addr, unsigned long pfn, unsigned long size,
+struct io_mapping *iomap);
+
 #define ptr_unpack_bits(ptr, bits) ({  \
unsigned long __v = (unsigned long)(ptr);   \
(bits) = __v & ~PAGE_MASK;  \
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f12114a35ae3..584144d5d8ea 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1698,7 +1698,6 @@ int i915_gem_fault(struct vm_area_struct *area, struct 
vm_fault *vmf)
bool write = !!(vmf->flags & FAULT_FLAG_WRITE);
struct i915_vma *vma;
pgoff_t page_offset;
-   unsigned long pfn;
unsigned int flags;
int ret;
 
@@ -1768,48 +1767,13 @@ int i915_gem_fault(struct vm_area_struct *area, struct 
vm_fault *vmf)
goto err_unpin;
 
/* Finally, remap it using the new GTT offset */
-   pfn = ggtt->mappable_base + i915_ggtt_offset(vma);
-   pfn >>= PAGE_SHIFT;
-
-   if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {
-   if (!obj->fault_mappable) {
-   unsigned long size =
-   min_t(unsigned long,
- area->vm_end - area->vm_start,
- obj->base.size) >> PAGE_SHIFT;
-   unsigned long base = area->vm_start;
-   int i;
-
-   for (i = 0; i < size; i++) {
-   ret = vm_insert_pfn(area,
-   base + i * PAGE_SIZE,
-   pfn + i);
-   if (ret)
-   break;
-   }
-   } else
-   ret = vm_insert_pfn(area,
-   (unsigned long)vmf->virtual_address,
-   pfn + page_offset);
-   } else {
-   /* Overriding existing pages in partial view does not cause
-* us any trouble as TLBs are still valid because the fault
-* is due to userspace losing part of the mapping or never
-* having accessed it before (at this partials' range).
-*/
-   const struct i915_ggtt_view *view = &vma->ggtt_view;
-   unsigned long base = area->vm_start +
-   (view->params.partial.offset << PAGE_SHIFT);
-   unsigned int i;
-
-   for (i = 0; i < view->params.partial.size; i++) {
-  

[Intel-gfx] [PATCH 1/2] io-mapping: Always create a struct to hold metadata about the io-mapping

2016-08-15 Thread Chris Wilson
Currently, we only allocate a structure to hold metadata if we need to
allocate an ioremap for every access, such as on x86-32. However, it
would be useful to store basic information about the io-mapping, such as
its page protection, on all platforms.

Signed-off-by: Chris Wilson 
Cc: linux...@kvack.org
---
 drivers/gpu/drm/i915/i915_gem.c|  6 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c| 11 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.h|  2 +-
 drivers/gpu/drm/i915/i915_gpu_error.c  |  2 +-
 drivers/gpu/drm/i915/intel_overlay.c   |  4 +-
 include/linux/io-mapping.h | 92 ++
 7 files changed, 70 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f5a7c7ffb1a5..f12114a35ae3 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -888,7 +888,7 @@ i915_gem_gtt_pread(struct drm_device *dev,
 * and write to user memory which may result into page
 * faults, and so we cannot perform this under struct_mutex.
 */
-   if (slow_user_access(ggtt->mappable, page_base,
+   if (slow_user_access(&ggtt->mappable, page_base,
 page_offset, user_data,
 page_length, false)) {
ret = -EFAULT;
@@ -1181,11 +1181,11 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
 * If the object is non-shmem backed, we retry again with the
 * path that handles page fault.
 */
-   if (fast_user_write(ggtt->mappable, page_base,
+   if (fast_user_write(&ggtt->mappable, page_base,
page_offset, user_data, page_length)) {
hit_slow_path = true;
mutex_unlock(&dev->struct_mutex);
-   if (slow_user_access(ggtt->mappable,
+   if (slow_user_access(&ggtt->mappable,
 page_base,
 page_offset, user_data,
 page_length, true)) {
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index c012a0d94878..e6f88f3194d6 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -470,7 +470,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj,
offset += page << PAGE_SHIFT;
}
 
-   vaddr = io_mapping_map_atomic_wc(cache->i915->ggtt.mappable, offset);
+   vaddr = io_mapping_map_atomic_wc(&cache->i915->ggtt.mappable, offset);
cache->page = page;
cache->vaddr = (unsigned long)vaddr;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index d03f9180ce76..3a82c97d5d53 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2808,7 +2808,6 @@ void i915_ggtt_cleanup_hw(struct drm_i915_private 
*dev_priv)
 
if (dev_priv->mm.aliasing_ppgtt) {
struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
-
ppgtt->base.cleanup(&ppgtt->base);
kfree(ppgtt);
}
@@ -2828,7 +2827,7 @@ void i915_ggtt_cleanup_hw(struct drm_i915_private 
*dev_priv)
ggtt->base.cleanup(&ggtt->base);
 
arch_phys_wc_del(ggtt->mtrr);
-   io_mapping_free(ggtt->mappable);
+   io_mapping_fini(&ggtt->mappable);
 }
 
 static unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
@@ -3226,9 +3225,9 @@ int i915_ggtt_init_hw(struct drm_i915_private *dev_priv)
if (!HAS_LLC(dev_priv))
ggtt->base.mm.color_adjust = i915_gtt_color_adjust;
 
-   ggtt->mappable =
-   io_mapping_create_wc(ggtt->mappable_base, ggtt->mappable_end);
-   if (!ggtt->mappable) {
+   if (!io_mapping_init_wc(&dev_priv->ggtt.mappable,
+   dev_priv->ggtt.mappable_base,
+   dev_priv->ggtt.mappable_end)) {
ret = -EIO;
goto out_gtt_cleanup;
}
@@ -3698,7 +3697,7 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 
ptr = vma->iomap;
if (ptr == NULL) {
-   ptr = io_mapping_map_wc(i915_vm_to_ggtt(vma->vm)->mappable,
+   ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->mappable,
vma->node.start,
vma->node.size);
if (ptr == NULL)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h 
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d2f79a1fb75f..f8d68d775896 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -438,13 +438,13 @@ stru

Re: [Intel-gfx] [PATCH RFC 1/4] drm/i915: add create_context2 ioctl

2016-08-15 Thread Chris Wilson
On Mon, Aug 15, 2016 at 02:48:04PM +0300, Mika Kuoppala wrote:
> From: Jesse Barnes 
> 
> Add i915_gem_context_create2_ioctl for passing flags
> (e.g. SVM) when creating a context.
> 
> v2: check the pad on create_context
> v3: rebase
> v4: i915_dma is no more. create_gvt needs flags
> 
> Cc: Daniel Vetter 
> Cc: Chris Wilson 
> Cc: Joonas Lahtinen 
> Signed-off-by: Jesse Barnes  (v1)
> Signed-off-by: Mika Kuoppala 

Considering we can use deferred ppgtt creation and have setparam do we
need a new create ioctl just to set a flag?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/9] drm/i915/cmdparser: Make initialisation failure non-fatal

2016-08-15 Thread Matthew Auld
On 12 August 2016 at 16:07, Chris Wilson  wrote:
> If the developer adds a register in the wrong order, we BUG during boot.
> That makes development and testing very difficult. Let's be a bit more
> friendly and disable the command parser with a big warning if the tables
> are invalid.
>
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/i915_cmd_parser.c | 30 ++
>  drivers/gpu/drm/i915/i915_drv.h|  2 +-
>  drivers/gpu/drm/i915/intel_engine_cs.c |  6 --
>  3 files changed, 23 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
> b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index a1f4683f5c35..1882dc28c750 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -746,17 +746,15 @@ static void fini_hash_table(struct intel_engine_cs 
> *engine)
>   * Optionally initializes fields related to batch buffer command parsing in 
> the
>   * struct intel_engine_cs based on whether the platform requires software
>   * command parsing.
> - *
> - * Return: non-zero if initialization fails
>   */
> -int intel_engine_init_cmd_parser(struct intel_engine_cs *engine)
> +void intel_engine_init_cmd_parser(struct intel_engine_cs *engine)
>  {
> const struct drm_i915_cmd_table *cmd_tables;
> int cmd_table_count;
> int ret;
>
> if (!IS_GEN7(engine->i915))
> -   return 0;
> +   return;
>
> switch (engine->id) {
> case RCS:
> @@ -811,24 +809,32 @@ int intel_engine_init_cmd_parser(struct intel_engine_cs 
> *engine)
> break;
> default:
> MISSING_CASE(engine->id);
> -   BUG();
> +   return;
> }
>
> -   BUG_ON(!validate_cmds_sorted(engine, cmd_tables, cmd_table_count));
> -   BUG_ON(!validate_regs_sorted(engine));
> +   if (!hash_empty(engine->cmd_hash)) {
> +   DRM_DEBUG_DRIVER("%s: no commands?\n", engine->name);
> +   return;
> +   }
"no commands?", !hash_empty should mean we already have commands, not
that we don't, right?

With that explained or fixed:
Reviewed-by: Matthew Auld 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [1/5] drm/i915: Allow disabling error capture

2016-08-15 Thread Patchwork
== Series Details ==

Series: series starting with [1/5] drm/i915: Allow disabling error capture
URL   : https://patchwork.freedesktop.org/series/11096/
State : failure

== Summary ==

Series 11096v1 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/11096/revisions/1/mbox

Test drv_module_reload_basic:
pass   -> SKIP   (ro-ivb-i7-3770)
Test kms_cursor_legacy:
Subgroup basic-cursor-vs-flip-varying-size:
fail   -> PASS   (ro-ilk1-i5-650)
Subgroup basic-flip-vs-cursor-legacy:
fail   -> PASS   (ro-ivb2-i7-3770)
fail   -> PASS   (ro-byt-n2820)
pass   -> FAIL   (ro-bdw-i5-5250u)
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
pass   -> INCOMPLETE (fi-hsw-i7-4770k)

fi-hsw-i7-4770k  total:201  pass:180  dwarn:0   dfail:0   fail:0   skip:20 
fi-kbl-qkkr  total:244  pass:185  dwarn:29  dfail:0   fail:3   skip:27 
fi-skl-i7-6700k  total:244  pass:208  dwarn:4   dfail:2   fail:2   skip:28 
fi-snb-i7-2600   total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:218  dwarn:1   dfail:0   fail:2   skip:19 
ro-bdw-i7-5600u  total:240  pass:207  dwarn:0   dfail:0   fail:1   skip:32 
ro-bsw-n3050 total:240  pass:194  dwarn:0   dfail:0   fail:4   skip:42 
ro-byt-n2820 total:240  pass:198  dwarn:0   dfail:0   fail:2   skip:40 
ro-hsw-i3-4010u  total:240  pass:214  dwarn:0   dfail:0   fail:0   skip:26 
ro-hsw-i7-4770r  total:240  pass:185  dwarn:0   dfail:0   fail:0   skip:55 
ro-ilk1-i5-650   total:235  pass:174  dwarn:0   dfail:0   fail:1   skip:60 
ro-ivb-i7-3770   total:240  pass:204  dwarn:0   dfail:0   fail:0   skip:36 
ro-ivb2-i7-3770  total:240  pass:209  dwarn:0   dfail:0   fail:0   skip:31 
ro-skl3-i5-6260u total:240  pass:223  dwarn:0   dfail:0   fail:3   skip:14 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1869/

e56d79f drm-intel-nightly: 2016y-08m-15d-10h-16m-44s UTC integration manifest
b8c5ad5 drm/i915: Compress GPU objects in error state
7d5601b drm/i915: Consolidate error object printing
716ad20 drm/i915: Always use the GTT for error capture
5f76742 drm/i915: Stop the machine whilst capturing the GPU crash dump
5476ea87 drm/i915: Allow disabling error capture

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 1/4] drm/i915: add create_context2 ioctl

2016-08-15 Thread Joonas Lahtinen
On ma, 2016-08-15 at 14:48 +0300, Mika Kuoppala wrote:
> @@ -2566,6 +2566,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
>   DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, 
> DRM_RENDER_ALLOW),
>   DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, 
> i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
>   DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, 
> i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW),
> + DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE2, 
> i915_gem_context_create2_ioctl, DRM_UNLOCKED),

Why DRM_UNLOCKED?

> @@ -394,7 +398,8 @@ i915_gem_context_create_gvt(struct drm_device *dev)
>   if (ret)
>   return ERR_PTR(ret);
>  
> - ctx = i915_gem_create_context(dev, NULL);
> + ctx = i915_gem_create_context(dev, NULL, USES_FULL_PPGTT(dev) ?
> +   I915_GEM_CONTEXT_FULL_PPGTT : 0);

Could use flags variable here just like below this point in code.

> @@ -552,7 +561,8 @@ int i915_gem_context_open(struct drm_device *dev, struct 
> drm_file *file)
>   idr_init(&file_priv->context_idr);
>  
>   mutex_lock(&dev->struct_mutex);
> - ctx = i915_gem_create_context(dev, file_priv);
> + ctx = i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev) ?
> +   I915_GEM_CONTEXT_FULL_PPGTT : 0);

Ditto.

> +int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> +   struct drm_file *file)
> +{
> + struct drm_i915_gem_context_create *args = data;
> + struct drm_i915_gem_context_create2 tmp;

'args2' just as we have create2?

> @@ -1142,6 +1144,22 @@ struct drm_i915_gem_context_create {
>   __u32 pad;
>  };
>  
> +/*
> + * SVM handling
> + *
> + * A context can opt in to SVM support (thereby using its CPU page tables
> + * when accessing data from the GPU) by using the %I915_ENABLE_SVM flag

s/I915_ENABLE_SVM/I915_GEM_CONTEXT_ENABLE_SVM/ ?

> + * and passing an existing context id.  This is a one way transition; SVM
> + * contexts can not be downgraded into PPGTT contexts once converted.
> + */
> +#define I915_GEM_CONTEXT_ENABLE_SVM  (1<<0)
> +#define I915_GEM_CONTEXT_FULL_PPGTT  (1<<1)

BIT()

With the above addressed;

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13

2016-08-15 Thread Chris Wilson
On Mon, Aug 15, 2016 at 02:48:05PM +0300, Mika Kuoppala wrote:
> @@ -891,6 +894,8 @@ struct i915_gem_context {
>   unsigned long flags;
>  #define CONTEXT_NO_ZEROMAP   BIT(0)
>  #define CONTEXT_NO_ERROR_CAPTURE BIT(1)
> +#define CONTEXT_SVM  BIT(2)
> +
>   unsigned hw_id;
>   u32 user_handle;
>  
> @@ -909,6 +914,9 @@ struct i915_gem_context {
>   struct atomic_notifier_head status_notifier;
>   bool execlists_force_single_submission;
>  
> + u32 pasid; /* svm, 20 bits */

Doesn't this conflict with hw_id for execlists.

> + struct task_struct *task;

We don't need the task, we need the mm.

Holding the task is not sufficient.

>   struct list_head link;
>  
>   u8 remap_slice;
> @@ -2001,6 +2009,8 @@ struct drm_i915_private {
>  
>   struct i915_runtime_pm pm;
>  
> + bool svm_available;

No better home / community?

> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 7e08c774a1aa..45d67b54c018 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4304,6 +4304,13 @@ i915_gem_init_hw(struct drm_device *dev)
>   }
>   }
>  
> + if (INTEL_GEN(dev) >= 8) {
> + if (intel_init_svm(dev))

init_hw ?

This looks more like one off early driver init.

> + DRM_DEBUG_DRIVER("Initialized Intel SVM support\n");
> + else
> + DRM_ERROR("Failed to enable Intel SVM support\n");
> + }
> +

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13

2016-08-15 Thread David Woodhouse
On Mon, 2016-08-15 at 14:48 +0300, Mika Kuoppala wrote:
> 
>  
> +static void i915_svm_fault_cb(struct device *dev, int pasid, u64 addr,
> + u32 private, int rwxp, int response)
> +{
> +}
> +
> +static struct svm_dev_ops i915_svm_ops = {
> +   .fault_cb = i915_svm_fault_cb,
> +};
> +

I'd prefer that you don't hook this up unless you need it.

I'd also prefer that you don't need it.

If you need it, nail a hardware designer to a tree before you hook it
up.

-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 3/4] drm/i915: add SVM execbuf ioctl v10

2016-08-15 Thread Chris Wilson
On Mon, Aug 15, 2016 at 02:48:06PM +0300, Mika Kuoppala wrote:
> From: Jesse Barnes 
> 
> We just need to pass in an address to execute and some flags, since we
> don't have to worry about buffer relocation or any of the other usual
> stuff.  Returns a fence to be used for synchronization.
> 
> v2: add a request after batch submission (Jesse)
> v3: add a flag for fence creation (Chris)
> v4: add CLOEXEC flag (Kristian)
> add non-RCS ring support (Jesse)
> v5: update for request alloc change (Jesse)
> v6: new sync file interface, error paths, request breadcrumbs
> v7: always CLOEXEC for sync_file_install
> v8: rebase on new sync file api
> v9: rework on top of fence requests and sync_file
> v10: take fence ref for sync_file (Chris)
>  use correct flush (Chris)
>  limit exec on rcs

This is incomplete, so just proof of principle?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 4/4] drm/i915: Add param for SVM

2016-08-15 Thread Chris Wilson
On Mon, Aug 15, 2016 at 02:48:07PM +0300, Mika Kuoppala wrote:
> From: Jesse Barnes 
> 
> Add possibility to query if svm is available.

When we try to enable SVM on the context we get an error. We have to do
that first anywhere, that seems like a good spot for userspace to catch
all issues.

What usecase do you have that doesn't involve creating an SVM context?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13

2016-08-15 Thread David Woodhouse
On Mon, 2016-08-15 at 13:05 +0100, Chris Wilson wrote:
> On Mon, Aug 15, 2016 at 02:48:05PM +0300, Mika Kuoppala wrote:
> > 
> > +   struct task_struct *task;
> 
> We don't need the task, we need the mm.
> 
> Holding the task is not sufficient.

From the pure DMA point of view, you don't need the MM at all. I handle
all that from the IOMMU side so it's none of your business, darling.

However, if you want to relate a given context to the specific thread
which started it, perhaps to deliver signals or whatever else, then
perhaps you do want the task not the MM.

> 
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4304,6 +4304,13 @@ i915_gem_init_hw(struct drm_device *dev)
> > }
> > }
> >  
> > +   if (INTEL_GEN(dev) >= 8) {
> > +   if (intel_init_svm(dev))
> 
> init_hw ?
> 
> This looks more like one off early driver init.

It's a per-device thing. You might support SVM on one device but not
another, depending on how the IOMMU is configured. Note the 'dev'
argument in the call to intel_init_svm().


-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Initialise mmaped_count for i915_gem_object_info

2016-08-15 Thread Chris Wilson
Reported-by: 0day kbuild test robot
Fixes: 2bd160a131ac ("drm/i915: Reduce i915_gem_objects to only show...")
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index f612d3f18c69..81fabc36ce5a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -403,7 +403,9 @@ static int i915_gem_object_info(struct seq_file *m, void* 
data)
   dev_priv->mm.object_count,
   dev_priv->mm.object_memory);
 
-   size = count = purgeable_size = purgeable_count = 0;
+   size = count = 0;
+   mapped_size = mapped_count = 0;
+   purgeable_size = purgeable_count = 0;
list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list) {
size += obj->base.size;
++count;
-- 
2.8.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 4/4] drm/i915: Add param for SVM

2016-08-15 Thread Mika Kuoppala
Chris Wilson  writes:

> On Mon, Aug 15, 2016 at 02:48:07PM +0300, Mika Kuoppala wrote:
>> From: Jesse Barnes 
>> 
>> Add possibility to query if svm is available.
>
> When we try to enable SVM on the context we get an error. We have to do
> that first anywhere, that seems like a good spot for userspace to catch
> all issues.
>
> What usecase do you have that doesn't involve creating an SVM context?

I can't think of any. So this patch stinks superfluous.

-Mika


> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13

2016-08-15 Thread Chris Wilson
On Mon, Aug 15, 2016 at 01:13:25PM +0100, David Woodhouse wrote:
> On Mon, 2016-08-15 at 13:05 +0100, Chris Wilson wrote:
> > On Mon, Aug 15, 2016 at 02:48:05PM +0300, Mika Kuoppala wrote:
> > > 
> > > + struct task_struct *task;
> > 
> > We don't need the task, we need the mm.
> > 
> > Holding the task is not sufficient.
> 
> From the pure DMA point of view, you don't need the MM at all. I handle
> all that from the IOMMU side so it's none of your business, darling.

But you don't keep the mm alive for the duration of device activity,
right? And you don't wait for the device to finish before releasing the
mmu? (iiuc intel-svm.c)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Ro.CI.BAT: failure for svm support

2016-08-15 Thread Patchwork
== Series Details ==

Series: svm support
URL   : https://patchwork.freedesktop.org/series/11097/
State : failure

== Summary ==

Series 11097v1 svm support
http://patchwork.freedesktop.org/api/1.0/series/11097/revisions/1/mbox

Test drv_hangman:
Subgroup error-state-basic:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
pass   -> DMESG-WARN (ro-bdw-i5-5250u)
pass   -> DMESG-WARN (ro-skl3-i5-6260u)
pass   -> DMESG-WARN (fi-skl-i7-6700k)
Test drv_module_reload_basic:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
pass   -> DMESG-WARN (ro-bdw-i5-5250u)
pass   -> DMESG-WARN (ro-skl3-i5-6260u)
pass   -> DMESG-WARN (fi-skl-i7-6700k)
Test gem_exec_suspend:
Subgroup basic-s3:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
pass   -> DMESG-WARN (ro-skl3-i5-6260u)
Test gem_ringfill:
Subgroup basic-default-hang:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
pass   -> DMESG-WARN (ro-bdw-i5-5250u)
pass   -> DMESG-WARN (ro-skl3-i5-6260u)
pass   -> DMESG-WARN (fi-skl-i7-6700k)
Test kms_cursor_legacy:
Subgroup basic-cursor-vs-flip-varying-size:
fail   -> PASS   (ro-ilk1-i5-650)
Subgroup basic-flip-vs-cursor-legacy:
fail   -> PASS   (ro-ivb2-i7-3770)
pass   -> FAIL   (ro-bdw-i5-5250u)
Subgroup basic-flip-vs-cursor-varying-size:
pass   -> FAIL   (ro-skl3-i5-6260u)
fail   -> PASS   (ro-bdw-i5-5250u)
Test kms_pipe_crc_basic:
Subgroup hang-read-crc-pipe-a:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
pass   -> DMESG-WARN (ro-bdw-i5-5250u)
pass   -> DMESG-WARN (ro-skl3-i5-6260u)
pass   -> DMESG-WARN (fi-skl-i7-6700k)
Subgroup hang-read-crc-pipe-b:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
pass   -> DMESG-WARN (ro-bdw-i5-5250u)
pass   -> DMESG-WARN (ro-skl3-i5-6260u)
pass   -> DMESG-WARN (fi-skl-i7-6700k)
Subgroup hang-read-crc-pipe-c:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
pass   -> DMESG-WARN (ro-bdw-i5-5250u)
pass   -> DMESG-WARN (ro-skl3-i5-6260u)
pass   -> DMESG-WARN (fi-skl-i7-6700k)
Subgroup suspend-read-crc-pipe-a:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
skip   -> DMESG-WARN (ro-bdw-i5-5250u)
pass   -> DMESG-WARN (ro-skl3-i5-6260u)
Subgroup suspend-read-crc-pipe-b:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
skip   -> DMESG-WARN (ro-bdw-i5-5250u)
pass   -> DMESG-WARN (ro-skl3-i5-6260u)
Subgroup suspend-read-crc-pipe-c:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
pass   -> DMESG-WARN (ro-skl3-i5-6260u)

fi-hsw-i7-4770k  total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-qkkr  total:244  pass:184  dwarn:31  dfail:0   fail:3   skip:26 
fi-skl-i7-6700k  total:244  pass:202  dwarn:10  dfail:2   fail:2   skip:28 
fi-snb-i7-2600   total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:213  dwarn:9   dfail:0   fail:1   skip:17 
ro-bdw-i7-5600u  total:240  pass:197  dwarn:10  dfail:0   fail:1   skip:32 
ro-bsw-n3050 total:240  pass:189  dwarn:6   dfail:0   fail:3   skip:42 
ro-byt-n2820 total:240  pass:197  dwarn:0   dfail:0   fail:3   skip:40 
ro-hsw-i3-4010u  total:240  pass:214  dwarn:0   dfail:0   fail:0   skip:26 
ro-hsw-i7-4770r  total:240  pass:185  dwarn:0   dfail:0   fail:0   skip:55 
ro-ilk1-i5-650   total:235  pass:174  dwarn:0   dfail:0   fail:1   skip:60 
ro-ivb-i7-3770   total:240  pass:205  dwarn:0   dfail:0   fail:0   skip:35 
ro-ivb2-i7-3770  total:240  pass:209  dwarn:0   dfail:0   fail:0   skip:31 
ro-skl3-i5-6260u total:240  pass:212  dwarn:10  dfail:0   fail:4   skip:14 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1870/

e56d79f drm-intel-nightly: 2016y-08m-15d-10h-16m-44s UTC integration manifest
6f1de3d drm/i915: Add param for SVM
ce6e32e drm/i915: add SVM execbuf ioctl v10
fd45b62 drm/i915: IOMMU based SVM implementation v13
7fa4650 drm/i915: add create_context2 ioctl

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 1/4] drm/i915: add create_context2 ioctl

2016-08-15 Thread Mika Kuoppala
Chris Wilson  writes:

> On Mon, Aug 15, 2016 at 02:48:04PM +0300, Mika Kuoppala wrote:
>> From: Jesse Barnes 
>> 
>> Add i915_gem_context_create2_ioctl for passing flags
>> (e.g. SVM) when creating a context.
>> 
>> v2: check the pad on create_context
>> v3: rebase
>> v4: i915_dma is no more. create_gvt needs flags
>> 
>> Cc: Daniel Vetter 
>> Cc: Chris Wilson 
>> Cc: Joonas Lahtinen 
>> Signed-off-by: Jesse Barnes  (v1)
>> Signed-off-by: Mika Kuoppala 
>
> Considering we can use deferred ppgtt creation and have setparam do we
> need a new create ioctl just to set a flag?

So like this:

- create ctx with the default create ioctl
- set cxt param it for svm capable.
- first submit deferred creates

And we use the setparam point for returning
error if svm context are not there.

?
-Mika

> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [1/2] io-mapping: Always create a struct to hold metadata about the io-mapping

2016-08-15 Thread Patchwork
== Series Details ==

Series: series starting with [1/2] io-mapping: Always create a struct to hold 
metadata about the io-mapping
URL   : https://patchwork.freedesktop.org/series/11099/
State : failure

== Summary ==

Applying: io-mapping: Always create a struct to hold metadata about the 
io-mapping
fatal: sha1 information is lacking or useless (drivers/gpu/drm/i915/i915_gem.c).
error: could not build fake ancestor
Patch failed at 0001 io-mapping: Always create a struct to hold metadata about 
the io-mapping
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13

2016-08-15 Thread David Woodhouse
On Mon, 2016-08-15 at 13:23 +0100, Chris Wilson wrote:
> On Mon, Aug 15, 2016 at 01:13:25PM +0100, David Woodhouse wrote:
> > On Mon, 2016-08-15 at 13:05 +0100, Chris Wilson wrote:
> > > On Mon, Aug 15, 2016 at 02:48:05PM +0300, Mika Kuoppala wrote:
> > > > 
> > > > + struct task_struct *task;
> > > 
> > > We don't need the task, we need the mm.
> > > 
> > > Holding the task is not sufficient.
> > 
> > From the pure DMA point of view, you don't need the MM at all. I handle
> > all that from the IOMMU side so it's none of your business, darling.
> 
> But you don't keep the mm alive for the duration of device activity,
> right? And you don't wait for the device to finish before releasing the
> mmu? (iiuc intel-svm.c)

We don't "keep it alive" (i.e. bump mm->mm_users), no.
We *did*, but it caused problems. See commit e57e58bd390a68 for the
gory details.

Now we only bump mm->mm_count so if the process exits, the MM can still
be torn down.

Since exit_mmap() happens before exit_files(), what happens on an
unclean shutdown is that the GPU may start to take faults on the PASID
which is in the process of exiting, before the corresponding file
descriptor gets closed.

So no, we don't wait for the device to finish before releasing the MM.
That would involve calling back into device-driver code from the
mmu_notifier callback, with "interesting" locking constraints. We don't
trust device drivers that much :)

-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 3/4] drm/i915: add SVM execbuf ioctl v10

2016-08-15 Thread Mika Kuoppala
Chris Wilson  writes:

> On Mon, Aug 15, 2016 at 02:48:06PM +0300, Mika Kuoppala wrote:
>> From: Jesse Barnes 
>> 
>> We just need to pass in an address to execute and some flags, since we
>> don't have to worry about buffer relocation or any of the other usual
>> stuff.  Returns a fence to be used for synchronization.
>> 
>> v2: add a request after batch submission (Jesse)
>> v3: add a flag for fence creation (Chris)
>> v4: add CLOEXEC flag (Kristian)
>> add non-RCS ring support (Jesse)
>> v5: update for request alloc change (Jesse)
>> v6: new sync file interface, error paths, request breadcrumbs
>> v7: always CLOEXEC for sync_file_install
>> v8: rebase on new sync file api
>> v9: rework on top of fence requests and sync_file
>> v10: take fence ref for sync_file (Chris)
>>  use correct flush (Chris)
>>  limit exec on rcs
>
> This is incomplete, so just proof of principle?

At some point of rebasing I noticed that Jesse did limit
everything on rcs. So I just put it back.

No idea yet why we would need to limit for rcs only.

-Mika

> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Embrace the race in busy-ioctl

2016-08-15 Thread Joonas Lahtinen
On pe, 2016-08-12 at 18:52 +0100, Chris Wilson wrote:
> Daniel Vetter proposed a new challenge to the serialisation inside the
> busy-ioctl that exposed a flaw that could result in us reporting the
> wrong engine as being busy. If the request is reallocated as we test
> its busyness and then reassigned to this object by another thread, we
> would not notice that the test itself was incorrect.
> 
> We are faced with a choice of using __i915_gem_active_get_request_rcu()
> to first acquire a reference to the request preventing the race, or to
> acknowledge the race and accept the limitations upon the accuracy of the
> busy flags. Note that we guarantee that we never falsely report the
> object as idle (providing userspace itself doesn't race), and so the
> most important use of the busy-ioctl and its guarantees are fulfilled.
> 

If Daniel acks the userspace change,

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Initialise mmaped_count for i915_gem_object_info

2016-08-15 Thread Mika Kuoppala
Chris Wilson  writes:

> Reported-by: 0day kbuild test robot
> Fixes: 2bd160a131ac ("drm/i915: Reduce i915_gem_objects to only show...")
> Signed-off-by: Chris Wilson 

Reviewed-by: Mika Kuoppala 

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
> b/drivers/gpu/drm/i915/i915_debugfs.c
> index f612d3f18c69..81fabc36ce5a 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -403,7 +403,9 @@ static int i915_gem_object_info(struct seq_file *m, void* 
> data)
>  dev_priv->mm.object_count,
>  dev_priv->mm.object_memory);
>  
> - size = count = purgeable_size = purgeable_count = 0;
> + size = count = 0;
> + mapped_size = mapped_count = 0;
> + purgeable_size = purgeable_count = 0;
>   list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list) {
>   size += obj->base.size;
>   ++count;
> -- 
> 2.8.1
>
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13

2016-08-15 Thread Chris Wilson
On Mon, Aug 15, 2016 at 01:30:11PM +0100, David Woodhouse wrote:
> On Mon, 2016-08-15 at 13:23 +0100, Chris Wilson wrote:
> > On Mon, Aug 15, 2016 at 01:13:25PM +0100, David Woodhouse wrote:
> > > On Mon, 2016-08-15 at 13:05 +0100, Chris Wilson wrote:
> > > > On Mon, Aug 15, 2016 at 02:48:05PM +0300, Mika Kuoppala wrote:
> > > > > 
> > > > > + struct task_struct *task;
> > > > 
> > > > We don't need the task, we need the mm.
> > > > 
> > > > Holding the task is not sufficient.
> > > 
> > > From the pure DMA point of view, you don't need the MM at all. I handle
> > > all that from the IOMMU side so it's none of your business, darling.
> > 
> > But you don't keep the mm alive for the duration of device activity,
> > right? And you don't wait for the device to finish before releasing the
> > mmu? (iiuc intel-svm.c)
> 
> We don't "keep it alive" (i.e. bump mm->mm_users), no.
> We *did*, but it caused problems. See commit e57e58bd390a68 for the
> gory details.
> 
> Now we only bump mm->mm_count so if the process exits, the MM can still
> be torn down.
> 
> Since exit_mmap() happens before exit_files(), what happens on an
> unclean shutdown is that the GPU may start to take faults on the PASID
> which is in the process of exiting, before the corresponding file
> descriptor gets closed.
> 
> So no, we don't wait for the device to finish before releasing the MM.
> That would involve calling back into device-driver code from the
> mmu_notifier callback, with "interesting" locking constraints. We don't
> trust device drivers that much :)

With the device allocating the memory, we can keep the object alive for
as long as required for it to complete the commands and for other users.

Other uses get access to the svm pages via shared memory (mmap, memfd)
and so another process copying from the buffer should be unaffected by
termination of the original process.

So it is really just what happens to commands for this client when it
dies/exits.  The kneejerk reaction is to say the pages should be kept
alive as they are now for !svm. We could be faced with a situation where
the client copies onto a shared buffer (obtaining a fence), passes that
fence over to the server scheduling an update, and die abruptly. Given
that the fence and request arrive on the server safely (the fence will
be completed even if the command is skipped or its faults filled with
zero), the server will itself proceed to present the incomplete result
from the dead client. (Presently for !svm the output will be intact.)

The question is do we accept the change in behaviour? Or I am
completely misunderstanding how the svm faulting/mmu-notifiers will
work?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 1/4] drm/i915: add create_context2 ioctl

2016-08-15 Thread Chris Wilson
On Mon, Aug 15, 2016 at 03:25:43PM +0300, Mika Kuoppala wrote:
> Chris Wilson  writes:
> 
> > On Mon, Aug 15, 2016 at 02:48:04PM +0300, Mika Kuoppala wrote:
> >> From: Jesse Barnes 
> >> 
> >> Add i915_gem_context_create2_ioctl for passing flags
> >> (e.g. SVM) when creating a context.
> >> 
> >> v2: check the pad on create_context
> >> v3: rebase
> >> v4: i915_dma is no more. create_gvt needs flags
> >> 
> >> Cc: Daniel Vetter 
> >> Cc: Chris Wilson 
> >> Cc: Joonas Lahtinen 
> >> Signed-off-by: Jesse Barnes  (v1)
> >> Signed-off-by: Mika Kuoppala 
> >
> > Considering we can use deferred ppgtt creation and have setparam do we
> > need a new create ioctl just to set a flag?
> 
> So like this:
> 
> - create ctx with the default create ioctl
> - set cxt param it for svm capable.
> - first submit deferred creates
> 
> And we use the setparam point for returning
> error if svm context are not there.

(and a call to set svm on a context after first use is illegal)

That's the outline I had in my head. I am not sure if the result is
cleaner - I just hope it is ;)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Ro.CI.BAT: failure for drm/i915: Initialise mmaped_count for i915_gem_object_info

2016-08-15 Thread Patchwork
== Series Details ==

Series: drm/i915: Initialise mmaped_count for i915_gem_object_info
URL   : https://patchwork.freedesktop.org/series/11100/
State : failure

== Summary ==

Series 11100v1 drm/i915: Initialise mmaped_count for i915_gem_object_info
http://patchwork.freedesktop.org/api/1.0/series/11100/revisions/1/mbox

Test kms_cursor_legacy:
Subgroup basic-flip-vs-cursor-legacy:
fail   -> PASS   (ro-ivb2-i7-3770)
pass   -> FAIL   (ro-bdw-i5-5250u)
Subgroup basic-flip-vs-cursor-varying-size:
fail   -> PASS   (ro-byt-n2820)
pass   -> FAIL   (ro-skl3-i5-6260u)
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-c:
pass   -> DMESG-WARN (ro-bdw-i7-5600u)
skip   -> DMESG-WARN (ro-bdw-i5-5250u)

fi-hsw-i7-4770k  total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-qkkr  total:244  pass:185  dwarn:30  dfail:0   fail:2   skip:27 
fi-skl-i7-6700k  total:244  pass:208  dwarn:4   dfail:2   fail:2   skip:28 
fi-snb-i7-2600   total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:218  dwarn:2   dfail:0   fail:2   skip:18 
ro-bdw-i7-5600u  total:240  pass:206  dwarn:1   dfail:0   fail:1   skip:32 
ro-bsw-n3050 total:87   pass:67   dwarn:0   dfail:0   fail:0   skip:19 
ro-byt-n2820 total:240  pass:198  dwarn:0   dfail:0   fail:2   skip:40 
ro-hsw-i3-4010u  total:240  pass:214  dwarn:0   dfail:0   fail:0   skip:26 
ro-hsw-i7-4770r  total:240  pass:185  dwarn:0   dfail:0   fail:0   skip:55 
ro-ilk1-i5-650   total:235  pass:173  dwarn:0   dfail:0   fail:2   skip:60 
ro-ivb-i7-3770   total:240  pass:205  dwarn:0   dfail:0   fail:0   skip:35 
ro-ivb2-i7-3770  total:240  pass:209  dwarn:0   dfail:0   fail:0   skip:31 
ro-skl3-i5-6260u total:240  pass:222  dwarn:0   dfail:0   fail:4   skip:14 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1872/

e56d79f drm-intel-nightly: 2016y-08m-15d-10h-16m-44s UTC integration manifest
0640a5b drm/i915: Initialise mmaped_count for i915_gem_object_info

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm: make drm_get_format_name thread-safe

2016-08-15 Thread Eric Engestrom
On Mon, Aug 15, 2016 at 12:54:01PM +0300, Jani Nikula wrote:
> On Mon, 15 Aug 2016, Eric Engestrom  wrote:
> > Signed-off-by: Eric Engestrom 
> > ---
> >
> > I moved the main bits to be the first diffs, shouldn't affect anything
> > when applying the patch, but I wanted to ask:
> > I don't like the hard-coded `32` the appears in both kmalloc() and
> > snprintf(), what do you think? If you don't like it either, what would
> > you suggest? Should I #define it?
> >
> > Second question is about the patch mail itself: should I send this kind
> > of patch separated by module, with a note requesting them to be squashed
> > when applying? It has to land as a single patch, but for review it might
> > be easier if people only see the bits they each care about, as well as
> > to collect ack's/r-b's.
> >
> > Cheers,
> >   Eric
> >
> > ---
> >  drivers/gpu/drm/amd/amdgpu/dce_v10_0.c  |  6 ++--
> >  drivers/gpu/drm/amd/amdgpu/dce_v11_0.c  |  6 ++--
> >  drivers/gpu/drm/amd/amdgpu/dce_v8_0.c   |  6 ++--
> >  drivers/gpu/drm/drm_atomic.c|  5 ++--
> >  drivers/gpu/drm/drm_crtc.c  | 21 -
> >  drivers/gpu/drm/drm_fourcc.c| 17 ++-
> >  drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c |  6 ++--
> >  drivers/gpu/drm/i915/i915_debugfs.c | 11 ++-
> >  drivers/gpu/drm/i915/intel_atomic_plane.c   |  6 ++--
> >  drivers/gpu/drm/i915/intel_display.c| 39 
> > -
> >  drivers/gpu/drm/radeon/atombios_crtc.c  | 12 +---
> >  include/drm/drm_fourcc.h|  2 +-
> >  12 files changed, 89 insertions(+), 48 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
> > index 0645c85..38216a1 100644
> > --- a/drivers/gpu/drm/drm_fourcc.c
> > +++ b/drivers/gpu/drm/drm_fourcc.c
> > @@ -39,16 +39,14 @@ static char printable_char(int c)
> >   * drm_get_format_name - return a string for drm fourcc format
> >   * @format: format to compute name of
> >   *
> > - * Note that the buffer used by this function is globally shared and owned 
> > by
> > - * the function itself.
> > - *
> > - * FIXME: This isn't really multithreading safe.
> > + * Note that the buffer returned by this function is owned by the caller
> > + * and will need to be freed.
> >   */
> >  const char *drm_get_format_name(uint32_t format)
> 
> I find it surprising that a function that allocates a buffer returns a
> const pointer. Some userspace libraries have conventions about the
> ownership based on constness.
> 
> (I also find it suprising that kfree() takes a const pointer; arguably
> that call changes the memory.)
> 
> Is there precedent for this?
> 
> BR,
> Jani.

It's not a const pointer, it's a normal pointer to a const char, i.e.
you can do as you want with the pointer but you shouldn't change the
chars it points to.

Cheers,
  Eric
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13

2016-08-15 Thread David Woodhouse
On Mon, 2016-08-15 at 13:53 +0100, Chris Wilson wrote:
> 
> So it is really just what happens to commands for this client when it
> dies/exits.  The kneejerk reaction is to say the pages should be kept
> alive as they are now for !svm. We could be faced with a situation where
> the client copies onto a shared buffer (obtaining a fence), passes that
> fence over to the server scheduling an update, and die abruptly.

Which pages?

Until the moment you actually do the DMA, you don't have "pages". They
might not even exist in RAM. All you have is (a PASID and) a userspace
linear address.

When you actually the DMA, *then* we might fault in the appropriate
pages from disk. Or might not, depending on whether the address is
valid or not.

Between the time when it hands you the linear address, and the time
that you use it, the process could have done anything. we are currently
talking about the case where it exits uncleanly. But it could also
munmap() the linear address in question. Or mmap() something else over
it. Obviously those would be bugs... but so is an unclean exit.

So it doesn't seem to make much sense to ask if you accept the change
in behaviour. You don't really have much choice; it's implicit in the
SVM model of doing DMA directly to userspace addresses. You just
*don't* get to lock things down and trust that the buffers will still
be there when you finally get round to using them.

-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Embrace the race in busy-ioctl

2016-08-15 Thread Mika Kuoppala
Chris Wilson  writes:

> Daniel Vetter proposed a new challenge to the serialisation inside the
> busy-ioctl that exposed a flaw that could result in us reporting the
> wrong engine as being busy. If the request is reallocated as we test
> its busyness and then reassigned to this object by another thread, we
> would not notice that the test itself was incorrect.
>
> We are faced with a choice of using __i915_gem_active_get_request_rcu()
> to first acquire a reference to the request preventing the race, or to
> acknowledge the race and accept the limitations upon the accuracy of the
> busy flags. Note that we guarantee that we never falsely report the
> object as idle (providing userspace itself doesn't race), and so the
> most important use of the busy-ioctl and its guarantees are fulfilled.
>
> Signed-off-by: Chris Wilson 
> Cc: Daniel Vetter 
> Cc: Joonas Lahtinen 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 87 
> ++---
>  include/uapi/drm/i915_drm.h | 15 ++-
>  2 files changed, 60 insertions(+), 42 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 5566916870eb..c77915378768 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3791,49 +3791,54 @@ static __always_inline unsigned int
>  __busy_set_if_active(const struct i915_gem_active *active,
>unsigned int (*flag)(unsigned int id))
>  {
> - /* For more discussion about the barriers and locking concerns,
> -  * see __i915_gem_active_get_rcu().
> -  */
> - do {
> - struct drm_i915_gem_request *request;
> - unsigned int id;
> -
> - request = rcu_dereference(active->request);
> - if (!request || i915_gem_request_completed(request))
> - return 0;
> + struct drm_i915_gem_request *request;
>  
> - id = request->engine->exec_id;
> + request = rcu_dereference(active->request);
> + if (!request || i915_gem_request_completed(request))
> + return 0;
>  
> - /* Check that the pointer wasn't reassigned and overwritten.
> -  *
> -  * In __i915_gem_active_get_rcu(), we enforce ordering between
> -  * the first rcu pointer dereference (imposing a
> -  * read-dependency only on access through the pointer) and
> -  * the second lockless access through the memory barrier
> -  * following a successful atomic_inc_not_zero(). Here there
> -  * is no such barrier, and so we must manually insert an
> -  * explicit read barrier to ensure that the following
> -  * access occurs after all the loads through the first
> -  * pointer.
> -  *
> -  * It is worth comparing this sequence with
> -  * raw_write_seqcount_latch() which operates very similarly.
> -  * The challenge here is the visibility of the other CPU
> -  * writes to the reallocated request vs the local CPU ordering.
> -  * Before the other CPU can overwrite the request, it will
> -  * have updated our active->request and gone through a wmb.
> -  * During the read here, we want to make sure that the values
> -  * we see have not been overwritten as we do so - and we do
> -  * that by serialising the second pointer check with the writes
> -  * on other other CPUs.
> -  *
> -  * The corresponding write barrier is part of
> -  * rcu_assign_pointer().
> -  */
> - smp_rmb();
> - if (request == rcu_access_pointer(active->request))
> - return flag(id);
> - } while (1);
> + /* This is racy. See __i915_gem_active_get_rcu() for a in detail
> +  * discussion of how to handle the race correctly, but for reporting
> +  * the busy state we err on the side of potentially reporting the
> +  * wrong engine as being busy (but we guarantee that the result
> +  * is at least self-consistent).
> +  *
> +  * As we use SLAB_DESTROY_BY_RCU, the request may be reallocated
> +  * whilst we are inspecting it, even under the RCU read lock as we are.
> +  * This means that there is a small window for the engine and/or the
> +  * seqno to have been overwritten. The seqno will always be in the
> +  * future compared to the intended, and so we know that if that
> +  * seqno is idle (on whatever engine) our request is idle and the
> +  * return 0 above is correct.
> +  *
> +  * The issue is that if the engine is switched, it is just as likely
> +  * to report that it is busy (but since the switch happened, we know
> +  * the request should be idle). So there is a small chance that a busy
> +  * result is actually the wrong engine.
> +  *
> +  * So why 

  1   2   3   >