Re: [Intel-gfx] [PATCH 01/10] drm/i915: Move map-and-fenceable tracking to the VMA
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > @@ -2843,8 +2843,7 @@ int i915_vma_unbind(struct i915_vma *vma) > GEM_BUG_ON(obj->bind_count == 0); > GEM_BUG_ON(!obj->pages); > > - if (i915_vma_is_ggtt(vma) && > - vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) { Maybe make a comment here, as the test feel out-of-place quickly glancing. Especially wrt. what it replaces. Although you mentioned in IRC this will soon be eliminated? > + if (i915_vma_is_map_and_fenceable(vma)) { > i915_gem_object_finish_gtt(obj); > > /* release the fence reg _after_ flushing */ > @@ -2864,13 +2864,9 @@ int i915_vma_unbind(struct i915_vma *vma) > drm_mm_remove_node(&vma->node); > list_move_tail(&vma->vm_link, &vma->vm->unbound_list); > > - if (i915_vma_is_ggtt(vma)) { > - if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) { > - obj->map_and_fenceable = false; > - } else if (vma->pages) { > - sg_free_table(vma->pages); > - kfree(vma->pages); > - } Not sure if there should be a comment that for 1:1 mappings vma->pages is just obj->pages so it should not be freed. Or maybe you could even make the test if vma->pages != vma->obj->pages? More self-documenting. > + if (vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL) { > + sg_free_table(vma->pages); > + kfree(vma->pages); > } > vma->pages = NULL; > @@ -3693,7 +3687,10 @@ void __i915_vma_set_map_and_fenceable(struct i915_vma > *vma) This might also clear, so function name should be update_map_and_fenceable, really. > @@ -2262,11 +2262,11 @@ void intel_unpin_fb_obj(struct drm_framebuffer *fb, > unsigned int rotation) > WARN_ON(!mutex_is_locked(&obj->base.dev->struct_mutex)); > > intel_fill_fb_ggtt_view(&view, fb, rotation); > + vma = i915_gem_object_to_ggtt(obj, &view); > > - if (view.type == I915_GGTT_VIEW_NORMAL) > + if (i915_vma_is_map_and_fenceable(vma)) > i915_gem_object_unpin_fence(obj); > > - vma = i915_gem_object_to_ggtt(obj, &view); > i915_gem_object_unpin_from_display_plane(vma); This did not have NULL protection previously either, so should be OK. Reviewed-by: Joonas Lahtinen Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 01/10] drm/i915: Move map-and-fenceable tracking to the VMA
On Mon, Aug 15, 2016 at 11:03:32AM +0300, Joonas Lahtinen wrote: > On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > Not sure if there should be a comment that for 1:1 mappings vma->pages > is just obj->pages so it should not be freed. Or maybe you could even > make the test if vma->pages != vma->obj->pages? More self-documenting. I contemplated making this vma->pages != vma->obj->pages as well in light of the recent changes, will do. > > > + if (vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL) { > > + sg_free_table(vma->pages); > > + kfree(vma->pages); > > } > > vma->pages = NULL; > > > > > @@ -3693,7 +3687,10 @@ void __i915_vma_set_map_and_fenceable(struct > > i915_vma *vma) > > This might also clear, so function name should be > update_map_and_fenceable, really. update/compute either is a fine TODO ;) > > @@ -2262,11 +2262,11 @@ void intel_unpin_fb_obj(struct drm_framebuffer *fb, > > unsigned int rotation) > > WARN_ON(!mutex_is_locked(&obj->base.dev->struct_mutex)); > > > > intel_fill_fb_ggtt_view(&view, fb, rotation); > > + vma = i915_gem_object_to_ggtt(obj, &view); > > > > - if (view.type == I915_GGTT_VIEW_NORMAL) > > + if (i915_vma_is_map_and_fenceable(vma)) > > i915_gem_object_unpin_fence(obj); > > > > - vma = i915_gem_object_to_ggtt(obj, &view); > > i915_gem_object_unpin_from_display_plane(vma); > > This did not have NULL protection previously either, so should be OK. Yup, the long goal here is to pass in the vma. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: Show RPS autotuning thresholds along with waitboost
On Sun, Aug 14, 2016 at 02:28:56PM +0100, Chris Wilson wrote: > For convenience when debugging user issues show the autotuning > RPS parameters in debugfs/i915_rps_boost_info. > > v2: Refine the presentation > > Signed-off-by: Chris Wilson > Cc: frit...@kodi.tv Looks good to me (well, it doesn't, I hate having things on the same line as the case statement, but that's a personal opinion), compiles and works as it should. Reviewed-by: David Weinehall > --- > drivers/gpu/drm/i915/i915_debugfs.c | 43 > +++-- > 1 file changed, 41 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > b/drivers/gpu/drm/i915/i915_debugfs.c > index c461072da142..8d302906d768 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -2441,6 +2441,16 @@ static int count_irq_waiters(struct drm_i915_private > *i915) > return count; > } > > +static const char *rps_power_to_str(int power) > +{ > + switch (power) { > + default: return "unknown"; > + case LOW_POWER: return "low power"; > + case BETWEEN: return "mixed"; > + case HIGH_POWER: return "high power"; > + } > +} > + > static int i915_rps_boost_info(struct seq_file *m, void *data) > { > struct drm_info_node *node = m->private; > @@ -2452,12 +2462,17 @@ static int i915_rps_boost_info(struct seq_file *m, > void *data) > seq_printf(m, "GPU busy? %s [%x]\n", > yesno(dev_priv->gt.awake), dev_priv->gt.active_engines); > seq_printf(m, "CPU waiting? %d\n", count_irq_waiters(dev_priv)); > - seq_printf(m, "Frequency requested %d; min hard:%d, soft:%d; max > soft:%d, hard:%d\n", > -intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq), > + seq_printf(m, "Frequency requested %d\n", > +intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq)); > + seq_printf(m, " min hard:%d, soft:%d; max soft:%d, hard:%d\n", > intel_gpu_freq(dev_priv, dev_priv->rps.min_freq), > intel_gpu_freq(dev_priv, dev_priv->rps.min_freq_softlimit), > intel_gpu_freq(dev_priv, dev_priv->rps.max_freq_softlimit), > intel_gpu_freq(dev_priv, dev_priv->rps.max_freq)); > + seq_printf(m, " idle:%d, efficient:%d, boost:%d\n", > +intel_gpu_freq(dev_priv, dev_priv->rps.idle_freq), > +intel_gpu_freq(dev_priv, dev_priv->rps.efficient_freq), > +intel_gpu_freq(dev_priv, dev_priv->rps.boost_freq)); > > mutex_lock(&dev->filelist_mutex); > spin_lock(&dev_priv->rps.client_lock); > @@ -2478,6 +2493,30 @@ static int i915_rps_boost_info(struct seq_file *m, > void *data) > spin_unlock(&dev_priv->rps.client_lock); > mutex_unlock(&dev->filelist_mutex); > > + if (INTEL_GEN(dev_priv) >= 6 && > + dev_priv->rps.enabled && > + dev_priv->gt.active_engines) { > + u32 rpupei, rpcurup; > + u32 rpdownei, rpcurdown; > + > + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); > + rpupei = I915_READ_FW(GEN6_RP_CUR_UP_EI) & GEN6_CURICONT_MASK; > + rpcurup = I915_READ_FW(GEN6_RP_CUR_UP) & GEN6_CURBSYTAVG_MASK; > + rpdownei = I915_READ_FW(GEN6_RP_CUR_DOWN_EI) & > GEN6_CURIAVG_MASK; > + rpcurdown = I915_READ_FW(GEN6_RP_CUR_DOWN) & > GEN6_CURBSYTAVG_MASK; > + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); > + > + seq_printf(m, "\nRPS Autotuning (current \"%s\" window):\n", > +rps_power_to_str(dev_priv->rps.power)); > + seq_printf(m, " Avg. up: %d%% [above threshold? %d%%]\n", > +100*rpcurup/rpupei, > +dev_priv->rps.up_threshold); > + seq_printf(m, " Avg. down: %d%% [below threshold? %d%%]\n", > +100*rpcurdown/rpdownei, > +dev_priv->rps.down_threshold); > + } else > + seq_printf(m, "\nRPS Autotuning inactive\n"); > + > return 0; > } > > -- > 2.8.1 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI] drm/i915: Show RPS autotuning thresholds along with waitboost
For convenience when debugging user issues show the autotuning RPS parameters in debugfs/i915_rps_boost_info. v2: Refine the presentation v3: Style Signed-off-by: Chris Wilson Cc: frit...@kodi.tv Link: http://patchwork.freedesktop.org/patch/msgid/1471181336-27523-1-git-send-email-ch...@chris-wilson.co.uk Reviewed-by: David Weinehall --- drivers/gpu/drm/i915/i915_debugfs.c | 48 +++-- drivers/gpu/drm/i915/i915_reg.h | 7 +++--- 2 files changed, 50 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 2a3d6d2..1949588 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2441,6 +2441,20 @@ static int count_irq_waiters(struct drm_i915_private *i915) return count; } +static const char *rps_power_to_str(unsigned power) +{ + const char *strings[] = { + [LOW_POWER] = "low power", + [BETWEEN] = "mixed", + [HIGH_POWER] = "high power", + }; + + if (power >= ARRAY_SIZE(strings) || !strings[power]) + return "unknown"; + + return strings[power]; +} + static int i915_rps_boost_info(struct seq_file *m, void *data) { struct drm_info_node *node = m->private; @@ -2452,12 +2466,17 @@ static int i915_rps_boost_info(struct seq_file *m, void *data) seq_printf(m, "GPU busy? %s [%x]\n", yesno(dev_priv->gt.awake), dev_priv->gt.active_engines); seq_printf(m, "CPU waiting? %d\n", count_irq_waiters(dev_priv)); - seq_printf(m, "Frequency requested %d; min hard:%d, soft:%d; max soft:%d, hard:%d\n", - intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq), + seq_printf(m, "Frequency requested %d\n", + intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq)); + seq_printf(m, " min hard:%d, soft:%d; max soft:%d, hard:%d\n", intel_gpu_freq(dev_priv, dev_priv->rps.min_freq), intel_gpu_freq(dev_priv, dev_priv->rps.min_freq_softlimit), intel_gpu_freq(dev_priv, dev_priv->rps.max_freq_softlimit), intel_gpu_freq(dev_priv, dev_priv->rps.max_freq)); + seq_printf(m, " idle:%d, efficient:%d, boost:%d\n", + intel_gpu_freq(dev_priv, dev_priv->rps.idle_freq), + intel_gpu_freq(dev_priv, dev_priv->rps.efficient_freq), + intel_gpu_freq(dev_priv, dev_priv->rps.boost_freq)); mutex_lock(&dev->filelist_mutex); spin_lock(&dev_priv->rps.client_lock); @@ -2478,6 +2497,31 @@ static int i915_rps_boost_info(struct seq_file *m, void *data) spin_unlock(&dev_priv->rps.client_lock); mutex_unlock(&dev->filelist_mutex); + if (INTEL_GEN(dev_priv) >= 6 && + dev_priv->rps.enabled && + dev_priv->gt.active_engines) { + u32 rpup, rpupei; + u32 rpdown, rpdownei; + + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); + rpup = I915_READ_FW(GEN6_RP_CUR_UP) & GEN6_RP_EI_MASK; + rpupei = I915_READ_FW(GEN6_RP_CUR_UP_EI) & GEN6_RP_EI_MASK; + rpdown = I915_READ_FW(GEN6_RP_CUR_DOWN) & GEN6_RP_EI_MASK; + rpdownei = I915_READ_FW(GEN6_RP_CUR_DOWN_EI) & GEN6_RP_EI_MASK; + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); + + seq_printf(m, "\nRPS Autotuning (current \"%s\" window):\n", + rps_power_to_str(dev_priv->rps.power)); + seq_printf(m, " Avg. up: %d%% [above threshold? %d%%]\n", + 100 * rpup / rpupei, + dev_priv->rps.up_threshold); + seq_printf(m, " Avg. down: %d%% [below threshold? %d%%]\n", + 100 * rpdown / rpdownei, + dev_priv->rps.down_threshold); + } else { + seq_puts(m, "\nRPS Autotuning inactive\n"); + } + return 0; } diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index da82744..d4adf28 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -7036,12 +7036,13 @@ enum { #define GEN6_RP_UP_THRESHOLD _MMIO(0xA02C) #define GEN6_RP_DOWN_THRESHOLD _MMIO(0xA030) #define GEN6_RP_CUR_UP_EI _MMIO(0xA050) -#define GEN6_CURICONT_MASK 0xff +#define GEN6_RP_EI_MASK 0xff +#define GEN6_CURICONT_MASK GEN6_RP_EI_MASK #define GEN6_RP_CUR_UP _MMIO(0xA054) -#define GEN6_CURBSYTAVG_MASK 0xff +#define GEN6_CURBSYTAVG_MASK GEN6_RP_EI_MASK #define GEN6_RP_PREV_UP_MMIO(0xA058) #define GEN6_RP_CUR_DOWN_EI_MMIO(0xA05C) -#define GEN6_CURIAVG_MASK
[Intel-gfx] [PATCH 5/5 v4] drm/i915: debugfs spring cleaning
drm/i915: debugfs spring cleaning Just like with sysfs, we do some major overhaul. Pass dev_priv instead of dev to all feature macros (IS_, HAS_, INTEL_, etc.). This has the side effect that a bunch of functions now get dev_priv passed instead of dev. All calls to INTEL_INFO()->gen have been replaced with INTEL_GEN(). We want access to to_i915(node->minor->dev) in a lot of places, so add the node_to_i915() helper to accomodate for this. Finally, we have quite a few cases where we get a void * pointer, and need to cast it to drm_device *, only to run to_i915() on it. Add cast_to_i915() to do this. v2: Don't introduce extra dev (Chris) v3: Make pipe_crc_info have a pointer to drm_i915_private instead of drm_device. This saves a bit of space, since we never use drm_device anywhere in these functions. Also some minor fixup that I missed in the previous version. v4: Fixed a nasty bug in the earlier version (that could trigger an oops). Changed the code a bit so that dev_priv is passed directly to various functions, thus removing the need for the cast_to_i915() helper. Also did some additional cleanup. Signed-off-by: David Weinehall diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index bba47cfd5d61..4b1884a238da 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -46,6 +46,11 @@ enum { PINNED_LIST, }; +static inline struct drm_i915_private *node_to_i915(struct drm_info_node *node) +{ + return to_i915(node->minor->dev); +} + /* As the drm_debugfs_init() routines are called before dev->dev_private is * allocated we need to hook into the minor for release. */ static int @@ -63,7 +68,7 @@ drm_add_fake_info_node(struct drm_minor *minor, node->minor = minor; node->dent = ent; - node->info_ent = (void *) key; + node->info_ent = (void *)key; mutex_lock(&minor->debugfs_lock); list_add(&node->list, &minor->debugfs_list); @@ -74,12 +79,11 @@ drm_add_fake_info_node(struct drm_minor *minor, static int i915_capabilities(struct seq_file *m, void *data) { - struct drm_info_node *node = m->private; - struct drm_device *dev = node->minor->dev; - const struct intel_device_info *info = INTEL_INFO(dev); + struct drm_i915_private *dev_priv = node_to_i915(m->private); + const struct intel_device_info *info = INTEL_INFO(dev_priv); - seq_printf(m, "gen: %d\n", info->gen); - seq_printf(m, "pch: %d\n", INTEL_PCH_TYPE(dev)); + seq_printf(m, "gen: %d\n", INTEL_GEN(dev_priv)); + seq_printf(m, "pch: %d\n", INTEL_PCH_TYPE(dev_priv)); #define PRINT_FLAG(x) seq_printf(m, #x ": %s\n", yesno(info->x)) #define SEP_SEMICOLON ; DEV_INFO_FOR_EACH_FLAG(PRINT_FLAG, SEP_SEMICOLON); @@ -136,13 +140,14 @@ static void describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) { struct drm_i915_private *dev_priv = to_i915(obj->base.dev); + struct drm_device *dev = &dev_priv->drm; struct intel_engine_cs *engine; struct i915_vma *vma; unsigned int frontbuffer_bits; int pin_count = 0; enum intel_engine_id id; - lockdep_assert_held(&obj->base.dev->struct_mutex); + lockdep_assert_held(&dev->struct_mutex); seq_printf(m, "%pK: %c%c%c%c%c %8zdKiB %02x %02x [ ", &obj->base, @@ -157,13 +162,13 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) for_each_engine_id(engine, dev_priv, id) seq_printf(m, "%x ", i915_gem_active_get_seqno(&obj->last_read[id], - &obj->base.dev->struct_mutex)); +&dev->struct_mutex)); seq_printf(m, "] %x %x%s%s%s", i915_gem_active_get_seqno(&obj->last_write, -&obj->base.dev->struct_mutex), +&dev->struct_mutex), i915_gem_active_get_seqno(&obj->last_fence, -&obj->base.dev->struct_mutex), - i915_cache_level_str(to_i915(obj->base.dev), obj->cache_level), +&dev->struct_mutex), + i915_cache_level_str(dev_priv, obj->cache_level), obj->dirty ? " dirty" : "", obj->madv == I915_MADV_DONTNEED ? " purgeable" : ""); if (obj->base.name) @@ -201,7 +206,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) } engine = i915_gem_active_get_engine(&obj->last_write, - &obj->base.dev->struct_mutex); + &dev->struct_mutex); if (engine) seq_printf(m, " (%s)", engine->name); @@ -213,10 +218,10 @@ de
Re: [Intel-gfx] [PATCH 5/5 v3] drm/i915: debugfs spring cleaning
On Fri, Aug 12, 2016 at 01:43:52PM +0100, Dave Gordon wrote: > Alternatively (noting that almost the only use we make of this drm_info_node > is to indirect multiple times to get dev_priv), we could change what is > stored in (struct seq_file).private to make it more convenient and/or > efficient. For example, > > struct i915_debugfs_node { > struct drm_i915_private *dev_priv; > struct drm_info_node drm_info; // if still required > }; > > thus eliminating several memory cycles per use for a cost of one word extra > data per debugfs node. v4 of the patch doesn't eliminate the need for the node_to_i915() macro and its users, but all functions that don't use the drm_debugfs_create_files() helper now receive drm_i915_private *dev_priv instead of drm_device *dev. This at least kills off the cast_to_i915() macro. Regards, David ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Ro.CI.BAT: failure for drm/i915: Show RPS autotuning thresholds along with waitboost (rev3)
== Series Details == Series: drm/i915: Show RPS autotuning thresholds along with waitboost (rev3) URL : https://patchwork.freedesktop.org/series/11063/ State : failure == Summary == Series 11063v3 drm/i915: Show RPS autotuning thresholds along with waitboost http://patchwork.freedesktop.org/api/1.0/series/11063/revisions/3/mbox Test drv_module_reload_basic: pass -> SKIP (ro-ivb-i7-3770) Test kms_cursor_legacy: Subgroup basic-cursor-vs-flip-varying-size: pass -> FAIL (ro-ilk1-i5-650) Subgroup basic-flip-vs-cursor-legacy: pass -> FAIL (ro-bdw-i5-5250u) Subgroup basic-flip-vs-cursor-varying-size: pass -> FAIL (ro-skl3-i5-6260u) Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-a: pass -> DMESG-WARN (ro-bdw-i7-5600u) skip -> DMESG-WARN (ro-bdw-i5-5250u) Subgroup suspend-read-crc-pipe-c: skip -> DMESG-WARN (ro-bdw-i5-5250u) fi-hsw-i7-4770k total:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-kbl-qkkr total:244 pass:185 dwarn:28 dfail:0 fail:3 skip:28 fi-skl-i7-6700k total:244 pass:208 dwarn:4 dfail:2 fail:2 skip:28 fi-snb-i7-2600 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 ro-bdw-i5-5250u total:240 pass:219 dwarn:3 dfail:0 fail:1 skip:17 ro-bdw-i7-5600u total:240 pass:206 dwarn:1 dfail:0 fail:1 skip:32 ro-bsw-n3050 total:240 pass:193 dwarn:0 dfail:0 fail:5 skip:42 ro-byt-n2820 total:240 pass:197 dwarn:0 dfail:0 fail:3 skip:40 ro-hsw-i3-4010u total:240 pass:214 dwarn:0 dfail:0 fail:0 skip:26 ro-hsw-i7-4770r total:240 pass:185 dwarn:0 dfail:0 fail:0 skip:55 ro-ilk1-i5-650 total:235 pass:173 dwarn:0 dfail:0 fail:2 skip:60 ro-ivb-i7-3770 total:240 pass:204 dwarn:0 dfail:0 fail:0 skip:36 ro-ivb2-i7-3770 total:240 pass:209 dwarn:0 dfail:0 fail:0 skip:31 ro-skl3-i5-6260u total:240 pass:222 dwarn:0 dfail:0 fail:4 skip:14 Results at /archive/results/CI_IGT_test/RO_Patchwork_1866/ 441299a drm-intel-nightly: 2016y-08m-15d-07h-32m-02s UTC integration manifest 3653f3b drm/i915: Show RPS autotuning thresholds along with waitboost ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 03/10] drm/i915: Move fence tracking from object to vma
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > @@ -455,15 +455,21 @@ struct intel_opregion { > struct intel_overlay; > struct intel_overlay_error_state; > > -#define I915_FENCE_REG_NONE -1 > -#define I915_MAX_NUM_FENCES 32 > -/* 32 fences + sign bit for FENCE_REG_NONE */ > -#define I915_MAX_NUM_FENCE_BITS 6 > - > struct drm_i915_fence_reg { > struct list_head lru_list; Could be converted to lru_link while at it. > @@ -1131,15 +1131,11 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private > *i915, > } else { > node.start = i915_ggtt_offset(vma); > node.allocated = false; > - ret = i915_gem_object_put_fence(obj); > + ret = i915_vma_put_fence(vma); > if (ret) > goto out_unpin; > } > > - ret = i915_gem_object_set_to_gtt_domain(obj, true); > - if (ret) > - goto out_unpin; > - This is a somewhat an unexpected change in here. Care to explain? > +static void i965_write_fence_reg(struct drm_i915_fence_reg *fence, > + struct i915_vma *vma) > { > - struct drm_i915_private *dev_priv = to_i915(dev); > i915_reg_t fence_reg_lo, fence_reg_hi; > int fence_pitch_shift; > + u64 val; > > - if (INTEL_INFO(dev)->gen >= 6) { > - fence_reg_lo = FENCE_REG_GEN6_LO(reg); > - fence_reg_hi = FENCE_REG_GEN6_HI(reg); > + if (INTEL_INFO(fence->i915)->gen >= 6) { > + fence_reg_lo = FENCE_REG_GEN6_LO(fence->id); > + fence_reg_hi = FENCE_REG_GEN6_HI(fence->id); > fence_pitch_shift = GEN6_FENCE_PITCH_SHIFT; > + > } else { > - fence_reg_lo = FENCE_REG_965_LO(reg); > - fence_reg_hi = FENCE_REG_965_HI(reg); > + fence_reg_lo = FENCE_REG_965_LO(fence->id); > + fence_reg_hi = FENCE_REG_965_HI(fence->id); > fence_pitch_shift = I965_FENCE_PITCH_SHIFT; > } > > - /* To w/a incoherency with non-atomic 64-bit register updates, > - * we split the 64-bit update into two 32-bit writes. In order > - * for a partial fence not to be evaluated between writes, we > - * precede the update with write to turn off the fence register, > - * and only enable the fence as the last step. > - * > - * For extra levels of paranoia, we make sure each step lands > - * before applying the next step. > - */ > - I915_WRITE(fence_reg_lo, 0); > - POSTING_READ(fence_reg_lo); > - > - if (obj) { > - struct i915_vma *vma = i915_gem_object_to_ggtt(obj, NULL); > - unsigned int tiling = i915_gem_object_get_tiling(obj); > - unsigned int stride = i915_gem_object_get_stride(obj); > - u64 size = vma->node.size; > - u32 row_size = stride * (tiling == I915_TILING_Y ? 32 : 8); > - u64 val; > - > - /* Adjust fence size to match tiled area */ > - size = rounddown(size, row_size); > + if (vma) { > + unsigned int tiling = i915_gem_object_get_tiling(vma->obj); > + unsigned int tiling_y = tiling == I915_TILING_Y; bool and maybe 'y_tiled'? > + unsigned int stride = i915_gem_object_get_stride(vma->obj); > + u32 row_size = stride * (tiling_y ? 32 : 8); > + u32 size = rounddown(vma->node.size, row_size); > > val = ((vma->node.start + size - 4096) & 0xf000) << 32; > val |= vma->node.start & 0xf000; > val |= (u64)((stride / 128) - 1) << fence_pitch_shift; > - if (tiling == I915_TILING_Y) > + if (tiling_y) > val |= 1 << I965_FENCE_TILING_Y_SHIFT; While around, BIT() > val |= I965_FENCE_REG_VALID; > + } else > + val = 0; > + > + if (1) { Umm? At least ought to have TODO: / FIXME: or some explanation. And if (!1) return; Would make the code more readable too, as you do not have any else branch. > @@ -152,20 +148,23 @@ static void i915_write_fence_reg(struct drm_device > *dev, int reg, > } else > val = 0; > > - I915_WRITE(FENCE_REG(reg), val); > - POSTING_READ(FENCE_REG(reg)); > + if (1) { Ditto. > @@ -186,96 +185,95 @@ static void i830_write_fence_reg(struct drm_device > *dev, int reg, > } else > val = 0; > > - I915_WRITE(FENCE_REG(reg), val); > - POSTING_READ(FENCE_REG(reg)); > -} > + if (1) { Ditto. > -static struct drm_i915_fence_reg * > -i915_find_fence_reg(struct drm_device *dev) > +static struct drm_i915_fence_reg *fence_find(struct drm_i915_private > *dev_priv) > { > - struct drm_i915_private *dev_priv = to_i915(dev); > - struct drm_i915_fence_reg *reg, *avail; > - int i; > - > - /* First try to find a free reg */ > - avail = NULL; > - for (i = 0; i < dev_priv->num_fence_regs; i++) { > - reg = &dev
Re: [Intel-gfx] [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs
On 12/08/16 17:31, Goel, Akash wrote: On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel As per the current i915 Driver load sequence, debugfs registration is done at the end and so the relay channel debugfs file is also created after that but the GuC firmware is loaded much earlier in the sequence. As a result Driver could miss capturing the boot-time logs of GuC firmware if there are flush interrupts from the GuC side. Relay has a provision to support early logging where initially only relay channel can be created, to have buffers for storing logs, and later on channel can be associated with a debugfs file at appropriate time. Have availed that, which allows Driver to capture boot time logs also, which can be collected once Userspace comes up. Suggested-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 61 +- 1 file changed, 44 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index af48f62..1c287d7 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct intel_guc *guc) relay_close(guc->log.relay_chan); } -static int guc_create_log_relay_file(struct intel_guc *guc) +static int guc_create_relay_channel(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); struct rchan *guc_log_relay_chan; -struct dentry *log_dir; size_t n_subbufs, subbuf_size; -/* For now create the log file in /sys/kernel/debug/dri/0 dir */ -log_dir = dev_priv->drm.primary->debugfs_root; - -/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is - * not mounted and so can't create the relay file. - * The relay API seems to fit well with debugfs only. It only needs a dentry, I don't see that it has to be a debugfs one. Besides dentry, there are other requirements for using relay, which can be met only for a debugfs file. debugfs wasn't the preferred choice to place the log file, but had no other option, as relay API is compatible with debugfs only. What are those and should they be mentioned in the comment above? Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 03/10] drm/i915: Move fence tracking from object to vma
On Mon, Aug 15, 2016 at 12:18:20PM +0300, Joonas Lahtinen wrote: > On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > > + if (1) { > > Umm? At least ought to have TODO: / FIXME: or some explanation. And You're not aware of the pipelined fencing? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/10] drm/i915: Fix partial GGTT faulting
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > @@ -1717,26 +1716,30 @@ int i915_gem_fault(struct vm_area_struct *area, > struct vm_fault *vmf) > } > > /* Use a partial view if the object is bigger than the aperture. */ Move this comment down to where partial view is actually created. > - if (obj->base.size >= ggtt->mappable_end && > - !i915_gem_object_is_tiled(obj)) { > + /* Now pin it into the GTT if needed */ > + vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, > + PIN_MAPPABLE | PIN_NONBLOCK); > + if (IS_ERR(vma)) { > + struct i915_ggtt_view partial; 'view' still makes more sense, less repeating of the word partial down. > @@ -1754,26 +1757,7 @@ int i915_gem_fault(struct vm_area_struct *area, struct > vm_fault *vmf) > pfn = ggtt->mappable_base + i915_ggtt_offset(vma); > pfn >>= PAGE_SHIFT; > > - if (unlikely(view.type == I915_GGTT_VIEW_PARTIAL)) { > - /* Overriding existing pages in partial view does not cause > - * us any trouble as TLBs are still valid because the fault > - * is due to userspace losing part of the mapping or never > - * having accessed it before (at this partials' range). > - */ > - unsigned long base = area->vm_start + > - (view.params.partial.offset << PAGE_SHIFT); > - unsigned int i; > - > - for (i = 0; i < view.params.partial.size; i++) { > - ret = vm_insert_pfn(area, > - base + i * PAGE_SIZE, > - pfn + i); > - if (ret) > - break; > - } > - > - obj->fault_mappable = true; > - } else { > + if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) { likely() ? > if (!obj->fault_mappable) { > unsigned long size = > min_t(unsigned long, > @@ -1789,13 +1773,31 @@ int i915_gem_fault(struct vm_area_struct *area, > struct vm_fault *vmf) > if (ret) > break; > } > - > - obj->fault_mappable = true; > } else > ret = vm_insert_pfn(area, > (unsigned long)vmf->virtual_address, > pfn + page_offset); > + } else { > + /* Overriding existing pages in partial view does not cause > + * us any trouble as TLBs are still valid because the fault > + * is due to userspace losing part of the mapping or never > + * having accessed it before (at this partials' range). > + */ > + const struct i915_ggtt_view *view = &vma->ggtt_view; I now see why you did the rename. Do not have a better idea really, so; Reviewed-by: Joonas Lahtinen Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/10] drm/i915: Fallback to using unmappable memory for scanout
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > The existing ABI says that scanouts are pinned into the mappable region > so that legacy clients (e.g. old Xorg or plymouthd) can write directly > into the scanout through a GTT mapping. However if the surface does not > fit into the mappable region, we are better off just trying to fit it > anywhere and hoping for the best. (Any userspace that is cappable of s/cappable/capable/ > using ginormous scanouts is also likely not to rely on pure GTT > updates.) With the partial vma fault support, we are no longer > restricted to only using scanouts that we can pin (though it is still > preferred for performance reasons and for powersaving features like > FBC). > > v2: Skip fence pinning when not mappable. > v3: Add a comment to explain the possible rammifactions of not being > able to use fences for unmappable scanouts. > v4: Rebase to skip over some local patches > v5: Rebase to defer until after we have unmappable GTT fault support > > Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen Could use some Acked-by tags. Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 08/10] drm/i915: Track display alignment on VMA
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > When using the aliasing ppgtt and pagefliping with the shrinker/eviction s/fliping/flipping/ > active, we note that we often have to rebind the backbuffer before > flipping onto the scanout because it has an invalid alignment. If we > store the worst-case alignment required for a VMA, we can avoid having > to rebind at critical junctures. > > Signed-off-by: Chris Wilson > @@ -2984,17 +2983,10 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 > alignment, u64 flags) > size = i915_gem_get_ggtt_size(dev_priv, size, > i915_gem_object_get_tiling(obj)); > > - min_alignment = > - i915_gem_get_ggtt_alignment(dev_priv, size, > - i915_gem_object_get_tiling(obj), > - flags & PIN_MAPPABLE); > - if (alignment == 0) > - alignment = min_alignment; > - if (alignment & (min_alignment - 1)) { > - DRM_DEBUG("Invalid object alignment requested %llu, minimum > %llu\n", > - alignment, min_alignment); > - return -EINVAL; > - } > + alignment = max(max(alignment, vma->display_alignment), > + i915_gem_get_ggtt_alignment(dev_priv, size, > + > i915_gem_object_get_tiling(obj), > + flags & PIN_MAPPABLE)); No DRM_DEBUG no more? > @@ -183,7 +183,7 @@ struct i915_vma { > struct drm_i915_fence_reg *fence; > struct sg_table *pages; > void __iomem *iomap; > - u64 size; > + u64 size, display_alignment; Unrelated variables, better off their own lines. Reviewed-by: Joonas Lahtinen Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 03/32] drm/i915: Store the active context object on all engines upon error
With execlists, we have context objects everywhere, not just RCS. So store them for post-mortem debugging. This also has a secondary effect of removing one more unsafe list iteration with using preserved state from the hanging request. And now we can cross-reference the request's context state with that loaded by the GPU. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gpu_error.c | 28 1 file changed, 4 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 1c098fa65fbe..d11630bac188 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1043,28 +1043,6 @@ static void error_record_engine_registers(struct drm_i915_error_state *error, } } -static void i915_gem_record_active_context(struct intel_engine_cs *engine, - struct drm_i915_error_state *error, - struct drm_i915_error_engine *ee) -{ - struct drm_i915_private *dev_priv = engine->i915; - struct drm_i915_gem_object *obj; - - /* Currently render ring is the only HW context user */ - if (engine->id != RCS || !error->ccid) - return; - - list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { - if (!i915_gem_obj_ggtt_bound(obj)) - continue; - - if ((error->ccid & PAGE_MASK) == i915_gem_obj_ggtt_offset(obj)) { - ee->ctx = i915_error_ggtt_object_create(dev_priv, obj); - break; - } - } -} - static void i915_gem_record_rings(struct drm_i915_private *dev_priv, struct drm_i915_error_state *error) { @@ -1114,6 +1092,10 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, i915_error_ggtt_object_create(dev_priv, engine->scratch.obj); + ee->ctx = + i915_error_ggtt_object_create(dev_priv, + request->ctx->engine[i].state); + if (request->pid) { struct task_struct *task; @@ -1144,8 +1126,6 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, ee->wa_ctx = i915_error_ggtt_object_create(dev_priv, engine->wa_ctx.obj); - i915_gem_record_active_context(engine, error, ee); - count = 0; list_for_each_entry(request, &engine->request_list, link) count++; -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 05/32] drm/i915: Focus debugfs/i915_gem_pinned to show only display pins
Only those objects pinned to the display have semi-permanent pins of a global nature (other pins are transient within their local vm). Simplify i915_gem_pinned to only show the pertinent information about the pinned objects within the GGTT. v2: i915_gem_gtt_info is still shared with debugfs/i915_gem_gtt, rename i915_gem_pinned to i915_gem_pin_display to better reflect its contents Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index cf35ce0b8518..c3bc5db1124f 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -40,12 +40,6 @@ #include #include "i915_drv.h" -enum { - ACTIVE_LIST, - INACTIVE_LIST, - PINNED_LIST, -}; - /* As the drm_debugfs_init() routines are called before dev->dev_private is * allocated we need to hook into the minor for release. */ static int @@ -537,8 +531,8 @@ static int i915_gem_gtt_info(struct seq_file *m, void *data) { struct drm_info_node *node = m->private; struct drm_device *dev = node->minor->dev; - uintptr_t list = (uintptr_t) node->info_ent->data; struct drm_i915_private *dev_priv = to_i915(dev); + bool show_pin_display_only = !!data; struct drm_i915_gem_object *obj; u64 total_obj_size, total_gtt_size; int count, ret; @@ -549,7 +543,7 @@ static int i915_gem_gtt_info(struct seq_file *m, void *data) total_obj_size = total_gtt_size = count = 0; list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { - if (list == PINNED_LIST && !i915_gem_obj_is_pinned(obj)) + if (show_pin_display_only && !obj->pin_display) continue; seq_puts(m, " "); @@ -5381,7 +5375,7 @@ static const struct drm_info_list i915_debugfs_list[] = { {"i915_capabilities", i915_capabilities, 0}, {"i915_gem_objects", i915_gem_object_info, 0}, {"i915_gem_gtt", i915_gem_gtt_info, 0}, - {"i915_gem_pinned", i915_gem_gtt_info, 0, (void *) PINNED_LIST}, + {"i915_gem_pin_display", i915_gem_gtt_info, 0, (void *)1}, {"i915_gem_stolen", i915_gem_stolen_list_info }, {"i915_gem_pageflip", i915_gem_pageflip_info, 0}, {"i915_gem_request", i915_gem_request_info, 0}, -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 14/32] drm/i915: Use VMA directly for checking tiling parameters
v2: Rename functions to suit their more active role Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_tiling.c | 51 -- 1 file changed, 30 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c index f4b984de83b5..b2b0cb7199ac 100644 --- a/drivers/gpu/drm/i915/i915_gem_tiling.c +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c @@ -116,35 +116,46 @@ i915_tiling_ok(struct drm_device *dev, int stride, int size, int tiling_mode) return true; } -/* Is the current GTT allocation valid for the change in tiling? */ -static bool -i915_gem_object_fence_ok(struct drm_i915_gem_object *obj, int tiling_mode) +/* Make the current GTT allocation valid for the change in tiling. */ +static int +i915_gem_object_fence_prepare(struct drm_i915_gem_object *obj, int tiling_mode) { struct drm_i915_private *dev_priv = to_i915(obj->base.dev); + struct i915_vma *vma; u32 size; if (tiling_mode == I915_TILING_NONE) - return true; + return 0; if (INTEL_GEN(dev_priv) >= 4) - return true; + return 0; + + vma = i915_gem_obj_to_ggtt(obj); + if (!vma) + return 0; + + if (!obj->map_and_fenceable) + return 0; if (IS_GEN3(dev_priv)) { - if (i915_gem_obj_ggtt_offset(obj) & ~I915_FENCE_START_MASK) - return false; + if (vma->node.start & ~I915_FENCE_START_MASK) + goto bad; } else { - if (i915_gem_obj_ggtt_offset(obj) & ~I830_FENCE_START_MASK) - return false; + if (vma->node.start & ~I830_FENCE_START_MASK) + goto bad; } size = i915_gem_get_ggtt_size(dev_priv, obj->base.size, tiling_mode); - if (i915_gem_obj_ggtt_size(obj) != size) - return false; + if (vma->node.size < size) + goto bad; - if (i915_gem_obj_ggtt_offset(obj) & (size - 1)) - return false; + if (vma->node.start & (size - 1)) + goto bad; - return true; + return 0; + +bad: + return i915_vma_unbind(vma); } /** @@ -168,7 +179,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data, struct drm_i915_gem_set_tiling *args = data; struct drm_i915_private *dev_priv = to_i915(dev); struct drm_i915_gem_object *obj; - int ret = 0; + int err = 0; /* Make sure we don't cross-contaminate obj->tiling_and_stride */ BUILD_BUG_ON(I915_TILING_LAST & STRIDE_MASK); @@ -187,7 +198,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data, mutex_lock(&dev->struct_mutex); if (obj->pin_display || obj->framebuffer_references) { - ret = -EBUSY; + err = -EBUSY; goto err; } @@ -234,11 +245,9 @@ i915_gem_set_tiling(struct drm_device *dev, void *data, * has to also include the unfenced register the GPU uses * whilst executing a fenced command for an untiled object. */ - if (obj->map_and_fenceable && - !i915_gem_object_fence_ok(obj, args->tiling_mode)) - ret = i915_vma_unbind(i915_gem_obj_to_ggtt(obj)); - if (ret == 0) { + err = i915_gem_object_fence_prepare(obj, args->tiling_mode); + if (!err) { if (obj->pages && obj->madv == I915_MADV_WILLNEED && dev_priv->quirks & QUIRK_PIN_SWIZZLED_PAGES) { @@ -281,7 +290,7 @@ err: intel_runtime_pm_put(dev_priv); - return ret; + return err; } /** -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 07/32] drm/i915: Remove redundant WARN_ON from __i915_add_request()
It's an outright programming error, so explode if it is ever hit. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_request.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 8a9e9bfeea09..4c5b7e104f2f 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -470,18 +470,12 @@ static void i915_gem_mark_busy(const struct intel_engine_cs *engine) */ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) { - struct intel_engine_cs *engine; - struct intel_ring *ring; + struct intel_engine_cs *engine = request->engine; + struct intel_ring *ring = request->ring; u32 request_start; u32 reserved_tail; int ret; - if (WARN_ON(!request)) - return; - - engine = request->engine; - ring = request->ring; - /* * To ensure that this call will not fail, space for its emissions * should already have been reserved in the ring buffer. Let the ring -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 02/32] drm/i915: Reduce amount of duplicate buffer information captured on error
When capturing the error state, we do not need to know about every address space - just those that are related to the error. We know which context is active at the time, therefore we know which VM are implicated in the error. We can then restrict the VM which we report to the relevant subset. v2: s/i/count_active/ (and similar) Rewrite label generation for "Buffers" Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 9 +- drivers/gpu/drm/i915/i915_gpu_error.c | 224 +++--- 2 files changed, 105 insertions(+), 128 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b1017950087b..7eb911e47904 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -517,6 +517,7 @@ struct drm_i915_error_state { int num_waiters; int hangcheck_score; enum intel_engine_hangcheck_action hangcheck_action; + struct i915_address_space *vm; int num_requests; /* our own tracking of ring head and tail */ @@ -587,17 +588,15 @@ struct drm_i915_error_state { u32 read_domains; u32 write_domain; s32 fence_reg:I915_MAX_NUM_FENCE_BITS; - s32 pinned:2; u32 tiling:2; u32 dirty:1; u32 purgeable:1; u32 userptr:1; s32 engine:4; u32 cache_level:3; - } **active_bo, **pinned_bo; - - u32 *active_bo_count, *pinned_bo_count; - u32 vm_count; + } *active_bo[I915_NUM_ENGINES], *pinned_bo; + u32 active_bo_count[I915_NUM_ENGINES], pinned_bo_count; + struct i915_address_space *active_vm[I915_NUM_ENGINES]; }; struct intel_connector; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index d54848f5f246..1c098fa65fbe 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -42,16 +42,6 @@ static const char *engine_str(int engine) } } -static const char *pin_flag(int pinned) -{ - if (pinned > 0) - return " P"; - else if (pinned < 0) - return " p"; - else - return ""; -} - static const char *tiling_flag(int tiling) { switch (tiling) { @@ -189,7 +179,7 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m, { int i; - err_printf(m, " %s [%d]:\n", name, count); + err_printf(m, "%s [%d]:\n", name, count); while (count--) { err_printf(m, "%08x_%08x %8u %02x %02x [ ", @@ -202,7 +192,6 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m, err_printf(m, "%02x ", err->rseqno[i]); err_printf(m, "] %02x", err->wseqno); - err_puts(m, pin_flag(err->pinned)); err_puts(m, tiling_flag(err->tiling)); err_puts(m, dirty_flag(err->dirty)); err_puts(m, purgeable_flag(err->purgeable)); @@ -414,18 +403,33 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, error_print_engine(m, &error->engine[i]); } - for (i = 0; i < error->vm_count; i++) { - err_printf(m, "vm[%d]\n", i); + for (i = 0; i < ARRAY_SIZE(error->active_vm); i++) { + char buf[128]; + int len, first = 1; - print_error_buffers(m, "Active", + if (!error->active_vm[i]) + break; + + len = scnprintf(buf, sizeof(buf), "Active ("); + for (j = 0; j < ARRAY_SIZE(error->engine); j++) { + if (error->engine[j].vm != error->active_vm[i]) + continue; + + len += scnprintf(buf + len, sizeof(buf), "%s%s", +first ? "" : ", ", +dev_priv->engine[j].name); + first = 0; + } + scnprintf(buf + len, sizeof(buf), ")"); + print_error_buffers(m, buf, error->active_bo[i], error->active_bo_count[i]); - - print_error_buffers(m, "Pinned", - error->pinned_bo[i], - error->pinned_bo_count[i]); } + print_error_buffers(m, "Pinned (global)", + error->pinned_bo, + error->pinned_bo_count); + for (i = 0; i < ARRAY_SIZE(error->engine); i++) { struct drm_i915_error_engine *ee = &error->engine[i]; @@ -627,13 +631,10 @@ static void i915_error_state_free(struct kref *error_ref) i915_error_object_free(error->semaphore_obj);
[Intel-gfx] [CI 01/32] drm/i915: Record the position of the start of the request
Not only does it make for good documentation and debugging aide, but it is also vital for when we want to unwind requests - such as when throwing away an incomplete request. Signed-off-by: Chris Wilson Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-2-git-send-email-arun.siluv...@linux.intel.com Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem_request.c | 13 + drivers/gpu/drm/i915/i915_gpu_error.c | 6 -- 3 files changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bf193ba1574e..b1017950087b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -557,6 +557,7 @@ struct drm_i915_error_state { struct drm_i915_error_request { long jiffies; u32 seqno; + u32 head; u32 tail; } *requests; diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index b764c1d440c8..8a9e9bfeea09 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -426,6 +426,13 @@ i915_gem_request_alloc(struct intel_engine_cs *engine, if (ret) goto err_ctx; + /* Record the position of the start of the request so that +* should we detect the updated seqno part-way through the +* GPU processing the request, we never over-estimate the +* position of the head. +*/ + req->head = req->ring->tail; + return req; err_ctx: @@ -500,8 +507,6 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) trace_i915_gem_request_add(request); - request->head = request_start; - /* Seal the request and mark it as pending execution. Note that * we may inspect this state, without holding any locks, during * hangcheck. Hence we apply the barrier to ensure that we do not @@ -514,10 +519,10 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) list_add_tail(&request->link, &engine->request_list); list_add_tail(&request->ring_link, &ring->request_list); - /* Record the position of the start of the request so that + /* Record the position of the start of the breadcrumb so that * should we detect the updated seqno part-way through the * GPU processing the request, we never over-estimate the -* position of the head. +* position of the ring's HEAD. */ request->postfix = ring->tail; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index eecb87063c88..d54848f5f246 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -455,9 +455,10 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, dev_priv->engine[i].name, ee->num_requests); for (j = 0; j < ee->num_requests; j++) { - err_printf(m, " seqno 0x%08x, emitted %ld, tail 0x%08x\n", + err_printf(m, " seqno 0x%08x, emitted %ld, head 0x%08x, tail 0x%08x\n", ee->requests[j].seqno, ee->requests[j].jiffies, + ee->requests[j].head, ee->requests[j].tail); } } @@ -1205,7 +1206,8 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, erq = &ee->requests[count++]; erq->seqno = request->fence.seqno; erq->jiffies = request->emitted_jiffies; - erq->tail = request->postfix; + erq->head = request->head; + erq->tail = request->tail; } } } -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 13/32] drm/i915: Convert fence computations to use vma directly
Lookup the GGTT vma once for the object assigned to the fence, and then derive everything from that vma. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_fence.c | 55 +-- 1 file changed, 26 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_fence.c b/drivers/gpu/drm/i915/i915_gem_fence.c index 9e8173fe2a09..d99fc5734cf1 100644 --- a/drivers/gpu/drm/i915/i915_gem_fence.c +++ b/drivers/gpu/drm/i915/i915_gem_fence.c @@ -85,22 +85,19 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg, POSTING_READ(fence_reg_lo); if (obj) { - u32 size = i915_gem_obj_ggtt_size(obj); + struct i915_vma *vma = i915_gem_obj_to_ggtt(obj); unsigned int tiling = i915_gem_object_get_tiling(obj); unsigned int stride = i915_gem_object_get_stride(obj); - uint64_t val; + u32 size = vma->node.size; + u32 row_size = stride * (tiling == I915_TILING_Y ? 32 : 8); + u64 val; /* Adjust fence size to match tiled area */ - if (tiling != I915_TILING_NONE) { - uint32_t row_size = stride * - (tiling == I915_TILING_Y ? 32 : 8); - size = (size / row_size) * row_size; - } + size = rounddown(size, row_size); - val = (uint64_t)((i915_gem_obj_ggtt_offset(obj) + size - 4096) & -0xf000) << 32; - val |= i915_gem_obj_ggtt_offset(obj) & 0xf000; - val |= (uint64_t)((stride / 128) - 1) << fence_pitch_shift; + val = ((vma->node.start + size - 4096) & 0xf000) << 32; + val |= vma->node.start & 0xf000; + val |= (u64)((stride / 128) - 1) << fence_pitch_shift; if (tiling == I915_TILING_Y) val |= 1 << I965_FENCE_TILING_Y_SHIFT; val |= I965_FENCE_REG_VALID; @@ -123,17 +120,17 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg, u32 val; if (obj) { - u32 size = i915_gem_obj_ggtt_size(obj); + struct i915_vma *vma = i915_gem_obj_to_ggtt(obj); unsigned int tiling = i915_gem_object_get_tiling(obj); unsigned int stride = i915_gem_object_get_stride(obj); int pitch_val; int tile_width; - WARN((i915_gem_obj_ggtt_offset(obj) & ~I915_FENCE_START_MASK) || -(size & -size) != size || -(i915_gem_obj_ggtt_offset(obj) & (size - 1)), -"object 0x%08llx [fenceable? %d] not 1M or pot-size (0x%08x) aligned\n", -i915_gem_obj_ggtt_offset(obj), obj->map_and_fenceable, size); + WARN((vma->node.start & ~I915_FENCE_START_MASK) || +!is_power_of_2(vma->node.size) || +(vma->node.start & (vma->node.size - 1)), +"object 0x%08llx [fenceable? %d] not 1M or pot-size (0x%08llx) aligned\n", +vma->node.start, obj->map_and_fenceable, vma->node.size); if (tiling == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev)) tile_width = 128; @@ -144,10 +141,10 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg, pitch_val = stride / tile_width; pitch_val = ffs(pitch_val) - 1; - val = i915_gem_obj_ggtt_offset(obj); + val = vma->node.start; if (tiling == I915_TILING_Y) val |= 1 << I830_FENCE_TILING_Y_SHIFT; - val |= I915_FENCE_SIZE_BITS(size); + val |= I915_FENCE_SIZE_BITS(vma->node.size); val |= pitch_val << I830_FENCE_PITCH_SHIFT; val |= I830_FENCE_REG_VALID; } else @@ -161,27 +158,27 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg, struct drm_i915_gem_object *obj) { struct drm_i915_private *dev_priv = to_i915(dev); - uint32_t val; + u32 val; if (obj) { - u32 size = i915_gem_obj_ggtt_size(obj); + struct i915_vma *vma = i915_gem_obj_to_ggtt(obj); unsigned int tiling = i915_gem_object_get_tiling(obj); unsigned int stride = i915_gem_object_get_stride(obj); - uint32_t pitch_val; + u32 pitch_val; - WARN((i915_gem_obj_ggtt_offset(obj) & ~I830_FENCE_START_MASK) || -(size & -size) != size || -(i915_gem_obj_ggtt_offset(obj) & (size - 1)), -"object 0x%08llx not 512K or pot-size 0x%08x aligned\n", -i915_gem_obj_ggtt_offset(obj), size); + WARN((vma->node.s
[Intel-gfx] [CI 09/32] drm/i915: Create a VMA for an object
In many places, we wish to store the VMA in preference to the object itself and so being able to create the persistent VMA is useful. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_gtt.c | 11 +++ drivers/gpu/drm/i915/i915_gem_gtt.h | 5 + 2 files changed, 16 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 9c178b0c40b5..1bec50bd651b 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -3387,6 +3387,17 @@ __i915_gem_vma_create(struct drm_i915_gem_object *obj, } struct i915_vma * +i915_vma_create(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, + const struct i915_ggtt_view *view) +{ + GEM_BUG_ON(view && !i915_is_ggtt(vm)); + GEM_BUG_ON(view ? i915_gem_obj_to_ggtt_view(obj, view) : i915_gem_obj_to_vma(obj, vm)); + + return __i915_gem_vma_create(obj, vm, view ?: &i915_ggtt_view_normal); +} + +struct i915_vma * i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj, struct i915_address_space *vm) { diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index b580e8a013ce..f2769e01cc8c 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -228,6 +228,11 @@ struct i915_vma { struct drm_i915_gem_exec_object2 *exec_entry; }; +struct i915_vma * +i915_vma_create(struct drm_i915_gem_object *obj, + struct i915_address_space *vm, + const struct i915_ggtt_view *view); + static inline bool i915_vma_is_ggtt(const struct i915_vma *vma) { return vma->flags & I915_VMA_GGTT; -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 04/32] drm/i915: Remove inactive/active list from debugfs
These two files (i915_gem_active, i915_gem_inactive) no longer give pertinent information since active/inactive tracking is per-vm and so we need the information per-vm. They are obsolete so remove them. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c | 49 - 1 file changed, 49 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index b8ed8db9f7ec..cf35ce0b8518 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -210,53 +210,6 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) seq_printf(m, " (frontbuffer: 0x%03x)", frontbuffer_bits); } -static int i915_gem_object_list_info(struct seq_file *m, void *data) -{ - struct drm_info_node *node = m->private; - uintptr_t list = (uintptr_t) node->info_ent->data; - struct list_head *head; - struct drm_device *dev = node->minor->dev; - struct drm_i915_private *dev_priv = to_i915(dev); - struct i915_ggtt *ggtt = &dev_priv->ggtt; - struct i915_vma *vma; - u64 total_obj_size, total_gtt_size; - int count, ret; - - ret = mutex_lock_interruptible(&dev->struct_mutex); - if (ret) - return ret; - - /* FIXME: the user of this interface might want more than just GGTT */ - switch (list) { - case ACTIVE_LIST: - seq_puts(m, "Active:\n"); - head = &ggtt->base.active_list; - break; - case INACTIVE_LIST: - seq_puts(m, "Inactive:\n"); - head = &ggtt->base.inactive_list; - break; - default: - mutex_unlock(&dev->struct_mutex); - return -EINVAL; - } - - total_obj_size = total_gtt_size = count = 0; - list_for_each_entry(vma, head, vm_link) { - seq_printf(m, " "); - describe_obj(m, vma->obj); - seq_printf(m, "\n"); - total_obj_size += vma->obj->base.size; - total_gtt_size += vma->node.size; - count++; - } - mutex_unlock(&dev->struct_mutex); - - seq_printf(m, "Total %d objects, %llu bytes, %llu GTT size\n", - count, total_obj_size, total_gtt_size); - return 0; -} - static int obj_rank_by_stolen(void *priv, struct list_head *A, struct list_head *B) { @@ -5429,8 +5382,6 @@ static const struct drm_info_list i915_debugfs_list[] = { {"i915_gem_objects", i915_gem_object_info, 0}, {"i915_gem_gtt", i915_gem_gtt_info, 0}, {"i915_gem_pinned", i915_gem_gtt_info, 0, (void *) PINNED_LIST}, - {"i915_gem_active", i915_gem_object_list_info, 0, (void *) ACTIVE_LIST}, - {"i915_gem_inactive", i915_gem_object_list_info, 0, (void *) INACTIVE_LIST}, {"i915_gem_stolen", i915_gem_stolen_list_info }, {"i915_gem_pageflip", i915_gem_pageflip_info, 0}, {"i915_gem_request", i915_gem_request_info, 0}, -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 06/32] drm/i915: Reduce i915_gem_objects to only show object information
No longer is knowing how much of the GTT (both mappable aperture and beyond) relevant, and the output clutters the real information - that is how many objects are allocated and bound (and by who) so that we can quickly grasp if there is a leak. v2: Relent, and rename pinned to indicate display only. Since the display objects are semi-static and are of variable size, they are the interesting objects to watch over time for aperture leaking. The other pins are either static (such as the scratch page) or very short lived (such as execbuf) and not part of the precious GGTT. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 100 -- drivers/gpu/drm/i915/i915_drv.h | 249 +- drivers/gpu/drm/i915/i915_gpu_error.c | 15 ++ 3 files changed, 168 insertions(+), 196 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index c3bc5db1124f..77a9c56ad25f 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -269,17 +269,6 @@ static int i915_gem_stolen_list_info(struct seq_file *m, void *data) return 0; } -#define count_objects(list, member) do { \ - list_for_each_entry(obj, list, member) { \ - size += i915_gem_obj_total_ggtt_size(obj); \ - ++count; \ - if (obj->map_and_fenceable) { \ - mappable_size += i915_gem_obj_ggtt_size(obj); \ - ++mappable_count; \ - } \ - } \ -} while (0) - struct file_stats { struct drm_i915_file_private *file_priv; unsigned long count; @@ -394,30 +383,16 @@ static void print_context_stats(struct seq_file *m, print_file_stats(m, "[k]contexts", stats); } -#define count_vmas(list, member) do { \ - list_for_each_entry(vma, list, member) { \ - size += i915_gem_obj_total_ggtt_size(vma->obj); \ - ++count; \ - if (vma->obj->map_and_fenceable) { \ - mappable_size += i915_gem_obj_ggtt_size(vma->obj); \ - ++mappable_count; \ - } \ - } \ -} while (0) - static int i915_gem_object_info(struct seq_file *m, void* data) { struct drm_info_node *node = m->private; struct drm_device *dev = node->minor->dev; struct drm_i915_private *dev_priv = to_i915(dev); struct i915_ggtt *ggtt = &dev_priv->ggtt; - u32 count, mappable_count, purgeable_count; - u64 size, mappable_size, purgeable_size; - unsigned long pin_mapped_count = 0, pin_mapped_purgeable_count = 0; - u64 pin_mapped_size = 0, pin_mapped_purgeable_size = 0; + u32 count, mapped_count, purgeable_count, dpy_count; + u64 size, mapped_size, purgeable_size, dpy_size; struct drm_i915_gem_object *obj; struct drm_file *file; - struct i915_vma *vma; int ret; ret = mutex_lock_interruptible(&dev->struct_mutex); @@ -428,70 +403,51 @@ static int i915_gem_object_info(struct seq_file *m, void* data) dev_priv->mm.object_count, dev_priv->mm.object_memory); - size = count = mappable_size = mappable_count = 0; - count_objects(&dev_priv->mm.bound_list, global_list); - seq_printf(m, "%u [%u] objects, %llu [%llu] bytes in gtt\n", - count, mappable_count, size, mappable_size); - - size = count = mappable_size = mappable_count = 0; - count_vmas(&ggtt->base.active_list, vm_link); - seq_printf(m, " %u [%u] active objects, %llu [%llu] bytes\n", - count, mappable_count, size, mappable_size); - - size = count = mappable_size = mappable_count = 0; - count_vmas(&ggtt->base.inactive_list, vm_link); - seq_printf(m, " %u [%u] inactive objects, %llu [%llu] bytes\n", - count, mappable_count, size, mappable_size); - size = count = purgeable_size = purgeable_count = 0; list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list) { - size += obj->base.size, ++count; - if (obj->madv == I915_MADV_DONTNEED) - purgeable_size += obj->base.size, ++purgeable_count; + size += obj->base.size; + ++count; + + if (obj->madv == I915_MADV_DONTNEED) { + purgeable_size += obj->base.size; + ++purgeable_count; + } + if (obj->mapping) { - pin_mapped_count++; - pin_mapped_size += obj->base.size; - if (obj->pages_pin_count == 0) { - pin_mapped_purgeable_count++; - pin_mapped_purgeable_size += obj->base.size; - } + mapped_count++; + mapped_size += obj->base.size;
[Intel-gfx] [CI 08/32] drm/i915: Always set the vma->pages
Previously, we would only set the vma->pages pointer for GGTT entries. However, if we always set it, we can use it to prettify some code that may want to access the backing store associated with the VMA (as assigned to the VMA). Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem.c | 8 drivers/gpu/drm/i915/i915_gem_gtt.c | 30 ++ drivers/gpu/drm/i915/i915_gem_gtt.h | 3 +-- 3 files changed, 19 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f48c45080a65..45c45d3a6e31 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2868,12 +2868,12 @@ int i915_vma_unbind(struct i915_vma *vma) if (i915_vma_is_ggtt(vma)) { if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) { obj->map_and_fenceable = false; - } else if (vma->ggtt_view.pages) { - sg_free_table(vma->ggtt_view.pages); - kfree(vma->ggtt_view.pages); + } else if (vma->pages) { + sg_free_table(vma->pages); + kfree(vma->pages); } - vma->ggtt_view.pages = NULL; } + vma->pages = NULL; /* Since the unbound list is global, only move to that list if * no more VMAs exist. */ diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index d876501694c6..9c178b0c40b5 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -170,11 +170,13 @@ static int ppgtt_bind_vma(struct i915_vma *vma, { u32 pte_flags = 0; + vma->pages = vma->obj->pages; + /* Currently applicable only to VLV */ if (vma->obj->gt_ro) pte_flags |= PTE_READ_ONLY; - vma->vm->insert_entries(vma->vm, vma->obj->pages, vma->node.start, + vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start, cache_level, pte_flags); return 0; @@ -2618,8 +2620,7 @@ static int ggtt_bind_vma(struct i915_vma *vma, if (obj->gt_ro) pte_flags |= PTE_READ_ONLY; - vma->vm->insert_entries(vma->vm, vma->ggtt_view.pages, - vma->node.start, + vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start, cache_level, pte_flags); /* @@ -2651,8 +2652,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, if (flags & I915_VMA_GLOBAL_BIND) { vma->vm->insert_entries(vma->vm, - vma->ggtt_view.pages, - vma->node.start, + vma->pages, vma->node.start, cache_level, pte_flags); } @@ -2660,8 +2660,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma, struct i915_hw_ppgtt *appgtt = to_i915(vma->vm->dev)->mm.aliasing_ppgtt; appgtt->base.insert_entries(&appgtt->base, - vma->ggtt_view.pages, - vma->node.start, + vma->pages, vma->node.start, cache_level, pte_flags); } @@ -3557,28 +3556,27 @@ i915_get_ggtt_vma_pages(struct i915_vma *vma) { int ret = 0; - if (vma->ggtt_view.pages) + if (vma->pages) return 0; if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) - vma->ggtt_view.pages = vma->obj->pages; + vma->pages = vma->obj->pages; else if (vma->ggtt_view.type == I915_GGTT_VIEW_ROTATED) - vma->ggtt_view.pages = + vma->pages = intel_rotate_fb_obj_pages(&vma->ggtt_view.params.rotated, vma->obj); else if (vma->ggtt_view.type == I915_GGTT_VIEW_PARTIAL) - vma->ggtt_view.pages = - intel_partial_pages(&vma->ggtt_view, vma->obj); + vma->pages = intel_partial_pages(&vma->ggtt_view, vma->obj); else WARN_ONCE(1, "GGTT view %u not implemented!\n", vma->ggtt_view.type); - if (!vma->ggtt_view.pages) { + if (!vma->pages) { DRM_ERROR("Failed to get pages for GGTT view type %u!\n", vma->ggtt_view.type); ret = -EINVAL; - } else if (IS_ERR(vma->ggtt_view.pages)) { - ret = PTR_ERR(vma->ggtt_view.pages); - vma->ggtt_view.pages = NULL; + } else if (IS_ERR(vma->pages)) { + ret = PTR_ERR(vma->pages); + vma->pages = NULL; DRM_ERROR("Failed to get pages for VMA view type %
[Intel-gfx] [CI 16/32] drm/i915: Only change the context object's domain when binding
We know that the only access to the context object is via the GPU, and the only time when it can be out of the GPU domain is when it is swapped out and unbound. Therefore we only need to clflush the object when binding, thus avoiding any potential stall on touching the domain on an active context. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_context.c | 19 +++ drivers/gpu/drm/i915/intel_ringbuffer.c | 4 2 files changed, 11 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 3857ce097c84..824dfe14bcd0 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -772,6 +772,13 @@ static int do_rcs_switch(struct drm_i915_gem_request *req) if (skip_rcs_switch(ppgtt, engine, to)) return 0; + /* Clear this page out of any CPU caches for coherent swap-in/out. */ + if (!(vma->flags & I915_VMA_GLOBAL_BIND)) { + ret = i915_gem_object_set_to_gtt_domain(vma->obj, false); + if (ret) + return ret; + } + /* Trying to pin first makes error handling easier. */ ret = i915_vma_pin(vma, 0, to->ggtt_alignment, PIN_GLOBAL); if (ret) @@ -786,18 +793,6 @@ static int do_rcs_switch(struct drm_i915_gem_request *req) */ from = engine->last_context; - /* -* Clear this page out of any CPU caches for coherent swap-in/out. Note -* that thanks to write = false in this call and us not setting any gpu -* write domains when putting a context object onto the active list -* (when switching away from it), this won't block. -* -* XXX: We need a real interface to do this instead of trickery. -*/ - ret = i915_gem_object_set_to_gtt_domain(vma->obj, false); - if (ret) - goto err; - if (needs_pd_load_pre(ppgtt, engine, to)) { /* Older GENs and non render rings still want the load first, * "PP_DCLV followed by PP_DIR_BASE register through Load diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 2318a27341c8..81dc69d1ff05 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2092,6 +2092,10 @@ static int intel_ring_context_pin(struct i915_gem_context *ctx, return 0; if (ce->state) { + ret = i915_gem_object_set_to_gtt_domain(ce->state->obj, false); + if (ret) + goto error; + ret = i915_vma_pin(ce->state, 0, ctx->ggtt_alignment, PIN_GLOBAL | PIN_HIGH); if (ret) -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 21/32] drm/i915: Move common seqno reset to intel_engine_cs.c
Since the intel_engine_init_seqno() is shared by all engine submission backends, move it out of the legacy intel_ringbuffer.c and into the new home for common routines, intel_engine_cs.c Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/intel_engine_cs.c | 42 + drivers/gpu/drm/i915/intel_ringbuffer.c | 42 - 2 files changed, 42 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 7104dec5e893..829624571ca4 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -161,6 +161,48 @@ cleanup: return ret; } +void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno) +{ + struct drm_i915_private *dev_priv = engine->i915; + + /* Our semaphore implementation is strictly monotonic (i.e. we proceed +* so long as the semaphore value in the register/page is greater +* than the sync value), so whenever we reset the seqno, +* so long as we reset the tracking semaphore value to 0, it will +* always be before the next request's seqno. If we don't reset +* the semaphore value, then when the seqno moves backwards all +* future waits will complete instantly (causing rendering corruption). +*/ + if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv)) { + I915_WRITE(RING_SYNC_0(engine->mmio_base), 0); + I915_WRITE(RING_SYNC_1(engine->mmio_base), 0); + if (HAS_VEBOX(dev_priv)) + I915_WRITE(RING_SYNC_2(engine->mmio_base), 0); + } + if (dev_priv->semaphore_obj) { + struct drm_i915_gem_object *obj = dev_priv->semaphore_obj; + struct page *page = i915_gem_object_get_dirty_page(obj, 0); + void *semaphores = kmap(page); + memset(semaphores + GEN8_SEMAPHORE_OFFSET(engine->id, 0), + 0, I915_NUM_ENGINES * gen8_semaphore_seqno_size); + kunmap(page); + } + memset(engine->semaphore.sync_seqno, 0, + sizeof(engine->semaphore.sync_seqno)); + + intel_write_status_page(engine, I915_GEM_HWS_INDEX, seqno); + if (engine->irq_seqno_barrier) + engine->irq_seqno_barrier(engine); + engine->last_submitted_seqno = seqno; + + engine->hangcheck.seqno = seqno; + + /* After manually advancing the seqno, fake the interrupt in case +* there are any waiters for that seqno. +*/ + intel_engine_wakeup(engine); +} + void intel_engine_init_hangcheck(struct intel_engine_cs *engine) { memset(&engine->hangcheck, 0, sizeof(engine->hangcheck)); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index c89aea55bc10..6008d54b9152 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2314,48 +2314,6 @@ int intel_ring_cacheline_align(struct drm_i915_gem_request *req) return 0; } -void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno) -{ - struct drm_i915_private *dev_priv = engine->i915; - - /* Our semaphore implementation is strictly monotonic (i.e. we proceed -* so long as the semaphore value in the register/page is greater -* than the sync value), so whenever we reset the seqno, -* so long as we reset the tracking semaphore value to 0, it will -* always be before the next request's seqno. If we don't reset -* the semaphore value, then when the seqno moves backwards all -* future waits will complete instantly (causing rendering corruption). -*/ - if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv)) { - I915_WRITE(RING_SYNC_0(engine->mmio_base), 0); - I915_WRITE(RING_SYNC_1(engine->mmio_base), 0); - if (HAS_VEBOX(dev_priv)) - I915_WRITE(RING_SYNC_2(engine->mmio_base), 0); - } - if (dev_priv->semaphore_obj) { - struct drm_i915_gem_object *obj = dev_priv->semaphore_obj; - struct page *page = i915_gem_object_get_dirty_page(obj, 0); - void *semaphores = kmap(page); - memset(semaphores + GEN8_SEMAPHORE_OFFSET(engine->id, 0), - 0, I915_NUM_ENGINES * gen8_semaphore_seqno_size); - kunmap(page); - } - memset(engine->semaphore.sync_seqno, 0, - sizeof(engine->semaphore.sync_seqno)); - - intel_write_status_page(engine, I915_GEM_HWS_INDEX, seqno); - if (engine->irq_seqno_barrier) - engine->irq_seqno_barrier(engine); - engine->last_submitted_seqno = seqno; - - engine->hangcheck.seqno = seqno; - - /* After manually advancing the seqno, fake the interrupt in case -* there
[Intel-gfx] [CI 17/32] drm/i915: Move assertion for iomap access to i915_vma_pin_iomap
Access through the GTT requires the device to be awake. Ideally i915_vma_pin_iomap() is short-lived and the pinning demarcates the access through the iomap. This is not entirely true, we have a mixture of long lived pins that exceed the wakelock (such as legacy ringbuffers) and short lived pin that do live within the wakelock (such as execlist ringbuffers). Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++ drivers/gpu/drm/i915/intel_ringbuffer.c | 3 --- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 1bec50bd651b..738a474c5afa 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -3650,6 +3650,9 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma) { void __iomem *ptr; + /* Access through the GTT requires the device to be awake. */ + assert_rpm_wakelock_held(to_i915(vma->vm->dev)); + lockdep_assert_held(&vma->vm->dev->struct_mutex); if (WARN_ON(!vma->obj->map_and_fenceable)) return IO_ERR_PTR(-ENODEV); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 81dc69d1ff05..4a614e567353 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1966,9 +1966,6 @@ int intel_ring_pin(struct intel_ring *ring) if (ret) goto err_unpin; - /* Access through the GTT requires the device to be awake. */ - assert_rpm_wakelock_held(dev_priv); - addr = (void __force *) i915_vma_pin_iomap(i915_gem_obj_to_ggtt(obj)); if (IS_ERR(addr)) { -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 30/32] drm/i915: Print the batchbuffer offset next to BBADDR in error state
It is useful when looking at captured error states to check the recorded BBADDR register (the address of the last batchbuffer instruction loaded) against the expected offset of the batch buffer, and so do a quick check that (a) the capture is true or (b) HEAD hasn't wandered off into the badlands. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gpu_error.c | 15 +-- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index d9f29244bafb..bb7d8130dbfd 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -775,6 +775,7 @@ struct drm_i915_error_state { struct drm_i915_error_object { int page_count; u64 gtt_offset; + u64 gtt_size; u32 *pages[0]; } *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 638664f78dd5..0f0b65214ef1 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -242,8 +242,16 @@ static void error_print_engine(struct drm_i915_error_state_buf *m, err_printf(m, " IPEIR: 0x%08x\n", ee->ipeir); err_printf(m, " IPEHR: 0x%08x\n", ee->ipehr); err_printf(m, " INSTDONE: 0x%08x\n", ee->instdone); + if (ee->batchbuffer) { + u64 start = ee->batchbuffer->gtt_offset; + u64 end = start + ee->batchbuffer->gtt_size; + + err_printf(m, " batch: [0x%08x_%08x, 0x%08x_%08x]\n", + upper_32_bits(start), lower_32_bits(start), + upper_32_bits(end), lower_32_bits(end)); + } if (INTEL_GEN(m->i915) >= 4) { - err_printf(m, " BBADDR: 0x%08x %08x\n", + err_printf(m, " BBADDR: 0x%08x_%08x\n", (u32)(ee->bbaddr>>32), (u32)ee->bbaddr); err_printf(m, " BB_STATE: 0x%08x\n", ee->bbstate); err_printf(m, " INSTPS: 0x%08x\n", ee->instps); @@ -677,7 +685,10 @@ i915_error_object_create(struct drm_i915_private *dev_priv, if (!dst) return NULL; - reloc_offset = dst->gtt_offset = vma->node.start; + dst->gtt_offset = vma->node.start; + dst->gtt_size = vma->node.size; + + reloc_offset = dst->gtt_offset; use_ggtt = (src->cache_level == I915_CACHE_NONE && (vma->flags & I915_VMA_GLOBAL_BIND) && reloc_offset + num_pages * PAGE_SIZE <= ggtt->mappable_end); -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 18/32] drm/i915: Use VMA for ringbuffer tracking
Use the GGTT VMA as the primary cookie for handing ring objects as the most common action upon the ring is mapping and unmapping which act upon the VMA itself. By restructuring the code to work with the ring VMA, we can shrink the code and remove a few cycles from context pinning. v2: Move the flush of the object back to before the first pin. We use the am-I-bound? query to only have to check the flush on the first bind and so avoid stalling on active rings. Lots of little renames and small hoops. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c| 2 +- drivers/gpu/drm/i915/i915_gpu_error.c | 4 +- drivers/gpu/drm/i915/i915_guc_submission.c | 16 +- drivers/gpu/drm/i915/intel_lrc.c | 17 +- drivers/gpu/drm/i915/intel_ringbuffer.c| 243 ++--- drivers/gpu/drm/i915/intel_ringbuffer.h| 14 +- 6 files changed, 139 insertions(+), 157 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index f05f8504a4fa..9e44d9eb8e76 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -356,7 +356,7 @@ static int per_file_ctx_stats(int id, void *ptr, void *data) if (ctx->engine[n].state) per_file_stats(0, ctx->engine[n].state->obj, data); if (ctx->engine[n].ring) - per_file_stats(0, ctx->engine[n].ring->obj, data); + per_file_stats(0, ctx->engine[n].ring->vma->obj, data); } return 0; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 61708faebf79..27f973fbe80f 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1128,12 +1128,12 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, ee->cpu_ring_tail = ring->tail; ee->ringbuffer = i915_error_ggtt_object_create(dev_priv, - ring->obj); + ring->vma->obj); } ee->hws_page = i915_error_ggtt_object_create(dev_priv, - engine->status_page.obj); + engine->status_page.vma->obj); ee->wa_ctx = i915_error_ggtt_object_create(dev_priv, engine->wa_ctx.obj); diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 4f0f173f9754..c40b92e212fa 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -343,7 +343,6 @@ static void guc_init_ctx_desc(struct intel_guc *guc, struct intel_context *ce = &ctx->engine[engine->id]; uint32_t guc_engine_id = engine->guc_id; struct guc_execlist_context *lrc = &desc.lrc[guc_engine_id]; - struct drm_i915_gem_object *obj; /* TODO: We have a design issue to be solved here. Only when we * receive the first batch, we know which engine is used by the @@ -358,17 +357,14 @@ static void guc_init_ctx_desc(struct intel_guc *guc, lrc->context_desc = lower_32_bits(ce->lrc_desc); /* The state page is after PPHWSP */ - gfx_addr = ce->state->node.start; - lrc->ring_lcra = gfx_addr + LRC_STATE_PN * PAGE_SIZE; + lrc->ring_lcra = + ce->state->node.start + LRC_STATE_PN * PAGE_SIZE; lrc->context_id = (client->ctx_index << GUC_ELC_CTXID_OFFSET) | (guc_engine_id << GUC_ELC_ENGINE_OFFSET); - obj = ce->ring->obj; - gfx_addr = i915_gem_obj_ggtt_offset(obj); - - lrc->ring_begin = gfx_addr; - lrc->ring_end = gfx_addr + obj->base.size - 1; - lrc->ring_next_free_location = gfx_addr; + lrc->ring_begin = ce->ring->vma->node.start; + lrc->ring_end = lrc->ring_begin + ce->ring->size - 1; + lrc->ring_next_free_location = lrc->ring_begin; lrc->ring_current_tail_pointer_value = 0; desc.engines_used |= (1 << guc_engine_id); @@ -943,7 +939,7 @@ static void guc_create_ads(struct intel_guc *guc) * to find it. */ engine = &dev_priv->engine[RCS]; - ads->golden_context_lrca = engine->status_page.gfx_addr; + ads->golden_context_lrca = engine->status_page.ggtt_offset; for_each_engine(engine, dev_priv) ads->eng_state_size[engine->guc_id] = intel_lr_context_size(engine); diff --git a/drivers/gpu/drm/i915/int
[Intel-gfx] [CI 15/32] drm/i915: Use VMA as the primary object for context state
When working with contexts, we most frequently want the GGTT VMA for the context state, first and foremost. Since the object is available via the VMA, we need only then store the VMA. v2: Formatting tweaks to debugfs output, restored some comments removed in the next patch Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c| 34 drivers/gpu/drm/i915/i915_drv.h| 3 +- drivers/gpu/drm/i915/i915_gem_context.c| 51 +--- drivers/gpu/drm/i915/i915_gpu_error.c | 7 ++-- drivers/gpu/drm/i915/i915_guc_submission.c | 6 +-- drivers/gpu/drm/i915/intel_lrc.c | 64 +++--- drivers/gpu/drm/i915/intel_ringbuffer.c| 6 +-- 7 files changed, 86 insertions(+), 85 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 0ae61e94ce04..f05f8504a4fa 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -354,7 +354,7 @@ static int per_file_ctx_stats(int id, void *ptr, void *data) for (n = 0; n < ARRAY_SIZE(ctx->engine); n++) { if (ctx->engine[n].state) - per_file_stats(0, ctx->engine[n].state, data); + per_file_stats(0, ctx->engine[n].state->obj, data); if (ctx->engine[n].ring) per_file_stats(0, ctx->engine[n].ring->obj, data); } @@ -1977,7 +1977,7 @@ static int i915_context_status(struct seq_file *m, void *unused) seq_printf(m, "%s: ", engine->name); seq_putc(m, ce->initialised ? 'I' : 'i'); if (ce->state) - describe_obj(m, ce->state); + describe_obj(m, ce->state->obj); if (ce->ring) describe_ctx_ring(m, ce->ring); seq_putc(m, '\n'); @@ -1995,36 +1995,34 @@ static void i915_dump_lrc_obj(struct seq_file *m, struct i915_gem_context *ctx, struct intel_engine_cs *engine) { - struct drm_i915_gem_object *ctx_obj = ctx->engine[engine->id].state; + struct i915_vma *vma = ctx->engine[engine->id].state; struct page *page; - uint32_t *reg_state; int j; - unsigned long ggtt_offset = 0; seq_printf(m, "CONTEXT: %s %u\n", engine->name, ctx->hw_id); - if (ctx_obj == NULL) { - seq_puts(m, "\tNot allocated\n"); + if (!vma) { + seq_puts(m, "\tFake context\n"); return; } - if (!i915_gem_obj_ggtt_bound(ctx_obj)) - seq_puts(m, "\tNot bound in GGTT\n"); - else - ggtt_offset = i915_gem_obj_ggtt_offset(ctx_obj); + if (vma->flags & I915_VMA_GLOBAL_BIND) + seq_printf(m, "\tBound in GGTT at 0x%08x\n", + lower_32_bits(vma->node.start)); - if (i915_gem_object_get_pages(ctx_obj)) { - seq_puts(m, "\tFailed to get pages for context object\n"); + if (i915_gem_object_get_pages(vma->obj)) { + seq_puts(m, "\tFailed to get pages for context object\n\n"); return; } - page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN); - if (!WARN_ON(page == NULL)) { - reg_state = kmap_atomic(page); + page = i915_gem_object_get_page(vma->obj, LRC_STATE_PN); + if (page) { + u32 *reg_state = kmap_atomic(page); for (j = 0; j < 0x600 / sizeof(u32) / 4; j += 4) { - seq_printf(m, "\t[0x%08lx] 0x%08x 0x%08x 0x%08x 0x%08x\n", - ggtt_offset + 4096 + (j * 4), + seq_printf(m, + "\t[0x%04x] 0x%08x 0x%08x 0x%08x 0x%08x\n", + j * 4, reg_state[j], reg_state[j + 1], reg_state[j + 2], reg_state[j + 3]); } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3285c8e2c87a..259425d99e17 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -893,9 +893,8 @@ struct i915_gem_context { u32 ggtt_alignment; struct intel_context { - struct drm_i915_gem_object *state; + struct i915_vma *state; struct intel_ring *ring; - struct i915_vma *lrc_vma; uint32_t *lrc_reg_state; u64 lrc_desc; int pin_count; diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 547caf26a6b9..3857ce097c84 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -155,7 +155
[Intel-gfx] [CI 25/32] drm/i915: Use VMA for wa_ctx tracking
Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- drivers/gpu/drm/i915/intel_lrc.c| 58 ++--- drivers/gpu/drm/i915/intel_ringbuffer.h | 4 +-- 3 files changed, 35 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 4068630bfc68..5e7734ca4579 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1134,7 +1134,7 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, engine->status_page.vma->obj); ee->wa_ctx = i915_error_ggtt_object_create(dev_priv, - engine->wa_ctx.obj); + engine->wa_ctx.vma->obj); count = 0; list_for_each_entry(request, &engine->request_list, link) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 56c904e2dc98..64cb04e63512 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1165,45 +1165,51 @@ static int gen9_init_perctx_bb(struct intel_engine_cs *engine, static int lrc_setup_wa_ctx_obj(struct intel_engine_cs *engine, u32 size) { - int ret; + struct drm_i915_gem_object *obj; + struct i915_vma *vma; + int err; - engine->wa_ctx.obj = i915_gem_object_create(&engine->i915->drm, - PAGE_ALIGN(size)); - if (IS_ERR(engine->wa_ctx.obj)) { - DRM_DEBUG_DRIVER("alloc LRC WA ctx backing obj failed.\n"); - ret = PTR_ERR(engine->wa_ctx.obj); - engine->wa_ctx.obj = NULL; - return ret; - } + obj = i915_gem_object_create(&engine->i915->drm, PAGE_ALIGN(size)); + if (IS_ERR(obj)) + return PTR_ERR(obj); - ret = i915_gem_object_ggtt_pin(engine->wa_ctx.obj, NULL, - 0, PAGE_SIZE, PIN_HIGH); - if (ret) { - DRM_DEBUG_DRIVER("pin LRC WA ctx backing obj failed: %d\n", -ret); - i915_gem_object_put(engine->wa_ctx.obj); - return ret; + vma = i915_vma_create(obj, &engine->i915->ggtt.base, NULL); + if (IS_ERR(vma)) { + err = PTR_ERR(vma); + goto err; } + err = i915_vma_pin(vma, 0, PAGE_SIZE, PIN_GLOBAL | PIN_HIGH); + if (err) + goto err; + + engine->wa_ctx.vma = vma; return 0; + +err: + i915_gem_object_put(obj); + return err; } static void lrc_destroy_wa_ctx_obj(struct intel_engine_cs *engine) { - if (engine->wa_ctx.obj) { - i915_gem_object_ggtt_unpin(engine->wa_ctx.obj); - i915_gem_object_put(engine->wa_ctx.obj); - engine->wa_ctx.obj = NULL; - } + struct i915_vma *vma; + + vma = fetch_and_zero(&engine->wa_ctx.vma); + if (!vma) + return; + + i915_vma_unpin(vma); + i915_vma_put(vma); } static int intel_init_workaround_bb(struct intel_engine_cs *engine) { - int ret; + struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx; uint32_t *batch; uint32_t offset; struct page *page; - struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx; + int ret; WARN_ON(engine->id != RCS); @@ -1226,7 +1232,7 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine) return ret; } - page = i915_gem_object_get_dirty_page(wa_ctx->obj, 0); + page = i915_gem_object_get_dirty_page(wa_ctx->vma->obj, 0); batch = kmap_atomic(page); offset = 0; @@ -2019,9 +2025,9 @@ populate_lr_context(struct i915_gem_context *ctx, RING_INDIRECT_CTX(engine->mmio_base), 0); ASSIGN_CTX_REG(reg_state, CTX_RCS_INDIRECT_CTX_OFFSET, RING_INDIRECT_CTX_OFFSET(engine->mmio_base), 0); - if (engine->wa_ctx.obj) { + if (engine->wa_ctx.vma) { struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx; - uint32_t ggtt_offset = i915_gem_obj_ggtt_offset(wa_ctx->obj); + u32 ggtt_offset = wa_ctx->vma->node.start; reg_state[CTX_RCS_INDIRECT_CTX+1] = (ggtt_offset + wa_ctx->indirect_ctx.offset * sizeof(uint32_t)) | diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index cb40785e7677..e3777572c70e 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -123,12 +123,12 @@ struct drm_i915_reg_table; *an option for fu
[Intel-gfx] [CI 24/32] drm/i915: Use VMA for render state page tracking
Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_render_state.c | 40 +++- drivers/gpu/drm/i915/i915_gem_render_state.h | 2 +- 2 files changed, 23 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 57fd767a2d79..95b7e9afd5f8 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -30,8 +30,7 @@ struct render_state { const struct intel_renderstate_rodata *rodata; - struct drm_i915_gem_object *obj; - u64 ggtt_offset; + struct i915_vma *vma; u32 aux_batch_size; u32 aux_batch_offset; }; @@ -73,7 +72,7 @@ render_state_get_rodata(const struct drm_i915_gem_request *req) static int render_state_setup(struct render_state *so) { - struct drm_device *dev = so->obj->base.dev; + struct drm_device *dev = so->vma->vm->dev; const struct intel_renderstate_rodata *rodata = so->rodata; const bool has_64bit_reloc = INTEL_GEN(dev) >= 8; unsigned int i = 0, reloc_index = 0; @@ -81,18 +80,18 @@ static int render_state_setup(struct render_state *so) u32 *d; int ret; - ret = i915_gem_object_set_to_cpu_domain(so->obj, true); + ret = i915_gem_object_set_to_cpu_domain(so->vma->obj, true); if (ret) return ret; - page = i915_gem_object_get_dirty_page(so->obj, 0); + page = i915_gem_object_get_dirty_page(so->vma->obj, 0); d = kmap(page); while (i < rodata->batch_items) { u32 s = rodata->batch[i]; if (i * 4 == rodata->reloc[reloc_index]) { - u64 r = s + so->ggtt_offset; + u64 r = s + so->vma->node.start; s = lower_32_bits(r); if (has_64bit_reloc) { if (i + 1 >= rodata->batch_items || @@ -154,7 +153,7 @@ static int render_state_setup(struct render_state *so) kunmap(page); - ret = i915_gem_object_set_to_gtt_domain(so->obj, false); + ret = i915_gem_object_set_to_gtt_domain(so->vma->obj, false); if (ret) return ret; @@ -175,6 +174,7 @@ err_out: int i915_gem_render_state_init(struct drm_i915_gem_request *req) { struct render_state so; + struct drm_i915_gem_object *obj; int ret; if (WARN_ON(req->engine->id != RCS)) @@ -187,21 +187,25 @@ int i915_gem_render_state_init(struct drm_i915_gem_request *req) if (so.rodata->batch_items * 4 > 4096) return -EINVAL; - so.obj = i915_gem_object_create(&req->i915->drm, 4096); - if (IS_ERR(so.obj)) - return PTR_ERR(so.obj); + obj = i915_gem_object_create(&req->i915->drm, 4096); + if (IS_ERR(obj)) + return PTR_ERR(obj); - ret = i915_gem_object_ggtt_pin(so.obj, NULL, 0, 0, 0); - if (ret) + so.vma = i915_vma_create(obj, &req->i915->ggtt.base, NULL); + if (IS_ERR(so.vma)) { + ret = PTR_ERR(so.vma); goto err_obj; + } - so.ggtt_offset = i915_gem_obj_ggtt_offset(so.obj); + ret = i915_vma_pin(so.vma, 0, 0, PIN_GLOBAL); + if (ret) + goto err_obj; ret = render_state_setup(&so); if (ret) goto err_unpin; - ret = req->engine->emit_bb_start(req, so.ggtt_offset, + ret = req->engine->emit_bb_start(req, so.vma->node.start, so.rodata->batch_items * 4, I915_DISPATCH_SECURE); if (ret) @@ -209,7 +213,7 @@ int i915_gem_render_state_init(struct drm_i915_gem_request *req) if (so.aux_batch_size > 8) { ret = req->engine->emit_bb_start(req, -(so.ggtt_offset + +(so.vma->node.start + so.aux_batch_offset), so.aux_batch_size, I915_DISPATCH_SECURE); @@ -217,10 +221,10 @@ int i915_gem_render_state_init(struct drm_i915_gem_request *req) goto err_unpin; } - i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req, 0); + i915_vma_move_to_active(so.vma, req, 0); err_unpin: - i915_gem_object_ggtt_unpin(so.obj); + i915_vma_unpin(so.vma); err_obj: - i915_gem_object_put(so.obj); + i915_gem_object_put(obj); return ret; } diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.h b/drivers/gpu/drm/i915/i915_gem_render_state.h index c44fca8599bb..18cce3f06e9c 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.h +++ b/drivers/gpu/drm/i915/i915_gem_render_state.h @@ -24,7
[Intel-gfx] [CI 22/32] drm/i915/overlay: Use VMA as the primary tracker for images
Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/intel_overlay.c | 39 1 file changed, 22 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c index 90f3ab424e01..d930e3a4a9cd 100644 --- a/drivers/gpu/drm/i915/intel_overlay.c +++ b/drivers/gpu/drm/i915/intel_overlay.c @@ -171,8 +171,8 @@ struct overlay_registers { struct intel_overlay { struct drm_i915_private *i915; struct intel_crtc *crtc; - struct drm_i915_gem_object *vid_bo; - struct drm_i915_gem_object *old_vid_bo; + struct i915_vma *vma; + struct i915_vma *old_vma; bool active; bool pfit_active; u32 pfit_vscale_ratio; /* shifted-point number, (1<<12) == 1.0 */ @@ -317,15 +317,17 @@ static void intel_overlay_release_old_vid_tail(struct i915_gem_active *active, { struct intel_overlay *overlay = container_of(active, typeof(*overlay), last_flip); - struct drm_i915_gem_object *obj = overlay->old_vid_bo; + struct i915_vma *vma; - i915_gem_track_fb(obj, NULL, - INTEL_FRONTBUFFER_OVERLAY(overlay->crtc->pipe)); + vma = fetch_and_zero(&overlay->old_vma); + if (WARN_ON(!vma)) + return; - i915_gem_object_ggtt_unpin(obj); - i915_gem_object_put(obj); + i915_gem_track_fb(vma->obj, NULL, + INTEL_FRONTBUFFER_OVERLAY(overlay->crtc->pipe)); - overlay->old_vid_bo = NULL; + i915_gem_object_unpin_from_display_plane(vma->obj, &i915_ggtt_view_normal); + i915_vma_put(vma); } static void intel_overlay_off_tail(struct i915_gem_active *active, @@ -333,15 +335,15 @@ static void intel_overlay_off_tail(struct i915_gem_active *active, { struct intel_overlay *overlay = container_of(active, typeof(*overlay), last_flip); - struct drm_i915_gem_object *obj = overlay->vid_bo; + struct i915_vma *vma; /* never have the overlay hw on without showing a frame */ - if (WARN_ON(!obj)) + vma = fetch_and_zero(&overlay->vma); + if (WARN_ON(!vma)) return; - i915_gem_object_ggtt_unpin(obj); - i915_gem_object_put(obj); - overlay->vid_bo = NULL; + i915_gem_object_unpin_from_display_plane(vma->obj, &i915_ggtt_view_normal); + i915_vma_put(vma); overlay->crtc->overlay = NULL; overlay->crtc = NULL; @@ -421,7 +423,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay) /* Only wait if there is actually an old frame to release to * guarantee forward progress. */ - if (!overlay->old_vid_bo) + if (!overlay->old_vma) return 0; if (I915_READ(ISR) & I915_OVERLAY_PLANE_FLIP_PENDING_INTERRUPT) { @@ -744,6 +746,7 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay, struct drm_i915_private *dev_priv = overlay->i915; u32 swidth, swidthsw, sheight, ostride; enum pipe pipe = overlay->crtc->pipe; + struct i915_vma *vma; lockdep_assert_held(&dev_priv->drm.struct_mutex); WARN_ON(!drm_modeset_is_locked(&dev_priv->drm.mode_config.connection_mutex)); @@ -757,6 +760,8 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay, if (ret != 0) return ret; + vma = i915_gem_obj_to_ggtt_view(new_bo, &i915_ggtt_view_normal); + ret = i915_gem_object_put_fence(new_bo); if (ret) goto out_unpin; @@ -834,11 +839,11 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay, if (ret) goto out_unpin; - i915_gem_track_fb(overlay->vid_bo, new_bo, + i915_gem_track_fb(overlay->vma->obj, new_bo, INTEL_FRONTBUFFER_OVERLAY(pipe)); - overlay->old_vid_bo = overlay->vid_bo; - overlay->vid_bo = new_bo; + overlay->old_vma = overlay->vma; + overlay->vma = vma; intel_frontbuffer_flip(dev_priv, INTEL_FRONTBUFFER_OVERLAY(pipe)); -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 20/32] drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c
Since the scratch allocation and cleanup is shared by all engine submission backends, move it out of the legacy intel_ringbuffer.c and into the new home for common routines, intel_engine_cs.c Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/intel_engine_cs.c | 50 + drivers/gpu/drm/i915/intel_lrc.c| 1 - drivers/gpu/drm/i915/intel_ringbuffer.c | 50 - drivers/gpu/drm/i915/intel_ringbuffer.h | 4 +-- 4 files changed, 51 insertions(+), 54 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 186c12d07f99..7104dec5e893 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -195,6 +195,54 @@ void intel_engine_setup_common(struct intel_engine_cs *engine) i915_gem_batch_pool_init(engine, &engine->batch_pool); } +int intel_engine_create_scratch(struct intel_engine_cs *engine, int size) +{ + struct drm_i915_gem_object *obj; + struct i915_vma *vma; + int ret; + + WARN_ON(engine->scratch); + + obj = i915_gem_object_create_stolen(&engine->i915->drm, size); + if (!obj) + obj = i915_gem_object_create(&engine->i915->drm, size); + if (IS_ERR(obj)) { + DRM_ERROR("Failed to allocate scratch page\n"); + return PTR_ERR(obj); + } + + vma = i915_vma_create(obj, &engine->i915->ggtt.base, NULL); + if (IS_ERR(vma)) { + ret = PTR_ERR(vma); + goto err_unref; + } + + ret = i915_vma_pin(vma, 0, 4096, PIN_GLOBAL | PIN_HIGH); + if (ret) + goto err_unref; + + engine->scratch = vma; + DRM_DEBUG_DRIVER("%s pipe control offset: 0x%08llx\n", +engine->name, vma->node.start); + return 0; + +err_unref: + i915_gem_object_put(obj); + return ret; +} + +static void intel_engine_cleanup_scratch(struct intel_engine_cs *engine) +{ + struct i915_vma *vma; + + vma = fetch_and_zero(&engine->scratch); + if (!vma) + return; + + i915_vma_unpin(vma); + i915_vma_put(vma); +} + /** * intel_engines_init_common - initialize cengine state which might require hw access * @engine: Engine to initialize. @@ -226,6 +274,8 @@ int intel_engine_init_common(struct intel_engine_cs *engine) */ void intel_engine_cleanup_common(struct intel_engine_cs *engine) { + intel_engine_cleanup_scratch(engine); + intel_engine_cleanup_cmd_parser(engine); intel_engine_fini_breadcrumbs(engine); i915_gem_batch_pool_fini(&engine->batch_pool); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 42999ba02152..56c904e2dc98 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1844,7 +1844,6 @@ int logical_render_ring_init(struct intel_engine_cs *engine) else engine->init_hw = gen8_init_render_ring; engine->init_context = gen8_init_rcs_context; - engine->cleanup = intel_engine_cleanup_scratch; engine->emit_flush = gen8_emit_flush_render; engine->emit_request = gen8_emit_request_render; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 7ce912f8d96c..c89aea55bc10 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -613,54 +613,6 @@ out: return ret; } -void intel_engine_cleanup_scratch(struct intel_engine_cs *engine) -{ - struct i915_vma *vma; - - vma = fetch_and_zero(&engine->scratch); - if (!vma) - return; - - i915_vma_unpin(vma); - i915_vma_put(vma); -} - -int intel_engine_create_scratch(struct intel_engine_cs *engine, int size) -{ - struct drm_i915_gem_object *obj; - struct i915_vma *vma; - int ret; - - WARN_ON(engine->scratch); - - obj = i915_gem_object_create_stolen(&engine->i915->drm, size); - if (!obj) - obj = i915_gem_object_create(&engine->i915->drm, size); - if (IS_ERR(obj)) { - DRM_ERROR("Failed to allocate scratch page\n"); - return PTR_ERR(obj); - } - - vma = i915_vma_create(obj, &engine->i915->ggtt.base, NULL); - if (IS_ERR(vma)) { - ret = PTR_ERR(vma); - goto err_unref; - } - - ret = i915_vma_pin(vma, 0, 4096, PIN_GLOBAL | PIN_HIGH); - if (ret) - goto err_unref; - - engine->scratch = vma; - DRM_DEBUG_DRIVER("%s pipe control offset: 0x%08llx\n", -engine->name, vma->node.start); - return 0; - -err_unref: - i915_gem_object_put(obj); - return ret; -} - static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req) { struct intel_r
[Intel-gfx] [CI 11/32] drm/i915: Add convenience wrappers for vma's object get/put
The VMA are unreferenced, they belong to the object and live until they are closed. However, if we want to use the VMA as a cookie and use it to keep the object alive, we want to hold onto a reference to the object for the lifetime of the VMA cookie. To facilitate this, add a couple of simple wrappers for managing the reference count on the object owning the VMA. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h| 12 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 ++-- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 855833a6306a..3285c8e2c87a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2373,6 +2373,18 @@ i915_gem_object_get_stride(struct drm_i915_gem_object *obj) return obj->tiling_and_stride & STRIDE_MASK; } +static inline struct i915_vma *i915_vma_get(struct i915_vma *vma) +{ + i915_gem_object_get(vma->obj); + return vma; +} + +static inline void i915_vma_put(struct i915_vma *vma) +{ + lockdep_assert_held(&vma->vm->dev->struct_mutex); + i915_gem_object_put(vma->obj); +} + /* * Optimised SGL iterator for GEM objects */ diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index c8d13fea4b25..ced05878b405 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -271,7 +271,7 @@ static void eb_destroy(struct eb_vmas *eb) exec_list); list_del_init(&vma->exec_list); i915_gem_execbuffer_unreserve_vma(vma); - i915_gem_object_put(vma->obj); + i915_vma_put(vma); } kfree(eb); } @@ -900,7 +900,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev, vma = list_first_entry(&eb->vmas, struct i915_vma, exec_list); list_del_init(&vma->exec_list); i915_gem_execbuffer_unreserve_vma(vma); - i915_gem_object_put(vma->obj); + i915_vma_put(vma); } mutex_unlock(&dev->struct_mutex); -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 27/32] drm/i915: Track pinned VMA
Treat the VMA as the primary struct responsible for tracking bindings into the GPU's VM. That is we want to treat the VMA returned after we pin an object into the VM as the cookie we hold and eventually release when unpinning. Doing so eliminates the ambiguity in pinning the object and then searching for the relevant pin later. v2: Joonas' stylistic nitpicks, a fun rebase. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c| 2 +- drivers/gpu/drm/i915/i915_drv.h| 60 ++-- drivers/gpu/drm/i915/i915_gem.c| 233 - drivers/gpu/drm/i915/i915_gem_execbuffer.c | 65 drivers/gpu/drm/i915/i915_gem_fence.c | 14 +- drivers/gpu/drm/i915/i915_gem_gtt.c| 73 + drivers/gpu/drm/i915/i915_gem_gtt.h| 14 -- drivers/gpu/drm/i915/i915_gem_request.c| 2 +- drivers/gpu/drm/i915/i915_gem_request.h| 2 +- drivers/gpu/drm/i915/i915_gem_stolen.c | 2 +- drivers/gpu/drm/i915/i915_gem_tiling.c | 2 +- drivers/gpu/drm/i915/i915_gpu_error.c | 58 +++ drivers/gpu/drm/i915/intel_display.c | 57 --- drivers/gpu/drm/i915/intel_drv.h | 5 +- drivers/gpu/drm/i915/intel_fbc.c | 2 +- drivers/gpu/drm/i915/intel_fbdev.c | 19 +-- drivers/gpu/drm/i915/intel_guc_loader.c| 21 +-- drivers/gpu/drm/i915/intel_overlay.c | 32 ++-- 18 files changed, 266 insertions(+), 397 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index cee15b3db6ed..6d73bdf069f0 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -105,7 +105,7 @@ static char get_tiling_flag(struct drm_i915_gem_object *obj) static char get_global_flag(struct drm_i915_gem_object *obj) { - return i915_gem_obj_to_ggtt(obj) ? 'g' : ' '; + return i915_gem_object_to_ggtt(obj, NULL) ? 'g' : ' '; } static char get_pin_mapped_flag(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 50dc3613c61c..bbee45acedeb 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3075,7 +3075,7 @@ struct drm_i915_gem_object *i915_gem_object_create_from_data( void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file); void i915_gem_free_object(struct drm_gem_object *obj); -int __must_check +struct i915_vma * __must_check i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, const struct i915_ggtt_view *view, u64 size, @@ -3279,12 +3279,11 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write); int __must_check i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write); -int __must_check +struct i915_vma * __must_check i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, u32 alignment, const struct i915_ggtt_view *view); -void i915_gem_object_unpin_from_display_plane(struct drm_i915_gem_object *obj, - const struct i915_ggtt_view *view); +void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma); int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align); int i915_gem_open(struct drm_device *dev, struct drm_file *file); @@ -3304,63 +3303,34 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, struct dma_buf *i915_gem_prime_export(struct drm_device *dev, struct drm_gem_object *gem_obj, int flags); -u64 i915_gem_obj_ggtt_offset_view(struct drm_i915_gem_object *o, - const struct i915_ggtt_view *view); -u64 i915_gem_obj_offset(struct drm_i915_gem_object *o, - struct i915_address_space *vm); -static inline u64 -i915_gem_obj_ggtt_offset(struct drm_i915_gem_object *o) -{ - return i915_gem_obj_ggtt_offset_view(o, &i915_ggtt_view_normal); -} - -bool i915_gem_obj_ggtt_bound_view(struct drm_i915_gem_object *o, - const struct i915_ggtt_view *view); -bool i915_gem_obj_bound(struct drm_i915_gem_object *o, - struct i915_address_space *vm); - struct i915_vma * i915_gem_obj_to_vma(struct drm_i915_gem_object *obj, - struct i915_address_space *vm); -struct i915_vma * -i915_gem_obj_to_ggtt_view(struct drm_i915_gem_object *obj, - const struct i915_ggtt_view *view); +struct i915_address_space *vm, +const struct i915_ggtt_view *view); struct i915_vma * i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj, - struct i915_address_space *vm); -st
[Intel-gfx] [CI 26/32] drm/i915: Consolidate i915_vma_unpin_and_release()
In a few places, we repeat a call to clear a pointer to a vma whilst unpinning and releasing a reference to its owner. Refactor those into a common function. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_gtt.c| 12 drivers/gpu/drm/i915/i915_gem_gtt.h| 1 + drivers/gpu/drm/i915/i915_guc_submission.c | 21 - drivers/gpu/drm/i915/intel_engine_cs.c | 9 + drivers/gpu/drm/i915/intel_lrc.c | 9 + drivers/gpu/drm/i915/intel_ringbuffer.c| 8 +--- 6 files changed, 20 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 738a474c5afa..d15eb1d71341 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -3674,3 +3674,15 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma) __i915_vma_pin(vma); return ptr; } + +void i915_vma_unpin_and_release(struct i915_vma **p_vma) +{ + struct i915_vma *vma; + + vma = fetch_and_zero(p_vma); + if (!vma) + return; + + i915_vma_unpin(vma); + i915_vma_put(vma); +} diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index a2691943a404..ec538fcc9c20 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -232,6 +232,7 @@ struct i915_vma * i915_vma_create(struct drm_i915_gem_object *obj, struct i915_address_space *vm, const struct i915_ggtt_view *view); +void i915_vma_unpin_and_release(struct i915_vma **p_vma); static inline bool i915_vma_is_ggtt(const struct i915_vma *vma) { diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index c40b92e212fa..e7dbc64ec1da 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -653,19 +653,6 @@ err: return vma; } -/** - * guc_release_vma() - Release gem object allocated for GuC usage - * @vma: gem obj to be released - */ -static void guc_release_vma(struct i915_vma *vma) -{ - if (!vma) - return; - - i915_vma_unpin(vma); - i915_vma_put(vma); -} - static void guc_client_free(struct drm_i915_private *dev_priv, struct i915_guc_client *client) @@ -690,7 +677,7 @@ guc_client_free(struct drm_i915_private *dev_priv, kunmap(kmap_to_page(client->client_base)); } - guc_release_vma(client->vma); + i915_vma_unpin_and_release(&client->vma); if (client->ctx_index != GUC_INVALID_CTX_ID) { guc_fini_ctx_desc(guc, client); @@ -1048,12 +1035,12 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv) { struct intel_guc *guc = &dev_priv->guc; - guc_release_vma(fetch_and_zero(&guc->ads_vma)); - guc_release_vma(fetch_and_zero(&guc->log_vma)); + i915_vma_unpin_and_release(&guc->ads_vma); + i915_vma_unpin_and_release(&guc->log_vma); if (guc->ctx_pool_vma) ida_destroy(&guc->ctx_ids); - guc_release_vma(fetch_and_zero(&guc->ctx_pool_vma)); + i915_vma_unpin_and_release(&guc->ctx_pool_vma); } /** diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 573f642a74f8..f02d66bbec4b 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -279,14 +279,7 @@ err_unref: static void intel_engine_cleanup_scratch(struct intel_engine_cs *engine) { - struct i915_vma *vma; - - vma = fetch_and_zero(&engine->scratch); - if (!vma) - return; - - i915_vma_unpin(vma); - i915_vma_put(vma); + i915_vma_unpin_and_release(&engine->scratch); } /** diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 64cb04e63512..2673fb4f817b 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1193,14 +1193,7 @@ err: static void lrc_destroy_wa_ctx_obj(struct intel_engine_cs *engine) { - struct i915_vma *vma; - - vma = fetch_and_zero(&engine->wa_ctx.vma); - if (!vma) - return; - - i915_vma_unpin(vma); - i915_vma_put(vma); + i915_vma_unpin_and_release(&engine->wa_ctx.vma); } static int intel_init_workaround_bb(struct intel_engine_cs *engine) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 30b066140b0c..65ef172e8761 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1257,14 +1257,8 @@ static int init_render_ring(struct intel_engine_cs *engine) static void render_ring_cleanup(struct intel_engine_cs *engine) { struct drm_i915_private *dev_priv = engine->i915; - struct i915_vma *vma; - - vma = fetch_a
[Intel-gfx] [CI 32/32] drm/i915: Record the RING_MODE register for post-mortem debugging
Just another useful register to inspect following a GPU hang. v2: Remove partial decoding of RING_MODE to userspace, be consistent and use GEN > 2 guards around RING_MODE everywhere. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gpu_error.c | 3 +++ drivers/gpu/drm/i915/intel_ringbuffer.c | 7 --- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bb7d8130dbfd..35caa9b2f36a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -757,6 +757,7 @@ struct drm_i915_error_state { u32 tail; u32 head; u32 ctl; + u32 mode; u32 hws; u32 ipeir; u32 ipehr; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 776818b86c0c..0c3f30ce85c3 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -236,6 +236,7 @@ static void error_print_engine(struct drm_i915_error_state_buf *m, err_printf(m, " HEAD: 0x%08x\n", ee->head); err_printf(m, " TAIL: 0x%08x\n", ee->tail); err_printf(m, " CTL: 0x%08x\n", ee->ctl); + err_printf(m, " MODE: 0x%08x\n", ee->mode); err_printf(m, " HWS: 0x%08x\n", ee->hws); err_printf(m, " ACTHD: 0x%08x %08x\n", (u32)(ee->acthd>>32), (u32)ee->acthd); @@ -1005,6 +1006,8 @@ static void error_record_engine_registers(struct drm_i915_error_state *error, ee->head = I915_READ_HEAD(engine); ee->tail = I915_READ_TAIL(engine); ee->ctl = I915_READ_CTL(engine); + if (INTEL_GEN(dev_priv) > 2) + ee->mode = I915_READ_MODE(engine); if (I915_NEED_GFX_HWS(dev_priv)) { i915_reg_t mmio; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index e3327a2ac6e1..fa22bd87bab0 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -498,7 +498,7 @@ static bool stop_ring(struct intel_engine_cs *engine) { struct drm_i915_private *dev_priv = engine->i915; - if (!IS_GEN2(dev_priv)) { + if (INTEL_GEN(dev_priv) > 2) { I915_WRITE_MODE(engine, _MASKED_BIT_ENABLE(STOP_RING)); if (intel_wait_for_register(dev_priv, RING_MI_MODE(engine->mmio_base), @@ -520,7 +520,7 @@ static bool stop_ring(struct intel_engine_cs *engine) I915_WRITE_HEAD(engine, 0); I915_WRITE_TAIL(engine, 0); - if (!IS_GEN2(dev_priv)) { + if (INTEL_GEN(dev_priv) > 2) { (void)I915_READ_CTL(engine); I915_WRITE_MODE(engine, _MASKED_BIT_DISABLE(STOP_RING)); } @@ -2142,7 +2142,8 @@ void intel_engine_cleanup(struct intel_engine_cs *engine) dev_priv = engine->i915; if (engine->buffer) { - WARN_ON(!IS_GEN2(dev_priv) && (I915_READ_MODE(engine) & MODE_IDLE) == 0); + WARN_ON(INTEL_GEN(dev_priv) > 2 && + (I915_READ_MODE(engine) & MODE_IDLE) == 0); intel_ring_unpin(engine->buffer); intel_ring_free(engine->buffer); -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 28/32] drm/i915: Introduce i915_ggtt_offset()
This little helper only exists to safely discard the upper unused 32bits of the general 64-bit VMA address - as we know that all Global GTT currently are less than 4GiB in size and so that the upper bits must be zero. In many places, we use a u32 for the global GTT offset and we want to document where we are discarding the full VMA offset. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c| 2 +- drivers/gpu/drm/i915/i915_drv.h| 2 +- drivers/gpu/drm/i915/i915_gem.c| 11 +-- drivers/gpu/drm/i915/i915_gem_context.c| 6 -- drivers/gpu/drm/i915/i915_gem_gtt.h| 9 + drivers/gpu/drm/i915/i915_guc_submission.c | 15 --- drivers/gpu/drm/i915/intel_display.c | 10 +++--- drivers/gpu/drm/i915/intel_engine_cs.c | 4 ++-- drivers/gpu/drm/i915/intel_fbdev.c | 6 +++--- drivers/gpu/drm/i915/intel_guc_loader.c| 6 +++--- drivers/gpu/drm/i915/intel_lrc.c | 20 +++- drivers/gpu/drm/i915/intel_overlay.c | 10 ++ drivers/gpu/drm/i915/intel_ringbuffer.c| 28 ++-- 13 files changed, 70 insertions(+), 59 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 6d73bdf069f0..f9bedcb1d9d0 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2008,7 +2008,7 @@ static void i915_dump_lrc_obj(struct seq_file *m, if (vma->flags & I915_VMA_GLOBAL_BIND) seq_printf(m, "\tBound in GGTT at 0x%08x\n", - lower_32_bits(vma->node.start)); + i915_ggtt_offset(vma)); if (i915_gem_object_get_pages(vma->obj)) { seq_puts(m, "\tFailed to get pages for context object\n\n"); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bbee45acedeb..bd58878de77b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3330,7 +3330,7 @@ static inline unsigned long i915_gem_object_ggtt_offset(struct drm_i915_gem_object *o, const struct i915_ggtt_view *view) { - return i915_gem_object_to_ggtt(o, view)->node.start; + return i915_ggtt_offset(i915_gem_object_to_ggtt(o, view)); } /* i915_gem_fence.c */ diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 685253a1323b..7e08c774a1aa 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -767,7 +767,7 @@ i915_gem_gtt_pread(struct drm_device *dev, i915_gem_object_pin_pages(obj); } else { - node.start = vma->node.start; + node.start = i915_ggtt_offset(vma); node.allocated = false; ret = i915_gem_object_put_fence(obj); if (ret) @@ -1071,7 +1071,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915, i915_gem_object_pin_pages(obj); } else { - node.start = vma->node.start; + node.start = i915_ggtt_offset(vma); node.allocated = false; ret = i915_gem_object_put_fence(obj); if (ret) @@ -1712,7 +1712,7 @@ int i915_gem_fault(struct vm_area_struct *area, struct vm_fault *vmf) goto err_unpin; /* Finally, remap it using the new GTT offset */ - pfn = ggtt->mappable_base + vma->node.start; + pfn = ggtt->mappable_base + i915_ggtt_offset(vma); pfn >>= PAGE_SHIFT; if (unlikely(view.type == I915_GGTT_VIEW_PARTIAL)) { @@ -3759,10 +3759,9 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, WARN(i915_vma_is_pinned(vma), "bo is already pinned in ggtt with incorrect alignment:" -" offset=%08x %08x, req.alignment=%llx, req.map_and_fenceable=%d," +" offset=%08x, req.alignment=%llx, req.map_and_fenceable=%d," " obj->map_and_fenceable=%d\n", -upper_32_bits(vma->node.start), -lower_32_bits(vma->node.start), +i915_ggtt_offset(vma), alignment, !!(flags & PIN_MAPPABLE), obj->map_and_fenceable); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index e566167d9441..98d2956f91f4 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -631,7 +631,8 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags) intel_ring_emit(ring, MI_NOOP); intel_ring_emit(ring, MI_SET_CONTEXT); - intel_ring_emit(ring, req->ctx->engine[RCS].state->node.start | flags); + intel_ring_emit(ring, + i915_ggtt_offset(req->ctx->engine[RCS].state) | flags);
[Intel-gfx] [CI 23/32] drm/i915: Use VMA as the primary tracker for semaphore page
Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu/drm/i915/i915_drv.h | 4 +-- drivers/gpu/drm/i915/i915_gpu_error.c | 16 - drivers/gpu/drm/i915/intel_engine_cs.c | 12 --- drivers/gpu/drm/i915/intel_ringbuffer.c | 60 +++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 4 +-- 6 files changed, 55 insertions(+), 43 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 9e44d9eb8e76..cee15b3db6ed 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -3198,7 +3198,7 @@ static int i915_semaphore_status(struct seq_file *m, void *unused) struct page *page; uint64_t *seqno; - page = i915_gem_object_get_page(dev_priv->semaphore_obj, 0); + page = i915_gem_object_get_page(dev_priv->semaphore->obj, 0); seqno = (uint64_t *)kmap_atomic(page); for_each_engine_id(engine, dev_priv, id) { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 259425d99e17..50dc3613c61c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -733,7 +733,7 @@ struct drm_i915_error_state { u64 fence[I915_MAX_NUM_FENCES]; struct intel_overlay_error_state *overlay; struct intel_display_error_state *display; - struct drm_i915_error_object *semaphore_obj; + struct drm_i915_error_object *semaphore; struct drm_i915_error_engine { int engine_id; @@ -1750,7 +1750,7 @@ struct drm_i915_private { struct pci_dev *bridge_dev; struct i915_gem_context *kernel_context; struct intel_engine_cs engine[I915_NUM_ENGINES]; - struct drm_i915_gem_object *semaphore_obj; + struct i915_vma *semaphore; u32 next_seqno; struct drm_dma_handle *status_page_dmah; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index c327733e6735..4068630bfc68 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -549,7 +549,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, } } - if ((obj = error->semaphore_obj)) { + if ((obj = error->semaphore)) { err_printf(m, "Semaphore page = 0x%08x\n", lower_32_bits(obj->gtt_offset)); for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { @@ -640,7 +640,7 @@ static void i915_error_state_free(struct kref *error_ref) kfree(ee->waiters); } - i915_error_object_free(error->semaphore_obj); + i915_error_object_free(error->semaphore); for (i = 0; i < ARRAY_SIZE(error->active_bo); i++) kfree(error->active_bo[i]); @@ -876,7 +876,7 @@ static void gen8_record_semaphore_state(struct drm_i915_error_state *error, struct intel_engine_cs *to; enum intel_engine_id id; - if (!error->semaphore_obj) + if (!error->semaphore) return; for_each_engine_id(to, dev_priv, id) { @@ -889,7 +889,7 @@ static void gen8_record_semaphore_state(struct drm_i915_error_state *error, signal_offset = (GEN8_SIGNAL_OFFSET(engine, id) & (PAGE_SIZE - 1)) / 4; - tmp = error->semaphore_obj->pages[0]; + tmp = error->semaphore->pages[0]; idx = intel_engine_sync_index(engine, to); ee->semaphore_mboxes[idx] = tmp[signal_offset]; @@ -1061,11 +1061,9 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, struct drm_i915_gem_request *request; int i, count; - if (dev_priv->semaphore_obj) { - error->semaphore_obj = - i915_error_ggtt_object_create(dev_priv, - dev_priv->semaphore_obj); - } + error->semaphore = + i915_error_ggtt_object_create(dev_priv, + dev_priv->semaphore->obj); for (i = 0; i < I915_NUM_ENGINES; i++) { struct intel_engine_cs *engine = &dev_priv->engine[i]; diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 829624571ca4..573f642a74f8 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -179,12 +179,16 @@ void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno) if (HAS_VEBOX(dev_priv)) I915_WRITE(RING_SYNC_2(engine->mmio_base), 0); } - if (dev_priv->semaphore_obj) { - struct drm_i915_gem_object *obj = dev_priv->semaphore_obj; - struct page *page = i915_gem_object_get_dirty_page(obj, 0); -
[Intel-gfx] [CI 31/32] drm/i915: Only record active and pending requests upon a GPU hang
There is no other state pertaining to the completed requests in the hang, other than gleamed through the ringbuffer, so including the expired requests in the list of outstanding requests simply adds noise. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/i915_gpu_error.c | 109 +++--- 1 file changed, 61 insertions(+), 48 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 0f0b65214ef1..776818b86c0c 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1060,12 +1060,68 @@ static void error_record_engine_registers(struct drm_i915_error_state *error, } } +static void engine_record_requests(struct intel_engine_cs *engine, + struct drm_i915_gem_request *first, + struct drm_i915_error_engine *ee) +{ + struct drm_i915_gem_request *request; + int count; + + count = 0; + request = first; + list_for_each_entry_from(request, &engine->request_list, link) + count++; + if (!count) + return; + + ee->requests = kcalloc(count, sizeof(*ee->requests), GFP_ATOMIC); + if (!ee->requests) + return; + + ee->num_requests = count; + + count = 0; + request = first; + list_for_each_entry_from(request, &engine->request_list, link) { + struct drm_i915_error_request *erq; + + if (count >= ee->num_requests) { + /* +* If the ring request list was changed in +* between the point where the error request +* list was created and dimensioned and this +* point then just exit early to avoid crashes. +* +* We don't need to communicate that the +* request list changed state during error +* state capture and that the error state is +* slightly incorrect as a consequence since we +* are typically only interested in the request +* list state at the point of error state +* capture, not in any changes happening during +* the capture. +*/ + break; + } + + erq = &ee->requests[count++]; + erq->seqno = request->fence.seqno; + erq->jiffies = request->emitted_jiffies; + erq->head = request->head; + erq->tail = request->tail; + + rcu_read_lock(); + erq->pid = request->ctx->pid ? pid_nr(request->ctx->pid) : 0; + rcu_read_unlock(); + } + ee->num_requests = count; +} + static void i915_gem_record_rings(struct drm_i915_private *dev_priv, struct drm_i915_error_state *error) { struct i915_ggtt *ggtt = &dev_priv->ggtt; - struct drm_i915_gem_request *request; - int i, count; + int i; error->semaphore = i915_error_object_create(dev_priv, dev_priv->semaphore); @@ -1073,6 +1129,7 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, for (i = 0; i < I915_NUM_ENGINES; i++) { struct intel_engine_cs *engine = &dev_priv->engine[i]; struct drm_i915_error_engine *ee = &error->engine[i]; + struct drm_i915_gem_request *request; ee->pid = -1; ee->engine_id = -1; @@ -1131,6 +1188,8 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, ee->cpu_ring_tail = ring->tail; ee->ringbuffer = i915_error_object_create(dev_priv, ring->vma); + + engine_record_requests(engine, request, ee); } ee->hws_page = @@ -1139,52 +1198,6 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, ee->wa_ctx = i915_error_object_create(dev_priv, engine->wa_ctx.vma); - - count = 0; - list_for_each_entry(request, &engine->request_list, link) - count++; - - ee->num_requests = count; - ee->requests = - kcalloc(count, sizeof(*ee->requests), GFP_ATOMIC); - if (!ee->requests) { - ee->num_requests = 0; - continue; - } - - count = 0; - list_for_each_entry(request, &engine->request_list, link) { - struct drm_i915_error_request *erq; - - if (count >= ee->num_reques
[Intel-gfx] [CI 10/32] drm/i915: Add fetch_and_zero() macro
A simple little macro to clear a pointer and return the old value. This is useful for writing value = *ptr; if (!value) return; *ptr = 0; ... free(value); in a slightly more concise form: value = fetch_and_zero(ptr); if (!value) return; ... free(value); with the idea that this establishes a pattern that may be extended for atomic use (using xchg or cmpxchg) i.e. atomic_fetch_and_zero() and similar to llist. Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Daniel Vetter Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 25b1e6c010d5..855833a6306a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3920,4 +3920,10 @@ bool i915_memcpy_from_wc(void *dst, const void *src, unsigned long len); #define ptr_pack_bits(ptr, bits) \ ((typeof(ptr))((unsigned long)(ptr) | (bits))) +#define fetch_and_zero(ptr) ({ \ + typeof(*ptr) __T = *(ptr); \ + *(ptr) = (typeof(*ptr))0; \ + __T;\ +}) + #endif -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 29/32] drm/i915: Move debug only per-request pid tracking from request to ctx
Since contexts are not currently shared between userspace processes, we have an exact correspondence between context creator and guilty batch submitter. Therefore we can save some per-batch work by inspecting the context->pid upon error instead. Note that we take the context's creator's pid rather than the file's pid in order to better track fd passed over sockets. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c | 25 - drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem_context.c | 4 drivers/gpu/drm/i915/i915_gem_request.c | 6 -- drivers/gpu/drm/i915/i915_gem_request.h | 3 --- drivers/gpu/drm/i915/i915_gpu_error.c | 13 ++--- 6 files changed, 32 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index f9bedcb1d9d0..b89478a8d19a 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -460,6 +460,8 @@ static int i915_gem_object_info(struct seq_file *m, void* data) print_context_stats(m, dev_priv); list_for_each_entry_reverse(file, &dev->filelist, lhead) { struct file_stats stats; + struct drm_i915_file_private *file_priv = file->driver_priv; + struct drm_i915_gem_request *request; struct task_struct *task; memset(&stats, 0, sizeof(stats)); @@ -473,10 +475,17 @@ static int i915_gem_object_info(struct seq_file *m, void* data) * still alive (e.g. get_pid(current) => fork() => exit()). * Therefore, we need to protect this ->comm access using RCU. */ + mutex_lock(&dev->struct_mutex); + request = list_first_entry_or_null(&file_priv->mm.request_list, + struct drm_i915_gem_request, + client_list); rcu_read_lock(); - task = pid_task(file->pid, PIDTYPE_PID); + task = pid_task(request && request->ctx->pid ? + request->ctx->pid : file->pid, + PIDTYPE_PID); print_file_stats(m, task ? task->comm : "", stats); rcu_read_unlock(); + mutex_unlock(&dev->struct_mutex); } mutex_unlock(&dev->filelist_mutex); @@ -658,12 +667,11 @@ static int i915_gem_request_info(struct seq_file *m, void *data) seq_printf(m, "%s requests: %d\n", engine->name, count); list_for_each_entry(req, &engine->request_list, link) { + struct pid *pid = req->ctx->pid; struct task_struct *task; rcu_read_lock(); - task = NULL; - if (req->pid) - task = pid_task(req->pid, PIDTYPE_PID); + task = pid ? pid_task(pid, PIDTYPE_PID) : NULL; seq_printf(m, "%x @ %d: %s [%d]\n", req->fence.seqno, (int) (jiffies - req->emitted_jiffies), @@ -1952,18 +1960,17 @@ static int i915_context_status(struct seq_file *m, void *unused) list_for_each_entry(ctx, &dev_priv->context_list, link) { seq_printf(m, "HW context %u ", ctx->hw_id); - if (IS_ERR(ctx->file_priv)) { - seq_puts(m, "(deleted) "); - } else if (ctx->file_priv) { - struct pid *pid = ctx->file_priv->file->pid; + if (ctx->pid) { struct task_struct *task; - task = get_pid_task(pid, PIDTYPE_PID); + task = get_pid_task(ctx->pid, PIDTYPE_PID); if (task) { seq_printf(m, "(%s [%d]) ", task->comm, task->pid); put_task_struct(task); } + } else if (IS_ERR(ctx->file_priv)) { + seq_puts(m, "(deleted) "); } else { seq_puts(m, "(kernel) "); } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bd58878de77b..d9f29244bafb 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -782,6 +782,7 @@ struct drm_i915_error_state { struct drm_i915_error_request { long jiffies; + pid_t pid; u32 seqno; u32 head; u32 tail; @@ -880,6 +881,7 @@ struct i915_gem_context { struct drm_i915_private *i915; struct drm_i915_file_private *file_priv; struct
[Intel-gfx] [CI 12/32] drm/i915: Track pinned vma inside guc
Since the guc allocates and pins and object into the GGTT for its usage, it is more natural to use that pinned VMA as our resource cookie. v2: Embrace naming tautology v3: Rewrite comments for guc_allocate_vma() Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_debugfs.c| 10 +- drivers/gpu/drm/i915/i915_gem_gtt.h| 6 ++ drivers/gpu/drm/i915/i915_guc_submission.c | 144 ++--- drivers/gpu/drm/i915/intel_guc.h | 9 +- drivers/gpu/drm/i915/intel_guc_loader.c| 7 +- 5 files changed, 90 insertions(+), 86 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 77a9c56ad25f..0ae61e94ce04 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2570,15 +2570,15 @@ static int i915_guc_log_dump(struct seq_file *m, void *data) struct drm_info_node *node = m->private; struct drm_device *dev = node->minor->dev; struct drm_i915_private *dev_priv = to_i915(dev); - struct drm_i915_gem_object *log_obj = dev_priv->guc.log_obj; - u32 *log; + struct drm_i915_gem_object *obj; int i = 0, pg; - if (!log_obj) + if (!dev_priv->guc.log_vma) return 0; - for (pg = 0; pg < log_obj->base.size / PAGE_SIZE; pg++) { - log = kmap_atomic(i915_gem_object_get_page(log_obj, pg)); + obj = dev_priv->guc.log_vma->obj; + for (pg = 0; pg < obj->base.size / PAGE_SIZE; pg++) { + u32 *log = kmap_atomic(i915_gem_object_get_page(obj, pg)); for (i = 0; i < PAGE_SIZE / sizeof(u32); i += 4) seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n", diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index f2769e01cc8c..a2691943a404 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -716,4 +716,10 @@ static inline void i915_vma_unpin_iomap(struct i915_vma *vma) i915_vma_unpin(vma); } +static inline struct page *i915_vma_first_page(struct i915_vma *vma) +{ + GEM_BUG_ON(!vma->pages); + return sg_page(vma->pages->sgl); +} + #endif diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 6831321a9c8c..29de8cec1b58 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -183,7 +183,7 @@ static int guc_update_doorbell_id(struct intel_guc *guc, struct i915_guc_client *client, u16 new_id) { - struct sg_table *sg = guc->ctx_pool_obj->pages; + struct sg_table *sg = guc->ctx_pool_vma->pages; void *doorbell_bitmap = guc->doorbell_bitmap; struct guc_doorbell_info *doorbell; struct guc_context_desc desc; @@ -325,7 +325,6 @@ static void guc_init_proc_desc(struct intel_guc *guc, static void guc_init_ctx_desc(struct intel_guc *guc, struct i915_guc_client *client) { - struct drm_i915_gem_object *client_obj = client->client_obj; struct drm_i915_private *dev_priv = guc_to_i915(guc); struct intel_engine_cs *engine; struct i915_gem_context *ctx = client->owner; @@ -383,8 +382,8 @@ static void guc_init_ctx_desc(struct intel_guc *guc, * The doorbell, process descriptor, and workqueue are all parts * of the client object, which the GuC will reference via the GGTT */ - gfx_addr = i915_gem_obj_ggtt_offset(client_obj); - desc.db_trigger_phy = sg_dma_address(client_obj->pages->sgl) + + gfx_addr = client->vma->node.start; + desc.db_trigger_phy = sg_dma_address(client->vma->pages->sgl) + client->doorbell_offset; desc.db_trigger_cpu = (uintptr_t)client->client_base + client->doorbell_offset; @@ -400,7 +399,7 @@ static void guc_init_ctx_desc(struct intel_guc *guc, desc.desc_private = (uintptr_t)client; /* Pool context is pinned already */ - sg = guc->ctx_pool_obj->pages; + sg = guc->ctx_pool_vma->pages; sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc), sizeof(desc) * client->ctx_index); } @@ -413,7 +412,7 @@ static void guc_fini_ctx_desc(struct intel_guc *guc, memset(&desc, 0, sizeof(desc)); - sg = guc->ctx_pool_obj->pages; + sg = guc->ctx_pool_vma->pages; sg_pcopy_from_buffer(sg->sgl, sg->nents, &desc, sizeof(desc), sizeof(desc) * client->ctx_index); } @@ -496,7 +495,7 @@ static void guc_add_workqueue_item(struct i915_guc_client *gc, /* WQ starts from the page after doorbell / process_desc */ wq_page = (wq_off + GUC_DB_SIZE) >> PAGE_SHIFT; wq_off &= PAGE_SIZE - 1; - base = kmap_atomic(i
[Intel-gfx] [CI 19/32] drm/i915: Use VMA for scratch page tracking
Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_context.c | 2 +- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- drivers/gpu/drm/i915/intel_display.c| 2 +- drivers/gpu/drm/i915/intel_lrc.c| 18 +-- drivers/gpu/drm/i915/intel_ringbuffer.c | 55 +++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 10 ++ 6 files changed, 46 insertions(+), 43 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 824dfe14bcd0..e566167d9441 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -660,7 +660,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags) MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT); intel_ring_emit_reg(ring, last_reg); - intel_ring_emit(ring, engine->scratch.gtt_offset); + intel_ring_emit(ring, engine->scratch->node.start); intel_ring_emit(ring, MI_NOOP); } intel_ring_emit(ring, MI_ARB_ON_OFF | MI_ARB_ENABLE); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 27f973fbe80f..c327733e6735 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1101,7 +1101,7 @@ static void i915_gem_record_rings(struct drm_i915_private *dev_priv, if (HAS_BROKEN_CS_TLB(dev_priv)) ee->wa_batchbuffer = i915_error_ggtt_object_create(dev_priv, - engine->scratch.obj); + engine->scratch->obj); if (request->ctx->engine[i].state) { ee->ctx = i915_error_ggtt_object_create(dev_priv, diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 90309a9858b2..9d18f34f7ce5 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -11795,7 +11795,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev, intel_ring_emit(ring, MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT); intel_ring_emit_reg(ring, DERRMR); - intel_ring_emit(ring, req->engine->scratch.gtt_offset + 256); + intel_ring_emit(ring, req->engine->scratch->node.start + 256); if (IS_GEN8(dev)) { intel_ring_emit(ring, 0); intel_ring_emit(ring, MI_NOOP); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 73dd2f9e0547..42999ba02152 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -914,7 +914,7 @@ static inline int gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, wa_ctx_emit(batch, index, (MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT)); wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4); - wa_ctx_emit(batch, index, engine->scratch.gtt_offset + 256); + wa_ctx_emit(batch, index, engine->scratch->node.start + 256); wa_ctx_emit(batch, index, 0); wa_ctx_emit(batch, index, MI_LOAD_REGISTER_IMM(1)); @@ -932,7 +932,7 @@ static inline int gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, wa_ctx_emit(batch, index, (MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT)); wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4); - wa_ctx_emit(batch, index, engine->scratch.gtt_offset + 256); + wa_ctx_emit(batch, index, engine->scratch->node.start + 256); wa_ctx_emit(batch, index, 0); return index; @@ -993,7 +993,7 @@ static int gen8_init_indirectctx_bb(struct intel_engine_cs *engine, /* WaClearSlmSpaceAtContextSwitch:bdw,chv */ /* Actual scratch location is at 128 bytes offset */ - scratch_addr = engine->scratch.gtt_offset + 2*CACHELINE_BYTES; + scratch_addr = engine->scratch->node.start + 2 * CACHELINE_BYTES; wa_ctx_emit(batch, index, GFX_OP_PIPE_CONTROL(6)); wa_ctx_emit(batch, index, (PIPE_CONTROL_FLUSH_L3 | @@ -1072,8 +1072,8 @@ static int gen9_init_indirectctx_bb(struct intel_engine_cs *engine, /* WaClearSlmSpaceAtContextSwitch:kbl */ /* Actual scratch location is at 128 bytes offset */ if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_A0)) { - uint32_t scratch_addr - = engine->scratch.gtt_offset + 2*CACHELINE_BYTES; + u32 scratch_addr = + engine->scratch->node.start + 2
Re: [Intel-gfx] [PATCH 03/10] drm/i915: Move fence tracking from object to vma
On Mon, Aug 15, 2016 at 12:18:20PM +0300, Joonas Lahtinen wrote: > On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > > @@ -1131,15 +1131,11 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private > > *i915, > > } else { > > node.start = i915_ggtt_offset(vma); > > node.allocated = false; > > - ret = i915_gem_object_put_fence(obj); > > + ret = i915_vma_put_fence(vma); > > if (ret) > > goto out_unpin; > > } > > > > - ret = i915_gem_object_set_to_gtt_domain(obj, true); > > - if (ret) > > - goto out_unpin; > > - > > This is a somewhat an unexpected change in here. Care to explain? Spontaneous disappearance due to rebasing. Pops back into existence again later! -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm: make drm_get_format_name thread-safe
On Mon, 15 Aug 2016, Eric Engestrom wrote: > Signed-off-by: Eric Engestrom > --- > > I moved the main bits to be the first diffs, shouldn't affect anything > when applying the patch, but I wanted to ask: > I don't like the hard-coded `32` the appears in both kmalloc() and > snprintf(), what do you think? If you don't like it either, what would > you suggest? Should I #define it? > > Second question is about the patch mail itself: should I send this kind > of patch separated by module, with a note requesting them to be squashed > when applying? It has to land as a single patch, but for review it might > be easier if people only see the bits they each care about, as well as > to collect ack's/r-b's. > > Cheers, > Eric > > --- > drivers/gpu/drm/amd/amdgpu/dce_v10_0.c | 6 ++-- > drivers/gpu/drm/amd/amdgpu/dce_v11_0.c | 6 ++-- > drivers/gpu/drm/amd/amdgpu/dce_v8_0.c | 6 ++-- > drivers/gpu/drm/drm_atomic.c| 5 ++-- > drivers/gpu/drm/drm_crtc.c | 21 - > drivers/gpu/drm/drm_fourcc.c| 17 ++- > drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c | 6 ++-- > drivers/gpu/drm/i915/i915_debugfs.c | 11 ++- > drivers/gpu/drm/i915/intel_atomic_plane.c | 6 ++-- > drivers/gpu/drm/i915/intel_display.c| 39 > - > drivers/gpu/drm/radeon/atombios_crtc.c | 12 +--- > include/drm/drm_fourcc.h| 2 +- > 12 files changed, 89 insertions(+), 48 deletions(-) > > diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c > index 0645c85..38216a1 100644 > --- a/drivers/gpu/drm/drm_fourcc.c > +++ b/drivers/gpu/drm/drm_fourcc.c > @@ -39,16 +39,14 @@ static char printable_char(int c) > * drm_get_format_name - return a string for drm fourcc format > * @format: format to compute name of > * > - * Note that the buffer used by this function is globally shared and owned by > - * the function itself. > - * > - * FIXME: This isn't really multithreading safe. > + * Note that the buffer returned by this function is owned by the caller > + * and will need to be freed. > */ > const char *drm_get_format_name(uint32_t format) I find it surprising that a function that allocates a buffer returns a const pointer. Some userspace libraries have conventions about the ownership based on constness. (I also find it suprising that kfree() takes a const pointer; arguably that call changes the memory.) Is there precedent for this? BR, Jani. > { > - static char buf[32]; > + char *buf = kmalloc(32, GFP_KERNEL); > > - snprintf(buf, sizeof(buf), > + snprintf(buf, 32, >"%c%c%c%c %s-endian (0x%08x)", >printable_char(format & 0xff), >printable_char((format >> 8) & 0xff), > @@ -73,6 +71,8 @@ EXPORT_SYMBOL(drm_get_format_name); > void drm_fb_get_bpp_depth(uint32_t format, unsigned int *depth, > int *bpp) > { > + const char *format_name; > + > switch (format) { > case DRM_FORMAT_C8: > case DRM_FORMAT_RGB332: > @@ -127,8 +127,9 @@ void drm_fb_get_bpp_depth(uint32_t format, unsigned int > *depth, > *bpp = 32; > break; > default: > - DRM_DEBUG_KMS("unsupported pixel format %s\n", > - drm_get_format_name(format)); > + format_name = drm_get_format_name(format); > + DRM_DEBUG_KMS("unsupported pixel format %s\n", format_name); > + kfree(format_name); > *depth = 0; > *bpp = 0; > break; > diff --git a/include/drm/drm_fourcc.h b/include/drm/drm_fourcc.h > index 7f90a39..030d22d 100644 > --- a/include/drm/drm_fourcc.h > +++ b/include/drm/drm_fourcc.h > @@ -32,6 +32,6 @@ int drm_format_horz_chroma_subsampling(uint32_t format); > int drm_format_vert_chroma_subsampling(uint32_t format); > int drm_format_plane_width(int width, uint32_t format, int plane); > int drm_format_plane_height(int height, uint32_t format, int plane); > -const char *drm_get_format_name(uint32_t format); > +const char *drm_get_format_name(uint32_t format) __malloc; > > #endif /* __DRM_FOURCC_H__ */ > diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c > b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c > index c1b04e9..0bf8959 100644 > --- a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c > @@ -2071,6 +2071,7 @@ static int dce_v10_0_crtc_do_set_base(struct drm_crtc > *crtc, > u32 tmp, viewport_w, viewport_h; > int r; > bool bypass_lut = false; > + const char *format_name; > > /* no fb bound */ > if (!atomic && !crtc->primary->fb) { > @@ -2182,8 +2183,9 @@ static int dce_v10_0_crtc_do_set_base(struct drm_crtc > *crtc, > bypass_lut = true; > break; > default: > - DRM_ERROR("Unsupported screen form
Re: [Intel-gfx] [PATCH 09/10] drm/i915: Bump the inactive MRU tracking for all VMA accessed
> When we bump the MRU access tracking on set-to-gtt, we need to not only > bump the primary GGTT VMA but all partials as well. Similarly we want to > bump the MRU access for when unpinning an object from the scanout. Refer to the list as LRU in the commit title and message to avoid confusion. On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > +static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object > *obj) > +{ > + struct i915_vma *vma; > + > + list_for_each_entry(vma, &obj->vma_list, obj_link) { > + if (!i915_vma_is_ggtt(vma)) > + continue; > + > + if (i915_vma_is_active(vma)) > + continue; Could combine these two to one if. Reviewed-by: Joonas Lahtinen Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 10/10] drm/i915: Stop discarding GTT cache-domain on unbind vma
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > Since commit 43566dedde54 ("drm/i915: Broaden application of > set-domain(GTT)") we allowed objects to be in the GTT domain, but unbound. > Therefore removing the GTT cache domain when removing the GGTT vma is no > longer semantically correct. > > An unfortunate side-effect is we lose the wondrously named > i915_gem_object_finish_gtt(), not to be confused with > i915_gem_gtt_finish_object()! > Does what it promises. Reviewed-by: Joonas Lahtinen Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/skl: Do not error out when total_data_rate is 0
This can happen when doing a modeset with only the cursor plane active. Testcase: kms_atomic_transition Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/intel_pm.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 651277b0c917..550d9f0688ae 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -3115,8 +3115,6 @@ skl_get_total_relative_data_rate(struct intel_crtc_state *intel_cstate) total_data_rate += intel_cstate->wm.skl.plane_y_data_rate[id]; } - WARN_ON(cstate->plane_mask && total_data_rate == 0); - return total_data_rate; } -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 09/10] drm/i915: Bump the inactive MRU tracking for all VMA accessed
On Mon, Aug 15, 2016 at 12:59:09PM +0300, Joonas Lahtinen wrote: > > When we bump the MRU access tracking on set-to-gtt, we need to not only > > bump the primary GGTT VMA but all partials as well. Similarly we want to > > bump the MRU access for when unpinning an object from the scanout. > > Refer to the list as LRU in the commit title and message to avoid confusion. Still disagree. We are adjusting the MRU entity, the code always has and then evicting from the LRU. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [CI,01/32] drm/i915: Record the position of the start of the request
== Series Details == Series: series starting with [CI,01/32] drm/i915: Record the position of the start of the request URL : https://patchwork.freedesktop.org/series/11093/ State : failure == Summary == Series 11093v1 Series without cover letter http://patchwork.freedesktop.org/api/1.0/series/11093/revisions/1/mbox Test kms_cursor_legacy: Subgroup basic-cursor-vs-flip-varying-size: fail -> PASS (ro-ilk1-i5-650) Subgroup basic-flip-vs-cursor-varying-size: pass -> FAIL (ro-byt-n2820) fail -> PASS (ro-bdw-i5-5250u) pass -> FAIL (ro-skl3-i5-6260u) Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-a: skip -> DMESG-WARN (ro-bdw-i5-5250u) Subgroup suspend-read-crc-pipe-c: pass -> DMESG-WARN (ro-bdw-i7-5600u) fi-hsw-i7-4770k total:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-kbl-qkkr total:244 pass:186 dwarn:29 dfail:0 fail:3 skip:26 fi-skl-i7-6700k total:244 pass:208 dwarn:4 dfail:2 fail:2 skip:28 fi-snb-i7-2600 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 ro-bdw-i5-5250u total:240 pass:219 dwarn:2 dfail:0 fail:1 skip:18 ro-bdw-i7-5600u total:240 pass:206 dwarn:1 dfail:0 fail:1 skip:32 ro-bsw-n3050 total:240 pass:194 dwarn:0 dfail:0 fail:4 skip:42 ro-byt-n2820 total:240 pass:197 dwarn:0 dfail:0 fail:3 skip:40 ro-hsw-i3-4010u total:240 pass:214 dwarn:0 dfail:0 fail:0 skip:26 ro-hsw-i7-4770r total:240 pass:185 dwarn:0 dfail:0 fail:0 skip:55 ro-ilk1-i5-650 total:235 pass:174 dwarn:0 dfail:0 fail:1 skip:60 ro-ivb-i7-3770 total:240 pass:205 dwarn:0 dfail:0 fail:0 skip:35 ro-ivb2-i7-3770 total:240 pass:209 dwarn:0 dfail:0 fail:0 skip:31 ro-skl3-i5-6260u total:240 pass:222 dwarn:0 dfail:0 fail:4 skip:14 Results at /archive/results/CI_IGT_test/RO_Patchwork_1867/ 1b2e958 drm-intel-nightly: 2016y-08m-15d-09h-09m-06s UTC integration manifest 7d6041c drm/i915: Record the RING_MODE register for post-mortem debugging d8ff181 drm/i915: Only record active and pending requests upon a GPU hang d0a310c drm/i915: Print the batchbuffer offset next to BBADDR in error state da7a99b drm/i915: Move debug only per-request pid tracking from request to ctx fa4b7d2 drm/i915: Introduce i915_ggtt_offset() b1fd3c1 drm/i915: Track pinned VMA c2dd68d drm/i915: Consolidate i915_vma_unpin_and_release() 4b74e1f drm/i915: Use VMA for wa_ctx tracking a2e786f drm/i915: Use VMA for render state page tracking 3083b2c drm/i915: Use VMA as the primary tracker for semaphore page ed77da3 drm/i915/overlay: Use VMA as the primary tracker for images 1be7e98 drm/i915: Move common seqno reset to intel_engine_cs.c fe5680d drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c b1dc161 drm/i915: Use VMA for scratch page tracking bc1fc79 drm/i915: Use VMA for ringbuffer tracking 1ced8c8 drm/i915: Move assertion for iomap access to i915_vma_pin_iomap 71987d7 drm/i915: Only change the context object's domain when binding fd8cbd3 drm/i915: Use VMA as the primary object for context state 79f2877 drm/i915: Use VMA directly for checking tiling parameters f1a7f7f drm/i915: Convert fence computations to use vma directly bf12887 drm/i915: Track pinned vma inside guc 762650b drm/i915: Add convenience wrappers for vma's object get/put eca1534 drm/i915: Add fetch_and_zero() macro 4f57f1d drm/i915: Create a VMA for an object 0fc7594 drm/i915: Always set the vma->pages 61a24bc drm/i915: Remove redundant WARN_ON from __i915_add_request() 4a925da drm/i915: Reduce i915_gem_objects to only show object information 4c3d11b drm/i915: Focus debugfs/i915_gem_pinned to show only display pins 07af7c5 drm/i915: Remove inactive/active list from debugfs e5e12b1 drm/i915: Store the active context object on all engines upon error 0a3d2c5 drm/i915: Reduce amount of duplicate buffer information captured on error a8cdde2 drm/i915: Record the position of the start of the request ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 03/10] drm/i915: Move fence tracking from object to vma
On ma, 2016-08-15 at 10:25 +0100, Chris Wilson wrote: > On Mon, Aug 15, 2016 at 12:18:20PM +0300, Joonas Lahtinen wrote: > > > > On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > > > > > > + if (1) { > > Umm? At least ought to have TODO: / FIXME: or some explanation. And > You're not aware of the pipelined fencing? I was most definitely not, now I am somewhat. Still need to add dem TODOs. Regards, Joonas > -Chris > -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 06/10] drm/i915: Choose not to evict faultable objects from the GGTT
On pe, 2016-08-12 at 12:13 +0100, Chris Wilson wrote: > On Fri, Aug 12, 2016 at 01:50:56PM +0300, Joonas Lahtinen wrote: > > > > On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > > > > > > @@ -1715,10 +1716,10 @@ int i915_gem_fault(struct vm_area_struct *area, > > > struct vm_fault *vmf) > > > goto err_unlock; > > > } > > > > > > - /* Use a partial view if the object is bigger than the aperture. */ > > > - /* Now pin it into the GTT if needed */ > > > - vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, > > > - PIN_MAPPABLE | PIN_NONBLOCK); > > > + flags = PIN_MAPPABLE; > > > + if (obj->base.size > 2 << 20) > > Magic number. > One day there may be a MiB() macro. It is a magic number, just a rule of > thumb based on minimum chunksize for a partial. #define the minimum chunk size and use it here too? With a warning of the number being derived from the wildest approximations. > > > > > > > > > @@ -55,6 +55,9 @@ mark_free(struct i915_vma *vma, struct list_head > > > *unwind) > > > if (WARN_ON(!list_empty(&vma->exec_list))) > > > return false; > > > > > > + if (flags & PIN_NOFAULT && vma->obj->fault_mappable) > > > + return false; > > The flag name is rather counter-intuitive for it describes other VMAs > > rather than our new VMA... > As does NONBLOCKING. We could loose this flag in favour of NOEVICT, but > I haven't run anything to confirm if that's a good tradeoff. Maybe the flag should be like __PIN_NOFAULTING to distinct in addition to __PIN_NONBLOCKING? And then make sure they're never set on vma itself. Regards, Joonas > -Chris > -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 20/20] drm/i915: Early creation of relay channel for capturing boot time logs
On 8/15/2016 2:50 PM, Tvrtko Ursulin wrote: On 12/08/16 17:31, Goel, Akash wrote: On 8/12/2016 9:52 PM, Tvrtko Ursulin wrote: On 12/08/16 07:25, akash.g...@intel.com wrote: From: Akash Goel As per the current i915 Driver load sequence, debugfs registration is done at the end and so the relay channel debugfs file is also created after that but the GuC firmware is loaded much earlier in the sequence. As a result Driver could miss capturing the boot-time logs of GuC firmware if there are flush interrupts from the GuC side. Relay has a provision to support early logging where initially only relay channel can be created, to have buffers for storing logs, and later on channel can be associated with a debugfs file at appropriate time. Have availed that, which allows Driver to capture boot time logs also, which can be collected once Userspace comes up. Suggested-by: Chris Wilson Signed-off-by: Akash Goel --- drivers/gpu/drm/i915/i915_guc_submission.c | 61 +- 1 file changed, 44 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index af48f62..1c287d7 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1099,25 +1099,12 @@ static void guc_remove_log_relay_file(struct intel_guc *guc) relay_close(guc->log.relay_chan); } -static int guc_create_log_relay_file(struct intel_guc *guc) +static int guc_create_relay_channel(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); struct rchan *guc_log_relay_chan; -struct dentry *log_dir; size_t n_subbufs, subbuf_size; -/* For now create the log file in /sys/kernel/debug/dri/0 dir */ -log_dir = dev_priv->drm.primary->debugfs_root; - -/* If /sys/kernel/debug/dri/0 location do not exist, then debugfs is - * not mounted and so can't create the relay file. - * The relay API seems to fit well with debugfs only. It only needs a dentry, I don't see that it has to be a debugfs one. Besides dentry, there are other requirements for using relay, which can be met only for a debugfs file. debugfs wasn't the preferred choice to place the log file, but had no other option, as relay API is compatible with debugfs only. What are those and For availing relay there are 3 requirements :- a) Need the associated ‘dentry’ pointer of the file, while opening the relay channel. b) Should be able to use 'relay_file_operations' fops for the file. c) Set the 'i_private' field of file’s inode to the pointer of relay channel buffer. All the above 3 requirements can be met for a debugfs file in a straightforward manner. But not all of them can be met for a file created inside sysfs or if the file is created inside /dev as a character device file. should they be mentioned in the comment above? Or should I mention them in the cover letter or commit message. Best regards Akash Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Ro.CI.BAT: failure for drm/i915/skl: Do not error out when total_data_rate is 0
== Series Details == Series: drm/i915/skl: Do not error out when total_data_rate is 0 URL : https://patchwork.freedesktop.org/series/11094/ State : failure == Summary == Series 11094v1 drm/i915/skl: Do not error out when total_data_rate is 0 http://patchwork.freedesktop.org/api/1.0/series/11094/revisions/1/mbox Test kms_cursor_legacy: Subgroup basic-flip-vs-cursor-legacy: fail -> PASS (ro-byt-n2820) Subgroup basic-flip-vs-cursor-varying-size: pass -> FAIL (ro-byt-n2820) pass -> FAIL (fi-hsw-i7-4770k) fail -> PASS (ro-bdw-i5-5250u) Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-a: skip -> DMESG-WARN (ro-bdw-i5-5250u) Subgroup suspend-read-crc-pipe-b: pass -> INCOMPLETE (fi-hsw-i7-4770k) skip -> DMESG-WARN (ro-bdw-i5-5250u) fi-hsw-i7-4770k total:207 pass:185 dwarn:0 dfail:0 fail:1 skip:20 fi-kbl-qkkr total:244 pass:185 dwarn:29 dfail:0 fail:3 skip:27 fi-skl-i7-6700k total:244 pass:208 dwarn:4 dfail:2 fail:2 skip:28 fi-snb-i7-2600 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 ro-bdw-i5-5250u total:240 pass:219 dwarn:3 dfail:0 fail:1 skip:17 ro-bdw-i7-5600u total:240 pass:207 dwarn:0 dfail:0 fail:1 skip:32 ro-bsw-n3050 total:240 pass:194 dwarn:0 dfail:0 fail:4 skip:42 ro-byt-n2820 total:240 pass:198 dwarn:0 dfail:0 fail:2 skip:40 ro-hsw-i3-4010u total:240 pass:214 dwarn:0 dfail:0 fail:0 skip:26 ro-hsw-i7-4770r total:240 pass:185 dwarn:0 dfail:0 fail:0 skip:55 ro-ilk1-i5-650 total:235 pass:173 dwarn:0 dfail:0 fail:2 skip:60 ro-ivb-i7-3770 total:240 pass:205 dwarn:0 dfail:0 fail:0 skip:35 ro-ivb2-i7-3770 total:240 pass:209 dwarn:0 dfail:0 fail:0 skip:31 ro-skl3-i5-6260u total:240 pass:223 dwarn:0 dfail:0 fail:3 skip:14 Results at /archive/results/CI_IGT_test/RO_Patchwork_1868/ 1b2e958 drm-intel-nightly: 2016y-08m-15d-09h-09m-06s UTC integration manifest 2f8ea64 drm/i915/skl: Do not error out when total_data_rate is 0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 6/9] drm/i915/cmdparser: Compare against the previous command descriptor
On 12 August 2016 at 16:07, Chris Wilson wrote: > On the blitter (and in test code), we see long sequences of repeated > commands, e.g. XY_PIXEL_BLT, XY_SCANLINE_BLT, or XY_SRC_COPY. For these, > we can skip the hashtable lookup by remembering the previous command > descriptor and doing a straightforward compare of the command header. > The corollary is that we need to do one extra comparison before lookup > up new commands. > > Signed-off-by: Chris Wilson > --- > drivers/gpu/drm/i915/i915_cmd_parser.c | 20 +--- > 1 file changed, 13 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c > b/drivers/gpu/drm/i915/i915_cmd_parser.c > index 274f2136a846..3b1100a0e0cb 100644 > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > @@ -350,6 +350,9 @@ static const struct drm_i915_cmd_descriptor > hsw_blt_cmds[] = { > CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ), > }; > > +static const struct drm_i915_cmd_descriptor noop_desc = > + CMD(MI_NOOP, SMI, F, 1, S); > + > #undef CMD > #undef SMI > #undef S3D > @@ -898,11 +901,14 @@ find_cmd_in_table(struct intel_engine_cs *engine, > static const struct drm_i915_cmd_descriptor* > find_cmd(struct intel_engine_cs *engine, > u32 cmd_header, > +const struct drm_i915_cmd_descriptor *desc, > struct drm_i915_cmd_descriptor *default_desc) > { > - const struct drm_i915_cmd_descriptor *desc; > u32 mask; > > + if (((cmd_header ^ desc->cmd.value) & desc->cmd.mask) == 0) > + return desc; > + > desc = find_cmd_in_table(engine, cmd_header); > if (desc) > return desc; > @@ -911,10 +917,10 @@ find_cmd(struct intel_engine_cs *engine, > if (!mask) > return NULL; > > - BUG_ON(!default_desc); Why remove this, was it overkill? > - default_desc->flags = CMD_DESC_SKIP; > + default_desc->cmd.value = cmd_header; > + default_desc->cmd.mask = 0x; Where did you pluck this mask from? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 09/10] drm/i915: Bump the inactive MRU tracking for all VMA accessed
On ma, 2016-08-15 at 11:12 +0100, Chris Wilson wrote: > On Mon, Aug 15, 2016 at 12:59:09PM +0300, Joonas Lahtinen wrote: > > > > > > > > When we bump the MRU access tracking on set-to-gtt, we need to not only > > > bump the primary GGTT VMA but all partials as well. Similarly we want to > > > bump the MRU access for when unpinning an object from the scanout. > > Refer to the list as LRU in the commit title and message to avoid confusion. > Still disagree. We are adjusting the MRU entity, the code always has and > then evicting from the LRU. I would not use the abbreviation MRU when discussing LRU scheme, but it's only the commit message so I can live with it. Regards, Joonas > -Chris > -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [STABLE 4.4 BACKPORT REQUEST] drm/i915: Don't complain about lack of ACPI video bios
Stable team, please backport commit 78c3d5fa7354774b7c8638033d46c042ebae41fb Author: Daniel Vetter Date: Fri Oct 23 11:00:06 2015 +0200 drm/i915: Don't complain about lack of ACPI video bios to v4.4. Tested-by: Rainer Fiebig # v4.4 BR, Jani. -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 02/10] drm/i915/userptr: Make gup errors stickier
On pe, 2016-08-12 at 11:28 +0100, Chris Wilson wrote: > Keep any error reported by the gup_worker until we are notified that the > arena has changed (via the mmu-notifier). This has the importance of > making two consecutive calls to i915_gem_object_get_pages() reporting > the same error, and curtailing an loop of detecting a fault and requeueing > a gup_worker. > I think this is for Mika to review. Regards, Joonas > Signed-off-by: Chris Wilson > --- > drivers/gpu/drm/i915/i915_gem_userptr.c | 17 +++-- > 1 file changed, 7 insertions(+), 10 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c > b/drivers/gpu/drm/i915/i915_gem_userptr.c > index 57218cca7e05..be54825ef3e8 100644 > --- a/drivers/gpu/drm/i915/i915_gem_userptr.c > +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c > @@ -542,8 +542,6 @@ __i915_gem_userptr_get_pages_worker(struct work_struct > *_work) > } > } > obj->userptr.work = ERR_PTR(ret); > - if (ret) > - __i915_gem_userptr_set_active(obj, false); > } > > obj->userptr.workers--; > @@ -628,15 +626,14 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object > *obj) > * to the vma (discard or cloning) which should prevent the more > * egregious cases from causing harm. > */ > - if (IS_ERR(obj->userptr.work)) { > - /* active flag will have been dropped already by the worker */ > - ret = PTR_ERR(obj->userptr.work); > - obj->userptr.work = NULL; > - return ret; > - } > - if (obj->userptr.work) > + > + if (obj->userptr.work) { > /* active flag should still be held for the pending work */ > - return -EAGAIN; > + if (IS_ERR(obj->userptr.work)) > + return PTR_ERR(obj->userptr.work); > + else > + return -EAGAIN; > + } > > /* Let the mmu-notifier know that we have begun and need cancellation */ > ret = __i915_gem_userptr_set_active(obj, true); -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v4 0/2] drm/i915/opregion: proper handling of DIDL and CADL
Hi list, I have an Asus laptop, and these two patches solved my problem with bright hot-keys not working[1]. I applied both patches on 4.8-rc1, and the only necessary fix was changing priv_dev->dev to priv_dev->drm in all places of for_each_* macros touched by these patches. Is there any chance to get this merged before 4.8 is launched? And, if there are other problems still in need of fixes in this patch, please let me know. Thanks! [1] https://bugzilla.kernel.org/show_bug.cgi?id=152091 >From ae6d2f8916abe9573b91b3ecb565c9585dda579a Mon Sep 17 00:00:00 2001 From: Jani Nikula Date: Wed, 29 Jun 2016 18:36:41 +0300 Subject: [PATCH 1/2] drm/i915: make i915 the source of acpi device ids for _DOD The graphics driver is supposed to define the DIDL, which are used for _DOD, not the BIOS. Restore that behaviour. This is basically a revert of commit 3143751ff51a163b77f7efd389043e038f3e008e Author: Zhang Rui Date: Mon Mar 29 15:12:16 2010 +0800 drm/i915: set DIDL using the ACPI video output device _ADR method return. which went out of its way to cater to a specific BIOS, setting up DIDL based on _ADR method. Perhaps that approach worked on that specific machine, but on the machines I checked the _ADR method invents the device identifiers out of thin air if DIDL has not been set. The source for _ADR is also supposed to be the DIDL set by the driver, not the other way around. With this, we'll also limit the number of outputs to what the driver actually has. v2: do not set ACPI_DEVICE_ID_SCHEME in the device id (Peter Wu) Reviewed-and-tested-by: Peter Wu Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/intel_drv.h | 3 ++ drivers/gpu/drm/i915/intel_opregion.c | 89 ++- 2 files changed, 28 insertions(+), 64 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index cc937a1..8656b4c 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -263,6 +263,9 @@ struct intel_connector { */ struct intel_encoder *encoder; + /* ACPI device id for ACPI and driver cooperation */ + u32 acpi_device_id; + /* Reads out the current hw, returning true if the connector is enabled * and active (i.e. dpms ON state). */ bool (*get_hw_state)(struct intel_connector *); diff --git a/drivers/gpu/drm/i915/intel_opregion.c b/drivers/gpu/drm/i915/intel_opregion.c index adca262..494559a 100644 --- a/drivers/gpu/drm/i915/intel_opregion.c +++ b/drivers/gpu/drm/i915/intel_opregion.c @@ -674,11 +674,11 @@ static void set_did(struct intel_opregion *opregion, int i, u32 val) } } -static u32 acpi_display_type(struct drm_connector *connector) +static u32 acpi_display_type(struct intel_connector *connector) { u32 display_type; - switch (connector->connector_type) { + switch (connector->base.connector_type) { case DRM_MODE_CONNECTOR_VGA: case DRM_MODE_CONNECTOR_DVIA: display_type = ACPI_DISPLAY_TYPE_VGA; @@ -707,7 +707,7 @@ static u32 acpi_display_type(struct drm_connector *connector) display_type = ACPI_DISPLAY_TYPE_OTHER; break; default: - MISSING_CASE(connector->connector_type); + MISSING_CASE(connector->base.connector_type); display_type = ACPI_DISPLAY_TYPE_OTHER; break; } @@ -718,34 +718,10 @@ static u32 acpi_display_type(struct drm_connector *connector) static void intel_didl_outputs(struct drm_i915_private *dev_priv) { struct intel_opregion *opregion = &dev_priv->opregion; - struct pci_dev *pdev = dev_priv->drm.pdev; - struct drm_connector *connector; - acpi_handle handle; - struct acpi_device *acpi_dev, *acpi_cdev, *acpi_video_bus = NULL; - unsigned long long device_id; - acpi_status status; - u32 temp, max_outputs; - int i = 0; - - handle = ACPI_HANDLE(&pdev->dev); - if (!handle || acpi_bus_get_device(handle, &acpi_dev)) - return; - - if (acpi_is_video_device(handle)) - acpi_video_bus = acpi_dev; - else { - list_for_each_entry(acpi_cdev, &acpi_dev->children, node) { - if (acpi_is_video_device(acpi_cdev->handle)) { - acpi_video_bus = acpi_cdev; - break; - } - } - } - - if (!acpi_video_bus) { - DRM_DEBUG_KMS("No ACPI video bus found\n"); - return; - } + struct intel_connector *connector; + struct drm_device *dev = &dev_priv->drm; + int i = 0, max_outputs; + int display_index[16] = {}; /* * In theory, did2, the extended didl, gets added at opregion version @@ -757,46 +733,31 @@ static void intel_didl_outputs(struct drm_i915_private *dev_priv) max_outputs
[Intel-gfx] [PATCH 2/5] drm/i915: Stop the machine whilst capturing the GPU crash dump
The error state is purposefully racy as we expect it to be called at any time and so have avoided any locking whilst capturing the crash dump. However, with multi-engine GPUs and multiple CPUs, those races can manifest into OOPSes as we attempt to chase dangling pointers freed on other CPUs. Under discussion are lots of ways to slow down normal operation in order to protect the post-mortem error capture, but what it we take the opposite approach and freeze the machine whilst the error capture runs (note the GPU may still running, but as long as we don't process any of the results the driver's bookkeeping will be static). Note that by of itself, this is not a complete fix. It also depends on the compiler barriers in list_add/list_del to prevent traversing the lists into the void. We also depend that we only require state from carefully controlled sources - i.e. all the state we require for post-mortem debugging should be reachable from the request itself so that we only have to worry about retrieving the request carefully. Once we have the request, we know that all pointers from it are intact. v2: Avoid drm_clflush_pages() inside stop_machine() as it may use stop_machine() itself for its wbinvd fallback. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/Kconfig | 1 + drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gpu_error.c | 46 +-- 3 files changed, 31 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 10a6ac11b6a9..0f46a9c04c0e 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -4,6 +4,7 @@ config DRM_I915 depends on X86 && PCI select INTEL_GTT select INTERVAL_TREE + select STOP_MACHINE # we need shmfs for the swappable backing store, and in particular # the shmem_readpage() which depends upon tmpfs select SHMEM diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 20caac1796ef..52facd4a7179 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -705,6 +705,8 @@ struct drm_i915_error_state { struct kref ref; struct timeval time; + struct drm_i915_private *i915; + char error_msg[128]; bool simulated; int iommu; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 0bbc22f9a705..0815e5c47431 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -28,6 +28,7 @@ */ #include +#include #include "i915_drv.h" #ifdef CONFIG_DRM_I915_CAPTURE_ERROR @@ -715,14 +716,12 @@ i915_error_object_create(struct drm_i915_private *dev_priv, dst->page_count = num_pages; while (num_pages--) { - unsigned long flags; void *d; d = kmalloc(PAGE_SIZE, GFP_ATOMIC); if (d == NULL) goto unwind; - local_irq_save(flags); if (use_ggtt) { void __iomem *s; @@ -741,15 +740,10 @@ i915_error_object_create(struct drm_i915_private *dev_priv, page = i915_gem_object_get_page(src, i); - drm_clflush_pages(&page, 1); - s = kmap_atomic(page); memcpy(d, s, PAGE_SIZE); kunmap_atomic(s); - - drm_clflush_pages(&page, 1); } - local_irq_restore(flags); dst->pages[i++] = d; reloc_offset += PAGE_SIZE; @@ -1404,6 +1398,31 @@ static void i915_capture_gen_state(struct drm_i915_private *dev_priv, sizeof(error->device_info)); } +static int capture(void *data) +{ + struct drm_i915_error_state *error = data; + + /* Ensure that what we readback from memory matches what the GPU sees */ + wbinvd(); + + i915_capture_gen_state(error->i915, error); + i915_capture_reg_state(error->i915, error); + i915_gem_record_fences(error->i915, error); + i915_gem_record_rings(error->i915, error); + i915_capture_active_buffers(error->i915, error); + i915_capture_pinned_buffers(error->i915, error); + + do_gettimeofday(&error->time); + + error->overlay = intel_overlay_capture_error_state(error->i915); + error->display = intel_display_capture_error_state(error->i915); + + /* And make sure we don't leave trash in the CPU cache */ + wbinvd(); + + return 0; +} + /** * i915_capture_error_state - capture an error record for later analysis * @dev: drm device @@ -1435,18 +1454,9 @@ void i915_capture_error_state(struct drm_i915_private *dev_priv, } kref_init(&error->ref); + error->i915 = dev_priv; - i915_capture_gen_state(dev_priv, error); - i915_capture_reg_sta
[Intel-gfx] [PATCH 1/5] drm/i915: Allow disabling error capture
We currently capture the GPU state after we detect a hang. This is vital for us to both triage and debug hangs in the wild (post-mortem debugging). However, it comes at the cost of running some potentially dangerous code (since it has to make very few assumption about the state of the driver) that is quite resource intensive. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/Kconfig | 10 ++ drivers/gpu/drm/i915/i915_debugfs.c | 6 ++ drivers/gpu/drm/i915/i915_drv.h | 11 +++ drivers/gpu/drm/i915/i915_gpu_error.c | 7 +++ drivers/gpu/drm/i915/i915_params.c| 9 + drivers/gpu/drm/i915/i915_params.h| 1 + drivers/gpu/drm/i915/i915_sysfs.c | 8 drivers/gpu/drm/i915/intel_display.c | 4 drivers/gpu/drm/i915/intel_overlay.c | 4 9 files changed, 60 insertions(+) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 7769e469118f..10a6ac11b6a9 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -46,6 +46,16 @@ config DRM_I915_PRELIMINARY_HW_SUPPORT If in doubt, say "N". +config DRM_I915_CAPTURE_ERROR + bool "Enable capturing GPU state following a hang" + depends on DRM_I915 + default y + help + This option enables capturing the GPU state when a hang is detected. + This information is vital for triaging hangs and assists in debugging. + + If in doubt, say "Y". + config DRM_I915_USERPTR bool "Always enable userptr support" depends on DRM_I915 diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index b89478a8d19a..f41ebf25655c 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -973,6 +973,8 @@ static int i915_hws_info(struct seq_file *m, void *data) return 0; } +#ifdef CONFIG_DRM_I915_CAPTURE_ERROR + static ssize_t i915_error_state_write(struct file *filp, const char __user *ubuf, @@ -1062,6 +1064,8 @@ static const struct file_operations i915_error_state_fops = { .release = i915_error_state_release, }; +#endif + static int i915_next_seqno_get(void *data, u64 *val) { @@ -5399,7 +5403,9 @@ static const struct i915_debugfs_files { {"i915_ring_missed_irq", &i915_ring_missed_irq_fops}, {"i915_ring_test_irq", &i915_ring_test_irq_fops}, {"i915_gem_drop_caches", &i915_drop_caches_fops}, +#ifdef CONFIG_DRM_I915_CAPTURE_ERROR {"i915_error_state", &i915_error_state_fops}, +#endif {"i915_next_seqno", &i915_next_seqno_fops}, {"i915_display_crc_ctl", &i915_display_crc_ctl_fops}, {"i915_pri_wm_latency", &i915_pri_wm_latency_fops}, diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 35caa9b2f36a..20caac1796ef 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3482,6 +3482,7 @@ static inline void intel_display_crc_init(struct drm_device *dev) {} #endif /* i915_gpu_error.c */ +#ifdef CONFIG_DRM_I915_CAPTURE_ERROR __printf(2, 3) void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...); int i915_error_state_to_str(struct drm_i915_error_state_buf *estr, @@ -3501,6 +3502,16 @@ void i915_error_state_get(struct drm_device *dev, struct i915_error_state_file_priv *error_priv); void i915_error_state_put(struct i915_error_state_file_priv *error_priv); void i915_destroy_error_state(struct drm_device *dev); +#else +static inline void i915_capture_error_state(struct drm_i915_private *dev_priv, + u32 engine_mask, + const char *error_msg) +{ +} +static inline void i915_destroy_error_state(struct drm_device *dev) +{ +} +#endif void i915_get_extra_instdone(struct drm_i915_private *dev_priv, uint32_t *instdone); const char *i915_cache_level_str(struct drm_i915_private *i915, int type); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 0c3f30ce85c3..0bbc22f9a705 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -30,6 +30,8 @@ #include #include "i915_drv.h" +#ifdef CONFIG_DRM_I915_CAPTURE_ERROR + static const char *engine_str(int engine) { switch (engine) { @@ -1419,6 +1421,9 @@ void i915_capture_error_state(struct drm_i915_private *dev_priv, struct drm_i915_error_state *error; unsigned long flags; + if (!i915.error_capture) + return; + if (READ_ONCE(dev_priv->gpu_error.first_error)) return; @@ -1504,6 +1509,8 @@ void i915_destroy_error_state(struct drm_device *dev) kref_put(&error->ref, i915_error_state_free); } +#endif + const char *i915_cache_level_str(struct drm_i915_private *i915, int type) { switch (type) { diff --git a/drivers/g
[Intel-gfx] Compress the GPU error state
After adjusting how we track the data to capture exactly what we use via VMA, then adjusting the way we inspect the VMA we can finally compress it. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/5] drm/i915: Consolidate error object printing
Leave all the pretty printing to userspace and simplify the error capture to only have a single common object printer. It makes the kernel code more compact, and the refactoring allows us to apply more complex transformations like compressing the output. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gpu_error.c | 100 +- 1 file changed, 25 insertions(+), 75 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 185adcff0f2d..ae0b98eee9ec 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -311,10 +311,22 @@ void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...) } static void print_error_obj(struct drm_i915_error_state_buf *m, + struct intel_engine_cs *engine, + const char *name, struct drm_i915_error_object *obj) { int page, offset, elt; + if (!obj) + return; + + if (name) { + err_printf(m, "%s --- %s = 0x%08x %08x\n", + engine ? engine->name : "global", name, + upper_32_bits(obj->gtt_offset), + lower_32_bits(obj->gtt_offset)); + } + for (page = offset = 0; page < obj->page_count; page++) { for (elt = 0; elt < PAGE_SIZE/4; elt++) { err_printf(m, "%08x : %08x\n", offset, @@ -341,8 +353,8 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, struct drm_i915_private *dev_priv = to_i915(dev); struct drm_i915_error_state *error = error_priv->error; struct drm_i915_error_object *obj; - int i, j, offset, elt; int max_hangcheck_score; + int i, j; if (!error) { err_printf(m, "no error state collected\n"); @@ -466,15 +478,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, err_printf(m, " --- gtt_offset = 0x%08x %08x\n", upper_32_bits(obj->gtt_offset), lower_32_bits(obj->gtt_offset)); - print_error_obj(m, obj); - } - - obj = ee->wa_batchbuffer; - if (obj) { - err_printf(m, "%s (w/a) --- gtt_offset = 0x%08x\n", - dev_priv->engine[i].name, - lower_32_bits(obj->gtt_offset)); - print_error_obj(m, obj); + print_error_obj(m, &dev_priv->engine[i], NULL, obj); } if (ee->num_requests) { @@ -503,77 +507,23 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, } } - if ((obj = ee->ringbuffer)) { - err_printf(m, "%s --- ringbuffer = 0x%08x\n", - dev_priv->engine[i].name, - lower_32_bits(obj->gtt_offset)); - print_error_obj(m, obj); - } + print_error_obj(m, &dev_priv->engine[i], + "ringbuffer", ee->ringbuffer); - if ((obj = ee->hws_page)) { - u64 hws_offset = obj->gtt_offset; - u32 *hws_page = &obj->pages[0][0]; + print_error_obj(m, &dev_priv->engine[i], + "HW Status", ee->hws_page); - if (i915.enable_execlists) { - hws_offset += LRC_PPHWSP_PN * PAGE_SIZE; - hws_page = &obj->pages[LRC_PPHWSP_PN][0]; - } - err_printf(m, "%s --- HW Status = 0x%08llx\n", - dev_priv->engine[i].name, hws_offset); - offset = 0; - for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { - err_printf(m, "[%04x] %08x %08x %08x %08x\n", - offset, - hws_page[elt], - hws_page[elt+1], - hws_page[elt+2], - hws_page[elt+3]); - offset += 16; - } - } + print_error_obj(m, &dev_priv->engine[i], + "HW context", ee->ctx); - obj = ee->wa_ctx; - if (obj) { - u64 wa_ctx_offset = obj->gtt_offset; - u32 *wa_ctx_page = &obj->pages[0][0]; - struct intel_engine_cs *engine = &dev_priv->engine[RCS]; - u32 wa_ctx_size
[Intel-gfx] [PATCH 3/5] drm/i915: Always use the GTT for error capture
Since the GTT provides universal access to any GPU page, we can use it to reduce our plethora of read methods to just one. It also has the important characteristic of being exactly what the GPU sees - if there are incoherency problems, seeing the batch as executed (rather than as trapped inside the cpu cache) is important. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_gtt.c | 43 drivers/gpu/drm/i915/i915_gem_gtt.h | 2 + drivers/gpu/drm/i915/i915_gpu_error.c | 122 -- 3 files changed, 75 insertions(+), 92 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 3631944ac2d9..cbeec4cfe8a4 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2729,6 +2729,7 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv) */ struct i915_ggtt *ggtt = &dev_priv->ggtt; unsigned long hole_start, hole_end; + struct i915_hw_ppgtt *ppgtt; struct drm_mm_node *entry; int ret; @@ -2736,6 +2737,15 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv) if (ret) return ret; + /* Reserve a mappable slot for our lockless error capture */ + ret = drm_mm_insert_node_in_range_generic(&ggtt->base.mm, + &ggtt->gpu_error, + 4096, 0, -1, + 0, ggtt->mappable_end, + 0, 0); + if (ret) + return ret; + /* Clear any non-preallocated blocks */ drm_mm_for_each_hole(entry, &ggtt->base.mm, hole_start, hole_end) { DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n", @@ -2750,25 +2760,21 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv) true); if (USES_PPGTT(dev_priv) && !USES_FULL_PPGTT(dev_priv)) { - struct i915_hw_ppgtt *ppgtt; - ppgtt = kzalloc(sizeof(*ppgtt), GFP_KERNEL); - if (!ppgtt) - return -ENOMEM; + if (!ppgtt) { + ret = -ENOMEM; + goto err; + } ret = __hw_ppgtt_init(ppgtt, dev_priv); - if (ret) { - kfree(ppgtt); - return ret; - } + if (ret) + goto err_ppgtt; - if (ppgtt->base.allocate_va_range) + if (ppgtt->base.allocate_va_range) { ret = ppgtt->base.allocate_va_range(&ppgtt->base, 0, ppgtt->base.total); - if (ret) { - ppgtt->base.cleanup(&ppgtt->base); - kfree(ppgtt); - return ret; + if (ret) + goto err_ppgtt_cleanup; } ppgtt->base.clear_range(&ppgtt->base, @@ -2782,6 +2788,14 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv) } return 0; + +err_ppgtt_cleanup: + ppgtt->base.cleanup(&ppgtt->base); +err_ppgtt: + kfree(ppgtt); +err: + drm_mm_remove_node(&ggtt->gpu_error); + return ret; } /** @@ -2801,6 +2815,9 @@ void i915_ggtt_cleanup_hw(struct drm_i915_private *dev_priv) i915_gem_cleanup_stolen(&dev_priv->drm); + if (drm_mm_node_allocated(&ggtt->gpu_error)) + drm_mm_remove_node(&ggtt->gpu_error); + if (drm_mm_initialized(&ggtt->base.mm)) { intel_vgt_deballoon(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index d6e4b6529196..79a08a050487 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -439,6 +439,8 @@ struct i915_ggtt { bool do_idle_maps; int mtrr; + + struct drm_mm_node gpu_error; }; struct i915_hw_ppgtt { diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 0815e5c47431..185adcff0f2d 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -628,7 +628,7 @@ static void i915_error_object_free(struct drm_i915_error_object *obj) return; for (page = 0; page < obj->page_count; page++) - kfree(obj->pages[page]); + free_page((unsigned long)obj->pages[page]); kfree(obj); } @@ -664,98 +664,69 @@ static void i915_error_state_free(struct kref *error_ref) kfree(error); } +static int compress_page(void *src, struct drm_i915_error_object *dst) +{ + unsigned long page; + + page = __get_free_page(GFP_ATOMIC | __GFP_NOWARN); + if (!page) + re
[Intel-gfx] [PATCH 5/5] drm/i915: Compress GPU objects in error state
Our error states are quickly growing, pinning kernel memory with them. The majority of the space is taken up by the error objects. These compress well using zlib and without decode are mostly meaningless, so encoding them does not hinder quickly parsing the error state for familiarity. v2: Make the zlib dependency optional Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/Kconfig | 12 +++ drivers/gpu/drm/i915/i915_drv.h | 3 +- drivers/gpu/drm/i915/i915_gpu_error.c | 170 +- 3 files changed, 163 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 0f46a9c04c0e..69657629d750 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -57,6 +57,18 @@ config DRM_I915_CAPTURE_ERROR If in doubt, say "Y". +config DRM_I915_COMPRESS_ERROR + bool "Compress GPU error state" + depends on DRM_I915 + select ZLIB_DEFLATE + default y + help + This option selects ZLIB_DEFLATE if it isn't already + selected and causes any error state captured upon a GPU hang + to be compressed using zlib. + + If in doubt, say "Y". + config DRM_I915_USERPTR bool "Always enable userptr support" depends on DRM_I915 diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 52facd4a7179..6bb39301999e 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -776,9 +776,10 @@ struct drm_i915_error_state { u32 semaphore_mboxes[I915_NUM_ENGINES - 1]; struct drm_i915_error_object { - int page_count; u64 gtt_offset; u64 gtt_size; + int page_count; + int unused; u32 *pages[0]; } *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index ae0b98eee9ec..404ae3356beb 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -29,6 +29,7 @@ #include #include +#include #include "i915_drv.h" #ifdef CONFIG_DRM_I915_CAPTURE_ERROR @@ -175,6 +176,110 @@ static void i915_error_puts(struct drm_i915_error_state_buf *e, #define err_printf(e, ...) i915_error_printf(e, __VA_ARGS__) #define err_puts(e, s) i915_error_puts(e, s) +#ifdef CONFIG_DRM_I915_COMPRESS_ERROR + +static bool compress_init(struct z_stream_s *zstream) +{ + memset(zstream, 0, sizeof(*zstream)); + + zstream->workspace = + kmalloc(zlib_deflate_workspacesize(MAX_WBITS, MAX_MEM_LEVEL), + GFP_ATOMIC | __GFP_NOWARN); + if (!zstream->workspace) + return NULL; + + if (zlib_deflateInit(zstream, Z_DEFAULT_COMPRESSION) != Z_OK) { + kfree(zstream->workspace); + return false; + } + + return true; +} + +static int compress_page(struct z_stream_s *zstream, +void *src, +struct drm_i915_error_object *dst) +{ + zstream->next_in = src; + zstream->avail_in = PAGE_SIZE; + + do { + if (zstream->avail_out == 0) { + unsigned long page; + + page = __get_free_page(GFP_ATOMIC | __GFP_NOWARN); + if (!page) + return -ENOMEM; + + dst->pages[dst->page_count++] = (void *)page; + + zstream->next_out = (void *)page; + zstream->avail_out = PAGE_SIZE; + } + + if (zlib_deflate(zstream, Z_SYNC_FLUSH) != Z_OK) + return -EIO; + + /* Fallback to uncompressed if we increase size? */ + if (0 && zstream->total_out > zstream->total_in) + return -E2BIG; + } while (zstream->avail_in); + + return 0; +} + +static void compress_fini(struct z_stream_s *zstream, + struct drm_i915_error_object *dst) +{ + if (dst) { + zlib_deflate(zstream, Z_FINISH); + dst->unused = zstream->avail_out; + } + + zlib_deflateEnd(zstream); + kfree(zstream->workspace); +} + +static void err_compression_marker(struct drm_i915_error_state_buf *m) +{ + err_puts(m, ":"); +} + +#else + +static bool compress_init(struct z_stream_s *zstream) +{ + return true; +} + +static int compress_page(struct z_stream_s *zstream, +void *src, +struct drm_i915_error_object *dst) +{ + unsigned long page; + + page = __get_free_page(GFP_ATOMIC | __GFP_NOWARN); + if (!page) + return -ENOMEM; + + dst->pages[dst->page_count++] = + memcpy((vo
[Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13
From: Jesse Barnes Use David's new IOMMU layer functions for supporting SVM in i915. TODO: error record collection for failing SVM contexts callback handling for fatal faults scheduling v2: integrate David's core IOMMU support make sure we don't clobber the PASID in the context reg state v3: fixup for intel-svm.h changes (David) v4: use fault & halt for now (Jesse) fix ring free in error path on context alloc (Julia) v5: update with new callback struct (Jesse) v6: fix init svm check per new IOMMU code (Jesse) v7: drop debug code and obsolete i915_svm.c file (Jesse) v8: fix !CONFIG_INTEL_IOMMU_SVM init stub (Jesse) v9: update to new execlist and reg handling bits (Jesse) context teardown fix (lrc deferred alloc vs teardown race?) (Jesse) check for SVM availability at context create (Jesse) v10: intel_context_svm_init/fini & rebase v11: move context specific stuff to i915_gem_context v12: move addressing to context descriptor v13: strip out workqueue and mm notifiers Cc: Daniel Vetter Cc: Chris Wilson Cc: Joonas Lahtinen Cc: David Woodhouse Signed-off-by: David Woodhouse (v3) Signed-off-by: Jesse Barnes (v9) Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_drv.h | 32 ++ drivers/gpu/drm/i915/i915_gem.c | 7 +++ drivers/gpu/drm/i915/i915_gem_context.c | 104 +--- drivers/gpu/drm/i915/i915_reg.h | 18 ++ drivers/gpu/drm/i915/intel_lrc.c| 39 +--- 5 files changed, 167 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 598e078418e3..64f3f0f18509 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -39,6 +39,7 @@ #include #include #include +#include #include #include #include @@ -866,6 +867,8 @@ struct i915_ctx_hang_stats { * @remap_slice: l3 row remapping information. * @flags: context specific flags: * CONTEXT_NO_ZEROMAP: do not allow mapping things to page 0. + * CONTEXT_NO_ERROR_CAPTURE: do not capture gpu state on hang. + * CONTEXT_SVM: context with 1:1 gpu vs cpu mapping of vm. * @file_priv: filp associated with this context (NULL for global default *context). * @hang_stats: information about the role of this context in possible GPU @@ -891,6 +894,8 @@ struct i915_gem_context { unsigned long flags; #define CONTEXT_NO_ZEROMAP BIT(0) #define CONTEXT_NO_ERROR_CAPTURE BIT(1) +#define CONTEXT_SVMBIT(2) + unsigned hw_id; u32 user_handle; @@ -909,6 +914,9 @@ struct i915_gem_context { struct atomic_notifier_head status_notifier; bool execlists_force_single_submission; + u32 pasid; /* svm, 20 bits */ + struct task_struct *task; + struct list_head link; u8 remap_slice; @@ -2001,6 +2009,8 @@ struct drm_i915_private { struct i915_runtime_pm pm; + bool svm_available; + /* Abstract the submission mechanism (legacy ringbuffer or execlists) away */ struct { void (*cleanup_engine)(struct intel_engine_cs *engine); @@ -3628,6 +3638,28 @@ extern void intel_set_memory_cxsr(struct drm_i915_private *dev_priv, int i915_reg_read_ioctl(struct drm_device *dev, void *data, struct drm_file *file); +/* svm */ +#ifdef CONFIG_INTEL_IOMMU_SVM +static inline bool intel_init_svm(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = to_i915(dev); + + dev_priv->svm_available = USES_FULL_48BIT_PPGTT(dev_priv) && + intel_svm_available(&dev->pdev->dev); + + return dev_priv->svm_available; +} +#else +static inline bool intel_init_svm(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = to_i915(dev); + + dev_priv->svm_available = false; + + return dev_priv->svm_available; +} +#endif + /* overlay */ extern struct intel_overlay_error_state * intel_overlay_capture_error_state(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 7e08c774a1aa..45d67b54c018 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4304,6 +4304,13 @@ i915_gem_init_hw(struct drm_device *dev) } } + if (INTEL_GEN(dev) >= 8) { + if (intel_init_svm(dev)) + DRM_DEBUG_DRIVER("Initialized Intel SVM support\n"); + else + DRM_ERROR("Failed to enable Intel SVM support\n"); + } + i915_gem_init_swizzling(dev); /* diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 189a6c018b72..9ab6332f296b 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -134,6 +134,47 @@ static int get_context_size(struct drm_i915_private *dev_p
[Intel-gfx] [PATCH RFC 0/4] svm support
Hi, Now when fences got merged I reworked the series. It is now much smaller in size. Some items are still missing like error state recording, fault handling, documentation and in fences. You can also find the most recent version in here: https://cgit.freedesktop.org/~miku/drm-intel/log/?h=svm -Mika Jesse Barnes (4): drm/i915: add create_context2 ioctl drm/i915: IOMMU based SVM implementation v13 drm/i915: add SVM execbuf ioctl v10 drm/i915: Add param for SVM drivers/gpu/drm/i915/Kconfig | 1 + drivers/gpu/drm/i915/i915_drv.c| 5 + drivers/gpu/drm/i915/i915_drv.h| 37 +++ drivers/gpu/drm/i915/i915_gem.c| 7 ++ drivers/gpu/drm/i915/i915_gem_context.c| 172 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 157 ++ drivers/gpu/drm/i915/i915_reg.h| 18 +++ drivers/gpu/drm/i915/intel_lrc.c | 39 +++ include/uapi/drm/i915_drm.h| 55 + 9 files changed, 446 insertions(+), 45 deletions(-) -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH RFC 1/4] drm/i915: add create_context2 ioctl
From: Jesse Barnes Add i915_gem_context_create2_ioctl for passing flags (e.g. SVM) when creating a context. v2: check the pad on create_context v3: rebase v4: i915_dma is no more. create_gvt needs flags Cc: Daniel Vetter Cc: Chris Wilson Cc: Joonas Lahtinen Signed-off-by: Jesse Barnes (v1) Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_drv.c | 1 + drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_gem_context.c | 70 +++-- include/uapi/drm/i915_drm.h | 18 + 4 files changed, 78 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 13ae340ef1f3..9fb6de90eac0 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -2566,6 +2566,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE2, i915_gem_context_create2_ioctl, DRM_UNLOCKED), }; static struct drm_driver driver = { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 35caa9b2f36a..598e078418e3 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3399,6 +3399,8 @@ static inline bool i915_gem_context_is_default(const struct i915_gem_context *c) int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, struct drm_file *file); +int i915_gem_context_create2_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, struct drm_file *file); int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 35950ee46a1d..189a6c018b72 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -341,17 +341,21 @@ err_out: */ static struct i915_gem_context * i915_gem_create_context(struct drm_device *dev, - struct drm_i915_file_private *file_priv) + struct drm_i915_file_private *file_priv, u32 flags) { struct i915_gem_context *ctx; + bool create_vm = false; lockdep_assert_held(&dev->struct_mutex); + if (flags & (I915_GEM_CONTEXT_FULL_PPGTT | I915_GEM_CONTEXT_ENABLE_SVM)) + create_vm = true; + ctx = __create_hw_context(dev, file_priv); if (IS_ERR(ctx)) return ctx; - if (USES_FULL_PPGTT(dev)) { + if (create_vm) { struct i915_hw_ppgtt *ppgtt = i915_ppgtt_create(to_i915(dev), file_priv); @@ -394,7 +398,8 @@ i915_gem_context_create_gvt(struct drm_device *dev) if (ret) return ERR_PTR(ret); - ctx = i915_gem_create_context(dev, NULL); + ctx = i915_gem_create_context(dev, NULL, USES_FULL_PPGTT(dev) ? + I915_GEM_CONTEXT_FULL_PPGTT : 0); if (IS_ERR(ctx)) goto out; @@ -440,6 +445,7 @@ int i915_gem_context_init(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); struct i915_gem_context *ctx; + u32 flags = 0; /* Init should only be called once per module load. Eventually the * restriction on the context_disabled check can be loosened. */ @@ -472,7 +478,10 @@ int i915_gem_context_init(struct drm_device *dev) } } - ctx = i915_gem_create_context(dev, NULL); + if (USES_FULL_PPGTT(dev)) + flags |= I915_GEM_CONTEXT_FULL_PPGTT; + + ctx = i915_gem_create_context(dev, NULL, flags); if (IS_ERR(ctx)) { DRM_ERROR("Failed to create default global context (error %ld)\n", PTR_ERR(ctx)); @@ -552,7 +561,8 @@ int i915_gem_context_open(struct drm_device *dev, struct drm_file *file) idr_init(&file_priv->context_idr); mutex_lock(&dev->struct_mutex); - ctx = i915_gem_create_context(dev, file_priv); + ctx = i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev) ? + I915_GEM_CONTEXT_FULL_PPGTT : 0); mutex_unlock(&dev->struct_mutex); if (IS_ERR(ctx)) { @@ -974,32 +984,66 @@ static bool contexts_enabled(struct drm_device *dev) return i915.enable_execlists || to_i915(dev)->hw_context_size; } -int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, - struct drm_f
[Intel-gfx] [PATCH RFC 4/4] drm/i915: Add param for SVM
From: Jesse Barnes Add possibility to query if svm is available. v2: moved into i915_drv.c Cc: Daniel Vetter Cc: Chris Wilson Cc: Joonas Lahtinen Signed-off-by: Jesse Barnes (v1) Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_drv.c | 3 +++ include/uapi/drm/i915_drm.h | 1 + 2 files changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index a07918d821e4..6d9c84253412 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -354,6 +354,9 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_MIN_EU_IN_POOL: value = INTEL_INFO(dev)->min_eu_in_pool; break; + case I915_PARAM_HAS_SVM: + value = dev_priv->svm_available; + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 8d567744f221..c21ba4b769c4 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -391,6 +391,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_EXEC_SOFTPIN 37 #define I915_PARAM_HAS_POOLED_EU38 #define I915_PARAM_MIN_EU_IN_POOL 39 +#define I915_PARAM_HAS_SVM 40 typedef struct drm_i915_getparam { __s32 param; -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH RFC 3/4] drm/i915: add SVM execbuf ioctl v10
From: Jesse Barnes We just need to pass in an address to execute and some flags, since we don't have to worry about buffer relocation or any of the other usual stuff. Returns a fence to be used for synchronization. v2: add a request after batch submission (Jesse) v3: add a flag for fence creation (Chris) v4: add CLOEXEC flag (Kristian) add non-RCS ring support (Jesse) v5: update for request alloc change (Jesse) v6: new sync file interface, error paths, request breadcrumbs v7: always CLOEXEC for sync_file_install v8: rebase on new sync file api v9: rework on top of fence requests and sync_file v10: take fence ref for sync_file (Chris) use correct flush (Chris) limit exec on rcs Cc: Daniel Vetter Cc: Chris Wilson Cc: Joonas Lahtinen Signed-off-by: Jesse Barnes (v5) Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/Kconfig | 1 + drivers/gpu/drm/i915/i915_drv.c| 1 + drivers/gpu/drm/i915/i915_drv.h| 3 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 157 + include/uapi/drm/i915_drm.h| 36 +++ 5 files changed, 198 insertions(+) diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 7769e469118f..6503133c3f85 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -8,6 +8,7 @@ config DRM_I915 # the shmem_readpage() which depends upon tmpfs select SHMEM select TMPFS + select SYNC_FILE select DRM_KMS_HELPER select DRM_PANEL select DRM_MIPI_DSI diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 9fb6de90eac0..a07918d821e4 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -2567,6 +2567,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE2, i915_gem_context_create2_ioctl, DRM_UNLOCKED), + DRM_IOCTL_DEF_DRV(I915_EXEC_MM, intel_exec_mm_ioctl, DRM_UNLOCKED), }; static struct drm_driver driver = { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 64f3f0f18509..884d9844863c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3660,6 +3660,9 @@ static inline bool intel_init_svm(struct drm_device *dev) } #endif +extern int intel_exec_mm_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); + /* overlay */ extern struct intel_overlay_error_state * intel_overlay_capture_error_state(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 699315304748..c1ba6da1fd33 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include @@ -1911,3 +1912,159 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data, drm_free_large(exec2_list); return ret; } + +static struct intel_engine_cs * +exec_mm_select_engine(struct drm_i915_private *dev_priv, + struct drm_i915_exec_mm *exec_mm) +{ + unsigned int user_ring_id = exec_mm->ring_id & I915_EXEC_RING_MASK; + struct intel_engine_cs *e; + + if (user_ring_id > I915_USER_RINGS) { + DRM_DEBUG("exec_mm with unknown ring: %u\n", user_ring_id); + return NULL; + } + + e = &dev_priv->engine[user_ring_map[user_ring_id]]; + + if (!intel_engine_initialized(e)) { + DRM_DEBUG("exec_mm with invalid ring: %u\n", user_ring_id); + return NULL; + } + + return e; +} + +static int do_exec_mm(struct drm_i915_exec_mm *exec_mm, + struct drm_i915_gem_request *req, + const u32 flags) +{ + const bool create_fence = flags & I915_EXEC_MM_FENCE; + struct sync_file *out_fence; + int ret; + + if (create_fence) { + out_fence = sync_file_create(fence_get(&req->fence)); + if (!out_fence) { + DRM_DEBUG("sync file creation failed\n"); + return ret; + } + + exec_mm->fence = get_unused_fd_flags(O_CLOEXEC); + fd_install(exec_mm->fence, out_fence->file); + } + + ret = req->engine->emit_flush(req, EMIT_INVALIDATE); + if (ret) { + DRM_DEBUG_DRIVER("engine flush failed: %d\n", ret); + goto fput; + } + + ret = req->engine->emit_bb_start(req, exec_mm->batch_ptr, 0, 0); + if (ret) { + DRM_DEBUG_DRIVER("engine dispatch execbuf failed: %d\n", ret); +
[Intel-gfx] [drm-intel:drm-intel-next-queued 7/33] drivers/gpu/drm/i915/i915_debugfs.c:392: error: 'mapped_count' may be used uninitialized in this function
tree: git://anongit.freedesktop.org/drm-intel drm-intel-next-queued head: 21a2c58a9c122151080ecbdddc115257cd7c30d8 commit: 2bd160a131ac617fc2441bfb4a02964c964a5da6 [7/33] drm/i915: Reduce i915_gem_objects to only show object information config: x86_64-randconfig-s2-08151903 (attached as .config) compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7 reproduce: git checkout 2bd160a131ac617fc2441bfb4a02964c964a5da6 # save the attached .config to linux build tree make ARCH=x86_64 Note: it may well be a FALSE warning. FWIW you are at least aware of it now. http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings All errors (new ones prefixed by >>): cc1: warnings being treated as errors drivers/gpu/drm/i915/i915_debugfs.c: In function 'i915_gem_object_info': >> drivers/gpu/drm/i915/i915_debugfs.c:392: error: 'mapped_count' may be used >> uninitialized in this function >> drivers/gpu/drm/i915/i915_debugfs.c:393: error: 'mapped_size' may be used >> uninitialized in this function vim +/mapped_count +392 drivers/gpu/drm/i915/i915_debugfs.c 386 static int i915_gem_object_info(struct seq_file *m, void* data) 387 { 388 struct drm_info_node *node = m->private; 389 struct drm_device *dev = node->minor->dev; 390 struct drm_i915_private *dev_priv = to_i915(dev); 391 struct i915_ggtt *ggtt = &dev_priv->ggtt; > 392 u32 count, mapped_count, purgeable_count, dpy_count; > 393 u64 size, mapped_size, purgeable_size, dpy_size; 394 struct drm_i915_gem_object *obj; 395 struct drm_file *file; 396 int ret; --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: Binary data ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass
Very old numbers indicate this is a 66% improvement when remapping the entire object for fence contention - due to the elimination of track_pfn_insert and its strcmp). Signed-off-by: Chris Wilson Testcase: igt/gem_fence_upload/performance Testcase: igt/gem_mmap_gtt --- drivers/gpu/drm/Makefile| 2 +- drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/i915_drv.h | 5 +++ drivers/gpu/drm/i915/i915_gem.c | 50 drivers/gpu/drm/i915/i915_mm.c | 85 + 5 files changed, 100 insertions(+), 45 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_mm.c diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 0238bf8bc8c3..3ff094171ee5 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -46,7 +46,7 @@ obj-$(CONFIG_DRM_RADEON)+= radeon/ obj-$(CONFIG_DRM_AMDGPU)+= amd/amdgpu/ obj-$(CONFIG_DRM_MGA) += mga/ obj-$(CONFIG_DRM_I810) += i810/ -obj-$(CONFIG_DRM_I915) += i915/ +obj-$(CONFIG_DRM_I915) += i915/ obj-$(CONFIG_DRM_MGAG200) += mgag200/ obj-$(CONFIG_DRM_VC4) += vc4/ obj-$(CONFIG_DRM_CIRRUS_QEMU) += cirrus/ diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 3412413408c0..a7da24640e88 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -12,6 +12,7 @@ subdir-ccflags-y += \ i915-y := i915_drv.o \ i915_irq.o \ i915_memcpy.o \ + i915_mm.o \ i915_params.o \ i915_pci.o \ i915_suspend.o \ @@ -113,6 +114,6 @@ i915-y += intel_gvt.o include $(src)/gvt/Makefile endif -obj-$(CONFIG_DRM_I915) += i915.o +obj-$(CONFIG_DRM_I915) += i915.o CFLAGS_i915_trace_points.o := -I$(src) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 05efc0501f3c..0f25302fb517 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3936,6 +3936,11 @@ static inline bool __i915_request_irq_complete(struct drm_i915_gem_request *req) void i915_memcpy_init_early(struct drm_i915_private *dev_priv); bool i915_memcpy_from_wc(void *dst, const void *src, unsigned long len); +/* i915_mm.c */ +int remap_io_mapping(struct vm_area_struct *vma, +unsigned long addr, unsigned long pfn, unsigned long size, +struct io_mapping *iomap); + #define ptr_unpack_bits(ptr, bits) ({ \ unsigned long __v = (unsigned long)(ptr); \ (bits) = __v & ~PAGE_MASK; \ diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f12114a35ae3..584144d5d8ea 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1698,7 +1698,6 @@ int i915_gem_fault(struct vm_area_struct *area, struct vm_fault *vmf) bool write = !!(vmf->flags & FAULT_FLAG_WRITE); struct i915_vma *vma; pgoff_t page_offset; - unsigned long pfn; unsigned int flags; int ret; @@ -1768,48 +1767,13 @@ int i915_gem_fault(struct vm_area_struct *area, struct vm_fault *vmf) goto err_unpin; /* Finally, remap it using the new GTT offset */ - pfn = ggtt->mappable_base + i915_ggtt_offset(vma); - pfn >>= PAGE_SHIFT; - - if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) { - if (!obj->fault_mappable) { - unsigned long size = - min_t(unsigned long, - area->vm_end - area->vm_start, - obj->base.size) >> PAGE_SHIFT; - unsigned long base = area->vm_start; - int i; - - for (i = 0; i < size; i++) { - ret = vm_insert_pfn(area, - base + i * PAGE_SIZE, - pfn + i); - if (ret) - break; - } - } else - ret = vm_insert_pfn(area, - (unsigned long)vmf->virtual_address, - pfn + page_offset); - } else { - /* Overriding existing pages in partial view does not cause -* us any trouble as TLBs are still valid because the fault -* is due to userspace losing part of the mapping or never -* having accessed it before (at this partials' range). -*/ - const struct i915_ggtt_view *view = &vma->ggtt_view; - unsigned long base = area->vm_start + - (view->params.partial.offset << PAGE_SHIFT); - unsigned int i; - - for (i = 0; i < view->params.partial.size; i++) { -
[Intel-gfx] [PATCH 1/2] io-mapping: Always create a struct to hold metadata about the io-mapping
Currently, we only allocate a structure to hold metadata if we need to allocate an ioremap for every access, such as on x86-32. However, it would be useful to store basic information about the io-mapping, such as its page protection, on all platforms. Signed-off-by: Chris Wilson Cc: linux...@kvack.org --- drivers/gpu/drm/i915/i915_gem.c| 6 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/i915_gem_gtt.c| 11 ++-- drivers/gpu/drm/i915/i915_gem_gtt.h| 2 +- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- drivers/gpu/drm/i915/intel_overlay.c | 4 +- include/linux/io-mapping.h | 92 ++ 7 files changed, 70 insertions(+), 49 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f5a7c7ffb1a5..f12114a35ae3 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -888,7 +888,7 @@ i915_gem_gtt_pread(struct drm_device *dev, * and write to user memory which may result into page * faults, and so we cannot perform this under struct_mutex. */ - if (slow_user_access(ggtt->mappable, page_base, + if (slow_user_access(&ggtt->mappable, page_base, page_offset, user_data, page_length, false)) { ret = -EFAULT; @@ -1181,11 +1181,11 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915, * If the object is non-shmem backed, we retry again with the * path that handles page fault. */ - if (fast_user_write(ggtt->mappable, page_base, + if (fast_user_write(&ggtt->mappable, page_base, page_offset, user_data, page_length)) { hit_slow_path = true; mutex_unlock(&dev->struct_mutex); - if (slow_user_access(ggtt->mappable, + if (slow_user_access(&ggtt->mappable, page_base, page_offset, user_data, page_length, true)) { diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index c012a0d94878..e6f88f3194d6 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -470,7 +470,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, offset += page << PAGE_SHIFT; } - vaddr = io_mapping_map_atomic_wc(cache->i915->ggtt.mappable, offset); + vaddr = io_mapping_map_atomic_wc(&cache->i915->ggtt.mappable, offset); cache->page = page; cache->vaddr = (unsigned long)vaddr; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index d03f9180ce76..3a82c97d5d53 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2808,7 +2808,6 @@ void i915_ggtt_cleanup_hw(struct drm_i915_private *dev_priv) if (dev_priv->mm.aliasing_ppgtt) { struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt; - ppgtt->base.cleanup(&ppgtt->base); kfree(ppgtt); } @@ -2828,7 +2827,7 @@ void i915_ggtt_cleanup_hw(struct drm_i915_private *dev_priv) ggtt->base.cleanup(&ggtt->base); arch_phys_wc_del(ggtt->mtrr); - io_mapping_free(ggtt->mappable); + io_mapping_fini(&ggtt->mappable); } static unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl) @@ -3226,9 +3225,9 @@ int i915_ggtt_init_hw(struct drm_i915_private *dev_priv) if (!HAS_LLC(dev_priv)) ggtt->base.mm.color_adjust = i915_gtt_color_adjust; - ggtt->mappable = - io_mapping_create_wc(ggtt->mappable_base, ggtt->mappable_end); - if (!ggtt->mappable) { + if (!io_mapping_init_wc(&dev_priv->ggtt.mappable, + dev_priv->ggtt.mappable_base, + dev_priv->ggtt.mappable_end)) { ret = -EIO; goto out_gtt_cleanup; } @@ -3698,7 +3697,7 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma) ptr = vma->iomap; if (ptr == NULL) { - ptr = io_mapping_map_wc(i915_vm_to_ggtt(vma->vm)->mappable, + ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->mappable, vma->node.start, vma->node.size); if (ptr == NULL) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index d2f79a1fb75f..f8d68d775896 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -438,13 +438,13 @@ stru
Re: [Intel-gfx] [PATCH RFC 1/4] drm/i915: add create_context2 ioctl
On Mon, Aug 15, 2016 at 02:48:04PM +0300, Mika Kuoppala wrote: > From: Jesse Barnes > > Add i915_gem_context_create2_ioctl for passing flags > (e.g. SVM) when creating a context. > > v2: check the pad on create_context > v3: rebase > v4: i915_dma is no more. create_gvt needs flags > > Cc: Daniel Vetter > Cc: Chris Wilson > Cc: Joonas Lahtinen > Signed-off-by: Jesse Barnes (v1) > Signed-off-by: Mika Kuoppala Considering we can use deferred ppgtt creation and have setparam do we need a new create ioctl just to set a flag? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/9] drm/i915/cmdparser: Make initialisation failure non-fatal
On 12 August 2016 at 16:07, Chris Wilson wrote: > If the developer adds a register in the wrong order, we BUG during boot. > That makes development and testing very difficult. Let's be a bit more > friendly and disable the command parser with a big warning if the tables > are invalid. > > Signed-off-by: Chris Wilson > --- > drivers/gpu/drm/i915/i915_cmd_parser.c | 30 ++ > drivers/gpu/drm/i915/i915_drv.h| 2 +- > drivers/gpu/drm/i915/intel_engine_cs.c | 6 -- > 3 files changed, 23 insertions(+), 15 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c > b/drivers/gpu/drm/i915/i915_cmd_parser.c > index a1f4683f5c35..1882dc28c750 100644 > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > @@ -746,17 +746,15 @@ static void fini_hash_table(struct intel_engine_cs > *engine) > * Optionally initializes fields related to batch buffer command parsing in > the > * struct intel_engine_cs based on whether the platform requires software > * command parsing. > - * > - * Return: non-zero if initialization fails > */ > -int intel_engine_init_cmd_parser(struct intel_engine_cs *engine) > +void intel_engine_init_cmd_parser(struct intel_engine_cs *engine) > { > const struct drm_i915_cmd_table *cmd_tables; > int cmd_table_count; > int ret; > > if (!IS_GEN7(engine->i915)) > - return 0; > + return; > > switch (engine->id) { > case RCS: > @@ -811,24 +809,32 @@ int intel_engine_init_cmd_parser(struct intel_engine_cs > *engine) > break; > default: > MISSING_CASE(engine->id); > - BUG(); > + return; > } > > - BUG_ON(!validate_cmds_sorted(engine, cmd_tables, cmd_table_count)); > - BUG_ON(!validate_regs_sorted(engine)); > + if (!hash_empty(engine->cmd_hash)) { > + DRM_DEBUG_DRIVER("%s: no commands?\n", engine->name); > + return; > + } "no commands?", !hash_empty should mean we already have commands, not that we don't, right? With that explained or fixed: Reviewed-by: Matthew Auld ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [1/5] drm/i915: Allow disabling error capture
== Series Details == Series: series starting with [1/5] drm/i915: Allow disabling error capture URL : https://patchwork.freedesktop.org/series/11096/ State : failure == Summary == Series 11096v1 Series without cover letter http://patchwork.freedesktop.org/api/1.0/series/11096/revisions/1/mbox Test drv_module_reload_basic: pass -> SKIP (ro-ivb-i7-3770) Test kms_cursor_legacy: Subgroup basic-cursor-vs-flip-varying-size: fail -> PASS (ro-ilk1-i5-650) Subgroup basic-flip-vs-cursor-legacy: fail -> PASS (ro-ivb2-i7-3770) fail -> PASS (ro-byt-n2820) pass -> FAIL (ro-bdw-i5-5250u) Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-a: pass -> INCOMPLETE (fi-hsw-i7-4770k) fi-hsw-i7-4770k total:201 pass:180 dwarn:0 dfail:0 fail:0 skip:20 fi-kbl-qkkr total:244 pass:185 dwarn:29 dfail:0 fail:3 skip:27 fi-skl-i7-6700k total:244 pass:208 dwarn:4 dfail:2 fail:2 skip:28 fi-snb-i7-2600 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 ro-bdw-i5-5250u total:240 pass:218 dwarn:1 dfail:0 fail:2 skip:19 ro-bdw-i7-5600u total:240 pass:207 dwarn:0 dfail:0 fail:1 skip:32 ro-bsw-n3050 total:240 pass:194 dwarn:0 dfail:0 fail:4 skip:42 ro-byt-n2820 total:240 pass:198 dwarn:0 dfail:0 fail:2 skip:40 ro-hsw-i3-4010u total:240 pass:214 dwarn:0 dfail:0 fail:0 skip:26 ro-hsw-i7-4770r total:240 pass:185 dwarn:0 dfail:0 fail:0 skip:55 ro-ilk1-i5-650 total:235 pass:174 dwarn:0 dfail:0 fail:1 skip:60 ro-ivb-i7-3770 total:240 pass:204 dwarn:0 dfail:0 fail:0 skip:36 ro-ivb2-i7-3770 total:240 pass:209 dwarn:0 dfail:0 fail:0 skip:31 ro-skl3-i5-6260u total:240 pass:223 dwarn:0 dfail:0 fail:3 skip:14 Results at /archive/results/CI_IGT_test/RO_Patchwork_1869/ e56d79f drm-intel-nightly: 2016y-08m-15d-10h-16m-44s UTC integration manifest b8c5ad5 drm/i915: Compress GPU objects in error state 7d5601b drm/i915: Consolidate error object printing 716ad20 drm/i915: Always use the GTT for error capture 5f76742 drm/i915: Stop the machine whilst capturing the GPU crash dump 5476ea87 drm/i915: Allow disabling error capture ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 1/4] drm/i915: add create_context2 ioctl
On ma, 2016-08-15 at 14:48 +0300, Mika Kuoppala wrote: > @@ -2566,6 +2566,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = { > DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, > DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, > i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, > i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW), > + DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE2, > i915_gem_context_create2_ioctl, DRM_UNLOCKED), Why DRM_UNLOCKED? > @@ -394,7 +398,8 @@ i915_gem_context_create_gvt(struct drm_device *dev) > if (ret) > return ERR_PTR(ret); > > - ctx = i915_gem_create_context(dev, NULL); > + ctx = i915_gem_create_context(dev, NULL, USES_FULL_PPGTT(dev) ? > + I915_GEM_CONTEXT_FULL_PPGTT : 0); Could use flags variable here just like below this point in code. > @@ -552,7 +561,8 @@ int i915_gem_context_open(struct drm_device *dev, struct > drm_file *file) > idr_init(&file_priv->context_idr); > > mutex_lock(&dev->struct_mutex); > - ctx = i915_gem_create_context(dev, file_priv); > + ctx = i915_gem_create_context(dev, file_priv, USES_FULL_PPGTT(dev) ? > + I915_GEM_CONTEXT_FULL_PPGTT : 0); Ditto. > +int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file) > +{ > + struct drm_i915_gem_context_create *args = data; > + struct drm_i915_gem_context_create2 tmp; 'args2' just as we have create2? > @@ -1142,6 +1144,22 @@ struct drm_i915_gem_context_create { > __u32 pad; > }; > > +/* > + * SVM handling > + * > + * A context can opt in to SVM support (thereby using its CPU page tables > + * when accessing data from the GPU) by using the %I915_ENABLE_SVM flag s/I915_ENABLE_SVM/I915_GEM_CONTEXT_ENABLE_SVM/ ? > + * and passing an existing context id. This is a one way transition; SVM > + * contexts can not be downgraded into PPGTT contexts once converted. > + */ > +#define I915_GEM_CONTEXT_ENABLE_SVM (1<<0) > +#define I915_GEM_CONTEXT_FULL_PPGTT (1<<1) BIT() With the above addressed; Reviewed-by: Joonas Lahtinen Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13
On Mon, Aug 15, 2016 at 02:48:05PM +0300, Mika Kuoppala wrote: > @@ -891,6 +894,8 @@ struct i915_gem_context { > unsigned long flags; > #define CONTEXT_NO_ZEROMAP BIT(0) > #define CONTEXT_NO_ERROR_CAPTURE BIT(1) > +#define CONTEXT_SVM BIT(2) > + > unsigned hw_id; > u32 user_handle; > > @@ -909,6 +914,9 @@ struct i915_gem_context { > struct atomic_notifier_head status_notifier; > bool execlists_force_single_submission; > > + u32 pasid; /* svm, 20 bits */ Doesn't this conflict with hw_id for execlists. > + struct task_struct *task; We don't need the task, we need the mm. Holding the task is not sufficient. > struct list_head link; > > u8 remap_slice; > @@ -2001,6 +2009,8 @@ struct drm_i915_private { > > struct i915_runtime_pm pm; > > + bool svm_available; No better home / community? > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 7e08c774a1aa..45d67b54c018 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -4304,6 +4304,13 @@ i915_gem_init_hw(struct drm_device *dev) > } > } > > + if (INTEL_GEN(dev) >= 8) { > + if (intel_init_svm(dev)) init_hw ? This looks more like one off early driver init. > + DRM_DEBUG_DRIVER("Initialized Intel SVM support\n"); > + else > + DRM_ERROR("Failed to enable Intel SVM support\n"); > + } > + -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13
On Mon, 2016-08-15 at 14:48 +0300, Mika Kuoppala wrote: > > > +static void i915_svm_fault_cb(struct device *dev, int pasid, u64 addr, > + u32 private, int rwxp, int response) > +{ > +} > + > +static struct svm_dev_ops i915_svm_ops = { > + .fault_cb = i915_svm_fault_cb, > +}; > + I'd prefer that you don't hook this up unless you need it. I'd also prefer that you don't need it. If you need it, nail a hardware designer to a tree before you hook it up. -- dwmw2 smime.p7s Description: S/MIME cryptographic signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 3/4] drm/i915: add SVM execbuf ioctl v10
On Mon, Aug 15, 2016 at 02:48:06PM +0300, Mika Kuoppala wrote: > From: Jesse Barnes > > We just need to pass in an address to execute and some flags, since we > don't have to worry about buffer relocation or any of the other usual > stuff. Returns a fence to be used for synchronization. > > v2: add a request after batch submission (Jesse) > v3: add a flag for fence creation (Chris) > v4: add CLOEXEC flag (Kristian) > add non-RCS ring support (Jesse) > v5: update for request alloc change (Jesse) > v6: new sync file interface, error paths, request breadcrumbs > v7: always CLOEXEC for sync_file_install > v8: rebase on new sync file api > v9: rework on top of fence requests and sync_file > v10: take fence ref for sync_file (Chris) > use correct flush (Chris) > limit exec on rcs This is incomplete, so just proof of principle? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 4/4] drm/i915: Add param for SVM
On Mon, Aug 15, 2016 at 02:48:07PM +0300, Mika Kuoppala wrote: > From: Jesse Barnes > > Add possibility to query if svm is available. When we try to enable SVM on the context we get an error. We have to do that first anywhere, that seems like a good spot for userspace to catch all issues. What usecase do you have that doesn't involve creating an SVM context? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13
On Mon, 2016-08-15 at 13:05 +0100, Chris Wilson wrote: > On Mon, Aug 15, 2016 at 02:48:05PM +0300, Mika Kuoppala wrote: > > > > + struct task_struct *task; > > We don't need the task, we need the mm. > > Holding the task is not sufficient. From the pure DMA point of view, you don't need the MM at all. I handle all that from the IOMMU side so it's none of your business, darling. However, if you want to relate a given context to the specific thread which started it, perhaps to deliver signals or whatever else, then perhaps you do want the task not the MM. > > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -4304,6 +4304,13 @@ i915_gem_init_hw(struct drm_device *dev) > > } > > } > > > > + if (INTEL_GEN(dev) >= 8) { > > + if (intel_init_svm(dev)) > > init_hw ? > > This looks more like one off early driver init. It's a per-device thing. You might support SVM on one device but not another, depending on how the IOMMU is configured. Note the 'dev' argument in the call to intel_init_svm(). -- dwmw2 smime.p7s Description: S/MIME cryptographic signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Initialise mmaped_count for i915_gem_object_info
Reported-by: 0day kbuild test robot Fixes: 2bd160a131ac ("drm/i915: Reduce i915_gem_objects to only show...") Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index f612d3f18c69..81fabc36ce5a 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -403,7 +403,9 @@ static int i915_gem_object_info(struct seq_file *m, void* data) dev_priv->mm.object_count, dev_priv->mm.object_memory); - size = count = purgeable_size = purgeable_count = 0; + size = count = 0; + mapped_size = mapped_count = 0; + purgeable_size = purgeable_count = 0; list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list) { size += obj->base.size; ++count; -- 2.8.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 4/4] drm/i915: Add param for SVM
Chris Wilson writes: > On Mon, Aug 15, 2016 at 02:48:07PM +0300, Mika Kuoppala wrote: >> From: Jesse Barnes >> >> Add possibility to query if svm is available. > > When we try to enable SVM on the context we get an error. We have to do > that first anywhere, that seems like a good spot for userspace to catch > all issues. > > What usecase do you have that doesn't involve creating an SVM context? I can't think of any. So this patch stinks superfluous. -Mika > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13
On Mon, Aug 15, 2016 at 01:13:25PM +0100, David Woodhouse wrote: > On Mon, 2016-08-15 at 13:05 +0100, Chris Wilson wrote: > > On Mon, Aug 15, 2016 at 02:48:05PM +0300, Mika Kuoppala wrote: > > > > > > + struct task_struct *task; > > > > We don't need the task, we need the mm. > > > > Holding the task is not sufficient. > > From the pure DMA point of view, you don't need the MM at all. I handle > all that from the IOMMU side so it's none of your business, darling. But you don't keep the mm alive for the duration of device activity, right? And you don't wait for the device to finish before releasing the mmu? (iiuc intel-svm.c) -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Ro.CI.BAT: failure for svm support
== Series Details == Series: svm support URL : https://patchwork.freedesktop.org/series/11097/ State : failure == Summary == Series 11097v1 svm support http://patchwork.freedesktop.org/api/1.0/series/11097/revisions/1/mbox Test drv_hangman: Subgroup error-state-basic: pass -> DMESG-WARN (ro-bdw-i7-5600u) pass -> DMESG-WARN (ro-bdw-i5-5250u) pass -> DMESG-WARN (ro-skl3-i5-6260u) pass -> DMESG-WARN (fi-skl-i7-6700k) Test drv_module_reload_basic: pass -> DMESG-WARN (ro-bdw-i7-5600u) pass -> DMESG-WARN (ro-bdw-i5-5250u) pass -> DMESG-WARN (ro-skl3-i5-6260u) pass -> DMESG-WARN (fi-skl-i7-6700k) Test gem_exec_suspend: Subgroup basic-s3: pass -> DMESG-WARN (ro-bdw-i7-5600u) pass -> DMESG-WARN (ro-skl3-i5-6260u) Test gem_ringfill: Subgroup basic-default-hang: pass -> DMESG-WARN (ro-bdw-i7-5600u) pass -> DMESG-WARN (ro-bdw-i5-5250u) pass -> DMESG-WARN (ro-skl3-i5-6260u) pass -> DMESG-WARN (fi-skl-i7-6700k) Test kms_cursor_legacy: Subgroup basic-cursor-vs-flip-varying-size: fail -> PASS (ro-ilk1-i5-650) Subgroup basic-flip-vs-cursor-legacy: fail -> PASS (ro-ivb2-i7-3770) pass -> FAIL (ro-bdw-i5-5250u) Subgroup basic-flip-vs-cursor-varying-size: pass -> FAIL (ro-skl3-i5-6260u) fail -> PASS (ro-bdw-i5-5250u) Test kms_pipe_crc_basic: Subgroup hang-read-crc-pipe-a: pass -> DMESG-WARN (ro-bdw-i7-5600u) pass -> DMESG-WARN (ro-bdw-i5-5250u) pass -> DMESG-WARN (ro-skl3-i5-6260u) pass -> DMESG-WARN (fi-skl-i7-6700k) Subgroup hang-read-crc-pipe-b: pass -> DMESG-WARN (ro-bdw-i7-5600u) pass -> DMESG-WARN (ro-bdw-i5-5250u) pass -> DMESG-WARN (ro-skl3-i5-6260u) pass -> DMESG-WARN (fi-skl-i7-6700k) Subgroup hang-read-crc-pipe-c: pass -> DMESG-WARN (ro-bdw-i7-5600u) pass -> DMESG-WARN (ro-bdw-i5-5250u) pass -> DMESG-WARN (ro-skl3-i5-6260u) pass -> DMESG-WARN (fi-skl-i7-6700k) Subgroup suspend-read-crc-pipe-a: pass -> DMESG-WARN (ro-bdw-i7-5600u) skip -> DMESG-WARN (ro-bdw-i5-5250u) pass -> DMESG-WARN (ro-skl3-i5-6260u) Subgroup suspend-read-crc-pipe-b: pass -> DMESG-WARN (ro-bdw-i7-5600u) skip -> DMESG-WARN (ro-bdw-i5-5250u) pass -> DMESG-WARN (ro-skl3-i5-6260u) Subgroup suspend-read-crc-pipe-c: pass -> DMESG-WARN (ro-bdw-i7-5600u) pass -> DMESG-WARN (ro-skl3-i5-6260u) fi-hsw-i7-4770k total:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-kbl-qkkr total:244 pass:184 dwarn:31 dfail:0 fail:3 skip:26 fi-skl-i7-6700k total:244 pass:202 dwarn:10 dfail:2 fail:2 skip:28 fi-snb-i7-2600 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 ro-bdw-i5-5250u total:240 pass:213 dwarn:9 dfail:0 fail:1 skip:17 ro-bdw-i7-5600u total:240 pass:197 dwarn:10 dfail:0 fail:1 skip:32 ro-bsw-n3050 total:240 pass:189 dwarn:6 dfail:0 fail:3 skip:42 ro-byt-n2820 total:240 pass:197 dwarn:0 dfail:0 fail:3 skip:40 ro-hsw-i3-4010u total:240 pass:214 dwarn:0 dfail:0 fail:0 skip:26 ro-hsw-i7-4770r total:240 pass:185 dwarn:0 dfail:0 fail:0 skip:55 ro-ilk1-i5-650 total:235 pass:174 dwarn:0 dfail:0 fail:1 skip:60 ro-ivb-i7-3770 total:240 pass:205 dwarn:0 dfail:0 fail:0 skip:35 ro-ivb2-i7-3770 total:240 pass:209 dwarn:0 dfail:0 fail:0 skip:31 ro-skl3-i5-6260u total:240 pass:212 dwarn:10 dfail:0 fail:4 skip:14 Results at /archive/results/CI_IGT_test/RO_Patchwork_1870/ e56d79f drm-intel-nightly: 2016y-08m-15d-10h-16m-44s UTC integration manifest 6f1de3d drm/i915: Add param for SVM ce6e32e drm/i915: add SVM execbuf ioctl v10 fd45b62 drm/i915: IOMMU based SVM implementation v13 7fa4650 drm/i915: add create_context2 ioctl ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 1/4] drm/i915: add create_context2 ioctl
Chris Wilson writes: > On Mon, Aug 15, 2016 at 02:48:04PM +0300, Mika Kuoppala wrote: >> From: Jesse Barnes >> >> Add i915_gem_context_create2_ioctl for passing flags >> (e.g. SVM) when creating a context. >> >> v2: check the pad on create_context >> v3: rebase >> v4: i915_dma is no more. create_gvt needs flags >> >> Cc: Daniel Vetter >> Cc: Chris Wilson >> Cc: Joonas Lahtinen >> Signed-off-by: Jesse Barnes (v1) >> Signed-off-by: Mika Kuoppala > > Considering we can use deferred ppgtt creation and have setparam do we > need a new create ioctl just to set a flag? So like this: - create ctx with the default create ioctl - set cxt param it for svm capable. - first submit deferred creates And we use the setparam point for returning error if svm context are not there. ? -Mika > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [1/2] io-mapping: Always create a struct to hold metadata about the io-mapping
== Series Details == Series: series starting with [1/2] io-mapping: Always create a struct to hold metadata about the io-mapping URL : https://patchwork.freedesktop.org/series/11099/ State : failure == Summary == Applying: io-mapping: Always create a struct to hold metadata about the io-mapping fatal: sha1 information is lacking or useless (drivers/gpu/drm/i915/i915_gem.c). error: could not build fake ancestor Patch failed at 0001 io-mapping: Always create a struct to hold metadata about the io-mapping The copy of the patch that failed is found in: .git/rebase-apply/patch When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13
On Mon, 2016-08-15 at 13:23 +0100, Chris Wilson wrote: > On Mon, Aug 15, 2016 at 01:13:25PM +0100, David Woodhouse wrote: > > On Mon, 2016-08-15 at 13:05 +0100, Chris Wilson wrote: > > > On Mon, Aug 15, 2016 at 02:48:05PM +0300, Mika Kuoppala wrote: > > > > > > > > + struct task_struct *task; > > > > > > We don't need the task, we need the mm. > > > > > > Holding the task is not sufficient. > > > > From the pure DMA point of view, you don't need the MM at all. I handle > > all that from the IOMMU side so it's none of your business, darling. > > But you don't keep the mm alive for the duration of device activity, > right? And you don't wait for the device to finish before releasing the > mmu? (iiuc intel-svm.c) We don't "keep it alive" (i.e. bump mm->mm_users), no. We *did*, but it caused problems. See commit e57e58bd390a68 for the gory details. Now we only bump mm->mm_count so if the process exits, the MM can still be torn down. Since exit_mmap() happens before exit_files(), what happens on an unclean shutdown is that the GPU may start to take faults on the PASID which is in the process of exiting, before the corresponding file descriptor gets closed. So no, we don't wait for the device to finish before releasing the MM. That would involve calling back into device-driver code from the mmu_notifier callback, with "interesting" locking constraints. We don't trust device drivers that much :) -- dwmw2 smime.p7s Description: S/MIME cryptographic signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 3/4] drm/i915: add SVM execbuf ioctl v10
Chris Wilson writes: > On Mon, Aug 15, 2016 at 02:48:06PM +0300, Mika Kuoppala wrote: >> From: Jesse Barnes >> >> We just need to pass in an address to execute and some flags, since we >> don't have to worry about buffer relocation or any of the other usual >> stuff. Returns a fence to be used for synchronization. >> >> v2: add a request after batch submission (Jesse) >> v3: add a flag for fence creation (Chris) >> v4: add CLOEXEC flag (Kristian) >> add non-RCS ring support (Jesse) >> v5: update for request alloc change (Jesse) >> v6: new sync file interface, error paths, request breadcrumbs >> v7: always CLOEXEC for sync_file_install >> v8: rebase on new sync file api >> v9: rework on top of fence requests and sync_file >> v10: take fence ref for sync_file (Chris) >> use correct flush (Chris) >> limit exec on rcs > > This is incomplete, so just proof of principle? At some point of rebasing I noticed that Jesse did limit everything on rcs. So I just put it back. No idea yet why we would need to limit for rcs only. -Mika > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Embrace the race in busy-ioctl
On pe, 2016-08-12 at 18:52 +0100, Chris Wilson wrote: > Daniel Vetter proposed a new challenge to the serialisation inside the > busy-ioctl that exposed a flaw that could result in us reporting the > wrong engine as being busy. If the request is reallocated as we test > its busyness and then reassigned to this object by another thread, we > would not notice that the test itself was incorrect. > > We are faced with a choice of using __i915_gem_active_get_request_rcu() > to first acquire a reference to the request preventing the race, or to > acknowledge the race and accept the limitations upon the accuracy of the > busy flags. Note that we guarantee that we never falsely report the > object as idle (providing userspace itself doesn't race), and so the > most important use of the busy-ioctl and its guarantees are fulfilled. > If Daniel acks the userspace change, Reviewed-by: Joonas Lahtinen Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Initialise mmaped_count for i915_gem_object_info
Chris Wilson writes: > Reported-by: 0day kbuild test robot > Fixes: 2bd160a131ac ("drm/i915: Reduce i915_gem_objects to only show...") > Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala > --- > drivers/gpu/drm/i915/i915_debugfs.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > b/drivers/gpu/drm/i915/i915_debugfs.c > index f612d3f18c69..81fabc36ce5a 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -403,7 +403,9 @@ static int i915_gem_object_info(struct seq_file *m, void* > data) > dev_priv->mm.object_count, > dev_priv->mm.object_memory); > > - size = count = purgeable_size = purgeable_count = 0; > + size = count = 0; > + mapped_size = mapped_count = 0; > + purgeable_size = purgeable_count = 0; > list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list) { > size += obj->base.size; > ++count; > -- > 2.8.1 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13
On Mon, Aug 15, 2016 at 01:30:11PM +0100, David Woodhouse wrote: > On Mon, 2016-08-15 at 13:23 +0100, Chris Wilson wrote: > > On Mon, Aug 15, 2016 at 01:13:25PM +0100, David Woodhouse wrote: > > > On Mon, 2016-08-15 at 13:05 +0100, Chris Wilson wrote: > > > > On Mon, Aug 15, 2016 at 02:48:05PM +0300, Mika Kuoppala wrote: > > > > > > > > > > + struct task_struct *task; > > > > > > > > We don't need the task, we need the mm. > > > > > > > > Holding the task is not sufficient. > > > > > > From the pure DMA point of view, you don't need the MM at all. I handle > > > all that from the IOMMU side so it's none of your business, darling. > > > > But you don't keep the mm alive for the duration of device activity, > > right? And you don't wait for the device to finish before releasing the > > mmu? (iiuc intel-svm.c) > > We don't "keep it alive" (i.e. bump mm->mm_users), no. > We *did*, but it caused problems. See commit e57e58bd390a68 for the > gory details. > > Now we only bump mm->mm_count so if the process exits, the MM can still > be torn down. > > Since exit_mmap() happens before exit_files(), what happens on an > unclean shutdown is that the GPU may start to take faults on the PASID > which is in the process of exiting, before the corresponding file > descriptor gets closed. > > So no, we don't wait for the device to finish before releasing the MM. > That would involve calling back into device-driver code from the > mmu_notifier callback, with "interesting" locking constraints. We don't > trust device drivers that much :) With the device allocating the memory, we can keep the object alive for as long as required for it to complete the commands and for other users. Other uses get access to the svm pages via shared memory (mmap, memfd) and so another process copying from the buffer should be unaffected by termination of the original process. So it is really just what happens to commands for this client when it dies/exits. The kneejerk reaction is to say the pages should be kept alive as they are now for !svm. We could be faced with a situation where the client copies onto a shared buffer (obtaining a fence), passes that fence over to the server scheduling an update, and die abruptly. Given that the fence and request arrive on the server safely (the fence will be completed even if the command is skipped or its faults filled with zero), the server will itself proceed to present the incomplete result from the dead client. (Presently for !svm the output will be intact.) The question is do we accept the change in behaviour? Or I am completely misunderstanding how the svm faulting/mmu-notifiers will work? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 1/4] drm/i915: add create_context2 ioctl
On Mon, Aug 15, 2016 at 03:25:43PM +0300, Mika Kuoppala wrote: > Chris Wilson writes: > > > On Mon, Aug 15, 2016 at 02:48:04PM +0300, Mika Kuoppala wrote: > >> From: Jesse Barnes > >> > >> Add i915_gem_context_create2_ioctl for passing flags > >> (e.g. SVM) when creating a context. > >> > >> v2: check the pad on create_context > >> v3: rebase > >> v4: i915_dma is no more. create_gvt needs flags > >> > >> Cc: Daniel Vetter > >> Cc: Chris Wilson > >> Cc: Joonas Lahtinen > >> Signed-off-by: Jesse Barnes (v1) > >> Signed-off-by: Mika Kuoppala > > > > Considering we can use deferred ppgtt creation and have setparam do we > > need a new create ioctl just to set a flag? > > So like this: > > - create ctx with the default create ioctl > - set cxt param it for svm capable. > - first submit deferred creates > > And we use the setparam point for returning > error if svm context are not there. (and a call to set svm on a context after first use is illegal) That's the outline I had in my head. I am not sure if the result is cleaner - I just hope it is ;) -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Ro.CI.BAT: failure for drm/i915: Initialise mmaped_count for i915_gem_object_info
== Series Details == Series: drm/i915: Initialise mmaped_count for i915_gem_object_info URL : https://patchwork.freedesktop.org/series/11100/ State : failure == Summary == Series 11100v1 drm/i915: Initialise mmaped_count for i915_gem_object_info http://patchwork.freedesktop.org/api/1.0/series/11100/revisions/1/mbox Test kms_cursor_legacy: Subgroup basic-flip-vs-cursor-legacy: fail -> PASS (ro-ivb2-i7-3770) pass -> FAIL (ro-bdw-i5-5250u) Subgroup basic-flip-vs-cursor-varying-size: fail -> PASS (ro-byt-n2820) pass -> FAIL (ro-skl3-i5-6260u) Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-c: pass -> DMESG-WARN (ro-bdw-i7-5600u) skip -> DMESG-WARN (ro-bdw-i5-5250u) fi-hsw-i7-4770k total:244 pass:222 dwarn:0 dfail:0 fail:0 skip:22 fi-kbl-qkkr total:244 pass:185 dwarn:30 dfail:0 fail:2 skip:27 fi-skl-i7-6700k total:244 pass:208 dwarn:4 dfail:2 fail:2 skip:28 fi-snb-i7-2600 total:244 pass:202 dwarn:0 dfail:0 fail:0 skip:42 ro-bdw-i5-5250u total:240 pass:218 dwarn:2 dfail:0 fail:2 skip:18 ro-bdw-i7-5600u total:240 pass:206 dwarn:1 dfail:0 fail:1 skip:32 ro-bsw-n3050 total:87 pass:67 dwarn:0 dfail:0 fail:0 skip:19 ro-byt-n2820 total:240 pass:198 dwarn:0 dfail:0 fail:2 skip:40 ro-hsw-i3-4010u total:240 pass:214 dwarn:0 dfail:0 fail:0 skip:26 ro-hsw-i7-4770r total:240 pass:185 dwarn:0 dfail:0 fail:0 skip:55 ro-ilk1-i5-650 total:235 pass:173 dwarn:0 dfail:0 fail:2 skip:60 ro-ivb-i7-3770 total:240 pass:205 dwarn:0 dfail:0 fail:0 skip:35 ro-ivb2-i7-3770 total:240 pass:209 dwarn:0 dfail:0 fail:0 skip:31 ro-skl3-i5-6260u total:240 pass:222 dwarn:0 dfail:0 fail:4 skip:14 Results at /archive/results/CI_IGT_test/RO_Patchwork_1872/ e56d79f drm-intel-nightly: 2016y-08m-15d-10h-16m-44s UTC integration manifest 0640a5b drm/i915: Initialise mmaped_count for i915_gem_object_info ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm: make drm_get_format_name thread-safe
On Mon, Aug 15, 2016 at 12:54:01PM +0300, Jani Nikula wrote: > On Mon, 15 Aug 2016, Eric Engestrom wrote: > > Signed-off-by: Eric Engestrom > > --- > > > > I moved the main bits to be the first diffs, shouldn't affect anything > > when applying the patch, but I wanted to ask: > > I don't like the hard-coded `32` the appears in both kmalloc() and > > snprintf(), what do you think? If you don't like it either, what would > > you suggest? Should I #define it? > > > > Second question is about the patch mail itself: should I send this kind > > of patch separated by module, with a note requesting them to be squashed > > when applying? It has to land as a single patch, but for review it might > > be easier if people only see the bits they each care about, as well as > > to collect ack's/r-b's. > > > > Cheers, > > Eric > > > > --- > > drivers/gpu/drm/amd/amdgpu/dce_v10_0.c | 6 ++-- > > drivers/gpu/drm/amd/amdgpu/dce_v11_0.c | 6 ++-- > > drivers/gpu/drm/amd/amdgpu/dce_v8_0.c | 6 ++-- > > drivers/gpu/drm/drm_atomic.c| 5 ++-- > > drivers/gpu/drm/drm_crtc.c | 21 - > > drivers/gpu/drm/drm_fourcc.c| 17 ++- > > drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c | 6 ++-- > > drivers/gpu/drm/i915/i915_debugfs.c | 11 ++- > > drivers/gpu/drm/i915/intel_atomic_plane.c | 6 ++-- > > drivers/gpu/drm/i915/intel_display.c| 39 > > - > > drivers/gpu/drm/radeon/atombios_crtc.c | 12 +--- > > include/drm/drm_fourcc.h| 2 +- > > 12 files changed, 89 insertions(+), 48 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c > > index 0645c85..38216a1 100644 > > --- a/drivers/gpu/drm/drm_fourcc.c > > +++ b/drivers/gpu/drm/drm_fourcc.c > > @@ -39,16 +39,14 @@ static char printable_char(int c) > > * drm_get_format_name - return a string for drm fourcc format > > * @format: format to compute name of > > * > > - * Note that the buffer used by this function is globally shared and owned > > by > > - * the function itself. > > - * > > - * FIXME: This isn't really multithreading safe. > > + * Note that the buffer returned by this function is owned by the caller > > + * and will need to be freed. > > */ > > const char *drm_get_format_name(uint32_t format) > > I find it surprising that a function that allocates a buffer returns a > const pointer. Some userspace libraries have conventions about the > ownership based on constness. > > (I also find it suprising that kfree() takes a const pointer; arguably > that call changes the memory.) > > Is there precedent for this? > > BR, > Jani. It's not a const pointer, it's a normal pointer to a const char, i.e. you can do as you want with the pointer but you shouldn't change the chars it points to. Cheers, Eric ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RFC 2/4] drm/i915: IOMMU based SVM implementation v13
On Mon, 2016-08-15 at 13:53 +0100, Chris Wilson wrote: > > So it is really just what happens to commands for this client when it > dies/exits. The kneejerk reaction is to say the pages should be kept > alive as they are now for !svm. We could be faced with a situation where > the client copies onto a shared buffer (obtaining a fence), passes that > fence over to the server scheduling an update, and die abruptly. Which pages? Until the moment you actually do the DMA, you don't have "pages". They might not even exist in RAM. All you have is (a PASID and) a userspace linear address. When you actually the DMA, *then* we might fault in the appropriate pages from disk. Or might not, depending on whether the address is valid or not. Between the time when it hands you the linear address, and the time that you use it, the process could have done anything. we are currently talking about the case where it exits uncleanly. But it could also munmap() the linear address in question. Or mmap() something else over it. Obviously those would be bugs... but so is an unclean exit. So it doesn't seem to make much sense to ask if you accept the change in behaviour. You don't really have much choice; it's implicit in the SVM model of doing DMA directly to userspace addresses. You just *don't* get to lock things down and trust that the buffers will still be there when you finally get round to using them. -- dwmw2 smime.p7s Description: S/MIME cryptographic signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Embrace the race in busy-ioctl
Chris Wilson writes: > Daniel Vetter proposed a new challenge to the serialisation inside the > busy-ioctl that exposed a flaw that could result in us reporting the > wrong engine as being busy. If the request is reallocated as we test > its busyness and then reassigned to this object by another thread, we > would not notice that the test itself was incorrect. > > We are faced with a choice of using __i915_gem_active_get_request_rcu() > to first acquire a reference to the request preventing the race, or to > acknowledge the race and accept the limitations upon the accuracy of the > busy flags. Note that we guarantee that we never falsely report the > object as idle (providing userspace itself doesn't race), and so the > most important use of the busy-ioctl and its guarantees are fulfilled. > > Signed-off-by: Chris Wilson > Cc: Daniel Vetter > Cc: Joonas Lahtinen > --- > drivers/gpu/drm/i915/i915_gem.c | 87 > ++--- > include/uapi/drm/i915_drm.h | 15 ++- > 2 files changed, 60 insertions(+), 42 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 5566916870eb..c77915378768 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3791,49 +3791,54 @@ static __always_inline unsigned int > __busy_set_if_active(const struct i915_gem_active *active, >unsigned int (*flag)(unsigned int id)) > { > - /* For more discussion about the barriers and locking concerns, > - * see __i915_gem_active_get_rcu(). > - */ > - do { > - struct drm_i915_gem_request *request; > - unsigned int id; > - > - request = rcu_dereference(active->request); > - if (!request || i915_gem_request_completed(request)) > - return 0; > + struct drm_i915_gem_request *request; > > - id = request->engine->exec_id; > + request = rcu_dereference(active->request); > + if (!request || i915_gem_request_completed(request)) > + return 0; > > - /* Check that the pointer wasn't reassigned and overwritten. > - * > - * In __i915_gem_active_get_rcu(), we enforce ordering between > - * the first rcu pointer dereference (imposing a > - * read-dependency only on access through the pointer) and > - * the second lockless access through the memory barrier > - * following a successful atomic_inc_not_zero(). Here there > - * is no such barrier, and so we must manually insert an > - * explicit read barrier to ensure that the following > - * access occurs after all the loads through the first > - * pointer. > - * > - * It is worth comparing this sequence with > - * raw_write_seqcount_latch() which operates very similarly. > - * The challenge here is the visibility of the other CPU > - * writes to the reallocated request vs the local CPU ordering. > - * Before the other CPU can overwrite the request, it will > - * have updated our active->request and gone through a wmb. > - * During the read here, we want to make sure that the values > - * we see have not been overwritten as we do so - and we do > - * that by serialising the second pointer check with the writes > - * on other other CPUs. > - * > - * The corresponding write barrier is part of > - * rcu_assign_pointer(). > - */ > - smp_rmb(); > - if (request == rcu_access_pointer(active->request)) > - return flag(id); > - } while (1); > + /* This is racy. See __i915_gem_active_get_rcu() for a in detail > + * discussion of how to handle the race correctly, but for reporting > + * the busy state we err on the side of potentially reporting the > + * wrong engine as being busy (but we guarantee that the result > + * is at least self-consistent). > + * > + * As we use SLAB_DESTROY_BY_RCU, the request may be reallocated > + * whilst we are inspecting it, even under the RCU read lock as we are. > + * This means that there is a small window for the engine and/or the > + * seqno to have been overwritten. The seqno will always be in the > + * future compared to the intended, and so we know that if that > + * seqno is idle (on whatever engine) our request is idle and the > + * return 0 above is correct. > + * > + * The issue is that if the engine is switched, it is just as likely > + * to report that it is busy (but since the switch happened, we know > + * the request should be idle). So there is a small chance that a busy > + * result is actually the wrong engine. > + * > + * So why