Re: [Intel-gfx] [PATCH v2 5/8] drm/i915: Check error return when converting pipe to connector
On Wed, 26 Apr 2017, Imre Deak wrote: > An error from intel_get_pipe_from_connector() would mean a bug somewhere > else, but we still should check for it to prevent some other more > obscure bug later. > > v2: > - Fall back to a reasonable default instead of bailing out in case of > error. (Jani) > > Cc: Jani Nikula > Signed-off-by: Imre Deak > --- > drivers/gpu/drm/i915/intel_panel.c | 17 ++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_panel.c > b/drivers/gpu/drm/i915/intel_panel.c > index cb50c52..3508f42 100644 > --- a/drivers/gpu/drm/i915/intel_panel.c > +++ b/drivers/gpu/drm/i915/intel_panel.c > @@ -888,10 +888,14 @@ static void pch_enable_backlight(struct intel_connector > *connector) > struct drm_i915_private *dev_priv = to_i915(connector->base.dev); > struct intel_panel *panel = &connector->panel; > enum pipe pipe = intel_get_pipe_from_connector(connector); > - enum transcoder cpu_transcoder = > - intel_pipe_to_cpu_transcoder(dev_priv, pipe); > + enum transcoder cpu_transcoder; > u32 cpu_ctl2, pch_ctl1, pch_ctl2; > > + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) > + cpu_transcoder = intel_pipe_to_cpu_transcoder(dev_priv, pipe); > + else > + cpu_transcoder = TRANSCODER_EDP; > + > cpu_ctl2 = I915_READ(BLC_PWM_CPU_CTL2); > if (cpu_ctl2 & BLM_PWM_ENABLE) { > DRM_DEBUG_KMS("cpu backlight already enabled\n"); > @@ -973,6 +977,9 @@ static void i965_enable_backlight(struct intel_connector > *connector) > enum pipe pipe = intel_get_pipe_from_connector(connector); > u32 ctl, ctl2, freq; > > + if (WARN_ON_ONCE(pipe == INVALID_PIPE)) > + pipe = PIPE_A; > + > ctl2 = I915_READ(BLC_PWM_CTL2); > if (ctl2 & BLM_PWM_ENABLE) { > DRM_DEBUG_KMS("backlight already enabled\n"); > @@ -1037,6 +1044,9 @@ static void bxt_enable_backlight(struct intel_connector > *connector) > enum pipe pipe = intel_get_pipe_from_connector(connector); > u32 pwm_ctl, val; > > + if (WARN_ON_ONCE(pipe) == PIPE_INVALID) Le pipe invalid? I think you mean INVALID_PIPE here. BR, Jani. > + pipe = PIPE_A; > + > /* Controller 1 uses the utility pin. */ > if (panel->backlight.controller == 1) { > val = I915_READ(UTIL_PIN_CTL); > @@ -1093,7 +1103,8 @@ void intel_panel_enable_backlight(struct > intel_connector *connector) > if (!panel->backlight.present) > return; > > - DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); > + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) > + DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); > > mutex_lock(&dev_priv->backlight_lock); -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v8] drm/i915: Squash repeated awaits on the same fence
Track the latest fence waited upon on each context, and only add a new asynchronous wait if the new fence is more recent than the recorded fence for that context. This requires us to filter out unordered timelines, which are noted by DMA_FENCE_NO_CONTEXT. However, in the absence of a universal identifier, we have to use our own i915->mm.unordered_timeline token. v2: Throw around the debug crutches v3: Inline the likely case of the pre-allocation cache being full. v4: Drop the pre-allocation support, we can lose the most recent fence in case of allocation failure -- it just means we may emit more awaits than strictly necessary but will not break. v5: Trim allocation size for leaf nodes, they only need an array of u32 not pointers. v6: Create mock_timeline to tidy selftest writing v7: s/intel_timeline_sync_get/intel_timeline_sync_is_later/ (Tvrtko) v8: Prune the stale sync points when we idle. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem.c| 1 + drivers/gpu/drm/i915/i915_gem_request.c| 11 + drivers/gpu/drm/i915/i915_gem_timeline.c | 314 + drivers/gpu/drm/i915/i915_gem_timeline.h | 15 + drivers/gpu/drm/i915/selftests/i915_gem_timeline.c | 125 .../gpu/drm/i915/selftests/i915_mock_selftests.h | 1 + drivers/gpu/drm/i915/selftests/mock_timeline.c | 52 drivers/gpu/drm/i915/selftests/mock_timeline.h | 33 +++ 8 files changed, 552 insertions(+) create mode 100644 drivers/gpu/drm/i915/selftests/i915_gem_timeline.c create mode 100644 drivers/gpu/drm/i915/selftests/mock_timeline.c create mode 100644 drivers/gpu/drm/i915/selftests/mock_timeline.h diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c1fa3c103f38..f886ef492036 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3214,6 +3214,7 @@ i915_gem_idle_work_handler(struct work_struct *work) intel_engine_disarm_breadcrumbs(engine); i915_gem_batch_pool_fini(&engine->batch_pool); } + i915_gem_timelines_mark_idle(dev_priv); GEM_BUG_ON(!dev_priv->gt.awake); dev_priv->gt.awake = false; diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 5fa4e52ded06..d9f76665bc6b 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -772,6 +772,12 @@ i915_gem_request_await_dma_fence(struct drm_i915_gem_request *req, if (fence->context == req->fence.context) continue; + /* Squash repeated waits to the same timelines */ + if (fence->context != req->i915->mm.unordered_timeline && + intel_timeline_sync_is_later(req->timeline, +fence->context, fence->seqno)) + continue; + if (dma_fence_is_i915(fence)) ret = i915_gem_request_await_request(req, to_request(fence)); @@ -781,6 +787,11 @@ i915_gem_request_await_dma_fence(struct drm_i915_gem_request *req, GFP_KERNEL); if (ret < 0) return ret; + + /* Record the most latest fence on each timeline */ + if (fence->context != req->i915->mm.unordered_timeline) + intel_timeline_sync_set(req->timeline, + fence->context, fence->seqno); } while (--nchild); return 0; diff --git a/drivers/gpu/drm/i915/i915_gem_timeline.c b/drivers/gpu/drm/i915/i915_gem_timeline.c index b596ca7ee058..967c53a53a92 100644 --- a/drivers/gpu/drm/i915/i915_gem_timeline.c +++ b/drivers/gpu/drm/i915/i915_gem_timeline.c @@ -24,6 +24,276 @@ #include "i915_drv.h" +#define NSYNC 16 +#define SHIFT ilog2(NSYNC) +#define MASK (NSYNC - 1) + +/* struct intel_timeline_sync is a layer of a radixtree that maps a u64 fence + * context id to the last u32 fence seqno waited upon from that context. + * Unlike lib/radixtree it uses a parent pointer that allows traversal back to + * the root. This allows us to access the whole tree via a single pointer + * to the most recently used layer. We expect fence contexts to be dense + * and most reuse to be on the same i915_gem_context but on neighbouring + * engines (i.e. on adjacent contexts) and reuse the same leaf, a very + * effective lookup cache. If the new lookup is not on the same leaf, we + * expect it to be on the neighbouring branch. + * + * A leaf holds an array of u32 seqno, and has height 0. The bitmap field + * allows us to store whether a particular seqno is valid (i.e. allows us + * to distinguish unset from 0). + * + * A branch holds an array of layer pointers, and has heig
Re: [Intel-gfx] [PATCH] drm/i915/gvt: fix typo: "supporte" -> "support"
On 2017.04.25 10:05:12 +0100, Colin King wrote: > From: Colin Ian King > > trivial fix to typo in WARN_ONCE message > > Signed-off-by: Colin Ian King > --- > drivers/gpu/drm/i915/gvt/handlers.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gvt/handlers.c > b/drivers/gpu/drm/i915/gvt/handlers.c > index 0ad1a508e2af..c995e540ff96 100644 > --- a/drivers/gpu/drm/i915/gvt/handlers.c > +++ b/drivers/gpu/drm/i915/gvt/handlers.c > @@ -1244,7 +1244,7 @@ static int dma_ctrl_write(struct intel_vgpu *vgpu, > unsigned int offset, > mode = vgpu_vreg(vgpu, offset); > > if (GFX_MODE_BIT_SET_IN_MASK(mode, START_DMA)) { > - WARN_ONCE(1, "VM(%d): iGVT-g doesn't supporte GuC\n", > + WARN_ONCE(1, "VM(%d): iGVT-g doesn't support GuC\n", > vgpu->id); > return 0; > } > -- applied, thanks! -- Open Source Technology Center, Intel ltd. $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827 signature.asc Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v8] drm/i915: Squash repeated awaits on the same fence
On Thu, Apr 27, 2017 at 08:06:36AM +0100, Chris Wilson wrote: > Track the latest fence waited upon on each context, and only add a new > asynchronous wait if the new fence is more recent than the recorded > fence for that context. This requires us to filter out unordered > timelines, which are noted by DMA_FENCE_NO_CONTEXT. However, in the > absence of a universal identifier, we have to use our own > i915->mm.unordered_timeline token. Fwiw, the conversion to a ht of leaves is http://paste.debian.net/929577/ I don't like the compromise of the fixed size ht, it is too easy to hit a badly performing case. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [01/27] drm/i915/selftests: Allocate inode/file dynamically (rev2)
== Series Details == Series: series starting with [01/27] drm/i915/selftests: Allocate inode/file dynamically (rev2) URL : https://patchwork.freedesktop.org/series/23227/ State : success == Summary == Series 23227v2 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/23227/revisions/2/mbox/ fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:428s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:429s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:580s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:510s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:544s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:481s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:486s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:413s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:409s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:418s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:494s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:467s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:457s fi-kbl-7560u total:278 pass:267 dwarn:1 dfail:0 fail:0 skip:10 time:570s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:458s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:570s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:453s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:496s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:432s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:534s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:406s 459f7d04deb6549ed4f27957ec414b727dc763f3 drm-tip: 2017y-04m-26d-16h-05m-26s UTC integration manifest 075ed10 drm/i915: Redefine ptr_pack_bits() and friends cb02940 drm/i915: Make ptr_unpack_bits() more function-like 70114f1 drm/i915: Lift timeline ordering to await_dma_fence 621074e drm/i915: Mark up clflushes as belonging to an unordered timeline 623963bb drm/i915: Mark CPU cache as dirty on every transition for CPU writes == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4561/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v6 02/20] drm/i915: Rename gen8_(un)request_engine_reset to gen8_reset_engine_start/cancel
On Tue, Apr 18, 2017 at 01:23:17PM -0700, Michel Thierry wrote: > As all other functions related to resetting engines are using > reset_engine. > > v2: remove _request_ and use start/cancel instead (Chris) > > Cc: Chris Wilson > Signed-off-by: Michel Thierry Picked up the first pair of trivial fixes, thanks. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 5/8] drm/i915: Check error return when converting pipe to connector
On Thu, Apr 27, 2017 at 10:09:35AM +0300, Jani Nikula wrote: > On Wed, 26 Apr 2017, Imre Deak wrote: > > An error from intel_get_pipe_from_connector() would mean a bug somewhere > > else, but we still should check for it to prevent some other more > > obscure bug later. > > > > v2: > > - Fall back to a reasonable default instead of bailing out in case of > > error. (Jani) > > > > Cc: Jani Nikula > > Signed-off-by: Imre Deak > > --- > > drivers/gpu/drm/i915/intel_panel.c | 17 ++--- > > 1 file changed, 14 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_panel.c > > b/drivers/gpu/drm/i915/intel_panel.c > > index cb50c52..3508f42 100644 > > --- a/drivers/gpu/drm/i915/intel_panel.c > > +++ b/drivers/gpu/drm/i915/intel_panel.c > > @@ -888,10 +888,14 @@ static void pch_enable_backlight(struct > > intel_connector *connector) > > struct drm_i915_private *dev_priv = to_i915(connector->base.dev); > > struct intel_panel *panel = &connector->panel; > > enum pipe pipe = intel_get_pipe_from_connector(connector); > > - enum transcoder cpu_transcoder = > > - intel_pipe_to_cpu_transcoder(dev_priv, pipe); > > + enum transcoder cpu_transcoder; > > u32 cpu_ctl2, pch_ctl1, pch_ctl2; > > > > + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) > > + cpu_transcoder = intel_pipe_to_cpu_transcoder(dev_priv, pipe); > > + else > > + cpu_transcoder = TRANSCODER_EDP; > > + > > cpu_ctl2 = I915_READ(BLC_PWM_CPU_CTL2); > > if (cpu_ctl2 & BLM_PWM_ENABLE) { > > DRM_DEBUG_KMS("cpu backlight already enabled\n"); > > @@ -973,6 +977,9 @@ static void i965_enable_backlight(struct > > intel_connector *connector) > > enum pipe pipe = intel_get_pipe_from_connector(connector); > > u32 ctl, ctl2, freq; > > > > + if (WARN_ON_ONCE(pipe == INVALID_PIPE)) > > + pipe = PIPE_A; > > + > > ctl2 = I915_READ(BLC_PWM_CTL2); > > if (ctl2 & BLM_PWM_ENABLE) { > > DRM_DEBUG_KMS("backlight already enabled\n"); > > @@ -1037,6 +1044,9 @@ static void bxt_enable_backlight(struct > > intel_connector *connector) > > enum pipe pipe = intel_get_pipe_from_connector(connector); > > u32 pwm_ctl, val; > > > > + if (WARN_ON_ONCE(pipe) == PIPE_INVALID) > > Le pipe invalid? I think you mean INVALID_PIPE here. Arg, forgot git add at some point.. --Imre > > BR, > Jani. > > > + pipe = PIPE_A; > > + > > /* Controller 1 uses the utility pin. */ > > if (panel->backlight.controller == 1) { > > val = I915_READ(UTIL_PIN_CTL); > > @@ -1093,7 +1103,8 @@ void intel_panel_enable_backlight(struct > > intel_connector *connector) > > if (!panel->backlight.present) > > return; > > > > - DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); > > + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) > > + DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); > > > > mutex_lock(&dev_priv->backlight_lock); > > -- > Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 5/8] drm/i915: Check error return when converting pipe to connector
An error from intel_get_pipe_from_connector() would mean a bug somewhere else, but we still should check for it to prevent some other more obscure bug later. v2: - Fall back to a reasonable default instead of bailing out in case of error. (Jani) v3: - Fix s/PIPE_INVALID/INVALID_PIPE/ typo. (Jani) Cc: Jani Nikula Signed-off-by: Imre Deak --- drivers/gpu/drm/i915/intel_panel.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_panel.c b/drivers/gpu/drm/i915/intel_panel.c index cb50c52..d1abbf1 100644 --- a/drivers/gpu/drm/i915/intel_panel.c +++ b/drivers/gpu/drm/i915/intel_panel.c @@ -888,10 +888,14 @@ static void pch_enable_backlight(struct intel_connector *connector) struct drm_i915_private *dev_priv = to_i915(connector->base.dev); struct intel_panel *panel = &connector->panel; enum pipe pipe = intel_get_pipe_from_connector(connector); - enum transcoder cpu_transcoder = - intel_pipe_to_cpu_transcoder(dev_priv, pipe); + enum transcoder cpu_transcoder; u32 cpu_ctl2, pch_ctl1, pch_ctl2; + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) + cpu_transcoder = intel_pipe_to_cpu_transcoder(dev_priv, pipe); + else + cpu_transcoder = TRANSCODER_EDP; + cpu_ctl2 = I915_READ(BLC_PWM_CPU_CTL2); if (cpu_ctl2 & BLM_PWM_ENABLE) { DRM_DEBUG_KMS("cpu backlight already enabled\n"); @@ -973,6 +977,9 @@ static void i965_enable_backlight(struct intel_connector *connector) enum pipe pipe = intel_get_pipe_from_connector(connector); u32 ctl, ctl2, freq; + if (WARN_ON_ONCE(pipe == INVALID_PIPE)) + pipe = PIPE_A; + ctl2 = I915_READ(BLC_PWM_CTL2); if (ctl2 & BLM_PWM_ENABLE) { DRM_DEBUG_KMS("backlight already enabled\n"); @@ -1037,6 +1044,9 @@ static void bxt_enable_backlight(struct intel_connector *connector) enum pipe pipe = intel_get_pipe_from_connector(connector); u32 pwm_ctl, val; + if (WARN_ON_ONCE(pipe) == INVALID_PIPE) + pipe = PIPE_A; + /* Controller 1 uses the utility pin. */ if (panel->backlight.controller == 1) { val = I915_READ(UTIL_PIN_CTL); @@ -1093,7 +1103,8 @@ void intel_panel_enable_backlight(struct intel_connector *connector) if (!panel->backlight.present) return; - DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) + DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); mutex_lock(&dev_priv->backlight_lock); -- 2.5.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v7 12/15] drm/i915/perf: Add OA unit support for Gen 8+
On Tue, Apr 25, 2017 at 10:30:07AM -0700, Lionel Landwerlin wrote: > +static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv) > +{ > + struct i915_gem_context *ctx; > + int ret; > + > + ret = i915_mutex_lock_interruptible(&dev_priv->drm); > + if (ret) > + return ret; > + > + /* Switch away from any user context. */ > + ret = i915_gem_switch_to_kernel_context(dev_priv); > + if (ret) { > + mutex_unlock(&dev_priv->drm.struct_mutex); > + return ret; > + } > + > + /* The OA register config is setup through the context image. This image > + * might be written to by the GPU on context switch (in particular on > + * lite-restore). This means we can't safely update a context's image, > + * if this context is scheduled/submitted to run on the GPU. > + * > + * We could emit the OA register config through the batch buffer but > + * this might leave small interval of time where the OA unit is > + * configured at an invalid sampling period. > + * > + * So far the best way to work around this issue seems to be draining > + * the GPU from any submitted work. > + */ > + ret = i915_gem_wait_for_idle(dev_priv, > + I915_WAIT_INTERRUPTIBLE | > + I915_WAIT_LOCKED); > + if (ret) { > + mutex_unlock(&dev_priv->drm.struct_mutex); > + return ret; > + } > + > + /* Update all contexts now that we've stalled the submission. */ > + list_for_each_entry(ctx, &dev_priv->context_list, link) { > + if (!ctx->engine[RCS].initialised) > + continue; > + You need to pin the context here, otherwise there is not guarrantee that the lrc_reg_state exists, or map it directly. > + gen8_update_reg_state_unlocked(ctx, > +ctx->engine[RCS].lrc_reg_state); > + } > + > + mutex_unlock(&dev_priv->drm.struct_mutex); > + > + /* Now update the current context. You don't need to. The current context is the kernel context, it is scratch and never used by userspace so no oa-reports. > + * > + * Note: Using MMIO to update per-context registers requires > + * some extra care... > + */ > + ret = gen8_begin_ctx_mmio(dev_priv); > + if (ret) { > + DRM_ERROR("Failed to bring RCS out of idle to update current > ctx OA state\n"); > + return ret; > + } > + > + I915_WRITE(GEN8_OACTXCONTROL, ((dev_priv->perf.oa.period_exponent << > + GEN8_OA_TIMER_PERIOD_SHIFT) | > + (dev_priv->perf.oa.periodic ? > +GEN8_OA_TIMER_ENABLE : 0) | > + GEN8_OA_COUNTER_RESUME)); > + > + config_oa_regs(dev_priv, dev_priv->perf.oa.flex_regs, > + dev_priv->perf.oa.flex_regs_len); > + > + gen8_end_ctx_mmio(dev_priv); This entire chunk can go and I don't need to critique gen8_begin_ctx_mmio() -- it needs a bit of tlc. This patch is not ready. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: Sanitize engine context sizes
On ke, 2017-04-26 at 14:16 +0100, Tvrtko Ursulin wrote: > On 26/04/2017 13:20, Joonas Lahtinen wrote: > > > > @@ -443,21 +417,6 @@ int i915_gem_context_init(struct drm_i915_private > > *dev_priv) > > BUILD_BUG_ON(MAX_CONTEXT_HW_ID > INT_MAX); > > ida_init(&dev_priv->context_hw_ida); > > > > - if (i915.enable_execlists) { > > - /* NB: intentionally left blank. We will allocate our own > > - * backing objects as we need them, thank you very much */ > > - dev_priv->hw_context_size = 0; > > - } else if (HAS_HW_CONTEXTS(dev_priv)) { > > - dev_priv->hw_context_size = > > - round_up(get_context_size(dev_priv), > > - I915_GTT_PAGE_SIZE); > > Is this rounding up lost when used from __create_hw_context? Added it back for Gen 7 and 6 where it could do something. Others have been rounded up in the heads of engineers already. > > +static u32 > > +__intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class) > Very minor, but Chris has been trying to establish i915 instead of > dev_priv in new code. It'd shadow the global i915 in this case, so leaving it as is :/ > > @@ -134,6 +208,10 @@ intel_engine_setup(struct drm_i915_private *dev_priv, > > engine->irq_shift = info->irq_shift; > > engine->class = info->class; > > engine->instance = info->instance; > > + engine->context_size = __intel_engine_context_size(dev_priv, > > + engine->class); > > + if (WARN_ON(engine->context_size > BIT(20))) > > + engine->context_size = 0; > > Don't know the history to tell whether upgrade of DRM_DEBUG_DRIVER to a > WARN_ON is ok. Talked with Chris, if it triggers, it should definitely be WARN_ON :) Never seen in the past. Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/2] A lockless Buffering Utility for Concurrency
The proposed buffering method utilizes atomic operations to manage data buffering. This methodology does not use classic locking approach (mutex, semaphores, blocking calls, etc.), therefore no "hard" serialization takes place. Signed-off-by: Krzysztof E. Olinski --- lib/buc.c | 208 + lib/buc.h | 242 ++ 2 files changed, 450 insertions(+) create mode 100755 lib/buc.c create mode 100755 lib/buc.h diff --git a/lib/buc.c b/lib/buc.c new file mode 100755 index 000..1a5b833 --- /dev/null +++ b/lib/buc.c @@ -0,0 +1,208 @@ +/* + * Copyright © 2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + *Krzysztof E. Olinski + * + */ + +#include "buc.h" +#include +#include +#include +#include +#include + + +#include "igt.h" +#define buc__assert igt_assert + +// Customized alloc definition. +// You can pick your favorite allocator here. +static void* buc__alloc(size_t size) +{ +// Add extra 4 bytes to store size of the buffer. +void* addr = malloc(size+sizeof(int)); +buc__assert(addr); + +{ +int ret; +ret = mlock(addr, size+sizeof(int)); +buc__assert(!ret); +} + +// Store 'size' for munlock. +// To be pedantic, call munlock before free. +*(unsigned int*)addr = size; + +return ((char*)addr + sizeof(int)); +} + +static void buc__free(void* addr) +{ +void* raddr = ((char*)addr - sizeof(int)); +unsigned int len = *(unsigned int*)raddr; +munlock(raddr, len); +free(raddr); +} + +static void* collector_thread(void *p) +{ +buc_t *my_buc = (buc_t*)p; +bufdesc_t *process_buffer = my_buc->bufferB; +int ret; + +while(my_buc->out_fd != -1) +{ +// Swap buffers A <-> B. +process_buffer = (bufdesc_t*)__atomic_exchange_n( +&my_buc->active_base, +process_buffer, +__ATOMIC_SEQ_CST); + +if(process_buffer->cursor > 0) +{ +// Wait until nobody is writing to the buffer. +// The only lock in this design is here in the collector. +while(process_buffer->ref_pointer > (void*)process_buffer) +sched_yield(); + +ret = write(my_buc->out_fd, +(char*)process_buffer + sizeof(bufdesc_t), +(process_buffer->cursor_of_last_hope)?\ +process_buffer->cursor_of_last_hope:process_buffer->cursor); +buc__assert(ret); +process_buffer->cursor = 0; +process_buffer->cursor_of_last_hope = 0; +} +sched_yield(); +} + +pthread_exit(NULL); +} + +buc_t* buc__create(int out_fd, unsigned int buffer_size) +{ +int r; + +// Create main structure. +buc_t *new_buflogger = buc__alloc(sizeof(buc_t)); +buc__assert(new_buflogger); + +// Allocate buffers A and B. +new_buflogger->bufferA = buc__alloc(2*(sizeof(bufdesc_t) + buffer_size)); +buc__assert(new_buflogger->bufferA); +new_buflogger->bufferA->ref_pointer = new_buflogger->bufferA; +new_buflogger->bufferA->cursor = 0; + +new_buflogger->bufferB = (bufdesc_t*)(((char*)(new_buflogger->bufferA) ++ (sizeof(bufdesc_t) + buffer_size))); +new_buflogger->bufferB->ref_pointer = new_buflogger->bufferB; +new_buflogger->bufferB->cursor = 0; + +new_buflogger->active_base = new_buflogger->bufferA; +new_buflogger->buffer_size = buffer_size; +new_buflogger->out_fd = out_fd; +new_buflogger->overflow_counter = 0; + +// Create collector thread. +r = pthread_create(&new_buflogger->collector_thread, + NULL, + collector_thread, + new_buflogger); +buc__assert(r==0); + +return new_buflogger; +} + +int buc__append(buc_t* this_buc, void* buf, unsigned in
[Intel-gfx] [PATCH 0/2] GuC logger redesign
GuC logger implementation simplified and moved to a library (GuCLAW). Adds simple buffering utility for logging routine (BUC). Krzysztof E. Olinski (2): A lockless Buffering Utility for Concurrency Simplification of guc logger design lib/Makefile.sources | 4 + lib/buc.c| 208 + lib/buc.h| 242 lib/igt_guclaw.c | 272 +++ lib/igt_guclaw.h | 81 + tools/intel_guc_logger.c | 465 +-- 6 files changed, 893 insertions(+), 379 deletions(-) mode change 100644 => 100755 lib/Makefile.sources create mode 100755 lib/buc.c create mode 100755 lib/buc.h create mode 100755 lib/igt_guclaw.c create mode 100755 lib/igt_guclaw.h mode change 100644 => 100755 tools/intel_guc_logger.c -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] Simplification of guc logger design
There are some compile problems for Android platform. The aim of this patch is to simplify the current design and make it compilable both on Linux and Android. Signed-off-by: Krzysztof E. Olinski --- lib/Makefile.sources | 4 + lib/igt_guclaw.c | 272 +++ lib/igt_guclaw.h | 81 + tools/intel_guc_logger.c | 465 +-- 4 files changed, 443 insertions(+), 379 deletions(-) mode change 100644 => 100755 lib/Makefile.sources create mode 100755 lib/igt_guclaw.c create mode 100755 lib/igt_guclaw.h mode change 100644 => 100755 tools/intel_guc_logger.c diff --git a/lib/Makefile.sources b/lib/Makefile.sources old mode 100644 new mode 100755 index 6348487..89a0fee --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -83,6 +83,10 @@ lib_source_list =\ uwildmat/uwildmat.c \ igt_kmod.c \ igt_kmod.h \ + buc.c \ + buc.h \ + igt_guclaw.c\ + igt_guclaw.h\ $(NULL) if HAVE_CHAMELIUM diff --git a/lib/igt_guclaw.c b/lib/igt_guclaw.c new file mode 100755 index 000..6880f4d --- /dev/null +++ b/lib/igt_guclaw.c @@ -0,0 +1,272 @@ +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "igt.h" +#include "buc.h" +#include "igt_guclaw.h" + +#define MB(x) ((uint64_t)(x) * 1024 * 1024) +#ifndef PAGE_SIZE + #define PAGE_SIZE 4096 +#endif +/* Currently the size of GuC log buffer is 19 pages & so is the size of relay + * subbuffer. If the size changes in future, then this define also needs to be + * updated accordingly. + */ +#define SUBBUF_SIZE (19*PAGE_SIZE) +/* Need large buffering from logger side to hide the DISK IO latency, Driver + * can only store 8 snapshots of GuC log buffer in relay. + */ + +#define NUM_SUBBUFS 100 + +#define RELAY_FILE_NAME "guc_log" +#define DEFAULT_OUTPUT_FILE_NAME "guc_log_dump.dat" +#define CONTROL_FILE_NAME "i915_guc_log_control" + +buc_t* my_buc = NULL; + +static guclaw_options_t guclaw_ops; + +uint64_t total_bytes_written; +int relay_fd, drm_fd, outfile_fd = -1; +bool stop_logging = false, suspend_logging = true; +pthread_t guclaw_thread; + + +static void guc_log_control(bool enable_logging) +{ +int control_fd; +char data[19]; +uint64_t val; +int ret; + +control_fd = igt_debugfs_open(drm_fd, CONTROL_FILE_NAME, O_WRONLY); +igt_assert_f(control_fd >= 0, "couldn't open the guc log control file\n"); + +val = enable_logging ? ((guclaw_ops.verbosity_level << 4) | 0x1) : 0; + +ret = snprintf(data, sizeof(data), "0x%" PRIx64, val); +igt_assert(ret > 2 && ret < sizeof(data)); + +ret = write(control_fd, data, ret); +igt_assert_f(ret > 0, "couldn't write to the log control file\n"); + +close(control_fd); +} + +static void pull_leftover_data(void) +{ +unsigned int bytes_read = 0; +int ret; +char rbuf[SUBBUF_SIZE]; + +do { +/* Read the logs from relay buffer */ +ret = read(relay_fd, rbuf, SUBBUF_SIZE); +if (!ret) +break; + +igt_assert_f(ret > 0, "failed to read from the guc log file\n"); + +bytes_read += ret; + +if(guclaw_ops.discard_oldlogs) +{ +total_bytes_written += ret; +if (outfile_fd >= 0) + buc__append(my_buc, rbuf, ret); +} + +} while(1); + +igt_info("%u bytes flushed\n", bytes_read); +} + +static void pull_data(void) +{ +int ret; +char rbuf[SUBBUF_SIZE]; + +do +{ +/* Read the logs from relay buffer */ +ret = read(relay_fd, rbuf, SUBBUF_SIZE); +if (!ret) +break; + +igt_assert_f(ret >= 0, "failed to read from the guc log file\n"); +total_bytes_written += ret; + +if(!suspend_logging) +buc__append(my_buc, rbuf, ret); + +} while(1); +} + +static void open_relay_file(void) +{ +relay_fd = igt_debugfs_open(drm_fd, RELAY_FILE_NAME, O_RDONLY); +igt_assert_f(relay_fd >= 0, "couldn't open the guc log file\n"); + +/* Purge the old/boot-time logs from the relay buffer. + * This is more for Val team's requirement, where they have to first + * purge the existing logs before starting the tests for which the logs + * are actually needed. After this logger will enter into a loop and + * wait for the new data, at that point benchmark can be launched from + * a different shell. + */ + + pull_leftover_data(); +} + +static void open_output_file(void) +{ +outfile_fd = open(guclaw_ops.out_filename ? : DEFAULT_OUTPUT_FILE_NAME, + O_CREAT | O_WRONLY | O_TRUNC, S_IWUSR|S_IROTH); + +igt_assert_f(outfile_fd >= 0, "coul
[Intel-gfx] [PATCH v3] drm/i915: Sanitize engine context sizes
Pre-calculate engine context size based on engine class and device generation and store it in the engine instance. v2: - Squash and get rid of hw_context_size (Chris) v3: - Move after MMIO init for probing on Gen7 and 8 (Chris) - Retained rounding (Tvrtko) Signed-off-by: Joonas Lahtinen Cc: Paulo Zanoni Cc: Rodrigo Vivi Cc: Chris Wilson Cc: Daniele Ceraolo Spurio Cc: Tvrtko Ursulin Cc: Oscar Mateo Cc: Zhenyu Wang Cc: intel-gvt-...@lists.freedesktop.org Acked-by: Tvrtko Ursulin Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/gvt/scheduler.c | 6 +- drivers/gpu/drm/i915/i915_drv.c| 15 +++-- drivers/gpu/drm/i915/i915_drv.h| 3 +- drivers/gpu/drm/i915/i915_gem_context.c| 61 +++- drivers/gpu/drm/i915/i915_guc_submission.c | 3 +- drivers/gpu/drm/i915/i915_reg.h| 10 drivers/gpu/drm/i915/intel_engine_cs.c | 90 +- drivers/gpu/drm/i915/intel_lrc.c | 54 +- drivers/gpu/drm/i915/intel_lrc.h | 2 - drivers/gpu/drm/i915/intel_ringbuffer.h| 7 ++- 10 files changed, 113 insertions(+), 138 deletions(-) diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c index bada32b..1256fe2 100644 --- a/drivers/gpu/drm/i915/gvt/scheduler.c +++ b/drivers/gpu/drm/i915/gvt/scheduler.c @@ -69,8 +69,7 @@ static int populate_shadow_context(struct intel_vgpu_workload *workload) gvt_dbg_sched("ring id %d workload lrca %x", ring_id, workload->ctx_desc.lrca); - context_page_num = intel_lr_context_size( - gvt->dev_priv->engine[ring_id]); + context_page_num = gvt->dev_priv->engine[ring_id]->context_size; context_page_num = context_page_num >> PAGE_SHIFT; @@ -330,8 +329,7 @@ static void update_guest_context(struct intel_vgpu_workload *workload) gvt_dbg_sched("ring id %d workload lrca %x\n", ring_id, workload->ctx_desc.lrca); - context_page_num = intel_lr_context_size( - gvt->dev_priv->engine[ring_id]); + context_page_num = gvt->dev_priv->engine[ring_id]->context_size; context_page_num = context_page_num >> PAGE_SHIFT; diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index c7d68e7..2d3c4264 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -835,10 +835,6 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv, intel_uc_init_early(dev_priv); i915_memcpy_init_early(dev_priv); - ret = intel_engines_init_early(dev_priv); - if (ret) - return ret; - ret = i915_workqueues_init(dev_priv); if (ret < 0) goto err_engines; @@ -948,14 +944,21 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv) ret = i915_mmio_setup(dev_priv); if (ret < 0) - goto put_bridge; + goto err_bridge; intel_uncore_init(dev_priv); + + ret = intel_engines_init_mmio(dev_priv); + if (ret) + goto err_uncore; + i915_gem_init_mmio(dev_priv); return 0; -put_bridge: +err_uncore: + intel_uncore_fini(dev_priv); +err_bridge: pci_dev_put(dev_priv->bridge_dev); return ret; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 357b6c6..0494e08c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2359,7 +2359,6 @@ struct drm_i915_private { */ struct mutex av_mutex; - uint32_t hw_context_size; struct list_head context_list; u32 fdi_rx_config; @@ -3023,7 +3022,7 @@ extern unsigned long i915_gfx_val(struct drm_i915_private *dev_priv); extern void i915_update_gfx_val(struct drm_i915_private *dev_priv); int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on); -int intel_engines_init_early(struct drm_i915_private *dev_priv); +int intel_engines_init_mmio(struct drm_i915_private *dev_priv); int intel_engines_init(struct drm_i915_private *dev_priv); /* intel_hotplug.c */ diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 8bd0c49..3271012 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -92,33 +92,6 @@ #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1 -static int get_context_size(struct drm_i915_private *dev_priv) -{ - int ret; - u32 reg; - - switch (INTEL_GEN(dev_priv)) { - case 6: - reg = I915_READ(CXT_SIZE); - ret = GEN6_CXT_TOTAL_SIZE(reg) * 64; - break; - case 7: - reg = I915_READ(GEN7_CXT_SIZE); - if (IS_HASWELL(dev_priv)) - ret = HSW_CXT_TOTAL_SIZE; - else -
Re: [Intel-gfx] [PATCH v3 5/8] drm/i915: Check error return when converting pipe to connector
On Thu, 27 Apr 2017, Imre Deak wrote: > An error from intel_get_pipe_from_connector() would mean a bug somewhere > else, but we still should check for it to prevent some other more > obscure bug later. > > v2: > - Fall back to a reasonable default instead of bailing out in case of > error. (Jani) > v3: > - Fix s/PIPE_INVALID/INVALID_PIPE/ typo. (Jani) > > Cc: Jani Nikula > Signed-off-by: Imre Deak Reviewed-by: Jani Nikula > --- > drivers/gpu/drm/i915/intel_panel.c | 17 ++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_panel.c > b/drivers/gpu/drm/i915/intel_panel.c > index cb50c52..d1abbf1 100644 > --- a/drivers/gpu/drm/i915/intel_panel.c > +++ b/drivers/gpu/drm/i915/intel_panel.c > @@ -888,10 +888,14 @@ static void pch_enable_backlight(struct intel_connector > *connector) > struct drm_i915_private *dev_priv = to_i915(connector->base.dev); > struct intel_panel *panel = &connector->panel; > enum pipe pipe = intel_get_pipe_from_connector(connector); > - enum transcoder cpu_transcoder = > - intel_pipe_to_cpu_transcoder(dev_priv, pipe); > + enum transcoder cpu_transcoder; > u32 cpu_ctl2, pch_ctl1, pch_ctl2; > > + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) > + cpu_transcoder = intel_pipe_to_cpu_transcoder(dev_priv, pipe); > + else > + cpu_transcoder = TRANSCODER_EDP; > + > cpu_ctl2 = I915_READ(BLC_PWM_CPU_CTL2); > if (cpu_ctl2 & BLM_PWM_ENABLE) { > DRM_DEBUG_KMS("cpu backlight already enabled\n"); > @@ -973,6 +977,9 @@ static void i965_enable_backlight(struct intel_connector > *connector) > enum pipe pipe = intel_get_pipe_from_connector(connector); > u32 ctl, ctl2, freq; > > + if (WARN_ON_ONCE(pipe == INVALID_PIPE)) > + pipe = PIPE_A; > + > ctl2 = I915_READ(BLC_PWM_CTL2); > if (ctl2 & BLM_PWM_ENABLE) { > DRM_DEBUG_KMS("backlight already enabled\n"); > @@ -1037,6 +1044,9 @@ static void bxt_enable_backlight(struct intel_connector > *connector) > enum pipe pipe = intel_get_pipe_from_connector(connector); > u32 pwm_ctl, val; > > + if (WARN_ON_ONCE(pipe) == INVALID_PIPE) > + pipe = PIPE_A; > + > /* Controller 1 uses the utility pin. */ > if (panel->backlight.controller == 1) { > val = I915_READ(UTIL_PIN_CTL); > @@ -1093,7 +1103,8 @@ void intel_panel_enable_backlight(struct > intel_connector *connector) > if (!panel->backlight.present) > return; > > - DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); > + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) > + DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); > > mutex_lock(&dev_priv->backlight_lock); -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/2] GuC logger redesign
On Thu, Apr 27, 2017 at 10:59:18AM +0200, Krzysztof E. Olinski wrote: > GuC logger implementation simplified and moved to a library (GuCLAW). > Adds simple buffering utility for logging routine (BUC). Bigger question, why? What designs goals do you want to achieve? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC v2 1/2] drm/i915: Engine discovery uAPI
From: Tvrtko Ursulin Engine discovery uAPI allows userspace to probe for engine configuration and features without needing to maintain the internal PCI id based database. This enables removal of code duplications across userspace components. Probing is done via the new DRM_IOCTL_I915_GEM_ENGINE_INFO ioctl which returns the number and information on the specified engine class. Currently only general engine configuration and HEVC feature of the VCS engine can be probed but the uAPI is designed to be generic and extensible. Code is based almost exactly on the earlier proposal on the topic by Jon Bloomfield. Engine class and instance refactoring made recently by Daniele Ceraolo Spurio enabled this to be implemented in an elegant fashion. To probe configuration userspace sets the engine class it wants to query (struct drm_i915_gem_engine_info) and provides an array of drm_i915_engine_info structs which will be filled in by the driver. Userspace also has to tell i915 how many elements are in the array, and the driver will report back the total number of engine instances in any case. v2: * Add a version field and consolidate to one engine count. (Chris Wilson) * Rename uAPI flags for VCS engines to DRM_I915_ENGINE_CLASS_VIDEO. (Gong Zhipeng) Signed-off-by: Tvrtko Ursulin Cc: Ben Widawsky Cc: Chris Wilson Cc: Daniel Vetter Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Charles Cc: "Rogozhkin, Dmitry V" Cc: Oscar Mateo Cc: "Gong, Zhipeng" Cc: intel-vaapi-me...@lists.01.org Cc: mesa-...@lists.freedesktop.org --- drivers/gpu/drm/i915/i915_drv.c| 1 + drivers/gpu/drm/i915/i915_drv.h| 3 ++ drivers/gpu/drm/i915/intel_engine_cs.c | 64 ++ include/uapi/drm/i915_drm.h| 40 + 4 files changed, 108 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index c7d68e789642..1a3f0859227b 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -2609,6 +2609,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(I915_GEM_ENGINE_INFO, i915_gem_engine_info_ioctl, DRM_RENDER_ALLOW), }; static struct drm_driver driver = { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 357b6c6c2f04..6eed0e854561 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3547,6 +3547,9 @@ i915_gem_context_lookup_timeline(struct i915_gem_context *ctx, int i915_perf_open_ioctl(struct drm_device *dev, void *data, struct drm_file *file); +int i915_gem_engine_info_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); + /* i915_gem_evict.c */ int __must_check i915_gem_evict_something(struct i915_address_space *vm, u64 min_size, u64 alignment, diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 82a274b336c5..caed32dbd912 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -25,6 +25,7 @@ #include "i915_drv.h" #include "intel_ringbuffer.h" #include "intel_lrc.h" +#include struct engine_class_info { const char *name; @@ -1187,6 +1188,69 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915) engine->set_default_submission(engine); } +u8 user_class_map[DRM_I915_ENGINE_CLASS_MAX] = { + [DRM_I915_ENGINE_CLASS_OTHER] = OTHER_CLASS, + [DRM_I915_ENGINE_CLASS_RENDER] = RENDER_CLASS, + [DRM_I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS, + [DRM_I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS, + [DRM_I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS, +}; + +int i915_gem_engine_info_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_private *i915 = to_i915(dev); + struct drm_i915_gem_engine_info *args = data; + struct drm_i915_engine_info __user *user_info = + u64_to_user_ptr(args->info_ptr); + unsigned int info_size = args->num_engines; + struct drm_i915_engine_info info; + struct intel_engine_cs *engine; + enum intel_engine_id id; + u8 class; + + if (args->rsvd) + return -EINVAL; + + switch (args->engine_class) { + case DRM_I915_ENGINE_CLASS_OTHER: + case DRM_I915_ENGINE_CLASS_RENDER: + case DRM_I915_ENGINE_CLASS_COPY: + case DRM_I915_ENGINE_CLASS_VIDEO: + case DRM_I915_ENGINE_CLASS_VIDEO_ENHANCE: +
[Intel-gfx] [RFC v2 2/2] drm/i915: Select engines via class and instance in execbuffer2
From: Tvrtko Ursulin Building on top of the previous patch which exported the concept of engine classes and instances, we can also use this instead of the current awkward engine selection uAPI. This is primarily interesting for the VCS engine selection which is a) currently done via disjoint set of flags, and b) the current I915_EXEC_BSD flags has different semantics depending on the underlying hardware which is bad. Proposed idea here is to reserve 16-bits of flags, to pass in the engine class and instance (8 bits each), and a new flag named I915_EXEC_CLASS_INSTACE to tell the kernel this new engine selection API is in use. The new uAPI also removes access to the weak VCS engine balancing as currently existing in the driver. Example usage to send a command to VCS0: eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 0); Or to send a command to VCS1: eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 1); v2: * Fix unknown flags mask. * Use I915_EXEC_RING_MASK for class. (Chris Wilson) Signed-off-by: Tvrtko Ursulin Cc: Ben Widawsky Cc: Chris Wilson Cc: Daniel Vetter Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Charles Cc: "Rogozhkin, Dmitry V" Cc: Oscar Mateo Cc: "Gong, Zhipeng" Cc: intel-vaapi-me...@lists.01.org Cc: mesa-...@lists.freedesktop.org --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 29 + include/uapi/drm/i915_drm.h| 11 ++- 2 files changed, 39 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index af1965774e7b..ecd1486642a7 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1492,6 +1492,32 @@ gen8_dispatch_bsd_engine(struct drm_i915_private *dev_priv, return file_priv->bsd_engine; } +extern u8 user_class_map[DRM_I915_ENGINE_CLASS_MAX]; + +static struct intel_engine_cs * +eb_select_engine_class_instance(struct drm_i915_private *i915, + struct drm_i915_gem_execbuffer2 *args) +{ + struct intel_engine_cs *engine; + enum intel_engine_id id; + u8 class, instance; + + class = args->flags & I915_EXEC_RING_MASK; + if (class >= DRM_I915_ENGINE_CLASS_MAX) + return NULL; + class = user_class_map[class]; + + instance = (args->flags >> I915_EXEC_INSTANCE_SHIFT) && + I915_EXEC_INSTANCE_MASK; + + for_each_engine(engine, i915, id) { + if (engine->class == class && engine->instance == instance) + return engine; + } + + return NULL; +} + #define I915_USER_RINGS (4) static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = { @@ -1510,6 +1536,9 @@ eb_select_engine(struct drm_i915_private *dev_priv, unsigned int user_ring_id = args->flags & I915_EXEC_RING_MASK; struct intel_engine_cs *engine; + if (args->flags & I915_EXEC_CLASS_INSTANCE) + return eb_select_engine_class_instance(dev_priv, args); + if (user_ring_id > I915_USER_RINGS) { DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id); return NULL; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 2ac6667e57ea..6a26bdf5e684 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -906,7 +906,12 @@ struct drm_i915_gem_execbuffer2 { */ #define I915_EXEC_FENCE_OUT(1<<17) -#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_OUT<<1)) +#define I915_EXEC_CLASS_INSTANCE (1<<18) + +#define I915_EXEC_INSTANCE_SHIFT (19) +#define I915_EXEC_INSTANCE_MASK(0xff << I915_EXEC_INSTANCE_SHIFT) + +#define __I915_EXEC_UNKNOWN_FLAGS (-((1 << 27) << 1)) #define I915_EXEC_CONTEXT_ID_MASK (0x) #define i915_execbuffer2_set_context_id(eb2, context) \ @@ -914,6 +919,10 @@ struct drm_i915_gem_execbuffer2 { #define i915_execbuffer2_get_context_id(eb2) \ ((eb2).rsvd1 & I915_EXEC_CONTEXT_ID_MASK) +#define i915_execbuffer2_engine(class, instance) \ + (I915_EXEC_CLASS_INSTANCE | (class) | \ + ((instance) << I915_EXEC_INSTANCE_SHIFT)) + struct drm_i915_gem_pin { /** Handle of the buffer to be pinned. */ __u32 handle; -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3] drm/i915: Sanitize engine context sizes
On Thu, Apr 27, 2017 at 12:01:11PM +0300, Joonas Lahtinen wrote: > Pre-calculate engine context size based on engine class and device > generation and store it in the engine instance. > > v2: > - Squash and get rid of hw_context_size (Chris) > > v3: > - Move after MMIO init for probing on Gen7 and 8 (Chris) > - Retained rounding (Tvrtko) > > Signed-off-by: Joonas Lahtinen > Cc: Paulo Zanoni > Cc: Rodrigo Vivi > Cc: Chris Wilson > Cc: Daniele Ceraolo Spurio > Cc: Tvrtko Ursulin > Cc: Oscar Mateo > Cc: Zhenyu Wang > Cc: intel-gvt-...@lists.freedesktop.org > Acked-by: Tvrtko Ursulin > Cc: Tvrtko Ursulin Reviewed-by: Chris Wilson Lgtm, though I did wish you would kill HAS_HW_CONTEXT after all that. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/27] drm/i915: Squash repeated awaits on the same fence
On 26/04/2017 23:22, Chris Wilson wrote: On Wed, Apr 26, 2017 at 07:56:14PM +0100, Chris Wilson wrote: On Wed, Apr 26, 2017 at 01:13:41PM +0100, Tvrtko Ursulin wrote: I was thinking of exactly the same thing as this patch does, u64 context id as key, u32 seqnos (wrapped in a container with hlist_node). #define NSYNC 32 struct intel_timeline_sync { /* kmalloc-256 slab */ struct hlist_node node; u64 prefix; u32 bitmap; u32 seqno[NSYNC]; }; DECLARE_HASHTABLE(sync, 7); If I squint, the numbers favour the idr. ;) Hmm, it didn't take much to start running into misery with a static ht. I know my testing is completely artificial but I am not going to be happy with a static size, it will always be too big or too small and never just Goldilocks. Oh what a pity, implementation is so much smaller. What kind of misery was it? I presume not longer below the noise floor? With more than three buckets? If no other choice I'll tackle the review. Hopefully won't get lost in all the shifts, leafs, branches and prefixes. :) Regards, Tvrtko P.S. GEM_STATS you mention in the other reply - what are you referring to with that? The idea to expose queue depths and possibly more via some interface? If so prototyping that is almost next on my TODO list. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/2] GuC logger redesign
On Thu, 2017-04-27 at 10:05 +0100, Chris Wilson wrote: > On Thu, Apr 27, 2017 at 10:59:18AM +0200, Krzysztof E. Olinski wrote: > > GuC logger implementation simplified and moved to a library > > (GuCLAW). > > Adds simple buffering utility for logging routine (BUC). > > Bigger question, why? What designs goals do you want to achieve? > -Chris > Currently, there are problems with compilation for Android platform due to pthread dependencies. The proposed implementation should work both for Linux and Android. I thought that this will be also a good occasion to introduce lockless mechanisms to improve efficiency. Regards, Krzysztof smime.p7s Description: S/MIME cryptographic signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC v2 2/2] drm/i915: Select engines via class and instance in execbuffer2
On Thu, Apr 27, 2017 at 10:10:34AM +0100, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > Building on top of the previous patch which exported the concept > of engine classes and instances, we can also use this instead of > the current awkward engine selection uAPI. > > This is primarily interesting for the VCS engine selection which > is a) currently done via disjoint set of flags, and b) the > current I915_EXEC_BSD flags has different semantics depending on > the underlying hardware which is bad. > > Proposed idea here is to reserve 16-bits of flags, to pass in > the engine class and instance (8 bits each), and a new flag > named I915_EXEC_CLASS_INSTACE to tell the kernel this new engine > selection API is in use. > > The new uAPI also removes access to the weak VCS engine > balancing as currently existing in the driver. > > Example usage to send a command to VCS0: > > eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 0); > > Or to send a command to VCS1: > > eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 1); > > v2: > * Fix unknown flags mask. > * Use I915_EXEC_RING_MASK for class. (Chris Wilson) > > Signed-off-by: Tvrtko Ursulin > Cc: Ben Widawsky > Cc: Chris Wilson > Cc: Daniel Vetter > Cc: Joonas Lahtinen > Cc: Jon Bloomfield > Cc: Daniel Charles > Cc: "Rogozhkin, Dmitry V" > Cc: Oscar Mateo > Cc: "Gong, Zhipeng" > Cc: intel-vaapi-me...@lists.01.org > Cc: mesa-...@lists.freedesktop.org > --- > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 29 + > include/uapi/drm/i915_drm.h| 11 ++- > 2 files changed, 39 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > index af1965774e7b..ecd1486642a7 100644 > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > @@ -1492,6 +1492,32 @@ gen8_dispatch_bsd_engine(struct drm_i915_private > *dev_priv, > return file_priv->bsd_engine; > } > > +extern u8 user_class_map[DRM_I915_ENGINE_CLASS_MAX]; > + > +static struct intel_engine_cs * > +eb_select_engine_class_instance(struct drm_i915_private *i915, > + struct drm_i915_gem_execbuffer2 *args) > +{ > + struct intel_engine_cs *engine; > + enum intel_engine_id id; > + u8 class, instance; > + > + class = args->flags & I915_EXEC_RING_MASK; > + if (class >= DRM_I915_ENGINE_CLASS_MAX) > + return NULL; > + class = user_class_map[class]; > + > + instance = (args->flags >> I915_EXEC_INSTANCE_SHIFT) && > +I915_EXEC_INSTANCE_MASK; > + > + for_each_engine(engine, i915, id) { > + if (engine->class == class && engine->instance == instance) > + return engine; > + } I am underwhelmed. No, i915->class_engine[class][instance] ? Still, at what point do we kill busy-ioctl per-engine reporting? Should we update all tracepoints to use class:instance (I think that's a better abi going forward). -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 6/8] drm/i915: Sanitize stolen memory size calculation
On ke, 2017-04-26 at 18:27 +0300, Ville Syrjälä wrote: > On Wed, Apr 26, 2017 at 04:40:11PM +0300, Imre Deak wrote: > > > > On GEN8+ (not counting CHV) the calculation can in theory result in an > > incorrect sign extension with all upper bits set. In practice this is > > unlikely to happen since it would require 4GB of stolen memory set > > aside. For consistency still prevent the sign extension explicitly > > everywhere. > > > > Signed-off-by: Imre Deak > > @@ -2577,14 +2577,14 @@ static size_t gen6_get_stolen_size(u16 snb_gmch_ctl) > > { > > > > snb_gmch_ctl >>= SNB_GMCH_GMS_SHIFT; > > > > snb_gmch_ctl &= SNB_GMCH_GMS_MASK; > > > > - return snb_gmch_ctl << 25; /* 32 MB units */ > > > > + return (size_t)snb_gmch_ctl << 25; /* 32 MB units */ > > So the u16 gets promoted to int, which gets converted to size_t, > which may be larger than int, and thus things get sign extended. > > Can't happen in the gen6 case actually due to SNB_GMCH_GMS_MASK being > small enough. But the gen8 case at least looks theoretically possible. > But having the case everywhere seems like the best way to avoid > someone copy-pasting the wrong thing when the next variant gets added. I was about to comment that early-quirks needs to be fixed too, but it was already fixed when I synchronized the code last time. That reminded me that I still have the GIT branch to eliminate the code duplication, which is probably why I didn't revamp the i915 variants. The trouble is that in early-quirks, pci subsystem functions are unavailable and the functions are __init, so we'll have to choose between: a) code duplication, as we have now b) move the functions to i915_drm.h as inline with hooks, code size is not shrunk, but code duplication is eliminated c) always have the functions as non __init, even if i915 is disabled reduces duplication and kernel size, (well, increases x86 tiny kernel size) c) Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/27] drm/i915: Squash repeated awaits on the same fence
On Thu, Apr 27, 2017 at 10:20:36AM +0100, Tvrtko Ursulin wrote: > > On 26/04/2017 23:22, Chris Wilson wrote: > >On Wed, Apr 26, 2017 at 07:56:14PM +0100, Chris Wilson wrote: > >>On Wed, Apr 26, 2017 at 01:13:41PM +0100, Tvrtko Ursulin wrote: > >>>I was thinking of exactly the same thing as this patch does, u64 > >>>context id as key, u32 seqnos (wrapped in a container with > >>>hlist_node). > >> > >>#define NSYNC 32 > >>struct intel_timeline_sync { /* kmalloc-256 slab */ > >>struct hlist_node node; > >>u64 prefix; > >>u32 bitmap; > >>u32 seqno[NSYNC]; > >>}; > >>DECLARE_HASHTABLE(sync, 7); > >> > >>If I squint, the numbers favour the idr. ;) > > > >Hmm, it didn't take much to start running into misery with a static ht. > >I know my testing is completely artificial but I am not going to be > >happy with a static size, it will always be too big or too small and > >never just Goldilocks. > > Oh what a pity, implementation is so much smaller. What kind of > misery was it? I presume not longer below the noise floor? With more > than three buckets? Yup, after realising the flaw in my userspace test, I was able to hit intel_timeline_sync_is_later() more often. The difference between idr/ht in that test is still less than the difference in not squashing, but it becomes easier to realise a difference (the moment when it was spending over 90% in that function walking the hash chain was the last straw). > If no other choice I'll tackle the review. Hopefully won't get lost > in all the shifts, leafs, branches and prefixes. :) You may well win the ht argument when it comes to an RCU compatible variant for reservation_object; the relative simplicity in walking the rcu chains is much more reassuring than arguing rcu correctness of parent pointers and manual stacks for iterators. Still a fixed sized ht is going to have long chains for igt, and reservation_objects are very common so we can't go mad in giving each a large number of buckets. The biggest complexity for reservation_object is that it offers guaranteed insertion (along with a u64 index that rules out lib/radixtree, rhashtable). And I hope one day refcounting becomes reasonably cheap again, since sadly it's unavoidable in reservation_object (afaict). > Regards, > > Tvrtko > > P.S. GEM_STATS you mention in the other reply - what are you > referring to with that? The idea to expose queue depths and possibly > more via some interface? If so prototyping that is almost next on my > TODO list. I was thinking of intrusive debugging stats that we may want to keep around and conditionally compile in. Most statistics should not be for public consumption :) -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v8] drm/i915: Squash repeated awaits on the same fence
On Thu, Apr 27, 2017 at 08:06:36AM +0100, Chris Wilson wrote: > +int i915_gem_timeline_mock_selftests(void) > +{ > + static const struct i915_subtest tests[] = { > + SUBTEST(igt_seqmap), I should add a few benchmarks here as well. random insertion random lookup (uses same random set as insertion) repeated lookups of neighbouring engines So that we can compare in situ given our simple api. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/2] GuC logger redesign
On Thu, Apr 27, 2017 at 09:22:11AM +, Olinski, Krzysztof E wrote: > On Thu, 2017-04-27 at 10:05 +0100, Chris Wilson wrote: > > On Thu, Apr 27, 2017 at 10:59:18AM +0200, Krzysztof E. Olinski wrote: > > > GuC logger implementation simplified and moved to a library > > > (GuCLAW). > > > Adds simple buffering utility for logging routine (BUC). > > > > Bigger question, why? What designs goals do you want to achieve? > > -Chris > > > Currently, there are problems with compilation for Android platform due > to pthread dependencies. The proposed implementation should work both > for Linux and Android. I thought that this will be also a good occasion > to introduce lockless mechanisms to improve efficiency. I dispute the improved efficiency -- you add an extra copy in reading the log data ;) If the xfer of the log data is not dominant, something is very wrong in the framework. Ok, I missed that pthreads are unworkable on android. If you can kill the copy, be my guest. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Sanitize engine context sizes (rev2)
== Series Details == Series: drm/i915: Sanitize engine context sizes (rev2) URL : https://patchwork.freedesktop.org/series/23567/ State : failure == Summary == Series 23567v2 drm/i915: Sanitize engine context sizes https://patchwork.freedesktop.org/api/1.0/series/23567/revisions/2/mbox/ Test core_auth: Subgroup basic-auth: pass -> SKIP (fi-bdw-gvtdvm) pass -> SKIP (fi-bxt-t5700) pass -> SKIP (fi-bsw-n3050) pass -> SKIP (fi-kbl-7500u) pass -> SKIP (fi-skl-6700hq) pass -> SKIP (fi-skl-gvtdvm) Test core_prop_blob: Subgroup basic: pass -> SKIP (fi-bdw-gvtdvm) pass -> SKIP (fi-bxt-t5700) pass -> SKIP (fi-bsw-n3050) pass -> SKIP (fi-kbl-7500u) pass -> SKIP (fi-skl-6700hq) pass -> SKIP (fi-skl-gvtdvm) Test drv_getparams_basic: Subgroup basic-eu-total: pass -> SKIP (fi-bdw-gvtdvm) pass -> SKIP (fi-bxt-t5700) pass -> SKIP (fi-bsw-n3050) pass -> SKIP (fi-kbl-7500u) pass -> SKIP (fi-skl-6700hq) pass -> SKIP (fi-skl-gvtdvm) Subgroup basic-subslice-total: pass -> SKIP (fi-bdw-gvtdvm) pass -> SKIP (fi-bxt-t5700) pass -> SKIP (fi-bsw-n3050) pass -> SKIP (fi-kbl-7500u) pass -> SKIP (fi-skl-6700hq) pass -> SKIP (fi-skl-gvtdvm) Test drv_hangman: Subgroup error-state-basic: pass -> SKIP (fi-bdw-gvtdvm) pass -> SKIP (fi-bxt-t5700) pass -> SKIP (fi-bsw-n3050) pass -> SKIP (fi-kbl-7500u) pass -> SKIP (fi-skl-6700hq) pass -> SKIP (fi-skl-gvtdvm) Test drv_module_reload: Subgroup basic-no-display: pass -> FAIL (fi-bxt-t5700) pass -> FAIL (fi-bdw-gvtdvm) pass -> FAIL (fi-skl-gvtdvm) Subgroup basic-reload: dmesg-warn -> FAIL (fi-bdw-gvtdvm) fdo#99938 pass -> FAIL (fi-bxt-t5700) pass -> INCOMPLETE (fi-bsw-n3050) pass -> INCOMPLETE (fi-kbl-7500u) pass -> INCOMPLETE (fi-skl-6700hq) pass -> FAIL (fi-skl-gvtdvm) Subgroup basic-reload-final: pass -> FAIL (fi-bxt-t5700) dmesg-warn -> FAIL (fi-bdw-gvtdvm) fdo#99938 pass -> FAIL (fi-skl-gvtdvm) Subgroup basic-reload-inject: dmesg-warn -> PASS (fi-bdw-gvtdvm) fdo#99938 Test gem_basic: Subgroup bad-close: pass -> SKIP (fi-bdw-gvtdvm) pass -> SKIP (fi-bxt-t5700) pass -> SKIP (fi-bsw-n3050) pass -> SKIP (fi-kbl-7500u) pass -> SKIP (fi-skl-6700hq) pass -> SKIP (fi-skl-gvtdvm) Subgroup create-close: pass -> SKIP (fi-bdw-gvtdvm) pass -> SKIP (fi-bxt-t5700) pass -> SKIP (fi-bsw-n3050) pass -> SKIP (fi-kbl-7500u) pass -> SKIP (fi-skl-6700hq) pass -> SKIP (fi-skl-gvtdvm) Subgroup create-fd-close: pass -> SKIP (fi-bdw-gvtdvm) pass -> SKIP (fi-bxt-t5700) pass -> SKIP (fi-bsw-n3050) pass -> SKIP (fi-kbl-7500u) pass -> SKIP (fi-skl-6700hq) pass -> SKIP (fi-skl-gvtdvm) Test gem_busy: Subgroup basic-busy-default: pass -> SKIP (fi-bdw-gvtdvm) pass -> SKIP (fi-bxt-t5700) pass -> SKIP (fi-bsw-n3050) pass -> SKIP (fi-kbl-7500u) pass -> SKIP (fi-skl-6700hq) pass -> SKIP (fi-skl-gvtdvm) Subgroup basic-hang-default: pass -> SKIP (fi-bdw-gvtdvm) pass -> SKIP (fi-bxt-t5700) pass -> SKIP (fi-bsw-n3050) pass -> SKIP (fi-kbl-7500u) pass -> SKIP (fi-skl-6700hq)
Re: [Intel-gfx] [RFC v2 2/2] drm/i915: Select engines via class and instance in execbuffer2
On 27/04/2017 10:25, Chris Wilson wrote: On Thu, Apr 27, 2017 at 10:10:34AM +0100, Tvrtko Ursulin wrote: From: Tvrtko Ursulin Building on top of the previous patch which exported the concept of engine classes and instances, we can also use this instead of the current awkward engine selection uAPI. This is primarily interesting for the VCS engine selection which is a) currently done via disjoint set of flags, and b) the current I915_EXEC_BSD flags has different semantics depending on the underlying hardware which is bad. Proposed idea here is to reserve 16-bits of flags, to pass in the engine class and instance (8 bits each), and a new flag named I915_EXEC_CLASS_INSTACE to tell the kernel this new engine selection API is in use. The new uAPI also removes access to the weak VCS engine balancing as currently existing in the driver. Example usage to send a command to VCS0: eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 0); Or to send a command to VCS1: eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 1); v2: * Fix unknown flags mask. * Use I915_EXEC_RING_MASK for class. (Chris Wilson) Signed-off-by: Tvrtko Ursulin Cc: Ben Widawsky Cc: Chris Wilson Cc: Daniel Vetter Cc: Joonas Lahtinen Cc: Jon Bloomfield Cc: Daniel Charles Cc: "Rogozhkin, Dmitry V" Cc: Oscar Mateo Cc: "Gong, Zhipeng" Cc: intel-vaapi-me...@lists.01.org Cc: mesa-...@lists.freedesktop.org --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 29 + include/uapi/drm/i915_drm.h| 11 ++- 2 files changed, 39 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index af1965774e7b..ecd1486642a7 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1492,6 +1492,32 @@ gen8_dispatch_bsd_engine(struct drm_i915_private *dev_priv, return file_priv->bsd_engine; } +extern u8 user_class_map[DRM_I915_ENGINE_CLASS_MAX]; + +static struct intel_engine_cs * +eb_select_engine_class_instance(struct drm_i915_private *i915, + struct drm_i915_gem_execbuffer2 *args) +{ + struct intel_engine_cs *engine; + enum intel_engine_id id; + u8 class, instance; + + class = args->flags & I915_EXEC_RING_MASK; + if (class >= DRM_I915_ENGINE_CLASS_MAX) + return NULL; + class = user_class_map[class]; + + instance = (args->flags >> I915_EXEC_INSTANCE_SHIFT) && + I915_EXEC_INSTANCE_MASK; + + for_each_engine(engine, i915, id) { + if (engine->class == class && engine->instance == instance) + return engine; + } I am underwhelmed. No, i915->class_engine[class][instance] ? Hey it's just an RFC for the uAPI proposal! Implementation efficiency only comes later! :) Still, at what point do we kill busy-ioctl per-engine reporting? Should It's the one we already broke before without no one noticing, where it userspace only effectively cares about a boolean value? If so you recommend we make it a real boolean? we update all tracepoints to use class:instance (I think that's a better abi going forward). I can't think of any big problems doing so. Could rename ring= to engine= there as well. engine=. for example? Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for New engine discovery and execbuffer2 engine selection uAPI (rev3)
== Series Details == Series: New engine discovery and execbuffer2 engine selection uAPI (rev3) URL : https://patchwork.freedesktop.org/series/23189/ State : success == Summary == Series 23189v3 New engine discovery and execbuffer2 engine selection uAPI https://patchwork.freedesktop.org/api/1.0/series/23189/revisions/3/mbox/ Test kms_flip: Subgroup basic-flip-vs-wf_vblank: fail -> PASS (fi-skl-6770hq) fdo#99739 fdo#99739 https://bugs.freedesktop.org/show_bug.cgi?id=99739 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:415s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:423s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:522s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:475s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:563s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:481s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:488s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:410s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:406s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:414s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:492s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:487s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:440s fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:556s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:438s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:552s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:448s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:473s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:432s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:539s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:407s a0363fea4a56eef81e8cda759a24d8951dd7ac73 drm-tip: 2017y-04m-27d-08h-13m-25s UTC integration manifest 02b92bc drm/i915: Select engines via class and instance in execbuffer2 90aa3b5 drm/i915: Engine discovery uAPI == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4564/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3] drm/i915: Sanitize engine context sizes
On Thu, Apr 27, 2017 at 12:01:11PM +0300, Joonas Lahtinen wrote: > @@ -266,11 +239,12 @@ __create_hw_context(struct drm_i915_private *dev_priv, > list_add_tail(&ctx->link, &dev_priv->context_list); > ctx->i915 = dev_priv; > > - if (dev_priv->hw_context_size) { > + if (dev_priv->engine[RCS]->context_size) { Totally missed this. This is for legacy only. Long term fix is to do deferred allocation for legacy context objects (i.e. just move this chunk to intel_ring_context_pin under a if (!ce->state) guard). Quick and dirty fix is if(!execlists && RCS->context_size). -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC v2 2/2] drm/i915: Select engines via class and instance in execbuffer2
On Thu, Apr 27, 2017 at 11:09:35AM +0100, Tvrtko Ursulin wrote: > > On 27/04/2017 10:25, Chris Wilson wrote: > >On Thu, Apr 27, 2017 at 10:10:34AM +0100, Tvrtko Ursulin wrote: > >>From: Tvrtko Ursulin > >> > >>Building on top of the previous patch which exported the concept > >>of engine classes and instances, we can also use this instead of > >>the current awkward engine selection uAPI. > >> > >>This is primarily interesting for the VCS engine selection which > >>is a) currently done via disjoint set of flags, and b) the > >>current I915_EXEC_BSD flags has different semantics depending on > >>the underlying hardware which is bad. > >> > >>Proposed idea here is to reserve 16-bits of flags, to pass in > >>the engine class and instance (8 bits each), and a new flag > >>named I915_EXEC_CLASS_INSTACE to tell the kernel this new engine > >>selection API is in use. > >> > >>The new uAPI also removes access to the weak VCS engine > >>balancing as currently existing in the driver. > >> > >>Example usage to send a command to VCS0: > >> > >> eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 0); > >> > >>Or to send a command to VCS1: > >> > >> eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 1); > >> > >>v2: > >> * Fix unknown flags mask. > >> * Use I915_EXEC_RING_MASK for class. (Chris Wilson) > >> > >>Signed-off-by: Tvrtko Ursulin > >>Cc: Ben Widawsky > >>Cc: Chris Wilson > >>Cc: Daniel Vetter > >>Cc: Joonas Lahtinen > >>Cc: Jon Bloomfield > >>Cc: Daniel Charles > >>Cc: "Rogozhkin, Dmitry V" > >>Cc: Oscar Mateo > >>Cc: "Gong, Zhipeng" > >>Cc: intel-vaapi-me...@lists.01.org > >>Cc: mesa-...@lists.freedesktop.org > >>--- > >> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 29 > >> + > >> include/uapi/drm/i915_drm.h| 11 ++- > >> 2 files changed, 39 insertions(+), 1 deletion(-) > >> > >>diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > >>b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > >>index af1965774e7b..ecd1486642a7 100644 > >>--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > >>+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > >>@@ -1492,6 +1492,32 @@ gen8_dispatch_bsd_engine(struct drm_i915_private > >>*dev_priv, > >>return file_priv->bsd_engine; > >> } > >> > >>+extern u8 user_class_map[DRM_I915_ENGINE_CLASS_MAX]; > >>+ > >>+static struct intel_engine_cs * > >>+eb_select_engine_class_instance(struct drm_i915_private *i915, > >>+ struct drm_i915_gem_execbuffer2 *args) > >>+{ > >>+ struct intel_engine_cs *engine; > >>+ enum intel_engine_id id; > >>+ u8 class, instance; > >>+ > >>+ class = args->flags & I915_EXEC_RING_MASK; > >>+ if (class >= DRM_I915_ENGINE_CLASS_MAX) > >>+ return NULL; > >>+ class = user_class_map[class]; > >>+ > >>+ instance = (args->flags >> I915_EXEC_INSTANCE_SHIFT) && > >>+ I915_EXEC_INSTANCE_MASK; > >>+ > >>+ for_each_engine(engine, i915, id) { > >>+ if (engine->class == class && engine->instance == instance) > >>+ return engine; > >>+ } > > > >I am underwhelmed. No, i915->class_engine[class][instance] ? > > Hey it's just an RFC for the uAPI proposal! Implementation > efficiency only comes later! :) > > >Still, at what point do we kill busy-ioctl per-engine reporting? Should > > It's the one we already broke before without no one noticing, where > it userspace only effectively cares about a boolean value? Userspace does try to distinguish between RCS and !RCS. But it's a rough heuristic that I'm not going to cry much over since it depends upon on so many other factors outside of its control as to which placement is better. > If so you recommend we make it a real boolean? Once we cross the u16 threshold, yup. Just then a busy read/write pair. > >we update all tracepoints to use class:instance (I think that's a better > >abi going forward). > > I can't think of any big problems doing so. Could rename ring= to > engine= there as well. engine=. for example? Works for me. There are still a few other places where we want an index into an array, so keeping a map to engine->uabi_id seems sensible. Or we always include engine.instance as part of our uABI for extensible structs. E.g. struct context_watchdog_param { u32 engine; u32 instance; u64 watchdog_ns; }; Then set/get_context_param return an array of those rather than an array of watchdog_ns. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Defer context state allocation for legacy ring submission
Almost from the outset for execlists, we used deferred allocation of the logical context and rings. Then we ported the infrastructure for pinning contexts back to legacy, and so now we are able to also implement deferred allocation for context objects prior to first use on the legacy submission. Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Tvrtko Ursulin Cc: Mika Kuoppala --- drivers/gpu/drm/i915/i915_gem_context.c | 59 - drivers/gpu/drm/i915/intel_ringbuffer.c | 50 2 files changed, 50 insertions(+), 59 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 8bd0c4966913..d46a69d3d390 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -151,45 +151,6 @@ void i915_gem_context_free(struct kref *ctx_ref) kfree(ctx); } -static struct drm_i915_gem_object * -alloc_context_obj(struct drm_i915_private *dev_priv, u64 size) -{ - struct drm_i915_gem_object *obj; - int ret; - - lockdep_assert_held(&dev_priv->drm.struct_mutex); - - obj = i915_gem_object_create(dev_priv, size); - if (IS_ERR(obj)) - return obj; - - /* -* Try to make the context utilize L3 as well as LLC. -* -* On VLV we don't have L3 controls in the PTEs so we -* shouldn't touch the cache level, especially as that -* would make the object snooped which might have a -* negative performance impact. -* -* Snooping is required on non-llc platforms in execlist -* mode, but since all GGTT accesses use PAT entry 0 we -* get snooping anyway regardless of cache_level. -* -* This is only applicable for Ivy Bridge devices since -* later platforms don't have L3 control bits in the PTE. -*/ - if (IS_IVYBRIDGE(dev_priv)) { - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_L3_LLC); - /* Failure shouldn't ever happen this early */ - if (WARN_ON(ret)) { - i915_gem_object_put(obj); - return ERR_PTR(ret); - } - } - - return obj; -} - static void context_close(struct i915_gem_context *ctx) { i915_gem_context_set_closed(ctx); @@ -266,26 +227,6 @@ __create_hw_context(struct drm_i915_private *dev_priv, list_add_tail(&ctx->link, &dev_priv->context_list); ctx->i915 = dev_priv; - if (dev_priv->hw_context_size) { - struct drm_i915_gem_object *obj; - struct i915_vma *vma; - - obj = alloc_context_obj(dev_priv, dev_priv->hw_context_size); - if (IS_ERR(obj)) { - ret = PTR_ERR(obj); - goto err_out; - } - - vma = i915_vma_instance(obj, &dev_priv->ggtt.base, NULL); - if (IS_ERR(vma)) { - i915_gem_object_put(obj); - ret = PTR_ERR(vma); - goto err_out; - } - - ctx->engine[RCS].state = vma; - } - /* Default context will never have a file_priv */ ret = DEFAULT_CONTEXT_HANDLE; if (file_priv) { diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 6836efb7e3d2..138cda347488 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1437,6 +1437,44 @@ static int context_pin(struct i915_gem_context *ctx) PIN_GLOBAL | PIN_HIGH); } +static struct i915_vma * +alloc_context_vma(struct intel_engine_cs *engine) +{ + struct drm_i915_private *i915 = engine->i915; + struct drm_i915_gem_object *obj; + struct i915_vma *vma; + + obj = i915_gem_object_create(i915, i915->hw_context_size); + if (IS_ERR(obj)) + return ERR_CAST(obj); + + /* +* Try to make the context utilize L3 as well as LLC. +* +* On VLV we don't have L3 controls in the PTEs so we +* shouldn't touch the cache level, especially as that +* would make the object snooped which might have a +* negative performance impact. +* +* Snooping is required on non-llc platforms in execlist +* mode, but since all GGTT accesses use PAT entry 0 we +* get snooping anyway regardless of cache_level. +* +* This is only applicable for Ivy Bridge devices since +* later platforms don't have L3 control bits in the PTE. +*/ + if (IS_IVYBRIDGE(i915)) { + /* Ignore any error, regard it as a simple optimisation */ + i915_gem_object_set_cache_level(obj, I915_CACHE_L3_LLC); + } + + vma = i915_vma_instance(obj, &engine->i915->ggtt.base, NULL); + if (IS_ERR(vma)) +
[Intel-gfx] [PATCH v2] drm/i915: Defer context state allocation for legacy ring submission
Almost from the outset for execlists, we used deferred allocation of the logical context and rings. Then we ported the infrastructure for pinning contexts back to legacy, and so now we are able to also implement deferred allocation for context objects prior to first use on the legacy submission. v2: We still need to differentiate between legacy engines, Joonas is fixing that but I want this first ;) (Joonas) Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Tvrtko Ursulin Cc: Mika Kuoppala --- drivers/gpu/drm/i915/i915_gem_context.c | 59 - drivers/gpu/drm/i915/intel_ringbuffer.c | 50 2 files changed, 50 insertions(+), 59 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 8bd0c4966913..d46a69d3d390 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -151,45 +151,6 @@ void i915_gem_context_free(struct kref *ctx_ref) kfree(ctx); } -static struct drm_i915_gem_object * -alloc_context_obj(struct drm_i915_private *dev_priv, u64 size) -{ - struct drm_i915_gem_object *obj; - int ret; - - lockdep_assert_held(&dev_priv->drm.struct_mutex); - - obj = i915_gem_object_create(dev_priv, size); - if (IS_ERR(obj)) - return obj; - - /* -* Try to make the context utilize L3 as well as LLC. -* -* On VLV we don't have L3 controls in the PTEs so we -* shouldn't touch the cache level, especially as that -* would make the object snooped which might have a -* negative performance impact. -* -* Snooping is required on non-llc platforms in execlist -* mode, but since all GGTT accesses use PAT entry 0 we -* get snooping anyway regardless of cache_level. -* -* This is only applicable for Ivy Bridge devices since -* later platforms don't have L3 control bits in the PTE. -*/ - if (IS_IVYBRIDGE(dev_priv)) { - ret = i915_gem_object_set_cache_level(obj, I915_CACHE_L3_LLC); - /* Failure shouldn't ever happen this early */ - if (WARN_ON(ret)) { - i915_gem_object_put(obj); - return ERR_PTR(ret); - } - } - - return obj; -} - static void context_close(struct i915_gem_context *ctx) { i915_gem_context_set_closed(ctx); @@ -266,26 +227,6 @@ __create_hw_context(struct drm_i915_private *dev_priv, list_add_tail(&ctx->link, &dev_priv->context_list); ctx->i915 = dev_priv; - if (dev_priv->hw_context_size) { - struct drm_i915_gem_object *obj; - struct i915_vma *vma; - - obj = alloc_context_obj(dev_priv, dev_priv->hw_context_size); - if (IS_ERR(obj)) { - ret = PTR_ERR(obj); - goto err_out; - } - - vma = i915_vma_instance(obj, &dev_priv->ggtt.base, NULL); - if (IS_ERR(vma)) { - i915_gem_object_put(obj); - ret = PTR_ERR(vma); - goto err_out; - } - - ctx->engine[RCS].state = vma; - } - /* Default context will never have a file_priv */ ret = DEFAULT_CONTEXT_HANDLE; if (file_priv) { diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 6836efb7e3d2..61f612454ce7 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1437,6 +1437,44 @@ static int context_pin(struct i915_gem_context *ctx) PIN_GLOBAL | PIN_HIGH); } +static struct i915_vma * +alloc_context_vma(struct intel_engine_cs *engine) +{ + struct drm_i915_private *i915 = engine->i915; + struct drm_i915_gem_object *obj; + struct i915_vma *vma; + + obj = i915_gem_object_create(i915, i915->hw_context_size); + if (IS_ERR(obj)) + return ERR_CAST(obj); + + /* +* Try to make the context utilize L3 as well as LLC. +* +* On VLV we don't have L3 controls in the PTEs so we +* shouldn't touch the cache level, especially as that +* would make the object snooped which might have a +* negative performance impact. +* +* Snooping is required on non-llc platforms in execlist +* mode, but since all GGTT accesses use PAT entry 0 we +* get snooping anyway regardless of cache_level. +* +* This is only applicable for Ivy Bridge devices since +* later platforms don't have L3 control bits in the PTE. +*/ + if (IS_IVYBRIDGE(i915)) { + /* Ignore any error, regard it as a simple optimisation */ + i915_gem_object_set_cache_level(obj, I915_CACHE_L3_LLC); +
Re: [Intel-gfx] [PATCH v2] drm/i915: Defer context state allocation for legacy ring submission
On to, 2017-04-27 at 11:46 +0100, Chris Wilson wrote: > Almost from the outset for execlists, we used deferred allocation of the > logical context and rings. Then we ported the infrastructure for pinning > contexts back to legacy, and so now we are able to also implement > deferred allocation for context objects prior to first use on the legacy > submission. > > v2: We still need to differentiate between legacy engines, Joonas is > fixing that but I want this first ;) (Joonas) > > Signed-off-by: Chris Wilson > Cc: Joonas Lahtinen > Cc: Tvrtko Ursulin > Cc: Mika Kuoppala You went the extra mile to reduce the if-claus too, so; Reviewed-by: Joonas Lahtinen Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Defer context state allocation for legacy ring submission (rev2)
== Series Details == Series: drm/i915: Defer context state allocation for legacy ring submission (rev2) URL : https://patchwork.freedesktop.org/series/23619/ State : success == Summary == Series 23619v2 drm/i915: Defer context state allocation for legacy ring submission https://patchwork.freedesktop.org/api/1.0/series/23619/revisions/2/mbox/ Test gem_exec_flush: Subgroup basic-batch-kernel-default-uc: pass -> FAIL (fi-snb-2600) fdo#17 Test kms_flip: Subgroup basic-flip-vs-wf_vblank: fail -> PASS (fi-skl-6770hq) fdo#99739 fdo#17 https://bugs.freedesktop.org/show_bug.cgi?id=17 fdo#99739 https://bugs.freedesktop.org/show_bug.cgi?id=99739 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:430s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:424s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:574s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:511s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:543s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:488s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:483s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:408s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:403s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:419s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:491s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:464s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:465s fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:566s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:449s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:566s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:463s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:487s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:430s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:526s fi-snb-2600 total:278 pass:248 dwarn:0 dfail:0 fail:1 skip:29 time:409s a0363fea4a56eef81e8cda759a24d8951dd7ac73 drm-tip: 2017y-04m-27d-08h-13m-25s UTC integration manifest 33aa8f5 drm/i915: Defer context state allocation for legacy ring submission == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4565/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 4/8] drm/i915: Check error return when setting DMA mask
On Wed, 26 Apr 2017, Imre Deak wrote: > Even though an error from these functions isn't fatal we still want to > have a diagnostic message about it. > > v2: > - Don't do assignments in if statements. (Jani) > > Cc: Jani Nikula > Signed-off-by: Imre Deak Reviewed-by: Jani Nikula > --- > drivers/gpu/drm/i915/i915_gem_gtt.c | 16 > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c > b/drivers/gpu/drm/i915/i915_gem_gtt.c > index 8bab4ae..0178c9e 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -2741,13 +2741,17 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt) > struct pci_dev *pdev = dev_priv->drm.pdev; > unsigned int size; > u16 snb_gmch_ctl; > + int err; > > /* TODO: We're not aware of mappable constraints on gen8 yet */ > ggtt->mappable_base = pci_resource_start(pdev, 2); > ggtt->mappable_end = pci_resource_len(pdev, 2); > > - if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(39))) > - pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(39)); > + err = pci_set_dma_mask(pdev, DMA_BIT_MASK(39)); > + if (!err) > + err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(39)); > + if (err) > + DRM_ERROR("Can't set DMA mask/consistent mask (%d)\n", err); > > pci_read_config_word(pdev, SNB_GMCH_CTRL, &snb_gmch_ctl); > > @@ -2790,6 +2794,7 @@ static int gen6_gmch_probe(struct i915_ggtt *ggtt) > struct pci_dev *pdev = dev_priv->drm.pdev; > unsigned int size; > u16 snb_gmch_ctl; > + int err; > > ggtt->mappable_base = pci_resource_start(pdev, 2); > ggtt->mappable_end = pci_resource_len(pdev, 2); > @@ -2802,8 +2807,11 @@ static int gen6_gmch_probe(struct i915_ggtt *ggtt) > return -ENXIO; > } > > - if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(40))) > - pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(40)); > + err = pci_set_dma_mask(pdev, DMA_BIT_MASK(40)); > + if (!err) > + err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(40)); > + if (err) > + DRM_ERROR("Can't set DMA mask/consistent mask (%d)\n", err); > pci_read_config_word(pdev, SNB_GMCH_CTRL, &snb_gmch_ctl); > > ggtt->stolen_size = gen6_get_stolen_size(snb_gmch_ctl); -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v8] drm/i915: Squash repeated awaits on the same fence
On Thu, Apr 27, 2017 at 10:50:28AM +0100, Chris Wilson wrote: > On Thu, Apr 27, 2017 at 08:06:36AM +0100, Chris Wilson wrote: > > +int i915_gem_timeline_mock_selftests(void) > > +{ > > + static const struct i915_subtest tests[] = { > > + SUBTEST(igt_seqmap), > > I should add a few benchmarks here as well. > > random insertion > random lookup (uses same random set as insertion) > repeated lookups of neighbouring engines > > So that we can compare in situ given our simple api. Hmm, I may be biased, but on Braswell: idr: bench_sync: 196699 random insertions, 515ns/insert bench_sync: 196699 random lookups, 376ns/lookup bench_sync: 2428021 repeated insert/lookups, 41ns/op 1<<3 ht: bench_sync: 7857 random insertions, 12766ns/insert bench_sync: 7857 random lookups, 12855ns/lookup bench_sync: 2164705 repeated insert/lookups, 47ns/op 1<<7 ht: bench_sync: 17891 random insertions, 5733ns/insert bench_sync: 17891 random lookups, 5618ns/lookup bench_sync: 1983086 repeated insert/lookups, 52ns/op That is better than my expectations! Once again, take with a pinch of salt as random insetions are totally unrealistic. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Defer context state allocation for legacy ring submission (rev2)
On Thu, Apr 27, 2017 at 11:06:18AM -, Patchwork wrote: > == Series Details == > > Series: drm/i915: Defer context state allocation for legacy ring submission > (rev2) > URL : https://patchwork.freedesktop.org/series/23619/ > State : success > > == Summary == > > Series 23619v2 drm/i915: Defer context state allocation for legacy ring > submission > https://patchwork.freedesktop.org/api/1.0/series/23619/revisions/2/mbox/ > > Test gem_exec_flush: > Subgroup basic-batch-kernel-default-uc: > pass -> FAIL (fi-snb-2600) fdo#17 > Test kms_flip: > Subgroup basic-flip-vs-wf_vblank: > fail -> PASS (fi-skl-6770hq) fdo#99739 > > fdo#17 https://bugs.freedesktop.org/show_bug.cgi?id=17 > fdo#99739 https://bugs.freedesktop.org/show_bug.cgi?id=99739 Pushed, thanks for the review. Another step towards grand unification! -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v9] drm/i915: Squash repeated awaits on the same fence
Track the latest fence waited upon on each context, and only add a new asynchronous wait if the new fence is more recent than the recorded fence for that context. This requires us to filter out unordered timelines, which are noted by DMA_FENCE_NO_CONTEXT. However, in the absence of a universal identifier, we have to use our own i915->mm.unordered_timeline token. v2: Throw around the debug crutches v3: Inline the likely case of the pre-allocation cache being full. v4: Drop the pre-allocation support, we can lose the most recent fence in case of allocation failure -- it just means we may emit more awaits than strictly necessary but will not break. v5: Trim allocation size for leaf nodes, they only need an array of u32 not pointers. v6: Create mock_timeline to tidy selftest writing v7: s/intel_timeline_sync_get/intel_timeline_sync_is_later/ (Tvrtko) v8: Prune the stale sync points when we idle. v9: Include a small benchmark in the kselftests Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem.c| 1 + drivers/gpu/drm/i915/i915_gem_request.c| 11 + drivers/gpu/drm/i915/i915_gem_timeline.c | 314 + drivers/gpu/drm/i915/i915_gem_timeline.h | 15 + drivers/gpu/drm/i915/selftests/i915_gem_timeline.c | 225 +++ .../gpu/drm/i915/selftests/i915_mock_selftests.h | 1 + drivers/gpu/drm/i915/selftests/mock_timeline.c | 52 drivers/gpu/drm/i915/selftests/mock_timeline.h | 33 +++ 8 files changed, 652 insertions(+) create mode 100644 drivers/gpu/drm/i915/selftests/i915_gem_timeline.c create mode 100644 drivers/gpu/drm/i915/selftests/mock_timeline.c create mode 100644 drivers/gpu/drm/i915/selftests/mock_timeline.h diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c1fa3c103f38..f886ef492036 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3214,6 +3214,7 @@ i915_gem_idle_work_handler(struct work_struct *work) intel_engine_disarm_breadcrumbs(engine); i915_gem_batch_pool_fini(&engine->batch_pool); } + i915_gem_timelines_mark_idle(dev_priv); GEM_BUG_ON(!dev_priv->gt.awake); dev_priv->gt.awake = false; diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 5fa4e52ded06..d9f76665bc6b 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -772,6 +772,12 @@ i915_gem_request_await_dma_fence(struct drm_i915_gem_request *req, if (fence->context == req->fence.context) continue; + /* Squash repeated waits to the same timelines */ + if (fence->context != req->i915->mm.unordered_timeline && + intel_timeline_sync_is_later(req->timeline, +fence->context, fence->seqno)) + continue; + if (dma_fence_is_i915(fence)) ret = i915_gem_request_await_request(req, to_request(fence)); @@ -781,6 +787,11 @@ i915_gem_request_await_dma_fence(struct drm_i915_gem_request *req, GFP_KERNEL); if (ret < 0) return ret; + + /* Record the most latest fence on each timeline */ + if (fence->context != req->i915->mm.unordered_timeline) + intel_timeline_sync_set(req->timeline, + fence->context, fence->seqno); } while (--nchild); return 0; diff --git a/drivers/gpu/drm/i915/i915_gem_timeline.c b/drivers/gpu/drm/i915/i915_gem_timeline.c index b596ca7ee058..967c53a53a92 100644 --- a/drivers/gpu/drm/i915/i915_gem_timeline.c +++ b/drivers/gpu/drm/i915/i915_gem_timeline.c @@ -24,6 +24,276 @@ #include "i915_drv.h" +#define NSYNC 16 +#define SHIFT ilog2(NSYNC) +#define MASK (NSYNC - 1) + +/* struct intel_timeline_sync is a layer of a radixtree that maps a u64 fence + * context id to the last u32 fence seqno waited upon from that context. + * Unlike lib/radixtree it uses a parent pointer that allows traversal back to + * the root. This allows us to access the whole tree via a single pointer + * to the most recently used layer. We expect fence contexts to be dense + * and most reuse to be on the same i915_gem_context but on neighbouring + * engines (i.e. on adjacent contexts) and reuse the same leaf, a very + * effective lookup cache. If the new lookup is not on the same leaf, we + * expect it to be on the neighbouring branch. + * + * A leaf holds an array of u32 seqno, and has height 0. The bitmap field + * allows us to store whether a particular seqno is valid (i.e. allows us + * to distinguish unset from 0). + * + *
Re: [Intel-gfx] [PATCH v3 5/8] drm/i915: Check error return when converting pipe to connector
On Thu, Apr 27, 2017 at 11:36:54AM +0300, Imre Deak wrote: > An error from intel_get_pipe_from_connector() would mean a bug somewhere > else, but we still should check for it to prevent some other more > obscure bug later. > > v2: > - Fall back to a reasonable default instead of bailing out in case of > error. (Jani) > v3: > - Fix s/PIPE_INVALID/INVALID_PIPE/ typo. (Jani) > > Cc: Jani Nikula > Signed-off-by: Imre Deak > --- > drivers/gpu/drm/i915/intel_panel.c | 17 ++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_panel.c > b/drivers/gpu/drm/i915/intel_panel.c > index cb50c52..d1abbf1 100644 > --- a/drivers/gpu/drm/i915/intel_panel.c > +++ b/drivers/gpu/drm/i915/intel_panel.c > @@ -888,10 +888,14 @@ static void pch_enable_backlight(struct intel_connector > *connector) > struct drm_i915_private *dev_priv = to_i915(connector->base.dev); > struct intel_panel *panel = &connector->panel; > enum pipe pipe = intel_get_pipe_from_connector(connector); > - enum transcoder cpu_transcoder = > - intel_pipe_to_cpu_transcoder(dev_priv, pipe); > + enum transcoder cpu_transcoder; > u32 cpu_ctl2, pch_ctl1, pch_ctl2; > > + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) > + cpu_transcoder = intel_pipe_to_cpu_transcoder(dev_priv, pipe); > + else > + cpu_transcoder = TRANSCODER_EDP; > + > cpu_ctl2 = I915_READ(BLC_PWM_CPU_CTL2); > if (cpu_ctl2 & BLM_PWM_ENABLE) { > DRM_DEBUG_KMS("cpu backlight already enabled\n"); > @@ -973,6 +977,9 @@ static void i965_enable_backlight(struct intel_connector > *connector) > enum pipe pipe = intel_get_pipe_from_connector(connector); > u32 ctl, ctl2, freq; > > + if (WARN_ON_ONCE(pipe == INVALID_PIPE)) > + pipe = PIPE_A; > + > ctl2 = I915_READ(BLC_PWM_CTL2); > if (ctl2 & BLM_PWM_ENABLE) { > DRM_DEBUG_KMS("backlight already enabled\n"); > @@ -1037,6 +1044,9 @@ static void bxt_enable_backlight(struct intel_connector > *connector) > enum pipe pipe = intel_get_pipe_from_connector(connector); > u32 pwm_ctl, val; > > + if (WARN_ON_ONCE(pipe) == INVALID_PIPE) ^ Isn't that thing in the wrong place? > + pipe = PIPE_A; > + > /* Controller 1 uses the utility pin. */ > if (panel->backlight.controller == 1) { > val = I915_READ(UTIL_PIN_CTL); > @@ -1093,7 +1103,8 @@ void intel_panel_enable_backlight(struct > intel_connector *connector) > if (!panel->backlight.present) > return; > > - DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); > + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) > + DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); > > mutex_lock(&dev_priv->backlight_lock); > > -- > 2.5.0 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 5/8] drm/i915: Check error return when converting pipe to connector
On Thu, Apr 27, 2017 at 02:49:55PM +0300, Ville Syrjälä wrote: > On Thu, Apr 27, 2017 at 11:36:54AM +0300, Imre Deak wrote: > > An error from intel_get_pipe_from_connector() would mean a bug somewhere > > else, but we still should check for it to prevent some other more > > obscure bug later. > > > > v2: > > - Fall back to a reasonable default instead of bailing out in case of > > error. (Jani) > > v3: > > - Fix s/PIPE_INVALID/INVALID_PIPE/ typo. (Jani) > > > > Cc: Jani Nikula > > Signed-off-by: Imre Deak > > --- > > drivers/gpu/drm/i915/intel_panel.c | 17 ++--- > > 1 file changed, 14 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_panel.c > > b/drivers/gpu/drm/i915/intel_panel.c > > index cb50c52..d1abbf1 100644 > > --- a/drivers/gpu/drm/i915/intel_panel.c > > +++ b/drivers/gpu/drm/i915/intel_panel.c > > @@ -888,10 +888,14 @@ static void pch_enable_backlight(struct > > intel_connector *connector) > > struct drm_i915_private *dev_priv = to_i915(connector->base.dev); > > struct intel_panel *panel = &connector->panel; > > enum pipe pipe = intel_get_pipe_from_connector(connector); > > - enum transcoder cpu_transcoder = > > - intel_pipe_to_cpu_transcoder(dev_priv, pipe); > > + enum transcoder cpu_transcoder; > > u32 cpu_ctl2, pch_ctl1, pch_ctl2; > > > > + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) > > + cpu_transcoder = intel_pipe_to_cpu_transcoder(dev_priv, pipe); > > + else > > + cpu_transcoder = TRANSCODER_EDP; > > + > > cpu_ctl2 = I915_READ(BLC_PWM_CPU_CTL2); > > if (cpu_ctl2 & BLM_PWM_ENABLE) { > > DRM_DEBUG_KMS("cpu backlight already enabled\n"); > > @@ -973,6 +977,9 @@ static void i965_enable_backlight(struct > > intel_connector *connector) > > enum pipe pipe = intel_get_pipe_from_connector(connector); > > u32 ctl, ctl2, freq; > > > > + if (WARN_ON_ONCE(pipe == INVALID_PIPE)) > > + pipe = PIPE_A; > > + > > ctl2 = I915_READ(BLC_PWM_CTL2); > > if (ctl2 & BLM_PWM_ENABLE) { > > DRM_DEBUG_KMS("backlight already enabled\n"); > > @@ -1037,6 +1044,9 @@ static void bxt_enable_backlight(struct > > intel_connector *connector) > > enum pipe pipe = intel_get_pipe_from_connector(connector); > > u32 pwm_ctl, val; > > > > + if (WARN_ON_ONCE(pipe) == INVALID_PIPE) > ^ > > Isn't that thing in the wrong place? Yes, thanks for catching it. > > > + pipe = PIPE_A; > > + > > /* Controller 1 uses the utility pin. */ > > if (panel->backlight.controller == 1) { > > val = I915_READ(UTIL_PIN_CTL); > > @@ -1093,7 +1103,8 @@ void intel_panel_enable_backlight(struct > > intel_connector *connector) > > if (!panel->backlight.present) > > return; > > > > - DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); > > + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) > > + DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); > > > > mutex_lock(&dev_priv->backlight_lock); > > > > -- > > 2.5.0 > > > > ___ > > Intel-gfx mailing list > > Intel-gfx@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx > > -- > Ville Syrjälä > Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 5/8] drm/i915: Check error return when converting pipe to connector
On Thu, 27 Apr 2017, Imre Deak wrote: > On Thu, Apr 27, 2017 at 02:49:55PM +0300, Ville Syrjälä wrote: >> On Thu, Apr 27, 2017 at 11:36:54AM +0300, Imre Deak wrote: >> > An error from intel_get_pipe_from_connector() would mean a bug somewhere >> > else, but we still should check for it to prevent some other more >> > obscure bug later. >> > >> > v2: >> > - Fall back to a reasonable default instead of bailing out in case of >> > error. (Jani) >> > v3: >> > - Fix s/PIPE_INVALID/INVALID_PIPE/ typo. (Jani) >> > >> > Cc: Jani Nikula >> > Signed-off-by: Imre Deak >> > --- >> > drivers/gpu/drm/i915/intel_panel.c | 17 ++--- >> > 1 file changed, 14 insertions(+), 3 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/i915/intel_panel.c >> > b/drivers/gpu/drm/i915/intel_panel.c >> > index cb50c52..d1abbf1 100644 >> > --- a/drivers/gpu/drm/i915/intel_panel.c >> > +++ b/drivers/gpu/drm/i915/intel_panel.c >> > @@ -888,10 +888,14 @@ static void pch_enable_backlight(struct >> > intel_connector *connector) >> >struct drm_i915_private *dev_priv = to_i915(connector->base.dev); >> >struct intel_panel *panel = &connector->panel; >> >enum pipe pipe = intel_get_pipe_from_connector(connector); >> > - enum transcoder cpu_transcoder = >> > - intel_pipe_to_cpu_transcoder(dev_priv, pipe); >> > + enum transcoder cpu_transcoder; >> >u32 cpu_ctl2, pch_ctl1, pch_ctl2; >> > >> > + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) >> > + cpu_transcoder = intel_pipe_to_cpu_transcoder(dev_priv, pipe); >> > + else >> > + cpu_transcoder = TRANSCODER_EDP; >> > + >> >cpu_ctl2 = I915_READ(BLC_PWM_CPU_CTL2); >> >if (cpu_ctl2 & BLM_PWM_ENABLE) { >> >DRM_DEBUG_KMS("cpu backlight already enabled\n"); >> > @@ -973,6 +977,9 @@ static void i965_enable_backlight(struct >> > intel_connector *connector) >> >enum pipe pipe = intel_get_pipe_from_connector(connector); >> >u32 ctl, ctl2, freq; >> > >> > + if (WARN_ON_ONCE(pipe == INVALID_PIPE)) >> > + pipe = PIPE_A; >> > + >> >ctl2 = I915_READ(BLC_PWM_CTL2); >> >if (ctl2 & BLM_PWM_ENABLE) { >> >DRM_DEBUG_KMS("backlight already enabled\n"); >> > @@ -1037,6 +1044,9 @@ static void bxt_enable_backlight(struct >> > intel_connector *connector) >> >enum pipe pipe = intel_get_pipe_from_connector(connector); >> >u32 pwm_ctl, val; >> > >> > + if (WARN_ON_ONCE(pipe) == INVALID_PIPE) >> ^ >> >> Isn't that thing in the wrong place? > > Yes, thanks for catching it. *facepalm* I guess my review is not to be trusted. :/ BR, Jani. > >> >> > + pipe = PIPE_A; >> > + >> >/* Controller 1 uses the utility pin. */ >> >if (panel->backlight.controller == 1) { >> >val = I915_READ(UTIL_PIN_CTL); >> > @@ -1093,7 +1103,8 @@ void intel_panel_enable_backlight(struct >> > intel_connector *connector) >> >if (!panel->backlight.present) >> >return; >> > >> > - DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); >> > + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) >> > + DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); >> > >> >mutex_lock(&dev_priv->backlight_lock); >> > >> > -- >> > 2.5.0 >> > >> > ___ >> > Intel-gfx mailing list >> > Intel-gfx@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx >> >> -- >> Ville Syrjälä >> Intel OTC -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v4 5/8] drm/i915: Check error return when converting pipe to connector
An error from intel_get_pipe_from_connector() would mean a bug somewhere else, but we still should check for it to prevent some other more obscure bug later. v2: - Fall back to a reasonable default instead of bailing out in case of error. (Jani) v3: - Fix s/PIPE_INVALID/INVALID_PIPE/ typo. (Jani) v4: - Fix bogus bracing around WARN() condition. (Ville) Cc: Jani Nikula Cc: Ville Syrjälä Signed-off-by: Imre Deak Reviewed-by: Jani Nikula (v3) --- drivers/gpu/drm/i915/intel_panel.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_panel.c b/drivers/gpu/drm/i915/intel_panel.c index cb50c52..c8103f8 100644 --- a/drivers/gpu/drm/i915/intel_panel.c +++ b/drivers/gpu/drm/i915/intel_panel.c @@ -888,10 +888,14 @@ static void pch_enable_backlight(struct intel_connector *connector) struct drm_i915_private *dev_priv = to_i915(connector->base.dev); struct intel_panel *panel = &connector->panel; enum pipe pipe = intel_get_pipe_from_connector(connector); - enum transcoder cpu_transcoder = - intel_pipe_to_cpu_transcoder(dev_priv, pipe); + enum transcoder cpu_transcoder; u32 cpu_ctl2, pch_ctl1, pch_ctl2; + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) + cpu_transcoder = intel_pipe_to_cpu_transcoder(dev_priv, pipe); + else + cpu_transcoder = TRANSCODER_EDP; + cpu_ctl2 = I915_READ(BLC_PWM_CPU_CTL2); if (cpu_ctl2 & BLM_PWM_ENABLE) { DRM_DEBUG_KMS("cpu backlight already enabled\n"); @@ -973,6 +977,9 @@ static void i965_enable_backlight(struct intel_connector *connector) enum pipe pipe = intel_get_pipe_from_connector(connector); u32 ctl, ctl2, freq; + if (WARN_ON_ONCE(pipe == INVALID_PIPE)) + pipe = PIPE_A; + ctl2 = I915_READ(BLC_PWM_CTL2); if (ctl2 & BLM_PWM_ENABLE) { DRM_DEBUG_KMS("backlight already enabled\n"); @@ -1037,6 +1044,9 @@ static void bxt_enable_backlight(struct intel_connector *connector) enum pipe pipe = intel_get_pipe_from_connector(connector); u32 pwm_ctl, val; + if (WARN_ON_ONCE(pipe == INVALID_PIPE)) + pipe = PIPE_A; + /* Controller 1 uses the utility pin. */ if (panel->backlight.controller == 1) { val = I915_READ(UTIL_PIN_CTL); @@ -1093,7 +1103,8 @@ void intel_panel_enable_backlight(struct intel_connector *connector) if (!panel->backlight.present) return; - DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); + if (!WARN_ON_ONCE(pipe == INVALID_PIPE)) + DRM_DEBUG_KMS("pipe %c\n", pipe_name(pipe)); mutex_lock(&dev_priv->backlight_lock); -- 2.5.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/7] Add Y-tiling support into IGTs
Hi Paulo, Thanks for your review. On Wednesday 26 April 2017 08:21 PM, Paulo Zanoni wrote: Em Qua, 2017-04-26 às 10:46 -0300, Paulo Zanoni escreveu: Em Sáb, 2017-03-18 às 00:45 +0530, Praveen Paneri escreveu: This series adds Y-tiled buffer creation support into IGT libraries and goes on to use this capability to add support into FBC tests to use Y-tiled buffers. I applied this series and the Kernel patch. If I try to run kms_draw_crc it just gets stuck eating 100% of the CPU. I suppose this needs to be debugged, maybe some patch is wrong. Can you reproduce this behavior? Just as a note, I tested on SKL. Maybe that's relevant. I had tested this on APL and it was working fine. I will try to run it on SKL as well. Will address your comments the resend this series soon. thanks, Praveen Akash Goel (1): lib/igt_draw: Add Y-tiling support for IGT_DRAW_BLT method Paulo Zanoni (1): tests/kms_draw_crc: add support for Y tiling Praveen Paneri (5): lib/igt_fb: Let others use igt_get_fb_tile_size lib/igt_fb: Add helper function for tile_to_mod lib/igt_draw: Add Y-tiling support igt/kms_frontbuffer_tracking: Add Y-tiling support igt/kms_fbc_crc.c : Add Y-tile tests lib/igt_draw.c | 167 - -- lib/igt_fb.c | 29 ++- lib/igt_fb.h | 4 +- tests/kms_draw_crc.c | 58 ++ tests/kms_fbc_crc.c | 71 + tests/kms_frontbuffer_tracking.c | 46 ++- 6 files changed, 262 insertions(+), 113 deletions(-) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t] tests/kms_atomic_abi: Test event ABI corner cases
Atomic has a few special cases around async commits and event generation that we need to test. This patch addresses these two tests - kernel rejects events on a disabled pipe - events on a pipe that is getting enabled/disabled For: VIZ-6954 Signed-off-by: Mika Kahola --- tests/Makefile.sources | 1 + tests/kms_atomic_abi.c | 201 + 2 files changed, 202 insertions(+) create mode 100644 tests/kms_atomic_abi.c diff --git a/tests/Makefile.sources b/tests/Makefile.sources index 2b6e6ee..b4e9897 100644 --- a/tests/Makefile.sources +++ b/tests/Makefile.sources @@ -121,6 +121,7 @@ TESTS_progs_M = \ kms_plane \ kms_plane_multiple \ kms_plane_lowres \ + kms_atomic_abi \ kms_properties \ kms_psr_sink_crc \ kms_render \ diff --git a/tests/kms_atomic_abi.c b/tests/kms_atomic_abi.c new file mode 100644 index 000..843220b --- /dev/null +++ b/tests/kms_atomic_abi.c @@ -0,0 +1,201 @@ +/* + * Copyright © 2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + */ + +#include "igt.h" +#include + +IGT_TEST_DESCRIPTION("Test event ABI"); + +static void +do_atomic_commit(igt_display_t *display, int expectation) +{ + char buf[256]; + struct drm_event *e = (void *)buf; + int ret; + + ret = igt_display_try_commit_atomic(display, + DRM_MODE_PAGE_FLIP_EVENT, + NULL); + igt_assert_eq(ret, expectation); + + if (expectation == 0) { + igt_set_timeout(1, "Stuck on page flip"); + + ret = read(display->drm_fd, buf, sizeof(buf)); + igt_assert(ret >= 0); + + igt_reset_timeout(); + igt_assert_eq(e->type, DRM_EVENT_FLIP_COMPLETE); + } +} + +static int +read_pageflip_event(igt_display_t *display) +{ + char buf[256]; + struct drm_event *e = (void *)buf; + int ret; + + for (int i = 0; i < 100; i++) { + ret = read(display->drm_fd, buf, sizeof(buf)); + + if (ret >= 0) + break; + } + + if (ret >= 0) + igt_assert_eq(e->type, DRM_EVENT_FLIP_COMPLETE); + + return ret; +} + +static void +test_init(igt_display_t *display, enum pipe pipe, igt_output_t *output) +{ + struct igt_fb fb; + drmModeModeInfo *mode; + igt_plane_t *primary; + int flags = DRM_MODE_ATOMIC_ALLOW_MODESET | DRM_MODE_ATOMIC_NONBLOCK; + int ret; + + igt_output_set_pipe(output, pipe); + + primary = igt_output_get_plane_type(output, DRM_PLANE_TYPE_PRIMARY); + + mode = igt_output_get_mode(output); + + igt_create_color_fb(display->drm_fd, mode->hdisplay, mode->vdisplay, + DRM_FORMAT_XRGB, + LOCAL_DRM_FORMAT_MOD_NONE, + 0.0f, 0.0f, 1.0f, + &fb); + + igt_plane_set_fb(primary, &fb); + + ret = igt_display_try_commit_atomic(display, flags, NULL); + igt_assert_eq(ret, 0); +} + +static void +test_disable_pipe(igt_display_t *display, enum pipe pipe, igt_output_t *output) +{ + test_init(display, pipe, output); + + igt_output_set_pipe(output, pipe); + + do_atomic_commit(display, 0); + + /* disable pipe */ + igt_output_set_pipe(output, PIPE_NONE); + + /* try to do atomic commit */ + do_atomic_commit(display, -EINVAL); + + igt_output_set_pipe(output, PIPE_ANY); +} + +static void +test_dpms(igt_display_t *display, enum pipe pipe, igt_output_t *output) +{ + int ret; + int flags = fcntl(display->drm_fd, F_GETFL, 0); + + test_init(display, pipe, output); + + igt_output_set_pipe(output, pipe); + + do_atomic_commit(display, 0); + + fcntl(display->drm_fd, F_SETFL
[Intel-gfx] [PULL] drm-intel-next-fixes for v4.12
Hi Dave, here's an assortment of drm/i915 and gvt fixes for drm-next/v4.12. BR, Jani. The following changes since commit ab6eb211b07a42a6346e284056422fd9a8576a99: Merge tag 'drm/panel/for-4.12-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next (2017-04-13 06:17:40 +1000) are available in the git repository at: git://anongit.freedesktop.org/git/drm-intel tags/drm-intel-next-fixes-2017-04-27 for you to fetch changes up to 88326ef05b262f681d837ecf65db10a7edb609f1: drm/i915: Confirm the request is still active before adding it to the await (2017-04-26 16:28:47 +0300) drm/i915 and gvt fixes for drm-next/v4.12 Changbin Du (4): drm/i915/gvt: Align render mmio list to cacheline drm/i915/gvt: remove redundant platform check for mocs load/restore drm/i915/gvt: remove redundant ring id check which cause significant CPU misprediction drm/i915/gvt: use directly assignment for structure copying Chris Wilson (7): drm/i915: Park the signaler before sleeping drm/i915: Apply a cond_resched() to the saturated signaler drm/i915: Use the right mapping_gfp_mask for final shmem allocation drm/i915: Fix use after free in lpe_audio_platdev_destroy() drm/i915/selftests: Allocate inode/file dynamically drm/i915: Avoid busy-spinning on VLV_GLTC_PW_STATUS mmio drm/i915: Confirm the request is still active before adding it to the await Dan Carpenter (2): drm/i915/gvt: fix a bounds check in ring_id_to_context_switch_event() drm/i915: checking for NULL instead of IS_ERR() in mock selftests Jani Nikula (1): Merge tag 'gvt-next-fixes-2017-04-20' of https://github.com/01org/gvt-linux into drm-intel-next-fixes Mika Kuoppala (1): drm/i915: Fix system hang with EI UP masked on Haswell Pei Zhang (1): drm/i915/gvt: add mmio init for virtual display Ville Syrjälä (2): drm/i915: Make legacy cursor updates more unsynced drm/i915: Perform link quality check unconditionally during long pulse Zhenyu Wang (3): drm/i915/gvt: cleanup some too chatty scheduler message drm/i915/gvt: remove some debug messages in scheduler timer handler drm/i915/gvt: Fix PTE write flush for taking runtime pm properly drivers/gpu/drm/i915/gvt/cmd_parser.c | 8 + drivers/gpu/drm/i915/gvt/display.c| 29 - drivers/gpu/drm/i915/gvt/execlist.c | 8 ++--- drivers/gpu/drm/i915/gvt/gtt.c| 5 +++ drivers/gpu/drm/i915/gvt/render.c | 10 ++ drivers/gpu/drm/i915/gvt/sched_policy.c | 17 ++ drivers/gpu/drm/i915/gvt/scheduler.c | 5 +-- drivers/gpu/drm/i915/i915_drv.c | 46 ++- drivers/gpu/drm/i915/i915_gem.c | 2 +- drivers/gpu/drm/i915/i915_gem_request.c | 3 ++ drivers/gpu/drm/i915/i915_irq.c | 4 +-- drivers/gpu/drm/i915/intel_breadcrumbs.c | 21 +--- drivers/gpu/drm/i915/intel_display.c | 31 +++--- drivers/gpu/drm/i915/intel_dp.c | 15 +++-- drivers/gpu/drm/i915/intel_lpe_audio.c| 9 +- drivers/gpu/drm/i915/selftests/mock_drm.c | 45 ++ drivers/gpu/drm/i915/selftests/mock_request.c | 2 +- 17 files changed, 163 insertions(+), 97 deletions(-) -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/2] drm/i915: Sanitize engine context sizes
Pre-calculate engine context size based on engine class and device generation and store it in the engine instance. v2: - Squash and get rid of hw_context_size (Chris) v3: - Move after MMIO init for probing on Gen7 and 8 (Chris) - Retained rounding (Tvrtko) v4: - Rebase for deferred legacy context allocation Signed-off-by: Joonas Lahtinen Cc: Paulo Zanoni Cc: Rodrigo Vivi Cc: Chris Wilson Cc: Daniele Ceraolo Spurio Cc: Tvrtko Ursulin Cc: Oscar Mateo Cc: Zhenyu Wang Cc: intel-gvt-...@lists.freedesktop.org Acked-by: Tvrtko Ursulin Cc: Tvrtko Ursulin Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/gvt/scheduler.c | 6 +- drivers/gpu/drm/i915/i915_drv.c| 15 +++-- drivers/gpu/drm/i915/i915_drv.h| 3 +- drivers/gpu/drm/i915/i915_gem_context.c| 56 ++- drivers/gpu/drm/i915/i915_guc_submission.c | 3 +- drivers/gpu/drm/i915/i915_reg.h| 10 drivers/gpu/drm/i915/intel_engine_cs.c | 90 +- drivers/gpu/drm/i915/intel_lrc.c | 54 +- drivers/gpu/drm/i915/intel_lrc.h | 2 - drivers/gpu/drm/i915/intel_ringbuffer.c| 4 +- drivers/gpu/drm/i915/intel_ringbuffer.h| 7 ++- 11 files changed, 112 insertions(+), 138 deletions(-) diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c index bada32b..1256fe2 100644 --- a/drivers/gpu/drm/i915/gvt/scheduler.c +++ b/drivers/gpu/drm/i915/gvt/scheduler.c @@ -69,8 +69,7 @@ static int populate_shadow_context(struct intel_vgpu_workload *workload) gvt_dbg_sched("ring id %d workload lrca %x", ring_id, workload->ctx_desc.lrca); - context_page_num = intel_lr_context_size( - gvt->dev_priv->engine[ring_id]); + context_page_num = gvt->dev_priv->engine[ring_id]->context_size; context_page_num = context_page_num >> PAGE_SHIFT; @@ -330,8 +329,7 @@ static void update_guest_context(struct intel_vgpu_workload *workload) gvt_dbg_sched("ring id %d workload lrca %x\n", ring_id, workload->ctx_desc.lrca); - context_page_num = intel_lr_context_size( - gvt->dev_priv->engine[ring_id]); + context_page_num = gvt->dev_priv->engine[ring_id]->context_size; context_page_num = context_page_num >> PAGE_SHIFT; diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index c7d68e7..2d3c4264 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -835,10 +835,6 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv, intel_uc_init_early(dev_priv); i915_memcpy_init_early(dev_priv); - ret = intel_engines_init_early(dev_priv); - if (ret) - return ret; - ret = i915_workqueues_init(dev_priv); if (ret < 0) goto err_engines; @@ -948,14 +944,21 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv) ret = i915_mmio_setup(dev_priv); if (ret < 0) - goto put_bridge; + goto err_bridge; intel_uncore_init(dev_priv); + + ret = intel_engines_init_mmio(dev_priv); + if (ret) + goto err_uncore; + i915_gem_init_mmio(dev_priv); return 0; -put_bridge: +err_uncore: + intel_uncore_fini(dev_priv); +err_bridge: pci_dev_put(dev_priv->bridge_dev); return ret; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index d1f7c48..e68edf1 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2359,7 +2359,6 @@ struct drm_i915_private { */ struct mutex av_mutex; - uint32_t hw_context_size; struct list_head context_list; u32 fdi_rx_config; @@ -3023,7 +3022,7 @@ extern unsigned long i915_gfx_val(struct drm_i915_private *dev_priv); extern void i915_update_gfx_val(struct drm_i915_private *dev_priv); int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on); -int intel_engines_init_early(struct drm_i915_private *dev_priv); +int intel_engines_init_mmio(struct drm_i915_private *dev_priv); int intel_engines_init(struct drm_i915_private *dev_priv); /* intel_hotplug.c */ diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index d46a69d..31a73c3 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -92,33 +92,6 @@ #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1 -static int get_context_size(struct drm_i915_private *dev_priv) -{ - int ret; - u32 reg; - - switch (INTEL_GEN(dev_priv)) { - case 6: - reg = I915_READ(CXT_SIZE); - ret = GEN6_CXT_TOTAL_SIZE(reg) * 64; - break; - case 7: - reg = I915_READ(GEN7_CXT_SIZE); -
[Intel-gfx] [PATCH 2/2] drm/i915: Eliminate HAS_HW_CONTEXTS
According to Chris i915_gem_sanitize was meant to reset ILK too. CCID register existed already on ILK according to the PRM (Chris verified the address to match too). HAS_HW_CONTEXTS in i915_l3_write is bogus because each HAS_L3_DPF match also has .has_hw_contexts = 1 set. This leads to us being able to get rid of the property completely. Signed-off-by: Joonas Lahtinen Cc: Chris Wilson Cc: Tvrtko Ursulin Cc: Mika Kuoppala --- drivers/gpu/drm/i915/i915_drv.h | 2 -- drivers/gpu/drm/i915/i915_gem.c | 2 +- drivers/gpu/drm/i915/i915_gpu_error.c | 6 +++--- drivers/gpu/drm/i915/i915_pci.c | 5 - drivers/gpu/drm/i915/i915_sysfs.c | 3 --- 5 files changed, 4 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e68edf1..cfa5689 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -822,7 +822,6 @@ struct intel_csr { func(has_gmch_display); \ func(has_guc); \ func(has_hotplug); \ - func(has_hw_contexts); \ func(has_l3_dpf); \ func(has_llc); \ func(has_logical_ring_contexts); \ @@ -2866,7 +2865,6 @@ intel_info(const struct drm_i915_private *dev_priv) #define HWS_NEEDS_PHYSICAL(dev_priv) ((dev_priv)->info.hws_needs_physical) -#define HAS_HW_CONTEXTS(dev_priv) ((dev_priv)->info.has_hw_contexts) #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \ ((dev_priv)->info.has_logical_ring_contexts) #define USES_PPGTT(dev_priv) (i915.enable_ppgtt) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 33fb11c..7c6048a 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4488,7 +4488,7 @@ void i915_gem_sanitize(struct drm_i915_private *i915) * of the reset, so we only reset recent machines with logical * context support (that must be reset to remove any stray contexts). */ - if (HAS_HW_CONTEXTS(i915)) { + if (INTEL_GEN(i915) >= 5) { int reset = intel_gpu_reset(i915, ALL_ENGINES); WARN_ON(reset && reset != -ENODEV); } diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 4b247b0..ec526d9 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1598,6 +1598,9 @@ static void i915_capture_reg_state(struct drm_i915_private *dev_priv, error->done_reg = I915_READ(DONE_REG); } + if (INTEL_GEN(dev_priv) >= 5) + error->ccid = I915_READ(CCID); + /* 3: Feature specific registers */ if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv)) { error->gam_ecochk = I915_READ(GAM_ECOCHK); @@ -1605,9 +1608,6 @@ static void i915_capture_reg_state(struct drm_i915_private *dev_priv, } /* 4: Everything else */ - if (HAS_HW_CONTEXTS(dev_priv)) - error->ccid = I915_READ(CCID); - if (INTEL_GEN(dev_priv) >= 8) { error->ier = I915_READ(GEN8_DE_MISC_IER); for (i = 0; i < 4; i++) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index f87b0c4..f80db2c 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -220,7 +220,6 @@ static const struct intel_device_info intel_ironlake_m_info = { .has_rc6 = 1, \ .has_rc6p = 1, \ .has_gmbus_irq = 1, \ - .has_hw_contexts = 1, \ .has_aliasing_ppgtt = 1, \ GEN_DEFAULT_PIPEOFFSETS, \ CURSOR_OFFSETS @@ -245,7 +244,6 @@ static const struct intel_device_info intel_sandybridge_m_info = { .has_rc6 = 1, \ .has_rc6p = 1, \ .has_gmbus_irq = 1, \ - .has_hw_contexts = 1, \ .has_aliasing_ppgtt = 1, \ .has_full_ppgtt = 1, \ GEN_DEFAULT_PIPEOFFSETS, \ @@ -280,7 +278,6 @@ static const struct intel_device_info intel_valleyview_info = { .has_runtime_pm = 1, .has_rc6 = 1, .has_gmbus_irq = 1, - .has_hw_contexts = 1, .has_gmch_display = 1, .has_hotplug = 1, .has_aliasing_ppgtt = 1, @@ -340,7 +337,6 @@ static const struct intel_device_info intel_cherryview_info = { .has_resource_streamer = 1, .has_rc6 = 1, .has_gmbus_irq = 1, - .has_hw_contexts = 1, .has_logical_ring_contexts = 1, .has_gmch_display = 1, .has_aliasing_ppgtt = 1, @@ -387,7 +383,6 @@ static const struct intel_device_info intel_skylake_gt3_info = { .has_rc6 = 1, \ .has_dp_mst = 1, \ .has_gmbus_irq = 1, \ - .has_hw_contexts = 1, \ .has_logical_ring_contexts = 1, \ .has_guc = 1, \ .has_decoupled_mmio = 1, \ diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c index f3fdfda..a6ad1c2 100644 --- a/drivers/gpu/drm/i915/i915_sysfs.c +++ b/driver
[Intel-gfx] [PATCH] drm/i915/edp: Read link status after exit PSR
From: "Lee, Shawn C" Display driver read DPCD register 0x202, 0x203 and 0x204 to identify eDP sink status. If PSR exit is ongoing at eDP sink, and eDP source read these registers at the same time. Panel will report EQ & symbol lock not done. It will cause panel display flicking. So driver have to make sure PSR already exit before read link status. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99639 TEST=Reboot DUT and no flicking on local display at login screen Cc: Cooper Chiou Cc: Jani Nikula Cc: Rodrigo Vivi Cc: Jim Bride Cc: Ryan Lin Signed-off-by: Shawn Lee --- drivers/gpu/drm/i915/intel_dp.c | 34 +- 1 file changed, 29 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 08834f74d396..cc431337b2dc 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -4252,19 +4252,35 @@ static void intel_dp_handle_test_request(struct intel_dp *intel_dp) } static void +intel_edp_wait_PSR_exit(struct intel_dp *intel_dp) +{ + struct drm_device *dev = intel_dp_to_dev(intel_dp); + struct drm_i915_private *dev_priv = dev->dev_private; + u32 srd_status, count = 100; + + while (count--) { + srd_status = I915_READ(EDP_PSR_STATUS_CTL); + + if ((srd_status & EDP_PSR_STATUS_SENDING_TP1) || + (srd_status & EDP_PSR_STATUS_SENDING_TP2_TP3) || + (srd_status & EDP_PSR_STATUS_SENDING_IDLE) || + (srd_status & EDP_PSR_STATUS_AUX_SENDING)) { + usleep_range(100, 150); + } else + return; + } +} + +static void intel_dp_check_link_status(struct intel_dp *intel_dp) { struct intel_encoder *intel_encoder = &dp_to_dig_port(intel_dp)->base; struct drm_device *dev = intel_dp_to_dev(intel_dp); + struct drm_i915_private *dev_priv = dev->dev_private; u8 link_status[DP_LINK_STATUS_SIZE]; WARN_ON(!drm_modeset_is_locked(&dev->mode_config.connection_mutex)); - if (!intel_dp_get_link_status(intel_dp, link_status)) { - DRM_ERROR("Failed to get link status\n"); - return; - } - if (!intel_encoder->base.crtc) return; @@ -4278,6 +4294,14 @@ static void intel_dp_handle_test_request(struct intel_dp *intel_dp) if (!intel_dp_link_params_valid(intel_dp)) return; + if (is_edp(intel_dp) && dev_priv->psr.enabled) + intel_edp_wait_PSR_exit(intel_dp); + + if (!intel_dp_get_link_status(intel_dp, link_status)) { + DRM_ERROR("Failed to get link status\n"); + return; + } + /* Retrain if Channel EQ or CR not ok */ if (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count)) { DRM_DEBUG_KMS("%s: channel EQ not ok, retraining\n", -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/edp: Read link status after exit PSR
On Thu, Apr 27, 2017 at 10:35:22PM +0800, Lee, Shawn C wrote: > From: "Lee, Shawn C" > > Display driver read DPCD register 0x202, 0x203 and 0x204 to identify > eDP sink status. If PSR exit is ongoing at eDP sink, and eDP source > read these registers at the same time. Panel will report EQ & symbol > lock not done. It will cause panel display flicking. > So driver have to make sure PSR already exit before read link status. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99639 > TEST=Reboot DUT and no flicking on local display at login screen > > Cc: Cooper Chiou > Cc: Jani Nikula > Cc: Rodrigo Vivi > Cc: Jim Bride > Cc: Ryan Lin > > Signed-off-by: Shawn Lee > --- > drivers/gpu/drm/i915/intel_dp.c | 34 +- > 1 file changed, 29 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c > index 08834f74d396..cc431337b2dc 100644 > --- a/drivers/gpu/drm/i915/intel_dp.c > +++ b/drivers/gpu/drm/i915/intel_dp.c > @@ -4252,19 +4252,35 @@ static void intel_dp_handle_test_request(struct > intel_dp *intel_dp) > } > > static void > +intel_edp_wait_PSR_exit(struct intel_dp *intel_dp) > +{ > + struct drm_device *dev = intel_dp_to_dev(intel_dp); > + struct drm_i915_private *dev_priv = dev->dev_private; > + u32 srd_status, count = 100; > + > + while (count--) { > + srd_status = I915_READ(EDP_PSR_STATUS_CTL); > + > + if ((srd_status & EDP_PSR_STATUS_SENDING_TP1) || > + (srd_status & EDP_PSR_STATUS_SENDING_TP2_TP3) || > + (srd_status & EDP_PSR_STATUS_SENDING_IDLE) || > + (srd_status & EDP_PSR_STATUS_AUX_SENDING)) { > + usleep_range(100, 150); > + } else > + return; > + } See intel_wait_for_register(i915, EDP_PSR_STATUS_CTL, (EDP_PSR_STATUS_SENDING_TP1 | EDP_PSR_STATUS_SENDING_TP2_TP3 | EDP_PSR_STATUS_SENDING_IDLE | EDP_PSR_STATUS_AUX_SENDING), 0, 15); -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 2/2] drm/i915: Prevent the system suspend complete optimization
Thanks, Imre I have tested this and I confirm that it solves the pm_runtime_get_sync() failed: -13 and the issues that follow after. This is also the root-cause in freedesktop bug 100770, which will be solved by your patch. BR, Marta > -Original Message- > From: Deak, Imre > Sent: Tuesday, April 25, 2017 1:29 PM > To: Lukas Wunner > Cc: intel-gfx@lists.freedesktop.org; Wysocki, Rafael J > ; Lofstedt, Marta ; > David Weinehall ; Sarvela, Tomi P > ; Ville Syrjälä ; > Kuoppala, Mika ; Chris Wilson wilson.co.uk>; Takashi Iwai ; Bjorn Helgaas > ; linux-...@vger.kernel.org > Subject: Re: [PATCH v2 2/2] drm/i915: Prevent the system suspend complete > optimization > > On Mon, Apr 24, 2017 at 10:16:38PM +0200, Lukas Wunner wrote: > > On Mon, Apr 24, 2017 at 05:27:43PM +0300, Imre Deak wrote: > > > Since > > > > > > commit bac2a909a096c9110525c18cbb8ce73c660d5f71 > > > Author: Rafael J. Wysocki > > > Date: Wed Jan 21 02:17:42 2015 +0100 > > > > > > PCI / PM: Avoid resuming PCI devices during system suspend > > > > This is not the commit you are looking for. :-) See below. > > > > > > > PCI devices will default to allowing the system suspend complete > > > optimization where devices are not woken up during system suspend if > > > they were already runtime suspended. This however breaks the > > > i915/HDA drivers for two reasons: > > > > > > - The i915 driver has system suspend specific steps that it needs to > > > run, that bring the device to a different state than its runtime > > > suspended state. > > > > > > - The HDA driver's suspend handler requires power that it will request > > > from the i915 driver's power domain handler. This in turn requires the > > > i915 driver to runtime resume itself, but this won't be possible if the > > > suspend complete optimization is in effect: in this case the i915 > > > runtime PM is disabled and trying to get an RPM reference returns > > > -EACCESS. > > > > Hm, sounds like something that needs to be solved with device_links. > > > > > > > > > > Solve this by requiring the PCI/PM core to resume the device during > > > system suspend which in effect disables the suspend complete > optimization. > > > > > > One possibility for this bug staying hidden for so long is that the > > > optimization for a device is disabled if it's disabled for any of > > > its children devices. i915 may have a backlight device as its child > > > which doesn't support runtime PM and so doesn't allow the optimization > either. > > > So if this backlight device got registered the bug stayed hidden. > > > > No, the reason this hasn't popped up earlier is because > > direct_complete has only been enabled for DRM devices for a few months > > now, to be specific since > > > > commit d14d2a8453d650bea32a1c5271af1458cd283a0f > > Author: Lukas Wunner > > Date: Wed Jun 8 12:49:29 2016 +0200 > > > > drm: Remove dev_pm_ops from drm_class > > > > which landed in v4.8. > > Right, this kept the optimization disabled even after bac2a909a096c91. > It did stay disabled on platforms with a backlight driver registered as > described above. > > --Imre > > > > > (Sorry for not raising my voice earlier, this patch appeared on my > > radar just now.) > > > > Kind regards, > > > > Lukas > > > > > > > > Credits to Marta, Tomi and David who enabled pstore logging, that > > > caught one instance of this issue across a suspend/ resume-to-ram > > > and Ville who rememberd that the optimization was enabled for some > > > devices at one point. > > > > > > The first WARN triggered by the problem: > > > > > > [ 6250.746445] WARNING: CPU: 2 PID: 17384 at > > > drivers/gpu/drm/i915/intel_runtime_pm.c:2846 > > > intel_runtime_pm_get+0x6b/0xd0 [i915] [ 6250.746448] > > > pm_runtime_get_sync() failed: -13 [ 6250.746451] Modules linked in: > > > snd_hda_intel i915 vgem snd_hda_codec_hdmi x86_pkg_temp_thermal > intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul > snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel > e1000e snd_hda_codec snd_hwdep snd_hda_core ptp mei_me pps_core > snd_pcm lpc_ich mei prime_ numbers i2c_hid i2c_designware_platform > i2c_designware_core [last unloaded: i915] > > > [ 6250.746512] CPU: 2 PID: 17384 Comm: kworker/u8:0 Tainted: G U W > 4.11.0-rc5-CI-CI_DRM_334+ #1 > > > [ 6250.746515] Hardware name: /NUC5i5RYB, BIOS > RYBDWi35.86A.0362.2017.0118.0940 01/18/2017 > > > [ 6250.746521] Workqueue: events_unbound async_run_entry_fn [ > > > 6250.746525] Call Trace: > > > [ 6250.746530] dump_stack+0x67/0x92 [ 6250.746536] > > > __warn+0xc6/0xe0 [ 6250.746542] ? > > > pci_restore_standard_config+0x40/0x40 > > > [ 6250.746546] warn_slowpath_fmt+0x46/0x50 [ 6250.746553] ? > > > __pm_runtime_resume+0x56/0x80 [ 6250.746584] > > > intel_runtime_pm_get+0x6b/0xd0 [i915] [ 6250.746610] > > > intel_display_power_get+0x1b/0x40 [i915] [ 6250.746646] > > > i915_audio_component_get_power+0x15/0x20 [i915] [ 6250.746654] > > > snd_hdac_d
[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915: Sanitize engine context sizes
== Series Details == Series: series starting with [1/2] drm/i915: Sanitize engine context sizes URL : https://patchwork.freedesktop.org/series/23630/ State : failure == Summary == Series 23630v1 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/23630/revisions/1/mbox/ Test gem_exec_flush: Subgroup basic-batch-kernel-default-uc: pass -> FAIL (fi-snb-2600) fdo#17 Test kms_cursor_legacy: Subgroup basic-busy-flip-before-cursor-atomic: pass -> INCOMPLETE (fi-bxt-t5700) Test kms_flip: Subgroup basic-plain-flip: pass -> DMESG-WARN (fi-byt-j1900) fdo#17 https://bugs.freedesktop.org/show_bug.cgi?id=17 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:432s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:425s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:579s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:509s fi-bxt-t5700 total:199 pass:185 dwarn:0 dfail:0 fail:0 skip:13 fi-byt-j1900 total:278 pass:253 dwarn:1 dfail:0 fail:0 skip:24 time:494s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:486s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:403s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:403s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:421s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:492s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:467s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:453s fi-kbl-7560u total:278 pass:267 dwarn:1 dfail:0 fail:0 skip:10 time:569s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:454s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:574s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:457s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:494s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:432s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:527s fi-snb-2600 total:278 pass:248 dwarn:0 dfail:0 fail:1 skip:29 time:405s 8b5a41bbd270c3a8db6d48bc1d6d6bafb59e6753 drm-tip: 2017y-04m-27d-13h-10m-59s UTC integration manifest 9a0c7d3 drm/i915: Eliminate HAS_HW_CONTEXTS 1125ffa drm/i915: Sanitize engine context sizes == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4567/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Eliminate HAS_HW_CONTEXTS
On Thu, Apr 27, 2017 at 04:41:33PM +0300, Joonas Lahtinen wrote: > According to Chris i915_gem_sanitize was meant to reset ILK too. > > CCID register existed already on ILK according to the PRM (Chris > verified the address to match too). > > HAS_HW_CONTEXTS in i915_l3_write is bogus because each HAS_L3_DPF > match also has .has_hw_contexts = 1 set. > > This leads to us being able to get rid of the property completely. > > Signed-off-by: Joonas Lahtinen > Cc: Chris Wilson > Cc: Tvrtko Ursulin > Cc: Mika Kuoppala Reviewed-by: Chris Wilson -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Eliminate HAS_HW_CONTEXTS
On Thu, Apr 27, 2017 at 04:41:33PM +0300, Joonas Lahtinen wrote: > According to Chris i915_gem_sanitize was meant to reset ILK too. In that case drawing the line before g4x might make more sense since it already has a GPU reset that doesn't clobber the display. > > CCID register existed already on ILK according to the PRM (Chris > verified the address to match too). I think it has existed since forever actually. Well, not sure about gen0-1. > > HAS_HW_CONTEXTS in i915_l3_write is bogus because each HAS_L3_DPF > match also has .has_hw_contexts = 1 set. > > This leads to us being able to get rid of the property completely. There seem to be several changes in here. Would it not be better to split this up into functional and non-functional patches so that if there's a regression you wouldn't have to revert the entire thing? > > Signed-off-by: Joonas Lahtinen > Cc: Chris Wilson > Cc: Tvrtko Ursulin > Cc: Mika Kuoppala > --- > drivers/gpu/drm/i915/i915_drv.h | 2 -- > drivers/gpu/drm/i915/i915_gem.c | 2 +- > drivers/gpu/drm/i915/i915_gpu_error.c | 6 +++--- > drivers/gpu/drm/i915/i915_pci.c | 5 - > drivers/gpu/drm/i915/i915_sysfs.c | 3 --- > 5 files changed, 4 insertions(+), 14 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index e68edf1..cfa5689 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -822,7 +822,6 @@ struct intel_csr { > func(has_gmch_display); \ > func(has_guc); \ > func(has_hotplug); \ > - func(has_hw_contexts); \ > func(has_l3_dpf); \ > func(has_llc); \ > func(has_logical_ring_contexts); \ > @@ -2866,7 +2865,6 @@ intel_info(const struct drm_i915_private *dev_priv) > > #define HWS_NEEDS_PHYSICAL(dev_priv) ((dev_priv)->info.hws_needs_physical) > > -#define HAS_HW_CONTEXTS(dev_priv)((dev_priv)->info.has_hw_contexts) > #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \ > ((dev_priv)->info.has_logical_ring_contexts) > #define USES_PPGTT(dev_priv) (i915.enable_ppgtt) > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 33fb11c..7c6048a 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -4488,7 +4488,7 @@ void i915_gem_sanitize(struct drm_i915_private *i915) >* of the reset, so we only reset recent machines with logical >* context support (that must be reset to remove any stray contexts). >*/ > - if (HAS_HW_CONTEXTS(i915)) { > + if (INTEL_GEN(i915) >= 5) { > int reset = intel_gpu_reset(i915, ALL_ENGINES); > WARN_ON(reset && reset != -ENODEV); > } > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c > b/drivers/gpu/drm/i915/i915_gpu_error.c > index 4b247b0..ec526d9 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -1598,6 +1598,9 @@ static void i915_capture_reg_state(struct > drm_i915_private *dev_priv, > error->done_reg = I915_READ(DONE_REG); > } > > + if (INTEL_GEN(dev_priv) >= 5) > + error->ccid = I915_READ(CCID); > + > /* 3: Feature specific registers */ > if (IS_GEN6(dev_priv) || IS_GEN7(dev_priv)) { > error->gam_ecochk = I915_READ(GAM_ECOCHK); > @@ -1605,9 +1608,6 @@ static void i915_capture_reg_state(struct > drm_i915_private *dev_priv, > } > > /* 4: Everything else */ > - if (HAS_HW_CONTEXTS(dev_priv)) > - error->ccid = I915_READ(CCID); > - > if (INTEL_GEN(dev_priv) >= 8) { > error->ier = I915_READ(GEN8_DE_MISC_IER); > for (i = 0; i < 4; i++) > diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c > index f87b0c4..f80db2c 100644 > --- a/drivers/gpu/drm/i915/i915_pci.c > +++ b/drivers/gpu/drm/i915/i915_pci.c > @@ -220,7 +220,6 @@ static const struct intel_device_info > intel_ironlake_m_info = { > .has_rc6 = 1, \ > .has_rc6p = 1, \ > .has_gmbus_irq = 1, \ > - .has_hw_contexts = 1, \ > .has_aliasing_ppgtt = 1, \ > GEN_DEFAULT_PIPEOFFSETS, \ > CURSOR_OFFSETS > @@ -245,7 +244,6 @@ static const struct intel_device_info > intel_sandybridge_m_info = { > .has_rc6 = 1, \ > .has_rc6p = 1, \ > .has_gmbus_irq = 1, \ > - .has_hw_contexts = 1, \ > .has_aliasing_ppgtt = 1, \ > .has_full_ppgtt = 1, \ > GEN_DEFAULT_PIPEOFFSETS, \ > @@ -280,7 +278,6 @@ static const struct intel_device_info > intel_valleyview_info = { > .has_runtime_pm = 1, > .has_rc6 = 1, > .has_gmbus_irq = 1, > - .has_hw_contexts = 1, > .has_gmch_display = 1, > .has_hotplug = 1, > .has_aliasing_ppgtt = 1, > @@ -340,7 +337,6 @@ static const struct intel_device_info > intel_cherryview_info = { > .has_resource_streamer = 1, > .has_rc6 = 1, > .has_gmbus_irq = 1
Re: [Intel-gfx] [PATCH 13/27] drm/i915/execlists: Pack the count into the low bits of the port.request
On Thu, Apr 20, 2017 at 03:58:19PM +0100, Tvrtko Ursulin wrote: > > static void record_context(struct drm_i915_error_context *e, > >diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c > >b/drivers/gpu/drm/i915/i915_guc_submission.c > >index 1642fff9cf13..370373c97b81 100644 > >--- a/drivers/gpu/drm/i915/i915_guc_submission.c > >+++ b/drivers/gpu/drm/i915/i915_guc_submission.c > >@@ -658,7 +658,7 @@ static void nested_enable_signaling(struct > >drm_i915_gem_request *rq) > > static bool i915_guc_dequeue(struct intel_engine_cs *engine) > > { > > struct execlist_port *port = engine->execlist_port; > >-struct drm_i915_gem_request *last = port[0].request; > >+struct drm_i915_gem_request *last = port[0].request_count; > > It's confusing that in this new scheme sometimes we have direct > access to the request and sometimes we have to go through the > port_request macro. > > So maybe we should always use the port_request macro. Hm, could we > invent a new type to help enforce that? Like: > > struct drm_i915_gem_port_request_slot { > struct drm_i915_gem_request *req_count; > }; > > And then execlist port would contain these and helpers would need to > be functions? > > I've also noticed some GVT/GuC patches which sounded like they are > adding the same single submission constraints so maybe now is the > time to unify the dequeue? (Haven't looked at those patches deeper > than the subject line so might be wrong.) > > Not sure 100% of all the above, would need to sketch it. What are > your thoughts? I forsee a use for the count in guc as well, so conversion is ok with me. > >diff --git a/drivers/gpu/drm/i915/intel_lrc.c > >b/drivers/gpu/drm/i915/intel_lrc.c > >index 7df278fe492e..69299fbab4f9 100644 > >--- a/drivers/gpu/drm/i915/intel_lrc.c > >+++ b/drivers/gpu/drm/i915/intel_lrc.c > >@@ -342,39 +342,32 @@ static u64 execlists_update_context(struct > >drm_i915_gem_request *rq) > > > > static void execlists_submit_ports(struct intel_engine_cs *engine) > > { > >-struct drm_i915_private *dev_priv = engine->i915; > > struct execlist_port *port = engine->execlist_port; > > u32 __iomem *elsp = > >-dev_priv->regs + i915_mmio_reg_offset(RING_ELSP(engine)); > >-u64 desc[2]; > >- > >-GEM_BUG_ON(port[0].count > 1); > >-if (!port[0].count) > >-execlists_context_status_change(port[0].request, > >-INTEL_CONTEXT_SCHEDULE_IN); > >-desc[0] = execlists_update_context(port[0].request); > >-GEM_DEBUG_EXEC(port[0].context_id = upper_32_bits(desc[0])); > >-port[0].count++; > >- > >-if (port[1].request) { > >-GEM_BUG_ON(port[1].count); > >-execlists_context_status_change(port[1].request, > >-INTEL_CONTEXT_SCHEDULE_IN); > >-desc[1] = execlists_update_context(port[1].request); > >-GEM_DEBUG_EXEC(port[1].context_id = upper_32_bits(desc[1])); > >-port[1].count = 1; > >-} else { > >-desc[1] = 0; > >-} > >-GEM_BUG_ON(desc[0] == desc[1]); > >- > >-/* You must always write both descriptors in the order below. */ > >-writel(upper_32_bits(desc[1]), elsp); > >-writel(lower_32_bits(desc[1]), elsp); > >+engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine)); > >+unsigned int n; > >+ > >+for (n = ARRAY_SIZE(engine->execlist_port); n--; ) { > > We could also add for_each_req_port or something, to iterate and > unpack either req only or the count as well? for_each_port_reverse? We're looking at very special cases here! I'm not sure and I'm playing with different structures. > >diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h > >b/drivers/gpu/drm/i915/intel_ringbuffer.h > >index d25b88467e5e..39b733e5cfd3 100644 > >--- a/drivers/gpu/drm/i915/intel_ringbuffer.h > >+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > >@@ -377,8 +377,12 @@ struct intel_engine_cs { > > /* Execlists */ > > struct tasklet_struct irq_tasklet; > > struct execlist_port { > >-struct drm_i915_gem_request *request; > >-unsigned int count; > >+struct drm_i915_gem_request *request_count; > > Would req(uest)_slot maybe be better? It's definitely a count (of how many times this request has been submitted), and I like long verbose names when I don't want them to be used directly. So expect guc to be tidied. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/edp: Read link status after exit PSR
On Thu, Apr 27, 2017 at 10:35:22PM +0800, Lee, Shawn C wrote: > From: "Lee, Shawn C" > > Display driver read DPCD register 0x202, 0x203 and 0x204 to identify > eDP sink status. If PSR exit is ongoing at eDP sink, and eDP source > read these registers at the same time. Panel will report EQ & symbol > lock not done. It will cause panel display flicking. > So driver have to make sure PSR already exit before read link status. And what exactly guarantees that it will exit PSR eventually? > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99639 > TEST=Reboot DUT and no flicking on local display at login screen > > Cc: Cooper Chiou > Cc: Jani Nikula > Cc: Rodrigo Vivi > Cc: Jim Bride > Cc: Ryan Lin > > Signed-off-by: Shawn Lee > --- > drivers/gpu/drm/i915/intel_dp.c | 34 +- > 1 file changed, 29 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c > index 08834f74d396..cc431337b2dc 100644 > --- a/drivers/gpu/drm/i915/intel_dp.c > +++ b/drivers/gpu/drm/i915/intel_dp.c > @@ -4252,19 +4252,35 @@ static void intel_dp_handle_test_request(struct > intel_dp *intel_dp) > } > > static void > +intel_edp_wait_PSR_exit(struct intel_dp *intel_dp) > +{ > + struct drm_device *dev = intel_dp_to_dev(intel_dp); > + struct drm_i915_private *dev_priv = dev->dev_private; > + u32 srd_status, count = 100; > + > + while (count--) { > + srd_status = I915_READ(EDP_PSR_STATUS_CTL); > + > + if ((srd_status & EDP_PSR_STATUS_SENDING_TP1) || > + (srd_status & EDP_PSR_STATUS_SENDING_TP2_TP3) || > + (srd_status & EDP_PSR_STATUS_SENDING_IDLE) || > + (srd_status & EDP_PSR_STATUS_AUX_SENDING)) { > + usleep_range(100, 150); > + } else > + return; > + } > +} > + > +static void > intel_dp_check_link_status(struct intel_dp *intel_dp) > { > struct intel_encoder *intel_encoder = &dp_to_dig_port(intel_dp)->base; > struct drm_device *dev = intel_dp_to_dev(intel_dp); > + struct drm_i915_private *dev_priv = dev->dev_private; > u8 link_status[DP_LINK_STATUS_SIZE]; > > WARN_ON(!drm_modeset_is_locked(&dev->mode_config.connection_mutex)); > > - if (!intel_dp_get_link_status(intel_dp, link_status)) { > - DRM_ERROR("Failed to get link status\n"); > - return; > - } > - > if (!intel_encoder->base.crtc) > return; > > @@ -4278,6 +4294,14 @@ static void intel_dp_handle_test_request(struct > intel_dp *intel_dp) > if (!intel_dp_link_params_valid(intel_dp)) > return; > > + if (is_edp(intel_dp) && dev_priv->psr.enabled) > + intel_edp_wait_PSR_exit(intel_dp); > + > + if (!intel_dp_get_link_status(intel_dp, link_status)) { > + DRM_ERROR("Failed to get link status\n"); > + return; > + } > + > /* Retrain if Channel EQ or CR not ok */ > if (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count)) { > DRM_DEBUG_KMS("%s: channel EQ not ok, retraining\n", > -- > 1.7.9.5 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/edp: Read link status after exit PSR
== Series Details == Series: drm/i915/edp: Read link status after exit PSR URL : https://patchwork.freedesktop.org/series/23631/ State : success == Summary == Series 23631v1 drm/i915/edp: Read link status after exit PSR https://patchwork.freedesktop.org/api/1.0/series/23631/revisions/1/mbox/ fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:432s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:424s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:581s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:505s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:550s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:485s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:482s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:405s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:405s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:411s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:482s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:472s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:459s fi-kbl-7560u total:278 pass:267 dwarn:1 dfail:0 fail:0 skip:10 time:564s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:456s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:578s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:458s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:490s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:431s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:535s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:395s 8b5a41bbd270c3a8db6d48bc1d6d6bafb59e6753 drm-tip: 2017y-04m-27d-13h-10m-59s UTC integration manifest 578c26d drm/i915/edp: Read link status after exit PSR == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4568/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes
Currently, we only mark the CPU cache as dirty if we skip a clflush. This leads to some confusion where we have to ask if the object is in the write domain or missed a clflush. If we always mark the cache as dirty, this becomes a much simply question to answer. The goal remains to do as few clflushes as required and to do them as late as possible, in the hope of deferring the work to a kthread and not block the caller (e.g. execbuf, flips). v2: Always call clflush before GPU execution when the cache_dirty flag is set. This may cause some extra work on llc systems that migrate dirty buffers back and forth - but we do try to limit that by only setting cache_dirty at the end of the gpu sequence. Reported-by: Dongwon Kim Fixes: a6a7cc4b7db6 ("drm/i915: Always flush the dirty CPU cache when pinning the scanout") Signed-off-by: Chris Wilson Cc: Dongwon Kim Cc: Matt Roper --- drivers/gpu/drm/i915/i915_gem.c | 78 +++- drivers/gpu/drm/i915/i915_gem_clflush.c | 15 +++-- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 21 +++ drivers/gpu/drm/i915/i915_gem_internal.c | 3 +- drivers/gpu/drm/i915/i915_gem_userptr.c | 5 +- drivers/gpu/drm/i915/selftests/huge_gem_object.c | 3 +- 6 files changed, 70 insertions(+), 55 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 33fb11cc5acc..488ca7733c1e 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -49,7 +49,7 @@ static void i915_gem_flush_free_objects(struct drm_i915_private *i915); static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) { - if (obj->base.write_domain == I915_GEM_DOMAIN_CPU) + if (obj->cache_dirty) return false; if (!i915_gem_object_is_coherent(obj)) @@ -233,6 +233,14 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj) return st; } +static void __start_cpu_write(struct drm_i915_gem_object *obj) +{ + obj->base.read_domains = I915_GEM_DOMAIN_CPU; + obj->base.write_domain = I915_GEM_DOMAIN_CPU; + if (cpu_write_needs_clflush(obj)) + obj->cache_dirty = true; +} + static void __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages, @@ -248,8 +256,7 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, !i915_gem_object_is_coherent(obj)) drm_clflush_sg(pages); - obj->base.read_domains = I915_GEM_DOMAIN_CPU; - obj->base.write_domain = I915_GEM_DOMAIN_CPU; + __start_cpu_write(obj); } static void @@ -684,6 +691,12 @@ i915_gem_dumb_create(struct drm_file *file, args->size, &args->handle); } +static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj) +{ + return !(obj->cache_level == I915_CACHE_NONE || +obj->cache_level == I915_CACHE_WT); +} + /** * Creates a new mm object and returns a handle to it. * @dev: drm device pointer @@ -753,6 +766,11 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains) case I915_GEM_DOMAIN_CPU: i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC); break; + + case I915_GEM_DOMAIN_RENDER: + if (gpu_write_needs_clflush(obj)) + obj->cache_dirty = true; + break; } obj->base.write_domain = 0; @@ -854,7 +872,8 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj, * optimizes for the case when the gpu will dirty the data * anyway again before the next pread happens. */ - if (!(obj->base.read_domains & I915_GEM_DOMAIN_CPU)) + if (!obj->cache_dirty && + !(obj->base.read_domains & I915_GEM_DOMAIN_CPU)) *needs_clflush = CLFLUSH_BEFORE; out: @@ -906,14 +925,15 @@ int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj, * This optimizes for the case when the gpu will use the data * right away and we therefore have to clflush anyway. */ - if (obj->base.write_domain != I915_GEM_DOMAIN_CPU) + if (!obj->cache_dirty) { *needs_clflush |= CLFLUSH_AFTER; - /* Same trick applies to invalidate partially written cachelines read -* before writing. -*/ - if (!(obj->base.read_domains & I915_GEM_DOMAIN_CPU)) - *needs_clflush |= CLFLUSH_BEFORE; + /* Same trick applies to invalidate partially written +* cachelines read before writing. +*/ + if (!(obj->base.read_domains & I915_GEM_DOMAIN_CPU)) + *needs_clflush |= CLFLUSH_BEFORE; + } out: intel_fb_obj_invalidate(obj, ORIGIN_CPU); @@ -3374,10 +3394,12 @@ int i915_gem_wait_for_idle(struct drm_i915_private *i915, unsign
[Intel-gfx] [PATCH 2/2] drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty
For ease of use (i.e. avoiding a few checks and function calls), store the object's cache coherency next to the cache is dirty bit. Signed-off-by: Chris Wilson Cc: Dongwon Kim Cc: Matt Roper --- drivers/gpu/drm/i915/i915_gem.c | 14 +++--- drivers/gpu/drm/i915/i915_gem_clflush.c | 2 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/i915_gem_internal.c | 3 ++- drivers/gpu/drm/i915/i915_gem_object.h | 1 + drivers/gpu/drm/i915/i915_gem_stolen.c | 1 + drivers/gpu/drm/i915/i915_gem_userptr.c | 3 ++- drivers/gpu/drm/i915/selftests/huge_gem_object.c | 3 ++- 8 files changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 488ca7733c1e..56f70fd3c345 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -52,7 +52,7 @@ static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) if (obj->cache_dirty) return false; - if (!i915_gem_object_is_coherent(obj)) + if (!obj->cache_coherent) return true; return obj->pin_display; @@ -253,7 +253,7 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, if (needs_clflush && (obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0 && - !i915_gem_object_is_coherent(obj)) + !obj->cache_coherent) drm_clflush_sg(pages); __start_cpu_write(obj); @@ -856,8 +856,7 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj, if (ret) return ret; - if (i915_gem_object_is_coherent(obj) || - !static_cpu_has(X86_FEATURE_CLFLUSH)) { + if (obj->cache_coherent || !static_cpu_has(X86_FEATURE_CLFLUSH)) { ret = i915_gem_object_set_to_cpu_domain(obj, false); if (ret) goto err_unpin; @@ -909,8 +908,7 @@ int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj, if (ret) return ret; - if (i915_gem_object_is_coherent(obj) || - !static_cpu_has(X86_FEATURE_CLFLUSH)) { + if (obj->cache_coherent || !static_cpu_has(X86_FEATURE_CLFLUSH)) { ret = i915_gem_object_set_to_cpu_domain(obj, true); if (ret) goto err_unpin; @@ -3664,6 +3662,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, list_for_each_entry(vma, &obj->vma_list, obj_link) vma->node.color = cache_level; obj->cache_level = cache_level; + obj->cache_coherent = i915_gem_object_is_coherent(obj); if (obj->base.write_domain & I915_GEM_DOMAIN_CPU && cpu_write_needs_clflush(obj)) @@ -4326,7 +4325,8 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size) } else obj->cache_level = I915_CACHE_NONE; - obj->cache_dirty = !i915_gem_object_is_coherent(obj); + obj->cache_coherent = i915_gem_object_is_coherent(obj); + obj->cache_dirty = !obj->cache_coherent; trace_i915_gem_object_create(obj); diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.c b/drivers/gpu/drm/i915/i915_gem_clflush.c index a895643c4dc4..c4190b04f7f0 100644 --- a/drivers/gpu/drm/i915/i915_gem_clflush.c +++ b/drivers/gpu/drm/i915/i915_gem_clflush.c @@ -140,7 +140,7 @@ void i915_gem_clflush_object(struct drm_i915_gem_object *obj, * snooping behaviour occurs naturally as the result of our domain * tracking. */ - if (!(flags & I915_CLFLUSH_FORCE) && i915_gem_object_is_coherent(obj)) + if (!(flags & I915_CLFLUSH_FORCE) && obj->cache_coherent) return; trace_i915_gem_object_clflush(obj); diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 0b8ae0f56675..6e77003d7f0f 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1129,7 +1129,7 @@ i915_gem_execbuffer_move_to_gpu(struct drm_i915_gem_request *req, if (vma->exec_entry->flags & EXEC_OBJECT_ASYNC) continue; - if (obj->cache_dirty) + if (obj->cache_dirty & !obj->cache_coherent) i915_gem_clflush_object(obj, 0); ret = i915_gem_request_await_object diff --git a/drivers/gpu/drm/i915/i915_gem_internal.c b/drivers/gpu/drm/i915/i915_gem_internal.c index 58e93e87d573..568bf83af1f5 100644 --- a/drivers/gpu/drm/i915/i915_gem_internal.c +++ b/drivers/gpu/drm/i915/i915_gem_internal.c @@ -191,7 +191,8 @@ i915_gem_object_create_internal(struct drm_i915_private *i915, obj->base.read_domains = I915_GEM_DOMAIN_CPU; obj->base.write_domain = I915_GEM_DOMAIN_CPU; obj->cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On Wed, Apr 26, 2017 at 06:00:41PM +0300, David Weinehall wrote: > Add a bunch of MOCS entries for gen 9 that were missing from intel_mocs. > Some of these are used by media-sdk; if these entries are missing > the default will instead be to do everything uncached. > > This patch improves media-sdk performance with up to 60% > with the (admittedly synthetic) benchmarks we use in our nightly > testing, without regressing any other benchmarks. Hey David, I am testing some of the extended MOCS with Mesa and the differences I see fit in the margins of statistical error. Odd, I thought, so to make sure I haven't messed up anything in the process of compiling, setting LD_LIBRARY_PATH and benchmarking I turned everything to UNCACHED - and I saw severe performance drop. So here is the question it induced: Have you used the "closest neighbour" from entries available or did you defaulted to the UNCACHED ones? That could be the culprit. Note: I have tested MOCS for VB and Render Target only, and only in a few synthetic cases - it will require much more fine-tuning and benchmarking before any final conclusions. -- Cheers, Arek ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes
== Series Details == Series: series starting with [1/2] drm/i915: Mark CPU cache as dirty on every transition for CPU writes URL : https://patchwork.freedesktop.org/series/23634/ State : success == Summary == Series 23634v1 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/23634/revisions/1/mbox/ Test drv_module_reload: Subgroup basic-reload-inject: pass -> INCOMPLETE (fi-bdw-5557u) fdo#100750 fdo#100750 https://bugs.freedesktop.org/show_bug.cgi?id=100750 fi-bdw-5557u total:276 pass:265 dwarn:0 dfail:0 fail:0 skip:10 fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:425s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:503s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:554s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:483s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:477s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:406s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:406s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:416s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:497s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:472s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:453s fi-kbl-7560u total:278 pass:267 dwarn:1 dfail:0 fail:0 skip:10 time:567s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:455s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:572s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:458s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:485s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:437s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:531s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:401s fi-bsw-n3050 failed to collect. IGT log at Patchwork_4569/fi-bsw-n3050/igt.log 8b5a41bbd270c3a8db6d48bc1d6d6bafb59e6753 drm-tip: 2017y-04m-27d-13h-10m-59s UTC integration manifest 5bb12ef drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty 1b954e1 drm/i915: Mark CPU cache as dirty on every transition for CPU writes == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4569/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On Thu, Apr 27, 2017 at 04:55:20PM +0200, Arkadiusz Hiler wrote: > On Wed, Apr 26, 2017 at 06:00:41PM +0300, David Weinehall wrote: > > Add a bunch of MOCS entries for gen 9 that were missing from intel_mocs. > > Some of these are used by media-sdk; if these entries are missing > > the default will instead be to do everything uncached. > > > > This patch improves media-sdk performance with up to 60% > > with the (admittedly synthetic) benchmarks we use in our nightly > > testing, without regressing any other benchmarks. > > Hey David, > > I am testing some of the extended MOCS with Mesa and the differences I > see fit in the margins of statistical error. > > Odd, I thought, so to make sure I haven't messed up anything in the > process of compiling, setting LD_LIBRARY_PATH and benchmarking I turned > everything to UNCACHED - and I saw severe performance drop. > > So here is the question it induced: > > Have you used the "closest neighbour" from entries available or did you > defaulted to the UNCACHED ones? That could be the culprit. > > Note: I have tested MOCS for VB and Render Target only, and only in a > few synthetic cases - it will require much more fine-tuning and > benchmarking before any final conclusions. As I mentioned in the commit message, the improvements only manifest themselves for media-sdk workloads (and presumably other workloads that uses the same hardware); if you see any performance regressions with these additional entries I'd be interested to know. Kind regards, David Weinehall ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Eliminate HAS_HW_CONTEXTS
On Thu, Apr 27, 2017 at 05:36:55PM +0300, Ville Syrjälä wrote: > On Thu, Apr 27, 2017 at 04:41:33PM +0300, Joonas Lahtinen wrote: > > According to Chris i915_gem_sanitize was meant to reset ILK too. > > In that case drawing the line before g4x might make more sense > since it already has a GPU reset that doesn't clobber the display. The initial reasoning for the cutoff was anything that used contexts for real. We do want to extend it to everything that we can realiably reset. One step at a time. > > > > CCID register existed already on ILK according to the PRM (Chris > > verified the address to match too). > > I think it has existed since forever actually. Well, not sure about > gen0-1. Hmm, didn't realise that (or completely forgot). Logical contexts exist for gen2/3 as well. Enabling for g4x+ might be fun. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions
On 27/04/17 12:53 AM, Christoph Hellwig wrote: > I think you'll need to follow the existing kmap semantics and never > fail the iomem version either. Otherwise you'll have a special case > that's almost never used that has a different error path. > > Again, wrong way. Suddenly making things fail for your special case > that normally don't fail is a receipe for bugs. I don't disagree but these restrictions make the problem impossible to solve? If there is iomem behind a page in an SGL and someone tries to map it, we either have to fail or we break iomem safety which was your original concern. Logan ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Eliminate HAS_HW_CONTEXTS
On Thu, Apr 27, 2017 at 04:32:55PM +0100, Chris Wilson wrote: > On Thu, Apr 27, 2017 at 05:36:55PM +0300, Ville Syrjälä wrote: > > On Thu, Apr 27, 2017 at 04:41:33PM +0300, Joonas Lahtinen wrote: > > > According to Chris i915_gem_sanitize was meant to reset ILK too. > > > > In that case drawing the line before g4x might make more sense > > since it already has a GPU reset that doesn't clobber the display. > > The initial reasoning for the cutoff was anything that used contexts for > real. We do want to extend it to everything that we can realiably reset. > One step at a time. This patch looked more like three steps to me. First step could have been just removing the flag and adjusting the code to check for gen>=6. > > > > > > > CCID register existed already on ILK according to the PRM (Chris > > > verified the address to match too). > > > > I think it has existed since forever actually. Well, not sure about > > gen0-1. > > Hmm, didn't realise that (or completely forgot). Logical contexts exist > for gen2/3 as well. Enabling for g4x+ might be fun. Not sure it's actually functional there. The docs seem to indicate that there's some linkage between contexts and run lists. -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 07/21] crypto: shash, caam: Make use of the new sg_map helper function
On 26/04/17 09:56 PM, Herbert Xu wrote: > On Tue, Apr 25, 2017 at 12:20:54PM -0600, Logan Gunthorpe wrote: >> Very straightforward conversion to the new function in the caam driver >> and shash library. >> >> Signed-off-by: Logan Gunthorpe >> Cc: Herbert Xu >> Cc: "David S. Miller" >> --- >> crypto/shash.c| 9 ++--- >> drivers/crypto/caam/caamalg.c | 8 +++- >> 2 files changed, 9 insertions(+), 8 deletions(-) >> >> diff --git a/crypto/shash.c b/crypto/shash.c >> index 5e31c8d..5914881 100644 >> --- a/crypto/shash.c >> +++ b/crypto/shash.c >> @@ -283,10 +283,13 @@ int shash_ahash_digest(struct ahash_request *req, >> struct shash_desc *desc) >> if (nbytes < min(sg->length, ((unsigned int)(PAGE_SIZE)) - offset)) { >> void *data; >> >> -data = kmap_atomic(sg_page(sg)); >> -err = crypto_shash_digest(desc, data + offset, nbytes, >> +data = sg_map(sg, 0, SG_KMAP_ATOMIC); >> +if (IS_ERR(data)) >> +return PTR_ERR(data); >> + >> +err = crypto_shash_digest(desc, data, nbytes, >>req->result); >> -kunmap_atomic(data); >> +sg_unmap(sg, data, 0, SG_KMAP_ATOMIC); >> crypto_yield(desc->flags); >> } else >> err = crypto_shash_init(desc) ?: > > Nack. This is an optimisation for the special case of a single > SG list entry. In fact in the common case the kmap_atomic should > disappear altogether in the no-highmem case. So replacing it > with sg_map is not acceptable. What you seem to have missed is that sg_map is just a thin wrapper around kmap_atomic. Perhaps with a future check for a mappable page. This change should have zero impact on performance. Logan ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions
On 27/04/17 09:27 AM, Jason Gunthorpe wrote: > On Thu, Apr 27, 2017 at 08:53:38AM +0200, Christoph Hellwig wrote: > How about first switching as many call sites as possible to use > sg_copy_X_buffer instead of kmap? Yeah, I could look at doing that first. One problem is we might get more Naks of the form of Herbert Xu's who might be concerned with the performance implications. These are definitely a bit more invasive changes than thin wrappers around kmap calls. > A random audit of Logan's series suggests this is actually a fairly > common thing. It's not _that_ common but there are a significant fraction. One of my patches actually did this to two places that seemed to be reimplementing the sg_copy_X_buffer logic. Thanks, Logan ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 02/11] ALSA: x86: Clear the pdata.notify_lpe_audio pointer before teardown
From: Ville Syrjälä Clear the notify function pointer in the platform data before we tear down the driver. Otherwise i915 would end up calling a stale function pointer and possibly explode. Cc: Takashi Iwai Cc: Pierre-Louis Bossart Signed-off-by: Ville Syrjälä --- sound/x86/intel_hdmi_audio.c | 5 + 1 file changed, 5 insertions(+) diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index bfac6f21ae5e..5b89662493c9 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -1665,6 +1665,11 @@ static int __maybe_unused hdmi_lpe_audio_resume(struct device *dev) static void hdmi_lpe_audio_free(struct snd_card *card) { struct snd_intelhad *ctx = card->private_data; + struct intel_hdmi_lpe_audio_pdata *pdata = ctx->dev->platform_data; + + spin_lock_irq(&pdata->lpe_audio_slock); + pdata->notify_audio_lpe = NULL; + spin_unlock_irq(&pdata->lpe_audio_slock); cancel_work_sync(&ctx->hdmi_audio_wq); -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 00/11] drm/i915: LPE audio runtime PM and multipipe (v2)
From: Ville Syrjälä Okay, here's the second attempt at getting multiple pipes playing back audio on the VLV/CHV HDMI LPE audio device. The main change from v1 is that now the PCM devices are associated with ports instead of pipes, so the audio from one device always gets output on the same display. I've also tacked on the alsa-lib conf update. No clue whether it's really correct or not (the config language isn't a close friend of mine). BTW I did notice that with LPE audio all the controls say iface=PCM, whereas on HDA a bunch of them say iface=MIXER. No idea if that's OK or not, just something I spotted when I was comparing the results with HDA. Entire series available here: git://github.com/vsyrjala/linux.git lpe_audio_multipipe_2 Cc: Takashi Iwai Cc: Pierre-Louis Bossart Ville Syrjälä (11): drm/i915: Fix runtime PM for LPE audio ALSA: x86: Clear the pdata.notify_lpe_audio pointer before teardown drm/i915: Stop pretending to mask/unmask LPE audio interrupts drm/i915: Remove the unused pending_notify from LPE platform data drm/i915: Replace tmds_clock_speed and link_rate with just ls_clock drm/i915: Remove hdmi_connected from LPE audio pdata drm/i915: Reorganize intel_lpe_audio_notify() arguments drm/i915: Clean up the LPE audio platform data ALSA: x86: Prepare LPE audio ctls for multiple PCMs ALSA: x86: Split snd_intelhad into card and PCM specific structures ALSA: x86: Register multiple PCM devices for the LPE audio card drivers/gpu/drm/i915/i915_drv.h| 4 +- drivers/gpu/drm/i915/i915_irq.c| 15 +- drivers/gpu/drm/i915/intel_audio.c | 19 +- drivers/gpu/drm/i915/intel_lpe_audio.c | 99 -- include/drm/intel_lpe_audio.h | 22 +-- sound/x86/intel_hdmi_audio.c | 328 - sound/x86/intel_hdmi_audio.h | 20 +- 7 files changed, 271 insertions(+), 236 deletions(-) -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 04/11] drm/i915: Remove the unused pending_notify from LPE platform data
From: Ville Syrjälä The pending_notify flag in the LPE audio platform data is pointless, actually unused. So let's kill it off. v2: Fix typo in patch subject Cc: Takashi Iwai Cc: Pierre-Louis Bossart Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/intel_lpe_audio.c | 2 -- include/drm/intel_lpe_audio.h | 1 - sound/x86/intel_hdmi_audio.c | 1 - 3 files changed, 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lpe_audio.c b/drivers/gpu/drm/i915/intel_lpe_audio.c index 292fedf30b00..79b9dca985ff 100644 --- a/drivers/gpu/drm/i915/intel_lpe_audio.c +++ b/drivers/gpu/drm/i915/intel_lpe_audio.c @@ -361,8 +361,6 @@ void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, if (pdata->notify_audio_lpe) pdata->notify_audio_lpe(dev_priv->lpe_audio.platdev); - else - pdata->notify_pending = true; spin_unlock_irqrestore(&pdata->lpe_audio_slock, irq_flags); diff --git a/include/drm/intel_lpe_audio.h b/include/drm/intel_lpe_audio.h index e9892b4c3af1..c201d39cdfea 100644 --- a/include/drm/intel_lpe_audio.h +++ b/include/drm/intel_lpe_audio.h @@ -38,7 +38,6 @@ struct intel_hdmi_lpe_audio_eld { }; struct intel_hdmi_lpe_audio_pdata { - bool notify_pending; int tmds_clock_speed; bool hdmi_connected; bool dp_output; diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index 5b89662493c9..cbba4a78afb5 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -1811,7 +1811,6 @@ static int hdmi_lpe_audio_probe(struct platform_device *pdev) spin_lock_irq(&pdata->lpe_audio_slock); pdata->notify_audio_lpe = notify_audio_lpe; - pdata->notify_pending = false; spin_unlock_irq(&pdata->lpe_audio_slock); pm_runtime_use_autosuspend(&pdev->dev); -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 01/11] drm/i915: Fix runtime PM for LPE audio
From: Ville Syrjälä Not calling pm_runtime_enable() means that runtime PM can't be enabled at all via sysfs. So we definitely need to call it from somewhere. Calling it from the driver seems like a bad idea because it would have to be paired with a pm_runtime_disable() at driver unload time, otherwise the core gets upset. Also if there's no LPE audio driver loaded then we couldn't runtime suspend i915 either. So it looks like a better plan is to call it from i915 when we register the platform device. That seems to match how pci generally does things. I cargo culted the pm_runtime_forbid() and pm_runtime_set_active() calls from pci as well. The exposed runtime PM API is massive an thorougly misleading, so I don't actually know if this is how you're supposed to use the API or not. But it seems to work. I can now runtime suspend i915 again with or without the LPE audio driver loaded, and reloading the LPE audio driver also seems to work. Note that powertop won't auto-tune runtime PM for platform devices, which is a little annoying. So I'm not sure that leaving runtime PM in "on" mode by default is the best choice here. But I've left it like that for now at least. Also remove the comment about there not being much benefit from LPE audio runtime PM. Not allowing runtime PM blocks i915 runtime PM, which will also block s0ix, and that could have a measurable impact on power consumption. Cc: Takashi Iwai Cc: Pierre-Louis Bossart Fixes: 0b6b524f3915 ("ALSA: x86: Don't enable runtime PM as default") Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/intel_lpe_audio.c | 5 + sound/x86/intel_hdmi_audio.c | 4 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lpe_audio.c b/drivers/gpu/drm/i915/intel_lpe_audio.c index 25d8e76489e4..668f00480d97 100644 --- a/drivers/gpu/drm/i915/intel_lpe_audio.c +++ b/drivers/gpu/drm/i915/intel_lpe_audio.c @@ -63,6 +63,7 @@ #include #include #include +#include #include "i915_drv.h" #include @@ -121,6 +122,10 @@ lpe_audio_platdev_create(struct drm_i915_private *dev_priv) kfree(rsc); + pm_runtime_forbid(&platdev->dev); + pm_runtime_set_active(&platdev->dev); + pm_runtime_enable(&platdev->dev); + return platdev; err: diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index c505b019e09c..bfac6f21ae5e 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -1809,10 +1809,6 @@ static int hdmi_lpe_audio_probe(struct platform_device *pdev) pdata->notify_pending = false; spin_unlock_irq(&pdata->lpe_audio_slock); - /* runtime PM isn't enabled as default, since it won't save much on -* BYT/CHT devices; user who want the runtime PM should adjust the -* power/ontrol and power/autosuspend_delay_ms sysfs entries instead -*/ pm_runtime_use_autosuspend(&pdev->dev); pm_runtime_mark_last_busy(&pdev->dev); pm_runtime_set_active(&pdev->dev); -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 06/11] drm/i915: Remove hdmi_connected from LPE audio pdata
From: Ville Syrjälä We can determine that the pipe was shut down from pipe<0, so there's no point in duplicating that information as 'hdmi_connected'. v2: Use pipe<0 instead of port<0 as we'll want to do per-port PCM devices later Initialize pipe to -1 to inidicate inactive initial state Cc: Takashi Iwai Cc: Pierre-Louis Bossart Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/intel_lpe_audio.c | 9 + include/drm/intel_lpe_audio.h | 3 +-- sound/x86/intel_hdmi_audio.c | 8 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lpe_audio.c b/drivers/gpu/drm/i915/intel_lpe_audio.c index 5a1a37e963f1..7fd95733eff5 100644 --- a/drivers/gpu/drm/i915/intel_lpe_audio.c +++ b/drivers/gpu/drm/i915/intel_lpe_audio.c @@ -111,6 +111,7 @@ lpe_audio_platdev_create(struct drm_i915_private *dev_priv) pinfo.size_data = sizeof(*pdata); pinfo.dma_mask = DMA_BIT_MASK(32); + pdata->pipe = -1; spin_lock_init(&pdata->lpe_audio_slock); platdev = platform_device_register_full(&pinfo); @@ -332,12 +333,12 @@ void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, audio_enable = I915_READ(VLV_AUD_PORT_EN_DBG(port)); + pdata->eld.port_id = port; + if (eld != NULL) { memcpy(pdata->eld.eld_data, eld, HDMI_MAX_ELD_BYTES); - pdata->eld.port_id = port; - pdata->eld.pipe_id = pipe; - pdata->hdmi_connected = true; + pdata->pipe = pipe; pdata->ls_clock = ls_clock; pdata->dp_output = dp_output; @@ -348,7 +349,7 @@ void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, } else { memset(pdata->eld.eld_data, 0, HDMI_MAX_ELD_BYTES); - pdata->hdmi_connected = false; + pdata->pipe = -1; pdata->ls_clock = 0; pdata->dp_output = false; diff --git a/include/drm/intel_lpe_audio.h b/include/drm/intel_lpe_audio.h index 8bf804ce8905..9a5bdf5ad180 100644 --- a/include/drm/intel_lpe_audio.h +++ b/include/drm/intel_lpe_audio.h @@ -33,13 +33,12 @@ struct platform_device; struct intel_hdmi_lpe_audio_eld { int port_id; - int pipe_id; unsigned char eld_data[HDMI_MAX_ELD_BYTES]; }; struct intel_hdmi_lpe_audio_pdata { + int pipe; int ls_clock; - bool hdmi_connected; bool dp_output; struct intel_hdmi_lpe_audio_eld eld; void (*notify_audio_lpe)(struct platform_device *pdev); diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index 4eaf5de54f61..1a095189db83 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -1559,7 +1559,7 @@ static void had_audio_wq(struct work_struct *work) pm_runtime_get_sync(ctx->dev); mutex_lock(&ctx->mutex); - if (!pdata->hdmi_connected) { + if (pdata->pipe < 0) { dev_dbg(ctx->dev, "%s: Event: HAD_NOTIFY_HOT_UNPLUG\n", __func__); memset(ctx->eld, 0, sizeof(ctx->eld)); /* clear the old ELD */ @@ -1568,9 +1568,9 @@ static void had_audio_wq(struct work_struct *work) struct intel_hdmi_lpe_audio_eld *eld = &pdata->eld; dev_dbg(ctx->dev, "%s: HAD_NOTIFY_ELD : port = %d, tmds = %d\n", - __func__, eld->port_id, pdata->ls_clock); + __func__, eld->port_id, pdata->ls_clock); - switch (eld->pipe_id) { + switch (pdata->pipe) { case 0: ctx->had_config_offset = AUDIO_HDMI_CONFIG_A; break; @@ -1582,7 +1582,7 @@ static void had_audio_wq(struct work_struct *work) break; default: dev_dbg(ctx->dev, "Invalid pipe %d\n", - eld->pipe_id); + pdata->pipe); break; } -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 05/11] drm/i915: Replace tmds_clock_speed and link_rate with just ls_clock
From: Ville Syrjälä There's no need to distinguish between the DP link rate and HDMI TMDS clock for the purposes of the LPE audio. Both are actually the same thing more or less, which is the link symbol clock. So let's just call the thing ls_clock and simplify the code. Cc: Takashi Iwai Cc: Pierre-Louis Bossart Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/i915_drv.h| 4 ++-- drivers/gpu/drm/i915/intel_audio.c | 19 --- drivers/gpu/drm/i915/intel_lpe_audio.c | 14 ++ include/drm/intel_lpe_audio.h | 3 +-- sound/x86/intel_hdmi_audio.c | 11 --- 5 files changed, 21 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index d1f7c48e4ae3..8bf72220ee07 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3721,8 +3721,8 @@ int intel_lpe_audio_init(struct drm_i915_private *dev_priv); void intel_lpe_audio_teardown(struct drm_i915_private *dev_priv); void intel_lpe_audio_irq_handler(struct drm_i915_private *dev_priv); void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, - void *eld, int port, int pipe, int tmds_clk_speed, - bool dp_output, int link_rate); + void *eld, int port, int pipe, int ls_clock, + bool dp_output); /* intel_i2c.c */ extern int intel_setup_gmbus(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/intel_audio.c b/drivers/gpu/drm/i915/intel_audio.c index 52c207e81f41..79eeef25321f 100644 --- a/drivers/gpu/drm/i915/intel_audio.c +++ b/drivers/gpu/drm/i915/intel_audio.c @@ -632,20 +632,9 @@ void intel_audio_codec_enable(struct intel_encoder *intel_encoder, (int) port, (int) pipe); } - switch (intel_encoder->type) { - case INTEL_OUTPUT_HDMI: - intel_lpe_audio_notify(dev_priv, connector->eld, port, pipe, - crtc_state->port_clock, - false, 0); - break; - case INTEL_OUTPUT_DP: - intel_lpe_audio_notify(dev_priv, connector->eld, port, pipe, - adjusted_mode->crtc_clock, - true, crtc_state->port_clock); - break; - default: - break; - } + intel_lpe_audio_notify(dev_priv, connector->eld, port, pipe, + crtc_state->port_clock, + intel_encoder->type == INTEL_OUTPUT_DP); } /** @@ -680,7 +669,7 @@ void intel_audio_codec_disable(struct intel_encoder *intel_encoder) (int) port, (int) pipe); } - intel_lpe_audio_notify(dev_priv, NULL, port, pipe, 0, false, 0); + intel_lpe_audio_notify(dev_priv, NULL, port, pipe, 0, false); } /** diff --git a/drivers/gpu/drm/i915/intel_lpe_audio.c b/drivers/gpu/drm/i915/intel_lpe_audio.c index 79b9dca985ff..5a1a37e963f1 100644 --- a/drivers/gpu/drm/i915/intel_lpe_audio.c +++ b/drivers/gpu/drm/i915/intel_lpe_audio.c @@ -309,13 +309,14 @@ void intel_lpe_audio_teardown(struct drm_i915_private *dev_priv) * @eld : ELD data * @pipe: pipe id * @port: port id - * @tmds_clk_speed: tmds clock frequency in Hz + * @ls_clock: Link symbol clock in kHz + * @dp_output: Driving a DP output? * * Notify lpe audio driver of eld change. */ void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, - void *eld, int port, int pipe, int tmds_clk_speed, - bool dp_output, int link_rate) + void *eld, int port, int pipe, int ls_clock, + bool dp_output) { unsigned long irq_flags; struct intel_hdmi_lpe_audio_pdata *pdata = NULL; @@ -337,12 +338,8 @@ void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, pdata->eld.port_id = port; pdata->eld.pipe_id = pipe; pdata->hdmi_connected = true; - + pdata->ls_clock = ls_clock; pdata->dp_output = dp_output; - if (tmds_clk_speed) - pdata->tmds_clock_speed = tmds_clk_speed; - if (link_rate) - pdata->link_rate = link_rate; /* Unmute the amp for both DP and HDMI */ I915_WRITE(VLV_AUD_PORT_EN_DBG(port), @@ -352,6 +349,7 @@ void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, memset(pdata->eld.eld_data, 0, HDMI_MAX_ELD_BYTES); pdata->hdmi_connected = false; + pdata->ls_clock = 0; pdata->dp_output = false; /* Mute the amp for both DP and HDMI */ diff --git a/include/drm/intel_lpe_audio.h
[Intel-gfx] [PATCH 09/11] ALSA: x86: Prepare LPE audio ctls for multiple PCMs
From: Ville Syrjälä In preparation for register a PCM device for each pipe adjust link up the ctl elements with the corresponding PCM device. Cc: Takashi Iwai Cc: Pierre-Louis Bossart Signed-off-by: Ville Syrjälä --- sound/x86/intel_hdmi_audio.c | 23 +++ 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index c2b78621852e..69e10845633a 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -1609,11 +1609,16 @@ static void had_audio_wq(struct work_struct *work) /* * Jack interface */ -static int had_create_jack(struct snd_intelhad *ctx) +static int had_create_jack(struct snd_intelhad *ctx, + struct snd_pcm *pcm) { + char hdmi_str[32]; int err; - err = snd_jack_new(ctx->card, "HDMI/DP", SND_JACK_AVOUT, &ctx->jack, + snprintf(hdmi_str, sizeof(hdmi_str), +"HDMI/DP,pcm=%d", pcm->device); + + err = snd_jack_new(ctx->card, hdmi_str, SND_JACK_AVOUT, &ctx->jack, true, false); if (err < 0) return err; @@ -1793,7 +1798,17 @@ static int hdmi_lpe_audio_probe(struct platform_device *pdev) /* create controls */ for (i = 0; i < ARRAY_SIZE(had_controls); i++) { - ret = snd_ctl_add(card, snd_ctl_new1(&had_controls[i], ctx)); + struct snd_kcontrol *kctl; + + kctl = snd_ctl_new1(&had_controls[i], ctx); + if (!kctl) { + ret = -ENOMEM; + goto err; + } + + kctl->id.device = pcm->device; + + ret = snd_ctl_add(card, kctl); if (ret < 0) goto err; } @@ -1805,7 +1820,7 @@ static int hdmi_lpe_audio_probe(struct platform_device *pdev) if (ret < 0) goto err; - ret = had_create_jack(ctx); + ret = had_create_jack(ctx, pcm); if (ret < 0) goto err; -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 07/11] drm/i915: Reorganize intel_lpe_audio_notify() arguments
From: Ville Syrjälä Shuffle the arguments to intel_lpe_audio_notify() around a bit. Pipe and port being the most important things, so let's put the first, and thre rest can come in as is. Also constify the eld argument. Cc: Takashi Iwai Cc: Pierre-Louis Bossart Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/i915_drv.h| 4 ++-- drivers/gpu/drm/i915/intel_audio.c | 4 ++-- drivers/gpu/drm/i915/intel_lpe_audio.c | 8 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 8bf72220ee07..9c528209fba7 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3721,8 +3721,8 @@ int intel_lpe_audio_init(struct drm_i915_private *dev_priv); void intel_lpe_audio_teardown(struct drm_i915_private *dev_priv); void intel_lpe_audio_irq_handler(struct drm_i915_private *dev_priv); void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, - void *eld, int port, int pipe, int ls_clock, - bool dp_output); + enum pipe pipe, enum port port, + const void *eld, int ls_clock, bool dp_output); /* intel_i2c.c */ extern int intel_setup_gmbus(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/intel_audio.c b/drivers/gpu/drm/i915/intel_audio.c index 79eeef25321f..d805b6e6fe71 100644 --- a/drivers/gpu/drm/i915/intel_audio.c +++ b/drivers/gpu/drm/i915/intel_audio.c @@ -632,7 +632,7 @@ void intel_audio_codec_enable(struct intel_encoder *intel_encoder, (int) port, (int) pipe); } - intel_lpe_audio_notify(dev_priv, connector->eld, port, pipe, + intel_lpe_audio_notify(dev_priv, pipe, port, connector->eld, crtc_state->port_clock, intel_encoder->type == INTEL_OUTPUT_DP); } @@ -669,7 +669,7 @@ void intel_audio_codec_disable(struct intel_encoder *intel_encoder) (int) port, (int) pipe); } - intel_lpe_audio_notify(dev_priv, NULL, port, pipe, 0, false); + intel_lpe_audio_notify(dev_priv, pipe, port, NULL, 0, false); } /** diff --git a/drivers/gpu/drm/i915/intel_lpe_audio.c b/drivers/gpu/drm/i915/intel_lpe_audio.c index 7fd95733eff5..4c770d037f23 100644 --- a/drivers/gpu/drm/i915/intel_lpe_audio.c +++ b/drivers/gpu/drm/i915/intel_lpe_audio.c @@ -307,17 +307,17 @@ void intel_lpe_audio_teardown(struct drm_i915_private *dev_priv) * intel_lpe_audio_notify() - notify lpe audio event * audio driver and i915 * @dev_priv: the i915 drm device private data + * @pipe: pipe + * @port: port * @eld : ELD data - * @pipe: pipe id - * @port: port id * @ls_clock: Link symbol clock in kHz * @dp_output: Driving a DP output? * * Notify lpe audio driver of eld change. */ void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, - void *eld, int port, int pipe, int ls_clock, - bool dp_output) + enum pipe pipe, enum port port, + const void *eld, int ls_clock, bool dp_output) { unsigned long irq_flags; struct intel_hdmi_lpe_audio_pdata *pdata = NULL; -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH alsa-lib] conf: Add multiple hdmi pcm definition for Intel LPE audio
From: Ville Syrjälä Now that the kernel driver exposes several pcm devices, update the hdmi pcm definitions to match. Cc: Takashi Iwai Cc: Pierre-Louis Bossart Signed-off-by: Ville Syrjälä --- src/conf/cards/HdmiLpeAudio.conf | 74 ++-- 1 file changed, 72 insertions(+), 2 deletions(-) diff --git a/src/conf/cards/HdmiLpeAudio.conf b/src/conf/cards/HdmiLpeAudio.conf index 61bdfeae2917..ad174b8ac450 100644 --- a/src/conf/cards/HdmiLpeAudio.conf +++ b/src/conf/cards/HdmiLpeAudio.conf @@ -51,11 +51,14 @@ HdmiLpeAudio.pcm.default { -HdmiLpeAudio.pcm.hdmi.0 { - @args [ CARD AES0 AES1 AES2 AES3 ] +HdmiLpeAudio.pcm.hdmi.common { + @args [ CARD DEVICE AES0 AES1 AES2 AES3 ] @args.CARD { type string } + @args.DEVICE { + type integer + } @args.AES0 { type integer } @@ -72,6 +75,7 @@ HdmiLpeAudio.pcm.hdmi.0 { slave.pcm { type hw card $CARD + device $DEVICE } hooks.0 { type ctl_elems @@ -86,3 +90,69 @@ HdmiLpeAudio.pcm.hdmi.0 { ] } } + +HdmiLpeAudio.pcm.hdmi.0 { + @args [ CARD AES0 AES1 AES2 AES3 ] + @args.CARD { type string } + @args.AES0 { type integer } + @args.AES1 { type integer } + @args.AES2 { type integer } + @args.AES3 { type integer } + @func refer + name { + @func concat + strings [ + "cards.HdmiLpeAudio.pcm.hdmi.common:" + "CARD=" $CARD "," + "DEVICE=0," + "AES0=" $AES0 "," + "AES1=" $AES1 "," + "AES2=" $AES2 "," + "AES3=" $AES3 + ] + } +} + +HdmiLpeAudio.pcm.hdmi.1 { + @args [ CARD AES0 AES1 AES2 AES3 ] + @args.CARD { type string } + @args.AES0 { type integer } + @args.AES1 { type integer } + @args.AES2 { type integer } + @args.AES3 { type integer } + @func refer + name { + @func concat + strings [ + "cards.HdmiLpeAudio.pcm.hdmi.common:" + "CARD=" $CARD "," + "DEVICE=1," + "AES0=" $AES0 "," + "AES1=" $AES1 "," + "AES2=" $AES2 "," + "AES3=" $AES3 + ] + } +} + +HdmiLpeAudio.pcm.hdmi.2 { + @args [ CARD AES0 AES1 AES2 AES3 ] + @args.CARD { type string } + @args.AES0 { type integer } + @args.AES1 { type integer } + @args.AES2 { type integer } + @args.AES3 { type integer } + @func refer + name { + @func concat + strings [ + "cards.HdmiLpeAudio.pcm.hdmi.common:" + "CARD=" $CARD "," + "DEVICE=2," + "AES0=" $AES0 "," + "AES1=" $AES1 "," + "AES2=" $AES2 "," + "AES3=" $AES3 + ] + } +} -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 10/11] ALSA: x86: Split snd_intelhad into card and PCM specific structures
From: Ville Syrjälä To allow multiple PCM devices to be registered for the LPE audio card, split the private data into card and PCM specific chunks. For now we'll stick to just one PCM device as before. v2: Rework to do a pcm device per port instead of per pipe Cc: Takashi Iwai Cc: Pierre-Louis Bossart Signed-off-by: Ville Syrjälä --- sound/x86/intel_hdmi_audio.c | 227 +-- sound/x86/intel_hdmi_audio.h | 15 ++- 2 files changed, 142 insertions(+), 100 deletions(-) diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index 69e10845633a..12fae26e70bb 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -42,6 +42,9 @@ #include #include "intel_hdmi_audio.h" +#define for_each_port(card_ctx, port) \ + for ((port) = 0; (port) < (card_ctx)->num_ports; (port)++) + /*standard module options for ALSA. This module supports only one card*/ static int hdmi_card_index = SNDRV_DEFAULT_IDX1; static char *hdmi_card_id = SNDRV_DEFAULT_STR1; @@ -192,12 +195,12 @@ static void had_substream_put(struct snd_intelhad *intelhaddata) /* Register access functions */ static u32 had_read_register_raw(struct snd_intelhad *ctx, u32 reg) { - return ioread32(ctx->mmio_start + ctx->had_config_offset + reg); + return ioread32(ctx->card_ctx->mmio_start + ctx->had_config_offset + reg); } static void had_write_register_raw(struct snd_intelhad *ctx, u32 reg, u32 val) { - iowrite32(val, ctx->mmio_start + ctx->had_config_offset + reg); + iowrite32(val, ctx->card_ctx->mmio_start + ctx->had_config_offset + reg); } static void had_read_register(struct snd_intelhad *ctx, u32 reg, u32 *val) @@ -1519,22 +1522,27 @@ static const struct snd_kcontrol_new had_controls[] = { */ static irqreturn_t display_pipe_interrupt_handler(int irq, void *dev_id) { - struct snd_intelhad *ctx = dev_id; - u32 audio_stat; + struct snd_intelhad_card *card_ctx = dev_id; + int port; - /* use raw register access to ack IRQs even while disconnected */ - audio_stat = had_read_register_raw(ctx, AUD_HDMI_STATUS); + for_each_port(card_ctx, port) { + struct snd_intelhad *ctx = &card_ctx->pcm_ctx[port]; + u32 audio_stat; - if (audio_stat & HDMI_AUDIO_UNDERRUN) { - had_write_register_raw(ctx, AUD_HDMI_STATUS, - HDMI_AUDIO_UNDERRUN); - had_process_buffer_underrun(ctx); - } + /* use raw register access to ack IRQs even while disconnected */ + audio_stat = had_read_register_raw(ctx, AUD_HDMI_STATUS); + + if (audio_stat & HDMI_AUDIO_UNDERRUN) { + had_write_register_raw(ctx, AUD_HDMI_STATUS, + HDMI_AUDIO_UNDERRUN); + had_process_buffer_underrun(ctx); + } - if (audio_stat & HDMI_AUDIO_BUFFER_DONE) { - had_write_register_raw(ctx, AUD_HDMI_STATUS, - HDMI_AUDIO_BUFFER_DONE); - had_process_buffer_done(ctx); + if (audio_stat & HDMI_AUDIO_BUFFER_DONE) { + had_write_register_raw(ctx, AUD_HDMI_STATUS, + HDMI_AUDIO_BUFFER_DONE); + had_process_buffer_done(ctx); + } } return IRQ_HANDLED; @@ -1545,9 +1553,14 @@ static irqreturn_t display_pipe_interrupt_handler(int irq, void *dev_id) */ static void notify_audio_lpe(struct platform_device *pdev) { - struct snd_intelhad *ctx = platform_get_drvdata(pdev); + struct snd_intelhad_card *card_ctx = platform_get_drvdata(pdev); + int port; + + for_each_port(card_ctx, port) { + struct snd_intelhad *ctx = &card_ctx->pcm_ctx[port]; - schedule_work(&ctx->hdmi_audio_wq); + schedule_work(&ctx->hdmi_audio_wq); + } } /* the work to handle monitor hot plug/unplug */ @@ -1618,7 +1631,8 @@ static int had_create_jack(struct snd_intelhad *ctx, snprintf(hdmi_str, sizeof(hdmi_str), "HDMI/DP,pcm=%d", pcm->device); - err = snd_jack_new(ctx->card, hdmi_str, SND_JACK_AVOUT, &ctx->jack, + err = snd_jack_new(ctx->card_ctx->card, hdmi_str, + SND_JACK_AVOUT, &ctx->jack, true, false); if (err < 0) return err; @@ -1632,13 +1646,18 @@ static int had_create_jack(struct snd_intelhad *ctx, static int hdmi_lpe_audio_runtime_suspend(struct device *dev) { - struct snd_intelhad *ctx = dev_get_drvdata(dev); - struct snd_pcm_substream *substream; + struct snd_intelhad_card *card_ctx = dev_get_drvdata(dev); + int port; - substream = had_substream_get(ctx); - if (substream) { - snd_pcm_suspend(substream); -
[Intel-gfx] [PATCH v2 11/11] ALSA: x86: Register multiple PCM devices for the LPE audio card
From: Ville Syrjälä Now that everything is in place let's register a PCM device for each port of the display engine. This will make it possible to actually output audio to multiple displays at the same time. And it avoids modesets on unrelated displays from clobbering up the ELD and whatnot for the display currently doing the playback. v2: Add a PCM per port instead of per pipe Cc: Takashi Iwai Cc: Pierre-Louis Bossart Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/intel_lpe_audio.c | 19 ++--- include/drm/intel_lpe_audio.h | 6 +- sound/x86/intel_hdmi_audio.c | 126 +++-- sound/x86/intel_hdmi_audio.h | 7 +- 4 files changed, 92 insertions(+), 66 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lpe_audio.c b/drivers/gpu/drm/i915/intel_lpe_audio.c index bdbc235141b5..fa728ed21d1f 100644 --- a/drivers/gpu/drm/i915/intel_lpe_audio.c +++ b/drivers/gpu/drm/i915/intel_lpe_audio.c @@ -111,7 +111,11 @@ lpe_audio_platdev_create(struct drm_i915_private *dev_priv) pinfo.size_data = sizeof(*pdata); pinfo.dma_mask = DMA_BIT_MASK(32); - pdata->port.pipe = -1; + pdata->num_pipes = INTEL_INFO(dev_priv)->num_pipes; + pdata->num_ports = IS_CHERRYVIEW(dev_priv) ? 3 : 2; /* B,C,D or B,C */ + pdata->port[0].pipe = -1; + pdata->port[1].pipe = -1; + pdata->port[2].pipe = -1; spin_lock_init(&pdata->lpe_audio_slock); platdev = platform_device_register_full(&pinfo); @@ -319,7 +323,7 @@ void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, enum pipe pipe, enum port port, const void *eld, int ls_clock, bool dp_output) { - unsigned long irq_flags; + unsigned long irqflags; struct intel_hdmi_lpe_audio_pdata *pdata; struct intel_hdmi_lpe_audio_port_pdata *ppdata; u32 audio_enable; @@ -328,14 +332,12 @@ void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, return; pdata = dev_get_platdata(&dev_priv->lpe_audio.platdev->dev); - ppdata = &pdata->port; + ppdata = &pdata->port[port]; - spin_lock_irqsave(&pdata->lpe_audio_slock, irq_flags); + spin_lock_irqsave(&pdata->lpe_audio_slock, irqflags); audio_enable = I915_READ(VLV_AUD_PORT_EN_DBG(port)); - ppdata->port = port; - if (eld != NULL) { memcpy(ppdata->eld, eld, HDMI_MAX_ELD_BYTES); ppdata->pipe = pipe; @@ -357,8 +359,7 @@ void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, } if (pdata->notify_audio_lpe) - pdata->notify_audio_lpe(dev_priv->lpe_audio.platdev); + pdata->notify_audio_lpe(dev_priv->lpe_audio.platdev, port); - spin_unlock_irqrestore(&pdata->lpe_audio_slock, - irq_flags); + spin_unlock_irqrestore(&pdata->lpe_audio_slock, irqflags); } diff --git a/include/drm/intel_lpe_audio.h b/include/drm/intel_lpe_audio.h index 211f1cd61153..a911530c012e 100644 --- a/include/drm/intel_lpe_audio.h +++ b/include/drm/intel_lpe_audio.h @@ -40,9 +40,11 @@ struct intel_hdmi_lpe_audio_port_pdata { }; struct intel_hdmi_lpe_audio_pdata { - struct intel_hdmi_lpe_audio_port_pdata port; + struct intel_hdmi_lpe_audio_port_pdata port[3]; /* ports B,C,D */ + int num_ports; + int num_pipes; - void (*notify_audio_lpe)(struct platform_device *pdev); + void (*notify_audio_lpe)(struct platform_device *pdev, int pipe); spinlock_t lpe_audio_slock; }; diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index 12fae26e70bb..909391d5270c 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -42,6 +42,8 @@ #include #include "intel_hdmi_audio.h" +#define for_each_pipe(card_ctx, pipe) \ + for ((pipe) = 0; (pipe) < (card_ctx)->num_pipes; (pipe)++) #define for_each_port(card_ctx, port) \ for ((port) = 0; (port) < (card_ctx)->num_ports; (port)++) @@ -192,15 +194,30 @@ static void had_substream_put(struct snd_intelhad *intelhaddata) spin_unlock_irqrestore(&intelhaddata->had_spinlock, flags); } +static u32 had_config_offset(int pipe) +{ + switch (pipe) { + default: + case 0: + return AUDIO_HDMI_CONFIG_A; + case 1: + return AUDIO_HDMI_CONFIG_B; + case 2: + return AUDIO_HDMI_CONFIG_C; + } +} + /* Register access functions */ -static u32 had_read_register_raw(struct snd_intelhad *ctx, u32 reg) +static u32 had_read_register_raw(struct snd_intelhad_card *card_ctx, +int pipe, u32 reg) { - return ioread32(ctx->card_ctx->mmio_start + ctx->had_config_offset + reg); + return ioread32(card_ctx->mmio_start + had_config_offset(pipe) + reg); } -static void had_write_register_raw(struct snd_intelhad *ctx, u32 reg, u32 val
[Intel-gfx] [PATCH 03/11] drm/i915: Stop pretending to mask/unmask LPE audio interrupts
From: Ville Syrjälä vlv_display_irq_postinstall() enables the LPE audio interrupts regardless of whether the LPE audio irq chip has masked/unmasked them. Also the irqchip masking/unmasking doesn't consider the state of the display power well or the device, and hence just leads to dmesg spew when it tries to access the hardware while it's powered down. If the current way works, then we don't need to do anything in the mask/unmask hooks. If it doesn't work, well, then we'd need to properly track whether the irqchip has masked/unmasked the interrupts when we enable display interrupts. And the mask/unmask hooks would need to check whether display interrupts are even enabled before frobbing with he registers. So let's just assume the current way works and neuter the mask/unmask hooks. Also clean up vlv_display_irq_postinstall() a bit and stop it from trying to unmask/enable the LPE C interrupt on VLV since it doesn't exist. Cc: Takashi Iwai Cc: Pierre-Louis Bossart Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/i915_irq.c| 15 ++ drivers/gpu/drm/i915/intel_lpe_audio.c | 36 -- 2 files changed, 6 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index fd97fe00cd0d..190f6aa5d15e 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2953,7 +2953,6 @@ static void vlv_display_irq_postinstall(struct drm_i915_private *dev_priv) u32 pipestat_mask; u32 enable_mask; enum pipe pipe; - u32 val; pipestat_mask = PLANE_FLIP_DONE_INT_STATUS_VLV | PIPE_CRC_DONE_INTERRUPT_STATUS; @@ -2964,18 +2963,16 @@ static void vlv_display_irq_postinstall(struct drm_i915_private *dev_priv) enable_mask = I915_DISPLAY_PORT_INTERRUPT | I915_DISPLAY_PIPE_A_EVENT_INTERRUPT | - I915_DISPLAY_PIPE_B_EVENT_INTERRUPT; + I915_DISPLAY_PIPE_B_EVENT_INTERRUPT | + I915_LPE_PIPE_A_INTERRUPT | + I915_LPE_PIPE_B_INTERRUPT; + if (IS_CHERRYVIEW(dev_priv)) - enable_mask |= I915_DISPLAY_PIPE_C_EVENT_INTERRUPT; + enable_mask |= I915_DISPLAY_PIPE_C_EVENT_INTERRUPT | + I915_LPE_PIPE_C_INTERRUPT; WARN_ON(dev_priv->irq_mask != ~0); - val = (I915_LPE_PIPE_A_INTERRUPT | - I915_LPE_PIPE_B_INTERRUPT | - I915_LPE_PIPE_C_INTERRUPT); - - enable_mask |= val; - dev_priv->irq_mask = ~enable_mask; GEN5_IRQ_INIT(VLV_, dev_priv->irq_mask, enable_mask); diff --git a/drivers/gpu/drm/i915/intel_lpe_audio.c b/drivers/gpu/drm/i915/intel_lpe_audio.c index 668f00480d97..292fedf30b00 100644 --- a/drivers/gpu/drm/i915/intel_lpe_audio.c +++ b/drivers/gpu/drm/i915/intel_lpe_audio.c @@ -149,44 +149,10 @@ static void lpe_audio_platdev_destroy(struct drm_i915_private *dev_priv) static void lpe_audio_irq_unmask(struct irq_data *d) { - struct drm_i915_private *dev_priv = d->chip_data; - unsigned long irqflags; - u32 val = (I915_LPE_PIPE_A_INTERRUPT | - I915_LPE_PIPE_B_INTERRUPT); - - if (IS_CHERRYVIEW(dev_priv)) - val |= I915_LPE_PIPE_C_INTERRUPT; - - spin_lock_irqsave(&dev_priv->irq_lock, irqflags); - - dev_priv->irq_mask &= ~val; - I915_WRITE(VLV_IIR, val); - I915_WRITE(VLV_IIR, val); - I915_WRITE(VLV_IMR, dev_priv->irq_mask); - POSTING_READ(VLV_IMR); - - spin_unlock_irqrestore(&dev_priv->irq_lock, irqflags); } static void lpe_audio_irq_mask(struct irq_data *d) { - struct drm_i915_private *dev_priv = d->chip_data; - unsigned long irqflags; - u32 val = (I915_LPE_PIPE_A_INTERRUPT | - I915_LPE_PIPE_B_INTERRUPT); - - if (IS_CHERRYVIEW(dev_priv)) - val |= I915_LPE_PIPE_C_INTERRUPT; - - spin_lock_irqsave(&dev_priv->irq_lock, irqflags); - - dev_priv->irq_mask |= val; - I915_WRITE(VLV_IMR, dev_priv->irq_mask); - I915_WRITE(VLV_IIR, val); - I915_WRITE(VLV_IIR, val); - POSTING_READ(VLV_IIR); - - spin_unlock_irqrestore(&dev_priv->irq_lock, irqflags); } static struct irq_chip lpe_audio_irqchip = { @@ -330,8 +296,6 @@ void intel_lpe_audio_teardown(struct drm_i915_private *dev_priv) desc = irq_to_desc(dev_priv->lpe_audio.irq); - lpe_audio_irq_mask(&desc->irq_data); - lpe_audio_platdev_destroy(dev_priv); irq_free_desc(dev_priv->lpe_audio.irq); -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 08/11] drm/i915: Clean up the LPE audio platform data
From: Ville Syrjälä Split the LPE audio platform data into a port specific chunk and device specific chunk. Eventually we'll have a port specific chunk for each port, but for now we'll stick to just one. We'll also get rid of the intel_hdmi_lpe_audio_eld structure which doesn't seem to have any real reason to exist. v2: Organize per port instead of per pipe Cc: Takashi Iwai Cc: Pierre-Louis Bossart Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/intel_lpe_audio.c | 30 ++ include/drm/intel_lpe_audio.h | 15 --- sound/x86/intel_hdmi_audio.c | 19 +-- 3 files changed, 31 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lpe_audio.c b/drivers/gpu/drm/i915/intel_lpe_audio.c index 4c770d037f23..bdbc235141b5 100644 --- a/drivers/gpu/drm/i915/intel_lpe_audio.c +++ b/drivers/gpu/drm/i915/intel_lpe_audio.c @@ -111,7 +111,7 @@ lpe_audio_platdev_create(struct drm_i915_private *dev_priv) pinfo.size_data = sizeof(*pdata); pinfo.dma_mask = DMA_BIT_MASK(32); - pdata->pipe = -1; + pdata->port.pipe = -1; spin_lock_init(&pdata->lpe_audio_slock); platdev = platform_device_register_full(&pinfo); @@ -320,38 +320,36 @@ void intel_lpe_audio_notify(struct drm_i915_private *dev_priv, const void *eld, int ls_clock, bool dp_output) { unsigned long irq_flags; - struct intel_hdmi_lpe_audio_pdata *pdata = NULL; + struct intel_hdmi_lpe_audio_pdata *pdata; + struct intel_hdmi_lpe_audio_port_pdata *ppdata; u32 audio_enable; if (!HAS_LPE_AUDIO(dev_priv)) return; - pdata = dev_get_platdata( - &(dev_priv->lpe_audio.platdev->dev)); + pdata = dev_get_platdata(&dev_priv->lpe_audio.platdev->dev); + ppdata = &pdata->port; spin_lock_irqsave(&pdata->lpe_audio_slock, irq_flags); audio_enable = I915_READ(VLV_AUD_PORT_EN_DBG(port)); - pdata->eld.port_id = port; + ppdata->port = port; if (eld != NULL) { - memcpy(pdata->eld.eld_data, eld, - HDMI_MAX_ELD_BYTES); - pdata->pipe = pipe; - pdata->ls_clock = ls_clock; - pdata->dp_output = dp_output; + memcpy(ppdata->eld, eld, HDMI_MAX_ELD_BYTES); + ppdata->pipe = pipe; + ppdata->ls_clock = ls_clock; + ppdata->dp_output = dp_output; /* Unmute the amp for both DP and HDMI */ I915_WRITE(VLV_AUD_PORT_EN_DBG(port), audio_enable & ~VLV_AMP_MUTE); - } else { - memset(pdata->eld.eld_data, 0, - HDMI_MAX_ELD_BYTES); - pdata->pipe = -1; - pdata->ls_clock = 0; - pdata->dp_output = false; + memset(ppdata->eld, 0, HDMI_MAX_ELD_BYTES); + ppdata->pipe = -1; + ppdata->ls_clock = 0; + ppdata->dp_output = false; /* Mute the amp for both DP and HDMI */ I915_WRITE(VLV_AUD_PORT_EN_DBG(port), diff --git a/include/drm/intel_lpe_audio.h b/include/drm/intel_lpe_audio.h index 9a5bdf5ad180..211f1cd61153 100644 --- a/include/drm/intel_lpe_audio.h +++ b/include/drm/intel_lpe_audio.h @@ -31,16 +31,17 @@ struct platform_device; #define HDMI_MAX_ELD_BYTES 128 -struct intel_hdmi_lpe_audio_eld { - int port_id; - unsigned char eld_data[HDMI_MAX_ELD_BYTES]; -}; - -struct intel_hdmi_lpe_audio_pdata { +struct intel_hdmi_lpe_audio_port_pdata { + u8 eld[HDMI_MAX_ELD_BYTES]; + int port; int pipe; int ls_clock; bool dp_output; - struct intel_hdmi_lpe_audio_eld eld; +}; + +struct intel_hdmi_lpe_audio_pdata { + struct intel_hdmi_lpe_audio_port_pdata port; + void (*notify_audio_lpe)(struct platform_device *pdev); spinlock_t lpe_audio_slock; }; diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index 1a095189db83..c2b78621852e 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -1556,21 +1556,20 @@ static void had_audio_wq(struct work_struct *work) struct snd_intelhad *ctx = container_of(work, struct snd_intelhad, hdmi_audio_wq); struct intel_hdmi_lpe_audio_pdata *pdata = ctx->dev->platform_data; + struct intel_hdmi_lpe_audio_port_pdata *ppdata = &pdata->port; pm_runtime_get_sync(ctx->dev); mutex_lock(&ctx->mutex); - if (pdata->pipe < 0) { + if (ppdata->pipe < 0) { dev_dbg(ctx->dev, "%s: Event: HAD_NOTIFY_HOT_UNPLUG\n", __func__); memset(ctx->eld, 0, sizeof(ctx->eld)); /* clear the old ELD */ had_process_hot_unplug(ctx); } else { - struct intel_hdmi_lpe_audio_eld *eld = &pdata
[Intel-gfx] ✓ Fi.CI.BAT: success for conf: Add multiple hdmi pcm definition for Intel LPE audio
== Series Details == Series: conf: Add multiple hdmi pcm definition for Intel LPE audio URL : https://patchwork.freedesktop.org/series/23639/ State : success == Summary == Series 23639v1 conf: Add multiple hdmi pcm definition for Intel LPE audio https://patchwork.freedesktop.org/api/1.0/series/23639/revisions/1/mbox/ Test gem_exec_suspend: Subgroup basic-s4-devices: dmesg-warn -> PASS (fi-kbl-7560u) fdo#100125 fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:435s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:423s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:577s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:512s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:544s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:487s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:483s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:406s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:410s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:415s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:495s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:480s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:459s fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:573s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:455s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:565s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:459s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:489s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:429s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:532s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:396s 8b5a41bbd270c3a8db6d48bc1d6d6bafb59e6753 drm-tip: 2017y-04m-27d-13h-10m-59s UTC integration manifest bc96bd1 ALSA: x86: Register multiple PCM devices for the LPE audio card e09ae3e ALSA: x86: Split snd_intelhad into card and PCM specific structures a9f1076 ALSA: x86: Prepare LPE audio ctls for multiple PCMs ca71392 drm/i915: Clean up the LPE audio platform data 13211f9 drm/i915: Reorganize intel_lpe_audio_notify() arguments 023f2ba drm/i915: Remove hdmi_connected from LPE audio pdata 26eda33 drm/i915: Replace tmds_clock_speed and link_rate with just ls_clock 223ac7f drm/i915: Remove the unused pending_notify from LPE platform data 1df8df6 drm/i915: Stop pretending to mask/unmask LPE audio interrupts ff5766c ALSA: x86: Clear the pdata.notify_lpe_audio pointer before teardown ddc7b54 drm/i915: Fix runtime PM for LPE audio == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4570/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On Thu, Apr 27, 2017 at 06:30:42PM +0300, David Weinehall wrote: > On Thu, Apr 27, 2017 at 04:55:20PM +0200, Arkadiusz Hiler wrote: > > On Wed, Apr 26, 2017 at 06:00:41PM +0300, David Weinehall wrote: > > > Add a bunch of MOCS entries for gen 9 that were missing from intel_mocs. > > > Some of these are used by media-sdk; if these entries are missing > > > the default will instead be to do everything uncached. > > > > > > This patch improves media-sdk performance with up to 60% > > > with the (admittedly synthetic) benchmarks we use in our nightly > > > testing, without regressing any other benchmarks. > > > > Hey David, > > > > I am testing some of the extended MOCS with Mesa and the differences I > > see fit in the margins of statistical error. > > > > Odd, I thought, so to make sure I haven't messed up anything in the > > process of compiling, setting LD_LIBRARY_PATH and benchmarking I turned > > everything to UNCACHED - and I saw severe performance drop. > > > > So here is the question it induced: > > > > Have you used the "closest neighbour" from entries available or did you > > defaulted to the UNCACHED ones? That could be the culprit. > > > > Note: I have tested MOCS for VB and Render Target only, and only in a > > few synthetic cases - it will require much more fine-tuning and > > benchmarking before any final conclusions. > > As I mentioned in the commit message, the improvements only manifest > themselves for media-sdk workloads (and presumably other workloads > that uses the same hardware); if you see any performance regressions > with these additional entries I'd be interested to know. But what is being counter suggested is that their is no reason for these mocs entries. If the sdk is just using mocs registers without first programming them outside of the kernel abi, then it will be hitting uncached memory - and then the only benefit is from simply enabling cached access. The kernel ABI is minimalist for a reason, and we want to know why we should be adding tables that we need to maintain forever (bonus points for making that a consistent interface for hardware for years to come). -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm: Make fbdev inherit the crtc's initial rotation
On Wed, Apr 26, 2017 at 02:28:32PM +0200, Bastien Nocera wrote: > On Mon, 2017-04-24 at 15:48 +0300, Ville Syrjälä wrote: > > > > > > > I've a patch for iio-sensor-proxy which fixes the rotation under > > > Xorg / > > > Wayland when using a desktop environment which honors iio-sensor- > > > proxy's > > > rotation detection: > > > https://github.com/hadess/iio-sensor-proxy/pull/162 > > > > Or is it just this thing that clobbers what the DDX inherited from > > the > > kernel as the initial rotation? > > I think it's mostly got to do with the compositor (or X) not knowing > what "normal" or "0 degrees rotation" corresponds to. Well, there are really two cases to consider: 1. BIOS/whatever configures display hardware rotation in a way that matches the orientation of the physical display 2. BIOS didn't do that. Either the hardware can't do what would be required, or the BIOS just chose not to do it. Case 1 should work with these patches as long as the DDX will set up the initial randr rotation to match what it read out from the kms rotation property of the primary plane. Case 2 can't work without some mechanism to query the orientation of the display from the firmware/etc. -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm: Make fbdev inherit the crtc's initial rotation
On Thu, 2017-04-27 at 19:24 +0300, Ville Syrjälä wrote: > On Wed, Apr 26, 2017 at 02:28:32PM +0200, Bastien Nocera wrote: > > On Mon, 2017-04-24 at 15:48 +0300, Ville Syrjälä wrote: > > > > > > > > > > > > > I've a patch for iio-sensor-proxy which fixes the rotation > > > > under > > > > Xorg / > > > > Wayland when using a desktop environment which honors iio- > > > > sensor- > > > > proxy's > > > > rotation detection: > > > > https://github.com/hadess/iio-sensor-proxy/pull/162 > > > > > > Or is it just this thing that clobbers what the DDX inherited > > > from > > > the > > > kernel as the initial rotation? > > > > I think it's mostly got to do with the compositor (or X) not > > knowing > > what "normal" or "0 degrees rotation" corresponds to. > > Well, there are really two cases to consider: > > 1. BIOS/whatever configures display hardware rotation in a way > that matches the orientation of the physical display > 2. BIOS didn't do that. Either the hardware can't do what > would be required, or the BIOS just chose not to do it. > > Case 1 should work with these patches as long as the DDX will set up > the > initial randr rotation to match what it read out from the kms > rotation > property of the primary plane. Yes. My problem was that instead of fixing the DDX to behave properly, reusing the same orientation as already configured, we were using iio- sensor-proxy to trigger the initial rotation. This doesn't work if there's no accelerometer, or orientation is locked, which is counter- intuitive. > Case 2 can't work without some mechanism to query the orientation > of the display from the firmware/etc. Yes. I'm not sure where we'd be exporting this quirk though, as we need it available early enough so that it can be used by boot splashes. DMI matches in the graphics driver? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v9] drm/i915: Squash repeated awaits on the same fence
On 27/04/2017 12:48, Chris Wilson wrote: Track the latest fence waited upon on each context, and only add a new asynchronous wait if the new fence is more recent than the recorded fence for that context. This requires us to filter out unordered timelines, which are noted by DMA_FENCE_NO_CONTEXT. However, in the absence of a universal identifier, we have to use our own i915->mm.unordered_timeline token. v2: Throw around the debug crutches v3: Inline the likely case of the pre-allocation cache being full. v4: Drop the pre-allocation support, we can lose the most recent fence in case of allocation failure -- it just means we may emit more awaits than strictly necessary but will not break. v5: Trim allocation size for leaf nodes, they only need an array of u32 not pointers. v6: Create mock_timeline to tidy selftest writing v7: s/intel_timeline_sync_get/intel_timeline_sync_is_later/ (Tvrtko) v8: Prune the stale sync points when we idle. v9: Include a small benchmark in the kselftests Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem.c| 1 + drivers/gpu/drm/i915/i915_gem_request.c| 11 + drivers/gpu/drm/i915/i915_gem_timeline.c | 314 + drivers/gpu/drm/i915/i915_gem_timeline.h | 15 + drivers/gpu/drm/i915/selftests/i915_gem_timeline.c | 225 +++ .../gpu/drm/i915/selftests/i915_mock_selftests.h | 1 + drivers/gpu/drm/i915/selftests/mock_timeline.c | 52 drivers/gpu/drm/i915/selftests/mock_timeline.h | 33 +++ 8 files changed, 652 insertions(+) create mode 100644 drivers/gpu/drm/i915/selftests/i915_gem_timeline.c create mode 100644 drivers/gpu/drm/i915/selftests/mock_timeline.c create mode 100644 drivers/gpu/drm/i915/selftests/mock_timeline.h diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c1fa3c103f38..f886ef492036 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3214,6 +3214,7 @@ i915_gem_idle_work_handler(struct work_struct *work) intel_engine_disarm_breadcrumbs(engine); i915_gem_batch_pool_fini(&engine->batch_pool); } + i915_gem_timelines_mark_idle(dev_priv); GEM_BUG_ON(!dev_priv->gt.awake); dev_priv->gt.awake = false; diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 5fa4e52ded06..d9f76665bc6b 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -772,6 +772,12 @@ i915_gem_request_await_dma_fence(struct drm_i915_gem_request *req, if (fence->context == req->fence.context) continue; + /* Squash repeated waits to the same timelines */ + if (fence->context != req->i915->mm.unordered_timeline && + intel_timeline_sync_is_later(req->timeline, +fence->context, fence->seqno)) + continue; Wrong base? + if (dma_fence_is_i915(fence)) ret = i915_gem_request_await_request(req, to_request(fence)); @@ -781,6 +787,11 @@ i915_gem_request_await_dma_fence(struct drm_i915_gem_request *req, GFP_KERNEL); if (ret < 0) return ret; + + /* Record the most latest fence on each timeline */ + if (fence->context != req->i915->mm.unordered_timeline) + intel_timeline_sync_set(req->timeline, + fence->context, fence->seqno); } while (--nchild); return 0; diff --git a/drivers/gpu/drm/i915/i915_gem_timeline.c b/drivers/gpu/drm/i915/i915_gem_timeline.c index b596ca7ee058..967c53a53a92 100644 --- a/drivers/gpu/drm/i915/i915_gem_timeline.c +++ b/drivers/gpu/drm/i915/i915_gem_timeline.c @@ -24,6 +24,276 @@ #include "i915_drv.h" +#define NSYNC 16 +#define SHIFT ilog2(NSYNC) +#define MASK (NSYNC - 1) + +/* struct intel_timeline_sync is a layer of a radixtree that maps a u64 fence + * context id to the last u32 fence seqno waited upon from that context. + * Unlike lib/radixtree it uses a parent pointer that allows traversal back to + * the root. This allows us to access the whole tree via a single pointer + * to the most recently used layer. We expect fence contexts to be dense + * and most reuse to be on the same i915_gem_context but on neighbouring + * engines (i.e. on adjacent contexts) and reuse the same leaf, a very + * effective lookup cache. If the new lookup is not on the same leaf, we + * expect it to be on the neighbouring branch. + * + * A leaf holds an array of u32 seqno, and has height 0. The bitmap field + * allows us to store whether a particular seqno is valid (i.
Re: [Intel-gfx] [PATCH v9] drm/i915: Squash repeated awaits on the same fence
On Thu, Apr 27, 2017 at 05:47:32PM +0100, Tvrtko Ursulin wrote: > > On 27/04/2017 12:48, Chris Wilson wrote: > >diff --git a/drivers/gpu/drm/i915/i915_gem_request.c > >b/drivers/gpu/drm/i915/i915_gem_request.c > >index 5fa4e52ded06..d9f76665bc6b 100644 > >--- a/drivers/gpu/drm/i915/i915_gem_request.c > >+++ b/drivers/gpu/drm/i915/i915_gem_request.c > >@@ -772,6 +772,12 @@ i915_gem_request_await_dma_fence(struct > >drm_i915_gem_request *req, > > if (fence->context == req->fence.context) > > continue; > > > >+/* Squash repeated waits to the same timelines */ > >+if (fence->context != req->i915->mm.unordered_timeline && > >+intel_timeline_sync_is_later(req->timeline, > >+ fence->context, fence->seqno)) > >+continue; > > Wrong base? I haven't moved this patch relative to the others in the series? There's a few patches to get to here first. > >+struct intel_timeline_sync { > >+u64 prefix; > >+unsigned int height; > >+unsigned int bitmap; > > u16 would be enough for the bitmap since NSYNC == 16? To no benefit > though. Maybe just add a BUILD_BUG_ON(sizeof(p->bitmap) * > BITS_PER_BYTE >= NSYNC) somewhere? Indeed compacting these bits have no impact on allocation size, so I went with natural sizes. But I didn't check if the compiler prefers u16. > >+struct intel_timeline_sync *parent; > >+/* union { > >+ * u32 seqno; > >+ * struct intel_timeline_sync *child; > >+ * } slot[NSYNC]; > >+ */ > > Put a note saying this comment describes what follows after struct > intel_timeline_sync. > > Would "union { ... } slot[0];" work as a maker and have any benefit > to the readability of the code below? > > You could same some bytes (64 I think) for the leaf nodes if you did > something like: Hmm, where's the saving? leaves are sizeof(*p) + NSYNC*sizeof(seqno) -> kmalloc-128 slab branches are sizeof(*p) + NSYNC*sizeof(p) -> kmalloc-256 slab > union { > u32 seqno[NSYNC]; > struct intel_timeline_sync *child[NSYNC]; > }; > > Although I think it conflicts with the slot marker idea. Hm, no > actually it doesn't. You could have both union members as simply > markers. > > union { > u32 seqno[]; > struct intel_timeline_sync *child[]; > }; > > Again, not sure yet if it would make that much better readability. Tried, gcc doesn't like unions of variable length arrays. Hence resorting to manual packing the arrays after the struct. > >+static void __sync_free(struct intel_timeline_sync *p) > >+{ > >+if (p->height) { > >+unsigned int i; > >+ > >+while ((i = ffs(p->bitmap))) { > >+p->bitmap &= ~0u << i; > >+__sync_free(__sync_child(p)[i - 1]); > > Maximum height is 64 for this tree so here there is no danger of > stack overflow? Maximum recusion depth is 64 / NSHIFT(4) = 16. Stack usage is small, only a few registers to push pop, so I didn't feel any danger in allowing recursion. The while() loop was chosen as that avoided a stack variable. > >+/* First climb the tree back to a parent branch */ > >+do { > >+p = p->parent; > >+if (!p) > >+return false; > >+ > >+if ((id >> p->height >> SHIFT) == p->prefix) > > Worth having "id >> p->height >> SHIFT" as a macro for better readability? Yeah, this is the main issue with the code, so many shifts. > >+break; > >+} while (1); > >+ > >+/* And then descend again until we find our leaf */ > >+do { > >+if (!p->height) > >+break; > >+ > >+p = __sync_child(p)[__sync_idx(p, id)]; > >+if (!p) > >+return false; > >+ > >+if ((id >> p->height >> SHIFT) != p->prefix) > >+return false; > > Is this possible or a GEM_BUG_ON? Maybe I am not understanding it, > but I thought it would be __sync_child slot had unexpected prefix in > it? The tree may skip levels. > >+} while (1); > >+ > >+tl->sync = p; > >+found: > >+idx = id & MASK; > >+if (!(p->bitmap & BIT(idx))) > >+return false; > >+ > >+return i915_seqno_passed(__sync_seqno(p)[idx], seqno); > >+} > >+ > >+static noinline int > >+__intel_timeline_sync_set(struct intel_timeline *tl, u64 id, u32 seqno) > >+{ > >+struct intel_timeline_sync *p = tl->sync; > >+unsigned int idx; > >+ > >+if (!p) { > >+p = kzalloc(sizeof(*p) + NSYNC * sizeof(seqno), GFP_KERNEL); > >+if (unlikely(!p)) > >+return -ENOMEM; > >+ > >+p->prefix = id >> SHIFT; > >+goto found; > >+} > >+ > >+/* Climb back up the tree until we find a common prefix */ > >+do { > >+if (!p->parent) > >+break;
Re: [Intel-gfx] [PATCH 1/2] drm/i915/guc: Fix sleep under spinlock during reset
On 12/04/17 09:22, Michel Thierry wrote: On 12/04/17 08:58, Chris Wilson wrote: On Wed, Apr 12, 2017 at 04:48:42PM +0100, Tvrtko Ursulin wrote: From: Tvrtko Ursulin Looks like intel_guc_reset had the ability to sleep under the uncore spinlock since forever but it wasn't detected until the recent changes annotated the wait for register with might_sleep. I have fixed it by removing holding of the uncore spinlock over the call to gen6_hw_domain_reset, since I do not see that is really needed. But there is always a possibility I am missing some nasty detail so please double check. Afaik, no we are not using the uncore.lock here to serialise resets so yes we should be safe in dropping it. Will the guc be coming under the same hw semaphore as gen8 per-engine resets? A bit unrelated, but should intel_guc_reset be intel_reset_guc instead? Here we're trying to reset the microcontroller, not asking guc to do a reset. Ping? Anyone unlucky enough to be using GuC submission should be seeing this warning when the firmware has to be reloaded (for example after any i-g-t hang test). I still think the function should be renamed to _reset_guc though, since it's the hw reseting the guc, not the other way around. Acked-by: Michel Thierry ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915/guc: Fix sleep under spinlock during reset
On 27/04/2017 19:14, Michel Thierry wrote: On 12/04/17 09:22, Michel Thierry wrote: On 12/04/17 08:58, Chris Wilson wrote: On Wed, Apr 12, 2017 at 04:48:42PM +0100, Tvrtko Ursulin wrote: From: Tvrtko Ursulin Looks like intel_guc_reset had the ability to sleep under the uncore spinlock since forever but it wasn't detected until the recent changes annotated the wait for register with might_sleep. I have fixed it by removing holding of the uncore spinlock over the call to gen6_hw_domain_reset, since I do not see that is really needed. But there is always a possibility I am missing some nasty detail so please double check. Afaik, no we are not using the uncore.lock here to serialise resets so yes we should be safe in dropping it. Will the guc be coming under the same hw semaphore as gen8 per-engine resets? A bit unrelated, but should intel_guc_reset be intel_reset_guc instead? Here we're trying to reset the microcontroller, not asking guc to do a reset. Ping? Anyone unlucky enough to be using GuC submission should be seeing this warning when the firmware has to be reloaded (for example after any i-g-t hang test). I still think the function should be renamed to _reset_guc though, since it's the hw reseting the guc, not the other way around. Acked-by: Michel Thierry Thanks! Now just exercise restrain in suggesting bikesheds and if someone can provide an r-b we could merge this. ;) (To be read as - lets leave the renaming for a follow up work since this fix is not to blame for the objectionable name.) Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915/guc: Fix sleep under spinlock during reset
On 27/04/17 11:20, Tvrtko Ursulin wrote: On 27/04/2017 19:14, Michel Thierry wrote: On 12/04/17 09:22, Michel Thierry wrote: On 12/04/17 08:58, Chris Wilson wrote: On Wed, Apr 12, 2017 at 04:48:42PM +0100, Tvrtko Ursulin wrote: From: Tvrtko Ursulin Looks like intel_guc_reset had the ability to sleep under the uncore spinlock since forever but it wasn't detected until the recent changes annotated the wait for register with might_sleep. I have fixed it by removing holding of the uncore spinlock over the call to gen6_hw_domain_reset, since I do not see that is really needed. But there is always a possibility I am missing some nasty detail so please double check. Afaik, no we are not using the uncore.lock here to serialise resets so yes we should be safe in dropping it. Will the guc be coming under the same hw semaphore as gen8 per-engine resets? A bit unrelated, but should intel_guc_reset be intel_reset_guc instead? Here we're trying to reset the microcontroller, not asking guc to do a reset. Ping? Anyone unlucky enough to be using GuC submission should be seeing this warning when the firmware has to be reloaded (for example after any i-g-t hang test). I still think the function should be renamed to _reset_guc though, since it's the hw reseting the guc, not the other way around. Acked-by: Michel Thierry Thanks! Now just exercise restrain in suggesting bikesheds and if someone can provide an r-b we could merge this. ;) (To be read as - lets leave the renaming for a follow up work since this fix is not to blame for the objectionable name.) Regards, _Invoking GuC experts_ Agreed, and since I'm the one that will tell the guc to perform a reset, I can include the bikeshed in my patches. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions
On 26/04/17 01:44 AM, Christoph Hellwig wrote: > I think we'll at least need a draft of those to make sense of these > patches. Otherwise they just look very clumsy. Ok, what follows is a draft patch attempting to show where I'm thinking of going with this. Obviously it will not compile because it assumes the users throughout the kernel are a bit different than they are today. Notably, there is no sg_page anymore. There's also likely a ton of issues and arguments to have over a bunch of the specifics below and I'd expect the concept to evolve more as cleanup occurs. This itself is an evolution of the draft I posted replying to you in my last RFC thread. Also, before any of this is truly useful to us, pfn_t would have to infect a few other places in the kernel. Thanks, Logan diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index fad170b..85ef928 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -6,13 +6,14 @@ #include #include #include +#include #include struct scatterlist { #ifdef CONFIG_DEBUG_SG unsigned long sg_magic; #endif - unsigned long page_link; + pfn_t pfn; unsigned intoffset; unsigned intlength; dma_addr_t dma_address; @@ -60,15 +61,68 @@ struct sg_table { #define SG_MAGIC 0x87654321 -/* - * We overload the LSB of the page pointer to indicate whether it's - * a valid sg entry, or whether it points to the start of a new scatterlist. - * Those low bits are there for everyone! (thanks mason :-) - */ -#define sg_is_chain(sg)((sg)->page_link & 0x01) -#define sg_is_last(sg) ((sg)->page_link & 0x02) -#define sg_chain_ptr(sg) \ - ((struct scatterlist *) ((sg)->page_link & ~0x03)) +static inline bool sg_is_chain(struct scatterlist *sg) +{ + return sg->pfn.val & PFN_SG_CHAIN; +} + +static inline bool sg_is_last(struct scatterlist *sg) +{ + return sg->pfn.val & PFN_SG_LAST; +} + +static inline struct scatterlist *sg_chain_ptr(struct scatterlist *sg) +{ + unsigned long sgl = pfn_t_to_pfn(sg->pfn); + return (struct scatterlist *)(sgl << PAGE_SHIFT); +} + +static inline bool sg_is_iomem(struct scatterlist *sg) +{ + return pfn_t_is_iomem(sg->pfn); +} + +/** + * sg_assign_pfn - Assign a given pfn_t to an SG entry + * @sg:SG entry + * @pfn: The pfn + * + * Description: + * Assign a pfn to sg entry. Also see sg_set_pfn(), the most commonly used + * variant.w + * + **/ +static inline void sg_assign_pfn(struct scatterlist *sg, pfn_t pfn) +{ +#ifdef CONFIG_DEBUG_SG + BUG_ON(sg->sg_magic != SG_MAGIC); + BUG_ON(sg_is_chain(sg)); + BUG_ON(pfn.val & (PFN_SG_CHAIN | PFN_SG_LAST)); +#endif + + sg->pfn = pfn; +} + +/** + * sg_set_pfn - Set sg entry to point at given pfn + * @sg: SG entry + * @pfn:The page + * @len:Length of data + * @offset: Offset into page + * + * Description: + * Use this function to set an sg entry pointing at a pfn, never assign + * the page directly. We encode sg table information in the lower bits + * of the page pointer. See sg_pfn_t for looking up the pfn_t belonging + * to an sg entry. + **/ +static inline void sg_set_pfn(struct scatterlist *sg, pfn_t pfn, + unsigned int len, unsigned int offset) +{ + sg_assign_pfn(sg, pfn); + sg->offset = offset; + sg->length = len; +} /** * sg_assign_page - Assign a given page to an SG entry @@ -82,18 +136,13 @@ struct sg_table { **/ static inline void sg_assign_page(struct scatterlist *sg, struct page *page) { - unsigned long page_link = sg->page_link & 0x3; + if (!page) { + pfn_t null_pfn = {0}; + sg_assign_pfn(sg, null_pfn); + return; + } - /* -* In order for the low bit stealing approach to work, pages -* must be aligned at a 32-bit boundary as a minimum. -*/ - BUG_ON((unsigned long) page & 0x03); -#ifdef CONFIG_DEBUG_SG - BUG_ON(sg->sg_magic != SG_MAGIC); - BUG_ON(sg_is_chain(sg)); -#endif - sg->page_link = page_link | (unsigned long) page; + sg_assign_pfn(sg, page_to_pfn_t(page)); } /** @@ -106,8 +155,7 @@ static inline void sg_assign_page(struct scatterlist *sg, struct page *page) * Description: * Use this function to set an sg entry pointing at a page, never assign * the page directly. We encode sg table information in the lower bits - * of the page pointer. See sg_page() for looking up the page belonging - * to an sg entry. + * of the page pointer. * **/ static inline void sg_set_page(struct scatterlist *sg, struct page *page, @@ -118,13 +166,53 @@ static inline void sg_set_page(struct scatterlist *sg, struct page *page, sg->length = len; } -static inline struct page *sg_page(struct scatterlist *sg) +/** + * sg_pfn_t - Return the pfn_
Re: [Intel-gfx] [PATCH v2 15/21] xen-blkfront: Make use of the new sg_map helper function
On 26/04/17 01:37 AM, Roger Pau Monné wrote: > On Tue, Apr 25, 2017 at 12:21:02PM -0600, Logan Gunthorpe wrote: >> Straightforward conversion to the new helper, except due to the lack >> of error path, we have to use SG_MAP_MUST_NOT_FAIL which may BUG_ON in >> certain cases in the future. >> >> Signed-off-by: Logan Gunthorpe >> Cc: Boris Ostrovsky >> Cc: Juergen Gross >> Cc: Konrad Rzeszutek Wilk >> Cc: "Roger Pau Monné" >> --- >> drivers/block/xen-blkfront.c | 20 +++- >> 1 file changed, 11 insertions(+), 9 deletions(-) >> >> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c >> index 3945963..ed62175 100644 >> --- a/drivers/block/xen-blkfront.c >> +++ b/drivers/block/xen-blkfront.c >> @@ -816,8 +816,9 @@ static int blkif_queue_rw_req(struct request *req, >> struct blkfront_ring_info *ri >> BUG_ON(sg->offset + sg->length > PAGE_SIZE); >> >> if (setup.need_copy) { >> -setup.bvec_off = sg->offset; >> -setup.bvec_data = kmap_atomic(sg_page(sg)); >> +setup.bvec_off = 0; >> +setup.bvec_data = sg_map(sg, 0, SG_KMAP_ATOMIC | >> + SG_MAP_MUST_NOT_FAIL); > > I assume that sg_map already adds sg->offset to the address? Correct. > Also wondering whether we can get rid of bvec_off and just increment > bvec_data, > adding Julien who IIRC added this code. bvec_off is used to keep track of the offset within the current mapping so it's not a great idea given that you'd want to kunmap_atomic the original address and not something with an offset. It would be nice if this could be converted to use the sg_miter interface but that's a much more invasive change that would require someone who knows this code and can properly test it. I'd be very grateful if someone actually took that on. Logan ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v9] drm/i915: Squash repeated awaits on the same fence
On Thu, Apr 27, 2017 at 06:25:47PM +0100, Chris Wilson wrote: > On Thu, Apr 27, 2017 at 05:47:32PM +0100, Tvrtko Ursulin wrote: > > >+int intel_timeline_sync_set(struct intel_timeline *tl, u64 id, u32 seqno) > > >+{ > > >+ struct intel_timeline_sync *p = tl->sync; > > >+ > > >+ /* We expect to be called in sequence following a _get(id), which > > >+ * should have preloaded the tl->sync hint for us. > > >+ */ > > >+ if (likely(p && (id >> SHIFT) == p->prefix)) { > > >+ unsigned int idx = id & MASK; > > >+ > > >+ __sync_seqno(p)[idx] = seqno; > > >+ p->bitmap |= BIT(idx); > > >+ return 0; > > >+ } > > >+ > > >+ return __intel_timeline_sync_set(tl, id, seqno); > > > > Could pass in p and set tl->sync = p at this level. That would > > decouple the algorithm from the timeline better. With equivalent > > treatment for the query, and renaming of struct intel_timeline_sync, > > algorithm would be ready for moving out of drm/i915/ :) > > I really did want to keep this as a tail call to keep the fast path neat > and tidy with minimal stack manipulation. Happier with _intel_timeline_sync_set(struct intel_timeline_sync **root, u64 id, u32 seqno) { struct intel_timeline_sync *p = *root; ... *root = p; return 0; } return __intel_timeline_sync_set(&tl->sync, id, seqno); A little step towards abstraction. Works equally well for intel_timeline_sync_is_later(). Hmm. i915_seqmap.c ? Too cryptic? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v9] drm/i915: Squash repeated awaits on the same fence
On Thu, Apr 27, 2017 at 09:34:10PM +0100, Chris Wilson wrote: > On Thu, Apr 27, 2017 at 06:25:47PM +0100, Chris Wilson wrote: > > On Thu, Apr 27, 2017 at 05:47:32PM +0100, Tvrtko Ursulin wrote: > > > >+int intel_timeline_sync_set(struct intel_timeline *tl, u64 id, u32 > > > >seqno) > > > >+{ > > > >+struct intel_timeline_sync *p = tl->sync; > > > >+ > > > >+/* We expect to be called in sequence following a _get(id), > > > >which > > > >+ * should have preloaded the tl->sync hint for us. > > > >+ */ > > > >+if (likely(p && (id >> SHIFT) == p->prefix)) { > > > >+unsigned int idx = id & MASK; > > > >+ > > > >+__sync_seqno(p)[idx] = seqno; > > > >+p->bitmap |= BIT(idx); > > > >+return 0; > > > >+} > > > >+ > > > >+return __intel_timeline_sync_set(tl, id, seqno); > > > > > > Could pass in p and set tl->sync = p at this level. That would > > > decouple the algorithm from the timeline better. With equivalent > > > treatment for the query, and renaming of struct intel_timeline_sync, > > > algorithm would be ready for moving out of drm/i915/ :) > > > > I really did want to keep this as a tail call to keep the fast path neat > > and tidy with minimal stack manipulation. > > Happier with > > _intel_timeline_sync_set(struct intel_timeline_sync **root, >u64 id, u32 seqno) > { > struct intel_timeline_sync *p = *root; > ... > *root = p; > return 0; > } > > return __intel_timeline_sync_set(&tl->sync, id, seqno); > > A little step towards abstraction. Works equally well for > intel_timeline_sync_is_later(). > > Hmm. i915_seqmap.c ? Too cryptic? Went with i915_syncmap (struct, .c, .h) There's some knowlege of seqno built in (i.e the is_later function). -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 15/21] xen-blkfront: Make use of the new sg_map helper function
On 27/04/17 02:53 PM, Jason Gunthorpe wrote: > blkfront is one of the drivers I looked at, and it appears to only be > memcpying with the bvec_data pointer, so I wonder why it does not use > sg_copy_X_buffer instead.. Yes, sort of... But you'd potentially end up calling sg_copy_to_buffer multiple times per page within the sg (given that gnttab_foreach_grant_in_range might call blkif_copy_from_grant/blkif_setup_rw_req_grant multiple times). Even calling sg_copy_to_buffer once per page seems rather inefficient as it uses sg_miter internally. Switching the for_each_sg to sg_miter is probably the nicer solution as it takes care of the mapping and the offset/length accounting for you and will have similar performance. But, yes, if performance is not an issue, switching it to sg_copy_to_buffer would be a less invasive change than sg_miter. Which the same might be said about a lot of these cases. Unfortunately, changing from kmap_atomic (which is a null operation in a lot of cases) to sg_copy_X_buffer is a pretty big performance hit. Logan ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 15/21] xen-blkfront: Make use of the new sg_map helper function
On 27/04/17 04:11 PM, Jason Gunthorpe wrote: > On Thu, Apr 27, 2017 at 03:53:37PM -0600, Logan Gunthorpe wrote: > Well, that is in the current form, with more users it would make sense > to optimize for the single page case, eg by providing the existing > call, providing a faster single-page-only variant of the copy, perhaps > even one that is inlined. Ok, does it make sense then to have an sg_copy_page_to_buffer (or some such... I'm having trouble thinking of a sane name that isn't too long). That just does k(un)map_atomic and memcpy? I could try that if it makes sense to people. >> Switching the for_each_sg to sg_miter is probably the nicer solution as >> it takes care of the mapping and the offset/length accounting for you >> and will have similar performance. > > sg_miter will still fail when the sg contains __iomem, however I would > expect that the sg_copy will work with iomem, by using the __iomem > memcpy variant. Yes, that's true. Any sg_miters that ever see iomem will need to be converted to support it. This isn't much different than the other kmap(sg_page()) users I was converting that will also fail if they see iomem. Though, I suspect an sg_miter user would be easier to convert to iomem than a random kmap user. Logan ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v7 12/20] drm/i915/guc: Provide register list to be saved/restored during engine reset
From: Arun Siluvery GuC expects a list of registers from the driver which are saved/restored during engine reset. The type of value to be saved is controlled by flags. We provide a minimal set of registers that we want GuC to save and restore. This is not an issue in case of engine reset as driver initializes most of them following an engine reset, but in case of media reset (aka watchdog reset) which is completely internal to GuC (including resubmission of hung workload), it is necessary to provide this list, otherwise GuC won't be able to schedule further workloads after a reset. This is the minimal set of registers identified for things to work as expected but if we see any new issues, this register list can be expanded. In order to not loose any existing workarounds, we have to let GuC know the registers and its values. These will be reapplied after the reset. Note that we can't just read the current value because most of these registers are masked (so we have a workaround for a workaround for a workaround). v2: REGSET_MASKED is too difficult for GuC, use REGSET_SAVE_DEFAULT_VALUE and current value from RING_MODE reg instead; no need to preserve head/tail either, be extra paranoid and save whitelisted registers (Daniele). v3: Workarounds added only once during _init_workarounds also have to been restored, or we risk loosing them after internal GuC reset (Daniele). v4: Rename macro used to keep track the workaround registers we will have to restore after reset (s/I915_GUC_REG_WRITE/WA_REG_WR_GUC_RESTORE). Cc: Daniele Ceraolo Spurio Signed-off-by: Arun Siluvery Signed-off-by: Jeff McGee Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_drv.h| 3 ++ drivers/gpu/drm/i915/i915_guc_submission.c | 68 +- drivers/gpu/drm/i915/intel_engine_cs.c | 65 +++- 3 files changed, 114 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b00ea523a634..c9ff7f726d47 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1913,7 +1913,10 @@ struct i915_wa_reg { struct i915_workarounds { struct i915_wa_reg reg[I915_MAX_WA_REGS]; + /* list of registers (and their values) that GuC will have to restore */ + struct i915_wa_reg guc_reg[GUC_REGSET_MAX_REGISTERS]; u32 count; + u32 guc_count; u32 hw_whitelist_count[I915_NUM_ENGINES]; }; diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 2cfe5d3b7795..4d1784c84fd4 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -1001,6 +1001,24 @@ static void guc_policies_init(struct guc_policies *policies) policies->is_valid = 1; } +/* + * In this macro it is highly unlikely to exceed max value but even if we did + * it is not an error so just throw a warning and continue. Only side effect + * in continuing further means some registers won't be added to save/restore + * list. + */ +#define GUC_ADD_MMIO_REG_ADS(node, reg_addr, _flags, defvalue) \ + do {\ + u32 __count = node->number_of_registers;\ + if (WARN_ON(__count >= GUC_REGSET_MAX_REGISTERS)) \ + continue; \ + node->registers[__count].offset = reg_addr.reg; \ + node->registers[__count].flags = (_flags); \ + if (defvalue) \ + node->registers[__count].value = (defvalue);\ + node->number_of_registers++;\ + } while (0) + static int guc_ads_create(struct intel_guc *guc) { struct drm_i915_private *dev_priv = guc_to_i915(guc); @@ -1014,6 +1032,7 @@ static int guc_ads_create(struct intel_guc *guc) u8 reg_state_buffer[GUC_S3_SAVE_SPACE_PAGES * PAGE_SIZE]; } __packed *blob; struct intel_engine_cs *engine; + struct i915_workarounds *workarounds = &dev_priv->workarounds; enum intel_engine_id id; u32 base; @@ -1033,6 +1052,47 @@ static int guc_ads_create(struct intel_guc *guc) /* MMIO reg state */ for_each_engine(engine, dev_priv, id) { + u32 i; + struct guc_mmio_regset *eng_reg = + &blob->reg_state.engine_reg[engine->guc_id]; + + /* +* Provide a list of registers to be saved/restored during gpu +* reset. This is mainly required for Media reset (aka watchdog +* timeout) which is completely under the control of GuC +* (resubmission of hung workload is handled inside GuC). +*/ + GUC_ADD_MMIO_REG_ADS(eng_reg, R
[Intel-gfx] [PATCH v7 04/20] drm/i915: Skip reset request if there is one already
From: Mika Kuoppala To perform engine reset we first disable engine to capture its state. This is done by issuing a reset request. Because we are reusing existing infrastructure, again when we actually reset an engine, reset function checks engine mask and issues reset request again which is unnecessary. To avoid this we check if the engine is already prepared, if so we just exit from that point. Cc: Chris Wilson Signed-off-by: Mika Kuoppala Signed-off-by: Arun Siluvery Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/intel_uncore.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 3ebba6b2dd74..120fb440bb8b 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1686,10 +1686,15 @@ int intel_wait_for_register(struct drm_i915_private *dev_priv, static int gen8_reset_engine_start(struct intel_engine_cs *engine) { struct drm_i915_private *dev_priv = engine->i915; + const i915_reg_t reset_ctrl = RING_RESET_CTL(engine->mmio_base); + const u32 ready = RESET_CTL_REQUEST_RESET | RESET_CTL_READY_TO_RESET; int ret; - I915_WRITE_FW(RING_RESET_CTL(engine->mmio_base), - _MASKED_BIT_ENABLE(RESET_CTL_REQUEST_RESET)); + /* If engine has been already prepared, we can shortcut here */ + if ((I915_READ_FW(reset_ctrl) & ready) == ready) + return 0; + + I915_WRITE_FW(reset_ctrl, _MASKED_BIT_ENABLE(RESET_CTL_REQUEST_RESET)); ret = intel_wait_for_register_fw(dev_priv, RING_RESET_CTL(engine->mmio_base), -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v7 02/20] drm/i915: Modify error handler for per engine hang recovery
From: Arun Siluvery This is a preparatory patch which modifies error handler to do per engine hang recovery. The actual patch which implements this sequence follows later in the series. The aim is to prepare existing recovery function to adapt to this new function where applicable (which fails at this point because core implementation is lacking) and continue recovery using legacy full gpu reset. A helper function is also added to query the availability of engine reset. The error events behaviour that are used to notify user of reset are adapted to engine reset such that it doesn't break users listening to these events. In legacy we report an error event, a reset event before resetting the gpu and a reset done event marking the completion of reset. The same behaviour is adapted but reset event is only dispatched once even when multiple engines are hung. Finally once reset is complete we send reset done event as usual. Note that this implementation of engine reset is for i915 directly submitting to the ELSP, where the driver manages the hang detection, recovery and resubmission. With GuC submission these tasks are shared between driver and firmware; i915 will still responsible for detecting a hang, and when it does it will have to request GuC to reset that Engine and remind the firmware about the outstanding submissions. This will be added in different patch. v2: rebase, advertise engine reset availability in platform definition, add note about GuC submission. v3: s/*engine_reset*/*reset_engine*/. (Chris) Handle reset as 2 level resets, by first going to engine only and fall backing to full/chip reset as needed, i.e. reset_engine will need the struct_mutex. v4: Pass the engine mask to i915_reset. (Chris) v5: Rebase, update selftests. v6: Rebase, prepare for mutex-less reset engine. v7: Pass reset_engine mask as a function parameter, and iterate over the engine mask for reset_engine. (Chris) Cc: Chris Wilson Cc: Mika Kuoppala Signed-off-by: Ian Lister Signed-off-by: Tomas Elf Signed-off-by: Arun Siluvery Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_drv.c | 15 +++ drivers/gpu/drm/i915/i915_drv.h | 3 +++ drivers/gpu/drm/i915/i915_irq.c | 33 ++--- drivers/gpu/drm/i915/i915_pci.c | 5 - drivers/gpu/drm/i915/intel_uncore.c | 11 +++ 5 files changed, 63 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index c7d68e789642..48c8b69d9bde 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1800,6 +1800,8 @@ void i915_reset(struct drm_i915_private *dev_priv) if (!test_bit(I915_RESET_HANDOFF, &error->flags)) return; + DRM_DEBUG_DRIVER("resetting chip\n"); + /* Clear any previous failed attempts at recovery. Time to try again. */ if (!i915_gem_unset_wedged(dev_priv)) goto wakeup; @@ -1863,6 +1865,19 @@ void i915_reset(struct drm_i915_private *dev_priv) goto finish; } +/** + * i915_reset_engine - reset GPU engine to recover from a hang + * @engine: engine to reset + * + * Reset a specific GPU engine. Useful if a hang is detected. + * Returns zero on successful reset or otherwise an error code. + */ +int i915_reset_engine(struct intel_engine_cs *engine) +{ + /* FIXME: replace me with engine reset sequence */ + return -ENODEV; +} + static int i915_pm_suspend(struct device *kdev) { struct pci_dev *pdev = to_pci_dev(kdev); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e06af46f5a57..ab7e68626c49 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -814,6 +814,7 @@ struct intel_csr { func(has_ddi); \ func(has_decoupled_mmio); \ func(has_dp_mst); \ + func(has_reset_engine); \ func(has_fbc); \ func(has_fpga_dbg); \ func(has_full_ppgtt); \ @@ -3019,6 +3020,8 @@ extern void i915_driver_unload(struct drm_device *dev); extern int intel_gpu_reset(struct drm_i915_private *dev_priv, u32 engine_mask); extern bool intel_has_gpu_reset(struct drm_i915_private *dev_priv); extern void i915_reset(struct drm_i915_private *dev_priv); +extern int i915_reset_engine(struct intel_engine_cs *engine); +extern bool intel_has_reset_engine(struct drm_i915_private *dev_priv); extern int intel_guc_reset(struct drm_i915_private *dev_priv); extern void intel_engine_init_hangcheck(struct intel_engine_cs *engine); extern void intel_hangcheck_init(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index fd97fe00cd0d..3a59ef1367ec 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2635,11 +2635,13 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg) /** * i915_reset_and_wakeup - do process context error handling work * @dev_priv: i915
[Intel-gfx] [PATCH v7 07/20] drm/i915: Export per-engine reset count info to debugfs
From: Arun Siluvery A new variable is added to export the reset counts to debugfs, this includes full gpu reset and engine reset count. This is useful for tests where they are expected to trigger reset; these counts are checked before and after the test to ensure the same. v2: Include reset engine count in i915_engine_info too (Chris). Cc: Chris Wilson Cc: Mika Kuoppala Signed-off-by: Arun Siluvery Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_debugfs.c | 21 + 1 file changed, 21 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 870c470177b5..6444c1a9bd22 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1403,6 +1403,23 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused) return 0; } +static int i915_reset_info(struct seq_file *m, void *unused) +{ + struct drm_i915_private *dev_priv = node_to_i915(m->private); + struct i915_gpu_error *error = &dev_priv->gpu_error; + struct intel_engine_cs *engine; + enum intel_engine_id id; + + seq_printf(m, "full gpu reset = %u\n", i915_reset_count(error)); + + for_each_engine(engine, dev_priv, id) { + seq_printf(m, "%s = %u\n", engine->name, + i915_reset_engine_count(error, engine)); + } + + return 0; +} + static int ironlake_drpc_info(struct seq_file *m) { struct drm_i915_private *dev_priv = node_to_i915(m->private); @@ -3242,6 +3259,7 @@ static int i915_display_info(struct seq_file *m, void *unused) static int i915_engine_info(struct seq_file *m, void *unused) { struct drm_i915_private *dev_priv = node_to_i915(m->private); + struct i915_gpu_error *error = &dev_priv->gpu_error; struct intel_engine_cs *engine; enum intel_engine_id id; @@ -3265,6 +3283,8 @@ static int i915_engine_info(struct seq_file *m, void *unused) engine->hangcheck.seqno, jiffies_to_msecs(jiffies - engine->hangcheck.action_timestamp), engine->timeline->inflight_seqnos); + seq_printf(m, "\tReset count: %d\n", + i915_reset_engine_count(error, engine)); rcu_read_lock(); @@ -4777,6 +4797,7 @@ static const struct drm_info_list i915_debugfs_list[] = { {"i915_huc_load_status", i915_huc_load_status_info, 0}, {"i915_frequency_info", i915_frequency_info, 0}, {"i915_hangcheck_info", i915_hangcheck_info, 0}, + {"i915_reset_info", i915_reset_info, 0}, {"i915_drpc_info", i915_drpc_info, 0}, {"i915_emon_status", i915_emon_status, 0}, {"i915_ring_freq_table", i915_ring_freq_table, 0}, -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v7 00/20] Gen8+ engine-reset
These patches add the reset-engine feature from Gen8. This is also referred to as Timeout detection and recovery (TDR). This complements to the full gpu reset feature available in i915 but it only allows to reset a particular engine instead of all engines thus providing a light weight engine reset and recovery mechanism. Thanks to recent changes merged, this implementation is now not only for execlists, but for GuC based submission too; it is still limited from Gen8 onwards. I have also included the changes for watchdog timeout detection. The GuC related patches are functional, but can be seen as RFC. Timeout detection relies on the existing hangcheck, which remains the same; main changes are to the recovery mechanism. Once we detect a hang on a particular engine we identify the request that caused the hang, skip the request and adjust head pointers to allow the execution to proceed normally. After some cleanup, submissions are restarted to process remaining work queued to that engine. If engine reset fails to recover engine correctly then we fallback to full gpu reset. We can argue about the effectiveness of reset-engine vs full reset when more than one ring is hung, but the benefits of just resetting one engine are reduced when the driver has to do it multiple times. v2: ELSP queue request tracking and reset path changes to handle incomplete requests during reset. Thanks to Chris Wilson for providing these patches. v3: Let the waiter keep handling the full gpu reset if it already has the lock; point out that GuC submission needs a different method to restart workloads after the engine reset completes. v4: Handle reset as 2 level resets, by first going to engine only and fall backing to full/chip reset as needed, i.e. reset_engine will need the struct_mutex. v5: Rebased after reset flag split in 2, add GuC support, include watchdog detection patches, addressing comments from prev RFC. v6: Mutex-less reset engine. Updates in watchdog abi and guc whitelist & register-restore fixes (including an old patch from Daniele). v7: Removed leftovers from v5; review comments; ability to cancel the reset if there's no active request. Cc: Chris Wilson Cc: Mika Kuoppala Cc: Daniele Ceraolo Spurio Arun Siluvery (7): drm/i915: Update i915.reset to handle engine resets drm/i915: Modify error handler for per engine hang recovery drm/i915: Add support for per engine reset recovery drm/i915: Add engine reset count to error state drm/i915: Export per-engine reset count info to debugfs drm/i915: Enable Engine reset and recovery support drm/i915/guc: Provide register list to be saved/restored during engine reset Daniele Ceraolo Spurio (1): drm/i915/guc: fix mmio whitelist mmio_start offset and add reminder Michel Thierry (11): drm/i915: Cancel reset-engine if we couldn't find an active request drm/i915: Add engine reset count in get-reset-stats ioctl drm/i915/selftests: reset engine self tests drm/i915/guc: Rename the function that resets the GuC drm/i915/guc: Add support for reset engine using GuC commands drm/i915: Watchdog timeout: Pass GuC shared data structure during param load drm/i915: Watchdog timeout: IRQ handler for gen8+ drm/i915: Watchdog timeout: Ringbuffer command emission for gen8+ drm/i915: Watchdog timeout: DRM kernel interface to set the timeout drm/i915: Watchdog timeout: Include threshold value in error state drm/i915: Watchdog timeout: Export media reset count from GuC to debugfs Mika Kuoppala (1): drm/i915: Skip reset request if there is one already drivers/gpu/drm/i915/i915_debugfs.c | 43 +++ drivers/gpu/drm/i915/i915_drv.c | 109 +++- drivers/gpu/drm/i915/i915_drv.h | 67 +- drivers/gpu/drm/i915/i915_gem.c | 116 ++--- drivers/gpu/drm/i915/i915_gem_context.c | 109 +++- drivers/gpu/drm/i915/i915_gem_context.h | 4 + drivers/gpu/drm/i915/i915_gem_request.c | 2 +- drivers/gpu/drm/i915/i915_gpu_error.c| 14 +- drivers/gpu/drm/i915/i915_guc_submission.c | 136 ++-- drivers/gpu/drm/i915/i915_irq.c | 45 ++- drivers/gpu/drm/i915/i915_params.c | 6 +- drivers/gpu/drm/i915/i915_params.h | 2 +- drivers/gpu/drm/i915/i915_pci.c | 5 +- drivers/gpu/drm/i915/i915_reg.h | 6 + drivers/gpu/drm/i915/intel_engine_cs.c | 65 +++--- drivers/gpu/drm/i915/intel_guc_fwif.h| 27 +++- drivers/gpu/drm/i915/intel_guc_loader.c | 11 ++ drivers/gpu/drm/i915/intel_hangcheck.c | 13 +- drivers/gpu/drm/i915/intel_lrc.c | 155 ++- drivers/gpu/drm/i915/intel_ringbuffer.h | 8 ++ drivers/gpu/drm/i915/intel_uc.c | 4 +- drivers/gpu/drm/i915/intel_uc.h | 3 + drivers/g