On Mon, Oct 30, 2017 at 05:24:10PM -0700, Nanley Chery wrote: > On Fri, Oct 06, 2017 at 04:30:47PM -0700, Anuj Phogat wrote: > > There are few other (duplicate) workarounds which have similar > > recommendations: > > WaFlushHangWhenNonPipelineStateAndMarkerStalled > > WaCSStallBefore3DSamplePattern > > WaPipeControlBefore3DStateSamplePattern > > > > WaPipeControlBefore3DStateSamplePattern has some extra recommendations if > > driver is using mid batch context restore. Ignoring it for now because We're > > not doing mid-batch context restore in Mesa. > > > > How do we know we've implemented this correctly? Does this fix or > improve any hangs we've been seeing? It would be helpful to have this > information in the commit message. > > > Cc: mesa-sta...@lists.freedesktop.org > > Cc: Jason Ekstrand <ja...@jlekstrand.net> > > Cc: Rafael Antognolli <rafael.antogno...@intel.com> > > Signed-off-by: Anuj Phogat <anuj.pho...@gmail.com> > > --- > > src/mesa/drivers/dri/i965/brw_context.h | 2 + > > src/mesa/drivers/dri/i965/brw_defines.h | 1 + > > src/mesa/drivers/dri/i965/brw_pipe_control.c | 50 > > ++++++++++++++++++++++ > > src/mesa/drivers/dri/i965/gen8_multisample_state.c | 8 ++++ > > 4 files changed, 61 insertions(+) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > > b/src/mesa/drivers/dri/i965/brw_context.h > > index 92fc16de13..f0e8d562e9 100644 > > --- a/src/mesa/drivers/dri/i965/brw_context.h > > +++ b/src/mesa/drivers/dri/i965/brw_context.h > > @@ -1647,6 +1647,8 @@ void brw_emit_post_sync_nonzero_flush(struct > > brw_context *brw); > > void brw_emit_depth_stall_flushes(struct brw_context *brw); > > void gen7_emit_vs_workaround_flush(struct brw_context *brw); > > void gen7_emit_cs_stall_flush(struct brw_context *brw); > > +void gen10_emit_wa_cs_stall_flush(struct brw_context *brw); > > +void gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw); > > > > /* brw_queryformat.c */ > > void brw_query_internal_format(struct gl_context *ctx, GLenum target, > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > > b/src/mesa/drivers/dri/i965/brw_defines.h > > index 4abb790612..270cdf29db 100644 > > --- a/src/mesa/drivers/dri/i965/brw_defines.h > > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > > @@ -1609,6 +1609,7 @@ enum brw_pixel_shader_coverage_mask_mode { > > #define GEN7_GPGPU_DISPATCHDIMY 0x2504 > > #define GEN7_GPGPU_DISPATCHDIMZ 0x2508 > > > > +#define GEN7_CACHE_MODE_0 0x7000 > > #define GEN7_CACHE_MODE_1 0x7004 > > > # define GEN9_FLOAT_BLEND_OPTIMIZATION_ENABLE (1 << 4) > > # define GEN8_HIZ_NP_PMA_FIX_ENABLE (1 << 11) > > diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c > > b/src/mesa/drivers/dri/i965/brw_pipe_control.c > > index 460b8f73b6..156f5c25ec 100644 > > --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c > > +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c > > @@ -278,6 +278,56 @@ gen7_emit_cs_stall_flush(struct brw_context *brw) > > brw->workaround_bo, 0, 0); > > } > > > > +static void > > +brw_flush_gpu_caches(struct brw_context *brw) { > > + brw_emit_pipe_control_flush(brw, > > + PIPE_CONTROL_CACHE_FLUSH_BITS | > > + PIPE_CONTROL_CACHE_INVALIDATE_BITS); > > +} > > + > > +/** > > + * From Gen10 Workarounds page in h/w specs: > > + * WaSampleOffsetIZ: > > + * Prior to the 3DSTATE_SAMPLE_PATTERN driver must ensure there are no > > + * markers in the pipeline by programming a PIPE_CONTROL with stall. > > + */ > > +void > > +gen10_emit_wa_cs_stall_flush(struct brw_context *brw) > > +{ > > + const struct gen_device_info *devinfo = &brw->screen->devinfo; > > + assert(devinfo->gen == 10); > > + brw_emit_pipe_control_flush(brw, > > + PIPE_CONTROL_CS_STALL | > > + PIPE_CONTROL_STALL_AT_SCOREBOARD); > > +} > > + > > +/** > > + * From Gen10 Workarounds page in h/w specs: > > + * WaSampleOffsetIZ: > > + * When 3DSTATE_SAMPLE_PATTERN is programmed, driver must then issue an > > + * MI_LOAD_REGISTER_IMM command to an offset between 0x7000 and 0x7FFF(SVL) > > + * after the command to ensure the state has been delivered prior to any > > + * command causing a marker in the pipeline. > > + */ > > +void > > +gen10_emit_wa_lri_to_cache_mode_zero(struct brw_context *brw) > > +{ > > + const struct gen_device_info *devinfo = &brw->screen->devinfo; > > + assert(devinfo->gen == 10); > > + > > + /* Before changing the value of CACHE_MODE_0 register, GFX pipeline must > > + * be idle; i.e., full flush is required. > > + */ > > + brw_flush_gpu_caches(brw); > > + > > I don't know how the flushing mechanism works completely, but I'm > guessing that doing the following would be better: > > brw_emit_mi_flush(); > brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL | > PIPE_CONTROL_STATE_CACHE_INVALIDATE); > > The first command seems to be the standard flush/invalidate/stall > everything command. It doesn't invalidate the state cache however, hence > the second command. Since the GPU must be idle, perhaps we want a CS > stall as well? > >
Sorry for not reading the discussion on this earlier. Rafael pointed out to me that what you have already does cause perform a CS stall. -Nanley > > + /* Write to CACHE_MODE_0 (0x7000) */ > > + BEGIN_BATCH(3); > > + OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); > > + OUT_BATCH(GEN7_CACHE_MODE_0); > > + OUT_BATCH(0); > > + ADVANCE_BATCH(); > > You could use brw_load_register_imm32() instead. > > > +} > > + > > /** > > * Emits a PIPE_CONTROL with a non-zero post-sync operation, for > > * implementing two workarounds on gen6. From section 1.4.7.1 > > diff --git a/src/mesa/drivers/dri/i965/gen8_multisample_state.c > > b/src/mesa/drivers/dri/i965/gen8_multisample_state.c > > index 7a31a5df4a..14043025b6 100644 > > --- a/src/mesa/drivers/dri/i965/gen8_multisample_state.c > > +++ b/src/mesa/drivers/dri/i965/gen8_multisample_state.c > > @@ -49,6 +49,11 @@ gen8_emit_3dstate_multisample(struct brw_context *brw, > > unsigned num_samples) > > void > > gen8_emit_3dstate_sample_pattern(struct brw_context *brw) > > { > > + const struct gen_device_info *devinfo = &brw->screen->devinfo; > > + > > + if (devinfo->gen == 10) > > + gen10_emit_wa_cs_stall_flush(brw); > > + > > BEGIN_BATCH(9); > > OUT_BATCH(_3DSTATE_SAMPLE_PATTERN << 16 | (9 - 2)); > > > > @@ -68,4 +73,7 @@ gen8_emit_3dstate_sample_pattern(struct brw_context *brw) > > /* 1x and 2x MSAA */ > > OUT_BATCH(brw_multisample_positions_1x_2x); > > ADVANCE_BATCH(); > > + > > + if (devinfo->gen == 10) > > + gen10_emit_wa_lri_to_cache_mode_zero(brw); > > } > > -- > > 2.13.5 > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev