On 11/12/2014 10:39 AM, Daniel Vetter wrote: > On Wed, Nov 12, 2014 at 01:33:01AM -0800, Kenneth Graunke wrote: >> According to the documentation, we need to do a CS stall on every fourth >> PIPE_CONTROL command to avoid GPU hangs. The kernel does a CS stall >> between batches, so we only need to count the PIPE_CONTROLs in our batches. >> >> Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > > Yeah, kernel adds the CS stall bit both to the flush right before/after > the batch so this works. The kernel also has a comment so people hopefully > check userspace assumptions when testing this. > > Reviewed-by: Daniel Vetter <daniel.vet...@ffwll.ch> > > Some useless bikesheds for you to ignore below ;-) > >> --- >> src/mesa/drivers/dri/i965/brw_context.h | 2 ++ >> src/mesa/drivers/dri/i965/intel_batchbuffer.c | 35 >> +++++++++++++++++++++++++++ >> 2 files changed, 37 insertions(+) >> >> This may help >> https://code.google.com/p/chromium/issues/detail?id=333130 >> My theory is that marcheu's patch removes PIPE_CONTROLs that don't have >> CS stalls, which may bring it under 4 in a row, or at least bring ones >> with CS stalls closer together. >> >> It also may help the couple of users who've reported IVB GPU hangs recently. >> >> Or it may be totally useless... >> >> diff --git a/src/mesa/drivers/dri/i965/brw_context.h >> b/src/mesa/drivers/dri/i965/brw_context.h >> index 656cbe8..27cf92c 100644 >> --- a/src/mesa/drivers/dri/i965/brw_context.h >> +++ b/src/mesa/drivers/dri/i965/brw_context.h >> @@ -854,6 +854,8 @@ struct intel_batchbuffer { >> enum brw_gpu_ring ring; >> bool needs_sol_reset; >> >> + uint8_t pipe_controls_since_last_cs_stall; > > Since the compile aligns this anyway I tend to not bother with smaller > types. And the fixed-width generally used for abi and hw registers, which > this isn't.
I think this will get stored in the padding after the bool. Making this an int will cause extra padding. I think that's why Ken chose to put it here. >> + >> struct { >> uint16_t used; >> int reloc_count; >> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c >> b/src/mesa/drivers/dri/i965/intel_batchbuffer.c >> index cd45af6..2255cee 100644 >> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c >> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c >> @@ -81,6 +81,7 @@ intel_batchbuffer_reset(struct brw_context *brw) >> brw->batch.state_batch_offset = brw->batch.bo->size; >> brw->batch.used = 0; >> brw->batch.needs_sol_reset = false; >> + brw->batch.pipe_controls_since_last_cs_stall = 0; >> >> /* We don't know what ring the new batch will be sent to until we see the >> * first BEGIN_BATCH or BEGIN_BATCH_BLT. Mark it as unknown. >> @@ -433,6 +434,36 @@ gen8_add_cs_stall_workaround_bits(uint32_t *flags) >> *flags |= PIPE_CONTROL_STALL_AT_SCOREBOARD; >> } >> >> +/* Implement the WaCsStallAtEveryFourthPipecontrol workaround on IVB, BYT: >> + * >> + * "Every 4th PIPE_CONTROL command, not counting the PIPE_CONTROL with >> + * only read-cache-invalidate bit(s) set, must have a CS_STALL bit set." >> + * >> + * Note that the kernel does CS stalls between batches, so we only need >> + * to count them within a batch. >> + */ >> +static uint32_t >> +gen7_cs_stall_every_four_pipe_controls(struct brw_context *brw, uint32_t >> flags) >> +{ >> + if (brw->gen == 7 && brw->is_haswell) { >> + if (flags & PIPE_CONTROL_CS_STALL) { >> + /* If we're doing a CS stall, reset the counter and carry on. */ >> + brw->batch.pipe_controls_since_last_cs_stall = 0; >> + return 0; >> + } else { >> + /* Otherwise, we need to count this PIPE_CONTROL. */ >> + ++brw->batch.pipe_controls_since_last_cs_stall; >> + } > > You can flatten the control flow here a bit by dropping the else and > moving the ++ into the check. Imo that's a fairly common patter to not > obfuscate the code. Yeah, I think that is slightly better... at least making it /* Count this PIPE_CONTROL. */ brw->batch.pipe_controls_since_last_cs_stall++; /* If this is the fourth pipe control without a CS stall, do one now. */ if (brw->batch.pipe_controls_since_last_cs_stall == 4) { ... } >> + >> + /* If this is the fourth pipe control without a CS stall, do one now. >> */ >> + if (brw->batch.pipe_controls_since_last_cs_stall == 4) { >> + brw->batch.pipe_controls_since_last_cs_stall = 0; >> + return PIPE_CONTROL_CS_STALL; >> + } >> + } >> + return 0; >> +} >> + >> /** >> * Emit a PIPE_CONTROL with various flushing flags. >> * >> @@ -454,6 +485,8 @@ brw_emit_pipe_control_flush(struct brw_context *brw, >> uint32_t flags) >> OUT_BATCH(0); >> ADVANCE_BATCH(); >> } else if (brw->gen >= 6) { >> + flags |= gen7_cs_stall_every_four_pipe_controls(brw, flags); >> + >> BEGIN_BATCH(5); >> OUT_BATCH(_3DSTATE_PIPE_CONTROL | (5 - 2)); >> OUT_BATCH(flags); >> @@ -496,6 +529,8 @@ brw_emit_pipe_control_write(struct brw_context *brw, >> uint32_t flags, >> OUT_BATCH(imm_upper); >> ADVANCE_BATCH(); >> } else if (brw->gen >= 6) { >> + flags |= gen7_cs_stall_every_four_pipe_controls(brw, flags); >> + >> /* PPGTT/GGTT is selected by DW2 bit 2 on Sandybridge, but DW1 bit 24 >> * on later platforms. We always use PPGTT on Gen7+. >> */ >> -- >> 2.1.3 >> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev