On 18/11/15 06:54, Jordan Justen wrote: > From: Francisco Jerez <curroje...@riseup.net> > > Improves performance of the arb_shader_image_load_store-atomicity > piglit test by over 25x (which isn't a real benchmark it's just heavy > on atomics -- the improvement in a microbenchmark I wrote a while ago > seemed to be even greater). The drawback is one needs to be > extra-careful not to hang the GPU (in fact the whole system). A DC > partition must have been allocated on L3, the "convert L3 cycle for DC > to UC" bit may not be set, the MOCS L3 cacheability bit must be set > for all surfaces accessed using DC atomics, and the SCRATCH1 and > ROW_CHICKEN3 bits must be kept in sync. > > A fairly recent kernel is required for the command parser to allow > writes to these registers. > --- > src/mesa/drivers/dri/i965/gen7_l3_state.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c > b/src/mesa/drivers/dri/i965/gen7_l3_state.c > index 48bca29..c863b7f 100644 > --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c > +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c > @@ -254,5 +254,19 @@ setup_l3_config(struct brw_context *brw, const struct > brw_l3_config *cfg) > SET_FIELD(cfg->n[L3P_T], GEN7_L3CNTLREG3_T_ALLOC)); > > ADVANCE_BATCH(); > + > + if (brw->is_haswell && brw->intelScreen->cmd_parser_version >= 4) { > + /* Enable L3 atomics on HSW if we have a DC partition, otherwise > keep > + * them disabled to avoid crashing the system hard. > + */ > + BEGIN_BATCH(5); > + OUT_BATCH(MI_LOAD_REGISTER_IMM | (5 - 2)); > + OUT_BATCH(HSW_SCRATCH1); > + OUT_BATCH(has_dc ? 0 : HSW_SCRATCH1_L3_ATOMIC_DISABLE); > + OUT_BATCH(HSW_ROW_CHICKEN3); > + OUT_BATCH(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16 | > + (has_dc ? 0 : HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE));
I have not found references to ROW_CHICKEN3 nor register with 0xe49c address offset in HSW PRMs, so these could be stupid questions: Why you need to set the L3 atomic disable flag in two different places in ROW_CHICKEN3 register? Also, why the first flag is set unconditionally while the second one only if we don't have a DC partition? This is what you want? Also, if the "HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16" is really needed, it could be defined as a constant in the first patch of the series. Sam > + ADVANCE_BATCH(); > + } > } > } > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev