On Tue, Oct 15, 2013 at 3:38 PM, Kenneth Graunke <kenn...@whitecape.org> wrote: > On 10/14/2013 10:12 AM, Anuj Phogat wrote: >> - Enable GEN6_WM_MSDISPMODE_PERSAMPLE, GEN6_WM_POSOFFSET_SAMPLE, >> GEN6_WM_OMASK_TO_RENDER_TARGET as per extension's specification. >> - Don't enable GEN6_WM_16_DISPATCH_ENABLE when GEN6_WM_MSDISPMODE_PERSAMPLE >> is enabled. Refer SNB PRM Vol. 2, Part 1, Page 279 for details. >> >> Signed-off-by: Anuj Phogat <anuj.pho...@gmail.com> >> --- >> src/mesa/drivers/dri/i965/gen6_wm_state.c | 67 >> +++++++++++++++++++++++++++++-- >> 1 file changed, 64 insertions(+), 3 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c >> b/src/mesa/drivers/dri/i965/gen6_wm_state.c >> index c96a107..4bc25d6 100644 >> --- a/src/mesa/drivers/dri/i965/gen6_wm_state.c >> +++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c >> @@ -103,6 +103,7 @@ upload_wm_state(struct brw_context *brw) >> >> /* _NEW_BUFFERS */ >> bool multisampled_fbo = ctx->DrawBuffer->Visual.samples > 1; >> + bool msdispmode_persample = false; >> >> /* CACHE_NEW_WM_PROG */ >> if (brw->wm.prog_data->nr_params == 0) { >> @@ -156,7 +157,17 @@ upload_wm_state(struct brw_context *brw) >> >> /* CACHE_NEW_WM_PROG */ >> dw5 |= GEN6_WM_8_DISPATCH_ENABLE; >> - if (brw->wm.prog_data->prog_offset_16) >> + msdispmode_persample = >> + ctx->Multisample.Enabled && >> + (ctx->Multisample.SampleShading || >> + brw->fragment_program->Base.InputsRead & VARYING_BIT_SAMPLE_ID || >> + brw->fragment_program->Base.InputsRead & VARYING_BIT_SAMPLE_POS); > > I misread this several times, but once I finally read it correctly, it > makes perfect sense. If multisample rendering is off, you get per-pixel > mode. Otherwise, you get it if the shader uses gl_Sample{In,Position} > or ctx->Multisample.SampleShading is on. > > Looks good. > >> + >> + /* In case of non 1x (i.e 4x, 8x) multisampling with MDISPMODE_PERSAMPLE, >> + * only one of SIMD8 and SIMD16 should be enabled. >> + */ >> + if (brw->wm.prog_data->prog_offset_16 && >> + !(multisampled_fbo && msdispmode_persample)) >> dw5 |= GEN6_WM_16_DISPATCH_ENABLE; > > This code doesn't match your comment. The comment says you should only > enable either 8 or 16 (but not both). Your code just always turns off > 16-wide. > > As I mentioned in the previous patch, I think 16-wide is allowed for 4x > MSAA on IVB, but I could be crazy. Otherwise I think you have to use > 8-wide only. > I'll put a more detailed comment here explaining which dispatch modes are allowed on SNB and IVB. I'll also add a 'TODO' item here in gen7_wm_state.c: "Enable 'SIMD16 only' dispatch in case of MSDISPMODE_PERSAMPLE with num_samples < 8" As discussed earlier on IRC, I'll enable "SIMD8 only" dispatch in this series and work on 'TODO' item later. >> >> /* CACHE_NEW_WM_PROG | _NEW_COLOR */ >> @@ -185,7 +196,10 @@ upload_wm_state(struct brw_context *brw) >> >> /* _NEW_COLOR, _NEW_MULTISAMPLE */ >> if (fp->program.UsesKill || ctx->Color.AlphaEnabled || >> - ctx->Multisample.SampleAlphaToCoverage) >> + ctx->Multisample.SampleAlphaToCoverage || >> + (ctx->Multisample.SampleShading && >> + (fp->program.Base.OutputsWritten & >> + BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK)))) >> dw5 |= GEN6_WM_KILL_ENABLE; > > This part looks good :) > >> if (brw_color_buffer_write_enabled(brw) || >> @@ -193,6 +207,19 @@ upload_wm_state(struct brw_context *brw) >> dw5 |= GEN6_WM_DISPATCH_ENABLE; >> } >> >> + /* From the SNB PRM, volume 2 part 1, page 278: >> + * "This bit is inserted in the PS payload header and made available to >> + * the DataPort (either via the message header or via header bypass) to >> + * indicate that oMask data (one or two phases) is included in Render >> + * Target Write messages. If present, the oMask data is used to mask off >> + * samples." >> + * TODO: [DevSNB:A0] This bit must be disabled in A Step. > > You can drop the TODO. SNB:A0 is preproduction hardware that never > shipped to the public. > ok. >> + */ >> + if (ctx->Extensions.ARB_sample_shading && >> + (brw->fragment_program->Base.OutputsWritten & >> + BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK))) >> + dw5 |= GEN6_WM_OMASK_TO_RENDER_TARGET; > > Is this necessary? Or is it only used for gl_SampleMaskIn from > ARB_gpu_shader5? > Yes, It is necessary to enable new builtin variable: out int glSampleMask[]; Look for oMask in IVB PRM vol 2, part 1, page 287.
I haven't implemented gl_SampleMaskIn because it's not the part of ARB_sample_shading. I can work on that after finishing this series. >> /* CACHE_NEW_WM_PROG */ >> dw6 |= brw->wm.prog_data->num_varying_inputs << >> GEN6_WM_NUM_SF_OUTPUTS_SHIFT; >> @@ -202,12 +229,46 @@ upload_wm_state(struct brw_context *brw) >> dw6 |= GEN6_WM_MSRAST_ON_PATTERN; >> else >> dw6 |= GEN6_WM_MSRAST_OFF_PIXEL; >> - dw6 |= GEN6_WM_MSDISPMODE_PERPIXEL; >> + >> + /* From arb_sample_shading specification: > > "From the ARB_sample_shading specification:" > >> + * "If MULTISAMPLE or SAMPLE_SHADING_ARB is disabled, sample shading >> + * has no effect." >> + * >> + * "Using gl_SampleID in a fragment shader causes the entire shader >> + * to be evaluated per-sample." >> + * "Using gl_SamplePosition in a fragment shader causes the entire >> + * shader to be evaluated per-sample." >> + * >> + * I interprate the above four lines as enable the sample shading >> + * if fragment shader uses gl_SampleID or gl_SamplePosition. >> + */ >> + if (msdispmode_persample) >> + dw6 |= GEN6_WM_MSDISPMODE_PERSAMPLE; >> + else >> + dw6 |= GEN6_WM_MSDISPMODE_PERPIXEL; >> } else { >> dw6 |= GEN6_WM_MSRAST_OFF_PIXEL; >> dw6 |= GEN6_WM_MSDISPMODE_PERSAMPLE; >> } >> >> + /* _NEW_MULTISAMPLE */ >> + /* From the SNB PRM, volume 2 part 1, page 281: >> + * "If the PS kernel does not need the Position XY Offsets >> + * to compute a Position XY value, then this field should be >> + * programmed to POSOFFSET_NONE." >> + * >> + * "SW Recommendation: If the PS kernel needs the Position Offsets >> + * to compute a Position XY value, this field should match Position >> + * ZW Interpolation Mode to ensure a consistent position.xyzw >> + * computation." >> + * We only require XY sample offsets. So, this recommendation doesn't >> + * look useful at the moment. We might need this in future. >> + */ >> + if (brw->fragment_program->Base.InputsRead & VARYING_BIT_SAMPLE_POS) >> + dw6 |= GEN6_WM_POSOFFSET_SAMPLE; >> + else >> + dw6 |= GEN6_WM_POSOFFSET_NONE; >> + >> BEGIN_BATCH(9); >> OUT_BATCH(_3DSTATE_WM << 16 | (9 - 2)); >> OUT_BATCH(brw->wm.base.prog_offset); >> > > Other than those nits, this looks good. Nice work. Thanks Ken. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev