On jue, 2014-09-18 at 08:08 +0200, Iago Toral Quiroga wrote: > Hi Jason, > > On miƩ, 2014-09-17 at 11:39 -0700, Jason Ekstrand wrote: > > I haven't tested this yet, just looked it over. I've got a couple of > > inline comments below. One general comment though: I'm currently > > working on a bunch of compiler stuff that reworks the way we do FB > > writes. In particular, it reworks things to use GRF registers instead > > of the MRF. It probably wouldn't be too bad for me to rebase on top > > of this or to rebase your patch on top of what I'm doing. I just > > thought I'd warn you about the conflict. > > Thanks for the warning, I suppose it should be easy for me to rebase > this patch on top of your work if you happen to land yours sooner. > > > On Wed, Sep 17, 2014 at 4:36 AM, Iago Toral Quiroga > > <ito...@igalia.com> wrote: > > From the SNB PRM, volume 4, part 1, page 193: > > "The dual source render target messages only have SIMD8 forms > > due to > > maximum message length limitations. SIMD16 pixel shaders must > > send two of > > these messages to cover all of the pixels. Each message > > contains two colors > > (4 channels each) for each pixel in the message payload." > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82831 > > --- > > src/mesa/drivers/dri/i965/brw_eu.h | 1 + > > src/mesa/drivers/dri/i965/brw_eu_emit.c | 3 +- > > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++++++-- > > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 41 > > +++++++++++++++++++------- > > 4 files changed, 45 insertions(+), 14 deletions(-) > > > > I tested this on SandyBridge and IvyBridge. No piglit > > regressions in these > > platforms, but would be nice if someone could test this in > > later platforms too. > > > > I only noticed these two tests for dual source blending in > > piglit though: > > > > tests/spec/ext_framebuffer_multisample/alpha-to-one-dual-src-blend.cpp > > > > tests/spec/ext_framebuffer_multisample/alpha-to-coverage-dual-src-blend.cpp > > > > The first one fails, in both platforms with and without my > > patch. The second one > > passes in both platforms, with and without my patch. > > > > I also tested this with a seprate test program to verify that > > it worked, at > > least, in a simple case. > > > > diff --git a/src/mesa/drivers/dri/i965/brw_eu.h > > b/src/mesa/drivers/dri/i965/brw_eu.h > > index e6c26e3..5908ba5 100644 > > --- a/src/mesa/drivers/dri/i965/brw_eu.h > > +++ b/src/mesa/drivers/dri/i965/brw_eu.h > > @@ -266,6 +266,7 @@ void brw_fb_WRITE(struct brw_compile *p, > > unsigned msg_length, > > unsigned response_length, > > bool eot, > > + bool last_render_target, > > bool header_present); > > > > void brw_SAMPLE(struct brw_compile *p, > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c > > b/src/mesa/drivers/dri/i965/brw_eu_emit.c > > index 39f94e9..ffdbe6d 100644 > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > > @@ -2251,6 +2251,7 @@ void brw_fb_WRITE(struct brw_compile *p, > > unsigned msg_length, > > unsigned response_length, > > bool eot, > > + bool last_render_target, > > bool header_present) > > { > > struct brw_context *brw = p->brw; > > @@ -2290,7 +2291,7 @@ void brw_fb_WRITE(struct brw_compile *p, > > msg_type, > > msg_length, > > header_present, > > - eot, /* last render target write > > */ > > + last_render_target, > > response_length, > > eot, > > 0 /* send_commit_msg */); > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > index 1bc10f5..a4b84aa 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > @@ -121,9 +121,12 @@ fs_generator::fire_fb_write(fs_inst > > *inst, > > > > if (inst->opcode == FS_OPCODE_REP_FB_WRITE) > > msg_control = > > BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE_REPLICATED; > > - else if (prog_data->dual_src_blend) > > - msg_control = > > BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD8_DUAL_SOURCE_SUBSPAN01; > > - else if (dispatch_width == 16) > > + else if (prog_data->dual_src_blend) { > > + if (dispatch_width == 8 || !inst->eot) > > + msg_control = > > BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD8_DUAL_SOURCE_SUBSPAN01; > > + else > > + msg_control = > > BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD8_DUAL_SOURCE_SUBSPAN23; > > + } else if (dispatch_width == 16) > > msg_control = > > BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE; > > else > > msg_control = > > BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD8_SINGLE_SOURCE_SUBSPAN01; > > @@ -131,6 +134,9 @@ fs_generator::fire_fb_write(fs_inst *inst, > > uint32_t surf_index = > > prog_data->binding_table.render_target_start + > > inst->target; > > > > + bool last_render_target = inst->eot || > > + (prog_data->dual_src_blend && > > dispatch_width == 16); > > + > > brw_fb_WRITE(p, > > dispatch_width, > > base_reg, > > @@ -140,6 +146,7 @@ fs_generator::fire_fb_write(fs_inst *inst, > > nr, > > 0, > > inst->eot, > > + last_render_target, > > inst->header_present); > > > > brw_mark_surface_used(&prog_data->base, surf_index); > > @@ -254,6 +261,7 @@ > > fs_generator::generate_blorp_fb_write(fs_inst *inst) > > inst->mlen, > > 0, > > true, > > + true, > > inst->header_present); > > } > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > > b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > > index 5d2e7c8..21dc5bc 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > > @@ -3064,12 +3064,6 @@ fs_visitor::emit_fb_writes() > > int reg_width = dispatch_width / 8; > > bool src0_alpha_to_render_target = false; > > > > - if (do_dual_src) { > > - no16("GL_ARB_blend_func_extended not yet supported in > > SIMD16."); > > - if (dispatch_width == 16) > > - do_dual_src = false; > > - } > > - > > /* From the Sandy Bridge PRM, volume 4, page 198: > > * > > * "Dispatched Pixel Enables. One bit per pixel > > indicating > > @@ -3109,11 +3103,22 @@ fs_visitor::emit_fb_writes() > > nr += 1; > > } > > > > - /* Reserve space for color. It'll be filled in per MRT > > below. */ > > + /* Reserve space for color. It'll be filled in per MRT > > below. > > + * > > + * From the SNB PRM, volume 4, part 1, page 193: > > + * "The dual source render target messages only have SIMD8 > > forms due to > > + * maximum message length limitations. SIMD16 pixel > > shaders must send two of > > + * these messages to cover all of the pixels. Each message > > contains two > > + * colors (4 channels each) for each pixel in the message > > payload." > > + * > > + * So color data in dual source mode, whether this is a > > SIMD8 or SIMD16 > > + * program, always requires 8 MRF registers. > > + */ > > int color_mrf = nr; > > - nr += 4 * reg_width; > > if (do_dual_src) > > - nr += 4; > > + nr += 8; > > + else > > + nr += 4 * reg_width; > > if (src0_alpha_to_render_target) > > nr += reg_width; > > > > @@ -3173,13 +3178,29 @@ fs_visitor::emit_fb_writes() > > inst->target = 0; > > inst->base_mrf = base_mrf; > > inst->mlen = nr - base_mrf; > > - inst->eot = true; > > + inst->eot = dispatch_width == 8; > > inst->header_present = header_present; > > if ((brw->gen >= 8 || brw->is_haswell) && > > prog_data->uses_kill) { > > inst->predicate = BRW_PREDICATE_NORMAL; > > inst->flag_subreg = 1; > > } > > > > > > It appears as if you don't copy the second half of the colors into the > > message before doing the second send. Is there a reason for this? > > It's quite possible that the piglit tests are only using solid colors > > in which case they might not catch this. > > > > Yes, I noticed that after I sent the patch. It is a mistake, I'll send a > second version with this fixed. At least the standalone program I was > using to test this was using solid colors, and I suppose piglit tests do > the same.. maybe we should add one more test or something.
FYI, I sent this piglit test for review so we can test for this case specifically. It passes with the second version of this patch, and as expected, fails with this version: http://lists.freedesktop.org/archives/piglit/2014-September/012626.html > Iago > > > + /* SIMD16 dual source blending requires to send two > > SIMD8 dual source > > + * messages, so here goes the second one. > > + */ > > + if (dispatch_width == 16) { > > + fs_inst *inst = emit(FS_OPCODE_FB_WRITE); > > + inst->target = 0; > > + inst->base_mrf = base_mrf; > > + inst->mlen = nr - base_mrf; > > + inst->eot = true; > > + inst->header_present = header_present; > > + if ((brw->gen >= 8 || brw->is_haswell) && > > prog_data->uses_kill) { > > + inst->predicate = BRW_PREDICATE_NORMAL; > > + inst->flag_subreg = 1; > > + } > > + } > > + > > prog_data->dual_src_blend = true; > > this->current_annotation = NULL; > > return; > > -- > > 1.9.1 > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > > > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev