On 05/29/2014 01:21 AM, Iago Toral wrote: > Hi Kenneth, > > On Wed, 2014-05-28 at 12:42 -0700, Kenneth Graunke wrote: >> On 05/27/2014 03:50 AM, Iago Toral Quiroga wrote: >>> In Gen < 6 the hardware generates a runtime bit that indicates whether AA >>> data >>> has to be sent as part of the framebuffer write SEND message. This affects >>> the >>> specific case where we have setup antialiased line rendering and we render >>> polygons which have one face setup in GL_LINE mode (line antialiasing >>> will be used) and the other one in GL_FILL mode (no line antialiasing >>> needed). >>> >>> Currently we are not doing this runtime test and instead we always send AA >>> data, which produces incorrect rendering of the GL_FILL face of the polygon >>> in >>> in the aforementioned scenario (verified in ironlake and gm45). >>> >>> In Gen4 this is, likely, a regression introduced with commit 098acf6c843. In >>> Gen5 this has never worked properly. Gen > 5 are not affected by this. >>> >>> The patch fixes the problem by adding the appropriate runtime check and >>> adjusting the framebuffer write message accordingly in the conflictive >>> scenario (detected with fs_visitor::runtime_check_aads_emit == TRUE). >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78679 >>> --- >>> src/mesa/drivers/dri/i965/brw_fs.h | 4 ++ >>> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 86 >>> +++++++++++++++++----------- >>> 2 files changed, 58 insertions(+), 32 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h >>> b/src/mesa/drivers/dri/i965/brw_fs.h >>> index 60a4906..ab8912f 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_fs.h >>> +++ b/src/mesa/drivers/dri/i965/brw_fs.h >>> @@ -452,6 +452,10 @@ public: >>> >>> void emit_color_write(int target, int index, int first_color_mrf); >>> void emit_alpha_test(); >>> + void do_emit_fb_write(int target, int base_mrf, int mlen, bool eot, >>> + bool header_present); >>> + void emit_fb_write(int target, int base_mrf, int mlen, bool eot, >>> + bool header_present); >>> void emit_fb_writes(); >>> >>> void emit_shader_time_begin(); >>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >>> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >>> index 171f063..4c3897b 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >>> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >>> @@ -2731,6 +2731,54 @@ fs_visitor::emit_alpha_test() >>> } >>> >>> void >>> +fs_visitor::do_emit_fb_write(int target, int base_mrf, int mlen, bool eot, >>> + bool header_present) >>> +{ >>> + fs_inst *inst = emit(FS_OPCODE_FB_WRITE); >>> + inst->target = target; >>> + inst->base_mrf = base_mrf; >>> + inst->mlen = mlen; >>> + inst->eot = eot; >>> + inst->header_present = header_present; >>> + if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) { >>> + inst->predicate = BRW_PREDICATE_NORMAL; >>> + inst->flag_subreg = 1; >>> + } >>> +} >>> + >>> +void >>> +fs_visitor::emit_fb_write(int target, int base_mrf, int mlen, bool eot, >>> + bool header_present) >>> +{ >>> + if (!runtime_check_aads_emit) { >>> + do_emit_fb_write(target, base_mrf, mlen, eot, header_present); >>> + } else { >>> + /* This can only happen in Gen < 6 >>> + */ >>> + fs_reg reg_tmp_ud = fs_reg(this, glsl_type::uint_type); >>> + emit(AND(reg_tmp_ud, >>> + fs_reg(get_element_ud(brw_vec8_grf(1,0), 6)), >> >> I think >> >> retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD) >> >> might be a little clearer than: >> >> get_element_ud(brw_vec8_grf(1,0), 6)) >> >> since it just refers to r1.6 right away, rather than r1.0 modified to >> have a suboffset of 6. > > Sure, that looks better. > >>> + fs_reg(brw_imm_ud(1<<26)))); >>> + emit(CMP(reg_null_ud, >>> + reg_tmp_ud, >>> + fs_reg(brw_imm_ud(0)), >>> + BRW_CONDITIONAL_Z)); >> >> You can actually generate a flag condition directly from the AND >> instruction, and eliminate the CMP: >> >> fs_inst *inst = >> emit(AND(reg_null_ud, >> fs_reg(retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD), >> fs_reg(0))); >> inst->conditional_mod = BRW_CONDITIONAL_Z; >> >> (you might have to use vec1(retype(brw_null_reg), BRW_REGISTER_TYPE_UD), >> rather than reg_null_ud.) > > Oh, that's much nicer! We also get rid of the temporary register with > this. > > I think you messed up the parameters to the AND though. I believe this > is what you meant: > > fs_inst *inst = > emit(AND(vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_UD)), > fs_reg(retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD)), > fs_reg(brw_imm_ud(1<<26)))); > inst->conditional_mod = BRW_CONDITIONAL_Z;
Whoops. Yeah, that's what I meant :) >>> + emit(IF(BRW_PREDICATE_NORMAL)); >>> + { >>> + /* Shift message header one register since we are not sending >>> + * AA data stored in base_mrf+2 >>> + */ >>> + do_emit_fb_write(target, base_mrf + 1, mlen - 1, eot, >>> header_present); >>> + } >>> + emit(BRW_OPCODE_ELSE); >>> + { >>> + do_emit_fb_write(target, base_mrf, mlen, eot, header_present); >>> + } >>> + emit(BRW_OPCODE_ENDIF); >> >> I suppose this probably works, but I've never seen a program end with an >> ENDIF. I'm not really comfortable with doing that, since the ENDIF >> should never be executed. With JMPI or ADD brw_ip_reg(), we'd just jump >> over one instruction and then execute one FB write or the other, which >> seems really straightforward. >> >> But, looking at the bug, I see you tried both of those suggestions, and >> it didn't work for some reason. Huh. It really should...maybe I can >> look into why... > > Right, I don't understand why that code did not work for me either. More > over, the code did not work as I expected in other ways, for example I > noticed that reverting the condition and the order of the fb_writes > (which should be completely the same thing) changed the behavior too... > I remember that it produced GPU hangs but I tried so many versions that > I can't tell... it felt as if the two fb_writes were being executed. I > tried all kinds of things to make it work as intended, even including a > manual JMPI at the end of the IF branch to force the jump over the > second fb_write... but nothing. > > I commented about this and included the C code and the generated > assembly for the JMPI version hoping that someone could tell if > something was off here: > > http://lists.freedesktop.org/archives/mesa-dev/2014-May/059985.html > > Maybe you can have a quick look at that see if there is something > obvious that I am missing... Yeah, I'll try and take a look... > Also, when I was trying the JMPI version I was developing the patches in > the generator rather than the visitor. I suppose that should not make > any difference though. > > One last thing, JMPI was recently made private to the sf module, so if > we want to do go with it we need to revert f3cb2e6ed705. It wasn't, actually - brw_JMPI is still available in brw_eu_emit.c. It's only brw_land_fwd_jump that moved to brw_sf_state.c, but all that does is set the jump distance on the JMPI instruction. It seems like you know the jump distance when emitting the JMPI (jump over 1 FB write and maybe some MOVs) so you could just set it straightaway, and skip that function. Or you could revert it - that'd be fine too.
signature.asc
Description: OpenPGP digital signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev