On Tuesday, July 01, 2014 12:43:54 PM Cody Northrop wrote: > Commit 17c7ead7 exposed a bug in how uniform loading happens in the > presence of discard. It manifested itself in an application as > randomly incorrect pixels on the borders of conditional areas. > > This is due to how discards jump to the end of the shader incorrectly > for some channels. The current implementation checks each 2x2 > subspan to preserve derivatives. When uniform loading via samplers > was turned on, it uses a full execution mask, as stated in > lower_uniform_pull_constant_loads(), and only populates four channels > of the destination (see generate_uniform_pull_constant_load_gen7()). > It happens incorrectly when the first subspan has been jumped over. > > The series that implemented this optimization was done before the > changes to use samplers for uniform loads. Uniform sampler loads > use special execution masks and only populate four channels, so we > can't jump over those or corruption ensues. > > This fix only jumps to the end of the shader if all relevant channels > are disabled, i.e. all 8 or 16, depending on dispatch. This > preserves the original GLbenchmark 2.7 speedup noted in commit > beafced2. > > It changes the shader assembly accordingly: > > before : (-f0.1.any4h) halt(8) 17 2 null { align1 WE_all 1Q }; > after(8) : (-f0.1.any8h) halt(8) 17 2 null { align1 WE_all 1Q }; > after(16): (-f0.1.any16h) halt(16) 17 2 null { align1 WE_all 1H }; > > v2: Cleaned up comments and conditional ordering. > > Signed-off-by: Cody Northrop <c...@lunarg.com> > Reviewed-by: Mike Stroyan <m...@lunarg.com> > Reviewed-by: Kenneth Graunke <kenn...@whitecape.org> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79948 > --- > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 11 +++++------ > 1 file changed, 5 insertions(+), 6 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > index 654f5fe..3516aca 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > @@ -1929,15 +1929,14 @@ fs_visitor::visit(ir_discard *ir) > > if (brw->gen >= 6) { > /* For performance, after a discard, jump to the end of the shader. > - * However, many people will do foliage by discarding based on a > - * texture's alpha mask, and then continue on to texture with the > - * remaining pixels. To avoid trashing the derivatives for those > - * texture samples, we'll only jump if all of the pixels in the subspan > - * have been discarded. > + * Only jump if all relavant channels have been discarded. > */ > fs_inst *discard_jump = emit(FS_OPCODE_DISCARD_JUMP); > discard_jump->flag_subreg = 1; > - discard_jump->predicate = BRW_PREDICATE_ALIGN1_ANY4H; > + > + discard_jump->predicate = (dispatch_width == 8) > + ? BRW_PREDICATE_ALIGN1_ANY8H > + : BRW_PREDICATE_ALIGN1_ANY16H; > discard_jump->predicate_inverse = true; > } > } >
Pushed, thanks! --Ken
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev