On Tue, Mar 31, 2015 at 2:11 PM, Rob Clark <robdcl...@gmail.com> wrote: > On Tue, Mar 31, 2015 at 2:03 PM, Kenneth Graunke <kenn...@whitecape.org> > wrote: >> On Tuesday, March 31, 2015 11:30:17 AM Rob Clark wrote: >>> From: Rob Clark <robcl...@freedesktop.org> >>> >>> In freedreno these get implemented as the matching f* instruction plus a >>> u2f to convert the result to float 1.0/0.0. But less lines of code to >>> just let nir_opt_algebraic handle this for us, plus opens up some small >>> window for other opt passes to improve (ie. if some shader ended up with >>> both a flt and slt with same src args, for example). >>> >>> Signed-off-by: Rob Clark <robcl...@freedesktop.org> >>> --- >>> src/glsl/nir/nir.h | 3 +++ >>> src/glsl/nir/nir_opt_algebraic.py | 5 +++++ >>> 2 files changed, 8 insertions(+) >>> >>> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h >>> index 669a26e..11505f9 100644 >>> --- a/src/glsl/nir/nir.h >>> +++ b/src/glsl/nir/nir.h >>> @@ -1371,6 +1371,9 @@ typedef struct nir_shader_compiler_options { >>> /** lowers fneg and ineg to fsub and isub. */ >>> bool lower_negate; >>> >>> + /* lower {slt,sge,seq,sne} to {flt,fge,feq,fne} + u2f: */ >>> + bool lower_scmp; >>> + >>> /** >>> * Does the driver support real 32-bit integers? (Otherwise, integers >>> * are simulated by floats.) >>> diff --git a/src/glsl/nir/nir_opt_algebraic.py >>> b/src/glsl/nir/nir_opt_algebraic.py >>> index ef855aa..6bd4187 100644 >>> --- a/src/glsl/nir/nir_opt_algebraic.py >>> +++ b/src/glsl/nir/nir_opt_algebraic.py >>> @@ -95,6 +95,11 @@ optimizations = [ >>> (('fsat', a), ('fmin', ('fmax', a, 0.0), 1.0), 'options->lower_fsat'), >>> (('fsat', ('fsat', a)), ('fsat', a)), >>> (('fmin', ('fmax', ('fmin', ('fmax', a, 0.0), 1.0), 0.0), 1.0), >>> ('fmin', ('fmax', a, 0.0), 1.0)), >>> + (('slt', a, b), ('u2f', ('flt', a, b)), 'options->lower_scmp'), >>> + (('sge', a, b), ('u2f', ('fge', a, b)), 'options->lower_scmp'), >>> + (('seq', a, b), ('u2f', ('feq', a, b)), 'options->lower_scmp'), >>> + (('sne', a, b), ('u2f', ('fne', a, b)), 'options->lower_scmp'), >>> + >>> # Comparison with the same args. Note that these are not done for >>> # the float versions because NaN always returns false on float >>> # inequalities. >>> >> >> Hi Rob! >> >> I'm pretty sure you want b2f here, not u2f...the slt/sge/seq/sne opcodes >> are defined to return either 0.0 or 1.0. flt and friends return 0 or >> 0xFFFFFFFF. u2f converts the numerical value of the unsigned source to >> float, so this would return 0.0 or 4294967295.0. >> > > hmm, that is a bit sad (since on the flt/etc cases I'd have to > multiply by 0xffffffff, which would in turn require a mov for the > 0xffffffff or perhaps emitting a driver uniform/const), and since it > makes the b2f more complicated..
Without commenting on nir, the way that freedreno currently implements this is by doing a neg on the cmp output, since TGSI also mandates the 0/~0 bools. And then b2f becomes bool & 0x3f8. (Which kinda sucks for later optimizations since they have to explicitly know about this pattern, or alternatively, not be able to see through it.) It would be nice if there were support for different acceptable values of "true" for bools, but this presents (other) issues for optimizations as well. -ilia _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev