On Tue, Mar 31, 2015 at 2:03 PM, Kenneth Graunke <kenn...@whitecape.org> wrote: > On Tuesday, March 31, 2015 11:30:17 AM Rob Clark wrote: >> From: Rob Clark <robcl...@freedesktop.org> >> >> In freedreno these get implemented as the matching f* instruction plus a >> u2f to convert the result to float 1.0/0.0. But less lines of code to >> just let nir_opt_algebraic handle this for us, plus opens up some small >> window for other opt passes to improve (ie. if some shader ended up with >> both a flt and slt with same src args, for example). >> >> Signed-off-by: Rob Clark <robcl...@freedesktop.org> >> --- >> src/glsl/nir/nir.h | 3 +++ >> src/glsl/nir/nir_opt_algebraic.py | 5 +++++ >> 2 files changed, 8 insertions(+) >> >> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h >> index 669a26e..11505f9 100644 >> --- a/src/glsl/nir/nir.h >> +++ b/src/glsl/nir/nir.h >> @@ -1371,6 +1371,9 @@ typedef struct nir_shader_compiler_options { >> /** lowers fneg and ineg to fsub and isub. */ >> bool lower_negate; >> >> + /* lower {slt,sge,seq,sne} to {flt,fge,feq,fne} + u2f: */ >> + bool lower_scmp; >> + >> /** >> * Does the driver support real 32-bit integers? (Otherwise, integers >> * are simulated by floats.) >> diff --git a/src/glsl/nir/nir_opt_algebraic.py >> b/src/glsl/nir/nir_opt_algebraic.py >> index ef855aa..6bd4187 100644 >> --- a/src/glsl/nir/nir_opt_algebraic.py >> +++ b/src/glsl/nir/nir_opt_algebraic.py >> @@ -95,6 +95,11 @@ optimizations = [ >> (('fsat', a), ('fmin', ('fmax', a, 0.0), 1.0), 'options->lower_fsat'), >> (('fsat', ('fsat', a)), ('fsat', a)), >> (('fmin', ('fmax', ('fmin', ('fmax', a, 0.0), 1.0), 0.0), 1.0), ('fmin', >> ('fmax', a, 0.0), 1.0)), >> + (('slt', a, b), ('u2f', ('flt', a, b)), 'options->lower_scmp'), >> + (('sge', a, b), ('u2f', ('fge', a, b)), 'options->lower_scmp'), >> + (('seq', a, b), ('u2f', ('feq', a, b)), 'options->lower_scmp'), >> + (('sne', a, b), ('u2f', ('fne', a, b)), 'options->lower_scmp'), >> + >> # Comparison with the same args. Note that these are not done for >> # the float versions because NaN always returns false on float >> # inequalities. >> > > Hi Rob! > > I'm pretty sure you want b2f here, not u2f...the slt/sge/seq/sne opcodes > are defined to return either 0.0 or 1.0. flt and friends return 0 or > 0xFFFFFFFF. u2f converts the numerical value of the unsigned source to > float, so this would return 0.0 or 4294967295.0. >
hmm, that is a bit sad (since on the flt/etc cases I'd have to multiply by 0xffffffff, which would in turn require a mov for the 0xffffffff or perhaps emitting a driver uniform/const), and since it makes the b2f more complicated.. I guess I can just implement b2f to be same as u2f in my backend and hope for the best.. can a bool be reinterpreted as an int and (for example) multiplied by things? If so, can we maybe have reinterpret opcodes so I can fix things up? BR, -R > b2f on i965 is implemented as "AND src 0x3f8" which would give you 0x0 > or 0x3f8 = 1.0. It sounds like vc4 does the same trick. > > With s/u2f/b2f/g, this patch would be: > Reviewed-by: Kenneth Graunke <kenn...@whitecape.org> > > Thanks for doing this! I'll want to use these patterns too. > > --Ken _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev