On 02/23/2018 05:14 PM, Jason Ekstrand wrote: > On Fri, Feb 23, 2018 at 3:55 PM, Ian Romanick <i...@freedesktop.org > <mailto:i...@freedesktop.org>> wrote: > > From: Ian Romanick <ian.d.roman...@intel.com > <mailto:ian.d.roman...@intel.com>> > > shader-db results: > > Haswell, Broadwell, and Skylake had similar results. (Skylake shown) > total instructions in shared programs: 14514817 -> 14514808 (<.01%) > instructions in affected programs: 229 -> 220 (-3.93%) > helped: 3 > HURT: 0 > helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4 > helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12% > > total cycles in shared programs: 533145211 -> 533144939 (<.01%) > cycles in affected programs: 37268 -> 36996 (-0.73%) > helped: 8 > HURT: 0 > helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2 > helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05% > > Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown) > total cycles in shared programs: 257618409 -> 257618403 (<.01%) > cycles in affected programs: 12582 -> 12576 (-0.05%) > helped: 3 > HURT: 0 > helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 > helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05% > > No changes on Iron Lake or GM45. > > Signed-off-by: Ian Romanick <ian.d.roman...@intel.com > <mailto:ian.d.roman...@intel.com>> > --- > src/compiler/nir/nir_opt_algebraic.py | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/compiler/nir/nir_opt_algebraic.py > b/src/compiler/nir/nir_opt_algebraic.py > index d40d59b..f5f9e94 100644 > --- a/src/compiler/nir/nir_opt_algebraic.py > +++ b/src/compiler/nir/nir_opt_algebraic.py > @@ -170,6 +170,8 @@ optimizations = [ > (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)), > (('bcsel', ('flt', b, a), b, a), ('fmin', a, b)), > (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)), > + (('bcsel', ('fge', b, a), a, b), ('fmin', a, b)), > + (('bcsel', ('fge', a, b), a, b), ('fmax', a, b)), > > > Please flag as inexact. As per the stupid GLSL definition, these are > not the same as fmin/fmax when you throw in a NaN.
I'm having some trouble rectifying this with the existing transformations and the Intel hardware implementation. GLSL spec says min(x, y) "Returns y if y < x; otherwise it returns x." From that I infer min(x, NaN) == x, and min(NaN, y) == NaN. The expression ('bcsel', ('flt', b, a), b, a) has the same behavior. I think if I rewrite the fmin transform as (swapping the argument order) (('bcsel', ('fge', a, b), b, a), ('fmin', a, b)), it should be at least as valid for as the existing transforms. A similar modification should work for fmax. The Intel SEL instruction which says that with the .L or .GE modifier, if one argument is NaN, the other value is always returned. This means that min(NaN, y) will be y. This is valid for min and max because section 4.7.1 (Range and Precision) says: Operations and built-in functions that operate on a NaN are not required to return a NaN as the result. I don't think returning non-NaN for ('bcsel', ('flt', b, NaN), b, NaN) is valid, so I think the existing transformations should also be marked inexact for platforms that implement the "never NaN" behavior for fmin or fmax. > (('bcsel', ('inot', a), b, c), ('bcsel', a, c, b)), > (('bcsel', a, ('bcsel', a, b, c), d), ('bcsel', a, b, d)), > (('bcsel', a, True, 'b@bool'), ('ior', a, b)), > -- > 2.9.5 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > <https://lists.freedesktop.org/mailman/listinfo/mesa-dev> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev