On Mon, Feb 26, 2018 at 2:21 PM, Ian Romanick <i...@freedesktop.org> wrote:
> On 02/23/2018 05:14 PM, Jason Ekstrand wrote: > > On Fri, Feb 23, 2018 at 3:55 PM, Ian Romanick <i...@freedesktop.org > > <mailto:i...@freedesktop.org>> wrote: > > > > From: Ian Romanick <ian.d.roman...@intel.com > > <mailto:ian.d.roman...@intel.com>> > > > > shader-db results: > > > > Haswell, Broadwell, and Skylake had similar results. (Skylake shown) > > total instructions in shared programs: 14514817 -> 14514808 (<.01%) > > instructions in affected programs: 229 -> 220 (-3.93%) > > helped: 3 > > HURT: 0 > > helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4 > > helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12% > > > > total cycles in shared programs: 533145211 -> 533144939 (<.01%) > > cycles in affected programs: 37268 -> 36996 (-0.73%) > > helped: 8 > > HURT: 0 > > helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2 > > helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05% > > > > Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown) > > total cycles in shared programs: 257618409 -> 257618403 (<.01%) > > cycles in affected programs: 12582 -> 12576 (-0.05%) > > helped: 3 > > HURT: 0 > > helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 > > helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05% > > > > No changes on Iron Lake or GM45. > > > > Signed-off-by: Ian Romanick <ian.d.roman...@intel.com > > <mailto:ian.d.roman...@intel.com>> > > --- > > src/compiler/nir/nir_opt_algebraic.py | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/src/compiler/nir/nir_opt_algebraic.py > > b/src/compiler/nir/nir_opt_algebraic.py > > index d40d59b..f5f9e94 100644 > > --- a/src/compiler/nir/nir_opt_algebraic.py > > +++ b/src/compiler/nir/nir_opt_algebraic.py > > @@ -170,6 +170,8 @@ optimizations = [ > > (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)), > > (('bcsel', ('flt', b, a), b, a), ('fmin', a, b)), > > (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)), > > + (('bcsel', ('fge', b, a), a, b), ('fmin', a, b)), > > + (('bcsel', ('fge', a, b), a, b), ('fmax', a, b)), > > > > > > Please flag as inexact. As per the stupid GLSL definition, these are > > not the same as fmin/fmax when you throw in a NaN. > > I'm having some trouble rectifying this with the existing > transformations and the Intel hardware implementation. > Me too. :-) Really, I think D3D10 spec'd the right thing. With GL, it looks more like a case of someone wrote the obvious thing down when NaN wasn't a thing and it got extended to NaN without much thought. How would you feel about just defining the NIR fmin/fmax to have the D3D10 behavior and telling people that they can make new opcodes if they want something different? At the very least, that would make it constant-fold consistently for us. It also sounds like it's a correct transformation as per the spec comment below. GLSL spec says min(x, y) "Returns y if y < x; otherwise it returns x." > From that I infer min(x, NaN) == x, and min(NaN, y) == NaN. The > expression ('bcsel', ('flt', b, a), b, a) has the same behavior. > > I think if I rewrite the fmin transform as (swapping the argument order) > > (('bcsel', ('fge', a, b), b, a), ('fmin', a, b)), > > it should be at least as valid for as the existing transforms. A > similar modification should work for fmax. > Yes, it should. > The Intel SEL instruction which says that with the .L or .GE modifier, > if one argument is NaN, the other value is always returned. This means > that min(NaN, y) will be y. > > This is valid for min and max because section 4.7.1 (Range and > Precision) says: > > Operations and built-in functions that operate on a NaN are not > required to return a NaN as the result. > It's good to know that the DX behavior is considered a valid transformation. I didn't know exactly what the rules were there. > I don't think returning non-NaN for ('bcsel', ('flt', b, NaN), b, NaN) > is valid, so I think the existing transformations should also be marked > inexact for platforms that implement the "never NaN" behavior for fmin > or fmax. > I think I agree. I think you could probably argue the point and I doubt real apps would care but... Probably best to flag it as inexact and move on with life. If you're using precise and open-coding min/max in some horifically stupid way, you're doing it wrong. > > (('bcsel', ('inot', a), b, c), ('bcsel', a, c, b)), > > (('bcsel', a, ('bcsel', a, b, c), d), ('bcsel', a, b, d)), > > (('bcsel', a, True, 'b@bool'), ('ior', a, b)), > > -- > > 2.9.5 > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists. > freedesktop.org> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > <https://lists.freedesktop.org/mailman/listinfo/mesa-dev> >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev