fmax

Ian Romanick Mon, 26 Feb 2018 14:21:30 -0800

On 02/23/2018 05:14 PM, Jason Ekstrand wrote:
> On Fri, Feb 23, 2018 at 3:55 PM, Ian Romanick <i...@freedesktop.org
> <mailto:i...@freedesktop.org>> wrote:
> 
>     From: Ian Romanick <ian.d.roman...@intel.com
>     <mailto:ian.d.roman...@intel.com>>
> 
>     shader-db results:
> 
>     Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
>     total instructions in shared programs: 14514817 -> 14514808 (<.01%)
>     instructions in affected programs: 229 -> 220 (-3.93%)
>     helped: 3
>     HURT: 0
>     helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4
>     helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12%
> 
>     total cycles in shared programs: 533145211 -> 533144939 (<.01%)
>     cycles in affected programs: 37268 -> 36996 (-0.73%)
>     helped: 8
>     HURT: 0
>     helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2
>     helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05%
> 
>     Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
>     total cycles in shared programs: 257618409 -> 257618403 (<.01%)
>     cycles in affected programs: 12582 -> 12576 (-0.05%)
>     helped: 3
>     HURT: 0
>     helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
>     helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05%
> 
>     No changes on Iron Lake or GM45.
> 
>     Signed-off-by: Ian Romanick <ian.d.roman...@intel.com
>     <mailto:ian.d.roman...@intel.com>>
>     ---
>      src/compiler/nir/nir_opt_algebraic.py | 2 ++
>      1 file changed, 2 insertions(+)
> 
>     diff --git a/src/compiler/nir/nir_opt_algebraic.py
>     b/src/compiler/nir/nir_opt_algebraic.py
>     index d40d59b..f5f9e94 100644
>     --- a/src/compiler/nir/nir_opt_algebraic.py
>     +++ b/src/compiler/nir/nir_opt_algebraic.py
>     @@ -170,6 +170,8 @@ optimizations = [
>         (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
>         (('bcsel', ('flt', b, a), b, a), ('fmin', a, b)),
>         (('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
>     +   (('bcsel', ('fge', b, a), a, b), ('fmin', a, b)),
>     +   (('bcsel', ('fge', a, b), a, b), ('fmax', a, b)),
> 
> 
> Please flag as inexact.  As per the stupid GLSL definition, these are
> not the same as fmin/fmax when you throw in a NaN.


I'm having some trouble rectifying this with the existing
transformations and the Intel hardware implementation.

GLSL spec says min(x, y) "Returns y if y < x; otherwise it returns x."
From that I infer min(x, NaN) == x, and min(NaN, y) == NaN.  The
expression ('bcsel', ('flt', b, a), b, a) has the same behavior.

I think if I rewrite the fmin transform as (swapping the argument order)

    (('bcsel', ('fge', a, b), b, a), ('fmin', a, b)),

it should be at least as valid for as the existing transforms.  A
similar modification should work for fmax.

The Intel SEL instruction which says that with the .L or .GE modifier,
if one argument is NaN, the other value is always returned.  This means
that min(NaN, y) will be y.

This is valid for min and max because section 4.7.1 (Range and
Precision) says:

    Operations and built-in functions that operate on a NaN are not
    required to return a NaN as the result.

I don't think returning non-NaN for ('bcsel', ('flt', b, NaN), b, NaN)
is valid, so I think the existing transformations should also be marked
inexact for platforms that implement the "never NaN" behavior for fmin
or fmax.

>         (('bcsel', ('inot', a), b, c), ('bcsel', a, c, b)),
>         (('bcsel', a, ('bcsel', a, b, c), d), ('bcsel', a, b, d)),
>         (('bcsel', a, True, 'b@bool'), ('ior', a, b)),
>     --
>     2.9.5
> 
>     _______________________________________________
>     mesa-dev mailing list
>     mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org>
>     https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>     <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/22] nir: Recognize some more open-coded fmin / fmax

Reply via email to