On Mon, 2017-01-16 at 10:21 -0800, Jason Ekstrand wrote:
> +curro
> 
> I'm not sure what I think here.  TBH, I haven't actually read it in detail 
> yet, but here are some first impressions:
>  1) There are two implementations of atan2 (SPIR-V and GLSL) and they should 
> be kept in sync.  The same dEQP tests exist in both cases.

Yes. I focused in Vulkan, but I agree these changes (if accepted) could be 
ported to the GLSL version.
>  2) The atan2 tests are not in any mustpass list and I doubt we'll see them 
> show up soon (I could be wrong on that one) so I'm not sure fixing them is 
> all that high priority.

Agree. There were removed from mustpass precisely due the corner cases it has.
>  3) There's a good reason why they aren't on the mustpass: atan2 has lots of 
> hard-to-get-right corners and, unless you have a very good hardware atan2 
> instruction, any "correct" implementation is liable to emit a *lot* of shader 
> code.  A quick grep through shader-db for atan yields a number of shaders 
> that use the atan2 variant so this may have a minor perf impact.
> 

This was the motivation to submit a change here: current atan2 implementation, 
already  emit a lot of code. So adding some instructions more I think won't 
make a big change here. And atan2 doesn't seem a very used function. In 
shader-db (including private shaders), I only found 36 shaders using it, out of 
of 26k.

> The above does not necessarily sum to "we shouldn't fix it" but it probably 
> does mean it's low-priority at best and we need to be careful.
> 
> Looking a bit into the math of atan2, I'm not convinced flushing denorms to 
> zero is actually what we want to do.  For x, I think it's valid, but as y 
> approaches 0, you get +/- pi/2 depending on whether y positive or negative.  
> As both approach zero, it's undefined.  As you approach infinity, it 
> approaches something in the interval [-pi/2, pi/2] but it depends on the 
> direction of approach.  If you do "man atan2" you'll see this all layed out 
> in a horrifying amount of detail (somewhere around a dozen different cases).
> 
> 

Yes, originally I only flushed 'x'. But then also flushed 'y' to keep 
coherence. But didn't think about that corner case you mention. I think we can 
remove the flush to 0 in 'y'.
 
> On Mon, Jan 16, 2017 at 4:08 AM, Juan A. Suarez Romero <jasua...@igalia.com> 
> wrote:
> > Rewrite atan2(y,x) to cover (+/-)INF values.
> > 
> > 
> > 
> > Also, in case either 'y' or 'x' is a denorm value, flush it to 0 at the
> > 
> > very beginning.
> > 
> > 
> > 
> > The reason is that in other case, the hardware will do the flush in some
> > 
> > of the steps, but not in order. So we end up handling in some steps a
> > 
> > denorm value and in others a 0. This causes wrong results.
> > 
> > 
> > 
> > Doing it at the very beginning we ensure always the same value is used
> > 
> > (a 0) in all the steps.
> > 
> > 
> > 
> > This fixes several test cases in Vulkan CTS
> > 
> > (dEQP-VK.glsl.builtin.precision.atan2.*)
> > 
> > ---
> > 
> >  src/compiler/spirv/vtn_glsl450.c | 68 
> > ++++++++++++++++++++++++++++++++++------
> > 
> >  1 file changed, 58 insertions(+), 10 deletions(-)
> > 
> > 
> > 
> > diff --git a/src/compiler/spirv/vtn_glsl450.c 
> > b/src/compiler/spirv/vtn_glsl450.c
> > 
> > index 0d32fdd..d2743a8 100644
> > 
> > --- a/src/compiler/spirv/vtn_glsl450.c
> > 
> > +++ b/src/compiler/spirv/vtn_glsl450.c
> > 
> > @@ -299,31 +299,79 @@ build_atan(nir_builder *b, nir_ssa_def *y_over_x)
> > 
> >     return nir_fmul(b, tmp, nir_fsign(b, y_over_x));
> > 
> >  }
> > 
> > 
> > 
> > +/*
> > 
> > + * Computes atan2(y,x)
> > 
> > + *
> > 
> > + * If any of the parameters is a denorm value, it is flushed to 0 at the 
> > very
> > 
> > + * beginning to avoid precision errors
> > 
> > + */
> > 
> >  static nir_ssa_def *
> > 
> >  build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x)
> > 
> >  {
> > 
> >     nir_ssa_def *zero = nir_imm_float(b, 0.0f);
> > 
> > -
> > 
> > -   /* If |x| >= 1.0e-8 * |y|: */
> > 
> > +   nir_ssa_def *inf = nir_imm_float(b, INFINITY);
> > 
> > +   nir_ssa_def *minus_inf = nir_imm_float(b, -INFINITY);
> > 
> > +   nir_ssa_def *m_3_pi_4 = nir_fmul(b, nir_imm_float(b, 3.0f),
> > 
> > +                                       nir_imm_float(b, M_PI_4f));
> > 
> > +
> > 
> > +   nir_ssa_def *denorm_y = nir_bcsel(b, nir_feq(b, nir_fmov(b, nir_fabs(b, 
> > y)),
> > 
> > +                                                   zero),
> > 
> > +                                        zero,
> > 
> > +                                        y);
> > 
> > +   nir_ssa_def *denorm_x = nir_bcsel(b, nir_feq(b, nir_fmov(b, nir_fabs(b, 
> > x)),
> > 
> > +                                                   zero),
> > 
> > +                                        zero,
> > 
> > +                                        x);
> > 
> > +
> > 
> > +   /* if y == +-INF */
> > 
> > +   nir_ssa_def *y_is_inf = nir_feq(b, nir_fabs(b, y), inf);
> > 
> > +
> > 
> > +   /* if x == +-INF */
> > 
> > +   nir_ssa_def *x_is_inf = nir_feq(b, nir_fabs(b, x), inf);
> > 
> > +
> > 
> > +   /* Case: y is +-INF */
> > 
> > +   nir_ssa_def *y_is_inf_then =
> > 
> > +      nir_fmul(b, nir_fsign(b, y),
> > 
> > +                  nir_bcsel(b, nir_feq(b, x, inf),
> > 
> > +                               nir_imm_float(b, M_PI_4f),
> > 
> > +                               nir_bcsel(b, nir_feq(b, x, minus_inf),
> > 
> > +                                            m_3_pi_4,
> > 
> > +                                            nir_imm_float(b, M_PI_2f))));
> > 
> > +
> > 
> > +   /* Case: x is +-INF */
> > 
> > +   nir_ssa_def *x_is_inf_then =
> > 
> > +      nir_fmul(b, nir_fsign(b, y),
> > 
> > +                  nir_bcsel(b, nir_feq(b, x, inf),
> > 
> > +                               zero,
> > 
> > +                               nir_imm_float(b, M_PIf)));
> > 
> > +
> > 
> > +   /* If x > 0 */
> > 
> >     nir_ssa_def *condition =
> > 
> > -      nir_fge(b, nir_fabs(b, x),
> > 
> > -              nir_fmul(b, nir_imm_float(b, 1.0e-8f), nir_fabs(b, y)));
> > 
> > +      nir_fne(b, denorm_x, zero);
> > 
> > 
> > 
> >     /* Then...call atan(y/x) and fix it up: */
> > 
> > -   nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, y, x));
> > 
> > +   nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, denorm_y, denorm_x));
> > 
> > +
> > 
> >     nir_ssa_def *r_then =
> > 
> > -      nir_bcsel(b, nir_flt(b, x, zero),
> > 
> > +      nir_bcsel(b, nir_flt(b, denorm_x, zero),
> > 
> >                     nir_fadd(b, atan1,
> > 
> > -                               nir_bcsel(b, nir_fge(b, y, zero),
> > 
> > +                               nir_bcsel(b, nir_fge(b, denorm_y, zero),
> > 
> >                                              nir_imm_float(b, M_PIf),
> > 
> >                                              nir_imm_float(b, -M_PIf))),
> > 
> >                     atan1);
> > 
> > 
> > 
> >     /* Else... */
> > 
> >     nir_ssa_def *r_else =
> > 
> > -      nir_fmul(b, nir_fsign(b, y), nir_imm_float(b, M_PI_2f));
> > 
> > -
> > 
> > -   return nir_bcsel(b, condition, r_then, r_else);
> > 
> > +      nir_fmul(b, nir_fsign(b, denorm_y), nir_imm_float(b, M_PI_2f));
> > 
> > +
> > 
> > +   /* Everything together */
> > 
> > +   return nir_bcsel(b, y_is_inf,
> > 
> > +                       y_is_inf_then,
> > 
> > +                       nir_bcsel(b, x_is_inf,
> > 
> > +                                    x_is_inf_then,
> > 
> > +                                    nir_bcsel(b, condition,
> > 
> > +                                                 r_then,
> > 
> > +                                                 r_else)));
> > 
> >  }
> > 
> > 
> > 
> >  static nir_ssa_def *
> > 
> > --
> > 
> > 2.9.3
> > 
> > 
> > 
> > _______________________________________________
> > 
> > mesa-dev mailing list
> > 
> > mesa-dev@lists.freedesktop.org
> > 
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> > 
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to