On Mon, 2017-01-16 at 10:21 -0800, Jason Ekstrand wrote: > +curro > > I'm not sure what I think here. TBH, I haven't actually read it in detail > yet, but here are some first impressions: > 1) There are two implementations of atan2 (SPIR-V and GLSL) and they should > be kept in sync. The same dEQP tests exist in both cases.
Yes. I focused in Vulkan, but I agree these changes (if accepted) could be ported to the GLSL version. > 2) The atan2 tests are not in any mustpass list and I doubt we'll see them > show up soon (I could be wrong on that one) so I'm not sure fixing them is > all that high priority. Agree. There were removed from mustpass precisely due the corner cases it has. > 3) There's a good reason why they aren't on the mustpass: atan2 has lots of > hard-to-get-right corners and, unless you have a very good hardware atan2 > instruction, any "correct" implementation is liable to emit a *lot* of shader > code. A quick grep through shader-db for atan yields a number of shaders > that use the atan2 variant so this may have a minor perf impact. > This was the motivation to submit a change here: current atan2 implementation, already emit a lot of code. So adding some instructions more I think won't make a big change here. And atan2 doesn't seem a very used function. In shader-db (including private shaders), I only found 36 shaders using it, out of of 26k. > The above does not necessarily sum to "we shouldn't fix it" but it probably > does mean it's low-priority at best and we need to be careful. > > Looking a bit into the math of atan2, I'm not convinced flushing denorms to > zero is actually what we want to do. For x, I think it's valid, but as y > approaches 0, you get +/- pi/2 depending on whether y positive or negative. > As both approach zero, it's undefined. As you approach infinity, it > approaches something in the interval [-pi/2, pi/2] but it depends on the > direction of approach. If you do "man atan2" you'll see this all layed out > in a horrifying amount of detail (somewhere around a dozen different cases). > > Yes, originally I only flushed 'x'. But then also flushed 'y' to keep coherence. But didn't think about that corner case you mention. I think we can remove the flush to 0 in 'y'. > On Mon, Jan 16, 2017 at 4:08 AM, Juan A. Suarez Romero <jasua...@igalia.com> > wrote: > > Rewrite atan2(y,x) to cover (+/-)INF values. > > > > > > > > Also, in case either 'y' or 'x' is a denorm value, flush it to 0 at the > > > > very beginning. > > > > > > > > The reason is that in other case, the hardware will do the flush in some > > > > of the steps, but not in order. So we end up handling in some steps a > > > > denorm value and in others a 0. This causes wrong results. > > > > > > > > Doing it at the very beginning we ensure always the same value is used > > > > (a 0) in all the steps. > > > > > > > > This fixes several test cases in Vulkan CTS > > > > (dEQP-VK.glsl.builtin.precision.atan2.*) > > > > --- > > > > src/compiler/spirv/vtn_glsl450.c | 68 > > ++++++++++++++++++++++++++++++++++------ > > > > 1 file changed, 58 insertions(+), 10 deletions(-) > > > > > > > > diff --git a/src/compiler/spirv/vtn_glsl450.c > > b/src/compiler/spirv/vtn_glsl450.c > > > > index 0d32fdd..d2743a8 100644 > > > > --- a/src/compiler/spirv/vtn_glsl450.c > > > > +++ b/src/compiler/spirv/vtn_glsl450.c > > > > @@ -299,31 +299,79 @@ build_atan(nir_builder *b, nir_ssa_def *y_over_x) > > > > return nir_fmul(b, tmp, nir_fsign(b, y_over_x)); > > > > } > > > > > > > > +/* > > > > + * Computes atan2(y,x) > > > > + * > > > > + * If any of the parameters is a denorm value, it is flushed to 0 at the > > very > > > > + * beginning to avoid precision errors > > > > + */ > > > > static nir_ssa_def * > > > > build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x) > > > > { > > > > nir_ssa_def *zero = nir_imm_float(b, 0.0f); > > > > - > > > > - /* If |x| >= 1.0e-8 * |y|: */ > > > > + nir_ssa_def *inf = nir_imm_float(b, INFINITY); > > > > + nir_ssa_def *minus_inf = nir_imm_float(b, -INFINITY); > > > > + nir_ssa_def *m_3_pi_4 = nir_fmul(b, nir_imm_float(b, 3.0f), > > > > + nir_imm_float(b, M_PI_4f)); > > > > + > > > > + nir_ssa_def *denorm_y = nir_bcsel(b, nir_feq(b, nir_fmov(b, nir_fabs(b, > > y)), > > > > + zero), > > > > + zero, > > > > + y); > > > > + nir_ssa_def *denorm_x = nir_bcsel(b, nir_feq(b, nir_fmov(b, nir_fabs(b, > > x)), > > > > + zero), > > > > + zero, > > > > + x); > > > > + > > > > + /* if y == +-INF */ > > > > + nir_ssa_def *y_is_inf = nir_feq(b, nir_fabs(b, y), inf); > > > > + > > > > + /* if x == +-INF */ > > > > + nir_ssa_def *x_is_inf = nir_feq(b, nir_fabs(b, x), inf); > > > > + > > > > + /* Case: y is +-INF */ > > > > + nir_ssa_def *y_is_inf_then = > > > > + nir_fmul(b, nir_fsign(b, y), > > > > + nir_bcsel(b, nir_feq(b, x, inf), > > > > + nir_imm_float(b, M_PI_4f), > > > > + nir_bcsel(b, nir_feq(b, x, minus_inf), > > > > + m_3_pi_4, > > > > + nir_imm_float(b, M_PI_2f)))); > > > > + > > > > + /* Case: x is +-INF */ > > > > + nir_ssa_def *x_is_inf_then = > > > > + nir_fmul(b, nir_fsign(b, y), > > > > + nir_bcsel(b, nir_feq(b, x, inf), > > > > + zero, > > > > + nir_imm_float(b, M_PIf))); > > > > + > > > > + /* If x > 0 */ > > > > nir_ssa_def *condition = > > > > - nir_fge(b, nir_fabs(b, x), > > > > - nir_fmul(b, nir_imm_float(b, 1.0e-8f), nir_fabs(b, y))); > > > > + nir_fne(b, denorm_x, zero); > > > > > > > > /* Then...call atan(y/x) and fix it up: */ > > > > - nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, y, x)); > > > > + nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, denorm_y, denorm_x)); > > > > + > > > > nir_ssa_def *r_then = > > > > - nir_bcsel(b, nir_flt(b, x, zero), > > > > + nir_bcsel(b, nir_flt(b, denorm_x, zero), > > > > nir_fadd(b, atan1, > > > > - nir_bcsel(b, nir_fge(b, y, zero), > > > > + nir_bcsel(b, nir_fge(b, denorm_y, zero), > > > > nir_imm_float(b, M_PIf), > > > > nir_imm_float(b, -M_PIf))), > > > > atan1); > > > > > > > > /* Else... */ > > > > nir_ssa_def *r_else = > > > > - nir_fmul(b, nir_fsign(b, y), nir_imm_float(b, M_PI_2f)); > > > > - > > > > - return nir_bcsel(b, condition, r_then, r_else); > > > > + nir_fmul(b, nir_fsign(b, denorm_y), nir_imm_float(b, M_PI_2f)); > > > > + > > > > + /* Everything together */ > > > > + return nir_bcsel(b, y_is_inf, > > > > + y_is_inf_then, > > > > + nir_bcsel(b, x_is_inf, > > > > + x_is_inf_then, > > > > + nir_bcsel(b, condition, > > > > + r_then, > > > > + r_else))); > > > > } > > > > > > > > static nir_ssa_def * > > > > -- > > > > 2.9.3 > > > > > > > > _______________________________________________ > > > > mesa-dev mailing list > > > > mesa-dev@lists.freedesktop.org > > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev