"Juan A. Suarez Romero" <jasua...@igalia.com> writes: > Rewrite atan2(y,x) to cover (+/-)INF values. > > This fixes several test cases in Vulkan CTS > (dEQP-VK.glsl.builtin.precision.atan2.*) > > v2: do not flush denorms to 0 (jasuarez) > --- > src/compiler/spirv/vtn_glsl450.c | 48 > +++++++++++++++++++++++++++++++++++----- > 1 file changed, 42 insertions(+), 6 deletions(-) > > diff --git a/src/compiler/spirv/vtn_glsl450.c > b/src/compiler/spirv/vtn_glsl450.c > index 0d32fddbef..d52a22c0c3 100644 > --- a/src/compiler/spirv/vtn_glsl450.c > +++ b/src/compiler/spirv/vtn_glsl450.c > @@ -299,18 +299,47 @@ build_atan(nir_builder *b, nir_ssa_def *y_over_x) > return nir_fmul(b, tmp, nir_fsign(b, y_over_x)); > } > > +/* > + * Computes atan2(y,x) > + */ > static nir_ssa_def * > build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x) > { > nir_ssa_def *zero = nir_imm_float(b, 0.0f); > - > - /* If |x| >= 1.0e-8 * |y|: */ > - nir_ssa_def *condition = > - nir_fge(b, nir_fabs(b, x), > - nir_fmul(b, nir_imm_float(b, 1.0e-8f), nir_fabs(b, y))); > + nir_ssa_def *inf = nir_imm_float(b, INFINITY); > + nir_ssa_def *minus_inf = nir_imm_float(b, -INFINITY); > + nir_ssa_def *m_3_pi_4 = nir_fmul(b, nir_imm_float(b, 3.0f), > + nir_imm_float(b, M_PI_4f)); > + > + /* if y == +-INF */ > + nir_ssa_def *y_is_inf = nir_feq(b, nir_fabs(b, y), inf); > + > + /* if x == +-INF */ > + nir_ssa_def *x_is_inf = nir_feq(b, nir_fabs(b, x), inf); > + > + /* Case: y is +-INF */ > + nir_ssa_def *y_is_inf_then = > + nir_fmul(b, nir_fsign(b, y), > + nir_bcsel(b, nir_feq(b, x, inf), > + nir_imm_float(b, M_PI_4f), > + nir_bcsel(b, nir_feq(b, x, minus_inf), > + m_3_pi_4, > + nir_imm_float(b, M_PI_2f)))); > + > + /* Case: x is +-INF */ > + nir_ssa_def *x_is_inf_then = > + nir_fmul(b, nir_fsign(b, y), > + nir_bcsel(b, nir_feq(b, x, inf), > + zero, > + nir_imm_float(b, M_PIf))); > +
I don't think we need all these special cases. The majority of the infinity/zero handling rules required by IEEE are fairly natural and would be taken care of without any additional effort by the floating-point division operation and single-argument atan function below if they propagated infinities and zeroes according to IEEE rules. I had a look at the test results myself and noticed that the failures are for the most part due to a precision problem in the current implementation that doesn't only affect infinity -- Relative precision also explodes as x grows above certain point, infinities just make the problem catastrophic and cause it to return NaN instead of the expected finite value. The reason for the precision problem is that fdiv is later on lowered into an fmul+frcp sequence, and the latter may flush the result to zero if the denominator was so huge that its reciprocal would be denormalized. If the numerator happened to be infinite you may end up with ∞/huge = NaN for the same reason. On top of that there seem to be other issues with the current atan2 implementation: - It doesn't handle zeros correctly. This may be related to your observation that denorm arguments cause it to give bogus results, but the problem doesn't seem to be related to denorms in particular, but to the fact that denorms can get flushed to -0 which is in turn handled incorrectly. The reason is that the existing code uses 'y >= 0' to determine on which side of the branch cut we are, but that causes the discontinuity to end up along the y=-epsilon line instead of along the y=0 line as IEEE requires -- IOW, with the current implementation very small negative y values behave as if they were positive which causes the result to have a large absolute error of 2π. - It doesn't give IEEE-compliant results when both arguments are simultaneously infinite. This is not surprising given that IEEE defining atan2(∞, ∞) = π/4 is fairly artificial (as are the other rules for combinations of positive or negative infinity), strictly speaking taking the limit along any direction other than the diagonal would be as right or wrong. To make the matter worse IEEE disagrees with itself on the direction limits are taken when it goes on and defines e.g. atan2(+0, -0) = π taken along the horizontal. Luckily GLSL specifically allows implementations to deviate from IEEE rules in a neighborhood of zero ("Results are undefined if x and y are both 0."), so we don't need to care about that one. Except for the last point, these issues seem serious enough to be worth fixing -- I'll reply to this thread with an alternative implementation of atan2 that addresses the first two issues (and actually uses less instructions than the current implementation), plus another, optional patch that addresses the third issue in order to make the CTS tests happy (we can implement the whole oddball atan2(±∞, ±∞) corner cases as IEEE says with only three additional Gen instructions by being bit smart, but one could argue that Khronos should just make atan2(±∞, ±∞) undefined since they're already deviating from IEEE-compliant behavior at (±0, ±0)). P.S.: Been waiting for hours to get jenkins results and it looks like the CI is busted, will send the patches once I get positive results -- Likely not today ;). > + /* If x > 0 */ > + nir_ssa_def *x_not_zero = > + nir_fne(b, x, zero); > > /* Then...call atan(y/x) and fix it up: */ > nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, y, x)); > + > nir_ssa_def *r_then = > nir_bcsel(b, nir_flt(b, x, zero), > nir_fadd(b, atan1, > @@ -323,7 +352,14 @@ build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def > *x) > nir_ssa_def *r_else = > nir_fmul(b, nir_fsign(b, y), nir_imm_float(b, M_PI_2f)); > > - return nir_bcsel(b, condition, r_then, r_else); > + /* Everything together */ > + return nir_bcsel(b, y_is_inf, > + y_is_inf_then, > + nir_bcsel(b, x_is_inf, > + x_is_inf_then, > + nir_bcsel(b, x_not_zero, > + r_then, > + r_else))); > } > > static nir_ssa_def * > -- > 2.11.0 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev