Ian Romanick <i...@freedesktop.org> writes: > On 01/26/2017 12:20 PM, Francisco Jerez wrote: >> Ian Romanick <i...@freedesktop.org> writes: >> >>> On 01/25/2017 10:53 AM, Francisco Jerez wrote: >>>> Hi Ian, and thank you for your comments, >>>> >>>> Ian Romanick <i...@freedesktop.org> writes: >>>> >>>>> On 01/24/2017 03:26 PM, Francisco Jerez wrote: >>>>>> Will avoid a regression in a future commit that introduces some >>>>>> additional rcp operations. >>>>> >>>>> When I converted GLSL IR to ir_expression_operation.py, I was careful to >>>>> keep all the expressions the same. rcp and div had these weird guards. >>>>> GLSL doesn't require that NaN be generated, and quite a few old GPUs >>>>> don't. If the atan2 implementation depends on NaN being generated by >>>>> rcp, it may have problems on i915, r300, and similar GPUs. I don't know >>>>> what they generate, but it's not NaN and it's probably not 0.0. >>>> >>>> The atan2 implementation from patch 5 doesn't rely on NaNs being >>>> generated, but it does rely on the reciprocal operation handling zero >>>> and infinity correctly as specified by GLSL for the division operation. >>> >>> Okay. That is the problem on older GPUs that I was referring. >>> Specifically, all versions of the GLSL prior to 4.40 (!) say: >>> >>> Similarly, treatment of conditions such as divide by 0 may lead to >>> an unspecified result, but in no case should such a condition lead >>> to the interruption or termination of processing. >> >> I cannot find this paragraph in any of the GLSL 4.1+ specs. > > Somehow I botched my search. GLSL 4.10 was the first version to drop > this in favor of, "Dividing by 0 results in the appropriately signed > IEEE Inf." > >>> I believe that DX11 requires the GLSL 4.40+ behavior, so all even >>> somewhat modern devices should just work. It's all the pre-DX11 >>> hardware that's might be a problem. >> >> Actually I don't think PATCH 5 necessarily cares about infinities not >> getting generated -- The only requirement is for rcp(0) to return a >> fairly large value in absolute value (the larger the more accurate the >> result will be), even the sign of the result is pretty much irrelevant. > > I expect that's what non-Inf GPUs do / did... generate float MAXVAL. > Now that I'm understanding (and remembering) all the issues better, I'm > a bit less worried. As I mentioned in the previous message, I'd like to > see someone test this on r300... I think I still have one somewhere, but > I won't be able to test it for at least two weeks. > >> That said there's a relatively straightforward change we could apply to >> PATCH 5 and 6 in order to make the calculation more robust against >> division by zero (it will probably make this patch unnecessary to avoid >> piglit regressions but I think we want it anyway because of GLSL 4.1+): > > Yes... I believe we do need this patch for GLSL 4.10 correctness. > >> We could leverage the coordinate rotation to turn (y, 0) into (0, y), >> avoiding division by zero along the whole vertical line -- The only >> remaining case where we could potentially divide by zero is along the >> left y=0 half-line, but the function jumps from -π to π along that line >> so returning an undefined value within [-π, π] for y=0 and x < 0 is very >> unlikely to hurt, because AFAICT all GLSL versions lacking well-defined >> divide by zero didn't require the implementation to represent the sign >> of zero consistently either, so the shader is unlikely to be able to >> generate a signed zero value accurately enough to notice the problem... >> >>> Now... talking to Jason just now, he reminded me that the spec also says >>> the following about built-in functions: >>> >>> Function parameters specified as angle are assumed to be in units >>> of radians. In no case will any of these functions result in a >>> divide by zero error. If the divisor of a ratio is 0, then results >>> will be undefined. >>> >> Heh... That could be interpreted as if atan2(y, 0) is undefined which >> would make this discussion moot -- I'll send a v2 of PATCH 5 and 6 >> anyway since it should be easy enough to get right. > > I think it allows atan2(y, 0) to have an undefined result, but atan2(0, > -abs(x)) should still produce 0.
Nope, atan2(±0, -abs(x)) is right along the discontinuity, and the expected value would jump from -π to π depending on the sign of zero. > Based on my reading of patch 5, those inputs would lead to rcp(0). > Yeah, but then again hardware unable to give IEEE-compliant results for division by zero is unlikely to be able to generate signed zero accurately enough, so the result is going to have a maximum absolute error of 2π anyway. >>> We may be fine even on old, clunky hardware. Looking at the code in >>> patch 5, atan(0, -abs(x)) would still be a problem if rcp(0) produces >>> undefined results. It looks like >>> tests/shaders/glsl-fs-atan-2.shader_test should hit that case. Anyone >>> have r300 or r400 hardware to test that? >>> >>> This patch doesn't affect that, and, even with the "unspecified result" >>> rule, it's clearly correct. This patch is >>> >>> Reviewed-by: Ian Romanick <ian.d.roman...@intel.com> >>> >>>>> That said, this matches NIR, and it's probably fine.
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev