On 01/26/2017 12:20 PM, Francisco Jerez wrote: > Ian Romanick <i...@freedesktop.org> writes: > >> On 01/25/2017 10:53 AM, Francisco Jerez wrote: >>> Hi Ian, and thank you for your comments, >>> >>> Ian Romanick <i...@freedesktop.org> writes: >>> >>>> On 01/24/2017 03:26 PM, Francisco Jerez wrote: >>>>> Will avoid a regression in a future commit that introduces some >>>>> additional rcp operations. >>>> >>>> When I converted GLSL IR to ir_expression_operation.py, I was careful to >>>> keep all the expressions the same. rcp and div had these weird guards. >>>> GLSL doesn't require that NaN be generated, and quite a few old GPUs >>>> don't. If the atan2 implementation depends on NaN being generated by >>>> rcp, it may have problems on i915, r300, and similar GPUs. I don't know >>>> what they generate, but it's not NaN and it's probably not 0.0. >>> >>> The atan2 implementation from patch 5 doesn't rely on NaNs being >>> generated, but it does rely on the reciprocal operation handling zero >>> and infinity correctly as specified by GLSL for the division operation. >> >> Okay. That is the problem on older GPUs that I was referring. >> Specifically, all versions of the GLSL prior to 4.40 (!) say: >> >> Similarly, treatment of conditions such as divide by 0 may lead to >> an unspecified result, but in no case should such a condition lead >> to the interruption or termination of processing. > > I cannot find this paragraph in any of the GLSL 4.1+ specs.
Somehow I botched my search. GLSL 4.10 was the first version to drop this in favor of, "Dividing by 0 results in the appropriately signed IEEE Inf." >> I believe that DX11 requires the GLSL 4.40+ behavior, so all even >> somewhat modern devices should just work. It's all the pre-DX11 >> hardware that's might be a problem. > > Actually I don't think PATCH 5 necessarily cares about infinities not > getting generated -- The only requirement is for rcp(0) to return a > fairly large value in absolute value (the larger the more accurate the > result will be), even the sign of the result is pretty much irrelevant. I expect that's what non-Inf GPUs do / did... generate float MAXVAL. Now that I'm understanding (and remembering) all the issues better, I'm a bit less worried. As I mentioned in the previous message, I'd like to see someone test this on r300... I think I still have one somewhere, but I won't be able to test it for at least two weeks. > That said there's a relatively straightforward change we could apply to > PATCH 5 and 6 in order to make the calculation more robust against > division by zero (it will probably make this patch unnecessary to avoid > piglit regressions but I think we want it anyway because of GLSL 4.1+): Yes... I believe we do need this patch for GLSL 4.10 correctness. > We could leverage the coordinate rotation to turn (y, 0) into (0, y), > avoiding division by zero along the whole vertical line -- The only > remaining case where we could potentially divide by zero is along the > left y=0 half-line, but the function jumps from -π to π along that line > so returning an undefined value within [-π, π] for y=0 and x < 0 is very > unlikely to hurt, because AFAICT all GLSL versions lacking well-defined > divide by zero didn't require the implementation to represent the sign > of zero consistently either, so the shader is unlikely to be able to > generate a signed zero value accurately enough to notice the problem... > >> Now... talking to Jason just now, he reminded me that the spec also says >> the following about built-in functions: >> >> Function parameters specified as angle are assumed to be in units >> of radians. In no case will any of these functions result in a >> divide by zero error. If the divisor of a ratio is 0, then results >> will be undefined. >> > Heh... That could be interpreted as if atan2(y, 0) is undefined which > would make this discussion moot -- I'll send a v2 of PATCH 5 and 6 > anyway since it should be easy enough to get right. I think it allows atan2(y, 0) to have an undefined result, but atan2(0, -abs(x)) should still produce 0. Based on my reading of patch 5, those inputs would lead to rcp(0). >> We may be fine even on old, clunky hardware. Looking at the code in >> patch 5, atan(0, -abs(x)) would still be a problem if rcp(0) produces >> undefined results. It looks like >> tests/shaders/glsl-fs-atan-2.shader_test should hit that case. Anyone >> have r300 or r400 hardware to test that? >> >> This patch doesn't affect that, and, even with the "unspecified result" >> rule, it's clearly correct. This patch is >> >> Reviewed-by: Ian Romanick <ian.d.roman...@intel.com> >> >>>> That said, this matches NIR, and it's probably fine.
signature.asc
Description: OpenPGP digital signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev