On 01/25/2017 10:53 AM, Francisco Jerez wrote: > Hi Ian, and thank you for your comments, > > Ian Romanick <i...@freedesktop.org> writes: > >> On 01/24/2017 03:26 PM, Francisco Jerez wrote: >>> Will avoid a regression in a future commit that introduces some >>> additional rcp operations. >> >> When I converted GLSL IR to ir_expression_operation.py, I was careful to >> keep all the expressions the same. rcp and div had these weird guards. >> GLSL doesn't require that NaN be generated, and quite a few old GPUs >> don't. If the atan2 implementation depends on NaN being generated by >> rcp, it may have problems on i915, r300, and similar GPUs. I don't know >> what they generate, but it's not NaN and it's probably not 0.0. > > The atan2 implementation from patch 5 doesn't rely on NaNs being > generated, but it does rely on the reciprocal operation handling zero > and infinity correctly as specified by GLSL for the division operation.
Okay. That is the problem on older GPUs that I was referring. Specifically, all versions of the GLSL prior to 4.40 (!) say: Similarly, treatment of conditions such as divide by 0 may lead to an unspecified result, but in no case should such a condition lead to the interruption or termination of processing. I believe that DX11 requires the GLSL 4.40+ behavior, so all even somewhat modern devices should just work. It's all the pre-DX11 hardware that's might be a problem. Now... talking to Jason just now, he reminded me that the spec also says the following about built-in functions: Function parameters specified as angle are assumed to be in units of radians. In no case will any of these functions result in a divide by zero error. If the divisor of a ratio is 0, then results will be undefined. We may be fine even on old, clunky hardware. Looking at the code in patch 5, atan(0, -abs(x)) would still be a problem if rcp(0) produces undefined results. It looks like tests/shaders/glsl-fs-atan-2.shader_test should hit that case. Anyone have r300 or r400 hardware to test that? This patch doesn't affect that, and, even with the "unspecified result" rule, it's clearly correct. This patch is Reviewed-by: Ian Romanick <ian.d.roman...@intel.com> >> That said, this matches NIR, and it's probably fine. >> >>> --- >>> src/compiler/glsl/ir_expression_operation.py | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/src/compiler/glsl/ir_expression_operation.py >>> b/src/compiler/glsl/ir_expression_operation.py >>> index f91ac9b..4ac1ffb 100644 >>> --- a/src/compiler/glsl/ir_expression_operation.py >>> +++ b/src/compiler/glsl/ir_expression_operation.py >>> @@ -422,7 +422,7 @@ ir_expression_operation = [ >>> operation("neg", 1, source_types=numeric_types, c_expression={'u': >>> "-((int) {src0})", 'default': "-{src0}"}), >>> operation("abs", 1, source_types=signed_numeric_types, >>> c_expression={'i': "{src0} < 0 ? -{src0} : {src0}", 'f': "fabsf({src0})", >>> 'd': "fabs({src0})", 'i64': "{src0} < 0 ? -{src0} : {src0}"}), >>> operation("sign", 1, source_types=signed_numeric_types, >>> c_expression={'i': "({src0} > 0) - ({src0} < 0)", 'f': "float(({src0} > >>> 0.0F) - ({src0} < 0.0F))", 'd': "double(({src0} > 0.0) - ({src0} < 0.0))", >>> 'i64': "({src0} > 0) - ({src0} < 0)"}), >>> - operation("rcp", 1, source_types=real_types, c_expression={'f': "{src0} >>> != 0.0F ? 1.0F / {src0} : 0.0F", 'd': "{src0} != 0.0 ? 1.0 / {src0} : >>> 0.0"}), >>> + operation("rcp", 1, source_types=real_types, c_expression={'f': "1.0F / >>> {src0}", 'd': "1.0 / {src0}"}), >>> operation("rsq", 1, source_types=real_types, c_expression={'f': "1.0F / >>> sqrtf({src0})", 'd': "1.0 / sqrt({src0})"}), >>> operation("sqrt", 1, source_types=real_types, c_expression={'f': >>> "sqrtf({src0})", 'd': "sqrt({src0})"}), >>> operation("exp", 1, source_types=(float_type,), >>> c_expression="expf({src0})"), # Log base e on gentype >>>
signature.asc
Description: OpenPGP digital signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev