On Thu, Feb 25, 2016 at 6:16 PM, Francisco Jerez <curroje...@riseup.net> wrote: > Ian Romanick <i...@freedesktop.org> writes: > >> On 02/25/2016 12:13 PM, Francisco Jerez wrote: >>> Ian Romanick <i...@freedesktop.org> writes: >>> >>>> On 02/25/2016 08:46 AM, Roland Scheidegger wrote: >>>>> Am 25.02.2016 um 11:15 schrieb Iago Toral Quiroga: >>>>>> From the OpenGL 4.2 spec: >>>>>> >>>>>> "When a constructor is used to convert any integer or floating-point >>>>>> type to a >>>>>> bool, 0 and 0.0 are converted to false, and non-zero values are >>>>>> converted to >>>>>> true." >>>>>> >>>>>> Thus, even the smallest non-zero floating value should be translated to >>>>>> true. >>>>>> This behavior has been verified also with proprietary NVIDIA drivers. >>>>>> >>>>>> Currently, we implement this conversion as a cmp.nz operation with >>>>>> floats, >>>>>> subject to floating-point precision limitations, and as a result, >>>>>> relatively >>>>>> small non-zero floating point numbers return false instead of true. >>>>>> >>>>>> This patch fixes the problem by getting rid of the sign bit (to cover >>>>>> the case >>>>>> of -0.0) and testing the result against 0u using an integer comparison >>>>>> instead. >>>>>> --- >>>>>> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 ++++++++++++--- >>>>>> 1 file changed, 12 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >>>>>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >>>>>> index db20c71..7d62d7e 100644 >>>>>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >>>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >>>>>> @@ -913,9 +913,18 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, >>>>>> nir_alu_instr *instr) >>>>>> bld.MOV(result, negate(op[0])); >>>>>> break; >>>>>> >>>>>> - case nir_op_f2b: >>>>>> - bld.CMP(result, op[0], brw_imm_f(0.0f), BRW_CONDITIONAL_NZ); >>>>>> - break; >>>>>> + case nir_op_f2b: { >>>>>> + /* Because comparing to 0.0f is subject to precision limitations, >>>>>> do the >>>>>> + * comparison using integers (we need to get rid of the sign bit >>>>>> for that) >>>>>> + */ >>>>>> + if (devinfo->gen >= 8) >>>>>> + op[0] = resolve_source_modifiers(op[0]); >>>>>> + op[0] = retype(op[0], BRW_REGISTER_TYPE_UD); >>>>>> + bld.AND(op[0], op[0], brw_imm_ud(0x7FFFFFFFu)); >>>>>> + bld.CMP(result, op[0], brw_imm_ud(0u), BRW_CONDITIONAL_NZ); >>>>>> + break; >>>>>> + } >>>>>> + >>>>>> case nir_op_i2b: >>>>>> bld.CMP(result, op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ); >>>>>> break; >>>>>> >>>>> >>>>> Does that fix anything? I don't really see a problem with the existing >>>>> logic. Yes any "non-zero value" should be converted to true. But surely >>>>> that definition cannot include denorms, which you are always allowed to >>>>> flush to zero. >>>>> (Albeit I can't tell what the result would be with NaNs with the float >>>>> compare, nor what the result actually should be in this case since glsl >>>>> doesn't require NaNs neither.) >>>> >>>> Based on this and Jason's comments, I think we need a bunch of new tests. >>>> >>>> - smallest positive normal number >>>> - abs of smallest positive normal number >>>> - neg of " " " " >>>> - largest positive subnormal number >>>> - abs of largest positive subnormal number >>>> - neg of " " " " >>>> - all of the above with negative numbers >>>> - NaN >>>> - abs of NaN >>>> - neg of " >>>> >>>> Perhaps others? +/-Inf just for kicks? >>> >>> What's the point? The result of most of the above (except possibly >>> bool(NaN)) is undefined by the spec: >>> >>> "Any denormalized value input into a shader or potentially generated by >>> any operation in a shader can be flushed to 0. [...] NaNs are not >>> required to be generated. [...] Operations and built-in functions that >>> operate on a NaN are not required to return a NaN as the result." >> >> Except that apparently one major OpenGL vendor does something well >> defined that's different than what we do. > > I'm skeptical that nVidia would treat single-precision denorms > inconsistently between datatype constructors and other floating-point > arithmetic, but assuming that's the case it would be an argument for > proposing the spec change to Khronos rather than introducing a dubiously > compliant change into the back-end. I think I would argue against > making such a change in the spec in any case, because even though it's > implementation-defined whether denorms are flushed or not, the following > is guaranteed by the spec AFAIUI: > > | if (bool(f)) > | random_arithmetic_on(f /* Guaranteed not to be zero here even if > | denorms are flushed */); > > With this change in, bool(f) would evaluate to true even if f is a > denorm which is flushed to zero for all subsequent arithmetic in the > block. For that reason this seems more likely to break stuff than to > fix stuff, it's not like applications can expect denorms to be > representable at all regardless of what nVidia does, but they can expect > the above GLSL source to work.
I'd be curious to see a shader test that triggers the relevant behaviour -- AFAIK nvidia always flushes denorms. The flush is a per-instruction setting though [since Fermi], so it's technically feasible not to flush sometimes and to flush other times. -ilia _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev