On Fri, Feb 26, 2016 at 10:39 AM, Iago Toral <ito...@igalia.com> wrote: > On Fri, 2016-02-26 at 16:28 +0100, Roland Scheidegger wrote: >> Am 26.02.2016 um 11:25 schrieb Iago Toral: >> > >> > On Thu, 2016-02-25 at 18:20 -0500, Ilia Mirkin wrote: >> >> On Thu, Feb 25, 2016 at 6:16 PM, Francisco Jerez <curroje...@riseup.net> >> >> wrote: >> >>> Ian Romanick <i...@freedesktop.org> writes: >> >>> >> >>>> On 02/25/2016 12:13 PM, Francisco Jerez wrote: >> >>>>> Ian Romanick <i...@freedesktop.org> writes: >> >>>>> >> >>>>>> On 02/25/2016 08:46 AM, Roland Scheidegger wrote: >> >>>>>>> Am 25.02.2016 um 11:15 schrieb Iago Toral Quiroga: >> >>>>>>>> From the OpenGL 4.2 spec: >> >>>>>>>> >> >>>>>>>> "When a constructor is used to convert any integer or >> >>>>>>>> floating-point type to a >> >>>>>>>> bool, 0 and 0.0 are converted to false, and non-zero values are >> >>>>>>>> converted to >> >>>>>>>> true." >> >>>>>>>> >> >>>>>>>> Thus, even the smallest non-zero floating value should be >> >>>>>>>> translated to true. >> >>>>>>>> This behavior has been verified also with proprietary NVIDIA >> >>>>>>>> drivers. >> >>>>>>>> >> >>>>>>>> Currently, we implement this conversion as a cmp.nz operation with >> >>>>>>>> floats, >> >>>>>>>> subject to floating-point precision limitations, and as a result, >> >>>>>>>> relatively >> >>>>>>>> small non-zero floating point numbers return false instead of true. >> >>>>>>>> >> >>>>>>>> This patch fixes the problem by getting rid of the sign bit (to >> >>>>>>>> cover the case >> >>>>>>>> of -0.0) and testing the result against 0u using an integer >> >>>>>>>> comparison instead. >> >>>>>>>> --- >> >>>>>>>> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 ++++++++++++--- >> >>>>>>>> 1 file changed, 12 insertions(+), 3 deletions(-) >> >>>>>>>> >> >>>>>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> >>>>>>>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> >>>>>>>> index db20c71..7d62d7e 100644 >> >>>>>>>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> >>>>>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> >>>>>>>> @@ -913,9 +913,18 @@ fs_visitor::nir_emit_alu(const fs_builder >> >>>>>>>> &bld, nir_alu_instr *instr) >> >>>>>>>> bld.MOV(result, negate(op[0])); >> >>>>>>>> break; >> >>>>>>>> >> >>>>>>>> - case nir_op_f2b: >> >>>>>>>> - bld.CMP(result, op[0], brw_imm_f(0.0f), BRW_CONDITIONAL_NZ); >> >>>>>>>> - break; >> >>>>>>>> + case nir_op_f2b: { >> >>>>>>>> + /* Because comparing to 0.0f is subject to precision >> >>>>>>>> limitations, do the >> >>>>>>>> + * comparison using integers (we need to get rid of the sign >> >>>>>>>> bit for that) >> >>>>>>>> + */ >> >>>>>>>> + if (devinfo->gen >= 8) >> >>>>>>>> + op[0] = resolve_source_modifiers(op[0]); >> >>>>>>>> + op[0] = retype(op[0], BRW_REGISTER_TYPE_UD); >> >>>>>>>> + bld.AND(op[0], op[0], brw_imm_ud(0x7FFFFFFFu)); >> >>>>>>>> + bld.CMP(result, op[0], brw_imm_ud(0u), BRW_CONDITIONAL_NZ); >> >>>>>>>> + break; >> >>>>>>>> + } >> >>>>>>>> + >> >>>>>>>> case nir_op_i2b: >> >>>>>>>> bld.CMP(result, op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ); >> >>>>>>>> break; >> >>>>>>>> >> >>>>>>> >> >>>>>>> Does that fix anything? I don't really see a problem with the >> >>>>>>> existing >> >>>>>>> logic. Yes any "non-zero value" should be converted to true. But >> >>>>>>> surely >> >>>>>>> that definition cannot include denorms, which you are always allowed >> >>>>>>> to >> >>>>>>> flush to zero. >> >>>>>>> (Albeit I can't tell what the result would be with NaNs with the >> >>>>>>> float >> >>>>>>> compare, nor what the result actually should be in this case since >> >>>>>>> glsl >> >>>>>>> doesn't require NaNs neither.) >> >>>>>> >> >>>>>> Based on this and Jason's comments, I think we need a bunch of new >> >>>>>> tests. >> >>>>>> >> >>>>>> - smallest positive normal number >> >>>>>> - abs of smallest positive normal number >> >>>>>> - neg of " " " " >> >>>>>> - largest positive subnormal number >> >>>>>> - abs of largest positive subnormal number >> >>>>>> - neg of " " " " >> >>>>>> - all of the above with negative numbers >> >>>>>> - NaN >> >>>>>> - abs of NaN >> >>>>>> - neg of " >> >>>>>> >> >>>>>> Perhaps others? +/-Inf just for kicks? >> >>>>> >> >>>>> What's the point? The result of most of the above (except possibly >> >>>>> bool(NaN)) is undefined by the spec: >> >>>>> >> >>>>> "Any denormalized value input into a shader or potentially generated by >> >>>>> any operation in a shader can be flushed to 0. [...] NaNs are not >> >>>>> required to be generated. [...] Operations and built-in functions that >> >>>>> operate on a NaN are not required to return a NaN as the result." >> >>>> >> >>>> Except that apparently one major OpenGL vendor does something well >> >>>> defined that's different than what we do. >> >>> >> >>> I'm skeptical that nVidia would treat single-precision denorms >> >>> inconsistently between datatype constructors and other floating-point >> >>> arithmetic, but assuming that's the case it would be an argument for >> >>> proposing the spec change to Khronos rather than introducing a dubiously >> >>> compliant change into the back-end. I think I would argue against >> >>> making such a change in the spec in any case, because even though it's >> >>> implementation-defined whether denorms are flushed or not, the following >> >>> is guaranteed by the spec AFAIUI: >> >>> >> >>> | if (bool(f)) >> >>> | random_arithmetic_on(f /* Guaranteed not to be zero here even if >> >>> | denorms are flushed */); >> >>> >> >>> With this change in, bool(f) would evaluate to true even if f is a >> >>> denorm which is flushed to zero for all subsequent arithmetic in the >> >>> block. For that reason this seems more likely to break stuff than to >> >>> fix stuff, it's not like applications can expect denorms to be >> >>> representable at all regardless of what nVidia does, but they can expect >> >>> the above GLSL source to work. >> >> >> >> I'd be curious to see a shader test that triggers the relevant >> >> behaviour -- AFAIK nvidia always flushes denorms. The flush is a >> >> per-instruction setting though [since Fermi], so it's technically >> >> feasible not to flush sometimes and to flush other times. >> > >> > This one: >> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Igalia_piglit_blob_arb-5Fgpu-5Fshader-5Ffp64_tests_spec_arb-5Fgpu-5Fshader-5Ffp64_execution_fs-2Dconversion-2Dexplicit-2Ddouble-2Dbool-2Dbad.shader-5Ftest&d=BQICaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=K5Q85JZWVAJuljMHdoEb9NzJVH29wma5HWxq4K5UNs0&s=knFllQfBVoab6tHDLvgEdZUWb7WW77hBt62DQT6Co3s&e= >> > >> > That test is for doubles. The author of the test told me that the >> > equivalent version for float also passed, but I have just verified that >> > he made a mistake. In summary: that test passes for double and fails for >> > float... >> > >> > The quote from the spec shared by Curro says: >> > >> > "Any denormalized value input into a shader or potentially generated by >> > any operation in a shader can be flushed to 0" >> > >> > It says _can_ instead of _must_, so I share Curro's point of view that >> > this is in fact undefined behavior by the spec and f2b of a denorm could >> > return true or false and both are valid implementations. I wonder if >> > nvidia's case of flushing floats but not doubles is also acceptable >> > though, sounds like a bug to me. >> >> GL is always "can" for such things, and never "must" :-). >> (d3d10 would require "must be flushed" for floats.) >> Flushing denorms for floats but not for doubles looks perfectly fine to >> me too, there's quite some hw which can't do denorms for floats (at >> least not without compromises) but can do it for doubles (because >> doubles had a low rate and were only useful for HPC, whereas floats need >> to favor performance over everything else, essentially). > > Ah, interesting, thanks for sharing this! > > In that case I wonder what we should do for doubles in the case of i965. > I have patches for both implementations, so which one do we want?
For GLSL it doesn't really matter, I'd go with whatever way is simpler to implement. > > Iago > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev