I think v_mad always flushes denorms. I would just ignore this failure. It's not required to fix every silly test on the planet. If you opencode v_max, you'll have the same problem, and then you'd have to fix v_cmp. It's just silly.
Marek On Jan 12, 2017 11:59 AM, "Nicolai Hähnle" <nhaeh...@gmail.com> wrote: > On 12.01.2017 09:24, Samuel Pitoiset wrote: > >> >> >> On 01/12/2017 02:12 AM, Marek Olšák wrote: >> >>> On Thu, Jan 12, 2017 at 12:33 AM, Ilia Mirkin <imir...@alum.mit.edu> >>> wrote: >>> >>>> On Wed, Jan 11, 2017 at 4:00 PM, Roland Scheidegger >>>> <srol...@vmware.com> wrote: >>>> >>>>> Am 11.01.2017 um 21:08 schrieb Samuel Pitoiset: >>>>> >>>>>> >>>>>> >>>>>> On 01/11/2017 07:00 PM, Roland Scheidegger wrote: >>>>>> >>>>>>> I don't think there's any glsl, es or otherwise, specification which >>>>>>> would require denorms (since obviously lots of hw can't do it, d3d10 >>>>>>> forbids them), with any precision qualifier. Hence these look like >>>>>>> bugs >>>>>>> of the test suite to me? >>>>>>> (Irrespective if it's a good idea or not to enable denormals, which I >>>>>>> don't realy know.) >>>>>>> >>>>>> >>>>>> That test works on NVIDIA hw (both with blob and nouveau) and IIRC it >>>>>> also works on Intel hw. I don't think it's buggy there. >>>>>> >>>>> The question then is why it needs denorms on radeons... >>>>> >>>> >>>> I spent some time with Samuel looking at this. So, this is pretty >>>> funny... (or at least feels that way after staring at floating point >>>> for a while) >>>> >>>> dEQP is, in fact, feeding denorms to the min/max functions. But it's >>>> smart enough to know that flushing denorms to 0 is OK, and so it >>>> treats a 0 as a pass. (And obviously it treats the "right answer" as a >>>> pass.) So that's why enabling denorm processing fixes it - that causes >>>> the hw to return the proper correct answer and all is well. >>>> >>>> However the issue is that without denorm processing, the hw is >>>> returning the *wrong* answer. At first I thought that max was being >>>> lowered into something like >>>> >>>> if (a > b) { x = a; } else { x = b; } >>>> >>>> which would end up with potentially wrong results if a and b are being >>>> flushed as inputs into the comparison but not into the assignments. >>>> But that's not (explicitly) what's happening - the v_max_f32_e32 >>>> instruction is being used. Perhaps that's what it does internally? If >>>> so, that means that results of affected float functions in LLVM need >>>> explicit flushing before being stored into results. >>>> >>>> FWIW the specific values triggering the issue are: >>>> >>>> in0=-0x0.000002p-126, in1=-0x0.fffffep-126, out0=-0x0.fffffep-126 -> >>>> FAIL >>>> >>>> With denorm processing, it correctly reports out0=-0x0.000002p-126, >>>> while nouveau with denorm flushing enabled reports out0=0.0 which also >>>> passes. >>>> >>> >>> The denorm configuration has 2 bits: >>> - flush (0) or allow (1) input denorms >>> - flush (0) or allow (1) output denorms >>> >>> In the case of v_max, it looks like output denorms are not flushed and >>> it behaves almost like you said: >>> >>> if (a >= b) { x = a; } else { x = b; } >>> >> >> Should we adjust the denorm mode with s_setreg for v_max_f32/v_min_f32? >> > > That might eliminate some optimization opportunities, so let's first see > if another fix is possible? > > I haven't run the test, but from the description the most plausible > explanation is that v_max_f32/v_min_f32 flushes input denorms, but doesn't > flush output denorms for some stupid reason. Perhaps we could change the > fp_denorm setting to > > - allow input denorms > - flush output denorms > > Then min/max will preserve the denorms, but other operations will flush > denorms to zero. > > Do you know how that affects v_mad_f32? If we just change the register > without telling LLVM about it, LLVM will still happily emit v_mad_f32, and > perhaps that produces incorrect results when denorms are passed in from > uniforms? > > If this register setting doesn't work, then yes, looks like s_setreg may > be needed. Unless there's a cheap way to flush denorms from loads as well? > But I don't think there is. > > Nicolai > > > > > >>> Marek >>> _______________________________________________ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>> >>> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev