Re: [Mesa-dev] [PATCH] radeonsi: enable 32-bit denormals on VI+

Marek Olšák Thu, 12 Jan 2017 03:55:42 -0800

I think v_mad always flushes denorms.

I would just ignore this failure. It's not required to fix every silly test
on the planet. If you opencode v_max, you'll have the same problem, and
then you'd have to fix v_cmp. It's just silly.


Marek

On Jan 12, 2017 11:59 AM, "Nicolai Hähnle" <nhaeh...@gmail.com> wrote:

> On 12.01.2017 09:24, Samuel Pitoiset wrote:
>
>>
>>
>> On 01/12/2017 02:12 AM, Marek Olšák wrote:
>>
>>> On Thu, Jan 12, 2017 at 12:33 AM, Ilia Mirkin <imir...@alum.mit.edu>
>>> wrote:
>>>
>>>> On Wed, Jan 11, 2017 at 4:00 PM, Roland Scheidegger
>>>> <srol...@vmware.com> wrote:
>>>>
>>>>> Am 11.01.2017 um 21:08 schrieb Samuel Pitoiset:
>>>>>
>>>>>>
>>>>>>
>>>>>> On 01/11/2017 07:00 PM, Roland Scheidegger wrote:
>>>>>>
>>>>>>> I don't think there's any glsl, es or otherwise, specification which
>>>>>>> would require denorms (since obviously lots of hw can't do it, d3d10
>>>>>>> forbids them), with any precision qualifier. Hence these look like
>>>>>>> bugs
>>>>>>> of the test suite to me?
>>>>>>> (Irrespective if it's a good idea or not to enable denormals, which I
>>>>>>> don't realy know.)
>>>>>>>
>>>>>>
>>>>>> That test works on NVIDIA hw (both with blob and nouveau) and IIRC it
>>>>>> also works on Intel hw. I don't think it's buggy there.
>>>>>>
>>>>> The question then is why it needs denorms on radeons...
>>>>>
>>>>
>>>> I spent some time with Samuel looking at this. So, this is pretty
>>>> funny... (or at least feels that way after staring at floating point
>>>> for a while)
>>>>
>>>> dEQP is, in fact, feeding denorms to the min/max functions. But it's
>>>> smart enough to know that flushing denorms to 0 is OK, and so it
>>>> treats a 0 as a pass. (And obviously it treats the "right answer" as a
>>>> pass.) So that's why enabling denorm processing fixes it - that causes
>>>> the hw to return the proper correct answer and all is well.
>>>>
>>>> However the issue is that without denorm processing, the hw is
>>>> returning the *wrong* answer. At first I thought that max was being
>>>> lowered into something like
>>>>
>>>> if (a > b) { x = a; } else { x = b; }
>>>>
>>>> which would end up with potentially wrong results if a and b are being
>>>> flushed as inputs into the comparison but not into the assignments.
>>>> But that's not (explicitly) what's happening - the v_max_f32_e32
>>>> instruction is being used. Perhaps that's what it does internally? If
>>>> so, that means that results of affected float functions in LLVM need
>>>> explicit flushing before being stored into results.
>>>>
>>>> FWIW the specific values triggering the issue are:
>>>>
>>>> in0=-0x0.000002p-126, in1=-0x0.fffffep-126, out0=-0x0.fffffep-126 ->
>>>> FAIL
>>>>
>>>> With denorm processing, it correctly reports out0=-0x0.000002p-126,
>>>> while nouveau with denorm flushing enabled reports out0=0.0 which also
>>>> passes.
>>>>
>>>
>>> The denorm configuration has 2 bits:
>>> - flush (0) or allow (1) input denorms
>>> - flush (0) or allow (1) output denorms
>>>
>>> In the case of v_max, it looks like output denorms are not flushed and
>>> it behaves almost like you said:
>>>
>>> if (a >= b) { x = a; } else { x = b; }
>>>
>>
>> Should we adjust the denorm mode with s_setreg for v_max_f32/v_min_f32?
>>
>
> That might eliminate some optimization opportunities, so let's first see
> if another fix is possible?
>
> I haven't run the test, but from the description the most plausible
> explanation is that v_max_f32/v_min_f32 flushes input denorms, but doesn't
> flush output denorms for some stupid reason. Perhaps we could change the
> fp_denorm setting to
>
> - allow input denorms
> - flush output denorms
>
> Then min/max will preserve the denorms, but other operations will flush
> denorms to zero.
>
> Do you know how that affects v_mad_f32? If we just change the register
> without telling LLVM about it, LLVM will still happily emit v_mad_f32, and
> perhaps that produces incorrect results when denorms are passed in from
> uniforms?
>
> If this register setting doesn't work, then yes, looks like s_setreg may
> be needed. Unless there's a cheap way to flush denorms from loads as well?
> But I don't think there is.
>
> Nicolai
>
>
>
>
>
>>> Marek
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>
>>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: enable 32-bit denormals on VI+

Reply via email to