Re: [Mesa-dev] [PATCH] radeonsi: enable 32-bit denormals on VI+

Samuel Pitoiset Thu, 12 Jan 2017 05:12:36 -0800


On 01/12/2017 12:55 PM, Marek Olšák wrote:

I think v_mad always flushes denorms.

I would just ignore this failure. It's not required to fix every silly
test on the planet. If you opencode v_max, you'll have the same problem,
and then you'd have to fix v_cmp. It's just silly.


Your call.

That test compares 60k values and only one is actually wrong (the onementioned by Ilia).


Marek

On Jan 12, 2017 11:59 AM, "Nicolai Hähnle" <nhaeh...@gmail.com
<mailto:nhaeh...@gmail.com>> wrote:

    On 12.01.2017 09:24, Samuel Pitoiset wrote:



        On 01/12/2017 02:12 AM, Marek Olšák wrote:

            On Thu, Jan 12, 2017 at 12:33 AM, Ilia Mirkin
            <imir...@alum.mit.edu <mailto:imir...@alum.mit.edu>>
            wrote:

                On Wed, Jan 11, 2017 at 4:00 PM, Roland Scheidegger
                <srol...@vmware.com <mailto:srol...@vmware.com>> wrote:

                    Am 11.01.2017 um 21:08 schrieb Samuel Pitoiset:



                        On 01/11/2017 07:00 PM, Roland Scheidegger wrote:

                            I don't think there's any glsl, es or
                            otherwise, specification which
                            would require denorms (since obviously lots
                            of hw can't do it, d3d10
                            forbids them), with any precision qualifier.
                            Hence these look like
                            bugs
                            of the test suite to me?
                            (Irrespective if it's a good idea or not to
                            enable denormals, which I
                            don't realy know.)


                        That test works on NVIDIA hw (both with blob and
                        nouveau) and IIRC it
                        also works on Intel hw. I don't think it's buggy
                        there.

                    The question then is why it needs denorms on radeons...


                I spent some time with Samuel looking at this. So, this
                is pretty
                funny... (or at least feels that way after staring at
                floating point
                for a while)

                dEQP is, in fact, feeding denorms to the min/max
                functions. But it's
                smart enough to know that flushing denorms to 0 is OK,
                and so it
                treats a 0 as a pass. (And obviously it treats the
                "right answer" as a
                pass.) So that's why enabling denorm processing fixes it
                - that causes
                the hw to return the proper correct answer and all is well.

                However the issue is that without denorm processing, the
                hw is
                returning the *wrong* answer. At first I thought that
                max was being
                lowered into something like

                if (a > b) { x = a; } else { x = b; }

                which would end up with potentially wrong results if a
                and b are being
                flushed as inputs into the comparison but not into the
                assignments.
                But that's not (explicitly) what's happening - the
                v_max_f32_e32
                instruction is being used. Perhaps that's what it does
                internally? If
                so, that means that results of affected float functions
                in LLVM need
                explicit flushing before being stored into results.

                FWIW the specific values triggering the issue are:

                in0=-0x0.000002p-126, in1=-0x0.fffffep-126,
                out0=-0x0.fffffep-126 ->
                FAIL

                With denorm processing, it correctly reports
                out0=-0x0.000002p-126,
                while nouveau with denorm flushing enabled reports
                out0=0.0 which also
                passes.


            The denorm configuration has 2 bits:
            - flush (0) or allow (1) input denorms
            - flush (0) or allow (1) output denorms

            In the case of v_max, it looks like output denorms are not
            flushed and
            it behaves almost like you said:

            if (a >= b) { x = a; } else { x = b; }


        Should we adjust the denorm mode with s_setreg for
        v_max_f32/v_min_f32?


    That might eliminate some optimization opportunities, so let's first
    see if another fix is possible?

    I haven't run the test, but from the description the most plausible
    explanation is that v_max_f32/v_min_f32 flushes input denorms, but
    doesn't flush output denorms for some stupid reason. Perhaps we
    could change the fp_denorm setting to

    - allow input denorms
    - flush output denorms

    Then min/max will preserve the denorms, but other operations will
    flush denorms to zero.

    Do you know how that affects v_mad_f32? If we just change the
    register without telling LLVM about it, LLVM will still happily emit
    v_mad_f32, and perhaps that produces incorrect results when denorms
    are passed in from uniforms?

    If this register setting doesn't work, then yes, looks like s_setreg
    may be needed. Unless there's a cheap way to flush denorms from
    loads as well? But I don't think there is.

    Nicolai





            Marek
            _______________________________________________
            mesa-dev mailing list
            mesa-dev@lists.freedesktop.org
            <mailto:mesa-dev@lists.freedesktop.org>
            https://lists.freedesktop.org/mailman/listinfo/mesa-dev
            <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>

        _______________________________________________
        mesa-dev mailing list
        mesa-dev@lists.freedesktop.org
        <mailto:mesa-dev@lists.freedesktop.org>
        https://lists.freedesktop.org/mailman/listinfo/mesa-dev
        <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: enable 32-bit denormals on VI+

Reply via email to