Matt Godbolt <m...@godbolt.org> wrote:

> I believe your example doesn't take into account that the values can be NaN
> which compares false in all situations.

That's a misbelief!
Please notice the first if-clause, which rules out NaNs for both arguments.
Also notice that GCC did NOT generate JP after the 4 COMISD instructions
in question, i.e. it knew that NaNs had been ruled out.
I included the 3 initial if-clauses just to give GCC enough rope to hang 
himself.

> If you allow the compiler to
> optimize without supporting NaN (-ffast-math), I think it generates the
> code you want: https://godbolt.org/z/1ra7zcsnd

Replace
     if (isnan(argx) || isnan(argy)) return argx + argy;
with
     if ((argx != argx) || (argy != argy)) return argx + argy;
then feed the changed snippet to compiler explorer again, with and without
-ffast-math

Stefan
 
> --matt
> 
> On Sat, Aug 21, 2021 at 1:59 PM Stefan Kanthak <stefan.kant...@nexgo.de>
> wrote:
> 
>> Hi,
>>
>> the following snippet is from the nextafter() function of
>> <http://www.netlib.no/netlib/toms/722>
>>
>> --- repro.c ---
>> #define Zero 0.0
>> double nextafter(double argx, double argy)
>> {
>>     double z = argx;
>>
>>     if (isnan(argx) || isnan(argy)) return argx + argy;
>>
>>     if (argx == argy) return argx;
>>
>>     if (argx != Zero)
>>         if (((argx < Zero) && (argx < argy))
>>          || ((argx > Zero) && (argx > argy)))
>>             z += 1.0;
>>         else
>>             z -= 1.0;
>>     return z;
>> }
>> --- EOF ---
>>
>> I expect that GCC knows DeMorgan's rules and is able to
>> simplify/optimize the last if-statement to
>>
>>         if ((argx < Zero) == (argx < argy))
>>
>> Unfortunately GCC fails to do so: see the lines from
>> label .L20: to label L7
>>
>> $ gcc -m64 -O3 -o- -S
>> ...
>> nextafter:
>>         ucomisd %xmm1, %xmm0
>>         jp      .L19
>>         pxor    %xmm2, %xmm2
>>         movl    $1, %edx
>>         ucomisd %xmm2, %xmm0
>>         setp    %al
>>         cmovne  %edx, %eax
>>         testb   %al, %al
>>         je      .L3
>>         ucomisd %xmm1, %xmm0
>>         setp    %al
>>         cmove   %eax, %edx
>>         testb   %dl, %dl
>>         jne     .L20
>> .L3:
>>         ret
>> .L20:
>>         comisd  %xmm0, %xmm2
>>         ja      .L21
>> .L4:
>>         comisd  %xmm2, %xmm0
>>         jbe     .L7
>>         comisd  %xmm1, %xmm0
>>         jbe     .L7
>> .L6:
>>         addsd   .LC1(%rip), %xmm0
>>         ret
>> .L21:
>>         comisd  %xmm0, %xmm1
>>         ja      .L6
>>         jmp     .L4
>> .L7:
>>         subsd   .LC1(%rip), %xmm0
>>         ret
>> .L19:
>>         addsd   %xmm1, %xmm0
>>         ret
>> .LC1:
>>         .long   0
>>         .long   1072693248
>>
>> Stefan
>>
> 
> 
> -- 
> Matt
> (he/him)
>

Reply via email to