"Jonathan Wakely" <jwakely....@gmail.com> wrote:

> On Fri, 26 May 2023, 08:01 Andrew Pinski via Gcc, <gcc@gcc.gnu.org> wrote:
>
>> On Thu, May 25, 2023 at 11:56?PM Stefan Kanthak <stefan.kant...@nexgo.de>
>> wrote:
>>>
>>> Hi,
>>>
>>> compile the following function on a system with Core2 processor
>>> (released January 2008) for the 32-bit execution environment:
>>>
>>> --- demo.c ---
>>> int ispowerof2(unsigned long long argument)
>>> {
>>>     return (argument & argument - 1) == 0;
>>> }
>>> --- EOF ---
>>>
>>> GCC 13.3: gcc -m32 -O3 demo.c
>>>
>>> NOTE: -mtune=native is the default!
>>
>> You need to use -march=native and not -mtune=native .... to turn on
>> the architecture features.

(Un)fortunately this changes nothing!

STOP: that's wrong, it makes it even WORSE!

# Compilation provided by Compiler Explorer at https://godbolt.org/
ispowerof2(unsigned long long):
        vmovq   xmm1, QWORD PTR [esp+4]
        vpcmpeqd        xmm0, xmm0, xmm0
        xor     eax, eax
        vpaddq  xmm0, xmm1, xmm0
        vpand   xmm0, xmm0, xmm1
        vpunpcklqdq     xmm0, xmm0, xmm0
        vptest  xmm0, xmm0
        sete    al
        ret

That's what I call a REALLY EPIC FAILURE!

Compare this unefficient BLOAT to the SSE4.1 code from my original post!

> Yes this is just user error. You didn't use the right options to say you
> want SSE2.

ARGH: please read CAREFULLY what I wrote!

1) I didn't tell GCC to use SSE at all (I DON'T want any compiler to use
   SSE per default, especially when the generated code is SLOWER and BIGGER
   than conventional code using the general purpose registers)!

2) GCC uses SSE2 on its own, but doesn't support it well: it FAILS to use
   PMOVMSKB here, despite -O3!

3) -march=core2 doesn't help too, GCC fails to use SSE4.1 at all!

> GCC supports it fine already.

DREAM ON!
Again: view the 2 counter examples from my original post CAREFULLY!

not amused
Stefan

Reply via email to