"Jonathan Wakely" <jwakely....@gmail.com> wrote: > On Fri, 26 May 2023, 08:01 Andrew Pinski via Gcc, <gcc@gcc.gnu.org> wrote: > >> On Thu, May 25, 2023 at 11:56?PM Stefan Kanthak <stefan.kant...@nexgo.de> >> wrote: >>> >>> Hi, >>> >>> compile the following function on a system with Core2 processor >>> (released January 2008) for the 32-bit execution environment: >>> >>> --- demo.c --- >>> int ispowerof2(unsigned long long argument) >>> { >>> return (argument & argument - 1) == 0; >>> } >>> --- EOF --- >>> >>> GCC 13.3: gcc -m32 -O3 demo.c >>> >>> NOTE: -mtune=native is the default! >> >> You need to use -march=native and not -mtune=native .... to turn on >> the architecture features.
(Un)fortunately this changes nothing! STOP: that's wrong, it makes it even WORSE! # Compilation provided by Compiler Explorer at https://godbolt.org/ ispowerof2(unsigned long long): vmovq xmm1, QWORD PTR [esp+4] vpcmpeqd xmm0, xmm0, xmm0 xor eax, eax vpaddq xmm0, xmm1, xmm0 vpand xmm0, xmm0, xmm1 vpunpcklqdq xmm0, xmm0, xmm0 vptest xmm0, xmm0 sete al ret That's what I call a REALLY EPIC FAILURE! Compare this unefficient BLOAT to the SSE4.1 code from my original post! > Yes this is just user error. You didn't use the right options to say you > want SSE2. ARGH: please read CAREFULLY what I wrote! 1) I didn't tell GCC to use SSE at all (I DON'T want any compiler to use SSE per default, especially when the generated code is SLOWER and BIGGER than conventional code using the general purpose registers)! 2) GCC uses SSE2 on its own, but doesn't support it well: it FAILS to use PMOVMSKB here, despite -O3! 3) -march=core2 doesn't help too, GCC fails to use SSE4.1 at all! > GCC supports it fine already. DREAM ON! Again: view the 2 counter examples from my original post CAREFULLY! not amused Stefan