On Fri, 26 May 2023 at 09:00, Stefan Kanthak <stefan.kant...@nexgo.de> wrote: > > "Jonathan Wakely" <jwakely....@gmail.com> wrote: > > > On Fri, 26 May 2023, 08:01 Andrew Pinski via Gcc, <gcc@gcc.gnu.org> wrote: > > > >> On Thu, May 25, 2023 at 11:56?PM Stefan Kanthak <stefan.kant...@nexgo.de> > >> wrote: > >>> > >>> Hi, > >>> > >>> compile the following function on a system with Core2 processor > >>> (released January 2008) for the 32-bit execution environment: > >>> > >>> --- demo.c --- > >>> int ispowerof2(unsigned long long argument) > >>> { > >>> return (argument & argument - 1) == 0; > >>> } > >>> --- EOF --- > >>> > >>> GCC 13.3: gcc -m32 -O3 demo.c > >>> > >>> NOTE: -mtune=native is the default! > >> > >> You need to use -march=native and not -mtune=native .... to turn on > >> the architecture features. > > (Un)fortunately this changes nothing! > > STOP: that's wrong, it makes it even WORSE! > > # Compilation provided by Compiler Explorer at https://godbolt.org/ > ispowerof2(unsigned long long): > vmovq xmm1, QWORD PTR [esp+4] > vpcmpeqd xmm0, xmm0, xmm0 > xor eax, eax > vpaddq xmm0, xmm1, xmm0 > vpand xmm0, xmm0, xmm1 > vpunpcklqdq xmm0, xmm0, xmm0 > vptest xmm0, xmm0 > sete al > ret > > That's what I call a REALLY EPIC FAILURE! > > Compare this unefficient BLOAT to the SSE4.1 code from my original post! > > > Yes this is just user error. You didn't use the right options to say you > > want SSE2. > > ARGH: please read CAREFULLY what I wrote!
You wrote "Now add the -mtune=core2 option to EXPLICITLY enable the NATIVE SSE4.1 alias "Penryn New Instruction Set" of the Core2 processor" which is wrong, that's not what -mtune does. Read the docs CAREFULLY: https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html > > 1) I didn't tell GCC to use SSE at all (I DON'T want any compiler to use > SSE per default, especially when the generated code is SLOWER and BIGGER > than conventional code using the general purpose registers)! > > 2) GCC uses SSE2 on its own, but doesn't support it well: it FAILS to use > PMOVMSKB here, despite -O3! So report a bug to bugzilla, not via an email to the wrong list. > > 3) -march=core2 doesn't help too, GCC fails to use SSE4.1 at all! core2 doesn't enable SSE4.1, as clearly shown in the docs: https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html If you send emails full of confused mistakes, don't be surprised if the replies aren't what you want. If you think GCC is generating bad code, file a bug. But make sure you're actually using the right options to enable the right instruction sets before complaining about the instructions used.