https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117159

            Bug ID: 117159
           Summary: kmovw storing to memory is assumed to zero-extend
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zsojka at seznam dot cz
  Target Milestone: ---
              Host: x86_64-pc-linux-gnu
            Target: x86_64-pc-linux-gnu

Created attachment 59352
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59352&action=edit
reduced testcase

Output (using Intel SDE):
$ x86_64-pc-linux-gnu-gcc -mavx512bw -Os testcase.c
$ for I in `seq 1 10`; do sde -cnl -- ./a.out; done
Aborted
000000000000000000000000Aborted
0000000000000000Aborted
00000000Aborted

The output is random, depending on how the stack is initialized.

In the ASM code:
...
# testcase.c:10:   unsigned k = __builtin_ia32_pcmpgtd512_mask ((V) { }, v, m);
        kmovw   WORD PTR [rsp+44], k0   # %sfp, tmp120
# testcase.c:11:   W r = (W) k + w;
        vmovd   xmm1, DWORD PTR [rsp+44]        # tmp121, %sfp
...

kmovw stores only word, but vmovd loads whole dword

According to Intel SDM, KMOVW operates as:

KMOVW
IF *destination is a memory location*
DEST[15:0] := SRC[15:0]
IF *destination is a mask register or a GPR *
DEST := ZeroExtension(SRC[15:0])


eg. the zero-extension is done only when storing to memory.
I cannot verify this on a real hardware.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r15-4354-20241015190608-g97f98855d41-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/15.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --enable-libsanitizer
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r15-4354-20241015190608-g97f98855d41-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.0.0 20241014 (experimental) (GCC)

Reply via email to