https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117159
Bug ID: 117159
Summary: kmovw storing to memory is assumed to zero-extend
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: wrong-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: zsojka at seznam dot cz
Target Milestone: ---
Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu
Created attachment 59352
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59352&action=edit
reduced testcase
Output (using Intel SDE):
$ x86_64-pc-linux-gnu-gcc -mavx512bw -Os testcase.c
$ for I in `seq 1 10`; do sde -cnl -- ./a.out; done
Aborted
000000000000000000000000Aborted
0000000000000000Aborted
00000000Aborted
The output is random, depending on how the stack is initialized.
In the ASM code:
...
# testcase.c:10: unsigned k = __builtin_ia32_pcmpgtd512_mask ((V) { }, v, m);
kmovw WORD PTR [rsp+44], k0 # %sfp, tmp120
# testcase.c:11: W r = (W) k + w;
vmovd xmm1, DWORD PTR [rsp+44] # tmp121, %sfp
...
kmovw stores only word, but vmovd loads whole dword
According to Intel SDM, KMOVW operates as:
KMOVW
IF *destination is a memory location*
DEST[15:0] := SRC[15:0]
IF *destination is a mask register or a GPR *
DEST := ZeroExtension(SRC[15:0])
eg. the zero-extension is done only when storing to memory.
I cannot verify this on a real hardware.
$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r15-4354-20241015190608-g97f98855d41-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/15.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --enable-libsanitizer
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r15-4354-20241015190608-g97f98855d41-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.0.0 20241014 (experimental) (GCC)