https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117159
Bug ID: 117159 Summary: kmovw storing to memory is assumed to zero-extend Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 59352 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59352&action=edit reduced testcase Output (using Intel SDE): $ x86_64-pc-linux-gnu-gcc -mavx512bw -Os testcase.c $ for I in `seq 1 10`; do sde -cnl -- ./a.out; done Aborted 000000000000000000000000Aborted 0000000000000000Aborted 00000000Aborted The output is random, depending on how the stack is initialized. In the ASM code: ... # testcase.c:10: unsigned k = __builtin_ia32_pcmpgtd512_mask ((V) { }, v, m); kmovw WORD PTR [rsp+44], k0 # %sfp, tmp120 # testcase.c:11: W r = (W) k + w; vmovd xmm1, DWORD PTR [rsp+44] # tmp121, %sfp ... kmovw stores only word, but vmovd loads whole dword According to Intel SDM, KMOVW operates as: KMOVW IF *destination is a memory location* DEST[15:0] := SRC[15:0] IF *destination is a mask register or a GPR * DEST := ZeroExtension(SRC[15:0]) eg. the zero-extension is done only when storing to memory. I cannot verify this on a real hardware. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r15-4354-20241015190608-g97f98855d41-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/15.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --enable-libsanitizer --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r15-4354-20241015190608-g97f98855d41-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 15.0.0 20241014 (experimental) (GCC)