https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91150
Bug ID: 91150
Summary: [10 Regression] wrong code with -O -mavx512vbmi due to
wrong writemask
Product: gcc
Version: 10.0
Status: UNCONFIRMED
Keywords: wrong-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: zsojka at seznam dot cz
Target Milestone: ---
Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu
Created attachment 46594
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46594&action=edit
reduced testcase
Output:
$ x86_64-pc-linux-gnu-gcc -O -mavx512vbmi testcase.c
$ sde64 -- ./a.out
Aborted
At the assembly level, the problem seems to be:
# testcase.c:12: {
vpxor xmm2, xmm2, xmm2 # tmp117
mov eax, 4294967295 # tmp119,
vmovdqa64 zmm4, ZMMWORD PTR [rsp+8] # tmp118, b
kmovq k1, rax # tmp119, tmp119
vmovdqu8 zmm4{k1}, zmm2 # tmp118, tmp119, tmp118, tmp117
# testcase.c:11: a <<= (v64u64) (v64u128)
vpsllvq zmm1, zmm1, zmm4 # a, tmp123, tmp118
vmovdqu8 is using the k1 mask to load zeros to given bytes - the mask should be
0xffffffffffff0000 instead (only the lowest-order 16byte word is kept)
$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-273353-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/10.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-273353-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.0.0 20190710 (experimental) (GCC)