On Fri, Jun 28, 2024 at 3:15 AM liuhongt <hongtao....@intel.com> wrote: > > for the testcase in the PR115406, here is part of the dump. > > char D.4882; > vector(1) <signed-boolean:8> _1; > vector(1) signed char _2; > char _5; > > <bb 2> : > _1 = { -1 }; > > When assign { -1 } to vector(1} {signed-boolean:8}, > Since TYPE_PRECISION (itype) <= BITS_PER_UNIT, so it set each bit of dest > with each vector elemnet. But i think the bit setting should only apply for > TYPE_PRECISION (itype) < BITS_PER_UNIT. .i.e for vector(1). > <signed-boolean:16>, it will be assigned as -1, instead of 1. > Is there any specific reason vector(1) <signed-boolean:8> is handled > differently from vector<1> <signed-boolean:16>? > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk?
I agree that <= BITS_PER_UNIT is suspicious, but the bit-precision code should work for 8 bit entities as well, it seems we only set the LSB of each element in the "mask". ISTR that SVE masks can have up to 8 bit elements (for 8 byte data elements), so maybe that's why <= BITS_PER_UNIT. So maybe instead of just setting one bit in ptr[bit / BITS_PER_UNIT] |= 1 << (bit % BITS_PER_UNIT); we should set elt_bits bits, aka (without testing) ptr[bit / BITS_PER_UNIT] |= (1 << elt_bits - 1) << (bit % BITS_PER_UNIT); ? > gcc/ChangeLog: > > PR middle-end/115406 > * fold-const.cc (native_encode_vector_part): Don't set each > bit to the dest when TYPE_PRECISION (itype) == BITS_PER_UNIT. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr115406.c: New test. > --- > gcc/fold-const.cc | 2 +- > gcc/testsuite/gcc.target/i386/pr115406.c | 23 +++++++++++++++++++++++ > 2 files changed, 24 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr115406.c > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > index 710d697c021..0f045f851d1 100644 > --- a/gcc/fold-const.cc > +++ b/gcc/fold-const.cc > @@ -8077,7 +8077,7 @@ native_encode_vector_part (const_tree expr, unsigned > char *ptr, int len, > { > tree itype = TREE_TYPE (TREE_TYPE (expr)); > if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (expr)) > - && TYPE_PRECISION (itype) <= BITS_PER_UNIT) > + && TYPE_PRECISION (itype) < BITS_PER_UNIT) > { > /* This is the only case in which elements can be smaller than a byte. > Element 0 is always in the lsb of the containing byte. */ > diff --git a/gcc/testsuite/gcc.target/i386/pr115406.c > b/gcc/testsuite/gcc.target/i386/pr115406.c > new file mode 100644 > index 00000000000..623dff06fc3 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr115406.c > @@ -0,0 +1,23 @@ > +/* { dg-do run } */ > +/* { dg-options "-O0 -mavx512f" } */ > +/* { dg-require-effective-target avx512f } */ > + > +typedef __attribute__((__vector_size__ (1))) char V; > + > +char > +foo (V v) > +{ > + return ((V) v == v)[0]; > +} > + > +int > +main () > +{ > + if (!__builtin_cpu_supports ("avx512f")) > + return 0; > + > + char x = foo ((V) { }); > + if (x != -1) > + __builtin_abort (); > + return 0; > +} > -- > 2.31.1 >