On Fri, Jun 28, 2024 at 8:01 AM Richard Biener <richard.guent...@gmail.com> wrote: > > On Fri, Jun 28, 2024 at 3:15 AM liuhongt <hongtao....@intel.com> wrote: > > > > for the testcase in the PR115406, here is part of the dump. > > > > char D.4882; > > vector(1) <signed-boolean:8> _1; > > vector(1) signed char _2; > > char _5; > > > > <bb 2> : > > _1 = { -1 }; > > > > When assign { -1 } to vector(1} {signed-boolean:8}, > > Since TYPE_PRECISION (itype) <= BITS_PER_UNIT, so it set each bit of dest > > with each vector elemnet. But i think the bit setting should only apply for > > TYPE_PRECISION (itype) < BITS_PER_UNIT. .i.e for vector(1). > > <signed-boolean:16>, it will be assigned as -1, instead of 1. > > Is there any specific reason vector(1) <signed-boolean:8> is handled > > differently from vector<1> <signed-boolean:16>? > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for trunk? > > I agree that <= BITS_PER_UNIT is suspicious, but the bit-precision > code should work for 8 bit > entities as well, it seems we only set the LSB of each element in the > "mask". ISTR that SVE > masks can have up to 8 bit elements (for 8 byte data elements), so > maybe that's why > <= BITS_PER_UNIT. So maybe instead of just setting one bit in > > ptr[bit / BITS_PER_UNIT] |= 1 << (bit % BITS_PER_UNIT); > > we should set elt_bits bits, aka (without testing) > > ptr[bit / BITS_PER_UNIT] |= (1 << elt_bits - 1) << (bit > % BITS_PER_UNIT); > > ?
Alternatively if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (expr)) && TYPE_PRECISION (itype) <= BITS_PER_UNIT) should be amended with && GET_MODE_CLASS (TYPE_MODE (TREE_TYPE (expr))) != MODE_VECTOR_INT maybe. Still for the possibility of vector(n) <signed-boolean:16> mask for a int128 element vector we'd have 16bit mask elements, encoding that differently would be inconsistent as well (but of course 16bit elements are not handled by the code right now). Richard. > > gcc/ChangeLog: > > > > PR middle-end/115406 > > * fold-const.cc (native_encode_vector_part): Don't set each > > bit to the dest when TYPE_PRECISION (itype) == BITS_PER_UNIT. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/i386/pr115406.c: New test. > > --- > > gcc/fold-const.cc | 2 +- > > gcc/testsuite/gcc.target/i386/pr115406.c | 23 +++++++++++++++++++++++ > > 2 files changed, 24 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr115406.c > > > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > > index 710d697c021..0f045f851d1 100644 > > --- a/gcc/fold-const.cc > > +++ b/gcc/fold-const.cc > > @@ -8077,7 +8077,7 @@ native_encode_vector_part (const_tree expr, unsigned > > char *ptr, int len, > > { > > tree itype = TREE_TYPE (TREE_TYPE (expr)); > > if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (expr)) > > - && TYPE_PRECISION (itype) <= BITS_PER_UNIT) > > + && TYPE_PRECISION (itype) < BITS_PER_UNIT) > > { > > /* This is the only case in which elements can be smaller than a > > byte. > > Element 0 is always in the lsb of the containing byte. */ > > diff --git a/gcc/testsuite/gcc.target/i386/pr115406.c > > b/gcc/testsuite/gcc.target/i386/pr115406.c > > new file mode 100644 > > index 00000000000..623dff06fc3 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/pr115406.c > > @@ -0,0 +1,23 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-O0 -mavx512f" } */ > > +/* { dg-require-effective-target avx512f } */ > > + > > +typedef __attribute__((__vector_size__ (1))) char V; > > + > > +char > > +foo (V v) > > +{ > > + return ((V) v == v)[0]; > > +} > > + > > +int > > +main () > > +{ > > + if (!__builtin_cpu_supports ("avx512f")) > > + return 0; > > + > > + char x = foo ((V) { }); > > + if (x != -1) > > + __builtin_abort (); > > + return 0; > > +} > > -- > > 2.31.1 > >