On Fri, Jun 28, 2024 at 8:01 AM Richard Biener
<richard.guent...@gmail.com> wrote:
>
> On Fri, Jun 28, 2024 at 3:15 AM liuhongt <hongtao....@intel.com> wrote:
> >
> > for the testcase in the PR115406, here is part of the dump.
> >
> >   char D.4882;
> >   vector(1) <signed-boolean:8> _1;
> >   vector(1) signed char _2;
> >   char _5;
> >
> >   <bb 2> :
> >   _1 = { -1 };
> >
> > When assign { -1 } to vector(1} {signed-boolean:8},
> > Since TYPE_PRECISION (itype) <= BITS_PER_UNIT, so it set each bit of dest
> > with each vector elemnet. But i think the bit setting should only apply for
> > TYPE_PRECISION (itype) < BITS_PER_UNIT. .i.e for vector(1).
> > <signed-boolean:16>, it will be assigned as -1, instead of 1.
> > Is there any specific reason vector(1) <signed-boolean:8> is handled
> > differently from vector<1> <signed-boolean:16>?
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
>
> I agree that <= BITS_PER_UNIT is suspicious, but the bit-precision
> code should work for 8 bit
> entities as well, it seems we only set the LSB of each element in the
> "mask".  ISTR that SVE
> masks can have up to 8 bit elements (for 8 byte data elements), so
> maybe that's why
> <= BITS_PER_UNIT.  So maybe instead of just setting one bit in
>
>               ptr[bit / BITS_PER_UNIT] |= 1 << (bit % BITS_PER_UNIT);
>
> we should set elt_bits bits, aka (without testing)
>
>               ptr[bit / BITS_PER_UNIT] |= (1 << elt_bits - 1) << (bit
> % BITS_PER_UNIT);
>
> ?

Alternatively

  if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (expr))
      && TYPE_PRECISION (itype) <= BITS_PER_UNIT)

should be amended with

   && GET_MODE_CLASS (TYPE_MODE (TREE_TYPE (expr))) != MODE_VECTOR_INT

maybe.  Still for the possibility of vector(n) <signed-boolean:16>
mask for a int128 element vector
we'd have 16bit mask elements, encoding that differently would be
inconsistent as well
(but of course 16bit elements are not handled by the code right now).

Richard.

> > gcc/ChangeLog:
> >
> >         PR middle-end/115406
> >         * fold-const.cc (native_encode_vector_part): Don't set each
> >         bit to the dest when TYPE_PRECISION (itype) == BITS_PER_UNIT.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/i386/pr115406.c: New test.
> > ---
> >  gcc/fold-const.cc                        |  2 +-
> >  gcc/testsuite/gcc.target/i386/pr115406.c | 23 +++++++++++++++++++++++
> >  2 files changed, 24 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr115406.c
> >
> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > index 710d697c021..0f045f851d1 100644
> > --- a/gcc/fold-const.cc
> > +++ b/gcc/fold-const.cc
> > @@ -8077,7 +8077,7 @@ native_encode_vector_part (const_tree expr, unsigned 
> > char *ptr, int len,
> >  {
> >    tree itype = TREE_TYPE (TREE_TYPE (expr));
> >    if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (expr))
> > -      && TYPE_PRECISION (itype) <= BITS_PER_UNIT)
> > +      && TYPE_PRECISION (itype) < BITS_PER_UNIT)
> >      {
> >        /* This is the only case in which elements can be smaller than a 
> > byte.
> >          Element 0 is always in the lsb of the containing byte.  */
> > diff --git a/gcc/testsuite/gcc.target/i386/pr115406.c 
> > b/gcc/testsuite/gcc.target/i386/pr115406.c
> > new file mode 100644
> > index 00000000000..623dff06fc3
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr115406.c
> > @@ -0,0 +1,23 @@
> > +/* { dg-do run } */
> > +/* { dg-options "-O0 -mavx512f" } */
> > +/* { dg-require-effective-target avx512f } */
> > +
> > +typedef __attribute__((__vector_size__ (1))) char V;
> > +
> > +char
> > +foo (V v)
> > +{
> > +  return ((V) v == v)[0];
> > +}
> > +
> > +int
> > +main ()
> > +{
> > +  if (!__builtin_cpu_supports ("avx512f"))
> > +    return 0;
> > +
> > +  char x = foo ((V) { });
> > +  if (x != -1)
> > +    __builtin_abort ();
> > +  return 0;
> > +}
> > --
> > 2.31.1
> >

Reply via email to