> Am 19.07.2024 um 19:44 schrieb Richard Sandiford <richard.sandif...@arm.com>:
> 
> Previously we built vector boolean constants using 1 for true
> elements and 0 for false elements.  This matches the predicates
> produced by SVE's PTRUE instruction, but leads to a miscompilation
> on AVX512, where all bits of a boolean element should be set.

Note it’s for AVX with the issue that QImode masks are handled differently than 
larger mode masks.  AVX512 has only single-bit element masks where this does 
not make a difference.

> One option for RTL would be to make this target-configurable.
> But that isn't really possible at the tree level, where vectors
> should work in a more target-independent way.  (There is currently
> no way to create a "generic" packed boolean vector, but never say
> never :))  And, if we were going to pick a generic behaviour,
> it would make sense to use 0/-1 rather than 0/1, for consistency
> with integer vectors.
> 
> Both behaviours should work with SVE on read, since SVE ignores
> the upper bits in each predicate element.  And the choice shouldn't
> make much difference for RTL, since all SVE predicate modes are
> expressed as vectors of BI, rather than of multi-bit booleans.
> 
> I suspect there might be some fallout from this change on SVE.
> But I think we should at least give it a go, and see whether any
> fallout provides a strong counterargument against the approach.
> 
> Tested on aarch64-linux-gnu (configured with --with-cpu=neoverse-v1
> for extra coverage) and x86_64-linux-gnu.  OK to install?

OK,

Richard 

> Richard
> 
> 
> gcc/
>    PR middle-end/115406
>    * fold-const.cc (native_encode_vector_part): For vector booleans,
>    check whether an element is nonzero and, if so, set all of the
>    correspending bits in the target image.
>    * simplify-rtx.cc (native_encode_rtx): Likewise.
> 
> gcc/testsuite/
>    PR middle-end/115406
>    * gcc.dg/torture/pr115406.c: New test.
> ---
> gcc/fold-const.cc                       |  5 +++--
> gcc/simplify-rtx.cc                     |  3 ++-
> gcc/testsuite/gcc.dg/torture/pr115406.c | 18 ++++++++++++++++++
> 3 files changed, 23 insertions(+), 3 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/torture/pr115406.c
> 
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 710d697c021..a509b46b904 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -8097,16 +8097,17 @@ native_encode_vector_part (const_tree expr, unsigned 
> char *ptr, int len,
>       unsigned int elts_per_byte = BITS_PER_UNIT / elt_bits;
>       unsigned int first_elt = off * elts_per_byte;
>       unsigned int extract_elts = extract_bytes * elts_per_byte;
> +      unsigned int elt_mask = (1 << elt_bits) - 1;
>       for (unsigned int i = 0; i < extract_elts; ++i)
>    {
>      tree elt = VECTOR_CST_ELT (expr, first_elt + i);
>      if (TREE_CODE (elt) != INTEGER_CST)
>        return 0;
> 
> -      if (ptr && wi::extract_uhwi (wi::to_wide (elt), 0, 1))
> +      if (ptr && integer_nonzerop (elt))
>        {
>          unsigned int bit = i * elt_bits;
> -          ptr[bit / BITS_PER_UNIT] |= 1 << (bit % BITS_PER_UNIT);
> +          ptr[bit / BITS_PER_UNIT] |= elt_mask << (bit % BITS_PER_UNIT);
>        }
>    }
>       return extract_bytes;
> diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
> index 35ba54c6292..a49eefb34d4 100644
> --- a/gcc/simplify-rtx.cc
> +++ b/gcc/simplify-rtx.cc
> @@ -7232,7 +7232,8 @@ native_encode_rtx (machine_mode mode, rtx x, 
> vec<target_unit> &bytes,
>          target_unit value = 0;
>          for (unsigned int j = 0; j < BITS_PER_UNIT; j += elt_bits)
>        {
> -          value |= (INTVAL (CONST_VECTOR_ELT (x, elt)) & mask) << j;
> +          if (INTVAL (CONST_VECTOR_ELT (x, elt)))
> +            value |= mask << j;
>          elt += 1;
>        }
>          bytes.quick_push (value);
> diff --git a/gcc/testsuite/gcc.dg/torture/pr115406.c 
> b/gcc/testsuite/gcc.dg/torture/pr115406.c
> new file mode 100644
> index 00000000000..800ef2f8317
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr115406.c
> @@ -0,0 +1,18 @@
> +// { dg-do run }
> +// { dg-additional-options "-mavx512f" { target avx512f_runtime } }
> +
> +typedef __attribute__((__vector_size__ (1))) signed char V;
> +
> +signed char
> +foo (V v)
> +{
> +  return ((V) v == v)[0];
> +}
> +
> +int
> +main ()
> +{
> +  signed char x = foo ((V) { });
> +  if (x != -1)
> +    __builtin_abort ();
> +}
> --
> 2.25.1
> 

Reply via email to