"Robin Dapp" <rdapp....@gmail.com> writes:
> Hi,
>
> this came up when testing even/odd permutes on riscv
> (https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669181.html).
> I didn't yet continue with the patch but it's clear it
> exposes an issue with encoding vector masks.
>
> When we encode a vector mask with a constant number of elements
> and fewer than BITS_PER_UNIT elements (i.e. 4 for a mask {1, 0, 1, 0})
> native_encode_rtx results in a value 85 = 0b10101010.

Do you mean fewer than BITS_PER_UNIT significant bits in total?
E.g. some SVE modes have fewer than BITS_PER_UNIT elements,
but each element occupies multiple bits, meaning that the whole
thing occupies full bytes.

If so...

> This is because CONST_VECTOR_ELT assumes it's operating on an encoded
> sequence and will happily return values beyond the actual number of
> elements.  Subsequently, when optimizing the constant pool, the
> hash values of {1, 0, 1, 0} and {1, 0, 1, 0, 1, 0, 1, 0} are identical.
> Then one of them is linked to the other one, resulting in wrong code.
>
> Therefore, this patch limits the number of elements to consider when
> building the value to GET_MODE_NUNITS in case the latter is constant
> and less than BITS_PER_UNIT.

...it's not clear to me that we should define the upper bits of the
byte to be zero.  What would rely on that?  Is it something that we'd
require for all loads and stores of such modes?

Thanks,
Richard
>
> Bootstrapped and regtested on x86, regtested on riscv.
>
> Regards
>  Robin
>
> gcc/ChangeLog:
>
>       * simplify-rtx.cc (native_encode_rtx): Limit number of elements
>       that are used for encoded value.
> ---
>  gcc/simplify-rtx.cc | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
> index 893c5f6e1ae..df4ad4db941 100644
> --- a/gcc/simplify-rtx.cc
> +++ b/gcc/simplify-rtx.cc
> @@ -7332,6 +7332,15 @@ native_encode_rtx (machine_mode mode, rtx x, 
> vec<target_unit> &bytes,
>        vectors can have several elements per byte.  */
>        unsigned int elt_bits = vector_element_size (GET_MODE_PRECISION (mode),
>                                                  GET_MODE_NUNITS (mode));
> +      poly_uint16 nunits = GET_MODE_NUNITS (mode);
> +
> +      /* For a const vector with a constant number of elements (and fewer
> +      than BITS_PER_UNIT elements) CONST_VECTOR_ELT will return values
> +      beyond the actual number of elements.*/
> +      unsigned int max_elt_bits = (num_bytes == 1 && nunits.is_constant ())
> +     ? MIN (BITS_PER_UNIT, nunits.to_constant ())
> +     : BITS_PER_UNIT;
> +
>        unsigned int elt = first_byte * BITS_PER_UNIT / elt_bits;
>        if (elt_bits < BITS_PER_UNIT)
>       {
> @@ -7342,7 +7351,7 @@ native_encode_rtx (machine_mode mode, rtx x, 
> vec<target_unit> &bytes,
>         for (unsigned int i = 0; i < num_bytes; ++i)
>           {
>             target_unit value = 0;
> -           for (unsigned int j = 0; j < BITS_PER_UNIT; j += elt_bits)
> +           for (unsigned int j = 0; j < max_elt_bits; j += elt_bits)
>               {
>                 if (INTVAL (CONST_VECTOR_ELT (x, elt)))
>                   value |= mask << j;

Reply via email to