On Sun, Jul 9, 2023 at 10:53 AM Uros Bizjak via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> As shown in the PR, simplify_gen_subreg call in simplify_replace_fn_rtx:
>
> (gdb) list
> 469           if (code == SUBREG)
> 470             {
> 471               op0 = simplify_replace_fn_rtx (SUBREG_REG (x),
> old_rtx, fn, data);
> 472               if (op0 == SUBREG_REG (x))
> 473                 return x;
> 474               op0 = simplify_gen_subreg (GET_MODE (x), op0,
> 475                                          GET_MODE (SUBREG_REG (x)),
> 476                                          SUBREG_BYTE (x));
> 477               return op0 ? op0 : x;
> 478             }
>
> simplifies with following arguments:
>
> (gdb) p debug_rtx (op0)
> (const_vector:V4QI [
>         (const_int -52 [0xffffffffffffffcc]) repeated x4
>     ])
> (gdb) p debug_rtx (x)
> (subreg:V16QI (reg:V4QI 98) 0)
>
> to:
>
> (gdb) p debug_rtx (op0)
> (const_vector:V16QI [
>         (const_int -52 [0xffffffffffffffcc]) repeated x16
>     ])
>
> This simplification is invalid, it is not possible to get V16QImode vector
> from V4QImode vector, even when all elements are duplicates.
>
> The simplification happens in simplify_context::simplify_subreg:
>
> (gdb) list
> 7558          if (VECTOR_MODE_P (outermode)
> 7559              && GET_MODE_INNER (outermode) == GET_MODE_INNER (innermode)
> 7560              && vec_duplicate_p (op, &elt))
> 7561            return gen_vec_duplicate (outermode, elt);
>
> but the above simplification is valid only for non-paradoxical registers,
> where outermode <= innermode.  We should not assume that elements outside
> the original register are valid, let alone all duplicates.

Hmm, but looking at the audit trail the x86 backend expects them to be zero?
Isn't that wrong as well?

That is, I think putting any random value into the upper lanes when
constant folding
a paradoxical subreg sounds OK to me, no?

Of course we might choose to not do such constant propagation for
efficiency reason - at least
when the resulting CONST_* would require a larger constant pool entry
or more costly
construction.

Thanks,
Richard.

>     PR target/110206
>
> gcc/ChangeLog:
>
>     * simplify-rtx.cc (simplify_context::simplify_subreg):
>     Avoid returning a vector with duplicated value
>     outside the original register.
>
> gcc/testsuite/ChangeLog:
>
>     * gcc.dg/torture/pr110206.c: New test.
>
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>
> OK for master and release branches?
>
> Uros.

Reply via email to