On Sun, Jul 9, 2023 at 10:53 AM Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > As shown in the PR, simplify_gen_subreg call in simplify_replace_fn_rtx: > > (gdb) list > 469 if (code == SUBREG) > 470 { > 471 op0 = simplify_replace_fn_rtx (SUBREG_REG (x), > old_rtx, fn, data); > 472 if (op0 == SUBREG_REG (x)) > 473 return x; > 474 op0 = simplify_gen_subreg (GET_MODE (x), op0, > 475 GET_MODE (SUBREG_REG (x)), > 476 SUBREG_BYTE (x)); > 477 return op0 ? op0 : x; > 478 } > > simplifies with following arguments: > > (gdb) p debug_rtx (op0) > (const_vector:V4QI [ > (const_int -52 [0xffffffffffffffcc]) repeated x4 > ]) > (gdb) p debug_rtx (x) > (subreg:V16QI (reg:V4QI 98) 0) > > to: > > (gdb) p debug_rtx (op0) > (const_vector:V16QI [ > (const_int -52 [0xffffffffffffffcc]) repeated x16 > ]) > > This simplification is invalid, it is not possible to get V16QImode vector > from V4QImode vector, even when all elements are duplicates. > > The simplification happens in simplify_context::simplify_subreg: > > (gdb) list > 7558 if (VECTOR_MODE_P (outermode) > 7559 && GET_MODE_INNER (outermode) == GET_MODE_INNER (innermode) > 7560 && vec_duplicate_p (op, &elt)) > 7561 return gen_vec_duplicate (outermode, elt); > > but the above simplification is valid only for non-paradoxical registers, > where outermode <= innermode. We should not assume that elements outside > the original register are valid, let alone all duplicates.
Hmm, but looking at the audit trail the x86 backend expects them to be zero? Isn't that wrong as well? That is, I think putting any random value into the upper lanes when constant folding a paradoxical subreg sounds OK to me, no? Of course we might choose to not do such constant propagation for efficiency reason - at least when the resulting CONST_* would require a larger constant pool entry or more costly construction. Thanks, Richard. > PR target/110206 > > gcc/ChangeLog: > > * simplify-rtx.cc (simplify_context::simplify_subreg): > Avoid returning a vector with duplicated value > outside the original register. > > gcc/testsuite/ChangeLog: > > * gcc.dg/torture/pr110206.c: New test. > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. > > OK for master and release branches? > > Uros.