On Fri, 14 Jul 2023, Uros Bizjak wrote: > cprop1 pass does not consider paradoxical subreg and for (insn 22) claims > that it equals 8 elements of HImodeby setting REG_EQUAL note: > > (insn 21 19 22 4 (set (reg:V4QI 98) > (mem/u/c:V4QI (symbol_ref/u:DI ("*.LC1") [flags 0x2]) [0 S4 > A32])) "pr110206.c":12:42 1530 {*movv4qi_internal} > (expr_list:REG_EQUAL (const_vector:V4QI [ > (const_int -52 [0xffffffffffffffcc]) repeated x4 > ]) > (nil))) > (insn 22 21 23 4 (set (reg:V8HI 100) > (zero_extend:V8HI (vec_select:V8QI (subreg:V16QI (reg:V4QI 98) 0) > (parallel [ > (const_int 0 [0]) > (const_int 1 [0x1]) > (const_int 2 [0x2]) > (const_int 3 [0x3]) > (const_int 4 [0x4]) > (const_int 5 [0x5]) > (const_int 6 [0x6]) > (const_int 7 [0x7]) > ])))) "pr110206.c":12:42 7471 > {sse4_1_zero_extendv8qiv8hi2} > (expr_list:REG_EQUAL (const_vector:V8HI [ > (const_int 204 [0xcc]) repeated x8 > ]) > (expr_list:REG_DEAD (reg:V4QI 98) > (nil)))) > > We rely on the "undefined" vals to have a specific value (from the earlier > REG_EQUAL note) but actual code generation doesn't ensure this (it doesn't > need to). That said, the issue isn't the constant folding per-se but that > we do not actually constant fold but register an equality that doesn't hold. > > PR target/110206 > > gcc/ChangeLog: > > * fwprop.cc (contains_paradoxical_subreg_p): Move to ... > * rtlanal.cc (contains_paradoxical_subreg_p): ... here. > * rtlanal.h (contains_paradoxical_subreg_p): Add prototype. > * cprop.cc (try_replace_reg): Do not set REG_EQUAL note > when the original source contains a paradoxical subreg. > > gcc/testsuite/ChangeLog: > > * gcc.dg/torture/pr110206.c: New test. > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. > > OK for mainline and backports?
OK. I think the testcase can also run on other targets if you add dg-additional-options "-w -Wno-psabi", all generic vector ops should be lowered if not supported. Richard.