on 2022/6/21 06:10, Segher Boessenkool wrote: > Hi! > > On Fri, Jun 17, 2022 at 07:13:37AM +0200, Roger Sayle wrote: >> This patch addresses PR target/105991 where a change to prefer representing >> shifts and adds at the tree-level as multiplications, causes problems for >> the rldimi patterns in the powerpc backend. > > Because it now is converted to different RTL at expand time. Which the > generic expand code does some premature optimisation on, which makes us > end up with the addition instead of data manipulation insns. Oh well. > >> The issue is that rs6000.md >> models this pattern using IOR, and some variants that have the equivalent >> PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns. >> This is fixed in this patch by adding a define_insn_and_split to locally >> canonicalize the PLUS and XOR forms to the backend's preferred IOR form. > > Okay. > >> An alternative fix might be for the RTL optimizers to define a canonical >> form for these plus_xor_ior equivalent expressions, but the logical >> choice might be plus (which may appear in an addressing mode), and such >> a change may require a number of tweaks to update various backends >> (i.e. a more intrusive change than the one proposed here). > > This does not make sense in an address at all, thankfully :-) > > The only sane canonicalisation for this is something like VEC_DUPLICATE > but for submodes of integer modes, instead of the component mode of a > vector mode. I don't feel this is worth trying to handle in general > though. > >> Many thanks for Marek Polacek for bootstrapping and regression testing >> this change without problems. > > You have an account on the cfarm, it is quick and easy to test there :-) > I recommend gcc135, a 32 core p9, with oodles of disk space :-) > >> +; Canonicalize the PLUS and XOR forms to IOR for rotl<mode>3_insert_3 >> +(define_code_iterator plus_xor [plus xor]) >> + >> +(define_insn_and_split "*rotl<mode>3_insert_3_<code>" >> + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") >> + (plus_xor:GPR >> + (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0") >> + (match_operand:GPR 4 "const_int_operand" "n")) >> + (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r") >> + (match_operand:SI 2 "const_int_operand" "n"))))] >> + "INTVAL (operands[2]) == exact_log2 (UINTVAL (operands[4]) + 1)" > > exact_log2 returns -1 if its argument is not a power of two. Please > test it is > 0 explicitly here: I don't think this splitter will work > correctly otherwise. There shouldn't really be a shift by 0 ever of > course, but it isn't invalid RTL. > >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr105991.c >> @@ -0,0 +1,11 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> +unsigned long long >> +foo (unsigned long long value) >> +{ >> + value &= 0xffffffff; >> + value |= value << 32; >> + return value; >> +} >> +/* { dg-final { scan-assembler "rldimi" } } */ > > Write > /* { dg-final { scan-assembler {\mrldimi\M} } } */ > please. >
This case also needs effective-target keyword lp64, that is /* { dg-require-effective-target lp64 } */ since with -m32, it gets: mr 3,4 with -m32 -mpowerpc64, it gets: rldicl 3,4,0,32 BR, Kewen