Re: [rs6000 PATCH] PR target/105991: Recognize PLUS and XOR forms of rldimi.

Kewen.Lin via Gcc-patches Mon, 20 Jun 2022 19:03:42 -0700

on 2022/6/21 06:10, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Jun 17, 2022 at 07:13:37AM +0200, Roger Sayle wrote:
>> This patch addresses PR target/105991 where a change to prefer representing
>> shifts and adds at the tree-level as multiplications, causes problems for
>> the rldimi patterns in the powerpc backend.
> 
> Because it now is converted to different RTL at expand time.  Which the
> generic expand code does some premature optimisation on, which makes us
> end up with the addition instead of data manipulation insns.  Oh well.
> 
>> The issue is that rs6000.md
>> models this pattern using IOR, and some variants that have the equivalent
>> PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns.
>> This is fixed in this patch by adding a define_insn_and_split to locally
>> canonicalize the PLUS and XOR forms to the backend's preferred IOR form.
> 
> Okay.
> 
>> An alternative fix might be for the RTL optimizers to define a canonical
>> form for these plus_xor_ior equivalent expressions, but the logical
>> choice might be plus (which may appear in an addressing mode), and such
>> a change may require a number of tweaks to update various backends
>> (i.e.  a more intrusive change than the one proposed here).
> 
> This does not make sense in an address at all, thankfully :-)
> 
> The only sane canonicalisation for this is something like VEC_DUPLICATE
> but for submodes of integer modes, instead of the component mode of a
> vector mode.  I don't feel this is worth trying to handle in general
> though.
> 
>> Many thanks for Marek Polacek for bootstrapping and regression testing
>> this change without problems.
> 
> You have an account on the cfarm, it is quick and easy to test there :-)
> I recommend gcc135, a 32 core p9, with oodles of disk space :-)
> 
>> +; Canonicalize the PLUS and XOR forms to IOR for rotl<mode>3_insert_3
>> +(define_code_iterator plus_xor [plus xor])
>> +
>> +(define_insn_and_split "*rotl<mode>3_insert_3_<code>"
>> +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>> +    (plus_xor:GPR
>> +      (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0")
>> +               (match_operand:GPR 4 "const_int_operand" "n"))
>> +      (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
>> +                  (match_operand:SI 2 "const_int_operand" "n"))))]
>> +  "INTVAL (operands[2]) == exact_log2 (UINTVAL (operands[4]) + 1)"
> 
> exact_log2 returns -1 if its argument is not a power of two.  Please
> test it is > 0 explicitly here: I don't think this splitter will work
> correctly otherwise.  There shouldn't really be a shift by 0 ever of
> course, but it isn't invalid RTL.
> 
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr105991.c
>> @@ -0,0 +1,11 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" } */
>> +unsigned long long
>> +foo (unsigned long long value)
>> +{
>> +  value &= 0xffffffff;
>> +  value |= value << 32;
>> +  return value;
>> +}
>> +/* { dg-final { scan-assembler "rldimi" } } */
> 
> Write
> /* { dg-final { scan-assembler {\mrldimi\M} } } */
> please.
>


This case also needs effective-target keyword lp64,
that is /* { dg-require-effective-target lp64 } */

since with -m32, it gets:
  mr 3,4

with -m32 -mpowerpc64, it gets:
  rldicl 3,4,0,32


BR,
Kewen

Re: [rs6000 PATCH] PR target/105991: Recognize PLUS and XOR forms of rldimi.

Reply via email to