on 2022/6/21 06:10, Segher Boessenkool wrote:
> Hi!
>
> On Fri, Jun 17, 2022 at 07:13:37AM +0200, Roger Sayle wrote:
>> This patch addresses PR target/105991 where a change to prefer representing
>> shifts and adds at the tree-level as multiplications, causes problems for
>> the rldimi patterns in the powerpc backend.
>
> Because it now is converted to different RTL at expand time. Which the
> generic expand code does some premature optimisation on, which makes us
> end up with the addition instead of data manipulation insns. Oh well.
>
>> The issue is that rs6000.md
>> models this pattern using IOR, and some variants that have the equivalent
>> PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns.
>> This is fixed in this patch by adding a define_insn_and_split to locally
>> canonicalize the PLUS and XOR forms to the backend's preferred IOR form.
>
> Okay.
>
>> An alternative fix might be for the RTL optimizers to define a canonical
>> form for these plus_xor_ior equivalent expressions, but the logical
>> choice might be plus (which may appear in an addressing mode), and such
>> a change may require a number of tweaks to update various backends
>> (i.e. a more intrusive change than the one proposed here).
>
> This does not make sense in an address at all, thankfully :-)
>
> The only sane canonicalisation for this is something like VEC_DUPLICATE
> but for submodes of integer modes, instead of the component mode of a
> vector mode. I don't feel this is worth trying to handle in general
> though.
>
>> Many thanks for Marek Polacek for bootstrapping and regression testing
>> this change without problems.
>
> You have an account on the cfarm, it is quick and easy to test there :-)
> I recommend gcc135, a 32 core p9, with oodles of disk space :-)
>
>> +; Canonicalize the PLUS and XOR forms to IOR for rotl<mode>3_insert_3
>> +(define_code_iterator plus_xor [plus xor])
>> +
>> +(define_insn_and_split "*rotl<mode>3_insert_3_<code>"
>> + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>> + (plus_xor:GPR
>> + (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0")
>> + (match_operand:GPR 4 "const_int_operand" "n"))
>> + (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
>> + (match_operand:SI 2 "const_int_operand" "n"))))]
>> + "INTVAL (operands[2]) == exact_log2 (UINTVAL (operands[4]) + 1)"
>
> exact_log2 returns -1 if its argument is not a power of two. Please
> test it is > 0 explicitly here: I don't think this splitter will work
> correctly otherwise. There shouldn't really be a shift by 0 ever of
> course, but it isn't invalid RTL.
>
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr105991.c
>> @@ -0,0 +1,11 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" } */
>> +unsigned long long
>> +foo (unsigned long long value)
>> +{
>> + value &= 0xffffffff;
>> + value |= value << 32;
>> + return value;
>> +}
>> +/* { dg-final { scan-assembler "rldimi" } } */
>
> Write
> /* { dg-final { scan-assembler {\mrldimi\M} } } */
> please.
>
This case also needs effective-target keyword lp64,
that is /* { dg-require-effective-target lp64 } */
since with -m32, it gets:
mr 3,4
with -m32 -mpowerpc64, it gets:
rldicl 3,4,0,32
BR,
Kewen