Alex Coplan <alex.cop...@arm.com> writes:
>> -----Original Message-----
>> From: Richard Sandiford <richard.sandif...@arm.com>
>> Sent: 06 May 2020 11:28
>> To: Alex Coplan <alex.cop...@arm.com>
>> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw <richard.earns...@arm.com>;
>> Marcus Shawcroft <marcus.shawcr...@arm.com>; Kyrylo Tkachov
>> <kyrylo.tkac...@arm.com>; nd <n...@arm.com>
>> Subject: Re: [PATCH] aarch64: prefer using csinv, csneg in zero extend
>> contexts
>>
>> Alex Coplan <alex.cop...@arm.com> writes:
>> >> -----Original Message-----
>> >> From: Richard Sandiford <richard.sandif...@arm.com>
>> >> Sent: 30 April 2020 15:13
>> >> To: Alex Coplan <alex.cop...@arm.com>
>> >> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw
>> <richard.earns...@arm.com>;
>> >> Marcus Shawcroft <marcus.shawcr...@arm.com>; Kyrylo Tkachov
>> >> <kyrylo.tkac...@arm.com>; nd <n...@arm.com>
>> >> Subject: Re: [PATCH] aarch64: prefer using csinv, csneg in zero extend
>> contexts
>> >>
>> >> Yeah, I was hoping for something like...
>> >>
>> >> > Indeed, clang generates a MVN + CSEL sequence where the CSEL
>> operates on the
>> >> > 64-bit registers:
>> >> >
>> >> > f:
>> >> >         mvn     w8, w2
>> >> >         cmp     w0, #0
>> >> >         csel    x0, x8, x1, eq
>> >> >         ret
>> >>
>> >> ...this rather than the 4-insn (+ret) sequence that we currently
>> >> generate.  So it would have been a define_insn_and_split that handles
>> >> the zero case directly but splits into the "optimal" two-instruction
>> >> sequence for registers.
>> >>
>> >> But I guess the underlying problem is instead that we don't have
>> >> a pattern for (zero_extend:DI (not:SI ...)).  Adding that would
>> >> definitely be a better fix.
>> >
>> > Yes. I sent a patch for this very fix which Kyrill is going to commit
>> once stage
>> > 1 opens: https://gcc.gnu.org/pipermail/gcc-patches/2020-
>> April/544365.html
>>
>> Sorry, missed that.
>>
>> It looks like that patch hinders this one though.  Trying it with
>> current master (where that patch is applied), I get:
>>
>> FAIL: gcc.target/aarch64/csinv-neg.c check-function-bodies inv_zero1
>> FAIL: gcc.target/aarch64/csinv-neg.c check-function-bodies inv_zero2
>>
>> It looks like a costs issue:
>>
>> Trying 27 -> 18:
>>    27: r99:DI=zero_extend(~r101:SI)
>>       REG_DEAD r101:SI
>>    18: x0:DI={(cc:CC==0)?r99:DI:0}
>>       REG_DEAD cc:CC
>>       REG_DEAD r99:DI
>> Successfully matched this instruction:
>> (set (reg/i:DI 0 x0)
>>     (if_then_else:DI (eq (reg:CC 66 cc)
>>             (const_int 0 [0]))
>>         (zero_extend:DI (not:SI (reg:SI 101)))
>>         (const_int 0 [0])))
>> rejecting combination of insns 27 and 18
>> original costs 4 + 4 = 8
>> replacement cost 12
>>
>> I guess we'll need to teach aarch64_if_then_else_costs about the costs
>> of the new insns.
>
> Ah, thanks for catching this. I've attached an updated patch which fixes the
> costs issue here. With the new patch, all the test cases in csinv-neg.c now
> pass. In addition, I've done a bootstrap and regtest on aarch64-linux with no
> new failures.
>
> Do you think we need to add cases to aarch64_if_then_else_costs for the other
> new insns, or is the attached patch OK for master?

Looks good as-is, thanks.  Just a couple of very minor nits:

> 2020-05-07  Alex Coplan  <alex.cop...@arm.com>
>
>         * config/aarch64/aarch64.c (aarch64_if_then_else_costs): Add case to 
> correctly
>         calculate cost for new pattern (*csinv3_uxtw_insn3).

ChangeLog lines follow the 80-character limit.

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index e92c7e69fcb..efb3da7a7fc 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -11695,6 +11695,15 @@ aarch64_if_then_else_costs (rtx op0, rtx op1, rtx 
> op2, int *cost, bool speed)
>         op1 = XEXP (op1, 0);
>         op2 = XEXP (op2, 0);
>       }
> +      else if (GET_CODE (op1) == ZERO_EXTEND && op2 == const0_rtx)
> +     {
> +       inner = XEXP (op1, 0);
> +       if (GET_CODE (inner) == NEG || GET_CODE (inner) == NOT)
> +       {
> +         /* CSINV/NEG with zero extend + const 0 (*csinv3_uxtw_insn3).  */
> +         op1 = XEXP (inner, 0);
> +       }

GCC style is to avoid { ... } around single statements, unless it's
needed to avoid an ambiguity.  If we did use { ... }, the block
should be indented by two further spaces, so that the "{" is
under the space after "if".

Pushed with those changes, thanks.

Richard

Reply via email to