Re: [PATCH] [Aarch64] Optimize subtract in shift counts

Richard Sandiford Tue, 22 Aug 2017 00:29:54 -0700

Richard Biener <richard.guent...@gmail.com> writes:
> On August 21, 2017 7:46:09 PM GMT+02:00, Richard Sandiford
> <richard.sandif...@linaro.org> wrote:
>>Richard Biener <richard.guent...@gmail.com> writes:
>>> On Tue, Aug 8, 2017 at 10:20 PM, Richard Kenner
>>> <ken...@vlsi1.ultra.nyu.edu> wrote:
>>>>> Correct. It is truncated for integer shift, but not simd shift
>>>>> instructions. We generate a pattern in the split that only
>>generates
>>>>> the integer shift instructions.
>>>>
>>>> That's unfortunate, because it would be nice to do this in
>>simplify_rtx,
>>>> since it's machine-independent, but that has to be conditioned on
>>>> SHIFT_COUNT_TRUNCATED, so you wouldn't get the benefit of it.
>>>
>>> SHIFT_COUNT_TRUNCATED should go ... you should express this in
>>> the patterns, like for example with
>>>
>>> (define_insn ashlSI3
>>>   [(set (match_operand 0 "")
>>>          (ashl:SI (match_operand ... )
>>>                      (subreg:QI (match_operand:SI ...)))]
>>>
>>> or an explicit and:SI and combine / simplify_rtx should apply the
>>magic
>>> optimization we expect.
>>
>>The problem with the explicit AND is that you'd end up with either
>>an AND of two constants for constant shifts, or with two separate
>>patterns,
>>one for constant shifts and one for variable shifts.  (And the problem
>>in
>>theory with two patterns is that it reduces the RA's freedom, although
>>in
>>practice I guess we'd always want a constant shift where possible for
>>cost reasons, and so the RA would never need to replace pseudos with
>>constants itself.)
>>
>>I think all useful instances of this optimisation will be exposed by
>>the gimple optimisers, so maybe expand could to do it based on
>>TARGET_SHIFT_TRUNCATION_MASK?  That describes the optab rather than
>>the rtx code and it does take the mode into account.
>
> Sure, that could work as well and also take into account range info. But
> we'd then need named expanders and the result would still have the
> explicit and or need to be an unspec or a different RTL operation.


Without SHIFT_COUNT_TRUNCATED, out-of-range rtl shifts have
target-dependent rather than undefined behaviour, so it's OK
for a target to use shift codes with out-of-range values.  And
TARGET_SHIFT_TRUNCATION_MASK is a guarantee from the target about
how the normal shift optabs behave, so I don't think we'd need new
optabs or new unspecs.

E.g. it already works this way when expanding double-word shifts,
which IIRC is why TARGET_SHIFT_TRUNCATION_MASK was added.  There it's
possible to use a shorter sequence if you know that the shift optab
truncates the count, so we can do that even if SHIFT_COUNT_TRUNCATED
isn't defined.

Thanks,
Richard

>
> Richard. 
>
>>Thanks,
>>Richard

Re: [PATCH] [Aarch64] Optimize subtract in shift counts

Reply via email to