Re: [PATCH] [Aarch64] Optimize subtract in shift counts

Richard Biener Tue, 22 Aug 2017 00:50:41 -0700

On Tue, Aug 22, 2017 at 9:29 AM, Richard Sandiford
<richard.sandif...@linaro.org> wrote:
> Richard Biener <richard.guent...@gmail.com> writes:
>> On August 21, 2017 7:46:09 PM GMT+02:00, Richard Sandiford
>> <richard.sandif...@linaro.org> wrote:
>>>Richard Biener <richard.guent...@gmail.com> writes:
>>>> On Tue, Aug 8, 2017 at 10:20 PM, Richard Kenner
>>>> <ken...@vlsi1.ultra.nyu.edu> wrote:
>>>>>> Correct. It is truncated for integer shift, but not simd shift
>>>>>> instructions. We generate a pattern in the split that only
>>>generates
>>>>>> the integer shift instructions.
>>>>>
>>>>> That's unfortunate, because it would be nice to do this in
>>>simplify_rtx,
>>>>> since it's machine-independent, but that has to be conditioned on
>>>>> SHIFT_COUNT_TRUNCATED, so you wouldn't get the benefit of it.
>>>>
>>>> SHIFT_COUNT_TRUNCATED should go ... you should express this in
>>>> the patterns, like for example with
>>>>
>>>> (define_insn ashlSI3
>>>>   [(set (match_operand 0 "")
>>>>          (ashl:SI (match_operand ... )
>>>>                      (subreg:QI (match_operand:SI ...)))]
>>>>
>>>> or an explicit and:SI and combine / simplify_rtx should apply the
>>>magic
>>>> optimization we expect.
>>>
>>>The problem with the explicit AND is that you'd end up with either
>>>an AND of two constants for constant shifts, or with two separate
>>>patterns,
>>>one for constant shifts and one for variable shifts.  (And the problem
>>>in
>>>theory with two patterns is that it reduces the RA's freedom, although
>>>in
>>>practice I guess we'd always want a constant shift where possible for
>>>cost reasons, and so the RA would never need to replace pseudos with
>>>constants itself.)
>>>
>>>I think all useful instances of this optimisation will be exposed by
>>>the gimple optimisers, so maybe expand could to do it based on
>>>TARGET_SHIFT_TRUNCATION_MASK?  That describes the optab rather than
>>>the rtx code and it does take the mode into account.
>>
>> Sure, that could work as well and also take into account range info. But
>> we'd then need named expanders and the result would still have the
>> explicit and or need to be an unspec or a different RTL operation.
>
> Without SHIFT_COUNT_TRUNCATED, out-of-range rtl shifts have
> target-dependent rather than undefined behaviour, so it's OK
> for a target to use shift codes with out-of-range values.


Hmm, but that means simplify-rtx can't do anything with them because
we need to preserve target dependent behavior.  I think the RTL IL should
be always well-defined and its semantics shouldn't have any target
dependences (ideally, and if, then they should be well specified via
extra target hooks/macros).

>  And
> TARGET_SHIFT_TRUNCATION_MASK is a guarantee from the target about
> how the normal shift optabs behave, so I don't think we'd need new
> optabs or new unspecs.
>
> E.g. it already works this way when expanding double-word shifts,
> which IIRC is why TARGET_SHIFT_TRUNCATION_MASK was added.  There it's
> possible to use a shorter sequence if you know that the shift optab
> truncates the count, so we can do that even if SHIFT_COUNT_TRUNCATED
> isn't defined.

I'm somewhat confused by docs saying TARGET_SHIFT_TRUNCATION_MASK
applies to the instructions generated by the named shift patterns but _not_
general shift RTXen.  But the generated pattern contains shift RTXen and how
can we figure whether they were generated by the named expanders or
by other means?  Don't define_expand also serve as define_insn for things
like combine?

That said, from looking at the docs and your description above it seems that
semantics are not fully reflected in the RTL instruction stream?

Richard.

> Thanks,
> Richard
>
>>
>> Richard.
>>
>>>Thanks,
>>>Richard

Re: [PATCH] [Aarch64] Optimize subtract in shift counts

Reply via email to