Re: [GCC][PATCH][Aarch64] Replace umov with cheaper fmov in popcount expansion

Richard Henderson Mon, 29 Oct 2018 05:16:55 -0700

On 10/29/18 10:31 AM, Sam Tebbs wrote:
> On 10/23/2018 02:50 PM, Richard Earnshaw (lists) wrote:
> 
>> On 22/10/2018 10:02, Sam Tebbs wrote:
>>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
>>> index 
>>> d7473418a8eb62b2757017cd1675493f86e41ef4..77e6f75cc15f06733df7b47906ee00580bea8d29
>>>  100644
>>> --- a/gcc/config/aarch64/aarch64.md
>>> +++ b/gcc/config/aarch64/aarch64.md
>>> @@ -4489,7 +4489,7 @@
>>>     emit_move_insn (v, gen_lowpart (V8QImode, in));
>>>     emit_insn (gen_popcountv8qi2 (v1, v));
>>>     emit_insn (gen_reduc_plus_scal_v8qi (r, v1));
>>> -  emit_insn (gen_zero_extendqi<mode>2 (out, r));
>>> +  emit_move_insn (out, gen_lowpart_SUBREG (GET_MODE (out), r));
>> I don't think this is right.  You're effectively creating a paradoxical
>> subreg here and relying on an unstated side effect of an earlier
>> instruction for correct behaviour.
>>
>> What you really need is a pattern that generates the zero-extend in
>> combination with the reduction operation.  So something like
>>
>> (set (reg:DI)
>>       (zero_extend:DI (unspec:VecMode [(reg:VecMode)] UNSPEC_ADDV)))
> 
> Hi Richard,
> 
> Thanks for the feedback. What assembly would you expect such a pattern 
> to produce?


The same assembly as you had.  It's just that the rtl that you were using to
represent it was incorrect.

> I'm a bit unclear on what you mean by the "the reduction operation", but 
> I'm assuming you're referring to the fmov in this case.

The reduction operation in this case is the addv.

r~

Re: [GCC][PATCH][Aarch64] Replace umov with cheaper fmov in popcount expansion

Reply via email to