"Yangfei (Felix)" <felix.y...@huawei.com> writes:
> Hi,
>
>> -----Original Message-----
>> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
>> Sent: Sunday, May 31, 2020 12:01 AM
>> To: Yangfei (Felix) <felix.y...@huawei.com>
>> Cc: gcc-patches@gcc.gnu.org; Uros Bizjak <ubiz...@gmail.com>; Jakub
>> Jelinek <ja...@redhat.com>; Hongtao Liu <crazy...@gmail.com>; H.J. Lu
>> <hjl.to...@gmail.com>
>> Subject: Re: [PATCH PR95254] aarch64: gcc generate inefficient code with
>> fixed sve vector length
>> 
>
> Snip...
>
>> >
>> > The v5 patch attached addressed this issue.
>> >
>> > There two added changes compared with the v4 patch:
>> > 1. In candidate_mem_p, mov_optab for innermode should be available.
>> >      In this case, mov_optab for SDmode is not there and subreg are added
>> back by emit_move_insn_1.  So we won't get the benefit with the patch.
>> 
>> I agree we should have this check.  I think the rule applies to all of the
>> transforms though, not just the mem one, so we should add the check to the
>> register and constant cases too.
>
> OK.  I changed to make this an extra condition for calculating x_inner & y 
> _inner.

Sounds good.  Maybe at this point the x_inner and y_inner code is
getting complicated enough to put into a lambda too:

  x_inner = ... (x);
  y_inner = ... (y);

Just a suggestion though.

>> > 2. Instead of using adjust_address, I changed to use adjust_address_nv to
>> avoid the emit of invalid insn 13.
>> >     The latter call to validize_mem() in emit_move_insn will take care of 
>> > the
>> address for us.
>> 
>> The validation performed by validize_mem is the same as that performed by
>> adjust_address, so the only case this should make a difference is for
>> push_operands:
>
> True.
>
>>   /* If X or Y are memory references, verify that their addresses are valid
>>      for the machine.  */
>>   if (MEM_P (x)
>>       && (! memory_address_addr_space_p (GET_MODE (x), XEXP (x, 0),
>>                                       MEM_ADDR_SPACE (x))
>>        && ! push_operand (x, GET_MODE (x))))
>>     x = validize_mem (x);
>> 
>>   if (MEM_P (y)
>>       && ! memory_address_addr_space_p (GET_MODE (y), XEXP (y, 0),
>>                                      MEM_ADDR_SPACE (y)))
>>     y = validize_mem (y);
>> 
>> So I think the fix is to punt on push_operands instead (and continue to use
>> adjust_address rather than adjust_address_nv).
>
> Not sure if I understand it correctly.
> Do you mean excluding push_operand in candidate_mem_p? Like:
>
>  3830   auto candidate_mem_p = [&](machine_mode innermode, rtx mem) {
>  3831     return !targetm.can_change_mode_class (innermode, GET_MODE (mem), 
> ALL_REGS)
>  3832            && !push_operand (mem, GET_MODE (mem))
>  3833            /* Not a candiate if innermode requires too much alignment.  
> */
>  3834            && (MEM_ALIGN (mem) >= GET_MODE_ALIGNMENT (innermode)
>  3835                || targetm.slow_unaligned_access (GET_MODE (mem),
>  3836                                                  MEM_ALIGN (mem))
>  3837                || !targetm.slow_unaligned_access (innermode, MEM_ALIGN 
> (mem)));
>  3838   };

Yeah, looks good.

Formatting nit though: multi-line conditions should be wrapped in (...),
i.e.:

    return (...
            && ...
            && ...);

Thanks,
Richard

Reply via email to