On Fri, Sep 15, 2017 at 07:22:39PM +0100, Steve Ellcey wrote:
> PR 81356 points out that doing a __builtin_strcpy of an empty string on
> aarch64 does a copy from memory instead of just writing out a zero byte.
> In looking at this I found that it was because of
> aarch64_use_by_pieces_infrastructure_p, which returns false for
> STORE_BY_PIECES.  The comment says:
> 
>   /* STORE_BY_PIECES can be used when copying a constant string, but
>      in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR).
>      For now we always fail this and let the move_by_pieces code copy
>      the string from read-only memory.  */
> 
> But this doesn't seem to be the case anymore.  When I remove this function
> and the TARGET_USE_BY_PIECES_INFRASTRUCTURE_P macro that uses it the code
> for __builtin_strcpy of a constant string seems to be either better or the
> same.  The only time I got more instructions after removing this function
> was on an 8 byte __builtin_strcpy where we now generate a mov and 3 movk
> instructions to create the source followed by a store instead of doing a
> load/store of 8 bytes.  The comment may have been applicable for
> -mstrict-align at one time but it doesn't seem to be the case now.  I still
> get better code without this routine under that option as well.

I've been trying to remember why this is the way it is, It looks like it dates
back to the initial port, and the only note I have on it points at an
incorrect PR number. So, I think this is probably a safe and sensible
choice.

OK.

Reviewed-by: James Greenhalgh <james.greenha...@arm.com>

Thanks,
James

> 
> Bootstrapped and tested without regressions, OK to checkin?
> 
> Steve Ellcey
> sell...@cavium.com
> 
> 
> 
> 2017-09-15  Steve Ellcey  <sell...@cavium.com>
> 
>       PR target/81356
>       * config/aarch64/aarch64.c (aarch64_use_by_pieces_infrastructure_p):
>       Remove.
>       (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Remove define.
> 
> 
> 2017-09-15  Steve Ellcey  <sell...@cavium.com>
> 
>       * gcc.target/aarch64/pr81356.c: New test.

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 1c14008..fc72236 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -14118,22 +14118,6 @@ aarch64_asan_shadow_offset (void)
>    return (HOST_WIDE_INT_1 << 36);
>  }
>  
> -static bool
> -aarch64_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size,
> -                                     unsigned int align,
> -                                     enum by_pieces_operation op,
> -                                     bool speed_p)
> -{
> -  /* STORE_BY_PIECES can be used when copying a constant string, but
> -     in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR).
> -     For now we always fail this and let the move_by_pieces code copy
> -     the string from read-only memory.  */
> -  if (op == STORE_BY_PIECES)
> -    return false;
> -
> -  return default_use_by_pieces_infrastructure_p (size, align, op, speed_p);
> -}
> -
>  static rtx
>  aarch64_gen_ccmp_first (rtx_insn **prep_seq, rtx_insn **gen_seq,
>                       int code, tree treeop0, tree treeop1)
> @@ -15631,10 +15615,6 @@ aarch64_libgcc_floating_mode_supported_p
>  #undef TARGET_LEGITIMIZE_ADDRESS
>  #define TARGET_LEGITIMIZE_ADDRESS aarch64_legitimize_address
>  
> -#undef TARGET_USE_BY_PIECES_INFRASTRUCTURE_P
> -#define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P \
> -  aarch64_use_by_pieces_infrastructure_p
> -
>  #undef TARGET_SCHED_CAN_SPECULATE_INSN
>  #define TARGET_SCHED_CAN_SPECULATE_INSN aarch64_sched_can_speculate_insn
>  

> diff --git a/gcc/testsuite/gcc.target/aarch64/pr81356.c 
> b/gcc/testsuite/gcc.target/aarch64/pr81356.c
> index e69de29..9fd6baa 100644
> --- a/gcc/testsuite/gcc.target/aarch64/pr81356.c
> +++ b/gcc/testsuite/gcc.target/aarch64/pr81356.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +void f(char *a)
> +{
> +  __builtin_strcpy (a, "");
> +}
> +
> +/* { dg-final { scan-assembler-not "ldrb" } } */

Reply via email to