On 17 November 2017 at 22:41, James Greenhalgh <james.greenha...@arm.com> wrote: > On Fri, Sep 15, 2017 at 07:22:39PM +0100, Steve Ellcey wrote: >> PR 81356 points out that doing a __builtin_strcpy of an empty string on >> aarch64 does a copy from memory instead of just writing out a zero byte. >> In looking at this I found that it was because of >> aarch64_use_by_pieces_infrastructure_p, which returns false for >> STORE_BY_PIECES. The comment says: >> >> /* STORE_BY_PIECES can be used when copying a constant string, but >> in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR). >> For now we always fail this and let the move_by_pieces code copy >> the string from read-only memory. */ >> >> But this doesn't seem to be the case anymore. When I remove this function >> and the TARGET_USE_BY_PIECES_INFRASTRUCTURE_P macro that uses it the code >> for __builtin_strcpy of a constant string seems to be either better or the >> same. The only time I got more instructions after removing this function >> was on an 8 byte __builtin_strcpy where we now generate a mov and 3 movk >> instructions to create the source followed by a store instead of doing a >> load/store of 8 bytes. The comment may have been applicable for >> -mstrict-align at one time but it doesn't seem to be the case now. I still >> get better code without this routine under that option as well. > > I've been trying to remember why this is the way it is, It looks like it dates > back to the initial port, and the only note I have on it points at an > incorrect PR number. So, I think this is probably a safe and sensible > choice. > > OK. >
As a result of this patch, we now have: XPASS: gfortran.dg/pr45636.f90 -O scan-tree-dump-times forwprop2 "memset" 0 instead of: XFAIL: gfortran.dg/pr45636.f90 -O scan-tree-dump-times forwprop2 "memset" 0 (found 2 times) Is it expected? Christophe > Reviewed-by: James Greenhalgh <james.greenha...@arm.com> > > Thanks, > James > >> >> Bootstrapped and tested without regressions, OK to checkin? >> >> Steve Ellcey >> sell...@cavium.com >> >> >> >> 2017-09-15 Steve Ellcey <sell...@cavium.com> >> >> PR target/81356 >> * config/aarch64/aarch64.c (aarch64_use_by_pieces_infrastructure_p): >> Remove. >> (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Remove define. >> >> >> 2017-09-15 Steve Ellcey <sell...@cavium.com> >> >> * gcc.target/aarch64/pr81356.c: New test. > >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> index 1c14008..fc72236 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -14118,22 +14118,6 @@ aarch64_asan_shadow_offset (void) >> return (HOST_WIDE_INT_1 << 36); >> } >> >> -static bool >> -aarch64_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size, >> - unsigned int align, >> - enum by_pieces_operation op, >> - bool speed_p) >> -{ >> - /* STORE_BY_PIECES can be used when copying a constant string, but >> - in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR). >> - For now we always fail this and let the move_by_pieces code copy >> - the string from read-only memory. */ >> - if (op == STORE_BY_PIECES) >> - return false; >> - >> - return default_use_by_pieces_infrastructure_p (size, align, op, speed_p); >> -} >> - >> static rtx >> aarch64_gen_ccmp_first (rtx_insn **prep_seq, rtx_insn **gen_seq, >> int code, tree treeop0, tree treeop1) >> @@ -15631,10 +15615,6 @@ aarch64_libgcc_floating_mode_supported_p >> #undef TARGET_LEGITIMIZE_ADDRESS >> #define TARGET_LEGITIMIZE_ADDRESS aarch64_legitimize_address >> >> -#undef TARGET_USE_BY_PIECES_INFRASTRUCTURE_P >> -#define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P \ >> - aarch64_use_by_pieces_infrastructure_p >> - >> #undef TARGET_SCHED_CAN_SPECULATE_INSN >> #define TARGET_SCHED_CAN_SPECULATE_INSN aarch64_sched_can_speculate_insn >> > >> diff --git a/gcc/testsuite/gcc.target/aarch64/pr81356.c >> b/gcc/testsuite/gcc.target/aarch64/pr81356.c >> index e69de29..9fd6baa 100644 >> --- a/gcc/testsuite/gcc.target/aarch64/pr81356.c >> +++ b/gcc/testsuite/gcc.target/aarch64/pr81356.c >> @@ -0,0 +1,9 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +void f(char *a) >> +{ >> + __builtin_strcpy (a, ""); >> +} >> + >> +/* { dg-final { scan-assembler-not "ldrb" } } */ >