On 9/5/19 5:55 PM, Richard Earnshaw (lists) wrote: > On 03/09/2019 21:45, Bernd Edlinger wrote: >> Hi, >> >> due to the introduction of unaligned_loaddi and unaligned_storedi, >> two test cases show some regression as PR 91614 points out. >> >> I would like to change the test expectations if these two test cases, >> since they seem to be bogus. >> >> That is the test case already failed for arm_prefer_ldrd_strd targets, >> since not a ldrd or strd is created but a ldm/stm instead, which >> is probably not what is intended. >> >> That is for arm_prefer_ldrd_strd the previously used instruction movdi: >> >> (insn 11 10 12 2 (set (mem/c:DI (reg:SI 112) [0 MEM <charD.4[1:15]> >> [(voidD.71 *)&destD.5932]+0 S8 A32]) >> (reg:DI 114)) "unaligned-memcpy-2.c":11:3 642 {*movdi_vfp} >> (nil)) >> >> is split up very early in subreg1 into: >> >> (insn 21 10 22 2 (set (mem/c:SI (reg:SI 112) [0 MEM <charD.4[1:15]> >> [(voidD.71 *)&destD.5932]+0 S4 A32]) >> (reg:SI 119)) "unaligned-memcpy-2.c":11:3 640 {*arm_movsi_vfp} >> (nil)) >> (insn 22 21 12 2 (set (mem/c:SI (plus:SI (reg:SI 112) >> (const_int 4 [0x4])) [0 MEM <charD.4[1:15]> [(voidD.71 >> *)&destD.5932]+4 S4 A32]) >> (reg:SI 120 [+4 ])) "unaligned-memcpy-2.c":11:3 640 {*arm_movsi_vfp} >> (nil)) >> >> which is finally replaced by: >> >> (insn 35 10 12 2 (parallel [ >> (set (mem/c:SI (reg/f:SI 3 r3 [111]) [0 MEM <charD.4[1:15]> >> [(voidD.71 *)&destD.5932]+0 S4 A32]) >> (reg:SI 1 r1)) >> (set (mem/c:SI (plus:SI (reg/f:SI 3 r3 [111]) >> (const_int 4 [0x4])) [0 MEM <charD.4[1:15]> >> [(voidD.71 *)&destD.5932]+4 S4 A32]) >> (reg:SI 2 r2)) >> ]) "unaligned-memcpy-2.c":11:3 378 {*stm2_} >> (expr_list:REG_DEAD (reg:SI 2 r2) >> (expr_list:REG_DEAD (reg:SI 1 r1) >> (nil)))) >> >> ... so stm instead of strd. >> >> Since the unalinged_storedi is an unspec, it is expanded as strd which is >> kind of okay, but there is register pair spilled which was not there >> previously: >> >> aligned_dest: >> @ args = 0, pretend = 0, frame = 0 >> @ frame_needed = 0, uses_anonymous_args = 0 >> @ link register save eliminated. >> strd r4, [sp, #-8]! >> ldr r4, [r0] @ unaligned >> movw r3, #:lower16:.LANCHOR0 >> ldr r5, [r0, #4] @ unaligned >> movt r3, #:upper16:.LANCHOR0 >> strd r4, [r3] >> ldr r2, [r0, #8] @ unaligned >> str r2, [r3, #8] >> ldrh r2, [r0, #12] @ unaligned >> strh r2, [r3, #12] @ movhi >> ldrb r2, [r0, #14] @ zero_extendqisi2 >> strb r2, [r3, #14] >> ldrd r4, [sp] >> add sp, sp, #8 >> bx lr >> >> The patch filters out ldrd/strd that sp-relative, and adds a few >> missing scan-assembler rules, in order to make the test results more stable. >> >> Alternative could be to use two movsi instead of the unaligned_load/storedi, >> but that would end up in ldm/stm which looks plain wrong to me. >> >> >> Tested using various arm-linux-gnueabihf cross-compilers. >> Is it OK for trunk? >> >> >> Thanks >> Bernd. >> > 2019-09-03 Bernd Edlinger <bernd.edlin...@hotmail.de> > > PR target/91614 > * gcc.target/arm/unaligned-memcpy-2.c: Adjust test expectations. > * gcc.target/arm/unaligned-memcpy-3.c: Likewise. > > Index: gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c > =================================================================== > --- gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c (Revision 275341) > +++ gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c (Arbeitskopie) > @@ -15,9 +15,11 @@ > loads/stores for the remainder. */ > > /* { dg-final { scan-assembler-times "ldmia" 0 } } */ > -/* { dg-final { scan-assembler-times "ldrd" 0 } } */ > +/* { dg-final { scan-assembler-times "ldrd\(?!\[^\\n\]*sp\)" 0 } } */ > +/* { dg-final { scan-assembler-times "ldm\(?!ia\)" 0 } } */ > /* { dg-final { scan-assembler-times "stmia" 1 { target { ! { > arm_prefer_ldrd_strd } } } } } */ > -/* { dg-final { scan-assembler-times "strd" 1 { target { > arm_prefer_ldrd_strd } } } } */ > +/* { dg-final { scan-assembler-times "strd\(?!\[^\\n\]*sp\)" 1 { target { > arm_prefer_ldrd_strd } } } } */ > +/* { dg-final { scan-assembler-times "stm\(?!ia\)" 0 } } */ > /* { dg-final { scan-assembler-times "ldrh" 1 } } */ > /* { dg-final { scan-assembler-times "strh" 1 } } */ > /* { dg-final { scan-assembler-times "ldrb" 1 } } */ > Index: gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c > =================================================================== > --- gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c (Revision 275341) > +++ gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c (Arbeitskopie) > @@ -15,10 +15,12 @@ > loads/stores for the remainder. */ > > /* { dg-final { scan-assembler-times "ldmia" 1 { target { ! { > arm_prefer_ldrd_strd } } } } } */ > -/* { dg-final { scan-assembler-times "ldrd" 1 { target { > arm_prefer_ldrd_strd } } } } */ > -/* { dg-final { scan-assembler-times "strd" 0 } } */ > -/* { dg-final { scan-assembler-times "stm" 0 } } */ > -/* { dg-final { scan-assembler-times "ldrh" 1 { target { ! { > arm_prefer_ldrd_strd } } } } } */ > +/* { dg-final { scan-assembler-times "ldrd\(?!\[^\\n\]*sp\)" 1 { target { > arm_prefer_ldrd_strd } } } } */ > +/* { dg-final { scan-assembler-times "ldm\(?!ia\)" 0 } } */ > +/* { dg-final { scan-assembler-times "stmia" 0 } } */ > +/* { dg-final { scan-assembler-times "strd\(?!\[^\\n\]*sp\)" 0 } } */ > +/* { dg-final { scan-assembler-times "stm\(?!ia\)" 0 } } */ > +/* { dg-final { scan-assembler-times "ldrh" 1 } } */ > /* { dg-final { scan-assembler-times "strh" 1 } } */ > -/* { dg-final { scan-assembler-times "ldrb" 1 { target { ! { > arm_prefer_ldrd_strd } } } } } */ > +/* { dg-final { scan-assembler-times "ldrb" 1 } } */ > /* { dg-final { scan-assembler-times "strb" 1 } } */ > > Ug! That's horrible! >
Yes, I know, this won't win a prize for it's beauty :-) But to catch all possible corner cases can't be easy. > Also, it's testing things we really shouldn't be concerned about. For > example, the ldrh+ldrb+strh+strb is probably better done as ldr [offset-1] + > str [offset-1] to handle the tail (ie re-copying the byte 4th from the end). > When someone does that, this test will then needlessly fail. > Okay, removed those then. And scan for ldrd _or_ ldm, respectively strd _or stm, which both are acceptable since the aligned side is effectively 4-byte aligned. > I'm not really convinced that this sort of thing can be properly tested with > scan-assembler tests, there's just too much that might change. Perhaps we > should just accept this and create a simple executable test that will fault > if run on a target that doesn't support the code we generate... > Yes, that would be good as well, but runtime tests can't test if a specific optimization like ldrd/strd is _not_ used when it should be. So how about this? Is it OK for trunk? Thanks Bernd.
2019-09-03 Bernd Edlinger <bernd.edlin...@hotmail.de> PR target/91614 * gcc.target/arm/unaligned-memcpy-2.c: Adjust test expectations. * gcc.target/arm/unaligned-memcpy-3.c: Likewise. Index: gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c =================================================================== --- gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c (revision 275409) +++ gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c (working copy) @@ -11,14 +11,12 @@ void aligned_dest (char *src) memcpy (dest, src, 15); } -/* Expect a multi-word store for the main part of the copy, but subword - loads/stores for the remainder. */ +/* Expect a multi-word store for the main part of the copy, + but no multi-word load for the misaligned input. + Ignore possible ldrd/strd from prologue/epilogue code. */ /* { dg-final { scan-assembler-times "ldmia" 0 } } */ -/* { dg-final { scan-assembler-times "ldrd" 0 } } */ +/* { dg-final { scan-assembler-times "ldrd\(?!\[^\\n\]*sp\)" 0 } } */ +/* { dg-final { scan-assembler-times "ldm\(?!ia\)" 0 } } */ /* { dg-final { scan-assembler-times "stmia" 1 { target { ! { arm_prefer_ldrd_strd } } } } } */ -/* { dg-final { scan-assembler-times "strd" 1 { target { arm_prefer_ldrd_strd } } } } */ -/* { dg-final { scan-assembler-times "ldrh" 1 } } */ -/* { dg-final { scan-assembler-times "strh" 1 } } */ -/* { dg-final { scan-assembler-times "ldrb" 1 } } */ -/* { dg-final { scan-assembler-times "strb" 1 } } */ +/* { dg-final { scan-assembler-times "strd\(?!\[^\\n\]*sp\)\|stm\(?!ia\)" 1 { target { arm_prefer_ldrd_strd } } } } */ Index: gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c =================================================================== --- gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c (revision 275409) +++ gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c (working copy) @@ -11,14 +11,12 @@ void aligned_src (char *dest) memcpy (dest, src, 15); } -/* Expect a multi-word load for the main part of the copy, but subword - loads/stores for the remainder. */ +/* Expect a multi-word load for the main part of the copy, + but no multi-word store for the misaligned output. + Ignore possible ldrd/strd from prologue/epilogue code. */ /* { dg-final { scan-assembler-times "ldmia" 1 { target { ! { arm_prefer_ldrd_strd } } } } } */ -/* { dg-final { scan-assembler-times "ldrd" 1 { target { arm_prefer_ldrd_strd } } } } */ -/* { dg-final { scan-assembler-times "strd" 0 } } */ -/* { dg-final { scan-assembler-times "stm" 0 } } */ -/* { dg-final { scan-assembler-times "ldrh" 1 { target { ! { arm_prefer_ldrd_strd } } } } } */ -/* { dg-final { scan-assembler-times "strh" 1 } } */ -/* { dg-final { scan-assembler-times "ldrb" 1 { target { ! { arm_prefer_ldrd_strd } } } } } */ -/* { dg-final { scan-assembler-times "strb" 1 } } */ +/* { dg-final { scan-assembler-times "ldrd\(?!\[^\\n\]*sp\)\|ldm\(?!ia\)" 1 { target { arm_prefer_ldrd_strd } } } } */ +/* { dg-final { scan-assembler-times "stmia" 0 } } */ +/* { dg-final { scan-assembler-times "strd\(?!\[^\\n\]*sp\)" 0 } } */ +/* { dg-final { scan-assembler-times "stm\(?!ia\)" 0 } } */