pr58041.c scan-assembler ldrb

Richard Earnshaw Wed, 28 May 2014 01:31:25 -0700

Ah, light dawns (maybe).

I guess the problems stem from the attempts to combine Neon with ARMv5.
 Neon shouldn't be used with anything prior to ARMv7, since that's the
earliest version of the architecture that can support it.


I guess that what is happening is that we see we have Neon, so start to
generate a Neon-based copy sequence, but then notice that we don't have
misaligned access (something that must exist if we have Neon) and
generate VLDR instructions in a mistaken attempt to work around the
first inconsistency.

Maybe we should tie -mfpu=neon to having at least ARMv7 (though ARMv6
also has misaligned access support).

R.

On 28/05/14 00:03, Maciej W. Rozycki wrote:
> On Tue, 27 May 2014, Kyrill Tkachov wrote:
> 
>>>>   This change however has regressed gcc.dg/vect/vect-72.c on the
>>>> arm-linux-gnueabi target, -march=armv5te, in particular in 4.8.
>>> And what are all the configure flags you are using in case some one
>>> has to reproduce this issue ?
>>
>> Second that. My recently built 4.8 (gcc version 4.8.2 20130531) for vect-72
>> with options:
>>  -O2 -ftree-vectorize -march=armv5te -mfpu=neon -mfloat-abi=hard
>> -fno-vect-cost-model -fno-common
>>
>> gives code the same as your original one:
>> .L14:
>>         sub     r1, r3, #16
>>         add     r3, r3, #16
>>         vld1.8  {q8}, [r1]
>>         cmp     r3, r0
>>         vst1.64 {d16-d17}, [r2:64]!
>>         bne     .L14
>>         ldr     r3, .L22+12
>>         add     ip, r3, #128
>>         add     r2, r3, #129
> 
>  I have this:
> 
> .L14:
>       sub     r1, r3, #16     @ 130   *arm_addsi3/7   [length = 4]
>       add     r3, r3, #16     @ 135   *arm_addsi3/2   [length = 4]
>       vld1.8  {q8}, [r1]      @ 131   *movmisalignv16qi_neon_load     [length 
> = 4]
>       cmp     r3, r0  @ 136   *arm_cmpsi_insn/3       [length = 4]
>       vst1.64 {d16-d17}, [r2:64]!     @ 133   *neon_movv16qi/2        [length 
> = 8]
>       bne     .L14    @ 137   arm_cond_branch [length = 4]
> 
> without and this:
> 
> .L14:
>       vldr    d16, [r3, #-16] @ 130   *neon_movv16qi/4        [length = 8]
>       vldr    d17, [r3, #-8]
>       add     r3, r3, #16     @ 133   *arm_addsi3/2   [length = 4]
>       cmp     r3, r1  @ 134   *arm_cmpsi_insn/3       [length = 4]
>       vst1.64 {d16-d17}, [r2:64]!     @ 131   *neon_movv16qi/2        [length 
> = 8]
>       bne     .L14    @ 135   arm_cond_branch [length = 4]
> 
> with your change applied respectively so clearly it's making its intended 
> effect of disabling the use of movmisalignv16qi_neon_load.
> 
>  However VLD1.8 can also be produced from other RTL patterns, or maybe you 
> don't have `unaligned_access' set to zero for some reason, which you 
> should (for ARMv5TE) according to this gcc/config/arm/arm.c piece:
> 
>   /* Enable -munaligned-access by default for
>      - all ARMv6 architecture-based processors
>      - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors.
>      - ARMv8 architecture-base processors.
> 
>      Disable -munaligned-access by default for
>      - all pre-ARMv6 architecture-based processors
>      - ARMv6-M architecture-based processors.  */
> 
>   if (unaligned_access == 2)
>     {
>       if (arm_arch6 && (arm_arch_notm || arm_arch7))
>       unaligned_access = 1;
>       else
>       unaligned_access = 0;
>     }
>   else if (unaligned_access == 1
>          && !(arm_arch6 && (arm_arch_notm || arm_arch7)))
>     {
>       warning (0, "target CPU does not support unaligned accesses");
>       unaligned_access = 0;
>     }
> 
> -- can you build vect-72.c with -dp and see which pattern VLD1.8 is 
> produced from in your case?
> 
>  As to the GCC configure options I have as I say nothing there beyond 
> making -march=armv5te the default (surely you're not interested in 
> --prefix, etc.).  For the record the test framework adds these options on 
> top of that for this particular case:
> 
> -fno-diagnostics-show-caret -mfpu=neon -mfloat-abi=softfp -ffast-math 
> -ftree-vectorize -fno-vect-cost-model -fno-common -O2 
> -fdump-tree-vect-details
> 
> but these should be no different in your case (except perhaps from 
> `-mfloat-abi=', but that shouldn't matter as this is no FP code).
> 
>  I have double-checked with current (r210984) 4.8 now and the issue is 
> still there.
> 
>   Maciej
>

Re: [PATCH][ARM] FAIL: gcc.target/arm/pr58041.c scan-assembler ldrb

Reply via email to