On Tue, 27 May 2014, Kyrill Tkachov wrote: > > > This change however has regressed gcc.dg/vect/vect-72.c on the > > > arm-linux-gnueabi target, -march=armv5te, in particular in 4.8. > > And what are all the configure flags you are using in case some one > > has to reproduce this issue ? > > Second that. My recently built 4.8 (gcc version 4.8.2 20130531) for vect-72 > with options: > -O2 -ftree-vectorize -march=armv5te -mfpu=neon -mfloat-abi=hard > -fno-vect-cost-model -fno-common > > gives code the same as your original one: > .L14: > sub r1, r3, #16 > add r3, r3, #16 > vld1.8 {q8}, [r1] > cmp r3, r0 > vst1.64 {d16-d17}, [r2:64]! > bne .L14 > ldr r3, .L22+12 > add ip, r3, #128 > add r2, r3, #129
I have this: .L14: sub r1, r3, #16 @ 130 *arm_addsi3/7 [length = 4] add r3, r3, #16 @ 135 *arm_addsi3/2 [length = 4] vld1.8 {q8}, [r1] @ 131 *movmisalignv16qi_neon_load [length = 4] cmp r3, r0 @ 136 *arm_cmpsi_insn/3 [length = 4] vst1.64 {d16-d17}, [r2:64]! @ 133 *neon_movv16qi/2 [length = 8] bne .L14 @ 137 arm_cond_branch [length = 4] without and this: .L14: vldr d16, [r3, #-16] @ 130 *neon_movv16qi/4 [length = 8] vldr d17, [r3, #-8] add r3, r3, #16 @ 133 *arm_addsi3/2 [length = 4] cmp r3, r1 @ 134 *arm_cmpsi_insn/3 [length = 4] vst1.64 {d16-d17}, [r2:64]! @ 131 *neon_movv16qi/2 [length = 8] bne .L14 @ 135 arm_cond_branch [length = 4] with your change applied respectively so clearly it's making its intended effect of disabling the use of movmisalignv16qi_neon_load. However VLD1.8 can also be produced from other RTL patterns, or maybe you don't have `unaligned_access' set to zero for some reason, which you should (for ARMv5TE) according to this gcc/config/arm/arm.c piece: /* Enable -munaligned-access by default for - all ARMv6 architecture-based processors - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors. - ARMv8 architecture-base processors. Disable -munaligned-access by default for - all pre-ARMv6 architecture-based processors - ARMv6-M architecture-based processors. */ if (unaligned_access == 2) { if (arm_arch6 && (arm_arch_notm || arm_arch7)) unaligned_access = 1; else unaligned_access = 0; } else if (unaligned_access == 1 && !(arm_arch6 && (arm_arch_notm || arm_arch7))) { warning (0, "target CPU does not support unaligned accesses"); unaligned_access = 0; } -- can you build vect-72.c with -dp and see which pattern VLD1.8 is produced from in your case? As to the GCC configure options I have as I say nothing there beyond making -march=armv5te the default (surely you're not interested in --prefix, etc.). For the record the test framework adds these options on top of that for this particular case: -fno-diagnostics-show-caret -mfpu=neon -mfloat-abi=softfp -ffast-math -ftree-vectorize -fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-details but these should be no different in your case (except perhaps from `-mfloat-abi=', but that shouldn't matter as this is no FP code). I have double-checked with current (r210984) 4.8 now and the issue is still there. Maciej