On 1 May 2018 at 05:05, Sameera Deshpande <sameera.deshpa...@linaro.org> wrote: > On 13 April 2018 at 20:21, James Greenhalgh <james.greenha...@arm.com> wrote: >> On Fri, Apr 13, 2018 at 03:39:32PM +0100, Sameera Deshpande wrote: >>> On Fri 13 Apr, 2018, 8:04 PM James Greenhalgh, >>> <james.greenha...@arm.com<mailto:james.greenha...@arm.com>> wrote: >>> On Fri, Apr 06, 2018 at 08:55:47PM +0100, Christophe Lyon wrote: >>> > Hi, >>> > >>> > 2018-04-06 12:15 GMT+02:00 Sameera Deshpande >>> > <sameera.deshpa...@linaro.org<mailto:sameera.deshpa...@linaro.org>>: >>> > > Hi Christophe, >>> > > >>> > > Please find attached the updated patch with testcases. >>> > > >>> > > Ok for trunk? >>> > >>> > Thanks for the update. >>> > >>> > Since the new intrinsics are only available on aarch64, you want to >>> > prevent the tests from running on arm. >>> > Indeed gcc.target/aarch64/advsimd-intrinsics/ is shared between the two >>> > targets. >>> > There are several examples on how to do that in that directory. >>> > >>> > I have also noticed that the tests fail at execution on aarch64_be. >>> >>> I think this is important to fix. We don't want the big-endian target to >>> have >>> failing implementations of the Neon intrinsics. What is the nature of the >>> failure? >>> >>> From what I can see, nothing in the patch prevents using these intrinsics >>> on big-endian, so either the intrinsics behaviour is wrong (we have a wrong >>> code bug), or the testcase expected behaviour is wrong. >>> >>> I don't think disabling the test for big-endian is the right fix. We should >>> either fix the intrinsics, or fix the testcase. >>> >>> Thanks, >>> James >>> >>> Hi James, >>> >>> As the tests assume the little endian order of elements while checking the >>> results, the tests are failing for big endian targets. So, the failures are >>> not because of intrinsic implementations, but because of the testcase. >> >> The testcase is a little hard to follow through the macros, but why would >> this be the case? >> >> ld1 is deterministic on big and little endian for which elements will be >> loaded from memory, as is st1. >> >> My expectation would be that: >> >> int __attribute__ ((noinline)) >> test_vld_u16_x3 () >> { >> uint16_t data[3 * 3]; >> uint16_t temp[3 * 3]; >> uint16x4x3_t vectors; >> int i,j; >> for (i = 0; i < 3 * 3; i++) >> data [i] = (uint16_t) 3*i; >> asm volatile ("" : : : "memory"); >> vectors = vld1_u16_x3 (data); >> vst1_u16 (temp, vectors.val[0]); >> vst1_u16 (&temp[3], vectors.val[1]); >> vst1_u16 (&temp[3 * 2], vectors.val[2]); >> asm volatile ("" : : : "memory"); >> for (j = 0; j < 3 * 3; j++) >> if (temp[j] != data[j]) >> return 1; >> return 0; >> } >> >> would work equally well for big- or little-endian. >> >> I think this is more likely to be an intrinsics implementation bug. >> >> Thanks, >> James >> > > Hi James, > > Please find attached the updated patch, which now passes for little as > well as big endian. > Ok for trunk? > > -- > - Thanks and regards, > Sameera D. > > gcc/Changelog: > > 2018-05-01 Sameera Deshpande <sameera.deshpa...@linaro.org> > > > * config/aarch64/aarch64-simd-builtins.def (ld1x3): New. > (st1x2): Likewise. > (st1x3): Likewise. > * config/aarch64/aarch64-simd.md > (aarch64_ld1x3<VALLDIF:mode>): New pattern. > (aarch64_ld1_x3_<mode>): Likewise > (aarch64_st1x2<VALLDIF:mode>): Likewise > (aarch64_st1_x2_<mode>): Likewise > (aarch64_st1x3<VALLDIF:mode>): Likewise > (aarch64_st1_x3_<mode>): Likewise > * config/aarch64/arm_neon.h (vld1_u8_x3): New function. > (vld1_s8_x3): Likewise. > (vld1_u16_x3): Likewise. > (vld1_s16_x3): Likewise. > (vld1_u32_x3): Likewise. > (vld1_s32_x3): Likewise. > (vld1_u64_x3): Likewise. > (vld1_s64_x3): Likewise. > (vld1_f16_x3): Likewise. > (vld1_f32_x3): Likewise. > (vld1_f64_x3): Likewise. > (vld1_p8_x3): Likewise. > (vld1_p16_x3): Likewise. > (vld1_p64_x3): Likewise. > (vld1q_u8_x3): Likewise. > (vld1q_s8_x3): Likewise. > (vld1q_u16_x3): Likewise. > (vld1q_s16_x3): Likewise. > (vld1q_u32_x3): Likewise. > (vld1q_s32_x3): Likewise. > (vld1q_u64_x3): Likewise. > (vld1q_s64_x3): Likewise. > (vld1q_f16_x3): Likewise. > (vld1q_f32_x3): Likewise. > (vld1q_f64_x3): Likewise. > (vld1q_p8_x3): Likewise. > (vld1q_p16_x3): Likewise. > (vld1q_p64_x3): Likewise. > (vst1_s64_x2): Likewise. > (vst1_u64_x2): Likewise. > (vst1_f64_x2): Likewise. > (vst1_s8_x2): Likewise. > (vst1_p8_x2): Likewise. > (vst1_s16_x2): Likewise. > (vst1_p16_x2): Likewise. > (vst1_s32_x2): Likewise. > (vst1_u8_x2): Likewise. > (vst1_u16_x2): Likewise. > (vst1_u32_x2): Likewise. > (vst1_f16_x2): Likewise. > (vst1_f32_x2): Likewise. > (vst1_p64_x2): Likewise. > (vst1q_s8_x2): Likewise. > (vst1q_p8_x2): Likewise. > (vst1q_s16_x2): Likewise. > (vst1q_p16_x2): Likewise. > (vst1q_s32_x2): Likewise. > (vst1q_s64_x2): Likewise. > (vst1q_u8_x2): Likewise. > (vst1q_u16_x2): Likewise. > (vst1q_u32_x2): Likewise. > (vst1q_u64_x2): Likewise. > (vst1q_f16_x2): Likewise. > (vst1q_f32_x2): Likewise. > (vst1q_f64_x2): Likewise. > (vst1q_p64_x2): Likewise. > (vst1_s64_x3): Likewise. > (vst1_u64_x3): Likewise. > (vst1_f64_x3): Likewise. > (vst1_s8_x3): Likewise. > (vst1_p8_x3): Likewise. > (vst1_s16_x3): Likewise. > (vst1_p16_x3): Likewise. > (vst1_s32_x3): Likewise. > (vst1_u8_x3): Likewise. > (vst1_u16_x3): Likewise. > (vst1_u32_x3): Likewise. > (vst1_f16_x3): Likewise. > (vst1_f32_x3): Likewise. > (vst1_p64_x3): Likewise. > (vst1q_s8_x3): Likewise. > (vst1q_p8_x3): Likewise. > (vst1q_s16_x3): Likewise. > (vst1q_p16_x3): Likewise. > (vst1q_s32_x3): Likewise. > (vst1q_s64_x3): Likewise. > (vst1q_u8_x3): Likewise. > (vst1q_u16_x3): Likewise. > (vst1q_u32_x3): Likewise. > (vst1q_u64_x3): Likewise. > (vst1q_f16_x3): Likewise. > (vst1q_f32_x3): Likewise. > (vst1q_f64_x3): Likewise. > (vst1q_p64_x3): Likewise. > > gcc/testsuite/Changelog: > > 2018-05-01 Sameera Deshpande <sameera.deshpa...@linaro.org> > > * gcc.target/aarch64/advsimd-intrinsics/vld1x3.c: New test for > vld1x3 intrinsics for aarch64. > * gcc.target/aarch64/advsimd-intrinsics/vst1x2.c: New test for > vst1x2 intrinsics for aarch64. > * gcc.target/aarch64/advsimd-intrinsics/vst1x3.c: New test for > vst1x3 intrinsics for aarch64.
Gentle reminder! -- - Thanks and regards, Sameera D.