On Wed, Jan 11, 2017 at 11:13:07AM +0100, Christophe Lyon wrote:
> Ping?
> 
> James, I'm not sure whether your comment was a request for a new
> version of my patch or just FYI?

Sorry that this was unclear. I was looking for a new version of the patch
covering this comment. Otherwise we just have debt to go fix it in future.

With the suggested change, the AArch64 parts of this patch are OK - adding
missing intrinsics is very safe (even in Stage 4).

Please post an updated patch, and give Richard and Marcus a reasonable
amount of tiume to object to taking the patch this late. (and you need an
AArch32 OK too).

Thanks,
James

> 
> 
> On 3 January 2017 at 16:47, Christophe Lyon <christophe.l...@linaro.org> 
> wrote:
> > Ping?
> >
> >
> > On 14 December 2016 at 23:09, Christophe Lyon
> > <christophe.l...@linaro.org> wrote:
> >> On 14 December 2016 at 17:55, James Greenhalgh <james.greenha...@arm.com> 
> >> wrote:
> >>> On Mon, Dec 12, 2016 at 05:03:31PM +0100, Christophe Lyon wrote:
> >>>> Hi,
> >>>>
> >>>> After the recent update from Tamar, I noticed a few discrepancies
> >>>> between ARM and AArch64 regarding a few poly64 intrinsics.
> >>>>
> >>>> This patch:
> >>>> - adds vtst_p64 and vtstq_p64 to AArch64's arm_neon.h
> >>>> - adds vgetq_lane_p64, vset_lane_p64 and vsetq_lane_p64 to ARM's 
> >>>> arm_neon.h
> >>>> ( vget_lane_p64 was already there)
> >>>> - adds the corresponding tests, and moves the vget_lane_p64 ones out
> >>>> of the #ifdef __aarch64__ zone.
> >>>>
> >>>> Cross-tested on arm* and aarch64* targets.
> >>>>
> >>>> OK?
> >>>
> >>> The AArch64 parts of this look fine to me, but I do have one question on
> >>> your inline assembly implementation for vtstq_p64:
> >>>
> >>>> +__extension__ extern __inline uint64x2_t
> >>>> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> >>>> +vtstq_p64 (poly64x2_t a, poly64x2_t b)
> >>>> +{
> >>>> +  uint64x2_t result;
> >>>> +  __asm__ ("cmtst %0.2d, %1.2d, %2.2d"
> >>>> +           : "=w"(result)
> >>>> +           : "w"(a), "w"(b)
> >>>> +           : /* No clobbers */);
> >>>> +  return result;
> >>>> +}
> >>>> +
> >>>
> >>> Why can this not be written as many of the other vtstq intrinsics are; 
> >>> e.g.:
> >>>
> >>>    __extension__ extern __inline uint64x2_t
> >>>   __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> >>>   vtstq_p64 (poly64x2_t __a, poly64x2_t __b)
> >>>   {
> >>>     return (uint64x2_t) ((((uint64x2_t) __a) & ((uint64x2_t) __b))
> >>>                           != __AARCH64_INT64_C (0));
> >>>   }
> >>>
> >>
> >> I don't know, I just followed the pattern used for vtstq_p8 and vtstq_p16
> >> just above...
> >>
> >>
> >>> Thanks,
> >>> James
> >>>
> >>>> gcc/ChangeLog:
> >>>>
> >>>> 2016-12-12  Christophe Lyon  <christophe.l...@linaro.org>
> >>>>
> >>>>       * config/aarch64/arm_neon.h (vtst_p64): New.
> >>>>       (vtstq_p64): New.
> >>>>       * config/arm/arm_neon.h (vgetq_lane_p64): New.
> >>>>       (vset_lane_p64): New.
> >>>>       (vsetq_lane_p64): New.
> >>>>
> >>>> gcc/testsuite/ChangeLog:
> >>>>
> >>>> 2016-12-12  Christophe Lyon  <christophe.l...@linaro.org>
> >>>>
> >>>>       * gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
> >>>>       (vget_lane_expected, vset_lane_expected, vtst_expected_poly64x1):
> >>>>       New.
> >>>>       (vmov_n_expected0, vmov_n_expected1, vmov_n_expected2)
> >>>>       (expected_vld_st2_0, expected_vld_st2_1, expected_vld_st3_0)
> >>>>       (expected_vld_st3_1, expected_vld_st3_2, expected_vld_st4_0)
> >>>>       (expected_vld_st4_1, expected_vld_st4_2, expected_vld_st4_3)
> >>>>       (vtst_expected_poly64x2): Move to aarch64-only section.
> >>>>       (vget_lane_p64, vgetq_lane_p64, vset_lane_p64, vsetq_lane_p64)
> >>>>       (vtst_p64, vtstq_p64): New tests.
> >>>>
> >>>
> >>>

Reply via email to