Re: [PATCH v2 1/2] aarch64: Use standard names for saturating arithmetic

Akram Ahmad Wed, 18 Dec 2024 10:03:38 -0800

Hi Kyrill,

On 17/12/2024 15:15, Kyrylo Tkachov wrote:

We avoid using the __builtin_aarch64_* builtins in test cases as they are 
undocumented and we don’t make any guarantees about their stability to users.
I’d prefer if the saturating operation was open-coded in C. I expect the midend 
machinery is smart enough to recognize the saturating logic for scalars by now?

Thanks for the detailed feedback. It's been really helpful, and I'vegone ahead and implemented almost all of it. I'm struggling to find apattern that's recognised for signed arithmetic though- the followingemits branching code:


int64_t  __attribute__((noipa))
sadd64 (int64_t __a, int64_t __b)
{
  if (__a > 0) {
    if (__b > INT64_MAX - __a)
      return INT64_MAX;
  } else if (__b < INT64_MIN - __a) {
    return INT64_MIN;
  }
  return __a + __b;
}

Resulting assembly:

|sadd64: .LFB6: .cfi_startproc mov x3, x0 cmp x0, 0 ble .L9 mov x2,9223372036854775807 sub x4, x2, x0 mov x0, x2 cmp x4, x1 blt .L8 .L11:add x0, x3, x1 .L8: ret .p2align 2,,3 .L9: mov x2, -9223372036854775808sub x0, x2, x0 cmp x0, x1 ble .L11 mov x0, x2 ret Is there a way toforce this not to use branches by any chance? I'll keep looking and seeif there are some patterns recently added to match that will work here.If I don't find something, would it be sufficient to use the scalar NEONintrinsics for this? And if so, would that mean the test should move tothe Adv. SIMD directory? Many thanks once again, Akram |

Re: [PATCH v2 1/2] aarch64: Use standard names for saturating arithmetic

Reply via email to