https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115974
Bug ID: 115974 Summary: sat_add vector patterns not done for aarch64 Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64-linux-gnu Take: ``` void f0(unsigned *__restrict__ a, unsigned * __restrict__ b) { for(int i = 0;i < 1024;i ++) { unsigned tt; if (__builtin_add_overflow (a[i], b[i], &tt)) tt = -1u; a[i] = tt; } } ``` This should be vectorizable. Like it is on riscv or with clang. LLVM's output: ``` .LBB1_1: // =>This Inner Loop Header: Depth=1 ldp q0, q3, [x10, #-16] subs x8, x8, #8 ldp q1, q2, [x9, #-16] add x10, x10, #32 uqadd v0.4s, v1.4s, v0.4s uqadd v1.4s, v2.4s, v3.4s stp q0, q1, [x9, #-16] add x9, x9, #32 b.ne .LBB1_1 ```