https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113882
Bug ID: 113882 Summary: V4SF->V4HI could be implemented using V4SF->V4SI and then truncation to V4HI Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64 Take: ``` void f(short *a, float *b) { a[0] = b[0]; a[1] = b[1]; a[2] = b[2]; a[3] = b[3]; } void f1(float *a, short *b) { a[0] = b[0]; a[1] = b[1]; a[2] = b[2]; a[3] = b[3]; } ``` GCC can SLP f1 (which does V4SF->V4HI) but not f1. LLVM can though: ``` f: ldr q0, [x1] fcvtzs v0.4s, v0.4s xtn v0.4h, v0.4s str d0, [x0] ret ```