https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71663
Bug ID: 71663 Summary: aarch64 Vector initialization can be improved slightly Product: gcc Version: 6.1.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64*-*-* Take: #define vector __attribute__((vector_size(16))) vector float combine (float a, float b, float c, float d) { return (vector float) { a, b, c, d }; } --- CUT --- Currently we produce: movi v4.4s, 0 ins v4.s[0], v0.s[0] ins v4.s[1], v1.s[0] ins v4.s[2], v2.s[0] orr v0.16b, v4.16b, v4.16b ins v0.s[3], v3.s[0] ret The movi is not needed and if we did it correctly, the move (orr) is not needed either. Even the first ins is not needed either. Right now we expand the first element as: (insn 9 8 10 (set (reg:SF 80) (reg/v:SF 74 [ a ])) t8.c:5 -1 (nil)) (insn 10 9 11 (set (reg:V4SF 79) (vec_merge:V4SF (vec_duplicate:V4SF (reg:SF 80)) (reg:V4SF 79) (const_int 1 [0x1]))) t8.c:5 -1 (nil)) But maybe if we do: (set (reg:V4SF 79) (subreg:V4SF (reg:SF 74))) We could remove the movi and ins. The last move would remove itself too because v0 is dead after the instruction.