https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74585
--- Comment #9 from Bill Schmidt <wschmidt at gcc dot gnu.org> --- We do optimize things well for the following: typedef struct { __vector double vx0; __vector double vx1; __vector double vx2; __vector double vx3; } vdoublex8_t; vdoublex8_t test_vecd8_rotate_left (vdoublex8_t a) { vdoublex8_t b; b.vx0 = a.vx0; b.vx1 = a.vx1; b.vx2 = a.vx2; b.vx3 = a.vx3; return b; } At expansion time we have similar code as above, with all the stack stores and reloads for the argument and the return value, but the optimizer is able to remove all of that. So the problem only occurs when referencing part of a vector.