https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117562
--- Comment #10 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- > > I do wonder about the usefulness of the memory alternative on the > sse_movhlps pattern though, there's the sse_storehps pattern which > also models the store part more precisely as V2SFmode. Is > sse_movhlps_exp ever invoked with a memory destination? > Like this? typedef float v4sf __attribute__((vector_size(16))); void foo (v4sf a, v4sf* b) { *b = __builtin_shufflevector (*b, a, 0, 1, 4, 5); } foo(float __vector(4), float __vector(4)*): movlps QWORD PTR [rdi+8], xmm0 # 11 [c=4 l=3] sse_movlhps/4 ret # 19 [c=0 l=1] simple_return_internal