https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866
luoxhu at gcc dot gnu.org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |luoxhu at gcc dot gnu.org --- Comment #2 from luoxhu at gcc dot gnu.org --- But it only works for V8HImode, no better code generation for other modes like V4SI/V2DI/V1TI to do byte swap with only two instructions vspltish+vrlh? unsigned int swap1[16] = {15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0}; unsigned int swap2[16] = {7,6,5,4,3,2,1,0,15,14,13,12,11,10,9,8}; unsigned int swap4[16] = {3,2,1,0,7,6,5,4,11,10,9,8,15,14,13,12}; unsigned int swap8[16] = {1,0,3,2,5,4,7,6,9,8,11,10,13,12,15,14}; For example V4SI, need swap short first, then swap word, it seems not so straight forward than vperm?