https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #8 from Segher Boessenkool <segher at gcc dot gnu.org> --- (In reply to luoxhu from comment #7) > (In reply to Segher Boessenkool from comment #3) > > The rotates in 6 and 7 are not merged, and neither are the vec_selects in > > 8 and 9. Both should be pretty easy to do, there is no unspec in sight, > > etc. > > Should this be done in pass bswaps or combine or by peephole2? :) It should be done by simplify-rtx.c at least (which will make it work in combine and other places): two rotates that together do nothing should be optimised to that, or generally, two rotates should be optimised to just one (which then can be optimised to nothing). Similar for vec_select. Maybe something in bswaps can help as well, I don't know, I haven't looked closely yet.