http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-03-19 11:13:52 UTC --- Created attachment 26915 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26915 gcc48-pr52607.patch AVX2 changes. This improves: 1) for {x, x, x, x} V4DFmode permutations we now don't generate vpermpd+vperm2f128, and just emit vpermpd (for all other V4DFmode permutations we were already emitting just vpermpd) 2) for most of V8SFmode permutations we now emit code like vmovdqa .LC0(%rip), %ymm1; vpermps %ymm0, %ymm1, %ymm0 instead of vperm2f128 $0, %ymm0, %ymm0, %ymm0; vpermilps .LC0(%rip), %ymm0, %ymm0 3) broadcast permutations improvements for V8SFmode (using vbroadcastss) as well as some integer modes