[Bug tree-optimization/115709] missed optimisation: vperms not reordered to eliminate

mjr19 at cam dot ac.uk via Gcc-bugs Tue, 02 Jul 2024 02:13:17 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115709


--- Comment #3 from mjr19 at cam dot ac.uk ---
Created attachment 58558
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58558&action=edit
Demo of effect of vperm rearrangement

I still believe that my code is correct. To make what I propose clearer, I
attach a runnable demo, which checks itself.

Whether the optimisation is easy enough to be worthwhile, or whether it would
generalise to other cases, is another matter. On a Kaby Lake the optimised
version is about 20% faster, but on a Haswell it is only about 7% faster.

[Bug tree-optimization/115709] missed optimisation: vperms not reordered to eliminate

Reply via email to