https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019
--- Comment #7 from Robin Dapp <rdapp at gcc dot gnu.org> --- > The problem is GCC-15 has performance regression compare to GCC-14 on both > strict align and we should fix it, we can't specify use no strict align in > GCC-15 to pretend that we don't have such performance regression. The problem is that in GCC 15 we're now vectorizing the first loop and don't cost it properly, most likely the vec_init/vec_construct is too inexpensive. A solution could be two-fold: - Increase those costs (but we can never get them really correct in case a vec_init is just a broadcast or so) - Enhance our vec_init expander to also allow construction from sub vectors so the initialization becomes cheaper and we don't need to clumsily load every uint8 element separately. I can start with the second one which should also help other workloads. There are several follow-up items, though, including proper subreg handling for those cases.