On Mon, 15 Jul 2024, Jan Hubicka wrote: > > Currently unaligned YMM and ZMM load and store costs are cheaper than > > aligned which causes the vectorizer to purposely mis-align accesses > > by adding an alignment prologue. It looks like the unaligned costs > > were simply left untouched from znver3 where they equate the aligned > > costs when tweaking aligned costs for znver4. The following makes > > the unaligned costs equal to the aligned costs. > > > > This avoids the miscompile seen in PR115843 but it's of course not > > a real fix for the issue uncovered there. But it makes it qualify > > as a regression fix. > > > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > > > OK for trunk and affected branches? It also affects the gcc11 branch > > Looks good to me. I think it was my omission. I should remmeber that > the costs are there multiple times.
Well, they are not the same but aligned vs. unaligned. I think we decided that on x86 doing an alignment prologue isn't usually helpful which means assigning the same cost to both aligned and unaligned stores/loads. Usually 'unaligned' means unknown alignment but in this particular instance it's also used for _known_ unaligned - the cost hook gets that distinction passed and we could for example bias that case by a +1 though I hardly believe that's going to make a difference (but it would possibly discourage mis-aligning a known aligned access like what happened with deepsjeng). > Maybe wait for SPEC tester before backporting to branches? I've pushed it to trunk now and am running local CPU 2017 to check for obvious fallout on Zen4 so we can make 14.2 RC early next week. There's still the question of GCC 11.5 which got the backport of zen4 support with this "wrong" costs but RC1 was already last week and we're set to release on Friday. I'd like to hear your opinion on that (13.3 and 12.4 also got the bogus value so eventually 11.5 getting the bogus value isn't too bad). Btw, I just see that znver5 tables have the same issue, I'll push the obvious change there as well. Richard. > Honza > > where znver4 support/costs are new for 11.5 and thus it affects the > > release candidate. The alternative option is to revert the zen4 > > backports or leave the costs broken. > > > > Thanks, > > Richard. > > > > PR tree-optimization/115843 > > * config/i386/x86-tune-costs.h (znver4_cost): Update unaligned > > load and store cost from the aligned costs. > > --- > > gcc/config/i386/x86-tune-costs.h | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/gcc/config/i386/x86-tune-costs.h > > b/gcc/config/i386/x86-tune-costs.h > > index a933794ed50..2ac75c35aee 100644 > > --- a/gcc/config/i386/x86-tune-costs.h > > +++ b/gcc/config/i386/x86-tune-costs.h > > @@ -1924,8 +1924,8 @@ struct processor_costs znver4_cost = { > > in 32bit, 64bit, 128bit, 256bit and > > 512bit */ > > {8, 8, 8, 12, 12}, /* cost of storing SSE register > > in 32bit, 64bit, 128bit, 256bit and > > 512bit */ > > - {6, 6, 6, 6, 6}, /* cost of unaligned loads. */ > > - {8, 8, 8, 8, 8}, /* cost of unaligned stores. */ > > + {6, 6, 10, 10, 12}, /* cost of unaligned loads. */ > > + {8, 8, 8, 12, 12}, /* cost of unaligned stores. */ > > 2, 2, 2, /* cost of moving XMM,YMM,ZMM > > register. */ > > 6, /* cost of moving SSE register > > to integer. */ > > -- > > 2.35.3 > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)