On Tue, 16 Jul 2024, Richard Biener wrote:

> On Mon, 15 Jul 2024, Jan Hubicka wrote:
> 
> > > Currently unaligned YMM and ZMM load and store costs are cheaper than
> > > aligned which causes the vectorizer to purposely mis-align accesses
> > > by adding an alignment prologue.  It looks like the unaligned costs
> > > were simply left untouched from znver3 where they equate the aligned
> > > costs when tweaking aligned costs for znver4.  The following makes
> > > the unaligned costs equal to the aligned costs.
> > > 
> > > This avoids the miscompile seen in PR115843 but it's of course not
> > > a real fix for the issue uncovered there.  But it makes it qualify
> > > as a regression fix.
> > > 
> > > Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> > > 
> > > OK for trunk and affected branches?  It also affects the gcc11 branch
> > 
> > Looks good to me.  I think it was my omission.  I should remmeber that
> > the costs are there multiple times.
> 
> Well, they are not the same but aligned vs. unaligned.  I think we decided
> that on x86 doing an alignment prologue isn't usually helpful which means
> assigning the same cost to both aligned and unaligned stores/loads.
> Usually 'unaligned' means unknown alignment but in this particular 
> instance it's also used for _known_ unaligned - the cost hook gets that
> distinction passed and we could for example bias that case by a +1 though
> I hardly believe that's going to make a difference (but it would possibly
> discourage mis-aligning a known aligned access like what happened with
> deepsjeng).
> 
> > Maybe wait for SPEC tester before backporting to branches?
> 
> I've pushed it to trunk now and am running local CPU 2017 to check for
> obvious fallout on Zen4 so we can make 14.2 RC early next week.  There's
> still the question of GCC 11.5 which got the backport of zen4 support
> with this "wrong" costs but RC1 was already last week and we're set
> to release on Friday.

There were no surprises with SPEC CPU 2017 on the 14 branch with this fix
so I've pushed it there, too, now.

Richard.

Reply via email to