Currently unaligned YMM and ZMM load and store costs are cheaper than
aligned which causes the vectorizer to purposely mis-align accesses
by adding an alignment prologue.  It looks like the unaligned costs
were simply left untouched from znver3 where they equate the aligned
costs when tweaking aligned costs for znver4.  The following makes
the unaligned costs equal to the aligned costs.

This avoids the miscompile seen in PR115843 but it's of course not
a real fix for the issue uncovered there.  But it makes it qualify
as a regression fix.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

OK for trunk and affected branches?  It also affects the gcc11 branch
where znver4 support/costs are new for 11.5 and thus it affects the
release candidate.  The alternative option is to revert the zen4
backports or leave the costs broken.

Thanks,
Richard.

        PR tree-optimization/115843
        * config/i386/x86-tune-costs.h (znver4_cost): Update unaligned
        load and store cost from the aligned costs.
---
 gcc/config/i386/x86-tune-costs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h
index a933794ed50..2ac75c35aee 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -1924,8 +1924,8 @@ struct processor_costs znver4_cost = {
                                           in 32bit, 64bit, 128bit, 256bit and 
512bit */
   {8, 8, 8, 12, 12},                   /* cost of storing SSE register
                                           in 32bit, 64bit, 128bit, 256bit and 
512bit */
-  {6, 6, 6, 6, 6},                     /* cost of unaligned loads.  */
-  {8, 8, 8, 8, 8},                     /* cost of unaligned stores.  */
+  {6, 6, 10, 10, 12},                  /* cost of unaligned loads.  */
+  {8, 8, 8, 12, 12},                   /* cost of unaligned stores.  */
   2, 2, 2,                             /* cost of moving XMM,YMM,ZMM
                                           register.  */
   6,                                   /* cost of moving SSE register to 
integer.  */
-- 
2.35.3

Reply via email to