[gcc r12-10636] Fixup unaligned load/store cost for znver4

Richard Biener via Gcc-cvs Tue, 23 Jul 2024 00:32:48 -0700

https://gcc.gnu.org/g:f78eb9524bd97679c8baa47a62e82147272719ae


commit r12-10636-gf78eb9524bd97679c8baa47a62e82147272719ae
Author: Richard Biener <rguent...@suse.de>
Date:   Mon Jul 15 13:01:24 2024 +0200

    Fixup unaligned load/store cost for znver4
    
    Currently unaligned YMM and ZMM load and store costs are cheaper than
    aligned which causes the vectorizer to purposely mis-align accesses
    by adding an alignment prologue.  It looks like the unaligned costs
    were simply left untouched from znver3 where they equate the aligned
    costs when tweaking aligned costs for znver4.  The following makes
    the unaligned costs equal to the aligned costs.
    
    This avoids the miscompile seen in PR115843 but it's of course not
    a real fix for the issue uncovered there.  But it makes it qualify
    as a regression fix.
    
            PR tree-optimization/115843
            * config/i386/x86-tune-costs.h (znver4_cost): Update unaligned
            load and store cost from the aligned costs.
    
    (cherry picked from commit 1e3aa9c9278db69d4bdb661a750a7268789188d6)

Diff:
---
 gcc/config/i386/x86-tune-costs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h
index f105d57cae79..d58827888994 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -1894,8 +1894,8 @@ struct processor_costs znver4_cost = {
                                           in 32bit, 64bit, 128bit, 256bit and 
512bit */
   {8, 8, 8, 12, 12},                   /* cost of storing SSE register
                                           in 32bit, 64bit, 128bit, 256bit and 
512bit */
-  {6, 6, 6, 6, 6},                     /* cost of unaligned loads.  */
-  {8, 8, 8, 8, 8},                     /* cost of unaligned stores.  */
+  {6, 6, 10, 10, 12},                  /* cost of unaligned loads.  */
+  {8, 8, 8, 12, 12},                   /* cost of unaligned stores.  */
   2, 2, 2,                             /* cost of moving XMM,YMM,ZMM
                                           register.  */
   6,                                   /* cost of moving SSE register to 
integer.  */

[gcc r12-10636] Fixup unaligned load/store cost for znver4

Reply via email to