Hi!

Apparently estimated_stack_frame_size -> expand_one_var -> add_stack_var
-> align_local_variable changes DECL_ALIGN of stack variables from
LOCAL_DECL_ALIGNMENT.

That is unfortunately very undesirable for offloading, because this happens
early during IPA where we stream out LTO bytecode for the target offloading
after it, so on the PR90811 testcase x86_64 backend says a local variable
would be best 128-bit aligned, but in the PTX backend such alignment means
runtime (virtual) stack adjustments because only 64-bit alignment is
guaranteed.

The following patch makes sure we don't update DECL_ALIGN during stack frame
size estimation, just during actual expansion.

Bootstrapped/regtested on x86_64-linux and i686-linux and furthermore tested
with x86_64-linux to nvptx-none offloading with the nvptx.c PR90811 fix
reverted to verify the overaligned variables are gone.  Ok for trunk?

Or, is there something in GIMPLE passes that would benefit from the updated
DECL_ALIGN during post-IPA optimizations, rather than just in RTL passes?
If yes, we should call those somewhere post-IPA on top of this patch.
Nothing in the testsuite showed it though.

2019-06-12  Jakub Jelinek  <ja...@redhat.com>

        PR target/90811
        * cfgexpand.c (align_local_variable): Add really_expand argument,
        don't SET_DECL_ALIGN if it is false.
        (add_stack_var): Add really_expand argument, pass it through to
        align_local_variable.
        (expand_one_stack_var_1): Pass true as really_expand to
        align_local_variable.
        (expand_one_ssa_partition): Pass true as really_expand to
        add_stack_var.
        (expand_one_var): Pass really_expand through to add_stack_var.

--- gcc/cfgexpand.c.jj  2019-05-20 11:39:34.059816432 +0200
+++ gcc/cfgexpand.c     2019-06-11 11:28:08.720932421 +0200
@@ -361,7 +361,7 @@ static bool has_short_buffer;
    we can't do with expected alignment of the stack boundary.  */
 
 static unsigned int
-align_local_variable (tree decl)
+align_local_variable (tree decl, bool really_expand)
 {
   unsigned int align;
 
@@ -370,7 +370,8 @@ align_local_variable (tree decl)
   else
     {
       align = LOCAL_DECL_ALIGNMENT (decl);
-      SET_DECL_ALIGN (decl, align);
+      if (really_expand)
+       SET_DECL_ALIGN (decl, align);
     }
   return align / BITS_PER_UNIT;
 }
@@ -418,7 +419,7 @@ alloc_stack_frame_space (poly_int64 size
 /* Accumulate DECL into STACK_VARS.  */
 
 static void
-add_stack_var (tree decl)
+add_stack_var (tree decl, bool really_expand)
 {
   struct stack_var *v;
 
@@ -446,7 +447,7 @@ add_stack_var (tree decl)
      variables that are simultaneously live.  */
   if (known_eq (v->size, 0U))
     v->size = 1;
-  v->alignb = align_local_variable (decl);
+  v->alignb = align_local_variable (decl, really_expand);
   /* An alignment of zero can mightily confuse us later.  */
   gcc_assert (v->alignb != 0);
 
@@ -1323,7 +1324,7 @@ expand_one_stack_var_1 (tree var)
   else
     {
       size = tree_to_poly_uint64 (DECL_SIZE_UNIT (var));
-      byte_align = align_local_variable (var);
+      byte_align = align_local_variable (var, true);
     }
 
   /* We handle highly aligned variables in expand_stack_vars.  */
@@ -1413,7 +1414,7 @@ expand_one_ssa_partition (tree var)
   if (!use_register_for_decl (var))
     {
       if (defer_stack_allocation (var, true))
-       add_stack_var (var);
+       add_stack_var (var, true);
       else
        expand_one_stack_var_1 (var);
       return;
@@ -1695,7 +1696,7 @@ expand_one_var (tree var, bool toplevel,
        }
     }
   else if (defer_stack_allocation (var, toplevel))
-    add_stack_var (origvar);
+    add_stack_var (origvar, really_expand);
   else
     {
       if (really_expand)

        Jakub

Reply via email to