> Sorry for the late notice, but this regressed memcpy-1.c for n32 and n64
> on mips64-linux-gnu.  I realise memcpy-1.c isn't viewed as being a proper
> SRA test, but in this case I think it really is showing a genuine problem.
> We have:
>
>     struct a {int a,b,c;} a;
>     int test(struct a a)
>     {
>     struct a nasty_local;
>     __builtin_memcpy (&nasty_local,&a, sizeof(a));
>     return nasty_local.a;
>     }
>
> We apply LOCAL_ALIGNMENT to nasty_local during estimated_stack_frame_size,
> so we have a VAR_DECL (nasty_local) with 64-bit alignment and a PARM_DECL
> (a) with 32-bit alignment.  This fails the condition:
>
>       if (STRICT_ALIGNMENT
>         && tree_non_aligned_mem_p (rhs, get_object_alignment (lhs)))
>         lacc->grp_unscalarizable_region = 1;
>
> because LHS has 64-bit alignment but RHS has 32-bit alignment.

Do you mean that the patch pessimizes this case?  If so, yes, this is known, I 
ran into similar cases in Ada (we do this kind of local alignment promotion).

The patch is conservative but, given the number of problems we have with SRA on 
strict-alignment platforms, we need to draw a line somewhere.  But if you think 
that optimizing this kind of memcpy calls is worth the hassle, then let's try 
harder.  We have:

 MEM[(char * {ref-all})&nasty_local] = MEM[(char * {ref-all})&a];

and SRA now refuses to scalarize this on the ground that the LHS has 64-bit 
alignment so SRA could generate a 64-bit aligned load-store, which would break 
since the RHS only has 32-bit alignment.  This is meant for cases like:

struct a {int a,b,c;} a;
struct __attribute__((packed)) b { struct a f; }

int test(struct b b)
{
struct a nasty_local;
__builtin_memcpy (&nasty_local, &b.f, sizeof(a));
return nasty_local.a;
}

where you cannot scalarize (even without alignment promotion).  I can propose 
an alignment cap to that of the access type:

Index: tree-sra.c
===================================================================
--- tree-sra.c  (revision 182780)
+++ tree-sra.c  (working copy)
@@ -1124,7 +1124,9 @@ build_accesses_from_assign (gimple stmt)
     {
       lacc->grp_assignment_write = 1;
       if (STRICT_ALIGNMENT
-         && tree_non_aligned_mem_p (rhs, get_object_alignment (lhs)))
+         && tree_non_aligned_mem_p (rhs,
+                                    MIN (TYPE_ALIGN (lacc->type),
+                                         get_object_alignment (lhs))))
         lacc->grp_unscalarizable_region = 1;
     }

@@ -1135,7 +1137,9 @@ build_accesses_from_assign (gimple stmt)
          && !is_gimple_reg_type (racc->type))
        bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc->base));
       if (STRICT_ALIGNMENT
-         && tree_non_aligned_mem_p (lhs, get_object_alignment (rhs)))
+         && tree_non_aligned_mem_p (lhs,
+                                    MIN (TYPE_ALIGN (racc->type),
+                                         get_object_alignment (rhs))))
         racc->grp_unscalarizable_region = 1;
     }

on the grounds that sub-accesses shouldn't be more aligned.  I think this 
should be OK but, well...

-- 
Eric Botcazou

Reply via email to