> Sorry for the late notice, but this regressed memcpy-1.c for n32 and n64 > on mips64-linux-gnu. I realise memcpy-1.c isn't viewed as being a proper > SRA test, but in this case I think it really is showing a genuine problem. > We have: > > struct a {int a,b,c;} a; > int test(struct a a) > { > struct a nasty_local; > __builtin_memcpy (&nasty_local,&a, sizeof(a)); > return nasty_local.a; > } > > We apply LOCAL_ALIGNMENT to nasty_local during estimated_stack_frame_size, > so we have a VAR_DECL (nasty_local) with 64-bit alignment and a PARM_DECL > (a) with 32-bit alignment. This fails the condition: > > if (STRICT_ALIGNMENT > && tree_non_aligned_mem_p (rhs, get_object_alignment (lhs))) > lacc->grp_unscalarizable_region = 1; > > because LHS has 64-bit alignment but RHS has 32-bit alignment.
Do you mean that the patch pessimizes this case? If so, yes, this is known, I ran into similar cases in Ada (we do this kind of local alignment promotion). The patch is conservative but, given the number of problems we have with SRA on strict-alignment platforms, we need to draw a line somewhere. But if you think that optimizing this kind of memcpy calls is worth the hassle, then let's try harder. We have: MEM[(char * {ref-all})&nasty_local] = MEM[(char * {ref-all})&a]; and SRA now refuses to scalarize this on the ground that the LHS has 64-bit alignment so SRA could generate a 64-bit aligned load-store, which would break since the RHS only has 32-bit alignment. This is meant for cases like: struct a {int a,b,c;} a; struct __attribute__((packed)) b { struct a f; } int test(struct b b) { struct a nasty_local; __builtin_memcpy (&nasty_local, &b.f, sizeof(a)); return nasty_local.a; } where you cannot scalarize (even without alignment promotion). I can propose an alignment cap to that of the access type: Index: tree-sra.c =================================================================== --- tree-sra.c (revision 182780) +++ tree-sra.c (working copy) @@ -1124,7 +1124,9 @@ build_accesses_from_assign (gimple stmt) { lacc->grp_assignment_write = 1; if (STRICT_ALIGNMENT - && tree_non_aligned_mem_p (rhs, get_object_alignment (lhs))) + && tree_non_aligned_mem_p (rhs, + MIN (TYPE_ALIGN (lacc->type), + get_object_alignment (lhs)))) lacc->grp_unscalarizable_region = 1; } @@ -1135,7 +1137,9 @@ build_accesses_from_assign (gimple stmt) && !is_gimple_reg_type (racc->type)) bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc->base)); if (STRICT_ALIGNMENT - && tree_non_aligned_mem_p (lhs, get_object_alignment (rhs))) + && tree_non_aligned_mem_p (lhs, + MIN (TYPE_ALIGN (racc->type), + get_object_alignment (rhs)))) racc->grp_unscalarizable_region = 1; } on the grounds that sub-accesses shouldn't be more aligned. I think this should be OK but, well... -- Eric Botcazou