On Fri, Nov 24, 2017 at 11:57 AM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Fri, Nov 24, 2017 at 11:31 AM, Richard Biener
> <richard.guent...@gmail.com> wrote:
>> On Thu, Nov 23, 2017 at 4:32 PM, Martin Jambor <mjam...@suse.cz> wrote:
>>> Hi,
>>>
>>> On Mon, Nov 13 2017, Richard Biener wrote:
>>>> The main concern here is that GIMPLE is not very well defined for
>>>> aggregate copies and that gimple-fold.c happily optimizes
>>>> memcpy (&a, &b, sizeof (a)) into a = b;
>>>>
>>>> struct A { short s; long i; long j; };
>>>> struct A a, b;
>>>> void foo ()
>>>> {
>>>>   __builtin_memcpy (&a, &b, sizeof (struct A));
>>>> }
>>>>
>>>> gets folded to
>>>>
>>>>   MEM[(char * {ref-all})&a] = MEM[(char * {ref-all})&b];
>>>>   return;
>>>>
>>>> you see we're careful about TBAA but (don't see that above but
>>>> can be verified by for example debugging expand_assignment)
>>>> TREE_TYPE (MEM[...]) is actually 'struct A'.
>>>>
>>>> And yes, I've been worried about SRA as well here...  it _does_
>>>> have some early outs when seeing VIEW_CONVERT_EXPR but
>>>> appearantly not for the above.  Testcase that aborts with SRA but
>>>> not without:
>>>>
>>>> struct A { short s; long i; long j; };
>>>> struct A a, b;
>>>> void foo ()
>>>> {
>>>>   struct A c;
>>>>   __builtin_memcpy (&c, &b, sizeof (struct A));
>>>>   __builtin_memcpy (&a, &c, sizeof (struct A));
>>>> }
>>>> int main()
>>>> {
>>>>   __builtin_memset (&b, 0, sizeof (struct A));
>>>>   b.s = 1;
>>>>   __builtin_memcpy ((char *)&b+2, &b, 2);
>>>>   foo ();
>>>>   __builtin_memcpy (&a, (char *)&a+2, 2);
>>>>   if (a.s != 1)
>>>>     __builtin_abort ();
>>>>   return 0;
>>>> }
>>>
>>> Thanks for the testcase, I agree that is a fairly big problem.  Do you
>>> think that the following (untested) patch is an appropriate way of
>>> fixing it and generally of extending gimple to capture that a statement
>>> is a bit-copy?
>>
>> I think the place to fix is the memcpy folding.  That is, we'd say that
>> aggregate assignments are not bit-copies but do element-wise assignments.
>> For memcpy folding we'd then need to use a type that doesn't contain
>> padding.  Which effectively means char[].
>>
>> Of course we need to stop SRA from decomposing that copy to
>> individual characters then ;)
>>
>> So iff we decide that all aggregate copies are element copies,
>> maybe only those where TYPE_MAIN_VARIANT of lhs and rhs match
>> (currently we allow TYPE_CANONICAL compatibility and thus there
>> might be some mismatches), then we have to fix nothign but
>> the memcpy folding.
>>
>>> If so, I'll add the testcase, bootstrap it and formally propose it.
>>> Subsequently I will of course make sure that any element-wise copying
>>> patch would test the predicate.
>>
>> I don't think the alias-set should determine whether a copy is
>> bit-wise or not.
>
> Like the attached.  At least FAILs
>
> FAIL: gcc.dg/tree-ssa/ssa-ccp-27.c scan-tree-dump-times ccp1
> "memcpy[^\n]*123456" 2 (found 0 times)
>
> not sure why we have this test.

Hum.  And SRA still decomposes the copy to struct elements w/o padding
even though the access is done using char[].  So somehow it ignores
VIEW_CONVERT_EXPRs
(well, those implicitely present on MEM_REFs).

Looks like this is because total scalarization is done on the decls
and not at all
honoring how the variable is accessed?  The following seems to fix
that, otherwise untested.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c      (revision 255137)
+++ gcc/tree-sra.c      (working copy)
@@ -1338,7 +1338,9 @@ build_accesses_from_assign (gimple *stmt
     {
       racc->grp_assignment_read = 1;
       if (should_scalarize_away_bitmap && !gimple_has_volatile_ops (stmt)
-         && !is_gimple_reg_type (racc->type))
+         && !is_gimple_reg_type (racc->type)
+         && (TYPE_MAIN_VARIANT (racc->type)
+             == TYPE_MAIN_VARIANT (TREE_TYPE (racc->base))))
        bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc->base));
       if (storage_order_barrier_p (lhs))
        racc->grp_unscalarizable_region = 1;


I'm giving this full testing with the folding fix.

Richard.

> Richard.

Reply via email to