On Thu, 7 Dec 2017, Martin Jambor wrote: > Hi, > > On Wed, Dec 06 2017, Richard Biener wrote: > > On Wed, 6 Dec 2017, Martin Jambor wrote: > > >> ... > > >> Second is the testcase I described in my previous email. When I saw > >> > >> FAIL: gcc.dg/guality/pr54970.c -O1 line 31 a[0] == 4 > >> > >> At all optimization levels, I grumbled about Jakub being right again and > >> duly decided to bite the bullet and do what he asked me to because it > >> fixes the issue. But if you allow me to XFAIL the guality test, I will > >> happily remove the whole padding detection, I don't really like it > >> either. > >> > >> The debug information is apparently lost because a[0] is never used from > >> that point on, as opposed to a[1] and a[2] for which the guality test > >> still passes. > > > > XFAILing that is fine I think. > > > > Great, the updated and re-tested patch is below. The only problem is > that I did not figure out how to XFAIL a dg-final test only for > optimized runs and so it now XPASSes at -O0. Alternatively I can make > a[0] not dead in the test, but that would hide the new regression which > seems worse.
Works for me. One could duplicate the test and dg-skip-if one for -O0 and one for anything besides -O0 ... > >> ... > > >> But let me emphasize again that whenever correctness is the issue, the > >> question whether an SRA recorded access comes from total scalarization > >> or not is not important. Users accessing the data in some other part of > >> the function can create them too. Users equipped with placement new can > >> create all sorts of weird type accesses and because tree-sra is flow > >> insensitive, they can then force scalarization to such replacements even > >> at places where the data have wildly different types. > > > > Yes, but SRA will never create loads or stores for the non-total > > scalarization case it will only choose one (better non-FP if not > > all accesses are FP - I think compare_access_positions guarantees that) > > scalar register for each replacement, right? > > Yes. My point was just that with placement new, the same aggregate decl > can contain wildly different data in two different places of a function, > and SRA might pick some from the first place and use it in the other. > Thus, any testcase that miscompiles with total scalarization can be > extended to one that miscompiles without it. I doubt that - do you have something specific in mind? I think placement-new also emits a CLOBBER, how does SRA treat that at the moment? > > > > Basically it will replace _all_ accesses of a memory piece with > > a register instead, making that memory piece unused? > > Yes. By the way, given that we are about to consider assignments with > type-changing MEM_REFs fragile and will not delete them, the aggregate > will not go away and that is why I added back the bit setting to > cannot_scalarize_away bitmap. After all, that is exactly what the > bitmap is for, don't bother totally scalarizing, the aggregate will not > disappear. Good. > Below is the updated and quite a bit simpler patch, which has passed > bootstrap and testing on x86_64-linux (but suffers from the XFAILs and > XPASSes dewscribed above). Ok. Thanks, Richard. > Martin > > > 2017-12-06 Martin Jambor <mjam...@suse.cz> > > PR tree-optimization/83141 > * tree-sra.c (contains_vce_or_bfcref_p): Move up in the file, also > test for MEM_REFs implicitely changing types with padding. Remove > inline keyword. > (build_accesses_from_assign): Added contains_vce_or_bfcref_p checks. > > testsuite/ > * gcc.dg/tree-ssa/pr83141.c: New test. > * gcc.dg/guality/pr54970.c: XFAIL tests querying a[0]. > --- > gcc/testsuite/gcc.dg/guality/pr54970.c | 10 +++--- > gcc/testsuite/gcc.dg/tree-ssa/pr83141.c | 37 ++++++++++++++++++++++ > gcc/tree-sra.c | 54 > +++++++++++++++++++++------------ > 3 files changed, 77 insertions(+), 24 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr83141.c > > diff --git a/gcc/testsuite/gcc.dg/guality/pr54970.c > b/gcc/testsuite/gcc.dg/guality/pr54970.c > index a9b8c064d8b..1819d023e21 100644 > --- a/gcc/testsuite/gcc.dg/guality/pr54970.c > +++ b/gcc/testsuite/gcc.dg/guality/pr54970.c > @@ -24,23 +24,23 @@ main () > /* { dg-final { gdb-test 25 "*p" "13" } } */ > asm volatile (NOP); /* { dg-final { gdb-test 25 "*q" "12" } > } */ > __builtin_memcpy (&a, (int [3]) { 4, 5, 6 }, sizeof (a)); > - /* { dg-final { gdb-test 31 "a\[0\]" "4" } } */ > + /* { dg-final { gdb-test 31 "a\[0\]" "4" { > xfail { *-*-* } } } } */ > /* { dg-final { gdb-test 31 "a\[1\]" "5" } } */ > /* { dg-final { gdb-test 31 "a\[2\]" "6" } } */ > /* { dg-final { gdb-test 31 "*p" "6" } } */ > asm volatile (NOP); /* { dg-final { gdb-test 31 "*q" "5" } > } */ > - *p += 20; /* { dg-final { gdb-test 36 "a\[0\]" "4" } } */ > + *p += 20; /* { dg-final { gdb-test 36 "a\[0\]" "4" { > xfail { *-*-* } } } } */ > /* { dg-final { gdb-test 36 "a\[1\]" "5" } } */ > /* { dg-final { gdb-test 36 "a\[2\]" "26" } } */ > /* { dg-final { gdb-test 36 "*p" "26" } } */ > asm volatile (NOP); /* { dg-final { gdb-test 36 "*q" "5" } > } */ > - *q += 20; /* { dg-final { gdb-test 45 "a\[0\]" "4" } } */ > + *q += 20; /* { dg-final { gdb-test 45 "a\[0\]" "4" { > xfail { *-*-* } } } } */ > /* { dg-final { gdb-test 45 "a\[1\]" "25" } } */ > /* { dg-final { gdb-test 45 "a\[2\]" "26" } } */ > /* { dg-final { gdb-test 45 "*p" "26" } } */ > /* { dg-final { gdb-test 45 "p\[-1\]" "25" } } > */ > - /* { dg-final { gdb-test 45 "p\[-2\]" "4" } } */ > - /* { dg-final { gdb-test 45 "q\[-1\]" "4" } } */ > + /* { dg-final { gdb-test 45 "p\[-2\]" "4" { > xfail { *-*-* } } } } */ > + /* { dg-final { gdb-test 45 "q\[-1\]" "4" { > xfail { *-*-* } } } } */ > /* { dg-final { gdb-test 45 "q\[1\]" "26" } } */ > asm volatile (NOP); /* { dg-final { gdb-test 45 "*q" "25" } > } */ > return 0; > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c > b/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c > new file mode 100644 > index 00000000000..73ea45c613c > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c > @@ -0,0 +1,37 @@ > +/* { dg-do run } */ > +/* { dg-options "-O -fdump-tree-esra-details" } */ > + > +volatile short vs; > +volatile long vl; > + > +struct A { short s; long i; long j; }; > +struct A a, b; > +void foo () > +{ > + struct A c; > + __builtin_memcpy (&c, &b, sizeof (struct A)); > + __builtin_memcpy (&a, &c, sizeof (struct A)); > + > + vs = c.s; > + vl = c.i; > + vl = c.j; > +} > + > + > +int main() > +{ > + if ((sizeof (short) != 2) > + || (__builtin_offsetof (struct A, i) < 4)) > + return 0; > + > + __builtin_memset (&b, 0, sizeof (struct A)); > + b.s = 1; > + __builtin_memcpy ((char *)&b+2, &b, 2); > + foo (); > + __builtin_memcpy (&a, (char *)&a+2, 2); > + if (a.s != 1) > + __builtin_abort (); > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-not "Will attempt to totally scalarize" > "esra" } } */ > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index 866cff0edb0..54f1c8d54d5 100644 > --- a/gcc/tree-sra.c > +++ b/gcc/tree-sra.c > @@ -1141,6 +1141,33 @@ contains_view_convert_expr_p (const_tree ref) > return false; > } > > +/* Return true if REF contains a VIEW_CONVERT_EXPR or a MEM_REF that performs > + type conversion or a COMPONENT_REF with a bit-field field declaration. */ > + > +static bool > +contains_vce_or_bfcref_p (const_tree ref) > +{ > + while (handled_component_p (ref)) > + { > + if (TREE_CODE (ref) == VIEW_CONVERT_EXPR > + || (TREE_CODE (ref) == COMPONENT_REF > + && DECL_BIT_FIELD (TREE_OPERAND (ref, 1)))) > + return true; > + ref = TREE_OPERAND (ref, 0); > + } > + > + if (TREE_CODE (ref) != MEM_REF > + || TREE_CODE (TREE_OPERAND (ref, 0)) != ADDR_EXPR) > + return false; > + > + tree mem = TREE_OPERAND (TREE_OPERAND (ref, 0), 0); > + if (TYPE_MAIN_VARIANT (TREE_TYPE (ref)) > + != TYPE_MAIN_VARIANT (TREE_TYPE (mem))) > + return true; > + > + return false; > +} > + > /* Search the given tree for a declaration by skipping handled components and > exclude it from the candidates. */ > > @@ -1339,7 +1366,14 @@ build_accesses_from_assign (gimple *stmt) > racc->grp_assignment_read = 1; > if (should_scalarize_away_bitmap && !gimple_has_volatile_ops (stmt) > && !is_gimple_reg_type (racc->type)) > - bitmap_set_bit (should_scalarize_away_bitmap, DECL_UID (racc->base)); > + { > + if (contains_vce_or_bfcref_p (rhs)) > + bitmap_set_bit (cannot_scalarize_away_bitmap, > + DECL_UID (racc->base)); > + else > + bitmap_set_bit (should_scalarize_away_bitmap, > + DECL_UID (racc->base)); > + } > if (storage_order_barrier_p (lhs)) > racc->grp_unscalarizable_region = 1; > } > @@ -3416,24 +3450,6 @@ get_repl_default_def_ssa_name (struct access *racc) > return get_or_create_ssa_default_def (cfun, racc->replacement_decl); > } > > -/* Return true if REF has an VIEW_CONVERT_EXPR or a COMPONENT_REF with a > - bit-field field declaration somewhere in it. */ > - > -static inline bool > -contains_vce_or_bfcref_p (const_tree ref) > -{ > - while (handled_component_p (ref)) > - { > - if (TREE_CODE (ref) == VIEW_CONVERT_EXPR > - || (TREE_CODE (ref) == COMPONENT_REF > - && DECL_BIT_FIELD (TREE_OPERAND (ref, 1)))) > - return true; > - ref = TREE_OPERAND (ref, 0); > - } > - > - return false; > -} > - > /* Examine both sides of the assignment statement pointed to by STMT, replace > them with a scalare replacement if there is one and generate copying of > replacements if scalarized aggregates have been used in the assignment. > GSI > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)