https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63483
--- Comment #7 from rguenther at suse dot de <rguenther at suse dot de> --- On Wed, 8 Oct 2014, ubizjak at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63483 > > --- Comment #6 from Uroš Bizjak <ubizjak at gmail dot com> --- > (In reply to Richard Biener from comment #5) > > The bug is clearly in > > > > " > > > Btw, if the mem is MEM_READONLY_P how can it be part of > > > a {un}aligned_store sequence? > > > > This flag is copied from the original memory operand of the store by > > alpha_set_memflags to all memory operands in the expanded sequence." > > > > and thus should be fixed there. What is the "original" memory reference > > here? It seems for the read-modify-write you should use the target MEM > > but somehow a source MEM gets used? Or rather you should not use either > > MEMs flags for _all_ MEMs in the sequence but properly distinguish between > > MEMs generated for the load of the source and MEMs generated for the store > > to the destination. > > No, this is wrong conclusion. I have new insight into this problem. > > Please consider following test: > > --cut here-- > static char *a; > static int *b; > > void foo (void) > { > a[1] = 1; > b[2] = 1; > } > > int bar (void) > { > return a && b; > } > --cut here-- > > (e.g. changing "static char *b" to "static int *b". This way, the problematic > insn is generated as "native" read: > > (insn 15 14 16 2 (set (reg/f:DI 78) > (mem/u/f/c:DI (lo_sum:DI (reg:DI 79) > (symbol_ref:DI ("b") [flags 0x6] <var_decl 0x2b1d67f5e090 > b>)) > [2 b+0 S8 A64])) rmw1.c:7 -1 > (nil)) > > and this is again moved over > > (insn 13 12 14 2 (set (mem:DI (and:DI (plus:DI (reg/f:DI 72) > (const_int 1 [0x1])) > (const_int -8 [0xfffffffffffffff8])) [0 S8 A64]) > (reg:DI 77)) rmw1.c:6 -1 > (nil)) > > (note that there is no "/u" suffix on the *store* of "a".) > > With this testcase, we have following sequence before sched1 pass: > > 5: r73:DI=high(`a') > 6: r72:DI=[r73:DI+low(`a')] > REG_DEAD r73:DI > 7: r74:QI=0x1 > 8: r76:DI=[r72:DI+0x1&0xfffffffffffffff8] > 9: r75:DI=r72:DI+0x1 > 10: r76:DI=!0xff<<r75:DI<<0x3&r76:DI > 11: r77:DI=zero_extend(r74:QI)<<r75:DI<<0x3 > REG_DEAD r75:DI > REG_DEAD r74:QI > REG_EQUAL 0x1<<r75:DI<<0x3 > 12: r77:DI=r77:DI|r76:DI > REG_DEAD r76:DI > 13: [r72:DI+0x1&0xfffffffffffffff8]=r77:DI > REG_DEAD r77:DI > REG_DEAD r72:DI > 14: r79:DI=high(`b') > 15: r78:DI=[r79:DI+low(`b')] > REG_DEAD r79:DI > 16: r80:SI=0x1 > 17: [r78:DI+0x8]=r80:SI > REG_DEAD r80:SI > REG_DEAD r78:DI > > and sched1 moves (insn 15) all the way up, past (insn 13): > > 5: r73:DI=high(`a') > 14: r79:DI=high(`b') > 7: r74:QI=0x1 > 6: r72:DI=[r73:DI+low(`a')] > REG_DEAD r73:DI > 16: r80:SI=0x1 > 15: r78:DI=[r79:DI+low(`b')] > REG_DEAD r79:DI > 9: r75:DI=r72:DI+0x1 > 8: r76:DI=[r72:DI+0x1&0xfffffffffffffff8] > 11: r77:DI=zero_extend(r74:QI)<<r75:DI<<0x3 > REG_DEAD r75:DI > REG_DEAD r74:QI > REG_EQUAL 0x1<<r75:DI<<0x3 > 10: r76:DI=!0xff<<r75:DI<<0x3&r76:DI > 12: r77:DI=r77:DI|r76:DI > REG_DEAD r76:DI > 13: [r72:DI+0x1&0xfffffffffffffff8]=r77:DI > REG_DEAD r77:DI > REG_DEAD r72:DI > 17: [r78:DI+0x8]=r80:SI > REG_DEAD r80:SI > REG_DEAD r78:DI > 20: NOTE_INSN_DELETED > > (this particular sequence won't end in a failure, but the insn shouldn't be > moved past possibly aliasing (insn 13) anyway. > > Patched compiler results in: > > 5: r73:DI=high(`a') > 7: r74:QI=0x1 > 14: r79:DI=high(`b') > 6: r72:DI=[r73:DI+low(`a')] > REG_DEAD r73:DI > 16: r80:SI=0x1 > 9: r75:DI=r72:DI+0x1 > 8: r76:DI=[r72:DI+0x1&0xfffffffffffffff8] > 11: r77:DI=zero_extend(r74:QI)<<r75:DI<<0x3 > REG_DEAD r75:DI > REG_DEAD r74:QI > REG_EQUAL 0x1<<r75:DI<<0x3 > 10: r76:DI=!0xff<<r75:DI<<0x3&r76:DI > 12: r77:DI=r77:DI|r76:DI > REG_DEAD r76:DI > 13: [r72:DI+0x1&0xfffffffffffffff8]=r77:DI > REG_DEAD r77:DI > REG_DEAD r72:DI > 15: r78:DI=[r79:DI+low(`b')] > REG_DEAD r79:DI > 17: [r78:DI+0x8]=r80:SI > REG_DEAD r80:SI > REG_DEAD r78:DI > > As shown, the code further down in true_dependence_1 function blocks the move, > since the read is detected as aliased with the preceding store involving AND > alignment. > > There is nothing that can be done in target-dependant code, since the "native" > read from "b" gets marked as "unchanging" by the generic middle-end code. Where is that done? It looks bogus to me. > Please reconsider the "component" setting. There is nothing that can be fixed > in target-dependent code. I'm not sure - copying MEM flags to all MEMs from a single source seems possibly wrong.