https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111231

--- Comment #32 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Earnshaw from comment #31)
> While that does seem to fix the bug, it's at the cost of 6 additional stores
> in the problematic test that are redundant other than changing the alias set
> view.

The alternative is to alter the earlier store MEM_ATTRs to use an
alias-set covering both which usually means using alias-set zero.
This will pessimize followup optimizations around the store though
but it might be a good trade-off if done only late - I'd say
after sched2 but it doesn't look like theres CSE/DSE after it.
So maybe after sched1 which effectively means after reload, but there's
no regular CSE after reload either.  The latest CSE is pass_cse2.
IIRC a minor complication is that the earlier insn isn't readily
available - IIRC 'dest' is copied/mangled and not necessarily the
single origial RTX of the earlier SET_DEST (IIRC - it's been some time).

OTOH I think that correctness trumps optimization and if this is the
problematical transform then I don't see much options here.

In the place CSE applies the transform we'd have to set MEM_ALIAS_SET
to zero if the alias set condition doesn't hold and clear MEM_EXPR
if the MEM_EXPR condition doesn't hold.

Note I can't get the cse.cc code to trigger with the full preprocessed
source and a cross to arm and using -O2 -fno-exceptions -march=armv7-a
-mfpu=neon-vfpv4 -mfloat-abi=hard -mfp16-format=ieee -fmath-errno

You mention at one point an insn removed by postreload, but that doesn't
use alias_set_subset_of.  I also don't remember postreload doing redundant
store removal.

Reply via email to