On 04/27/2015 06:46 AM, Richard Biener wrote:
On Mon, 27 Apr 2015, Richard Biener wrote:
On Fri, 24 Apr 2015, Jeff Law wrote:
On 02/17/2015 07:58 AM, Richard Biener wrote:
[ Restarting a old thread... ]
On a closer look the record_const_or_copy_1 hunk is redundant
(record_equality is really a bit obfuscated...).
Agreed. I'm not entirely sure how it got to this point.
And record_equality is where the SSA_NAME_VALUEs might end up
forming chains? At least because we might record a SSA_NAME_VALUE
for sth that might be the target of a SSA_NAME_VALUE, as in
a_1 = b_2; SSA_NAME_VALUE (a_1) == b_2;
if (b_2 == i_3(D))
... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)? I guess we
can't easily avoid that because we don't keep track of a
reverse mapping (nor would we want to update that).
Right. In general, the fact that we're in SSA form means that we ought not
get loops in the SSA_NAME_VALUE chain since there's a single static assignment
and we'll visit it once.
That nice model breaks down in two ways. First we derive equivalences from
equality conditions like you've shown above. Second, when threading we can
thread through a loop latch and start reprocessing a block. The interaction
between those two can result in loops in the value chain.
What I'm hoping to do in GCC6 is eliminate all this stuff with a threader that
is independent of the dominator walk (and optimizer). Instead of walking
forward through the dominator tree recording key nuggets, then removing those
nuggets as we pop back up the tree, we instead we start with the conditional
and walk up the use-def chains, simplifying as we go, with the goal of
simplifying to a constant, of course.
You can see the beginnings of that with the FSM bits from Sebastian.
Obviously until those bits are in place, we should try to clean up any warts
in the current implementation.
Btw, in record_equality, the == part of the <= check for the loop
depth will just randomly swap x and y while we should prefer
IL canonical order. If we can't rely on the input being in IL
canonical order we should prepend the whole function with
if (tree_swap_operands_p (x, y, false))
std::swap (x, y);
and change <= to < for the loop-depth check.
Agreed. Makes perfect sense.
I'm now re-bootstrapping and testing the following.
Committed as follows, with XFAILing and re-opening PR65217.
Richard.
2015-04-27 Richard Biener <rguent...@suse.de>
* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
(record_equivalences_from_stmt): Valueize rhs.
(record_equality): Canonicalize x and y order via
tree_swap_operands_p. Do not swap operands for same loop depth.
* gcc.target/i386/pr65217.c: XFAIL.
Looks good. I'd spun the same record_equality changes over the weekend
and have state on regressing 65217.
65217 is an interesting test.
if ((n & -n) != n)
__builtin_unreachable();
return n;
n & -n will (of course) get computed into an SSA_NAME. We then
propagate that name for the use of "n" in the return statement rather
than using the effectively zero cost "n".
If we put some smarts into tree_swap_operands_p to order sensibly in
this kind of case, we end up regressing a different case that I'll be
looking at today.
jeff