On Sat, Apr 14, 2012 at 9:26 AM, Tom de Vries <[email protected]> wrote:
> On 27/01/12 21:37, Tom de Vries wrote:
>> On 24/01/12 11:40, Richard Guenther wrote:
>>> On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries <[email protected]>
>>> wrote:
>>>> Richard,
>>>> Jakub,
>>>>
>>>> the following patch fixes PR51879.
>>>>
>>>> Consider the following test-case:
>>>> ...
>>>> int bar (int);
>>>> void baz (int);
>>>>
>>>> void
>>>> foo (int y)
>>>> {
>>>> int a;
>>>> if (y == 6)
>>>> a = bar (7);
>>>> else
>>>> a = bar (7);
>>>> baz (a);
>>>> }
>>>> ...
>>>>
>>>> after compiling at -02, the representation looks like this before
>>>> tail-merging:
>>>> ...
>>>> # BLOCK 3 freq:1991
>>>> # PRED: 2 [19.9%] (true,exec)
>>>> # .MEMD.1714_7 = VDEF <.MEMD.1714_6(D)>
>>>> # USE = nonlocal
>>>> # CLB = nonlocal
>>>> aD.1709_3 = barD.1703 (7);
>>>> goto <bb 5>;
>>>> # SUCC: 5 [100.0%] (fallthru,exec)
>>>>
>>>> # BLOCK 4 freq:8009
>>>> # PRED: 2 [80.1%] (false,exec)
>>>> # .MEMD.1714_8 = VDEF <.MEMD.1714_6(D)>
>>>> # USE = nonlocal
>>>> # CLB = nonlocal
>>>> aD.1709_4 = barD.1703 (7);
>>>> # SUCC: 5 [100.0%] (fallthru,exec)
>>>>
>>>> # BLOCK 5 freq:10000
>>>> # PRED: 3 [100.0%] (fallthru,exec) 4 [100.0%] (fallthru,exec)
>>>> # aD.1709_1 = PHI <aD.1709_3(3), aD.1709_4(4)>
>>>> # .MEMD.1714_5 = PHI <.MEMD.1714_7(3), .MEMD.1714_8(4)>
>>>> # .MEMD.1714_9 = VDEF <.MEMD.1714_5>
>>>> # USE = nonlocal
>>>> # CLB = nonlocal
>>>> bazD.1705 (aD.1709_1);
>>>> # VUSE <.MEMD.1714_9>
>>>> return;
>>>> ...
>>>>
>>>> the patch allows aD.1709_4 to be value numbered to aD.1709_3, and
>>>> .MEMD.1714_8
>>>> to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.
>>>>
>>>> The patch makes sure non-const/pure call results (gimple_vdef and
>>>> gimple_call_lhs) are properly value numbered.
>>>>
>>>> Bootstrapped and reg-tested on x86_64.
>>>>
>>>> ok for stage1?
>>>
>>> The following cannot really work:
>>>
>>> @@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
>>> result = vn_reference_lookup_1 (&vr1, NULL);
>>> if (result)
>>> {
>>> - changed = set_ssa_val_to (lhs, result);
>>> + tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
>>> + if (vdef)
>>> + changed |= set_ssa_val_to (vdef, result_vdef);
>>> + changed |= set_ssa_val_to (lhs, result);
>>>
>>> because 'result' may be not an SSA name. It might also not have
>>> a proper definition statement (if VN_INFO (result)->needs_insertion
>>> is true). So you at least need to guard things properly.
>>>
>>
>> Right. And that also doesn't work if the function is without lhs, such as in
>> the
>> new test-case pr51879-6.c.
>>
>> I fixed this by storing both lhs and vdef, such that I don't have to derive
>> the vdef from the lhs.
>>
>>> (On a side-note - I _did_ want to remove value-numbering virtual operands
>>> at some point ...)
>>>
>>
>> Doing so willl hurt performance of tail-merging in its current form.
>> OTOH, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51964#c0 shows that
>> value numbering as used in tail-merging has its limitations too.
>> Do you have any ideas how to address that one?
>>
>>> @@ -3359,8 +3366,10 @@ visit_use (tree use)
>>> /* ??? We should handle stores from calls. */
>>> else if (TREE_CODE (lhs) == SSA_NAME)
>>> {
>>> + tree vuse = gimple_vuse (stmt);
>>> if (!gimple_call_internal_p (stmt)
>>> - && gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST))
>>> + && (gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST)
>>> + || (vuse && SSA_VAL (vuse) != VN_TOP)))
>>> changed = visit_reference_op_call (lhs, stmt);
>>> else
>>> changed = defs_to_varying (stmt);
>>>
>>> ... exactly because of the issue that a stmt has multiple defs. Btw,
>>> vuse should have been visited here or be part of our SCC, so, why do
>>> you need this check?
>>>
>>
>> Removed now, that was a workaround for a bug in an earlier version of the
>> patch,
>> that I didn't need anymore.
>>
>> Bootstrapped and reg-tested on x86_64.
>>
>> OK for stage1?
>>
>
> Richard,
>
> quoting you in http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00618.html:
> ...
> I think these fixes hint at that we should
> use "structural" equality as fallback if value-numbering doesn't equate
> two stmt effects. Thus, treat two stmts with exactly the same operands
> and flags as equal and using value-numbering to canonicalize operands
> (when they are SSA names) for that comparison, or use VN entirely
> if there are no side-effects on the stmt.
>
> Changing value-numbering of virtual operands, even if it looks correct in the
> simple cases you change, doesn't look like a general solution for the missed
> tail merging opportunities.
> ...
>
> The test-case pr51879-6.c shows a case where improving value numbering will
> help
> tail-merging, but structural equality comparison not:
> ...
> # BLOCK 3 freq:1991
> # PRED: 2 [19.9%] (true,exec)
> # .MEMD.1717_7 = VDEF <.MEMD.1717_6(D)>
> # USE = nonlocal
> # CLB = nonlocal
> blaD.1708 (5);
> # .MEMD.1717_8 = VDEF <.MEMD.1717_7>
> # USE = nonlocal
> # CLB = nonlocal
> aD.1712_3 = barD.1704 (7);
> goto <bb 5>;
> # SUCC: 5 [100.0%] (fallthru,exec)
>
> # BLOCK 4 freq:8009
> # PRED: 2 [80.1%] (false,exec)
> # .MEMD.1717_9 = VDEF <.MEMD.1717_6(D)>
> # USE = nonlocal
> # CLB = nonlocal
> blaD.1708 (5);
> # .MEMD.1717_10 = VDEF <.MEMD.1717_9>
> # USE = nonlocal
> # CLB = nonlocal
> aD.1712_4 = barD.1704 (7);
> # SUCC: 5 [100.0%] (fallthru,exec)
>
> # BLOCK 5 freq:10000
> # PRED: 3 [100.0%] (fallthru,exec) 4 [100.0%] (fallthru,exec)
> # aD.1712_1 = PHI <aD.1712_3(3), aD.1712_4(4)>
> # .MEMD.1717_5 = PHI <.MEMD.1717_8(3), .MEMD.1717_10(4)>
> # .MEMD.1717_11 = VDEF <.MEMD.1717_5>
> # USE = nonlocal
> # CLB = nonlocal
> bazD.1706 (aD.1712_1);
> # VUSE <.MEMD.1717_11>
> return;
> ...
> by structurally comparing the gimples in both blocks, we can see that these
> are
> equal.
> What we can't see, is that aD.1712_3 and aD.1712_4 have the same value. For
> that, we need value numbering to number the vuses the same. And if we get
> value
> numbering to number these values equal, we don't need the structural
> comparison.
>
> In this case structural comparison cannot be used as fallback for
> value-numbering. So, ok for trunk?
>
> And if not ok, do you have a suggestion of how to deal with this otherwise?
Hmm. I'm not sure we can conclude that they have the same value!
+int bar (int);
+void baz (int);
+void bla (int);
+
+void
+foo (int y)
+{
+ int a;
+ if (y == 6)
+ {
+ bla (5);
+ a = bar (7);
+ }
+ else
+ {
+ bla (5);
+ a = bar (7);
+ }
+ baz (a);
+}
I can implement bla as
void bla(int) { a = random (); }
thus a would not have the same value on both paths - but that is not what
matters - because obviously we can still merge the blocks. That is,
we cannot be sure that the side-effects on memory of bla (5) and bla (5)
are the same. But is that not what your value-numbering changes will assert?
(not sure how you achieve this, but it looks like you insert/lookup general
calls, thus not pure/const ones - that could easily lead to miscompiles(?)).
Richard.
Richard.