On Mon, Jan 16, 2017 at 11:36 PM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On January 16, 2017 7:27:53 PM GMT+01:00, Jeff Law <l...@redhat.com> wrote:
>>On 01/16/2017 01:51 AM, Richard Biener wrote:
>>> On Sun, Jan 15, 2017 at 10:34 AM, Jeff Law <l...@redhat.com> wrote:
>>>>
>>>> At one time I know I had the max_size == size test in
>>valid_ao_ref_for_dse.
>>>> But it got lost at some point.  This is what caused the Ada failure.
>>>>
>>>> Technically it'd be OK for the potentially dead store to have a
>>variable
>>>> size as long as the later stores covered the entire range of the
>>potentially
>>>> dead store.  I doubt this happens enough to be worth checking.
>>>>
>>>> The ppc64 big endian failures were more interesting.  We had this in
>>the IL:
>>>>
>>>> memmove (dst, src, 0)
>>>>
>>>> The trimming code assumes that there's at least one live byte in the
>>store,
>>>> which obviously isn't the case here.  The net result is we compute
>>an
>>>> incorrect trim and the copy goes wild with incorrect addresses and
>>lengths.
>>>> This is trivial to fix by validating that the store has a nonzero
>>length.
>>>>
>>>> I was a bit curious how often this happened in practice because such
>>a call
>>>> is trivially dead.  ~80 during a bootstrap and a few dozen in the
>>testsuite.
>>>> Given how trivial it is to detect and optimize, this patch includes
>>removal
>>>> of such calls.  This hunk makes the check for zero size in
>>>> valid_ao_ref_for_dse redundant, but I'd like to keep the check -- if
>>we add
>>>> more builtin support without filtering zero size we'd regress again.
>>>
>>> Interesting - we do fold memset (..., 0) away so this means we either
>>> have an unfolded memset stmt in the IL before DSE.
>>It's actually exposed by fre3, both in the original test and in the
>>reduced testcase.  In the reduced testcase we have this just prior to
>>FRE3:
>>
>>;;   basic block 3, loop depth 0, count 0, freq 7326, maybe hot
>>;;    prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED)
>>;;    pred:       2 [73.3%]  (TRUE_VALUE,EXECUTABLE)
>>   _3 = MEM[(const struct vec *)_4].m_num;
>>   if (_3 != 0)
>>     goto <bb 4>; [36.64%]
>>   else
>>     goto <bb 5>; [63.36%]
>>;;    succ:       4 [36.6%]  (TRUE_VALUE,EXECUTABLE)
>>;;                5 [63.4%]  (FALSE_VALUE,EXECUTABLE)
>>
>>;;   basic block 4, loop depth 0, count 0, freq 2684, maybe hot
>>;;    prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED)
>>;;    pred:       3 [36.6%]  (TRUE_VALUE,EXECUTABLE)
>>   _6 = vec_av_set.m_vec;
>>   _7 = _6->m_num;
>>   _8 = _7 - _3;
>>   _6->m_num = _8;
>>   _9 = (long unsigned int) _8;
>>   _10 = _9 * 4;
>>   slot.2_11 = slot;
>>   dest.3_12 = dest;
>>   memmove (dest.3_12, slot.2_11, _10);
>>;;    succ:       5 [100.0%]  (FALLTHRU,EXECUTABLE)
>>
>>
>>_3 has the value _6->m_num.  Thus _8 will have the value 0, which in
>>turn makes _10 have the value zero as seen in the .fre3 dump:
>>
>>;;   basic block 4, loop depth 0, count 0, freq 2684, maybe hot
>>;;    prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED)
>>;;    pred:       3 [36.6%]  (TRUE_VALUE,EXECUTABLE)
>>   _4->m_num = 0;
>>   slot.2_11 = slot;
>>   dest.3_12 = dest;
>>   memmove (dest.3_12, slot.2_11, 0);
>>
>>In the full test its similar.
>>
>>I don't know if you want to try and catch this in FRE though.
>
> Ah, I think I have patches for this since a long time in my tree...  We're 
> folding calls in a restricted way for some historical reason.
>
>>If we detect in DCE (where it makes reasonable sense) rather than DSE,
>>then we detect the dead mem* about 17 passes earlier and the dead
>>argument setup about 20 passes earlier.  In the testcase I looked at, I
>>
>>didn't see additional secondary optimizations enabled, but I could
>>imagine cases where it might.  Seems like a gcc-8 thing though.
>
> I'll give it a quick look tomorrow.

The comment before fold_stmt_inplace no longer applies (but it seems I
simply forgot to push
this change...).  It's better to not keep unfolded stmts around, so
I'll commit this as last bit
of stage3 if testing is fine.

Bootstrap / regtest on x86_64-unknown-linux-gnu in progress.

Richard.

2017-01-17  Richard Biener  <rguent...@suse.de>

        * tree-ssa-pre.c (eliminate_dom_walker::before_dom_children):
        Fold calls regularly.

        * gcc.dg/tree-ssa/ssa-fre-57.c: New testcase.

Attachment: p3
Description: Binary data

Reply via email to