Re: [PR63185][RFC] Improve DSE with branches

Jeff Law Wed, 16 May 2018 14:51:02 -0700

On 05/15/2018 08:42 AM, Richard Biener wrote:
> On Tue, 15 May 2018, Richard Biener wrote:
> 
>> On Tue, 15 May 2018, Richard Biener wrote:
>>
>>> On Tue, 15 May 2018, Richard Biener wrote:
>>>
>>>> On Tue, 15 May 2018, Richard Biener wrote:
>>>>
>>>>> On Mon, 14 May 2018, Kugan Vivekanandarajah wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Attached patch handles PR63185 when we reach PHI with temp != NULLL.
>>>>>> We could see the PHI and if there isn't any uses for PHI that is
>>>>>> interesting, we could ignore that ?
>>>>>>
>>>>>> Bootstrapped and regression tested on x86_64-linux-gnu.
>>>>>> Is this OK?
>>>>>
>>>>> No, as Jeff said we can't do it this way.
>>>>>
>>>>> If we end up with multiple VDEFs in the walk of defvar immediate
>>>>> uses we know we are dealing with a CFG fork.  We can't really
>>>>> ignore any of the paths but we have to
>>>>>
>>>>>  a) find the merge point (and the associated VDEF)
>>>>>  b) verify for each each chain of VDEFs with associated VUSEs
>>>>>     up to that merge VDEF that we have no uses of the to classify
>>>>>     store and collect (partial) kills
>>>>>  c) intersect kill info and continue walking from the merge point
>>>>>
>>>>> in b) there's the optional possibility to find sinking opportunities
>>>>> in case we have kills on some paths but uses on others.  This is why
>>>>> DSE should be really merged with (store) sinking.
>>>>>
>>>>> So if we want to enhance DSEs handling of branches then we need
>>>>> to refactor the simple dse_classify_store function.  Let me take
>>>>> an attempt at this today.
>>>>
>>>> First (baby) step is the following - it arranges to collect the
>>>> defs we need to continue walking from and implements trivial
>>>> reduction by stopping at (full) kills.  This allows us to handle
>>>> the new testcase (which was already handled in the very late DSE
>>>> pass with the help of sinking the store).
>>>>
>>>> I took the opportunity to kill the use_stmt parameter of
>>>> dse_classify_store as the only user is only looking for whether
>>>> the kills were all clobbers which I added a new parameter for.
>>>>
>>>> I didn't adjust the byte-tracking case fully (I'm not fully understanding
>>>> the code in the case of a use and I'm not sure whether it's worth
>>>> doing the def reduction with byte-tracking).
>>>>
>>>> Your testcase can be handled by reducing the PHI and the call def
>>>> by seeing that the only use of a candidate def is another def
>>>> we have already processed.  Not sure if worth special-casing though,
>>>> I'd rather have a go at "recursing".  That will be the next
>>>> patch.
>>>>
>>>> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
>>>
>>> Applied.
>>>
>>> Another intermediate one below, fixing the byte-tracking for
>>> stmt with uses.  This also re-does the PHI handling by simply
>>> avoiding recursion by means of a visited bitmap and stopping
>>> at the DSE classify stmt when re-visiting it instead of failing.
>>> This covers Pratamesh loop case for which I added ssa-dse-33.c.
>>> For the do-while loop this still runs into the inability to
>>> handle two defs to walk from.
>>>
>>> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
>>
>> Ok, loop handling doesn't work in general since we run into the
>> issue that SSA form across the backedge is not representing the
>> same values.  Consider
>>
>>  <bb 3>
>>  # .MEM_22 = PHI <.MEM_12(D)(2), .MEM_13(4)>
>>  # n_20 = PHI <0(2), n_7(4)>
>>  # .MEM_13 = VDEF <.MEM_22>
>>  bytes[n_20] = _4;
>>  if (n_20 > 7)
>>    goto <bb 5>;
>>
>>  <bb 4>
>>  n_7 = n_20 + 1;
>>  # .MEM_15 = VDEF <.MEM_13>
>>  bytes[n_20] = _5;
>>  goto <bb 3>;
>>
>> then when classifying the store in bb4, visiting the PHI node
>> gets us to the store in bb3 which appears to be killing.
>>
>>        if (gimple_code (temp) == GIMPLE_PHI)
>> -       defvar = PHI_RESULT (temp);
>> +       {
>> +         /* If we visit this PHI by following a backedge then reset
>> +            any info in ref that may refer to SSA names which we'd need
>> +            to PHI translate.  */
>> +         if (gimple_bb (temp) == gimple_bb (stmt)
>> +             || dominated_by_p (CDI_DOMINATORS, gimple_bb (stmt),
>> +                                gimple_bb (temp)))
>> +           /* ???  ref->ref may not refer to SSA names or it may only
>> +              refer to SSA names that are invariant with respect to the
>> +              loop represented by this PHI node.  */
>> +           ref->ref = NULL_TREE;
>> +         defvar = PHI_RESULT (temp);
>> +         bitmap_set_bit (visited, SSA_NAME_VERSION (defvar));
>> +       }
>>
>> should be a workable solution for that.  I'm checking that, but
>> eventually you can think of other things that might prevent us from
>> handling backedges.  Note the current code tries to allow
>> looking across loops but not handle backedges of loops the
>> original stmt belongs to.
> 
> Just to mention before I leave for the day I think I've identified
> a latent issue where I just fail to produce a testcase right now
> which is that non-return exits from a function are not reliably
> visited given they do not all have virtual operands, like
> for example resx.  Thus consider
> 
> void foo (int *p, int x)
> {
>   *p = 0;
>   if (x)
>     resx;
>   *p = 1;
>   return;
> }
> 
> where we will never visit any stmts on the resx path and thus think
> the *p = 0 store is dead (all with current trunk of course).
> 
> Will continue to dig tomorrow.
Yes.  I stumbled over this at some point as well.  I think there's a BZ
which touches on this issue, though I don't have the # and I don't
recall if handling a non-return exit would be sufficient to address the BZ.


Jeff

Re: [PR63185][RFC] Improve DSE with branches

Reply via email to