Re: [PATCH] Be careful about combined chain with length == 0 (PR, tree-optimization/70754).

Bin.Cheng Thu, 19 Jan 2017 04:08:22 -0800

On Thu, Jan 19, 2017 at 11:22 AM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Thu, Jan 19, 2017 at 11:25 AM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>> On Thu, Jan 19, 2017 at 9:42 AM, Richard Biener
>> <richard.guent...@gmail.com> wrote:
>>> On Wed, Jan 18, 2017 at 4:32 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>>>> On Wed, Jan 18, 2017 at 2:54 PM, Richard Biener
>>>> <richard.guent...@gmail.com> wrote:
>>>>> On Wed, Jan 18, 2017 at 11:10 AM, Martin Liška <mli...@suse.cz> wrote:
>>>>>> Hello.
>>>>>>
>>>>>>
>>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression 
>>>>>> tests.
>>>>>>
>>>>>> Ready to be installed?
>>>>>
>>>>> I'm not sure.  If we have such zero distance refs in the IL at the
>>>>> time pcom runs then not handling
>>>>> them will pessimize code-gen for cases where they are part of a larger
>>>>> chain.  Esp. I don't like
>>>> Do you mean different chains distributed because of MAX_DISTANCE by
>>>> "larger chain"?  With the patch, such chain of refs would still be
>>>> pred-commoned, just the arithmetic operation not combined, which could
>>>> be handled by later DOM?
>>>>> another stmt_dominates_stmt_p call and thus rather not handle length
>>>>> == 0 at all...
>>>> Not handle length == 0 chains at all may be sub-optimal.  As you said,
>>>> such chain of refs at the point may simply because previous dom/cse
>>>> fail to analyze the references.
>>>>>
>>>>> We already seem to go great length in associating stuff when combining
>>>>> stuff thus isn't this
>>>>> maybe an artifact of this association?  Maybe we simply need to sort
>>>>> the new chain after
>>>>> combining it so the root stmt comes last?
>>>>>
>>>>> Note that there seems to be only a single length per chain but not all
>>>>> refs in a chain need to
>>>>> have the same distance.  This means your fix is likely incomplete?
>>>>> What prevents the situation
>>>>> to arise for distance != 0?
>>>> Yes, it's possible for two refs have the same distance in a chain with
>>>> length > 0.  But that should not be a problem, because existing uses
>>>> are replaced by newly generated PHI variables which always dominate
>>>> the uses, right?
>>>
>>> I must admit I don't know predcom in such detail but then can we handle
>>> distance == 0 by simply inserting a PHI for those as well (a degenerate
>>> one of course)?  Or can for distance == 0 the ref be not loop invariant?
>> Not sure if I understand the question correctly.  Distance is
>> difference of niter between one ref and the root ref of the chain, so
>> 0 distance/length doesn't mean a loop invariant, it's very likely two
>> (exactly the same) references in each loop iteration, the address of
>> reference is still a SCEV.  OTOH, invariant chain has invariant
>> address, and is handled separately.  For the first question, it's
>> length, rather than distance that decides how the chain is handled.
>> For length > 0 chain, we have to insert PHIs to pass carried result of
>> memory reference, even some refs may have 0 distance to the root ref.
>>>
>>> Note that for length == 0 all refs in the chain will have a dependence 
>>> distance
>>> of zero.  So my first argument probably doesn't hold and we could simply
>>> remove handling of length == 0 chains and rely on CSE?
>> I am not sure, that CSE opportunity of references exists at this point
>> means previous cse pass failed for some reason.
>
> Or a later pass introduced it (in this case, the vectorizer).
>
>> Predcom could be the
>> only pass that can handle such case as it understands data reference
>> better.  Note Martin's patch is not to skip handling of length == 0
>> chain, later ref will still be CSEed with result of root ref, only the
>> combination operation like chain1 + chain2 is skipped.  In this case,
>> following dom should be able to handle such (loop independent) cse
>> opportunities.
>
> I must admit I don't completely understand the consequences of this
> disabling but of course DOM should also be able to handle the CSE
> (ok, DOM is known to be quite weak with memory equivalence but
> it's not that data-dependence is really better in all cases).
>
> Can't we simply re-order refs in new_chain appropriately or handle
> this case in combinable_refs_p instead?
It's not refs need to be reordered, root ref always dominates others.
But yes, we need to find a dominator insertion place for combined
operation.  Looking at function reassociate_to_the_same_stmt, it
simply inserts new_stmt at root_stmt of root ref, which causes ICE in
this case.  The new_stmt needs to be inserted at a place also
dominating combination of later refs.  We can either compute the
information in place, or compute and pass the information from
previous combinable_refs_p.  This should be the real fix.


Thanks,
bin
>
> That is, I understand the patch as a hack as it should be always
> possible to find dominating refs?
>
> In fact the point of the assert tells us a simpler fix may be
>
> Index: tree-predcom.c
> ===================================================================
> --- tree-predcom.c      (revision 244519)
> +++ tree-predcom.c      (working copy)
> @@ -2330,6 +2334,12 @@ combine_chains (chain_p ch1, chain_p ch2
>           break;
>         }
>      }
> +  if (new_chain->length == 0
> +      && ! new_chain->has_max_use_after)
> +    {
> +      release_chain (new_chain);
> +      return NULL;
> +    }
>
>    ch1->combined = true;
>    ch2->combined = true;
>
> which obviously matches the assert we run into for the testcase?  I'd
> be ok with that
> (no stmt_dominates_stmt_p, heh) with a suitable comment before it.
>
> Richard.
>
>
>>
>> Thanks,
>> bin
>>>
>>> Richard.
>>>
>>>> Thanks,
>>>> bin
>>>>>
>>>>> Richard.
>>>>>
>>>>>> Martin
>>>>>>

Re: [PATCH] Be careful about combined chain with length == 0 (PR, tree-optimization/70754).

Reply via email to