On Thu, Jan 19, 2017 at 11:22 AM, Richard Biener <richard.guent...@gmail.com> wrote: > On Thu, Jan 19, 2017 at 11:25 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: >> On Thu, Jan 19, 2017 at 9:42 AM, Richard Biener >> <richard.guent...@gmail.com> wrote: >>> On Wed, Jan 18, 2017 at 4:32 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>>> On Wed, Jan 18, 2017 at 2:54 PM, Richard Biener >>>> <richard.guent...@gmail.com> wrote: >>>>> On Wed, Jan 18, 2017 at 11:10 AM, Martin Liška <mli...@suse.cz> wrote: >>>>>> Hello. >>>>>> >>>>>> >>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression >>>>>> tests. >>>>>> >>>>>> Ready to be installed? >>>>> >>>>> I'm not sure. If we have such zero distance refs in the IL at the >>>>> time pcom runs then not handling >>>>> them will pessimize code-gen for cases where they are part of a larger >>>>> chain. Esp. I don't like >>>> Do you mean different chains distributed because of MAX_DISTANCE by >>>> "larger chain"? With the patch, such chain of refs would still be >>>> pred-commoned, just the arithmetic operation not combined, which could >>>> be handled by later DOM? >>>>> another stmt_dominates_stmt_p call and thus rather not handle length >>>>> == 0 at all... >>>> Not handle length == 0 chains at all may be sub-optimal. As you said, >>>> such chain of refs at the point may simply because previous dom/cse >>>> fail to analyze the references. >>>>> >>>>> We already seem to go great length in associating stuff when combining >>>>> stuff thus isn't this >>>>> maybe an artifact of this association? Maybe we simply need to sort >>>>> the new chain after >>>>> combining it so the root stmt comes last? >>>>> >>>>> Note that there seems to be only a single length per chain but not all >>>>> refs in a chain need to >>>>> have the same distance. This means your fix is likely incomplete? >>>>> What prevents the situation >>>>> to arise for distance != 0? >>>> Yes, it's possible for two refs have the same distance in a chain with >>>> length > 0. But that should not be a problem, because existing uses >>>> are replaced by newly generated PHI variables which always dominate >>>> the uses, right? >>> >>> I must admit I don't know predcom in such detail but then can we handle >>> distance == 0 by simply inserting a PHI for those as well (a degenerate >>> one of course)? Or can for distance == 0 the ref be not loop invariant? >> Not sure if I understand the question correctly. Distance is >> difference of niter between one ref and the root ref of the chain, so >> 0 distance/length doesn't mean a loop invariant, it's very likely two >> (exactly the same) references in each loop iteration, the address of >> reference is still a SCEV. OTOH, invariant chain has invariant >> address, and is handled separately. For the first question, it's >> length, rather than distance that decides how the chain is handled. >> For length > 0 chain, we have to insert PHIs to pass carried result of >> memory reference, even some refs may have 0 distance to the root ref. >>> >>> Note that for length == 0 all refs in the chain will have a dependence >>> distance >>> of zero. So my first argument probably doesn't hold and we could simply >>> remove handling of length == 0 chains and rely on CSE? >> I am not sure, that CSE opportunity of references exists at this point >> means previous cse pass failed for some reason. > > Or a later pass introduced it (in this case, the vectorizer). > >> Predcom could be the >> only pass that can handle such case as it understands data reference >> better. Note Martin's patch is not to skip handling of length == 0 >> chain, later ref will still be CSEed with result of root ref, only the >> combination operation like chain1 + chain2 is skipped. In this case, >> following dom should be able to handle such (loop independent) cse >> opportunities. > > I must admit I don't completely understand the consequences of this > disabling but of course DOM should also be able to handle the CSE > (ok, DOM is known to be quite weak with memory equivalence but > it's not that data-dependence is really better in all cases). > > Can't we simply re-order refs in new_chain appropriately or handle > this case in combinable_refs_p instead? It's not refs need to be reordered, root ref always dominates others. But yes, we need to find a dominator insertion place for combined operation. Looking at function reassociate_to_the_same_stmt, it simply inserts new_stmt at root_stmt of root ref, which causes ICE in this case. The new_stmt needs to be inserted at a place also dominating combination of later refs. We can either compute the information in place, or compute and pass the information from previous combinable_refs_p. This should be the real fix.
Thanks, bin > > That is, I understand the patch as a hack as it should be always > possible to find dominating refs? > > In fact the point of the assert tells us a simpler fix may be > > Index: tree-predcom.c > =================================================================== > --- tree-predcom.c (revision 244519) > +++ tree-predcom.c (working copy) > @@ -2330,6 +2334,12 @@ combine_chains (chain_p ch1, chain_p ch2 > break; > } > } > + if (new_chain->length == 0 > + && ! new_chain->has_max_use_after) > + { > + release_chain (new_chain); > + return NULL; > + } > > ch1->combined = true; > ch2->combined = true; > > which obviously matches the assert we run into for the testcase? I'd > be ok with that > (no stmt_dominates_stmt_p, heh) with a suitable comment before it. > > Richard. > > >> >> Thanks, >> bin >>> >>> Richard. >>> >>>> Thanks, >>>> bin >>>>> >>>>> Richard. >>>>> >>>>>> Martin >>>>>>