Jeff Law <l...@redhat.com> writes: > On 5/31/19 1:24 AM, Richard Biener wrote: >> On Thu, 30 May 2019, Jeff Law wrote: >> >>> On 5/30/19 12:41 AM, Richard Biener wrote: >>>> On May 29, 2019 10:18:01 PM GMT+02:00, Jeff Law <l...@redhat.com> wrote: >>>>> On 5/23/19 6:11 AM, Richard Biener wrote: >>>>>> On Thu, 23 May 2019, Jiufu Guo wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Richard Biener <rguent...@suse.de> writes: >>>>>>> >>>>>>>> On Tue, 21 May 2019, Jiufu Guo wrote: >>>>> >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + if (TREE_CODE_CLASS (gimple_assign_rhs_code (def)) != >>>>> tcc_comparison) >>>>>>>>> + return false; >>>>>>>>> + >>>>>>>>> + /* Check if phi's incoming value is defined in the incoming >>>>> basic_block. */ >>>>>>>>> + edge e = gimple_phi_arg_edge (phi, index); >>>>>>>>> + if (def->bb != e->src) >>>>>>>>> + return false; >>>>>>>> why does this matter? >>>>>>>> >>>>>>> Through preparing pathes and duplicating block, this transform can >>>>> also >>>>>>> help to combine a cmp in previous block and a gcond in current >>>>> block. >>>>>>> "if (def->bb != e->src)" make sure the cmp is define in the incoming >>>>>>> block of the current; and then combining "cmp with gcond" is safe. >>>>> If >>>>>>> the cmp is defined far from the incoming block, it would be hard to >>>>>>> achieve the combining, and the transform may not needed. >>>>>> We're in SSA form so the "combining" doesn't really care where the >>>>>> definition comes from. >>>>> Combining doesn't care, but we need to make sure the copy of the >>>>> conditional ends up in the right block since it wouldn't necessarily be >>>>> associated with def->bb anymore. But I'd expect the sinking pass to >>>>> make this a non-issue in practice anyway. >>>>> >>>>>> >>>>>>>>> + >>>>>>>>> + if (!single_succ_p (def->bb)) >>>>>>>>> + return false; >>>>>>>> Or this? The actual threading will ensure this will hold true. >>>>>>>> >>>>>>> Yes, other thread code check this and ensure it to be true, like >>>>>>> function thread_through_normal_block. Since this new function is >>>>> invoked >>>>>>> outside thread_through_normal_block, so, checking single_succ_p is >>>>> also >>>>>>> needed for this case. >>>>>> I mean threading will isolate the path making this trivially true. >>>>>> It's also no requirement for combining, in fact due to the single-use >>>>>> check the definition can be sinked across the edge already (if >>>>>> the edges dest didn't have multiple predecessors which this threading >>>>>> will fix as well). >>>>> I don't think so. The CMP source block could end with a call and have >>>>> an abnormal edge (for example). We can't put the copied conditional >>>>> before the call and putting it after the call essentially means >>>>> creating >>>>> a new block. >>>>> >>>>> The CMP source block could also end with a conditional. Where do we >>>>> put >>>>> the one we want to copy into the CMP source block in that case? :-) >>>>> >>>>> This is something else we'd want to check if we ever allowed the the >>>>> CMP >>>>> defining block to not be the immediate predecessor of the conditional >>>>> jump block. If we did that we'd need to validate that the block where >>>>> we're going to insert the copy of the jump has a single successor. >>>> >>>> But were just isolating a path here. The actual combine job is left to >>>> followup cleanups. >>> Absolutely agreed. My point was that there's some additional stuff we'd >>> have to verify does the right thing if we wanted to allow the CMP to be >>> somewhere other than in the immediate predecessor of the conditional >>> jump block. >> >> For correctness? No. For the CMP to be forwarded? No. For optimality >> maybe - forwarding a binary operation always incurs register pressure >> increase. > For correctness of the patch. Conceptually I have _no_ issues with > having the CMP in a different block than an immediate predecessor of the > conditional jump block. But the patch does certain code which would > need to be audited with that change in mind. Thanks for all your great comments! It is right, if immediate predecessor of conditional jump block has more than one successors, the conditional jump block can be duplicated to split the path; and the condtional jump will keep in the duplicate block instead inserting into predecessor. From functionality aspect, it is still correct. While it does not merge CMP with conditional jump in this pass; then it may not directly help to eliminate the CMP. While I also agree this path may provides other optimize opportunity in following passes.
I just have a check with gcc bootstrap, and find there are ~1800 edges as !single_succ_p (e->src). And similar number edges are single_succ_p (e->src). It would be valuable to take the opptunity for these edges of !single_succ_p (e->src). Jiufu Guo > >> >> Btw, as you already said sinking should have sinked the CMP to the >> predecessor (since we have a single use in the PHI). >> >> So I hardly see the point of making this difference. > :-) > > jeff