Richard Biener <rguent...@suse.de> writes: > On Thu, 23 May 2019, Jiufu Guo wrote: > >> Hi, >> >> Richard Biener <rguent...@suse.de> writes: >> >> > On Tue, 21 May 2019, Jiufu Guo wrote: >> > >> >> Hi, >> >> >> >> This patch implements a new opportunity of jump threading for PR77820. >> >> In this optimization, conditional jumps are merged with unconditional >> >> jump. >> >> And then moving CMP result to GPR is eliminated. >> >> >> >> It looks like below: >> >> >> >> <P0> >> >> p0 = a CMP b >> >> goto <X>; >> >> >> >> <P1> >> >> p1 = c CMP d >> >> goto <X>; >> >> >> >> <X> >> >> # phi = PHI <p0 (P0), p1 (P1)> >> >> if (phi != 0) goto <Y>; else goto <Z>; >> >> >> >> Could be transformed to: >> >> >> >> <P0> >> >> p0 = a CMP b >> >> if (p0 != 0) goto <Y>; else goto <Z>; >> >> >> >> <P1> >> >> p1 = c CMP d >> >> if (p1 != 0) goto <Y>; else goto <Z>; >> >> >> >> >> >> This optimization eliminates: >> >> 1. saving CMP result: p0 = a CMP b. >> >> 2. additional CMP on branch: if (phi != 0). >> >> 3. converting CMP result if there is phi = (INT_CONV) p0 if there is. >> >> >> >> Bootstrapped and tested on powerpc64le with no regressions(one case is >> >> improved) >> >> and new testcases are added. Is this ok for trunk? >> >> >> >> Thanks! >> >> Jiufu Guo >> >> >> ... >> >> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c >> >> index c3ea2d6..23000f6 100644 >> >> --- a/gcc/tree-ssa-threadedge.c >> >> +++ b/gcc/tree-ssa-threadedge.c >> >> @@ -1157,6 +1157,90 @@ thread_through_normal_block (edge e, >> >> return 0; >> >> } >> >> >> >> +/* Return true if PHI's INDEX-th incoming value is a CMP, and the CMP is >> >> + defined in the incoming basic block. Otherwise return false. */ >> >> +static bool >> >> +cmp_from_unconditional_block (gphi *phi, int index) >> >> +{ >> >> + tree value = gimple_phi_arg_def (phi, index); >> >> + if (!(TREE_CODE (value) == SSA_NAME && has_single_use (value))) >> >> + return false; >> > >> > Not sure why we should reject a constant here but I guess we >> > expect it to find a simplified condition anyways ;) >> > >> Const could be accepted here, like "# t_9 = PHI <5(3), t_17(4)>". I >> found this case is already handled by other jump-threading code, like >> 'ethread' pass. >> >> >> + >> >> + gimple *def = SSA_NAME_DEF_STMT (value); >> >> + >> >> + if (!is_gimple_assign (def)) >> >> + return false; >> >> + >> >> + if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def))) >> >> + { >> >> + value = gimple_assign_rhs1 (def); >> >> + if (!(TREE_CODE (value) == SSA_NAME && has_single_use (value))) >> >> + return false; >> >> + >> >> + def = SSA_NAME_DEF_STMT (value); >> >> + >> >> + if (!is_gimple_assign (def)) >> >> + return false; >> > >> > too much vertial space. >> > >> Thanks, I will refine it. >> >> + } >> >> + >> >> + if (TREE_CODE_CLASS (gimple_assign_rhs_code (def)) != tcc_comparison) >> >> + return false; >> >> + >> >> + /* Check if phi's incoming value is defined in the incoming >> >> basic_block. */ >> >> + edge e = gimple_phi_arg_edge (phi, index); >> >> + if (def->bb != e->src) >> >> + return false; >> > >> > why does this matter? >> > >> Through preparing pathes and duplicating block, this transform can also >> help to combine a cmp in previous block and a gcond in current block. >> "if (def->bb != e->src)" make sure the cmp is define in the incoming >> block of the current; and then combining "cmp with gcond" is safe. If >> the cmp is defined far from the incoming block, it would be hard to >> achieve the combining, and the transform may not needed. > > We're in SSA form so the "combining" doesn't really care where the > definition comes from. > >> >> + >> >> + if (!single_succ_p (def->bb)) >> >> + return false; >> > >> > Or this? The actual threading will ensure this will hold true. >> > >> Yes, other thread code check this and ensure it to be true, like >> function thread_through_normal_block. Since this new function is invoked >> outside thread_through_normal_block, so, checking single_succ_p is also >> needed for this case. > > I mean threading will isolate the path making this trivially true. > It's also no requirement for combining, in fact due to the single-use > check the definition can be sinked across the edge already (if > the edges dest didn't have multiple predecessors which this threading > will fix as well). > I would relax these check and have a test.
And I refactor the code a little as below. Thanks for any comments! bool edge_forwards_cmp_to_conditional_jump_through_empty_bb_p (edge e) { basic_block bb = e->dest; /* See if there is only one stmt which is gcond. */ gimple *gs = last_and_only_stmt (bb); if (gs == NULL || gimple_code (gs) != GIMPLE_COND) return false; /* See if gcond's condition is "(phi !=/== 0/1)". */ tree cond = gimple_cond_lhs (gs); if (TREE_CODE (cond) != SSA_NAME || gimple_code (SSA_NAME_DEF_STMT (cond)) != GIMPLE_PHI || gimple_bb (SSA_NAME_DEF_STMT (cond)) != bb) return false; enum tree_code code = gimple_cond_code (gs); tree rhs = gimple_cond_rhs (gs); if (!(code == NE_EXPR || code == EQ_EXPR || integer_onep (rhs) || integer_zerop (rhs))) return false; gphi *phi = as_a<gphi *> (SSA_NAME_DEF_STMT (cond)); edge_iterator ei; edge in_e; FOR_EACH_EDGE (in_e, ei, bb->preds) { /* Check if phi's incoming value is CMP */ gimple *def; tree value = PHI_ARG_DEF_FROM_EDGE (phi, in_e); if (TREE_CODE (value) == SSA_NAME && has_single_use (value) && is_gimple_assign (SSA_NAME_DEF_STMT (value))) def = SSA_NAME_DEF_STMT (value); else return false; /* Or if it is (INTCONV) (a CMP b). */ if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def))) { value = gimple_assign_rhs1 (def); if (TREE_CODE (value) == SSA_NAME && has_single_use (value) && is_gimple_assign (SSA_NAME_DEF_STMT (value))) def = SSA_NAME_DEF_STMT (value); else return false; } if (TREE_CODE_CLASS (gimple_assign_rhs_code (def)) != tcc_comparison) return false; } return true; } Thanks, Jiufu Guo >> >> + return true; >> >> +} >> >> + >> >> +/* There are basic blocks look like: >> >> + <P0> >> >> + p0 = a CMP b ; or p0 = (INT)( a CMP b) >> >> + goto <X>; >> >> + >> >> + <P1> >> >> + p1 = c CMP d >> >> + goto <X>; >> >> + >> >> + <X> >> >> + # phi = PHI <p0 (P0), p1 (P1)> >> >> + if (phi != 0) goto <Y>; else goto <Z>; >> >> + >> >> + Then, <X>: a trivial join block. >> >> + >> >> + Check if BB is <X> in like above. */ >> >> + >> >> +bool >> >> +is_trivial_join_block (basic_block bb) >> > >> > I'd make this work on a specific edge. >> > >> > edge_forwards_conditional_to_conditional_jump_through_empty_bb_p (edge e) >> > { >> > basic_block b = e->dest; >> > >> > maybe too elaborate name ;) >> > >> Thanks for help to name the function! It is very valuable for me ;) >> >> +{ >> >> + gimple *gs = last_and_only_stmt (bb); >> >> + if (gs == NULL) >> >> + return false; >> >> + >> >> + if (gimple_code (gs) != GIMPLE_COND) >> >> + return false; >> >> + >> >> + tree cond = gimple_cond_lhs (gs); >> >> + >> >> + if (TREE_CODE (cond) != SSA_NAME) >> >> + return false; >> > >> > space after if( too much vertical space in this function >> > for my taste btw. >> Will update this. >> > >> > For the forwarding to work we want a NE_EXPR or EQ_EXPR >> > as gimple_cond_code and integer_one_p or integer_zero_p >> > gimple_cond_rhs. >> Right, checking those would be more safe. Since no issue found, during >> bootstrap and regression tests, so I did not add these checking. I will >> add this checking. >> > >> >> + >> >> + if (gimple_code (SSA_NAME_DEF_STMT (cond)) != GIMPLE_PHI) >> >> + return false; >> >> + >> >> + gphi *phi = as_a<gphi *> (SSA_NAME_DEF_STMT (cond)); >> > >> > I think to match your pattern you want to check that >> > gimple_bb (phi) == bb as well here. >> Right, it should be checked. I will update. >> > >> >> + for (unsigned int i = 0; i < phi->nargs; i++) >> >> + if (!cmp_from_unconditional_block (phi, i)) >> > >> > Just process the incoming edge argument and inline the >> > helper. You can use PHI_ARG_DEF_FROM_EDGE here. >> I will refine code, and try to use it. >> > >> > Thanks for integrating this into jump-threading - it does look >> > like a good fit. >> > >> > How often does this trigger during bootstrap? >> Thanks for your sugguestion, this could help to evaluate patch. During >> bootstrap(stage 2 or 3), in gcc source code, 1300-1500 basic blocks are >> fullfile this tranform. > > Thanks, > Richard.