On Thu, 2021-09-23 at 13:55 +0200, Richard Biener wrote: > On Wed, 22 Sep 2021, Ilya Leoshkevich wrote: > > > PR tree-optimization/49749 introduced code that shortens dependency > > chains containing loop accumulators by placing them last on operand > > lists of associative operations. > > > > 456.hmmer benchmark on s390 could benefit from this, however, the > > code > > that needs it modifies loop accumulator before using it, and since > > only > > so-called loop-carried phis are are treated as loop accumulators, > > the > > code in the present form doesn't really help. According to Bill > > Schmidt - the original author - such a conservative approach was > > chosen > > so as to avoid unnecessarily swapping operands, which might cause > > unpredictable effects. However, giving special treatment to forms > > of > > loop accumulators is acceptable. > > > > The definition of loop-carried phi is: it's a single-use phi, which > > is > > used in the same innermost loop it's defined in, at least one > > argument > > of which is defined in the same innermost loop as the phi itself. > > Given this, it seems natural to treat single uses of such phis as > > phis > > themselves. > > > > gcc/ChangeLog: > > > > * tree-ssa-reassoc.c (biased_names): New global. > > (propagate_bias_p): New function. > > (loop_carried_phi): Remove. > > (propagate_rank): Propagate bias along single uses. > > (get_rank): Update biased_names when needed. > > --- > > gcc/tree-ssa-reassoc.c | 97 ++++++++++++++++++++++++++++---------- > > ---- > > 1 file changed, 64 insertions(+), 33 deletions(-) > > > > diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c > > index 420c14e8cf5..2f7a8882aac 100644 > > --- a/gcc/tree-ssa-reassoc.c > > +++ b/gcc/tree-ssa-reassoc.c > > @@ -211,6 +211,10 @@ static int64_t *bb_rank; > > /* Operand->rank hashtable. */ > > static hash_map<tree, int64_t> *operand_rank; > > > > +/* SSA_NAMEs that are forms of loop accumulators and whose ranks > > need to be > > + biased. */ > > +static auto_bitmap biased_names; > > + > > /* Vector of SSA_NAMEs on which after reassociate_bb is done with > > all basic blocks the CFG should be adjusted - basic blocks > > split right after that SSA_NAME's definition statement and > > before > > @@ -256,6 +260,50 @@ reassoc_remove_stmt (gimple_stmt_iterator > > *gsi) > > the rank difference between two blocks. */ > > #define PHI_LOOP_BIAS (1 << 15) > > > > +/* Return TRUE iff PHI_LOOP_BIAS should be propagated from one of > > the STMT's > > + operands to the STMT's left-hand side. The goal is to preserve > > bias in code > > + like this: > > + > > + x_1 = phi(x_0, x_2) > > + a = x_1 | 1 > > + b = a ^ 2 > > + .MEM = b > > + c = b + d > > + x_2 = c + e > > + > > + That is, we need to preserve bias along single-use chains > > originating from > > + loop-carried phis. Only GIMPLE_ASSIGNs to SSA_NAMEs are > > considered to be > > + uses, because only they participate in rank propagation. */ > > +static bool > > +propagate_bias_p (gimple *stmt) > > +{ > > + use_operand_p use; > > + imm_use_iterator use_iter; > > + gimple *single_use_stmt = NULL; > > + > > + FOR_EACH_IMM_USE_FAST (use, use_iter, gimple_assign_lhs (stmt)) > > + { > > + gimple *current_use_stmt = USE_STMT (use); > > + > > + if (is_gimple_assign (current_use_stmt) > > + && TREE_CODE (gimple_assign_lhs (current_use_stmt)) == > > SSA_NAME) > > + { > > + if (single_use_stmt != NULL) > > what if single_use_stmt == current_use_stmt? We might have two > uses on a stmt after all - should that still be biased? I guess not > and thus the check is correct?
Come to think of it, it should be ok to bias it. Things like x = x + x are fine (this particular case can be transformed into something else earlier, but I think the overall point still holds). > > > + return false; > > + single_use_stmt = current_use_stmt; > > + } > > + } > > + > > + if (single_use_stmt == NULL) > > + return false; > > + > > + if (gimple_bb (stmt)->loop_father > > + != gimple_bb (single_use_stmt)->loop_father) > > + return false; > > + > > + return true; > > +} > > + > > /* Rank assigned to a phi statement. If STMT is a loop-carried > > phi of > > an innermost loop, and the phi has only a single use which is > > inside > > the loop, then the rank is the block rank of the loop latch > > plus an > > @@ -313,46 +361,23 @@ phi_rank (gimple *stmt) > > return bb_rank[bb->index]; > > } > > > > -/* If EXP is an SSA_NAME defined by a PHI statement that > > represents a > > - loop-carried dependence of an innermost loop, return TRUE; else > > - return FALSE. */ > > -static bool > > -loop_carried_phi (tree exp) > > -{ > > - gimple *phi_stmt; > > - int64_t block_rank; > > - > > - if (TREE_CODE (exp) != SSA_NAME > > - || SSA_NAME_IS_DEFAULT_DEF (exp)) > > - return false; > > - > > - phi_stmt = SSA_NAME_DEF_STMT (exp); > > - > > - if (gimple_code (SSA_NAME_DEF_STMT (exp)) != GIMPLE_PHI) > > - return false; > > - > > - /* Non-loop-carried phis have block rank. Loop-carried phis > > have > > - an additional bias added in. If this phi doesn't have block > > rank, > > - it's biased and should not be propagated. */ > > - block_rank = bb_rank[gimple_bb (phi_stmt)->index]; > > - > > - if (phi_rank (phi_stmt) != block_rank) > > - return true; > > - > > - return false; > > -} > > - > > /* Return the maximum of RANK and the rank that should be > > propagated > > from expression OP. For most operands, this is just the rank > > of OP. > > For loop-carried phis, the value is zero to avoid undoing the > > bias > > in favor of the phi. */ > > static int64_t > > -propagate_rank (int64_t rank, tree op) > > +propagate_rank (int64_t rank, tree op, gimple *stmt, bool *bias_p) > > { > > int64_t op_rank; > > > > - if (loop_carried_phi (op)) > > - return rank; > > + if (TREE_CODE (op) == SSA_NAME > > + && bitmap_bit_p (biased_names, SSA_NAME_VERSION (op))) > > + { > > + if (propagate_bias_p (stmt)) > > + *bias_p = true; > > + else > > + return rank; > > + } > > > > op_rank = get_rank (op); > > > > @@ -440,6 +465,8 @@ get_rank (tree e) > > > > else > > { > > + bool bias_p = false; > > + > > /* Otherwise, find the maximum rank for the operands. As > > an > > exception, remove the bias from loop-carried phis when > > propagating > > the rank so that dependent operations are not also > > biased. */ > > @@ -448,9 +475,12 @@ get_rank (tree e) > > thus have rank 0. */ > > rank = 0; > > FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE) > > - rank = propagate_rank (rank, op); > > + rank = propagate_rank (rank, op, stmt, &bias_p); > > It looks like if (propagate_bias_p (stmt)) is loop invariant here > and so when we inline the head this should simplify, avoiding > the new parameters to propagate_rank? I managed to move propagate_bias_p (stmt) out of the loop, but couldn't find any worthy simplification opportunities. For example, if we move the biasing logic out of propagate_rank () into the loop, nothing gets cancelled out and the resulting code looks more cluttered. I'll post the v3 after regtest finishes. > > Otherwise looks good to me. > > Richard.