>> >> gcc/ >> >> PR tree-optimization/94234 >> >> * tree-ssa-forwprop.c (simplify_binary_with_convert): New >> >> function. >> >> * (fwprop_ssa_val): Move it before its new caller. >> >> > No * at this line. There's an entry for (pass_forwprop::execute) missing. >> OK. >> >> > I don't think the transform as implemented, ((T) X) OP ((T) Y) to >> > (T) (X OP Y) is useful to do in tree-ssa-forwprop.c. Instead what I >> > suggested was to do the original >> > >> > +/* (T)(A * C) +- (T)(B * C) -> (T)((A +- B) * C) and >> > + (T)(A * C) +- (T)(A) -> (T)(A * (C +- 1)). */ >> > >> > but realize we already do this for GENERIC in fold_plusminus_mult_expr, >> > just >> > without the conversions (also look at the conditions in the callers). This >> > function takes great care for handling overflow correctly and thus I >> > suggested >> > to take that over to GIMPLE in tree-ssa-forwprop.c and try extend it to >> > cover >> > the conversions you need for the specific cases. >> But this way would introduce duplicate handling. Is it more concise to reuse >> existing rule?
> Sure moving the GENERIC folding to match.pd so it covers both GENERIC > and GIMPLE would be nice to avoid duplication. >> And different from GENERIC, we might need to check whether operand is >> single-use >> or not, and have distinct actions accordingly. >> >> (T)(A * C) +- (T)(B * C) -> (T)((A +- B) * C) >> >> Suppose both A and B are multiple-used, in most situations, the transform >> is unprofitable and avoided. But if (A +- B) could be folded to a constant, >> we >> can still allow the transform. For this, we have to recursively fold (A >> +-B), either >> handle it manually or resort to gimple-matcher to tell result. The latter is >> a >> natural choice. If so, why not do it on the top. > > I don't understand. From the comments in your patch you are just > hoisting conversions in the transform. I don't really see the connection > to the originally desired transform here? A code sequence as: t1 = (T)(A * C) t2 = (T)(B * C) ... = use (t1) ... = use (t2) t3 = t1 - t2 Since t1 and t2 are not single-use, we do not expect the transform on t3 happens in that it incurs more (add/mul) operations in most situations. But if (A - B) * C can be folded to a constant or an existing SSA, the transform is OK. That is to say we need to try to fold (A - B) and (A - B) * C to peak the final result. To do this, it is natural to use gimple-matcher instead of manually pattern matching as fold_plusminus_mult_expr, which could not cover all cases as gimple rules. Some examples: A = n + 2, B = n + 1, C=m A = n - m, B = n, C = -1 A = 3 * n, B = 2 * n, C = 1 And this way can be easily generalized to handle ((T) X) OP ((T) Y). >> > Alternatively one could move the GENERIC bits to match.pd, leaving a >> > worker in fold-const.c. Then try to extend that there. >> This worker function is meant to be used by both GENERIC and GIMPLE? > Yes, for both. Thanks, Feng