On Tue, 30 Nov 2021, Jakub Jelinek wrote: > Hi! > > Seems simplify_associate_operation is quadratic, which isn't a big deal > for use during combine and other similar RTL passes, because those never > try to combine expressions from more than a few instructions and because > those instructions need to be recognized the machine description also bounds > how many expressions can appear in there.
Well ... > var-tracking has depth limits only for some cases and unlimited depth > for the vt_expand_loc though: > /* This is the value used during expansion of locations. We want it > to be unbounded, so that variables expanded deep in a recursion > nest are fully evaluated, so that their values are cached > correctly. We avoid recursion cycles through other means, and we > don't unshare RTL, so excess complexity is not a problem. */ > #define EXPR_DEPTH (INT_MAX) > /* We use this to keep too-complex expressions from being emitted as > location notes, and then to debug information. Users can trade > compile time for ridiculously complex expressions, although they're > seldom useful, and they may often have to be discarded as not > representable anyway. */ > #define EXPR_USE_DEPTH (param_max_vartrack_expr_depth) > > IMO for very large expressions it isn't worth trying to reassociate though, > in fact e.g. for the new testcase below keeping it as is has bigger chance > of generating smaller debug info which the dwarf2out.c part of the change > tries to achieve - if a binary operation has the same operands, we can > use DW_OP_dup and not bother computing the possibly large operand again. > > This patch punts if the associate operands contain together more than > 64 same operations, which can happen only during var-tracking. > During bootstrap/regtest on x86_64-linux and i686-linux, this triggers > only on the new testcase and on gcc.dg/torture/pr88597.c. > I think given the 16 element static buffer in subrtx_iterator::array_type > it shouldn't slow down the common case of small expressions, but have > been wondering whether we shouldn't have some in_vartrack global flag > or guard it with > (current_pass && strcmp (current_pass->name, "vartrack") == 0). > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > Another possibility to deal with the power expressions in debug info > would be to introduce some new RTL operation for the pow{,i} (x, n) > case, allow that solely in debug insns and expand those into DWARF > using a loop. But that seems like quite a lot of work for something rarely > used (especially when powi for larger n is only useful for 0 and 1 inputs > because anything else overflows). I wonder given we now have 'simplify_context' whether we can track a re-association budget we can eat from. At least your code to determine whether the expression is too large is quadratic as well (but bound to 64, so just a very large constant overhead for an outermost expression of size 63). We already have a mem_depth there, so just have reassoc_times and punt if that reaches --param max-simplify-reassoc-times, incrementing it each time simplify_associative_operation is entered? Thanks, Richard. > 2021-11-30 Jakub Jelinek <ja...@redhat.com> > > PR rtl-optimization/102356 > * simplify-rtx.c: Include rtl-iter.h. > (simplify_associative_operation): Don't reassociate very large > expressions with 64 or more CODE subrtxes in the operands. > * dwarf2out.c (mem_loc_descriptor): Optimize binary operation > with both operands the same using DW_OP_dup. > > * gcc.dg/pr102356.c: New test. > > --- gcc/simplify-rtx.c.jj 2021-11-05 00:43:22.576624649 +0100 > +++ gcc/simplify-rtx.c 2021-11-29 19:46:29.674750656 +0100 > @@ -37,6 +37,7 @@ along with GCC; see the file COPYING3. > #include "selftest-rtl.h" > #include "rtx-vector-builder.h" > #include "rtlanal.h" > +#include "rtl-iter.h" > > /* Simplification and canonicalization of RTL. */ > > @@ -2263,9 +2264,40 @@ simplify_context::simplify_associative_o > { > rtx tem; > > + if (GET_CODE (op0) == code || GET_CODE (op1) == code) > + { > + /* During vartrack, the expressions can grow arbitrarily large. > + Reassociation isn't really useful for larger expressions > + and can be very compile time expensive. */ > + unsigned count = 0; > + subrtx_iterator::array_type array; > + FOR_EACH_SUBRTX (iter, array, op0, NONCONST) > + { > + const_rtx x = *iter; > + if (GET_CODE (x) == code) > + { > + if (count++ >= 64) > + return NULL_RTX; > + } > + else > + iter.skip_subrtxes (); > + } > + FOR_EACH_SUBRTX (iter, array, op1, NONCONST) > + { > + const_rtx x = *iter; > + if (GET_CODE (x) == code) > + { > + if (count++ >= 64) > + return NULL_RTX; > + } > + else > + iter.skip_subrtxes (); > + } > + } > + > /* Linearize the operator to the left. */ > if (GET_CODE (op1) == code) > { > /* "(a op b) op (c op d)" becomes "((a op b) op c) op d)". */ > if (GET_CODE (op0) == code) > { > --- gcc/dwarf2out.c.jj 2021-11-29 14:24:14.053634713 +0100 > +++ gcc/dwarf2out.c 2021-11-29 19:44:54.192113401 +0100 > @@ -16363,6 +16363,15 @@ mem_loc_descriptor (rtx rtl, machine_mod > do_binop: > op0 = mem_loc_descriptor (XEXP (rtl, 0), mode, mem_mode, > VAR_INIT_STATUS_INITIALIZED); > + if (XEXP (rtl, 0) == XEXP (rtl, 1)) > + { > + if (op0 == 0) > + break; > + mem_loc_result = op0; > + add_loc_descr (&mem_loc_result, new_loc_descr (DW_OP_dup, 0, 0)); > + add_loc_descr (&mem_loc_result, new_loc_descr (op, 0, 0)); > + break; > + } > op1 = mem_loc_descriptor (XEXP (rtl, 1), mode, mem_mode, > VAR_INIT_STATUS_INITIALIZED); > > --- gcc/testsuite/gcc.dg/pr102356.c.jj 2021-11-29 19:57:47.061075856 > +0100 > +++ gcc/testsuite/gcc.dg/pr102356.c 2021-11-29 19:57:26.852364489 +0100 > @@ -0,0 +1,33 @@ > +/* PR rtl-optimization/102356 */ > +/* { dg-do compile { target int32plus } } */ > +/* { dg-options "-O3 -g" } */ > + > +signed char a = 0; > +unsigned char b = 9; > +unsigned long long c = 0xF1FBFC17225F7A57ULL; > +int d = 0x3A6667C6; > + > +unsigned char > +foo (unsigned int x) > +{ > + unsigned int *e = &x; > + if ((c /= ((0 * (*e *= b)) <= 0))) > + ; > + for (d = 9; d > 2; d -= 2) > + { > + c = -2; > + do > + if ((*e *= *e)) > + { > + a = 4; > + do > + { > + a -= 3; > + if ((*e *= *e)) > + b = 9; > + } > + while (a > 2); > + } > + while (c++); > + } > +} > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)