On Thu, Jan 17, 2019 at 03:06:57PM +0100, Richard Biener wrote: > > + tree new_var = *t; > > + if (VAR_P (new_var) > > + && !DECL_HARD_REGISTER (new_var) > > + && !TREE_THIS_VOLATILE (new_var) > > + && !DECL_HAS_VALUE_EXPR_P (new_var) > > + && !is_gimple_reg (new_var) > > + && auto_var_in_fn_p (new_var, id->dst_fn)) > > + { > > + gimple_stmt_iterator gsi = gsi_after_labels (bb); > > + tree clobber = build_clobber (TREE_TYPE (new_var)); > > + gimple *clobber_stmt = gimple_build_assign (new_var, clobber); > > + gsi_insert_before (&gsi, clobber_stmt, GSI_NEW_STMT); > > + } > > + } > > So we do not care to optimize this to only clobber the vars that > are appear live over the EH edge?
Wouldn't that be quite expensive (especially at that spot in the inliner)? I could surely defer that (at the spot in copy_edges_for_bb just remember which bb was it and handle it before destroying the decl_map in expand_call_inline, but: 1) are the extra CLOBBERs that big a deal? 2) if they are, shouldn't we have a pass that does it generically after IPA and removes all CLOBBERs for vars already known dead, whether they come from inlining or whatever else (we have many in the IL already that are dead for other reasons, e.g. a variable where the destructor is inlined and has one CLOBBER in itself, but have another CLOBBER in the caller too when the var goes out of scope); tree DSE is able to remove CLOBBERs for the same var if they are adjacent; in either case, what would be the algorithm for that? Something like add_scope_conflicts algorithm, just not build any conflicts, but just propagate the var is live bitmaps and when the propagation terminates, go through all bbs and if it sees a clobber on a var that isn't live at that point, remove the clobber? > > @@ -2317,8 +2351,16 @@ copy_edges_for_bb (basic_block bb, profi > > e->probability = old_edge->probability; > > > > FOR_EACH_EDGE (e, ei, copy_stmt_bb->succs) > > - if ((e->flags & EDGE_EH) && !e->probability.initialized_p ()) > > - e->probability = profile_probability::never (); > > + if (e->flags & EDGE_EH) > > + { > > + if (!e->probability.initialized_p ()) > > + e->probability = profile_probability::never (); > > + if (e->dest->index < id->add_clobbers_to_eh_landing_pads) > > + { > > Why don't we need to watch out for EDGE_COUNT (e->dest->preds) != 1? That is the usual case. > Is it that we just don't care? That was the intention, if say the EH pad has 4 predecessors, 3 of them from 3 different inline functions and the last one from something else, then the alternatives to what the patch does (i.e. have clobbers for variables from all those 3 inline functions, the EH pad is initially in the caller, so the vars from those inline functions can't be magically still live there) would be either do nothing and keep this PR open and have inlining increase stack sizes much more than it expects to in its estimations, or split those EH pads, have special EH pad for one inline function and have that set of clobber there, another one for another etc. and branch to the same spot. But that means worse code generation unless we optimize that away (and, either we optimize that by throwing the CLOBBERs out and we are back to this PR, or we optimize them by moving all the CLOBBERs into the same EH pad (essentially sinking the CLOBBERs to the common fallthru, but that means what this patch does)). Note, analysis of what variables are dead could be useful e.g. for tail-merge pass, e.g. on: int bar (int n); void baz (char *x, int n); int foo (int x, int y) { switch (x) { case 1: { char buf[8192]; baz (buf, 4); if (y != 5) return 4; bar (2); bar (3); bar (4); bar (5); bar (6); bar (7); bar (9); break; } case 2: { char buf[8192]; baz (buf, 124); if (y != 7) return 19; bar (2); break; } case 3: { char buf[8192]; baz (buf, 51); if (y != 19) return 26; bar (2); bar (3); bar (4); bar (5); bar (6); bar (7); bar (9); break; } default: { char buf[8192]; baz (buf, 18); if (y != 21) return 28; bar (2); bar (3); bar (4); bar (5); bar (6); bar (7); bar (9); break; } } bar (8); return 7; } we don't optimize this at -O2, but do with -O2 -fstack-reuse=none when we don't have the CLOBBERs around. If tail-merge pass could run this analysis and find out that while the basic blocks under consideration have different sets of clobbers, the vars from clobbers from the other sets are dead in the bb not having those, thus it should be possible to union the clobber sets and tail-merge because of that. Jakub