On Wed, Apr 29, 2015 at 3:23 PM, Jan Hubicka <hubi...@ucw.cz> wrote: >> Historically the pragma Inline_Always of GNAT had been implemented in the FE >> because the RTL inliner and then the Tree inliner weren't invoked at -O0 or >> powerful enough to inline some constructs. But this approach had drawbacks, >> especially wrt debug info. These restrictions were gradually lifted and now >> the pragma is entirely piggybacked on the Tree inliner. >> >> This went mostly OK, except for a few cases where intrisinc operations that >> used to be reasonably handled at -O0 now generate awful code, the typical >> example being a modulus or division instrinsic by a power-of-2 generating a >> fully-fledged modulus or division instruction instead of a simple shift. >> >> Therefore the attached patch implements anonymous constant propagation in the >> inliner to fix the code quality regression. >> >> Tested on x86_64-suse-linux, OK for the mainline? >> >> if (TREE_CODE (*tp) == SSA_NAME) >> { >> - *tp = remap_ssa_name (*tp, id); >> + tree t = remap_ssa_name (*tp, id); >> + /* Perform anonymous constant propagation, this makes it possible to >> + generate reasonable code even at -O0 for operators implemented as >> + inline functions. */ >> + if (TREE_CODE (t) == SSA_NAME >> + && SSA_NAME_DEF_STMT (t) >> + && (!SSA_NAME_VAR (t) || DECL_IGNORED_P (SSA_NAME_VAR (t))) >> + && gimple_assign_copy_p (SSA_NAME_DEF_STMT (t)) >> + && is_gimple_min_invariant >> + (gimple_assign_rhs1 (SSA_NAME_DEF_STMT (t)))) >> + *tp = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (t)); >> + else >> + *tp = t; > > This looks like a good idea to me (though i can't approve it). We may want to > lift the (!SSA_NAME_VAR (t) || DECL_IGNORED_P (SSA_NAME_VAR (t))) when > optimize > is set - the amount of garbage inliner produce is large and killing it early > is > better than killing it later. This has chance to help early opts where > ordering between ccp and einline is quite hard.
Early opts run CCP as pretty much the first pass, so I don't see what you are refering to here. >> *walk_subtrees = 0; >> return NULL; >> } >> @@ -1965,7 +1977,7 @@ copy_bb (copy_body_data *id, basic_block >> >> /* Statements produced by inlining can be unfolded, especially >> when we constant propagated some operands. We can't fold >> - them right now for two reasons: >> + them right now in the general case for two reasons: >> 1) folding require SSA_NAME_DEF_STMTs to be correct >> 2) we can't change function calls to builtins. >> So we just mark statement for later folding. We mark >> @@ -1974,7 +1986,10 @@ copy_bb (copy_body_data *id, basic_block >> foldable indirectly are updated. If this turns out to be >> expensive, copy_body can be told to watch for nontrivial >> changes. */ >> - if (id->statements_to_fold) >> + if (gimple_assign_cast_p (stmt) >> + && is_gimple_min_invariant (gimple_assign_rhs1 (stmt))) >> + fold_stmt (©_gsi); >> + else if (id->statements_to_fold) > > Why this is needed? Is it just an optimization because we know that folding of > casts will not do crazy stuff like SSA graph walking (that was original reason > for delaying the folidng) or just an optimization to reudce the size of the > list? Beause this folding turns the cast into sth the above hunk handles... It's basically a hack ;) A better approach might be to simply fold stmts we copy during the stmt copy itself by using gimple_build intead of gimple_copy + fold_stmt. Note that the match-and-simplify machinery does not perform constant propgation but relies on a valueization hook. Richard. > Honza >> id->statements_to_fold->add (stmt); >> >> /* We're duplicating a CALL_EXPR. Find any corresponding