On Wed, Apr 29, 2015 at 3:23 PM, Jan Hubicka <hubi...@ucw.cz> wrote:
>> Historically the pragma Inline_Always of GNAT had been implemented in the FE
>> because the RTL inliner and then the Tree inliner weren't invoked at -O0 or
>> powerful enough to inline some constructs.  But this approach had drawbacks,
>> especially wrt debug info.  These restrictions were gradually lifted and now
>> the pragma is entirely piggybacked on the Tree inliner.
>>
>> This went mostly OK, except for a few cases where intrisinc operations that
>> used to be reasonably handled at -O0 now generate awful code, the typical
>> example being a modulus or division instrinsic by a power-of-2 generating a
>> fully-fledged modulus or division instruction instead of a simple shift.
>>
>> Therefore the attached patch implements anonymous constant propagation in the
>> inliner to fix the code quality regression.
>>
>> Tested on x86_64-suse-linux, OK for the mainline?
>>
>>    if (TREE_CODE (*tp) == SSA_NAME)
>>      {
>> -      *tp = remap_ssa_name (*tp, id);
>> +      tree t = remap_ssa_name (*tp, id);
>> +      /* Perform anonymous constant propagation, this makes it possible to
>> +         generate reasonable code even at -O0 for operators implemented as
>> +         inline functions.  */
>> +      if (TREE_CODE (t) == SSA_NAME
>> +       && SSA_NAME_DEF_STMT (t)
>> +       && (!SSA_NAME_VAR (t) || DECL_IGNORED_P (SSA_NAME_VAR (t)))
>> +       && gimple_assign_copy_p (SSA_NAME_DEF_STMT (t))
>> +       && is_gimple_min_invariant
>> +          (gimple_assign_rhs1 (SSA_NAME_DEF_STMT (t))))
>> +     *tp = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (t));
>> +      else
>> +     *tp = t;
>
> This looks like a good idea to me (though i can't approve it).  We may want to
> lift the (!SSA_NAME_VAR (t) || DECL_IGNORED_P (SSA_NAME_VAR (t))) when 
> optimize
> is set - the amount of garbage inliner produce is large and killing it early 
> is
> better than killing it later.  This has chance to help early opts where
> ordering between ccp and einline is quite hard.

Early opts run CCP as pretty much the first pass, so I don't see what
you are refering to here.

>>        *walk_subtrees = 0;
>>        return NULL;
>>      }
>> @@ -1965,7 +1977,7 @@ copy_bb (copy_body_data *id, basic_block
>>
>>         /* Statements produced by inlining can be unfolded, especially
>>            when we constant propagated some operands.  We can't fold
>> -          them right now for two reasons:
>> +          them right now in the general case for two reasons:
>>            1) folding require SSA_NAME_DEF_STMTs to be correct
>>            2) we can't change function calls to builtins.
>>            So we just mark statement for later folding.  We mark
>> @@ -1974,7 +1986,10 @@ copy_bb (copy_body_data *id, basic_block
>>            foldable indirectly are updated.  If this turns out to be
>>            expensive, copy_body can be told to watch for nontrivial
>>            changes.  */
>> -       if (id->statements_to_fold)
>> +       if (gimple_assign_cast_p (stmt)
>> +           && is_gimple_min_invariant (gimple_assign_rhs1 (stmt)))
>> +         fold_stmt (&copy_gsi);
>> +       else if (id->statements_to_fold)
>
> Why this is needed? Is it just an optimization because we know that folding of
> casts will not do crazy stuff like SSA graph walking (that was original reason
> for delaying the folidng) or just an optimization to reudce the size of the 
> list?

Beause this folding turns the cast into sth the above hunk handles...

It's basically a hack ;)

A better approach might be to simply fold stmts we copy during the
stmt copy itself by using gimple_build intead of gimple_copy + fold_stmt.
Note that the match-and-simplify machinery does not perform constant
propgation but relies on a valueization hook.

Richard.

> Honza
>>           id->statements_to_fold->add (stmt);
>>
>>         /* We're duplicating a CALL_EXPR.  Find any corresponding

Reply via email to