On Thursday 24 November 2016 04:54 PM, Richard Biener wrote:
On Thu, 24 Nov 2016, Pitchumani Sivanupandi wrote:
GCC inlines small functions if the code size after expansion is not excedded.
For test case (inline.c, avr-gcc -Os -S inline.c) code size become higher if
'func2' is inlined. It happens because the CONVERT_EXPR/ NOP_EXPR are
considered
as zero cost expression.
Few conversions will cost additional instructions. For targets like AVR
it will cost considerably as it's register size is just one byte.
Attached the tentative patch that changes the CONVERT_EXPR/ NOP_EXPR cost
to 1 if the LHS is bigger than RHS and target's word_mode.
Is this Ok?
Would it be reasonable if cost evaluated as below instead of constant 1?
if (LHS PRECISION > RHS PRECISION)
cost = LHS_PRECISION / word_mode - 1
else
cost = 0
Built GCC for native with bootstrap enabled. No issues.
I believe a better check would be tree_nop_conversion_p (). Thus
CASE_CONVERT:
return tree_nop_conversion_p (type, TREE_TYPE (op0)) ? 0 : 1;
note that
+ rhs_code = gimple_assign_rhs_code (stmt);
+ if ((rhs_code == NOP_EXPR) || (rhs_code == CONVERT_EXPR))
+ {
+ cost += estimate_operator_cost (rhs_code, weights,
+ gimple_assign_lhs (stmt),
+ gimple_assign_rhs1 (stmt));
+ }
is super-ugly - please simply add the type of the expression as an
additional argument (see gimple_expr_type (), but the type of the
LHS would do as well).
Note that doing this change might require some re-tuning of
inliner params, but otherwise it's totally sensible.
Thanks. Attached the revised patch.
When reg-tested for x86_64 found following failures.
FAIL: gcc.dg/uninit-19.c
FAIL: gcc.dg/vect/vect-104.c
For uninit-19.c, index to dereference float array is converted to
long unsigned int and that is not tree_nop_conversion_p. This caused
function cost to increase and auto inline is rejected.
I think, this may be huge penalty for target like x86_64 which has rich ISA.
Any suggestions to avoid hitting such targets?
Regards,
Pitchumani
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 6899d2a..e9f45be 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3867,19 +3867,19 @@ estimate_move_cost (tree type, bool ARG_UNUSED (speed_p))
static int
estimate_operator_cost (enum tree_code code, eni_weights *weights,
- tree op1 ATTRIBUTE_UNUSED, tree op2)
+ tree op1, tree op2, tree op0)
{
switch (code)
{
/* These are "free" conversions, or their presumed cost
is folded into other operations. */
case RANGE_EXPR:
- CASE_CONVERT:
case COMPLEX_EXPR:
case PAREN_EXPR:
case VIEW_CONVERT_EXPR:
return 0;
-
+ CASE_CONVERT:
+ return tree_nop_conversion_p (op0, TREE_TYPE (op1)) ? 0 : 1;
/* Assign cost of 1 to usual operations.
??? We may consider mapping RTL costs to this. */
case COND_EXPR:
@@ -4068,13 +4068,14 @@ estimate_num_insns (gimple *stmt, eni_weights *weights)
gimple_assign_rhs1 (stmt),
get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
== GIMPLE_BINARY_RHS
- ? gimple_assign_rhs2 (stmt) : NULL);
+ ? gimple_assign_rhs2 (stmt) : NULL,
+ gimple_expr_type (stmt));
break;
case GIMPLE_COND:
cost = 1 + estimate_operator_cost (gimple_cond_code (stmt), weights,
gimple_op (stmt, 0),
- gimple_op (stmt, 1));
+ gimple_op (stmt, 1), gimple_expr_type (stmt));
break;
case GIMPLE_SWITCH:
@@ -4129,7 +4130,7 @@ estimate_num_insns (gimple *stmt, eni_weights *weights)
&dconst2)))
return estimate_operator_cost
(MULT_EXPR, weights, gimple_call_arg (stmt, 0),
- gimple_call_arg (stmt, 0));
+ gimple_call_arg (stmt, 0), gimple_expr_type (stmt));
break;
default: