Hi, > > The following patch makes us use profile-guided size/speed metrics > for MOVE_RATIO in estimate_move_cost. The estimate_num_insn > user update is obvious but the rest not exactly - I've choosen > size metrics everywhere but in ipa-cp.c:estimate_local_effects > (because there the result is used in some time metric). Some > other places look like they look at size metrics but others > are not obvious at all. > > Do you think I should use !optimize_size in those places? Or > is it important that some of the uses are consistent? > > Thanks, > Richard. > > 2014-07-23 Richard Biener <rguent...@suse.de> > > * tree-inline.h (estimate_move_cost): Add speed_p parameter. > * tree-inline.c (estimate_move_cost): Add speed_p parameter > and adjust MOVE_RATIO query accordingly. > (estimate_num_insns): Adjust callers. > * ipa-prop.c (ipa_populate_param_decls): Likewise. > * ipa-cp.c (gather_context_independent_values, > estimate_local_effects): Likewise. > * ipa-split.c (consider_split): Likewise. > > Index: gcc/tree-inline.c > =================================================================== > *** gcc/tree-inline.c (revision 212927) > --- gcc/tree-inline.c (working copy) > *************** tree_inlinable_function_p (tree fn) > *** 3623,3633 **** > return inlinable; > } > > ! /* Estimate the cost of a memory move. Use machine dependent > ! word size and take possible memcpy call into account. */ > > int > ! estimate_move_cost (tree type) > { > HOST_WIDE_INT size; > > --- 3623,3634 ---- > return inlinable; > } > > ! /* Estimate the cost of a memory move of type TYPE. Use machine dependent > ! word size and take possible memcpy call into account and return > ! cost based on whether optimizing for size or speed according to SPEED_P. > */ > > int > ! estimate_move_cost (tree type, bool speed_p)
This looks obvious to me.. > { > HOST_WIDE_INT size; > > *************** estimate_move_cost (tree type) > *** 3645,3651 **** > > size = int_size_in_bytes (type); > > ! if (size < 0 || size > MOVE_MAX_PIECES * MOVE_RATIO (!optimize_size)) > /* Cost of a memcpy call, 3 arguments and the call. */ > return 4; > else > --- 3646,3652 ---- > > size = int_size_in_bytes (type); > > ! if (size < 0 || size > MOVE_MAX_PIECES * MOVE_RATIO (speed_p)) > /* Cost of a memcpy call, 3 arguments and the call. */ > return 4; > else > *************** estimate_num_insns (gimple stmt, eni_wei > *** 3847,3855 **** > > /* Account for the cost of moving to / from memory. */ > if (gimple_store_p (stmt)) > ! cost += estimate_move_cost (TREE_TYPE (lhs)); > if (gimple_assign_load_p (stmt)) > ! cost += estimate_move_cost (TREE_TYPE (rhs)); > > cost += estimate_operator_cost (gimple_assign_rhs_code (stmt), > weights, > gimple_assign_rhs1 (stmt), > --- 3848,3856 ---- > > /* Account for the cost of moving to / from memory. */ > if (gimple_store_p (stmt)) > ! cost += estimate_move_cost (TREE_TYPE (lhs), weights->time_based); > if (gimple_assign_load_p (stmt)) > ! cost += estimate_move_cost (TREE_TYPE (rhs), weights->time_based); So does this. > Index: gcc/ipa-prop.c > =================================================================== > *** gcc/ipa-prop.c (revision 212927) > --- gcc/ipa-prop.c (working copy) > *************** ipa_populate_param_decls (struct cgraph_ > *** 204,210 **** > for (parm = fnargs; parm; parm = DECL_CHAIN (parm)) > { > descriptors[param_num].decl = parm; > ! descriptors[param_num].move_cost = estimate_move_cost (TREE_TYPE > (parm)); > param_num++; > } > } > --- 204,211 ---- > for (parm = fnargs; parm; parm = DECL_CHAIN (parm)) > { > descriptors[param_num].decl = parm; > ! descriptors[param_num].move_cost = estimate_move_cost (TREE_TYPE > (parm), > ! false); I seems to me that ipa-prop does not make difference in between size/time at all for its "cost" metric. I guess Martin may want to make these two independent and used accordingly. move_cost seems to be used exclusively in the profitability cost metrics, so I think your change is OK, but I would make it true here - it is used only in gather_context_independent_values that is time based. > Index: gcc/ipa-cp.c > =================================================================== > *** gcc/ipa-cp.c (revision 212927) > --- gcc/ipa-cp.c (working copy) > *************** gather_context_independent_values (struc > *** 1845,1851 **** > (*known_csts)[i] = val->value; > if (removable_params_cost) > *removable_params_cost > ! += estimate_move_cost (TREE_TYPE (val->value)); > ret = true; > } > else if (plats->virt_call) > --- 1845,1851 ---- > (*known_csts)[i] = val->value; > if (removable_params_cost) > *removable_params_cost > ! += estimate_move_cost (TREE_TYPE (val->value), false); > ret = true; > } > else if (plats->virt_call) > *************** estimate_local_effects (struct cgraph_no > *** 1997,2003 **** > { > known_csts[i] = val->value; > known_binfos[i] = NULL_TREE; > ! emc = estimate_move_cost (TREE_TYPE (val->value)); > } > else if (plats->virt_call) > { > --- 1997,2003 ---- > { > known_csts[i] = val->value; > known_binfos[i] = NULL_TREE; > ! emc = estimate_move_cost (TREE_TYPE (val->value), true); > } > else if (plats->virt_call) > { > Index: gcc/ipa-split.c > =================================================================== > *** gcc/ipa-split.c (revision 212927) > --- gcc/ipa-split.c (working copy) > *************** consider_split (struct split_point *curr > *** 488,500 **** > SSA_NAME_VERSION (ddef))) > { > if (!VOID_TYPE_P (TREE_TYPE (parm))) > ! call_overhead += estimate_move_cost (TREE_TYPE (parm)); > num_args++; > } > } > } > if (!VOID_TYPE_P (TREE_TYPE (current_function_decl))) > ! call_overhead += estimate_move_cost (TREE_TYPE (current_function_decl)); > > if (current->split_size <= call_overhead) > { > --- 488,501 ---- > SSA_NAME_VERSION (ddef))) > { > if (!VOID_TYPE_P (TREE_TYPE (parm))) > ! call_overhead += estimate_move_cost (TREE_TYPE (parm), false); > num_args++; > } > } > } > if (!VOID_TYPE_P (TREE_TYPE (current_function_decl))) > ! call_overhead += estimate_move_cost (TREE_TYPE (current_function_decl), > ! false); call_overhead is size metric, so this seems OK to me. OK with the change suggested above (or reason why it should not be done :) Honza > > if (current->split_size <= call_overhead) > {