Re: [ada, build] host/target configuration

2013-06-02 Thread Eric Botcazou
> So, your case works because the manu/osys parsing wrongly detects/assigns
> a manufacturer »linux« and an operating system androideabi.  Then, the
> following case fails, which is expected to yield identical results, with
> "complete triplets" -- which I took for granted in my reasoning about the
> Makefile code:
> 
> $ make target_alias=arm-unknown-linux-androideabi
> target_alias = »arm-unknown-linux-androideabi«
> targ = »arm unknown linux androideabi«
> arch = »arm«
> manu = »unknown«
> osys = »linux«
> not matched
> 
> 
> My suggested change would make all these work -- however I have not yet
> had the time to fully digest your other emails with the reasoning that
> you need configure GCC with non-canonical target and target_alias set
> differently.

The whole discussion started from wrong premises since, contrary to what the 
ChangeLog says, neither Pascal nor I have nothing to do with the original, 
problematic change (see PR ada/57188 for my take on it).  We all agree that 
the mess should be fixed somehow or other and Olivier is working on it.

-- 
Eric Botcazou


Re: Patch ping - Add a new option "-fstack-protector-strong"

2013-06-02 Thread Gerald Pfeifer
On Fri, 26 Apr 2013, Han Shen(沈涵) wrote:
> Hi, I'd like to ping the patch '-fstack-protector-strong':
> 
> - http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00945.html
>   Add a new option '-fstack-protector-strong' to protect only
> stack-smashing-vulnerable functions.

I see this is now in?

Can you please propose some wording (ideally a patch) for
http://gcc.gnu.org/gcc-4.9/changes.html ?

(http://gcc.gnu.org/projects/web.html has some more on our
web pages.)

Gerald

Re: [PATCH RX] Added target specific macros for macros for RX100, RX200, and RX600

2013-06-02 Thread Gerald Pfeifer
On Thu, 2 May 2013, Sandeep Kumar Singh wrote:
> 2013-05-02  Sandeep Kumar Singh  
> 
>   * rx/rx.h (TARGET_CPU_CPP_BUILTINS): Add macros for RX100, RX200, and 
> RX600. 
>   * rx/rx.opt: Add macro for rx100 with string rx100 and value RX100.
>   * rx/rx-opts.h (rx_cpu_types): Add new cpu type rx100.
>   * rx/t-rx: Add rx100 under multi library matches option for nofpu 
> option.

Mind also documenting this on http://gcc.gnu.org/gcc-4.9/changes.html ?

Let me know if you need help with the web pages.

Gerald


Re: [ada, build] host/target configuration

2013-06-02 Thread Alexandre Oliva
On May 31, 2013, Olivier Hainque  wrote:

>  - revert to our former computations, based on target and
>not target_alias. Revert the subsequent adjustments as
>well.

*nod*

>  - Use target_alias explicitly just at the points where
>we know that we need to depart from the canonical name

I suggest another approach: if there are significant differences between
the run-time systems, they ought to be preserved in the canonical target
names.  So, adjust config.sub so that it preserve them, and then we can
decide based on the canonical target name only.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


[PATCH] Basic support for MIPS r5900

2013-06-02 Thread Jürgen Urban
Hello,

after some months I reworked the patch for r5900. It would be nice if this 
could be accepted. The patch contains only changes to get basic support for 
MIPS r5900. It can be used to compile a working Linux kernel for the 
Playstation 2. It is also possible to get Linux programs working with software 
floating point and ABI o32. Other stuff like hardware floating point and ABI 
n32 is not fully supported yet.

How much other changes will be currently accepted here? There is other stuff 
which I want to prepare and submit here, e.g.:
1. disable use of dmult and ddiv (ABI n32).
2. use trunc.w.s instead of cvt.w.s (to get single float working for normal 
range calculations; i.e. calculating without inf or nan).
3. fix use of ll/sc in libgomp, either increase mips ISA level or use syscall 
(which is broken in Linux 2.6.35.4).
4. fix libgcc to build a real muldi3 function for ABI n32 (not the multi3 
function which is stored in muldi3.o file).
5. add support for configure parameters --float=single and --float=double in 
addition to --float=soft and --float=hard.
6. rework floating point to support single float with ABI n32 (either break the 
ABI or store floating point values in general purpose registers like soft 
float).
7. change libgcc or mips.md in way so that the non IEEE 754 compatible FPU of 
the r5900 gets compatible.

Best regards
Jürgen--- gcc/libgcc/config.host	(Revision 199343)
+++ gcc/libgcc/config.host	(Arbeitskopie)
@@ -739,7 +739,17 @@
 	;;
 mips*-*-linux*)# Linux MIPS, either endian.
 	extra_parts="$extra_parts crtfastmath.o"
-	tmake_file="${tmake_file} t-crtfm mips/t-mips16"
+	tmake_file="${tmake_file} t-crtfm"
+	# Check for MicroMIPS support.
+	case ${host} in
+		mips64r5900* | mipsr5900*)
+			# MicroMIPS uses floating point instructions
+			# which are not supported on r5900.
+			;;
+		*)
+		tmake_file="${tmake_file} mips/t-mips16"
+		;;
+	esac
 	md_unwind_header=mips/linux-unwind.h
 	if test "${ac_cv_sizeof_long_double}" = 16; then
 		tmake_file="${tmake_file} mips/t-tpbit"
@@ -777,10 +787,18 @@
 	tmake_file="$tmake_file mips/t-elf mips/t-crtstuff mips/t-mips16"
 	extra_parts="$extra_parts crti.o crtn.o"
 	;;
+mipsr5900-*-elf* | mipsr5900el-*-elf*)
+	tmake_file="$tmake_file mips/t-elf mips/t-crtstuff"
+	extra_parts="$extra_parts crti.o crtn.o"
+	;;
 mips64-*-elf* | mips64el-*-elf*)
 	tmake_file="$tmake_file mips/t-elf mips/t-crtstuff mips/t-mips16"
 	extra_parts="$extra_parts crti.o crtn.o"
 	;;
+mips64r5900-*-elf* | mips64r5900el-*-elf*)
+	tmake_file="$tmake_file mips/t-elf mips/t-crtstuff"
+	extra_parts="$extra_parts crti.o crtn.o"
+	;;
 mips64vr-*-elf* | mips64vrel-*-elf*)
 	tmake_file="$tmake_file mips/t-elf mips/t-vr mips/t-crtstuff"
 	extra_parts="$extra_parts crti.o crtn.o"
--- gcc/gcc/config.gcc	(Revision 199343)
+++ gcc/gcc/config.gcc	(Arbeitskopie)
@@ -1937,10 +1937,16 @@
 	target_cpu_default="MASK_64BIT|MASK_FLOAT64"
 	tm_defines="${tm_defines} MIPS_ISA_DEFAULT=64 MIPS_CPU_STRING_DEFAULT=\\\"sb1\\\" MIPS_ABI_DEFAULT=ABI_O64"
 	;;
-mips-*-elf* | mipsel-*-elf*)
+mips-*-elf* | mipsel-*-elf* | mipsr5900-*-elf* | mipsr5900el-*-elf*)
 	tm_file="elfos.h newlib-stdint.h ${tm_file} mips/elf.h"
 	tmake_file="mips/t-elf"
 	;;
+mips64r5900-*-elf* | mips64r5900el-*-elf*)
+	tm_file="elfos.h newlib-stdint.h ${tm_file} mips/elf.h"
+	tmake_file="mips/t-elf"
+	target_cpu_default="MASK_64BIT"
+	tm_defines="${tm_defines} MIPS_ISA_DEFAULT=3 MIPS_ABI_DEFAULT=ABI_N32"
+	;;
 mips64-*-elf* | mips64el-*-elf*)
 	tm_file="elfos.h newlib-stdint.h ${tm_file} mips/elf.h"
 	tmake_file="mips/t-elf"
@@ -2973,6 +2979,19 @@
 	  ;;
   esac
   ;;
+mips64r5900-*-*|mips64r5900el-*-*|mipsr5900-*-*|mipsr5900el-*-*)
+  with_arch=r5900
+  with_tune=r5900
+	if test x$with_llsc = x; then
+	  # r5900 doesn't support ll, sc, lld and scd instructions:
+	  with_llsc=no
+	fi
+	if test x$with_float = x; then
+	  # r5900 doesn't support 64 bit float:
+	  # 32 bit float doesn't comply with IEEE 754.
+	  with_float=soft
+	fi
+  ;;
 mips*-*-vxworks)
   with_arch=mips2
   ;;
--- gcc/gcc/config/mips/mips.c	(Revision 199343)
+++ gcc/gcc/config/mips/mips.c	(Arbeitskopie)
@@ -1029,6 +1029,19 @@
 		 1,   /* branch_cost */
 		 4/* memory_latency */
   },
+  { /* R5900 */
+COSTS_N_INSNS (4),/* fp_add */
+COSTS_N_INSNS (4),/* fp_mult_sf */
+COSTS_N_INSNS (256),  /* fp_mult_df */
+COSTS_N_INSNS (8),/* fp_div_sf */
+COSTS_N_INSNS (256),  /* fp_div_df */
+COSTS_N_INSNS (4),/* int_mult_si */
+COSTS_N_INSNS (256),  /* int_mult_di */
+COSTS_N_INSNS (37),   /* int_div_si */
+COSTS_N_INSNS (256),  /* int_div_di */
+		 1,   /* branch_cost */
+		 4/* memory_latency */
+  },
   { /* R7000 */
 /* The only costs that are changed here are
integer multiplication.  */
@@ -13005,6 +13018,7 @@
 case PROCESSOR_R4130:

Re: [GOOGLE] Unrestrict early inline restrictions for AutoFDO

2013-06-02 Thread Dehao Chen
The patch was committed to google-4_8, but it causes problem because
einline sets PARAM_EARLY_INLINING_INSNS = 11. This will cause
recursive inlining at einline stage (e.g. main->foo, foo->bar,
bar->foo) when autofdo is enabled.

The following patch can fix the problem by doing more targetted early inlining:

Index: gcc/predict.c
===
--- gcc/predict.c (revision 199593)
+++ gcc/predict.c (working copy)
@@ -175,6 +175,8 @@ cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
   && !maybe_hot_count_p (NULL,
  edge->count))
 return false;
+  if (flag_auto_profile)
+return false;
   if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
   || (edge->callee
   && edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED))

Performance testing on-going...

Dehao

On Wed, May 29, 2013 at 3:44 PM, Dehao Chen  wrote:
> OK, I'll commit the early inline part.
>
> Dehao
>
> On Wed, May 29, 2013 at 10:00 AM, Xinliang David Li  
> wrote:
>> The early inlining part is ok. The tracer optimization should be
>> revisited -- we should have more fine grain control on it (for
>> instance, based on FDO summary -- but that should be common to
>> FDO/LIPO).
>>
>> David
>>
>> On Wed, May 29, 2013 at 9:39 AM, Dehao Chen  wrote:
>>> In gcc4-8, the max einline iterations are restricted to 1. For
>>> AutoFDO, this is bad because early inline is not size restricted. This
>>> patch allows einline to do multiple iterations in AutoFDO. It also
>>> enables tracer optimization in AutoFDO.
>>>
>>> Bootstrapped and passed regression test.
>>>
>>> OK for googel-4_8?
>>>
>>> Thanks,
>>> Dehao
>>>
>>> Index: gcc/ipa-inline.c
>>> ===
>>> --- gcc/ipa-inline.c (revision 199416)
>>> +++ gcc/ipa-inline.c (working copy)
>>> @@ -2161,7 +2161,8 @@ early_inliner (void)
>>>  {
>>>/* We iterate incremental inlining to get trivial cases of indirect
>>>   inlining.  */
>>> -  while (iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
>>> +  while ((flag_auto_profile
>>> +  || iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS))
>>>   && early_inline_small_functions (node))
>>>   {
>>>timevar_push (TV_INTEGRATION);
>>> Index: gcc/opts.c
>>> ===
>>> --- gcc/opts.c (revision 199416)
>>> +++ gcc/opts.c (working copy)
>>> @@ -1644,6 +1644,8 @@ common_handle_option (struct gcc_options *opts,
>>>   opts->x_flag_peel_loops = value;
>>>if (!opts_set->x_flag_value_profile_transformations)
>>>   opts->x_flag_value_profile_transformations = value;
>>> +  if (!opts_set->x_flag_tracer)
>>> + opts->x_flag_tracer = value;
>>>if (!opts_set->x_flag_inline_functions)
>>>   opts->x_flag_inline_functions = value;
>>>if (!opts_set->x_flag_ipa_cp)


[PATCH][1 of 2] Add value range info to SSA_NAME for zero sign extension elimination in RTL

2013-06-02 Thread Kugan

Hi,

This patch adds value range information to tree SSA_NAME during Value 
Range Propagation (VRP) pass  in preparation to removes some of the 
redundant sign/zero extensions during RTL expansion.


This is based on the original patch posted in 
http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00610.html and addresses 
the review comments of  Richard Biener.


Tested  on X86_64 and ARM.

I would like review comments on this.

Thanks,
Kugan


+2013-06-03  Kugan Vivekanandarajah  
+
+   * gcc/gcc/tree-flow.h: Declared structure range_info_def and function
+   definition for mark_range_info_unknown.
+   * gcc/tree-ssa-alias.c (dump_alias_info) : Check pointer type
+   * gcc/tree-ssanames.c (make_ssa_name_fn) : Check pointer type in
+   initialize.
+   * (mark_range_info_unknown) : New function.
+   * (duplicate_ssa_name_range_info) : Likewise.
+   * (duplicate_ssa_name_fn) : Check pointer type and call correct
+   duplicate function.
+   * gcc/tree-vrp.c (extract_exp_value_range): New function.
+   * (simplify_stmt_using_ranges): Call extract_exp_value_range and
+   tree_ssa_set_value_range.
+   * gcc/tree.c (tree_ssa_set_value_range): New function.
+   * gcc/tree.h (SSA_NAME_PTR_INFO) : changed to access via union
+   * gcc/tree.h (SSA_NAME_RANGE_INFO) : New macro
+



diff --git a/gcc/tree-flow.h b/gcc/tree-flow.h
index 24fcfbf..dd4e2f5 100644
--- a/gcc/tree-flow.h
+++ b/gcc/tree-flow.h
@@ -147,6 +147,19 @@ struct GTY(()) ptr_info_def
   unsigned int misalign;
 };
 
+/* Value range information for SSA_NAMEs representing non-pointer variables.  */
+
+struct GTY (()) range_info_def {
+  /* Set to true if VR_RANGE and false if VR_ANTI_RANGE.  */
+  bool vr_range;
+  /* Minmum for value range.  */
+  double_int min;
+  /* Maximum for value range.  */
+  double_int max;
+  /* Set to true if range is valid.  */
+  bool valid;
+};
+
 
 /* It is advantageous to avoid things like life analysis for variables which
do not need PHI nodes.  This enum describes whether or not a particular
@@ -532,6 +545,7 @@ extern void replace_ssa_name_symbol (tree, tree);
 extern bool get_ptr_info_alignment (struct ptr_info_def *, unsigned int *,
 unsigned int *);
 extern void mark_ptr_info_alignment_unknown (struct ptr_info_def *);
+extern void mark_range_info_unknown (struct range_info_def *);
 extern void set_ptr_info_alignment (struct ptr_info_def *, unsigned int,
 unsigned int);
 extern void adjust_ptr_info_misalignment (struct ptr_info_def *,
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 2ecd139..8ccecb5 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -404,6 +404,7 @@ dump_alias_info (FILE *file)
   struct ptr_info_def *pi;
 
   if (ptr == NULL_TREE
+  || !POINTER_TYPE_P (TREE_TYPE (ptr))
 	  || SSA_NAME_IN_FREE_LIST (ptr))
 	continue;
 
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 0a405ce..420ae00 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -151,7 +151,11 @@ make_ssa_name_fn (struct function *fn, tree var, gimple stmt)
   SET_SSA_NAME_VAR_OR_IDENTIFIER (t, var);
 }
   SSA_NAME_DEF_STMT (t) = stmt;
-  SSA_NAME_PTR_INFO (t) = NULL;
+  if (POINTER_TYPE_P (TREE_TYPE (t)))
+SSA_NAME_PTR_INFO (t) = NULL;
+  else
+SSA_NAME_RANGE_INFO (t) = NULL;
+
   SSA_NAME_IN_FREE_LIST (t) = 0;
   SSA_NAME_IS_DEFAULT_DEF (t) = 0;
   imm = &(SSA_NAME_IMM_USE_NODE (t));
@@ -266,6 +270,14 @@ mark_ptr_info_alignment_unknown (struct ptr_info_def *pi)
   pi->misalign = 0;
 }
 
+/* Set the range described by RI has invalid values.  */
+
+void
+mark_range_info_unknown (struct range_info_def *ri)
+{
+  ri->valid = false;
+}
+
 /* Store the the power-of-two byte alignment and the deviation from that
alignment of pointer described by PI to ALIOGN and MISALIGN
respectively.  */
@@ -359,6 +371,26 @@ duplicate_ssa_name_ptr_info (tree name, struct ptr_info_def *ptr_info)
   SSA_NAME_PTR_INFO (name) = new_ptr_info;
 }
 
+/* Creates a duplicate of the range_info_def at RANGE_INFO for use by
+   the SSA name NAME.  */
+void
+duplicate_ssa_name_range_info (tree name, struct range_info_def *range_info)
+{
+  struct range_info_def *new_range_info;
+
+  gcc_assert (!POINTER_TYPE_P (TREE_TYPE (name)));
+  gcc_assert (!SSA_NAME_RANGE_INFO (name));
+
+  if (!range_info)
+return;
+
+  new_range_info = ggc_alloc_range_info_def ();
+  *new_range_info = *range_info;
+
+  SSA_NAME_RANGE_INFO (name) = new_range_info;
+}
+
+
 
 /* Creates a duplicate of a ssa name NAME tobe defined by statement STMT
in function FN.  */
@@ -367,10 +399,20 @@ tree
 duplicate_ssa_name_fn (struct function *fn, tree name, gimple stmt)
 {
   tree new_name = copy_ssa_name_fn (fn, name, stmt);
-  struct ptr_info_def *old_ptr_info = SSA_NAME_PTR_INFO (name);
+  if (POINTER_TYPE_P (TREE_TYPE (name)))
+{
+  struct ptr_info_def *old_ptr_info = SSA_NAME_PTR_INFO (name);
+
+  if (old_ptr_info)
+duplica

Re: [GOOGLE] Unrestrict early inline restrictions for AutoFDO

2013-06-02 Thread Xinliang David Li
auto profile info is not available yet in early inlining, why would
this change make any difference? Can you just reset the max_iters to a
higher value for autoFDO?

David

On Sun, Jun 2, 2013 at 6:21 PM, Dehao Chen  wrote:
> The patch was committed to google-4_8, but it causes problem because
> einline sets PARAM_EARLY_INLINING_INSNS = 11. This will cause
> recursive inlining at einline stage (e.g. main->foo, foo->bar,
> bar->foo) when autofdo is enabled.
>
> The following patch can fix the problem by doing more targetted early 
> inlining:
>
> Index: gcc/predict.c
> ===
> --- gcc/predict.c (revision 199593)
> +++ gcc/predict.c (working copy)
> @@ -175,6 +175,8 @@ cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
>&& !maybe_hot_count_p (NULL,
>   edge->count))
>  return false;
> +  if (flag_auto_profile)
> +return false;
>if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
>|| (edge->callee
>&& edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED))
>
> Performance testing on-going...
>
> Dehao
>
> On Wed, May 29, 2013 at 3:44 PM, Dehao Chen  wrote:
>> OK, I'll commit the early inline part.
>>
>> Dehao
>>
>> On Wed, May 29, 2013 at 10:00 AM, Xinliang David Li  
>> wrote:
>>> The early inlining part is ok. The tracer optimization should be
>>> revisited -- we should have more fine grain control on it (for
>>> instance, based on FDO summary -- but that should be common to
>>> FDO/LIPO).
>>>
>>> David
>>>
>>> On Wed, May 29, 2013 at 9:39 AM, Dehao Chen  wrote:
 In gcc4-8, the max einline iterations are restricted to 1. For
 AutoFDO, this is bad because early inline is not size restricted. This
 patch allows einline to do multiple iterations in AutoFDO. It also
 enables tracer optimization in AutoFDO.

 Bootstrapped and passed regression test.

 OK for googel-4_8?

 Thanks,
 Dehao

 Index: gcc/ipa-inline.c
 ===
 --- gcc/ipa-inline.c (revision 199416)
 +++ gcc/ipa-inline.c (working copy)
 @@ -2161,7 +2161,8 @@ early_inliner (void)
  {
/* We iterate incremental inlining to get trivial cases of indirect
   inlining.  */
 -  while (iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
 +  while ((flag_auto_profile
 +  || iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS))
   && early_inline_small_functions (node))
   {
timevar_push (TV_INTEGRATION);
 Index: gcc/opts.c
 ===
 --- gcc/opts.c (revision 199416)
 +++ gcc/opts.c (working copy)
 @@ -1644,6 +1644,8 @@ common_handle_option (struct gcc_options *opts,
   opts->x_flag_peel_loops = value;
if (!opts_set->x_flag_value_profile_transformations)
   opts->x_flag_value_profile_transformations = value;
 +  if (!opts_set->x_flag_tracer)
 + opts->x_flag_tracer = value;
if (!opts_set->x_flag_inline_functions)
   opts->x_flag_inline_functions = value;
if (!opts_set->x_flag_ipa_cp)


[PATCH][2 of 2] RTL expansion for zero sign extension elimination with VRP

2013-06-02 Thread Kugan

Hi,

This patch  removes some of the redundant sign/zero extensions using 
value range information during RTL expansion.


When GIMPLE_ASSIGN stmts with LHS type smaller than word is expanded to 
RTL, if we can prove that RHS expression value can always fit in LHS 
type and there is no sign conversion, truncation and extension to fit 
the type is redundant. For a SUBREG_PROMOTED_VAR_P, Subreg and Zero/sign 
extensions are therefore redundant.


For example, when an expression is evaluated and it's value is assigned
to variable of type short, the generated RTL would look something like
the following.

(set (reg:SI 110)
 (zero_extend:SI (subreg:HI (reg:SI 117) 0)))

However, if during value range propagation, if we can say for certain
that the value of the expression which is present in register 117 is
within the limits of short and there is no sign conversion, we do not
need to perform the subreg and zero_extend; instead we can generate the
following RTl.

(set (reg:SI 110)
 (reg:SI 117)))

Same could be done for other assign statements.

This patch is based on the earlier attempt posted in 
http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00610.html and addresses 
the review comments of  Richard Biener. I am post-processing the 
expand_expr_real_2 output in expand_gimple_stmt though. Reason for this 
is that I would like to process all the possible assignment stmts, not 
just  CASE_CONVERT case and/or the REDUCE_BITFIELD.


This change along with expansion improve the geomean of spec2k int 
benchmark with ref by about ~3.5% on an arm chromebook.


Tested  on X86_64 and ARM.

I would like review comments on this.

Thanks,
Kugan


+2013-06-03  Kugan Vivekanandarajah  
+
+   * gcc/dojump.c (do_compare_and_jump): generates rtl without
+   zero/sign extension if redundant.
+   * gcc/cfgexpand.c (expand_gimple_stmt_1): Likewise.
+   * gcc/gimple.c (gimple_assign_is_zero_sign_ext_redundant) : New
+   function.
+   * gcc/gimple.h (gimple_assign_is_zero_sign_ext_redundant) : New
+   function definition.
+







diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index c187273..ce980bc 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -2311,6 +2311,17 @@ expand_gimple_stmt_1 (gimple stmt)
 
 	if (temp == target)
 	  ;
+/* If the value in SUBREG of temp fits that SUBREG (does not
+   overflow) and is assigned to target SUBREG of the same mode
+   without sign convertion, we can skip the SUBREG
+   and extension.  */
+else if (promoted
+ && gimple_assign_is_zero_sign_ext_redundant (stmt)
+ && (GET_CODE (temp) == SUBREG)
+ && (GET_MODE (target) == GET_MODE (temp))
+ && (GET_MODE (SUBREG_REG (target))
+ == GET_MODE (SUBREG_REG (temp
+	  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
 	else if (promoted)
 	  {
 		int unsignedp = SUBREG_PROMOTED_UNSIGNED_P (target);
diff --git a/gcc/dojump.c b/gcc/dojump.c
index 3f04eac..cb13f3a 100644
--- a/gcc/dojump.c
+++ b/gcc/dojump.c
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ggc.h"
 #include "basic-block.h"
 #include "tm_p.h"
+#include "gimple.h"
 
 static bool prefer_and_bit_test (enum machine_mode, int);
 static void do_jump_by_parts_greater (tree, tree, int, rtx, rtx, int);
@@ -1108,6 +1109,60 @@ do_compare_and_jump (tree treeop0, tree treeop1, enum rtx_code signed_code,
 
   type = TREE_TYPE (treeop0);
   mode = TYPE_MODE (type);
+
+  /* Is zero/sign extension redundant as per VRP.  */
+  bool op0_ext_redundant = false;
+  bool op1_ext_redundant = false;
+
+  /* If promoted and the value in SUBREG of op0 fits (does not overflow),
+ it is a candidate for extension elimination.  */
+  if (GET_CODE (op0) == SUBREG && SUBREG_PROMOTED_VAR_P (op0))
+op0_ext_redundant =
+  gimple_assign_is_zero_sign_ext_redundant (SSA_NAME_DEF_STMT (treeop0));
+
+  /* If promoted and the value in SUBREG of op1 fits (does not overflow),
+ it is a candidate for extension elimination.  */
+  if (GET_CODE (op1) == SUBREG && SUBREG_PROMOTED_VAR_P (op1))
+op1_ext_redundant =
+  gimple_assign_is_zero_sign_ext_redundant (SSA_NAME_DEF_STMT (treeop1));
+
+  /* If zero/sign extension is redundant, generate RTL
+ for operands without zero/sign extension.  */
+  if ((op0_ext_redundant || TREE_CODE (treeop0) == INTEGER_CST)
+  && (op1_ext_redundant || TREE_CODE (treeop1) == INTEGER_CST))
+{
+  if (TREE_CODE (treeop1) == INTEGER_CST)
+{
+  /* First operand is constant.  */
+  rtx new_op0 = gen_reg_rtx (GET_MODE (SUBREG_REG (op0)));
+
+  emit_move_insn (new_op0, SUBREG_REG (op0));
+  op0 = new_op0;
+}
+  else if (TREE_CODE (treeop0) == INTEGER_CST)
+{
+  /* Other operand is constant.  */
+  rtx new_op1 = gen_reg_rtx 

Re: [GOOGLE] Unrestrict early inline restrictions for AutoFDO

2013-06-02 Thread Dehao Chen
On Sun, Jun 2, 2013 at 7:14 PM, Xinliang David Li  wrote:
>
> auto profile info is not available yet in early inlining, why would
> this change make any difference?

Because the check of PARAM_EARLY_INLINING_INSNS is after the check of
cgraph_maybe_hot_edge_p in early inline. If
cgraph_maybe_hot_edge_p fails, the early inline will not happen even
if growth is less than PARAM_EARLY_INLINING_INSNS.

>
> Can you just reset the max_iters to a
> higher value for autoFDO?

We could do that, but it could still lead to some code bloat because
recursive inlines can happen for at most, say 10, iterations.

Dehao

>
> David
>
> On Sun, Jun 2, 2013 at 6:21 PM, Dehao Chen  wrote:
> > The patch was committed to google-4_8, but it causes problem because
> > einline sets PARAM_EARLY_INLINING_INSNS = 11. This will cause
> > recursive inlining at einline stage (e.g. main->foo, foo->bar,
> > bar->foo) when autofdo is enabled.
> >
> > The following patch can fix the problem by doing more targetted early 
> > inlining:
> >
> > Index: gcc/predict.c
> > ===
> > --- gcc/predict.c (revision 199593)
> > +++ gcc/predict.c (working copy)
> > @@ -175,6 +175,8 @@ cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
> >&& !maybe_hot_count_p (NULL,
> >   edge->count))
> >  return false;
> > +  if (flag_auto_profile)
> > +return false;
> >if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
> >|| (edge->callee
> >&& edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED))
> >
> > Performance testing on-going...
> >
> > Dehao
> >
> > On Wed, May 29, 2013 at 3:44 PM, Dehao Chen  wrote:
> >> OK, I'll commit the early inline part.
> >>
> >> Dehao
> >>
> >> On Wed, May 29, 2013 at 10:00 AM, Xinliang David Li  
> >> wrote:
> >>> The early inlining part is ok. The tracer optimization should be
> >>> revisited -- we should have more fine grain control on it (for
> >>> instance, based on FDO summary -- but that should be common to
> >>> FDO/LIPO).
> >>>
> >>> David
> >>>
> >>> On Wed, May 29, 2013 at 9:39 AM, Dehao Chen  wrote:
>  In gcc4-8, the max einline iterations are restricted to 1. For
>  AutoFDO, this is bad because early inline is not size restricted. This
>  patch allows einline to do multiple iterations in AutoFDO. It also
>  enables tracer optimization in AutoFDO.
> 
>  Bootstrapped and passed regression test.
> 
>  OK for googel-4_8?
> 
>  Thanks,
>  Dehao
> 
>  Index: gcc/ipa-inline.c
>  ===
>  --- gcc/ipa-inline.c (revision 199416)
>  +++ gcc/ipa-inline.c (working copy)
>  @@ -2161,7 +2161,8 @@ early_inliner (void)
>   {
> /* We iterate incremental inlining to get trivial cases of 
>  indirect
>    inlining.  */
>  -  while (iterations < PARAM_VALUE 
>  (PARAM_EARLY_INLINER_MAX_ITERATIONS)
>  +  while ((flag_auto_profile
>  +  || iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS))
>    && early_inline_small_functions (node))
>    {
> timevar_push (TV_INTEGRATION);
>  Index: gcc/opts.c
>  ===
>  --- gcc/opts.c (revision 199416)
>  +++ gcc/opts.c (working copy)
>  @@ -1644,6 +1644,8 @@ common_handle_option (struct gcc_options *opts,
>    opts->x_flag_peel_loops = value;
> if (!opts_set->x_flag_value_profile_transformations)
>    opts->x_flag_value_profile_transformations = value;
>  +  if (!opts_set->x_flag_tracer)
>  + opts->x_flag_tracer = value;
> if (!opts_set->x_flag_inline_functions)
>    opts->x_flag_inline_functions = value;
> if (!opts_set->x_flag_ipa_cp)


[ACTIVITY] 10-14 May 2013

2013-06-02 Thread Kugan

== Progress ==
* VRP based zero/sign extension
  - Tested and posted the latest patch

* Better end of loop counter optimisation
  - Tree level optimization are optimized in mainline
  - Christophe noted a slight change in asm generated from earlier version
  - tracked down the patch causing this and communicated this.

* Generate a single call to divmod
  - Looked at expand_divmod to understand how __aeabi_idiv and 
__aeabi_idivmod are generated.



== Plan ==

* Better end of loop counter optimisation
  - Change the pattern to remove this additional instruction if necessary.

* Generate a single call to divmod
  - Come up with a solution


Re: [ACTIVITY] 27-31 May 2013

2013-06-02 Thread Kugan

Apologies for sending again. Corrected wrong dates in subject now.

On 03/06/13 12:19, Kugan wrote:

== Progress ==
* VRP based zero/sign extension
   - Tested and posted the latest patch

* Better end of loop counter optimisation
   - Tree level optimization are optimized in mainline
   - Christophe noted a slight change in asm generated from earlier version
   - tracked down the patch causing this and communicated this.

* Generate a single call to divmod
   - Looked at expand_divmod to understand how __aeabi_idiv and
__aeabi_idivmod are generated.


== Plan ==

* Better end of loop counter optimisation
   - Change the pattern to remove this additional instruction if necessary.

* Generate a single call to divmod
   - Come up with a solution




Re: [GOOGLE] Unrestrict early inline restrictions for AutoFDO

2013-06-02 Thread Xinliang David Li
If the purpose of the fix is to filter early inlinings with code
growth in autoFDO, the proposed fix is the wrong way to do -- it
changes the meaning of cgraph_maybe_hot_edge_p.

David

On Sun, Jun 2, 2013 at 7:25 PM, Dehao Chen  wrote:
> On Sun, Jun 2, 2013 at 7:14 PM, Xinliang David Li  wrote:
>>
>> auto profile info is not available yet in early inlining, why would
>> this change make any difference?
>
> Because the check of PARAM_EARLY_INLINING_INSNS is after the check of
> cgraph_maybe_hot_edge_p in early inline. If
> cgraph_maybe_hot_edge_p fails, the early inline will not happen even
> if growth is less than PARAM_EARLY_INLINING_INSNS.
>
>>
>> Can you just reset the max_iters to a
>> higher value for autoFDO?
>
> We could do that, but it could still lead to some code bloat because
> recursive inlines can happen for at most, say 10, iterations.
>
> Dehao
>
>>
>> David
>>
>> On Sun, Jun 2, 2013 at 6:21 PM, Dehao Chen  wrote:
>> > The patch was committed to google-4_8, but it causes problem because
>> > einline sets PARAM_EARLY_INLINING_INSNS = 11. This will cause
>> > recursive inlining at einline stage (e.g. main->foo, foo->bar,
>> > bar->foo) when autofdo is enabled.
>> >
>> > The following patch can fix the problem by doing more targetted early 
>> > inlining:
>> >
>> > Index: gcc/predict.c
>> > ===
>> > --- gcc/predict.c (revision 199593)
>> > +++ gcc/predict.c (working copy)
>> > @@ -175,6 +175,8 @@ cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
>> >&& !maybe_hot_count_p (NULL,
>> >   edge->count))
>> >  return false;
>> > +  if (flag_auto_profile)
>> > +return false;
>> >if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
>> >|| (edge->callee
>> >&& edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED))
>> >
>> > Performance testing on-going...
>> >
>> > Dehao
>> >
>> > On Wed, May 29, 2013 at 3:44 PM, Dehao Chen  wrote:
>> >> OK, I'll commit the early inline part.
>> >>
>> >> Dehao
>> >>
>> >> On Wed, May 29, 2013 at 10:00 AM, Xinliang David Li  
>> >> wrote:
>> >>> The early inlining part is ok. The tracer optimization should be
>> >>> revisited -- we should have more fine grain control on it (for
>> >>> instance, based on FDO summary -- but that should be common to
>> >>> FDO/LIPO).
>> >>>
>> >>> David
>> >>>
>> >>> On Wed, May 29, 2013 at 9:39 AM, Dehao Chen  wrote:
>>  In gcc4-8, the max einline iterations are restricted to 1. For
>>  AutoFDO, this is bad because early inline is not size restricted. This
>>  patch allows einline to do multiple iterations in AutoFDO. It also
>>  enables tracer optimization in AutoFDO.
>> 
>>  Bootstrapped and passed regression test.
>> 
>>  OK for googel-4_8?
>> 
>>  Thanks,
>>  Dehao
>> 
>>  Index: gcc/ipa-inline.c
>>  ===
>>  --- gcc/ipa-inline.c (revision 199416)
>>  +++ gcc/ipa-inline.c (working copy)
>>  @@ -2161,7 +2161,8 @@ early_inliner (void)
>>   {
>> /* We iterate incremental inlining to get trivial cases of 
>>  indirect
>>    inlining.  */
>>  -  while (iterations < PARAM_VALUE 
>>  (PARAM_EARLY_INLINER_MAX_ITERATIONS)
>>  +  while ((flag_auto_profile
>>  +  || iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS))
>>    && early_inline_small_functions (node))
>>    {
>> timevar_push (TV_INTEGRATION);
>>  Index: gcc/opts.c
>>  ===
>>  --- gcc/opts.c (revision 199416)
>>  +++ gcc/opts.c (working copy)
>>  @@ -1644,6 +1644,8 @@ common_handle_option (struct gcc_options *opts,
>>    opts->x_flag_peel_loops = value;
>> if (!opts_set->x_flag_value_profile_transformations)
>>    opts->x_flag_value_profile_transformations = value;
>>  +  if (!opts_set->x_flag_tracer)
>>  + opts->x_flag_tracer = value;
>> if (!opts_set->x_flag_inline_functions)
>>    opts->x_flag_inline_functions = value;
>> if (!opts_set->x_flag_ipa_cp)


Re: [GOOGLE] Unrestrict early inline restrictions for AutoFDO

2013-06-02 Thread Dehao Chen
I've updated the patch to check it at ipa-inline:

Index: gcc/ipa-inline.c
===
--- gcc/ipa-inline.c (revision 199593)
+++ gcc/ipa-inline.c (working copy)
@@ -434,6 +434,16 @@ want_early_inline_function_p (struct cgraph_edge *

   if (growth <= PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_ANY))
  ;
+  else if (flag_auto_profile)
+ {
+  if (dump_file)
+fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
+ "call is cold in profiling and code would grow by %i\n",
+ xstrdup (cgraph_node_name (e->caller)), e->caller->uid,
+ xstrdup (cgraph_node_name (callee)), callee->uid,
+ growth);
+want_inline = false;
+ }
   else if (!cgraph_maybe_hot_edge_p (e))
  {
   if (dump_file)

Thanks,
Dehao

On Sun, Jun 2, 2013 at 9:08 PM, Xinliang David Li  wrote:
> If the purpose of the fix is to filter early inlinings with code
> growth in autoFDO, the proposed fix is the wrong way to do -- it
> changes the meaning of cgraph_maybe_hot_edge_p.
>
> David
>
> On Sun, Jun 2, 2013 at 7:25 PM, Dehao Chen  wrote:
>> On Sun, Jun 2, 2013 at 7:14 PM, Xinliang David Li  wrote:
>>>
>>> auto profile info is not available yet in early inlining, why would
>>> this change make any difference?
>>
>> Because the check of PARAM_EARLY_INLINING_INSNS is after the check of
>> cgraph_maybe_hot_edge_p in early inline. If
>> cgraph_maybe_hot_edge_p fails, the early inline will not happen even
>> if growth is less than PARAM_EARLY_INLINING_INSNS.
>>
>>>
>>> Can you just reset the max_iters to a
>>> higher value for autoFDO?
>>
>> We could do that, but it could still lead to some code bloat because
>> recursive inlines can happen for at most, say 10, iterations.
>>
>> Dehao
>>
>>>
>>> David
>>>
>>> On Sun, Jun 2, 2013 at 6:21 PM, Dehao Chen  wrote:
>>> > The patch was committed to google-4_8, but it causes problem because
>>> > einline sets PARAM_EARLY_INLINING_INSNS = 11. This will cause
>>> > recursive inlining at einline stage (e.g. main->foo, foo->bar,
>>> > bar->foo) when autofdo is enabled.
>>> >
>>> > The following patch can fix the problem by doing more targetted early 
>>> > inlining:
>>> >
>>> > Index: gcc/predict.c
>>> > ===
>>> > --- gcc/predict.c (revision 199593)
>>> > +++ gcc/predict.c (working copy)
>>> > @@ -175,6 +175,8 @@ cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
>>> >&& !maybe_hot_count_p (NULL,
>>> >   edge->count))
>>> >  return false;
>>> > +  if (flag_auto_profile)
>>> > +return false;
>>> >if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
>>> >|| (edge->callee
>>> >&& edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED))
>>> >
>>> > Performance testing on-going...
>>> >
>>> > Dehao
>>> >
>>> > On Wed, May 29, 2013 at 3:44 PM, Dehao Chen  wrote:
>>> >> OK, I'll commit the early inline part.
>>> >>
>>> >> Dehao
>>> >>
>>> >> On Wed, May 29, 2013 at 10:00 AM, Xinliang David Li  
>>> >> wrote:
>>> >>> The early inlining part is ok. The tracer optimization should be
>>> >>> revisited -- we should have more fine grain control on it (for
>>> >>> instance, based on FDO summary -- but that should be common to
>>> >>> FDO/LIPO).
>>> >>>
>>> >>> David
>>> >>>
>>> >>> On Wed, May 29, 2013 at 9:39 AM, Dehao Chen  wrote:
>>>  In gcc4-8, the max einline iterations are restricted to 1. For
>>>  AutoFDO, this is bad because early inline is not size restricted. This
>>>  patch allows einline to do multiple iterations in AutoFDO. It also
>>>  enables tracer optimization in AutoFDO.
>>> 
>>>  Bootstrapped and passed regression test.
>>> 
>>>  OK for googel-4_8?
>>> 
>>>  Thanks,
>>>  Dehao
>>> 
>>>  Index: gcc/ipa-inline.c
>>>  ===
>>>  --- gcc/ipa-inline.c (revision 199416)
>>>  +++ gcc/ipa-inline.c (working copy)
>>>  @@ -2161,7 +2161,8 @@ early_inliner (void)
>>>   {
>>> /* We iterate incremental inlining to get trivial cases of 
>>>  indirect
>>>    inlining.  */
>>>  -  while (iterations < PARAM_VALUE 
>>>  (PARAM_EARLY_INLINER_MAX_ITERATIONS)
>>>  +  while ((flag_auto_profile
>>>  +  || iterations < PARAM_VALUE 
>>>  (PARAM_EARLY_INLINER_MAX_ITERATIONS))
>>>    && early_inline_small_functions (node))
>>>    {
>>> timevar_push (TV_INTEGRATION);
>>>  Index: gcc/opts.c
>>>  ===
>>>  --- gcc/opts.c (revision 199416)
>>>  +++ gcc/opts.c (working copy)
>>>  @@ -1644,6 +1644,8 @@ common_handle_option (struct gcc_options *opts,
>>>    opts->x_flag_peel_loops = value;
>>> if (!opts_set->x_flag_value_profile_transformations)
>>>    opts->x_flag_value_profile_transformations = value;
>>> >>>

Re: [GOOGLE] Unrestrict early inline restrictions for AutoFDO

2013-06-02 Thread Xinliang David Li
The patch is ok if performance test passes.  For a complete fix, Is it
better to tune down PARAM_EARLY_INLINE_INSNS from 11 to a small value
for autoFDO or use a different parameter?

David

On Sun, Jun 2, 2013 at 9:19 PM, Dehao Chen  wrote:
> I've updated the patch to check it at ipa-inline:
>
> Index: gcc/ipa-inline.c
> ===
> --- gcc/ipa-inline.c (revision 199593)
> +++ gcc/ipa-inline.c (working copy)
> @@ -434,6 +434,16 @@ want_early_inline_function_p (struct cgraph_edge *
>
>if (growth <= PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_ANY))
>   ;
> +  else if (flag_auto_profile)
> + {
> +  if (dump_file)
> +fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
> + "call is cold in profiling and code would grow by %i\n",
> + xstrdup (cgraph_node_name (e->caller)), e->caller->uid,
> + xstrdup (cgraph_node_name (callee)), callee->uid,
> + growth);
> +want_inline = false;
> + }
>else if (!cgraph_maybe_hot_edge_p (e))
>   {
>if (dump_file)
>
> Thanks,
> Dehao
>
> On Sun, Jun 2, 2013 at 9:08 PM, Xinliang David Li  wrote:
>> If the purpose of the fix is to filter early inlinings with code
>> growth in autoFDO, the proposed fix is the wrong way to do -- it
>> changes the meaning of cgraph_maybe_hot_edge_p.
>>
>> David
>>
>> On Sun, Jun 2, 2013 at 7:25 PM, Dehao Chen  wrote:
>>> On Sun, Jun 2, 2013 at 7:14 PM, Xinliang David Li  
>>> wrote:

 auto profile info is not available yet in early inlining, why would
 this change make any difference?
>>>
>>> Because the check of PARAM_EARLY_INLINING_INSNS is after the check of
>>> cgraph_maybe_hot_edge_p in early inline. If
>>> cgraph_maybe_hot_edge_p fails, the early inline will not happen even
>>> if growth is less than PARAM_EARLY_INLINING_INSNS.
>>>

 Can you just reset the max_iters to a
 higher value for autoFDO?
>>>
>>> We could do that, but it could still lead to some code bloat because
>>> recursive inlines can happen for at most, say 10, iterations.
>>>
>>> Dehao
>>>

 David

 On Sun, Jun 2, 2013 at 6:21 PM, Dehao Chen  wrote:
 > The patch was committed to google-4_8, but it causes problem because
 > einline sets PARAM_EARLY_INLINING_INSNS = 11. This will cause
 > recursive inlining at einline stage (e.g. main->foo, foo->bar,
 > bar->foo) when autofdo is enabled.
 >
 > The following patch can fix the problem by doing more targetted early 
 > inlining:
 >
 > Index: gcc/predict.c
 > ===
 > --- gcc/predict.c (revision 199593)
 > +++ gcc/predict.c (working copy)
 > @@ -175,6 +175,8 @@ cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
 >&& !maybe_hot_count_p (NULL,
 >   edge->count))
 >  return false;
 > +  if (flag_auto_profile)
 > +return false;
 >if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
 >|| (edge->callee
 >&& edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED))
 >
 > Performance testing on-going...
 >
 > Dehao
 >
 > On Wed, May 29, 2013 at 3:44 PM, Dehao Chen  wrote:
 >> OK, I'll commit the early inline part.
 >>
 >> Dehao
 >>
 >> On Wed, May 29, 2013 at 10:00 AM, Xinliang David Li 
 >>  wrote:
 >>> The early inlining part is ok. The tracer optimization should be
 >>> revisited -- we should have more fine grain control on it (for
 >>> instance, based on FDO summary -- but that should be common to
 >>> FDO/LIPO).
 >>>
 >>> David
 >>>
 >>> On Wed, May 29, 2013 at 9:39 AM, Dehao Chen  wrote:
  In gcc4-8, the max einline iterations are restricted to 1. For
  AutoFDO, this is bad because early inline is not size restricted. This
  patch allows einline to do multiple iterations in AutoFDO. It also
  enables tracer optimization in AutoFDO.
 
  Bootstrapped and passed regression test.
 
  OK for googel-4_8?
 
  Thanks,
  Dehao
 
  Index: gcc/ipa-inline.c
  ===
  --- gcc/ipa-inline.c (revision 199416)
  +++ gcc/ipa-inline.c (working copy)
  @@ -2161,7 +2161,8 @@ early_inliner (void)
   {
 /* We iterate incremental inlining to get trivial cases of 
  indirect
    inlining.  */
  -  while (iterations < PARAM_VALUE 
  (PARAM_EARLY_INLINER_MAX_ITERATIONS)
  +  while ((flag_auto_profile
  +  || iterations < PARAM_VALUE 
  (PARAM_EARLY_INLINER_MAX_ITERATIONS))
    && early_inline_small_functions (node))
    {
 timevar_push (TV_INTEGRATION);
  Index: gcc/opts.c
  ==