[PATCH][ARM] Discourage use of NEON on Cortex-A8

2011-03-13 Thread Andrew Stubbs
This patch discourages the use of NEON for integer operations on ARM 
Cortex-A8.


The problem is that transferring data from NEON/VFP registers to core 
registers is prohibitively expensive on A8. This should not affect 
Cortex-A9 in the same way.


This change gives a 6% increase in performance on SPEC2000 crafty, on an 
imx51 board.


An older version of the patch has been used for some time in the 
CodeSourcery and Linaro toolchains, so it's fairly well tested.


OK (for stage 1)?

Andrew
2011-03-13  Bernd Schmidt  
	Andrew Stubbs  

	gcc/
	* config/arm/vfp.md (arm_movdi_vfp): Enable only when not tuning
	for Cortex-A8.
	(arm_movdi_vfp_cortexa8): New pattern.
	* config/arm/neon.md (adddi3_neon, subdi3_neon, anddi3_neon,
	iordi3_neon, xordi3_neon): Add alternatives to discourage Neon
	instructions when tuning for Cortex-A8.  Set attribute "arch".
	* config/arm/arm.md: Move include arm-tune.md up a bit.
	(define_attr "arch"): Add "onlya8" and "nota8" values.
	(define_attr "arch_enabled"): Handle "onlya8" and "nota8".

--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -149,6 +149,9 @@
 ;;---
 ;; Attributes
 
+;; Processor type.  This is created automatically from arm-cores.def.
+(include "arm-tune.md")
+
 ; IS_THUMB is set to 'yes' when we are generating Thumb code, and 'no' when
 ; generating ARM code.  This is used to control the length of some insn
 ; patterns that share the same RTL in both ARM and Thumb code.
@@ -192,7 +195,7 @@
 ; for ARM or Thumb-2 with arm_arch6, and nov6 for ARM without
 ; arm_arch6.  This attribute is used to compute attribute "enabled",
 ; use type "any" to enable an alternative in all cases.
-(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6"
+(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,onlya8,nota8"
   (const_string "any"))
 
 (define_attr "arch_enabled" "no,yes"
@@ -225,6 +228,14 @@
 
 	 (and (eq_attr "arch" "nov6")
 	  (ne (symbol_ref "(TARGET_32BIT && !arm_arch6)") (const_int 0)))
+	 (const_string "yes")
+
+	 (and (eq_attr "arch" "onlya8")
+	  (eq_attr "tune" "cortexa8"))
+	 (const_string "yes")
+
+	 (and (eq_attr "arch" "nota8")
+	  (not (eq_attr "tune" "cortexa8")))
 	 (const_string "yes")]
 	(const_string "no")))
 
@@ -485,9 +496,6 @@
 ;;---
 ;; Pipeline descriptions
 
-;; Processor type.  This is created automatically from arm-cores.def.
-(include "arm-tune.md")
-
 (define_attr "tune_cortexr4" "yes,no"
   (const (if_then_else
 	  (eq_attr "tune" "cortexr4,cortexr4f")
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -583,23 +583,25 @@
 )
 
 (define_insn "adddi3_neon"
-  [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r")
-(plus:DI (match_operand:DI 1 "s_register_operand" "%w,0,0")
- (match_operand:DI 2 "s_register_operand" "w,r,0")))
+  [(set (match_operand:DI 0 "s_register_operand" "=w,?w,?&r,?&r")
+(plus:DI (match_operand:DI 1 "s_register_operand" "%w,w,0,0")
+ (match_operand:DI 2 "s_register_operand" "w,w,r,0")))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_NEON"
 {
   switch (which_alternative)
 {
-case 0: return "vadd.i64\t%P0, %P1, %P2";
-case 1: return "#";
+case 0: /* fall through */
+case 1: return "vadd.i64\t%P0, %P1, %P2";
 case 2: return "#";
+case 3: return "#";
 default: gcc_unreachable ();
 }
 }
-  [(set_attr "neon_type" "neon_int_1,*,*")
-   (set_attr "conds" "*,clob,clob")
-   (set_attr "length" "*,8,8")]
+  [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*")
+   (set_attr "conds" "*,*,clob,clob")
+   (set_attr "length" "*,*,8,8")
+   (set_attr "arch" "nota8,onlya8,*,*")]
 )
 
 (define_insn "*sub3_neon"
@@ -617,24 +619,26 @@
 )
 
 (define_insn "subdi3_neon"
-  [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?&r")
-(minus:DI (match_operand:DI 1 "s_register_operand" "w,0,r,0")
-  (match_operand:DI 2 "s_register_operand" "w,r,0,0")))
+  [(set (match_operand:DI 0 "s_register_operand" "=w,?w,?&r,?&r,?&r")
+(minus:DI (match_operand:DI 1 "s_register_operand" "w,w,0,r,0")
+  (match_operand:DI 2 "s_register_operand" "w,w,r,0,0")))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_NEON"
 {
   switch (which_alternative)
 {
-case 0: return "vsub.i64\t%P0, %P1, %P2";
-case 1: /* fall through */ 
-case 2: /* fall through */
-case 3: return  "subs\\t%Q0, %Q1, %Q2\;sbc\\t%R0, %R1, %R2";
+case 0: /* fall through */
+case 1: return "vsub.i64\t%P0, %P1, %P2";
+case 2: /* fall through */ 
+case 3: /* fall through */
+case 4: return  "subs\\t%Q0, %Q1, %Q2\;sbc\\t%R0, %R1, %R2";
 default: gcc_unreachable ();
 }
 }
-  [(set_attr "neon_type" "neon_int_2,*,*,*")
-   (set_attr "conds" "*,clob,clob,clob")
-   (set_attr "length" "*,8,8,8")]
+  [(set_attr "neon_type" "neon_int_2,neon_int_2,*,*,*")
+   (set_attr "co

Re: ivopts improvement

2011-03-13 Thread Tom de Vries
On 03/04/2011 11:37 PM, Zdenek Dvorak wrote:
> Hi,
> 
>>/* Whether the loop body includes any function calls.  */
>>bool body_includes_call;
>> +
>> +  /* Whether the loop body includes any function calls that possibly have 
>> side
>> + effects.  */
>> +  bool body_includes_side_effect_call;
>>  };
>>  
>>  /* An assignment of iv candidates to uses.  */
>> @@ -456,6 +460,20 @@
>>return exit;
>>  }
>>  
>> +/* Returns true if single_exit (DATA->current_loop) is the only possible 
>> exit.
>> +   Uses the same logic as loop_only_exit_p.  */
> 
> why are you duplicating the functionality, instead of simply caching the 
> result
> of loop_only_exit_p?
> 

I was trying to avoid iterating over the loop body twice, once for
body_includes_call and once for body_includes_side_effect_call (or
loop_only_exit_p). But indeed, duplicating functionality is not ideal
either. I additionally tried a version which both does not duplicate
functionality, and is runtime efficient, but the implementation became
very convoluted, so I settled for your suggestion of caching the result
of loop_only_exit_p.

>> +/* Tries to detect
>> + NIT == (use_iv_max < USE->iv->base)
>> +? 0
>> +: (use_iv_max - USE->iv->base)
>> +   where
>> + use_iv_real_base == (USE->iv->base - USE->iv->step)
>> + && CAND->iv->base == base_ptr + use_iv_real_base
>> +   and returns the exclusive upper bound for CAND->var_after:
>> + base_ptr + use_iv_max.  */
>> +
>> +static tree
>> +get_lt_bound (struct iv_use *use, struct iv_cand *cand, tree nit)
>> +{
> ...
>> +  /* use_iv_real_base == use->iv->base - use->iv->step.  */
>> +  use_iv_real_base = fold_build_plus (MINUS_EXPR, use->iv->base, 
>> use->iv->step);
>> +
>> +  /* cand_iv_base.  */
>> +
>> +  /* cand->iv->base == base_ptr + use_iv_real_base.  */
> ...
>> +  /* 0.  */
> ...
> 
> This function seriously needs better comments.  All that are currently 
> present just
> give relations between variables that can be as easily seen from the code (but
> do not at all explain what the variables are supposed to mean), 

I see.

> or make no sense
> (what does the 0. comment mean?)

I was trying to repeat parts of the function header comment bit by bit,
but got too terse in the process.

> Otherwise the patch looks ok (but I would still like to see get_lt_bound with 
> proper
> comments, currently I don't really understand what happens there),

changes compared to last submission:
iterator.6.3-ml.patch:
- split up fold_build_plus into fold_build_plus and robust_plus, in
  order to reuse robust_plus in fold_plus in iterator.6.6-ml.patch.
iterator.6.4-ml.patch:
- just cache result of loop_only_exit_p.
- make loop_only_exit_p robust against exit == NULL.
iterator.6.5-ml.patch:
- new patch. keep ssa_name field valid.
iterator.6.6-ml.patch:
- improved comments.
- factored out folding functionality into fold_plus and
  fold_walk_def_plus.
- detect use loop bound based on use->stmt rather than on the nit
  COND_EXPR.
- improved code to handle 'int' iterator.
- improved code to handle '<=' case.
- improved code to handle negative step.
- improved code to handle iv increments after loop exit.
iterator.6.6-ml.test.patch:
- duplicated test for int iterator.

reg-tested on x86_64.

I hope the comments are better now.

Thanks,
- Tom
diff -u gcc/tree-ssa-loop-ivopts.c gcc/tree-ssa-loop-ivopts.c
--- gcc/tree-ssa-loop-ivopts.c	(working copy)
+++ gcc/tree-ssa-loop-ivopts.c	(working copy)
@@ -340,6 +340,44 @@
 
 static VEC(tree,heap) *decl_rtl_to_reset;
 
+/* Detects whether A is of POINTER_TYPE, and modifies CODE and B to make
+   A CODE B type-safe.  */
+
+static inline void
+robust_plus (enum tree_code *code, tree a, tree *b)
+{
+  tree a_type = TREE_TYPE (a);
+  tree b_type = TREE_TYPE (*b);
+
+  if (POINTER_TYPE_P (a_type))
+{
+  switch (*code)
+{
+case MINUS_EXPR:
+  *b = fold_build1 (NEGATE_EXPR, b_type, *b);
+
+  /* Fall-through.  */
+case PLUS_EXPR:
+  *code = POINTER_PLUS_EXPR;
+  break;
+default:
+  gcc_unreachable ();
+}
+}
+  else
+*b = fold_convert (a_type, *b);
+}
+
+/* Returns (TREE_TYPE (A))(A CODE B), where CODE is either PLUS_EXPR or
+   MINUS_EXPR.  Handles the case that A is a pointer robustly.  */
+
+static inline tree
+fold_build_plus (enum tree_code code, tree a, tree b)
+{
+  robust_plus (&code, a, &b);
+  return fold_build2 (code, TREE_TYPE (a), a, b);
+}
+
 /* Number of uses recorded in DATA.  */
 
 static inline unsigned
@@ -2255,18 +2293,7 @@
   if ((HAVE_PRE_INCREMENT && GET_MODE_SIZE (mem_mode) == cstepi)
   || (HAVE_PRE_DECREMENT && GET_MODE_SIZE (mem_mode) == -cstepi))
 {
-  enum tree_code code = MINUS_EXPR;
-  tree new_base;
-  tree new_step = step;
-
-  if (POINTER_TYPE_P (TREE_TYPE (base)))
-	{
-	  new_step = fold_build1 (NEGATE_EXPR, TREE_TYPE (step), step);
-	  code = POINTER_PLUS_EXPR;
-	}
-  else
-	new_step = fold_

[4.7] Avoid global state in sparc_handle_option

2011-03-13 Thread Joseph S. Myers
This patch, for 4.7 and relative to a tree with
 applied,
stops the SPARC handle_option hook from using global state.
Everything the hook does can be replaced by use of .opt features, so
the patch removes the hook.

The hook did two things: setting fpu_option_set for certain options,
and setting elements of sparc_select.  The options for which
fpu_option_set was set are exactly those using MASK_FPU, meaning that
the use of this variable was equivalent to checking the relevant bit
of target_flags_explicit (global_options_set.x_target_flags).  The
sparc_select mechanism was overly complicated and mixed constant state
with variable state depending on options; this is replaced by using
the .opt Enum machinery to set two gcc_options fields directly to enum
processor_type values, simple logic to default those values (-mcpu
defaults to a default configure-time value, -mtune defaults to the
-mcpu value) and using those values as indices in a table showing the
effects on target_flags.

Tested building cc1 and xgcc for cross to sparc-elf.  Will commit to
trunk for 4.7 in the absence of target maintainer objections.

2011-03-13  Joseph Myers  

* config/sparc/sparc-opts.h: New.
* config/sparc/sparc.c (sparc_handle_option, sparc_select,
sparc_cpu, fpu_option_set, TARGET_HANDLE_OPTION): Remove.
(sparc_option_override): Store processor_type enumeration rather
than string in cpu_default.  Remove name and enumeration from
cpu_table.  Directly default -mcpu then default -mtune from -mcpu
without using sparc_select.  Use target_flags_explicit instead of
fpu_option_set.
* config/sparc/sparc.h (enum processor_type): Move to
sparc-opts.h.
(sparc_cpu, struct sparc_cpu_select, sparc_select): Remove.
* config/sparc/sparc.opt (config/sparc/sparc-opts.h): New
HeaderInclude entry.
(mcpu=, mtune=): Use Var and Enum.
(sparc_processor_type): New Enum and EnumValue entries.

diff -rupN --exclude=.svn gcc-mainline-1/gcc/config/sparc/sparc-opts.h 
gcc-mainline/gcc/config/sparc/sparc-opts.h
--- gcc-mainline-1/gcc/config/sparc/sparc-opts.h1969-12-31 
16:00:00.0 -0800
+++ gcc-mainline/gcc/config/sparc/sparc-opts.h  2011-03-11 17:38:04.0 
-0800
@@ -0,0 +1,48 @@
+/* Definitions for option handling for SPARC.
+   Copyright (C) 1987, 1988, 1989, 1992, 1994, 1995, 1996, 1997, 1998, 1999
+   2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
+   Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef SPARC_OPTS_H
+#define SPARC_OPTS_H
+
+/* Processor type.
+   These must match the values for the cpu attribute in sparc.md and
+   the table in sparc_option_override.  */
+enum processor_type {
+  PROCESSOR_V7,
+  PROCESSOR_CYPRESS,
+  PROCESSOR_V8,
+  PROCESSOR_SUPERSPARC,
+  PROCESSOR_HYPERSPARC,
+  PROCESSOR_LEON,
+  PROCESSOR_SPARCLITE,
+  PROCESSOR_F930,
+  PROCESSOR_F934,
+  PROCESSOR_SPARCLITE86X,
+  PROCESSOR_SPARCLET,
+  PROCESSOR_TSC701,
+  PROCESSOR_V9,
+  PROCESSOR_ULTRASPARC,
+  PROCESSOR_ULTRASPARC3,
+  PROCESSOR_NIAGARA,
+  PROCESSOR_NIAGARA2
+};
+
+#endif
diff -rupN --exclude=.svn gcc-mainline-1/gcc/config/sparc/sparc.c 
gcc-mainline/gcc/config/sparc/sparc.c
--- gcc-mainline-1/gcc/config/sparc/sparc.c 2011-03-11 09:06:55.0 
-0800
+++ gcc-mainline/gcc/config/sparc/sparc.c   2011-03-11 17:42:20.0 
-0800
@@ -367,8 +367,6 @@ static HOST_WIDE_INT frame_base_offset;
 /* 1 if the next opcode is to be specially indented.  */
 int sparc_indent_opcode = 0;
 
-static bool sparc_handle_option (struct gcc_options *, struct gcc_options *,
-const struct cl_decoded_option *, location_t);
 static void sparc_option_override (void);
 static void sparc_init_modes (void);
 static void scan_record_type (const_tree, int *, int *, int *);
@@ -485,21 +483,6 @@ enum cmodel sparc_cmodel;
 
 char sparc_hard_reg_printed[8];
 
-struct sparc_cpu_select sparc_select[] =
-{
-  /* switchname,   tunearch */
-  { (char *)0, "default",  1,  1 },
-  { (char *)0, "-mcpu=",   1,  1 },
-  { (char *)0, "-mtune=",  1,  0 },
-  { 0, 0, 0, 0 }
-};
-
-/* CPU type.  This is set from TARGET_CPU_DEFAULT and -m{cpu,tune}=xxx.  */
-enum processor_type sparc_cpu

Re: [patch] ping1 unbreak bootstrap on FreeBSD ppc

2011-03-13 Thread Gerald Pfeifer

On Sat, 12 Mar 2011, Andreas Tobler wrote:

I'd like to commit the below patch to gcc trunk and gcc-4.5.

I have an ok from DJE, but I still await a comment from Loren.

This is now pending for more than a month. And I'd like to push this out.


I know Loren has been busy on the private side of things, and since
this is (a) fine with David as ppc maintainer, (b) strictly confined
to FreeBSD/ppc, and (c) simply broken right now I suggest to go ahead
right away for this to make GCC 4.6.0.

Then I'll update the FreeBSD port and you can point interested parties
there and after two weeks, say, and some positive feedback, backport
this to the GCC 4.5 branch.

Gerald


[PATCH] 'Fix' PR48086 by disabling LTO on darwin

2011-03-13 Thread Jack Howarth
   Unfortunately, Apple's assembler programmers overzealously fixed
radar://7920267 by ignoring their own mach-o specifications and blindly
forcing 255 sections regardless of the presence of symbols. This causes
major breakage in the LTO testsuite...

http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg01124.html

and cripples the use of LTO on significantly large projects as well
as breaking the previously functional lto-bootstrap. Unfortunately
Apple's distribution method for Xcode hamstrings us as well. The
only available dmg's are for Xcode 3.2.2 and 3.2.6. The older Xcode
3.2.2 release is missing critical fixes that FSF gcc 4.6.0 requires
and the only version available via Software Update is now 3.2.6. This
means that new installations would be trapped on the buggy Xcode
3.2.2 release. Also users lucky enough to currently be on Xcode 
3.2.5 could easily accidentally upgrade to Xcode 3.2.6 in a 
Software Update session and be unable to get back to Xcode 3.2.5.
Considering the fragility this introduces in having functional
LTO support on darwin, LTO should be disabled by default again
until we can work around this breakage in the Apple assembler 
for gcc 4.7 and 4.6.1. Bootstrap tested on x86_64-apple-darwin10.

howarth% ./dist/bin/gcc -flto himenoBMTxpa.c
cc1: error: LTO support has not been enabled in this configuration

 Okay for gcc trunk?
Jack

2011-03-13  Jack Howarth  

PR lto/48086
* configure.ac: Disable LTO on darwin due to assembler bug
in Xcode 3.2.6/4.0.
* configure: Regenerate.

Index: configure.ac
===
--- configure.ac(revision 170924)
+++ configure.ac(working copy)
@@ -1743,7 +1743,7 @@ ACX_ELF_TARGET_IFELSE([# ELF platforms b
   build_lto_plugin=yes
 ],[if test x"$default_enable_lto" = x"yes" ; then
 case $target in
-  *-apple-darwin* | *-cygwin* | *-mingw*) ;;
+  *-cygwin* | *-mingw*) ;;
   # On other non-ELF platforms, LTO has yet to be validated.
   *) enable_lto=no ;;
 esac


Re: [PATCH] 'Fix' PR48086 by disabling LTO on darwin

2011-03-13 Thread Mike Stump
On Mar 13, 2011, at 5:57 PM, Jack Howarth wrote:
> Okay for gcc trunk?

Ok, applied.  I updated the wording of the checkin slightly, hope you don't 
mind.  Thanks.  Hate to find out late in the game, but, at least we figured it 
out before release.

2011-03-13  Jack Howarth  

PR lto/48086
* configure.ac: Disable LTO on darwin due to an assembler change in
Xcode 3.2.6/4.0 that limits the total number of sections/segments to
under 256.
* configure: Regenerate.

Index: configure
===
--- configure   (revision 170745)
+++ configure   (working copy)
@@ -6206,7 +6206,7 @@
 else
   if test x"$default_enable_lto" = x"yes" ; then
 case $target in
-  *-apple-darwin* | *-cygwin* | *-mingw*) ;;
+  *-cygwin* | *-mingw*) ;;
   # On other non-ELF platforms, LTO has yet to be validated.
   *) enable_lto=no ;;
 esac
Index: configure.ac
===
--- configure.ac(revision 170745)
+++ configure.ac(working copy)
@@ -1743,7 +1743,7 @@
   build_lto_plugin=yes
 ],[if test x"$default_enable_lto" = x"yes" ; then
 case $target in
-  *-apple-darwin* | *-cygwin* | *-mingw*) ;;
+  *-cygwin* | *-mingw*) ;;
   # On other non-ELF platforms, LTO has yet to be validated.
   *) enable_lto=no ;;
 esac



Re: [PATCH] 'Fix' PR48086 by disabling LTO on darwin

2011-03-13 Thread Jack Howarth
On Sun, Mar 13, 2011 at 07:53:23PM -0700, Mike Stump wrote:
> On Mar 13, 2011, at 5:57 PM, Jack Howarth wrote:
> > Okay for gcc trunk?
> 
> Ok, applied.  I updated the wording of the checkin slightly, hope you don't 
> mind.  Thanks.  Hate to find out late in the game, but, at least we figured 
> it out before release.
> 

Mike,
   I opened PR48108 for the development of containerized LTO for darwin.
I also asked Chris and Nick to mention to the darwin assembler developer that
some notification via the original radar of his proposed fix might have 
prevented
this snafu or at least given us several months notice to fix it before the
FSF gcc 4.6.0 release was upon us.
   Jack
ps Life would be so much easier if Apple would stop being so secretive with the
Xcode Previews and provided them to the general ADC accounts like in the old 
days.

> 2011-03-13  Jack Howarth  
> 
>   PR lto/48086
>   * configure.ac: Disable LTO on darwin due to an assembler change in
>   Xcode 3.2.6/4.0 that limits the total number of sections/segments to
>   under 256.
>   * configure: Regenerate.
> 
> Index: configure
> ===
> --- configure (revision 170745)
> +++ configure (working copy)
> @@ -6206,7 +6206,7 @@
>  else
>if test x"$default_enable_lto" = x"yes" ; then
>  case $target in
> -  *-apple-darwin* | *-cygwin* | *-mingw*) ;;
> +  *-cygwin* | *-mingw*) ;;
># On other non-ELF platforms, LTO has yet to be validated.
>*) enable_lto=no ;;
>  esac
> Index: configure.ac
> ===
> --- configure.ac  (revision 170745)
> +++ configure.ac  (working copy)
> @@ -1743,7 +1743,7 @@
>build_lto_plugin=yes
>  ],[if test x"$default_enable_lto" = x"yes" ; then
>  case $target in
> -  *-apple-darwin* | *-cygwin* | *-mingw*) ;;
> +  *-cygwin* | *-mingw*) ;;
># On other non-ELF platforms, LTO has yet to be validated.
>*) enable_lto=no ;;
>  esac

> 



Re: [patch] ping1 unbreak bootstrap on FreeBSD ppc

2011-03-13 Thread Andreas Tobler

On 13.03.11 22:37, Gerald Pfeifer wrote:

On Sat, 12 Mar 2011, Andreas Tobler wrote:

I'd like to commit the below patch to gcc trunk and gcc-4.5.

I have an ok from DJE, but I still await a comment from Loren.

This is now pending for more than a month. And I'd like to push this out.


I know Loren has been busy on the private side of things, and since
this is (a) fine with David as ppc maintainer, (b) strictly confined
to FreeBSD/ppc, and (c) simply broken right now I suggest to go ahead
right away for this to make GCC 4.6.0.

Then I'll update the FreeBSD port and you can point interested parties
there and after two weeks, say, and some positive feedback, backport
this to the GCC 4.5 branch.


Thank you.

Committed as:

- gcc: 170930
- libgcc: 170931

Andreas


RE: [Patch][AVR]: Support tail calls

2011-03-13 Thread Boyapati, Anitha

Hi Georg,

>This is a patch to test/review/comment on. It adds tail call
>optimization to avr backend.
>
>The implementation uses struct machine_function to pass information
>around, i.e. from avr_function_arg_advance to avr_function_ok_for_sibcall.
>
>Tail call support is more general than avr-ld's replacement of
>call/ret sequences with --relax which are sometimes wrong, see
>http://sourceware.org/PR12494
>
>gcc can, e.g. tail-call bar1 in
>
>void bar0 (void);
>void bar1 (int);
>
>int foo (int x)
>{
>  bar0();
>  return bar1 (x);
>}

To be on same page, can you explain how gcc optimizes above case? As I 
understand, in a tail-call optimization, bar1 can return to the caller of 
foo(). There can be different cases of handling this. But how is this handled 
in gcc after recognizing that foo() is a candidate for tail call? 

Also, I have applied the patch, and used it for a small test case as below:

int bar1(int x) {
x++;
return x;
}

int foo (int x)
{
  return bar1 (x);
}

int main() {
volatile int i;
return foo(i);

}

avr-gcc -S -foptimize-sibling-calls tail-call.c


I find no difference in the code generated with and without tail call 
optimization. (I am assuming -foptimize-sibling-calls should turn on this). Let 
me know if I am doing something wrong.

Anitha