date:20120404

Ping [IA-64] Implement static stack checking

2012-04-04 Thread Tristan Gingold

Hi,

I'd like to ping this patch, as it hasn't been reviewed for 4 weeks.

Tristan.

On Mar 6, 2012, at 11:08 PM, Eric Botcazou wrote:

> This at last implements static stack checking for the IA-64, i.e. stack 
> checking of the static part of the frame, and makes it possible to pass the 
> entire ACATS testsuite.  The peculiarity is the second stack in memory, 
> namely 
> the Backing Store of the Register Stack Engine, that needs to be dealt with.
> 
> This also introduces full support for "unknown" insns in the bundling code 
> (the 
> only other "unknown" insn, namely set_bsp, didn't need that because it comes 
> always last in a function).
> 
> Bootstrapped/regtested on IA-64/Linux (and also tested on IA-64/HP-UX and 
> VMS), 
> OK for the mainline?
> 
> 
> 2012-03-06  Eric Botcazou  
>Tristan Gingold  
> 
>   * doc/md.texi (Standard Names): Document probe_stack_address.
>   * explow.c (emit_stack_probe): Handle probe_stack_address.
>   * config/ia64/ia64.md (UNSPECV_PROBE_STACK_ADDRESS): New constant.
>   (UNSPECV_PROBE_STACK_RANGE): Likewise.
>   (probe_stack_address): New insn.
>   (probe_stack_range): Likewise.
>   * config/ia64/ia64.c: Include common/common-target.h.
>   (ia64_compute_frame_size): Mark r2 and r3 as used if static stack
>   checking is enabled.
>   (ia64_emit_probe_stack_range): New function.
>   (output_probe_stack_range): Likewise.
>   (ia64_expand_prologue): Invoke ia64_emit_probe_stack_range if static
>   builtin stack checking is enabled.
>   (rtx_needs_barrier) : Handle UNSPECV_PROBE_STACK_RANGE
>   and UNSPECV_PROBE_STACK_ADDRESS.
>   (unknown_for_bundling_p): New predicate.
>   (group_barrier_needed): Use important_for_bundling_p.
>   (ia64_dfa_new_cycle): Use unknown_for_bundling_p.
>   (issue_nops_and_insn): Likewise.
>   (bundling): Likewise.
>   (final_emit_insn_group_barriers): Likewise.
>   * config/ia64/ia64-protos.h (output_probe_stack_range): Declare.
>   * config/ia64/hpux.h (STACK_CHECK_STATIC_BUILTIN): Define.
>   (STACK_CHECK_PROTECT): Likewise.
>   * config/ia64/linux.h (STACK_CHECK_STATIC_BUILTIN): Likewise.
> 
> 
> -- 
> Eric Botcazou
>

Ping [IA-64] Work around thinko in 'x' constraint implementation

2012-04-04 Thread Tristan Gingold


I'd like to ping this patch as it fixed an ICE visible on both ia64 linux and 
ia64 openvms.

Tristan.

On Mar 6, 2012, at 11:07 PM, Eric Botcazou wrote:

> We have a regression on one of the testcases of our internal testsuite on 
> IA-64 
> with a 4.7-based compiler, which is of the form:
> 
> test_vec_madd.adb: In function 'Test_Vec_Madd':
> test_vec_madd.adb:160:5: error: could not split insn
> (insn 887 4859 889 16 (set (reg:TI 158 f30 [orig:417 m ] [417])
>(mem/c:TI (reg/f:DI 14 r14 [1025]) [0 S16 A128])) /gnu/lib/gcc/ia64-hp-
> openvms/4_7_0/adainclude/g-altcon.adb:277 125 {movti_internal}
> (nil))
> +===GNAT BUG DETECTED==+
> | Pro 7.1.0w (20120221-head) (ia64-hp-openvms) GCC error:  |
> | in final_scan_insn, at final.c:2716  |
> | Error detected around test_vec_madd.adb:160:5|
> 
> The compiler aborts during the final pass because it couldn't split the insn.
> The pattern for movti_internal is:
> 
> (define_insn_and_split "movti_internal"
>  [(set (match_operand:TI 0 "destination_operand" "=r,   *fm,*x,*f,  Q")
>   (match_operand:TI 1 "general_operand" "r*fim,r,  Q, *fOQ,*f"))]
>  "ia64_move_ok (operands[0], operands[1])"
>  "@
>   #
>   #
>   ldfp8 %X0 = %1%P1
>   #
>   #"
>  "reload_completed && !ia64_load_pair_ok(operands[0], operands[1])"
>  [(const_int 0)]
> 
> The problem is that the operands satisfy ia64_load_pair_ok so the splitter
> cannot be invoked on them.  The root cause is a discrepancy between this
> predicate and how the 'x' constraint is interpreted.  The predicate uses
> FP_REGNO_P to check the destination and this returns true for %f30 (but would
> return false for the immediately following register %f31).  But recog 
> interprets the 'x' constraint as meaning that every hard register in the 
> destination must be in the FP_REGS class; now the mode is TImode so both %f30 
> and %f31 are taken into account and %f31 isn't in the FP_REGS class, so the 
> operand is rejected.
> 
> AFAICS the problem dates back to the introduction of the code (r102463), so 
> I'm 
> not sure that we want to rewrite it at this point.  That's why the attached 
> patch is a simple workaround that just avoid ICEing.
> 
> Bootstrapped/regtested on IA-64/Linux, OK for the mainline?  Do we also want 
> it 
> for 4.7.1 (I assume that some RA change makes the issue visible in 4.7.x)?
> 
> 
> 2012-03-06  Eric Botcazou  
> 
>   * config/ia64/ia64.c (ia64_load_pair_ok): Return 0 if the second member
>   of the destination isn't also a FP_REGS register.
> 
> 
> -- 
> Eric Botcazou
>

[4.5/Ada] Fix build with 4.6 compiler

2012-04-04 Thread Eric Botcazou

Arno, do you have objections to me applying the attached patch to the 4.5 
branch?  It makes it possible to build (and bootstrap) the Ada compiler on the 
4.5 branch (oldest supported branch) with the 4.6 compiler, which is now the 
system compiler in recent Linux distributions.

The patch backports minor fixes from the 4.6 branch.  The only potentially 
controversial thing is the preprocessor trick in init.c, which arranges for 
the new symbol __gl_main_cpu not to be added to the runtime.

Bootstrapped/regtested with 4.3 and 4.6 compilers on x86 and x86-64/Linux.


2012-04-04  Eric Botcazou  

Backport from 4.6 branch
* init.c (__gl_main_cpu): New global variable.
* par-ch3.adb: Remove a couple of blank lines.
* types.ads (Big_String_Ptr): Don't give it zero storage size.
(Source_Buffer_Ptr): Likewise.
* uintp.adb (Hash_Num): Use "mod" operator from Types.


-- 
Eric Botcazou
Index: init.c
===
--- init.c	(revision 186078)
+++ init.c	(working copy)
@@ -86,6 +86,9 @@
 
 /* Global values computed by the binder.  */
 int   __gl_main_priority = -1;
+#if (__GNUC__ * 10 + __GNUC_MINOR__ > 45)
+int   __gl_main_cpu  = -1;
+#endif
 int   __gl_time_slice_val= -1;
 char  __gl_wc_encoding   = 'n';
 char  __gl_locking_policy= ' ';
Index: types.ads
===
--- types.ads	(revision 186078)
+++ types.ads	(working copy)
@@ -125,8 +125,9 @@
 
subtype Big_String is String (Positive);
type Big_String_Ptr is access all Big_String;
-   for Big_String_Ptr'Storage_Size use 0;
-   --  Virtual type for handling imported big strings
+   --  Virtual type for handling imported big strings. Note that we should
+   --  never have any allocators for this type, but we don't give a storage
+   --  size of zero, since there are legitimate deallocations going on.
 
function To_Big_String_Ptr is
  new Unchecked_Conversion (System.Address, Big_String_Ptr);
@@ -200,13 +201,14 @@
--  Source_Buffer_Ptr, see Osint.Read_Source_File for details.
 
type Source_Buffer_Ptr is access all Big_Source_Buffer;
-   for Source_Buffer_Ptr'Storage_Size use 0;
--  Pointer to source buffer. We use virtual origin addressing for source
--  buffers, with thin pointers. The pointer points to a virtual instance
--  of type Big_Source_Buffer, where the actual type is in fact of type
--  Source_Buffer. The address is adjusted so that the virtual origin
--  addressing works correctly. See Osint.Read_Source_Buffer for further
-   --  details.
+   --  details. Again, as for Big_String_Ptr, we should never allocate using
+   --  this type, but we don't give a storage size clause of zero, since we
+   --  may end up doing deallocations of instances allocated manually.
 
subtype Source_Ptr is Text_Ptr;
--  Type used to represent a source location, which is a subscript of a
Index: par-ch3.adb
===
--- par-ch3.adb	(revision 186078)
+++ par-ch3.adb	(working copy)
@@ -111,7 +111,6 @@
--  current token, and if this is the first such message issued, saves
--  the message id in Missing_Begin_Msg, for possible later replacement.
 
-
-
-- Check_Restricted_Expression --
-
@@ -2107,7 +2106,6 @@
   Range_Node : Node_Id;
   Save_Loc   : Source_Ptr;
 
-
--  Start of processing for P_Range_Or_Subtype_Mark
 
begin
Index: uintp.adb
===
--- uintp.adb	(revision 186078)
+++ uintp.adb	(working copy)
@@ -239,7 +239,7 @@
 
function Hash_Num (F : Int) return Hnum is
begin
-  return Standard."mod" (F, Hnum'Range_Length);
+  return Types."mod" (F, Hnum'Range_Length);
end Hash_Num;
 
---

Re: [4.5/Ada] Fix build with 4.6 compiler

2012-04-04 Thread Jakub Jelinek

On Wed, Apr 04, 2012 at 09:36:52AM +0200, Eric Botcazou wrote:
> --- init.c(revision 186078)
> +++ init.c(working copy)
> @@ -86,6 +86,9 @@
>  
>  /* Global values computed by the binder.  */
>  int   __gl_main_priority = -1;
> +#if (__GNUC__ * 10 + __GNUC_MINOR__ > 45)

Shouldn't this be * 100 and > 405 ?  I mean, we already had GCC
2.95, 2.96, 2.97 and 20 + 95 is > 45...

Jakub

Re: [4.5/Ada] Fix build with 4.6 compiler

2012-04-04 Thread Arnaud Charlet

> Arno, do you have objections to me applying the attached patch to the 4.5
> branch?  It makes it possible to build (and bootstrap) the Ada compiler on the
> 4.5 branch (oldest supported branch) with the 4.6 compiler, which is now the
> system compiler in recent Linux distributions.

Well, we don't guarantee such compatibility in general,
so I'd like to make it clear that people shouldn't expect this combination
to work, and if more complex patches are submitted, we'll likely NOT
integrate them.

The init.c change is clearly already on the edge of kludgy changes and is
NOT ok as is, it requires a clear comments explaining why it's there.

OK with the comment changes.

Arno

Re: [Patch, Fortran]: Fix libgfortran.h error for VMS

2012-04-04 Thread Tristan Gingold

On Apr 3, 2012, at 5:53 PM, Tobias Burnus wrote:

> 
> On 04/03/2012 02:42 PM, Tristan Gingold wrote:
>> The simplest path is simply to reverse the include order in libgfortran.h.  
>> I know that this is somewhat VMS specific, and I welcome better ideas.
> 
> Well, changing the order is not that bad than one has to try hard to  find a 
> better solution. (Unless, it fails on other systems with the new order.)

I cross my fingers!

>> Tested by building gfortran for x86_64-darwin and ia64-hp-openvms.
> 
> OK. Thanks for the patch.

You're welcome.  This was my latest patch to make gfortran build for VMS.  
AFAIK, this is now the only post F95 fortran compiler available for VMS, 
although I suppose it lacks some DEC extensions specific to VMS.

Tristan.

Re: [PATCH, RTL] Fix PR 51106

2012-04-04 Thread Andrey Belevantsev


On 03.04.2012 13:36, Jakub Jelinek wrote:

On Mon, Apr 02, 2012 at 06:56:25PM +0400, Andrey Belevantsev wrote:

After Richi's RTL generation related cleanups went it, the extra
cleanup_cfg call was added so we are no longer lucky to have the
proper fallthru edge in this case.  The PR trail has the patch to
purge_dead_edges making it consider this situation, but I still
prefer the below patch that fixes only the invalid asm case.  The
reason is that I think it unlikely that after initial RTL expansion
(of which the instantiate virtual regs pass seems to be the part) we
will get the problematic situation.  However, I'm happy to test the
PR trail patch, too.


I don't like blindly changing edge into FALLTHRU, generally the edge
could be abnormal, or EH, etc. and making that fallthru is not a good idea.
I'd prefer if wherever the fallthru edge is removed the other normal edge(s)
are adjusted.


Well, as I mentioned in 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51106#c18 the removal happens 
in try_optimize_cfg around line 2617, that's the code that deals with 
removing trivially empty block (in the PR case, the succ of the asm goto 
block is trivially empty).  After that we have an asm goto acting as an 
unconditional jump with a single succ edge and no fallthru bit, which seems 
perfectly fine.  We get into trouble when we actually realize that the asm 
is bogus.  Thus I've tried to fix it up at this time.  The options that we 
briefly discussed on IRC with Richi are as follows:


- Fix up try_optimize_cfg in the case of asm gotos, but it is not clear to 
me how do we do this -- we don't yet distinguish between good and bad asm 
goto at this point;


- Fix up function.c as I did but make it more robust.  Either handle more 
cases with strange edges (EH seems also possible after introducing the 
throw attribute for asms), or just remove the asm and insert the 
unconditional jump pointing to the place where the asm was, retaining all 
the flags;


- Fix up purge_dead_edges. as I did initially in 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51106#c16.  When we find a 
situation of no jump at the end of block and no fallthru edge, assert this 
happens only with single_succ_p and actually make this edge fallthru. 
(Probably also watch out whether we need to make it fake or whatever.)  Or 
as Richi tried, just accept the situation of having no successors here, as 
in -- no fallthru edge on entry to purge_dead_edges and no jump means no 
successors, period.


I think that just nobody deleted unconditional jumps with 
delete_insn_and_edges previously, otherwise I don't understand why this did 
not trigger.


Thoughts?

Andrey



Jakub

Re: [4.5/Ada] Fix build with 4.6 compiler

2012-04-04 Thread Eric Botcazou

> Shouldn't this be * 100 and > 405 ?  I mean, we already had GCC
> 2.95, 2.96, 2.97 and 20 + 95 is > 45...

This idiom is the one already used in tracebak.c for example.  Would that 
really matter in practice?

-- 
Eric Botcazou

Re: [4.5/Ada] Fix build with 4.6 compiler

2012-04-04 Thread Eric Botcazou

> Well, we don't guarantee such compatibility in general,
> so I'd like to make it clear that people shouldn't expect this combination
> to work, and if more complex patches are submitted, we'll likely NOT
> integrate them.

That's mainly for GCC developers; without this, it will be a pain to keep 
testing the 4.5.x Ada compiler on modern Linux distributions.

> The init.c change is clearly already on the edge of kludgy changes and is
> NOT ok as is, it requires a clear comments explaining why it's there.

OK, will add a ??? comment, thanks.

-- 
Eric Botcazou

[Patch]: Fix ICE on VMS when using SImode pointers

2012-04-04 Thread Tristan Gingold

Hi,

this patch fixes a build time failure on VMS (while compiling Ada RTS file 
i-cstrin.adb) due to the use of short pointers:

i-cstrin.adb: In function 'Interfaces.C.Strings.To_Chars_Ptr':
i-cstrin.adb:236:8: error: unrecognizable insn:
(insn 80 79 81 13 (set (reg:SI 384)
(const_int 4294967288 [0xfff8])) i-cstrin.adb:234 -1
 (nil))
+===GNAT BUG DETECTED==+
| Pro 7.1.0w (20120403-47) (ia64-hp-openvms) GCC error:|
| in extract_insn, at recog.c:2123 |
| Error detected around i-cstrin.adb:236:8 |


Expansion of POINTER_PLUS_EXPR doesn't handle the case of PRECISION(sizetype) > 
PRECISION(type), leading to RTL expressions with different modes.

This patch fixes the build issue, tested on ia64-hp-openvms.
Also tested with our internal testsuite.
I haven't run the GCC testsuite on a regular platform, as the condition will 
never trigger.

Ok for trunk ?

Tristan.

2012-04-04  Tristan Gingold  

* expr.c (expand_expr_real_2): Handle larger sizetype in
POINTER_PLUS_EXPR.

--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -7957,6 +7957,9 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_m
treeop1 = fold_convert_loc (loc, type,
fold_convert_loc (loc, ssizetype,
  treeop1));
+  else if (TYPE_PRECISION (sizetype) > TYPE_PRECISION (type))
+   treeop1 = fold_convert_loc (loc, type, treeop1);
+
 case PLUS_EXPR:
   /* If we are adding a constant, a VAR_DECL that is sp, fp, or ap, and
 something else, make sure we add the register to the constant and
u

Re: [4.5/Ada] Fix build with 4.6 compiler

2012-04-04 Thread Jakub Jelinek

On Wed, Apr 04, 2012 at 10:08:50AM +0200, Eric Botcazou wrote:
> > Shouldn't this be * 100 and > 405 ?  I mean, we already had GCC
> > 2.95, 2.96, 2.97 and 20 + 95 is > 45...
> 
> This idiom is the one already used in tracebak.c for example.  Would that 
> really matter in practice?

It is a bad idiom, given that we already had >= 10 __GNUC_MINOR__ and it
is possible we'll have 4.10 as well.
E.g. __GNUC_PREREQ macro in glibc shifts left major by 16, but even
multiplying by 100 instead of 10 is better.

Jakub

[PATCH] Fix PRs c/52283/37985

2012-04-04 Thread Christian Bruel


Hello,

Is it OK to push the cleaning of TREE_NO_WARNING to fix the constant 
expressions errors discrepancies, as discussed in bugzilla #52283, now 
that the trunk is open ?


Many thanks,



2012-03-29   Manuel LÃ³pez-IbÃ¡Ã±ez  

	PR c/52283/37985
	* stmt.c (warn_if_unused_value): Skip NOP_EXPR.
	* convert.c (convert_to_integer): Don't set TREE_NO_WARNING.

2010-03-29  Christian Bruel  

	PR c/52283
	* gcc.dg/case-const-1.c: Test constant expression.
	* gcc.dg/case-const-2.c: Likewise.
	* gcc.dg/case-const-3.c: Likewise.

2012-03-29   Manuel LÃ³pez-IbÃ¡Ã±ez  

	* gcc/testsuite/gcc.dg/pr37985.c: New test.

Index: gcc/testsuite/gcc.dg/pr37985.c
===
--- gcc/testsuite/gcc.dg/pr37985.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr37985.c	(revision 0)
@@ -0,0 +1,8 @@
+/* PR c/37985 */
+/* { dg-do compile } */
+/* { dg-options " -Wall -Wextra " } */
+unsigned char foo(unsigned char a)
+{
+  a >> 2; /* { dg-warning "no effect" } */
+  return a;
+}
Index: gcc/testsuite/gcc.dg/case-const-1.c
===
--- gcc/testsuite/gcc.dg/case-const-1.c	(revision 186082)
+++ gcc/testsuite/gcc.dg/case-const-1.c	(working copy)
@@ -1,9 +1,11 @@
 /* Test for case labels not integer constant expressions but folding
-   to integer constants (used in Linux kernel, PR 39613).  */
+   to integer constants (used in Linux kernel, PR 39613, 52283).  */
 /* { dg-do compile } */
 /* { dg-options "" } */
 
 extern int i;
+extern unsigned int u;
+
 void
 f (int c)
 {
@@ -13,3 +15,13 @@
   ;
 }
 }
+
+void
+b (int c)
+{
+  switch (c)
+{
+case (int) (2  | ((4 < 8) ? 8 : u)):
+  ;
+}
+}
Index: gcc/testsuite/gcc.dg/case-const-2.c
===
--- gcc/testsuite/gcc.dg/case-const-2.c	(revision 186082)
+++ gcc/testsuite/gcc.dg/case-const-2.c	(working copy)
@@ -1,9 +1,11 @@
 /* Test for case labels not integer constant expressions but folding
-   to integer constants (used in Linux kernel, PR 39613).  */
+   to integer constants (used in Linux kernel, PR 39613, 52283).  */
 /* { dg-do compile } */
 /* { dg-options "-pedantic" } */
 
 extern int i;
+extern unsigned int u;
+
 void
 f (int c)
 {
@@ -13,3 +15,14 @@
   ;
 }
 }
+
+void
+b (int c)
+{
+  switch (c)
+{
+case (int) (2  | ((4 < 8) ? 8 : u)): /* { dg-warning "case label is not an integer constant expression" } */
+  ;
+}
+}
+
Index: gcc/testsuite/gcc.dg/case-const-3.c
===
--- gcc/testsuite/gcc.dg/case-const-3.c	(revision 186082)
+++ gcc/testsuite/gcc.dg/case-const-3.c	(working copy)
@@ -1,9 +1,11 @@
 /* Test for case labels not integer constant expressions but folding
-   to integer constants (used in Linux kernel, PR 39613).  */
+   to integer constants (used in Linux kernel, PR 39613, 52283, ).  */
 /* { dg-do compile } */
 /* { dg-options "-pedantic-errors" } */
 
 extern int i;
+extern unsigned int u;
+
 void
 f (int c)
 {
@@ -13,3 +15,16 @@
   ;
 }
 }
+
+void
+b (int c)
+{
+  switch (c)
+{
+case (int) (2  | ((4 < 8) ? 8 : u)): /* { dg-error "case label is not an integer constant expression" } */
+  ;
+}
+}
+
+
+
Index: gcc/stmt.c
===
--- gcc/stmt.c	(revision 186082)
+++ gcc/stmt.c	(working copy)
@@ -1515,6 +1515,7 @@
 
 case SAVE_EXPR:
 case NON_LVALUE_EXPR:
+case NOP_EXPR:
   exp = TREE_OPERAND (exp, 0);
   goto restart;
 
Index: gcc/convert.c
===
--- gcc/convert.c	(revision 186082)
+++ gcc/convert.c	(working copy)
@@ -542,7 +542,6 @@
   else if (outprec >= inprec)
 	{
 	  enum tree_code code;
-	  tree tem;
 
 	  /* If the precision of the EXPR's type is K bits and the
 	 destination mode has more bits, and the sign is changing,
@@ -560,13 +559,7 @@
 	  else
 	code = NOP_EXPR;
 
-	  tem = fold_unary (code, type, expr);
-	  if (tem)
-	return tem;
-
-	  tem = build1 (code, type, expr);
-	  TREE_NO_WARNING (tem) = 1;
-	  return tem;
+	  return fold_build1 (code, type, expr);
 	}
 
   /* If TYPE is an enumeral type or a type with a precision less

Re: [Patch]: Fix ICE on VMS when using SImode pointers

2012-04-04 Thread Richard Guenther

On Wed, 4 Apr 2012, Tristan Gingold wrote:

> Hi,
> 
> this patch fixes a build time failure on VMS (while compiling Ada RTS file 
> i-cstrin.adb) due to the use of short pointers:
> 
> i-cstrin.adb: In function 'Interfaces.C.Strings.To_Chars_Ptr':
> i-cstrin.adb:236:8: error: unrecognizable insn:
> (insn 80 79 81 13 (set (reg:SI 384)
> (const_int 4294967288 [0xfff8])) i-cstrin.adb:234 -1
>  (nil))
> +===GNAT BUG DETECTED==+
> | Pro 7.1.0w (20120403-47) (ia64-hp-openvms) GCC error:|
> | in extract_insn, at recog.c:2123 |
> | Error detected around i-cstrin.adb:236:8 |
> 
> 
> Expansion of POINTER_PLUS_EXPR doesn't handle the case of PRECISION(sizetype) 
> > PRECISION(type), leading to RTL expressions with different modes.
> 
> This patch fixes the build issue, tested on ia64-hp-openvms.
> Also tested with our internal testsuite.
> I haven't run the GCC testsuite on a regular platform, as the condition will 
> never trigger.
> 
> Ok for trunk ?

Ok if you add a comment why this is needed.

Richard.

> Tristan.
> 
> 2012-04-04  Tristan Gingold  
> 
>   * expr.c (expand_expr_real_2): Handle larger sizetype in
>   POINTER_PLUS_EXPR.
> 
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -7957,6 +7957,9 @@ expand_expr_real_2 (sepops ops, rtx target, enum 
> machine_m
> treeop1 = fold_convert_loc (loc, type,
> fold_convert_loc (loc, ssizetype,
>   treeop1));
> +  else if (TYPE_PRECISION (sizetype) > TYPE_PRECISION (type))
> +   treeop1 = fold_convert_loc (loc, type, treeop1);
> +
>  case PLUS_EXPR:
>/* If we are adding a constant, a VAR_DECL that is sp, fp, or ap, and
>  something else, make sure we add the register to the constant and
> u
> 
> 
> 

-- 
Richard Guenther 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix ImendÃ¶rffer

Re: [SH] PR50751 - rework displacement calculations pt. 2

2012-04-04 Thread Kaz Kojima

Oleg Endo  wrote:
> The attached patch restructures the move insn displacement calculations
> a bit more.  The idea is to have the displacement addressing decision
> making logic in a few simple functions and then re-use those in other
> places, as opposed to having multiple special cases.
> 
> Tested against rev 185893 with...
> make -k check RUNTESTFLAGS="--target_board=sh-sim
> \{-m2/-ml,-m2/-mb,-m2a-single/-mb,
> -m4-single/-ml,-m4-single/-mb,
> -m4a-single/-ml,-m4a-single/-mb}"
> 
> ...and no new failures.
> 
> I've also checked the CSiBE size results for a bunch of variants.  There
> is a slight overall code size increase, with a maximum of +0.03% on the
> whole set for -m2 -ml/-mb -Os/-O2) due to some changes on how DImode
> index range decisions are made.
> 
> OK?

OK.

Regards,
kaz

Re: [4.5/Ada] Fix build with 4.6 compiler

2012-04-04 Thread Eric Botcazou

> It is a bad idiom, given that we already had >= 10 __GNUC_MINOR__ and it
> is possible we'll have 4.10 as well.
> E.g. __GNUC_PREREQ macro in glibc shifts left major by 16, but even
> multiplying by 100 instead of 10 is better.

OK, we'll change the idiom on mainline.

-- 
Eric Botcazou

Re: [Patch]: Fix ICE on VMS when using SImode pointers

2012-04-04 Thread Tristan Gingold


On Apr 4, 2012, at 10:18 AM, Richard Guenther wrote:

> On Wed, 4 Apr 2012, Tristan Gingold wrote:
> 
>> Hi,
>> 
>> this patch fixes a build time failure on VMS (while compiling Ada RTS file 
>> i-cstrin.adb) due to the use of short pointers:
>> 
>> i-cstrin.adb: In function 'Interfaces.C.Strings.To_Chars_Ptr':
>> i-cstrin.adb:236:8: error: unrecognizable insn:
>> (insn 80 79 81 13 (set (reg:SI 384)
>>(const_int 4294967288 [0xfff8])) i-cstrin.adb:234 -1
>> (nil))
>> +===GNAT BUG DETECTED==+
>> | Pro 7.1.0w (20120403-47) (ia64-hp-openvms) GCC error:|
>> | in extract_insn, at recog.c:2123 |
>> | Error detected around i-cstrin.adb:236:8 |
>> 
>> 
>> Expansion of POINTER_PLUS_EXPR doesn't handle the case of 
>> PRECISION(sizetype) > PRECISION(type), leading to RTL expressions with 
>> different modes.
>> 
>> This patch fixes the build issue, tested on ia64-hp-openvms.
>> Also tested with our internal testsuite.
>> I haven't run the GCC testsuite on a regular platform, as the condition will 
>> never trigger.
>> 
>> Ok for trunk ?
> 
> Ok if you add a comment why this is needed.

Thanks, committed with this comment:

  /* If sizetype precision is larger than pointer precision, truncate the
 offset to have matching modes.  */
  else if (TYPE_PRECISION (sizetype) > TYPE_PRECISION (type))
treeop1 = fold_convert_loc (loc, type, treeop1);


Tristan.

[libiberty]: Adjust style in pex-unix.c(to_ptr32)

2012-04-04 Thread Tristan Gingold

Hi,

I am committing this patch (as obvious) to adjust the style of the VMS specific 
function to_ptr32.

Tested by building for ia64-hp-openvms.

Tristan.

libiberty/
2012-04-04  Tristan Gingold  

* pex-unix.c (to_ptr32): Fix style.

Index: pex-unix.c
===
--- pex-unix.c  (revision 186130)
+++ pex-unix.c  (working copy)
@@ -85,13 +85,15 @@
   int argc;
   __char_ptr_char_ptr32 short_argv;
 
-  for (argc=0; ptr64[argc]; argc++);
+  /* Count number of arguments.  */
+  for (argc = 0; ptr64[argc] != NULL; argc++)
+;
 
   /* Reallocate argv with 32 bit pointers.  */
   short_argv = (__char_ptr_char_ptr32) decc$malloc
 (sizeof (__char_ptr32) * (argc + 1));
 
-  for (argc=0; ptr64[argc]; argc++)
+  for (argc = 0; ptr64[argc] != NULL; argc++)
 short_argv[argc] = (__char_ptr32) decc$strdup (ptr64[argc]);
 
   short_argv[argc] = (__char_ptr32) 0;

Re: [C11-atomic] [patch] gimple atomic statements

2012-04-04 Thread Richard Guenther

On Tue, Apr 3, 2012 at 9:40 PM, Andrew MacLeod  wrote:
> Here is my first step in promoting the __atomic builtins into gimple
> statements.  I originally planned this as tree codes, but when prototyped it
> became obvious that a gimple statement was a far superior solution since we
> also need to deal with LHS and memory address issues.
>
> Motivations are many-fold, but primarily it makes manipulating atomics
> easier and exposes more of their side effects to the optimizers.  In
> particular we can now expose both return values of the compare and swap when
> implementing compare_exchange and get more efficient code generation. (soon
> :-)   It is item number 3 on my 4.8 task list:
> http://gcc.gnu.org/wiki/Atomic/GCCMM/gcc4.8.
>
> This first step adds a GIMPLE_ATOMIC statement class which handles all the
> __atomic built-in calls. Right after the cfg is built, all built-in __atomic
> calls are converted to gimple_atomic statements.  I considered doing the
> conversion right in the gimplifier, but elected to leave it as a pass for
> the time being.  This then  passes through all the optimizers to cfgexpand,
> where they are then converted directly into rtl.
>
> This currently  produces the same code that the builtins do in 4.7.  I
> expect that I missed a few places in the optimizers where they aren't
> properly treated as barriers yet,  but I'll get to tracking those down in a
> bit.
>
> I also have not implemented non-integral atomics yet, nor do I issuing
> library calls when inline expansion cannot be done.  That's next... I just
> want to get the basics checked into the branch.
>
> I expect to be able to wrap the __sync routines into this as well,
> eliminating all the atomic and sync builtin expansion code, keeping
> everything in one easy statement class.   Then I'll add the _Atomic type
> qualifier to the parser, and have that simply translate expressions
> involving those types into gimple_atomic statements at the same time calls
> are converted.
>
> This bootstraps on x86_64-unknown-linux-gnu, and the only testsuite
> regressions are one involving issuing library calls.  There is a toolchain
> build problem with libjava however... During libjava construction there ends
> up being files which cant be created due to permission problems in .svn
> directories ...   Pretty darn weird, but I'll look into it later when the
> atomic gimple support is complete, if the problem it still exists then.
>
> Anyone see anything obviously flawed about the approach?

The fact that you need to touch every place that wants to look at memory
accesses shows that you are doing it wrong.  Instead my plan was to
force _all_ memory accesses to GIMPLE_ASSIGNs (yes, including those
we have now in calls).  You're making a backwards step in my eyes.

What do you think is "easier" when you use a GIMPLE_ATOMIC
(why do you need a fntype field?!  Should the type not be available
via the operand types?!)

Your tree-cfg.c parts need to be filled in.  They are the specification of
GIMPLE_ATOMIC - at the moment you allow any garbage.

Similar to how I dislike the choice of adding GIMPLE_TRANSACTION
instead of using builtin functions I dislike this.

I suppose you do not want to use builtins because for primitive types you
end up with multiple statements for something "atomic"?

So, please tell us why the current scheme does not work and how the
new scheme overcomes this (that's entirely missing in your proposal...)

Thanks,
Richard.

> Andrew
>
>

[Libiberty]: Handle VMS as a LLP64 platform in splay-tree.h

2012-04-04 Thread Tristan Gingold

Hi,

unfortunately VMS (when 64bit pointers are used - which is nice for gcc) is 
also an LLP64 platform.
So I need to follow to Win64 way in splay-tree.h.

Tested manually by build (and using) gcc on ia64-hp-openvms.

Ok for trunk ?

Tristan.

include/
2012-04-04  Tristan Gingold  

* splay-tree.h: Use LLP64 definitions of libi_shostptr_t and
libi_hostptr_t for VMS with 64bit pointers.

--- a/include/splay-tree.h
+++ b/include/splay-tree.h
@@ -37,7 +37,8 @@ extern "C" {
 
 #include "ansidecl.h"
 
-#ifndef _WIN64
+#if !(defined (_WIN64) \
+  || (defined (__VMS__) && __INITIAL_POINTER_SIZE == 64))
   typedef unsigned long int libi_uhostptr_t;
   typedef long int libi_shostptr_t;
 #else

Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-04 Thread Andrew Haley

On 04/03/2012 11:53 AM, Richard Earnshaw wrote:
>> Now, I wonder why the dynamic linker cannot figure out the ABI itself
>> > by means of using ELF flags or so?
>> > 
> There are no ELF flags for this in executables.  The attributes only
> apply to object files and anyway they are too expensive to decode at run
> time.

Isn't that the core problem, then?  We have incompatible libraries
and executables but they aren't marked as such.

Andrew.

Re: [4.5/Ada] Fix build with 4.6 compiler

2012-04-04 Thread Richard Guenther

On Wed, Apr 4, 2012 at 9:51 AM, Arnaud Charlet  wrote:
>> Arno, do you have objections to me applying the attached patch to the 4.5
>> branch?  It makes it possible to build (and bootstrap) the Ada compiler on 
>> the
>> 4.5 branch (oldest supported branch) with the 4.6 compiler, which is now the
>> system compiler in recent Linux distributions.
>
> Well, we don't guarantee such compatibility in general,
> so I'd like to make it clear that people shouldn't expect this combination
> to work, and if more complex patches are submitted, we'll likely NOT
> integrate them.

Maybe you should start to do that though.  Otherwise you'll simply lose
the easy (but not required) testing of Ada when I (or others) backport
changes to still maintained branches on machines where the system Ada
compiler is newer than the oldest maintained branch.

So - it's all for your own benefit ;)  [you can think of using a new
-we-are-building-gcc
switch for the compile where you'd enable some backward compatibility or so]

Thanks,
Richard.

Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-04 Thread Joseph S. Myers

On Wed, 4 Apr 2012, Michael Hope wrote:

> The tricky one is new GCC with old GLIBC.  GCC may have to do a
> configure time test and fall back to /lib/ld-linux.so.3 if the hard
> float loader is missing.

I don't think that's appropriate for ABI issues.  If a different dynamic 
linker name is specified, GCC should use it unconditionally (and require 
new enough glibc or a glibc installation that was appropriately 
rearranged).

> > I have no idea whether shlib-versions files naming a file in a
> > subdirectory will work - but if not, you'd need to send a patch to
> > libc-alpha to support dynamic linkers in subdirectories, with appropriate
> > justification for why you are doing something different from all other
> > architectures.
> 
> Understood.  For now this is just a path.  There's more infrastructure
> work needed if the path includes a directory.

Formally it's just a path - but an important feature of GNU/Linux and the 
GNU toolchain is consistency between different architectures and existing 
upstream practice is that the dynamic linker is always in the same 
directory as the other associated libraries and that this has the form 
/lib.  In the absence of a compelling reason, which I have not 
seen stated, to do otherwise for a single case, I think that existing 
practice should be followed with the dynamic linker being in a directory 
such as /libhf.

The "more infrastructure work needed" makes clear that you need libc-alpha 
buy-in *before* putting any patches into GCC or ports.  But maybe if you 
don't try to put the dynamic linker in a different directory from the 
other libraries, it's easier to support via existing mechanisms (setting 
slibdir differently if --enable-multiarch-directories or similar)?

> Do the MIPS or PowerPC loaders detect the ABI and change the library
> path based on that?  I couldn't tell from the code.

No, they don't detect the ABI.  Both ABIs (and, for Power, the e500v1 and 
e500v2 variants - compatible with soft-float at the function-calling level 
but with some glibc ABI differences with soft-float and with each other) 
use the same directories.

> > (e) Existing practice for cases that do use different dynamic linkers is
> > to use a separate library directory, not just dynamic linker name, as in
> > lib32 and lib64 for MIPS or libx32 for x32; it's certainly a lot easier to
> > make two sets of libraries work in parallel if you have separate library
> > directories like that.
> 
> Is this required, or should it be left to the distro to choose?  Once
> the loader is in control then it can account for any distro specific
> features, which may be the standard /lib and /usr/lib for single ABI
> distros like Fedora or /usr/lib/$tuple for multiarch distros like
> Ubuntu and Debian.

I thought Fedora used the standard upstream /lib64 on x86_64 and so would 
naturally use a standard upstream /libhf where appropriate.

> > So it would seem more appropriate to define a directory libhf for ARM 
> > (meaning you need a binutils patch as well to
> > handle that directory, I think)
> 
> I'd like to leave that discussion for now.  The Debian goal is to
> support incompatible ABIs and, past that, incompatible architectures.
> libhf is ambiguous as you could have a MIPS hard float library
> installed on the same system as an ARM hard float library.

If you want both ARM and MIPS hard-float then I'd think you want both 
big-endian and little-endian ARM hard-float - but your patch defines the 
same dynamic linker name for both of those.

Standard upstream practice supports having multiple variants that 
plausibly run on the same system at the same time, such as /lib and 
/lib64, and it seems reasonable to support hard and soft float variants 
that way via a directory such as /libhf.  The Debian-style paths are not 
the default on any other architecture and I don't think it's appropriate 
to make them the default for this particular case only.

> > and these different Debian-style names
> > could be implemented separately in a multiarch patch if someone submits
> > one that properly accounts for my review comments on previous patch
> > versions (failure to produce such a fixed patch being why Debian multiarch
> > directory support has not got into GCC so far).
> 
> Agreed.  Note that this loader path discussion is unrelated to
> multiarch.  It came from the same people so there's a family
> resemblance.

I think it's directly related, and that such a path is inappropriate by 
default; that ARM should be consistent with other architectures, and that 
if you want to support paths in such subdirectories that would be a 
separate multiarch patch series for GCC, binutils and glibc (but the 
PT_INTERP would still use /lib/ without subdirectories in 
any case).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [Patch, fortran] PR 49010/24518 MOD/MODULO fixes

2012-04-04 Thread Tobias Burnus

Janne Blomqvist wrote:
> the attached patch implements a few fixes and cleanups for the MOD and
> MODULO intrinsics.

> The patch adds notes to the documentation about the usage of fmod, so
> users interested in corner-case behavior can look up how that function
> is supposed to behave on their target.

> +@item @emph{Note}:
> +The obvious algorithm as specified above is unstable for large real
> +inputs. Hence, for real inputs the result is calculated by using the
> +@code{fmod} function in the C math library.

I wonder whether one should extend the note, stating that that using
"fmod" might lead to a `wrongly' signed 0. Something like: "; depending
on its implementation, this might lead to a diffent sign for 0 results
- compared to the result of the obvious algorithm."

Since you modify intrinsic.texi, could you also do:
- Add a cross @ref between mod and modulo.
- Show in the example (as comments) also the result of the mod/module
  operations. As many confuse mod (=remainder) and modulo, it makes
  sense to help them by showing the result of examples.*

Please also update the Copyright year for simplify.c.

Otherwise, the patch looks OK.

(Except for the mode change of libgcc/configure - which requires a build
maintainer approval. If you want to make it executable, do the same also
for libitm/configure - and do so in a separate patch.)

Tobias


* Regarding "mod" vs. "module" see also
https://en.wikipedia.org/wiki/Remainder#The_case_of_general_integers
and the right column at
https://en.wikipedia.org/wiki/Modulo_operation

Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-04 Thread Joseph S. Myers

On Wed, 4 Apr 2012, Jakub Jelinek wrote:

> If the agreement is that arm 32-bit softfp really needs to be installable
> alongside 32-bit hardfp (and alongside aarch64), then IMHO it should do it
> like all other multilib ports (x86_64/i?86/x32, s390/s390x, ppc/ppc64, the
> various MIPS variants) and what FSB says, e.g. use
> /lib/ld-linux.so.3 and */lib dirs for softfp,
> /libhf/ld-linux.so.3 and */libhf dirs for hardfp and
> /lib64/ld-linux.so.3 and */lib64 dirs for aarch64, have 32-bit
> arm-linux-gnueabi gcc configured for softfp/hardfp multilib with
> MULTILIB_OSDIRNAMES, etc., have it configured in glibc, and for those that
> choose the Debian layout instead, if it is added somehow configurable into
> upstream gcc/glibc of course handle it similarly there.  I just wonder why
> that hasn't been done 10 years ago and only needs doing now (of course,
> aarch64 is going to be new, talking now about the 32-bit softfp vs. hardfp).

Exactly.  The default should follow the existing practice for other 
architectures.

> One needs to wonder also why arm hasn't switched to 128-bit long double when
> all other mainstream architectures did (I hope at least aarch64 will use it
> by default).

The AArch64 ABI (generic, not GNU/Linux, and draft, still subject to 
incompatible change) is public and used 128-bit long double the last time 
I checked.

My presumption is that there has been no demand for long double wider than 
double among 32-bit ARM users.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: rs6000 toc reference rtl again

2012-04-04 Thread Richard Sandiford

Alan Modra  writes:
> On Tue, Apr 03, 2012 at 07:49:04PM +0100, Richard Sandiford wrote:
>> Alan Modra  writes:
>> > Now that we are back in stage1, I'd like to apply
>> > http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00304.html, a change to
>> > toc reference rtl in order to properly specify r2 dependencies.  More
>> > commentary in that url.  I'm reposting the patch here since the old
>> > one no longer applies cleanly, and I've added some ENABLE_CHECKING
>> > code in rs6000_delegitimize_address.
>> 
>> Sorry to be a pain, but I don't think HIGH is supposed contain
>> regs either.  Both HIGH and CONST are supposed to be true constants.
>
> Eh, so the existing use of CONST is wrong then.  ;-)

Right :-)  Sorry, I meant: although you're fixing the CONST,
the same problem really applies to the HIGH too.

> I'm proposing
>   (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL)
> for the small model, and
>   (high (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL)))
>   (lo_sum (reg hi) (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL))
> for medium/large model.
>
> You can see why I'd like to keep it this way;  The medium/large rtl is
> a natural split of the small rtl.  (I'm going to experiment with
> splitting the small rtl after reload for medium/large to see whether
> that helps our usage of call-saved regs in loops.)

Yeah.  FWIW, MIPS keeps both the small and large forms as constants
before reload, then splits them into (non-constant) forms that
reference the global pointer after reload.  So the small version
is simply:

(set (match_operand 0 "register_operand" "=...")
 (match_operand 1 "symbolic_constant" ""))

before reload, while the large form is a normal HIGH/LO_SUM pair.
That HIGH then gets split after reload too.

It seems to work well, although these days it does require the split
to happen before prologue/epilogue generation.  (Hmm, the *lea_high64
patterns are probably wrong now that we have shrink-wrapping...)

With the loop thing, do you mean that you're seeing too many HIGHs
being hoisted?  Would be nice to fix that in the loop optimisers
if so.

> I'm not wedded to the representation, *but* we do want gcc to treat
> the high part as a constant.  That's important because we don't ever
> want reload saving the high part to a stack slot!  Which is what does
> happen if you don't somehow tell gcc it is a constant.

Out of curiosity, does that still happen if you have a HIGH REG_EQUAL
note attached to the addition?  I'd have expected reload to convert
the note into a REG_EQUIV and treat the source as a function invariant.

> Besides, the high part *is* a constant within any given function.  So
> is the low part for that matter.  The only reason I want r2 mentioned
> in this rtl is for register liveness, eg. so that a load of a function
> pointer (which loads r2) for an indirect call doesn't get scheduled
> before any uses of the old r2.

Right.  But that's also true of, say, a constant that needs to be split
into a load-high and add.  The result of the add is a function constant,
but the add itself is still not a constant from an rtl perspective.

I think it'd get too confusing if constants were allowed to reference
registers.  That sort of thing is usually handled with REG_EQUAL notes
instead.  But in the specific case of GOT references, where the GOT
register isn't really around until prologue/epilogue generation anyway,
there's less point exposing it before reload.

> The alternative of removing r2 from the unspec and attaching a
> (use (reg r2)) to all instructions that have this addressing form
> might be clean but will require major duplication of patterns in
> rs6000.md, won't it?

Yeah.  It'd probably also not be as effective as splitting after reload.
Passes that want to optimise the load high are likely to be put off by
a (use ...).

Richard

Re: [Libiberty]: Handle VMS as a LLP64 platform in splay-tree.h

2012-04-04 Thread Pedro Alves

On 04/04/2012 09:55 AM, Tristan Gingold wrote:

> Hi,
> 
> unfortunately VMS (when 64bit pointers are used - which is nice for gcc) is 
> also an LLP64 platform.
> So I need to follow to Win64 way in splay-tree.h.


Doesn't VMS gcc define __LP64__/__LLP64__?  Then we could for example:

#if !(defined (_WIN64) || defined (__LLP64__))

and thus avoid sprinkling __VMS__ checks around.

> 
> Tested manually by build (and using) gcc on ia64-hp-openvms.
> 
> Ok for trunk ?
> 
> Tristan.
> 
> include/
> 2012-04-04  Tristan Gingold  
> 
>   * splay-tree.h: Use LLP64 definitions of libi_shostptr_t and
>   libi_hostptr_t for VMS with 64bit pointers.
> 
> --- a/include/splay-tree.h
> +++ b/include/splay-tree.h
> @@ -37,7 +37,8 @@ extern "C" {
>  
>  #include "ansidecl.h"
>  
> -#ifndef _WIN64
> +#if !(defined (_WIN64) \
> +  || (defined (__VMS__) && __INITIAL_POINTER_SIZE == 64))
>typedef unsigned long int libi_uhostptr_t;
>typedef long int libi_shostptr_t;
>  #else
> 



-- 
Pedro Alves

Re: [Patch, Fortran] PRs 52751/40973 - don't set TREE_PUBLIC for PRIVATE module procs/vars

2012-04-04 Thread Paul Richard Thomas

Dear Tobias,

This is OK for trunk - thanks for the patch.

Cheers

Paul

On Tue, Apr 3, 2012 at 11:18 PM, Tobias Burnus  wrote:
> Dear all,
>
> the attached patch only sets TREE_PUBLIC for module variables and module
> procedures which have neither the PRIVATE attribute nor a C-binding label.
> Seemingly, only NAG f95 does this for module variables and none of my
> compilers does this for module procedures.
>
> The main effect of this patch is a code size reduction as the compiler might
> (even without -fwhole-program -flto) optimize unused variables/procedures
> away. Additionally, the compiler might inline code which it otherwise
> wouldn't do (due to the code size increase) or do optimizations based on the
> value of the module variables (though, GCC has room for improvement for
> optimizing static variables.)
>
> Note: For C-binding variables without binding label ("bind(C, name='')"), I
> don't use DECL_COMMON. That should be okay as DECL_COMMON is only used to
> make sure that a variable can be initialized from either C or Fortran. But
> without binding name, that's not possible from C, hence, that's fine.
>
> Build and regtested on x86-64-linux. (And currently regtesting again - after
> a minor modification.)
> OK for the trunk?
>
> Tobias
>



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
       --Hitchhikers Guide to the Galaxy

Re: [PATCH] Fix PRs c/52283/37985

2012-04-04 Thread Manuel López-Ibáñez

Hi Christian,

You have to add the testcases from both PR52283 and PR37985, and an
appropriate Changelog, and bootstrap+regression test everything and
double-check that the new testcases don't fail and no old testcases
fail with the patch (by comparing with the testcases that fail without
the patch).

Cheers,

Manuel.

On 4 April 2012 10:17, Christian Bruel  wrote:
> Hello,
>
> Is it OK to push the cleaning of TREE_NO_WARNING to fix the constant
> expressions errors discrepancies, as discussed in bugzilla #52283, now that
> the trunk is open ?
>
> Many thanks,
>
>
>

Re: [Libiberty]: Handle VMS as a LLP64 platform in splay-tree.h

2012-04-04 Thread Tristan Gingold


On Apr 4, 2012, at 11:26 AM, Pedro Alves wrote:

> On 04/04/2012 09:55 AM, Tristan Gingold wrote:
> 
>> Hi,
>> 
>> unfortunately VMS (when 64bit pointers are used - which is nice for gcc) is 
>> also an LLP64 platform.
>> So I need to follow to Win64 way in splay-tree.h.
> 
> 
> Doesn't VMS gcc define __LP64__/__LLP64__?

Unfortunately no (like mingw if I read config files correctly).

This makes even less sense on VMS where 32bit and 64bit can be used at the same 
time.

>  Then we could for example:
> 
> #if !(defined (_WIN64) || defined (__LLP64__))
> 
> and thus avoid sprinkling __VMS__ checks around.

In fact this is the only place where we need it, so this is under control :-)

Tristan.

> 
>> 
>> Tested manually by build (and using) gcc on ia64-hp-openvms.
>> 
>> Ok for trunk ?
>> 
>> Tristan.
>> 
>> include/
>> 2012-04-04  Tristan Gingold  
>> 
>>  * splay-tree.h: Use LLP64 definitions of libi_shostptr_t and
>>  libi_hostptr_t for VMS with 64bit pointers.
>> 
>> --- a/include/splay-tree.h
>> +++ b/include/splay-tree.h
>> @@ -37,7 +37,8 @@ extern "C" {
>> 
>> #include "ansidecl.h"
>> 
>> -#ifndef _WIN64
>> +#if !(defined (_WIN64) \
>> +  || (defined (__VMS__) && __INITIAL_POINTER_SIZE == 64))
>>   typedef unsigned long int libi_uhostptr_t;
>>   typedef long int libi_shostptr_t;
>> #else
>> 
> 
> 
> 
> -- 
> Pedro Alves

RFA: RL78: Add -fstack-usage support

2012-04-04 Thread Nick Clifton

Hi DJ,

  Please could I apply the patch below to the RL78 port ?  It adds
  support for generating stack use info with the -fstack-usage option.

  Also - may I apply this patch to the 4.7 branch as well please ?

Cheers
  Nick

gcc/ChangeLog
2012-04-04  Nick Clifton  

* config/rl78/rl78.c (rl78_expand_prologue): Set stack use
information, if requested.

Index: gcc/config/rl78/rl78.c
===
--- gcc/config/rl78/rl78.c  (revision 186130)
+++ gcc/config/rl78/rl78.c  (working copy)
@@ -827,6 +827,9 @@
   if (!cfun->machine->computed)
 rl78_compute_frame_info ();
 
+  if (flag_stack_usage)
+current_function_static_stack_size = cfun->machine->framesize;
+
   for (i = 0; i < 16; i++)
 if (cfun->machine->need_to_push [i])
   {

[PATCH] Fix PR52808

2012-04-04 Thread Richard Guenther


This fixes LTO profiledbootstrap.  tracer tail-duplicates loop
headers; that is not profitable and it makes loops have multiple
entries which inhibits further optimization.  The following
patch cures that.

LTO profiledbootstrapped on x86_64-unknown-linux-gnu, regular
testing in progress.

Richard.

2012-04-04  Richard Guenther  

PR tree-optimization/52808
* tracer.c (tail_duplicate): Do not tail-duplicate loop header
blocks.
* Makefile.in (tracer.o): Depend on $(CFGLOOP_H).

Index: gcc/tracer.c
===
*** gcc/tracer.c(revision 186134)
--- gcc/tracer.c(working copy)
***
*** 52,57 
--- 52,58 
  #include "tree-pass.h"
  #include "tree-flow.h"
  #include "tree-inline.h"
+ #include "cfgloop.h"
  
  static int count_insns (basic_block);
  static bool ignore_bb_p (const_basic_block);
*** tail_duplicate (void)
*** 307,313 
}
  traced_insns += bb2->frequency * counts [bb2->index];
  if (EDGE_COUNT (bb2->preds) > 1
! && can_duplicate_block_p (bb2))
{
  edge e;
  basic_block copy;
--- 308,320 
}
  traced_insns += bb2->frequency * counts [bb2->index];
  if (EDGE_COUNT (bb2->preds) > 1
! && can_duplicate_block_p (bb2)
! /* We have the tendency to duplicate the loop header
!of all do { } while loops.  Do not do that - it is
!not profitable and it might create a loop with multiple
!entries or at least rotate the loop.  */
! && (!current_loops
! || bb2->loop_father->header != bb2))
{
  edge e;
  basic_block copy;
Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 186134)
+++ gcc/Makefile.in (working copy)
@@ -3391,7 +3391,7 @@ bb-reorder.o : bb-reorder.c $(CONFIG_H)
 tracer.o : tracer.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
$(TREE_H) $(BASIC_BLOCK_H) hard-reg-set.h output.h $(CFGLAYOUT_H) \
$(FLAGS_H) $(TIMEVAR_H) $(PARAMS_H) $(COVERAGE_H) $(FIBHEAP_H) \
-   $(TREE_PASS_H) $(TREE_FLOW_H) $(TREE_INLINE_H)
+   $(TREE_PASS_H) $(TREE_FLOW_H) $(TREE_INLINE_H) $(CFGLOOP_H)
 cfglayout.o : cfglayout.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
$(RTL_H) $(TREE_H) insn-config.h $(BASIC_BLOCK_H) hard-reg-set.h output.h \
$(FUNCTION_H) $(CFGLAYOUT_H) $(CFGLOOP_H) $(TARGET_H) gt-cfglayout.h \

Re: [PATCH] Fix PRs c/52283/37985

2012-04-04 Thread Christian Bruel



On 04/04/2012 11:38 AM, Manuel López-Ibáñez wrote:

Hi Christian,

You have to add the testcases from both PR52283 and PR37985, and an
appropriate Changelog, and bootstrap+regression test everything and
double-check that the new testcases don't fail and no old testcases
fail with the patch (by comparing with the testcases that fail without
the patch).



The testscase was part of the attached patch, along with the ChangeLog 
entries


It was bootstrapped and regtested for C and C++ on x86 (that was in 
bugzilla comment #22), sorry I should have mentioned it here too. done 
now :)


Cheers

Christian




Cheers,

Manuel.

On 4 April 2012 10:17, Christian Bruel  wrote:

Hello,

Is it OK to push the cleaning of TREE_NO_WARNING to fix the constant
expressions errors discrepancies, as discussed in bugzilla #52283, now that
the trunk is open ?

Many thanks,

Re: rs6000 toc reference rtl again

2012-04-04 Thread Alan Modra

On Wed, Apr 04, 2012 at 10:25:39AM +0100, Richard Sandiford wrote:
> With the loop thing, do you mean that you're seeing too many HIGHs
> being hoisted?

No, nothing as complicated as that.  In a lot of cases, any hoisting
of the high part is bad, because the linker nops out the high part and
edits the low part insn when the r2 offset can be done in 16 bits.  If
the high part insn was the only use of a call-saved reg, then the
function saves and restores a reg to no purpose.

> > I'm not wedded to the representation, *but* we do want gcc to treat
> > the high part as a constant.  That's important because we don't ever
> > want reload saving the high part to a stack slot!  Which is what does
> > happen if you don't somehow tell gcc it is a constant.
> 
> Out of curiosity, does that still happen if you have a HIGH REG_EQUAL
> note attached to the addition?  I'd have expected reload to convert
> the note into a REG_EQUIV and treat the source as a function invariant.

I saw that possibility in reload, and did try a REG_EQUIV note before
I wrapped the ADD in a CONST as we have currently.  Some early pass
deleted the notes for me.  I forget which one.

Oh well, I was going to try splitting after reload anyway.

-- 
Alan Modra
Australia Development Lab, IBM

Re: [Patch, fortran] PR 49010/24518 MOD/MODULO fixes

2012-04-04 Thread Janne Blomqvist

On Wed, Apr 4, 2012 at 12:11, Tobias Burnus
 wrote:
> Janne Blomqvist wrote:
>> the attached patch implements a few fixes and cleanups for the MOD and
>> MODULO intrinsics.
>
>> The patch adds notes to the documentation about the usage of fmod, so
>> users interested in corner-case behavior can look up how that function
>> is supposed to behave on their target.
>
>> +@item @emph{Note}:
>> +The obvious algorithm as specified above is unstable for large real
>> +inputs. Hence, for real inputs the result is calculated by using the
>> +@code{fmod} function in the C math library.
>
> I wonder whether one should extend the note, stating that that using
> "fmod" might lead to a `wrongly' signed 0. Something like: "; depending
> on its implementation, this might lead to a diffent sign for 0 results
> - compared to the result of the obvious algorithm."

Yes, but if one describes the behavior for one special case, IMHO one
should also specify it for other special cases. E.g. from the glibc
fmod manpage:

...the returned value has the same sign as x and a magnitude less than
the magnitude of y.

   If x or y is a NaN, a NaN is returned.

   If x is an infinity, a domain error occurs, and a NaN is returned.

   If y is zero, a domain error occurs, and a NaN is returned.

   If x is +0 (-0), and y is not zero, +0 (-0) is returned.

For compile-time evaluation, MPFR fmod conforms to the specification
for fmod in C99 Annex F, and for runtime so does glibc fmod (see
above). But for other targets, runtime might be different. But I
couldn't come up with a formulation that I was happy with; not that
I'm happy about my text in the patch either. Suggestions welcome..

> Since you modify intrinsic.texi, could you also do:
> - Add a cross @ref between mod and modulo.

Ok.

> - Show in the example (as comments) also the result of the mod/module
>  operations. As many confuse mod (=remainder) and modulo, it makes
>  sense to help them by showing the result of examples.*

Well, Fortran mod != C99 remainder().

So for a general remainder operation the result is "x-n*y" (evaluated
with infinite precision), where n is x/y rounded to an integer in some
way. Then we have at least the following cases

- C fmod and Fortran MOD: n is rounded towards zero

- Fortran MODULO: n is rounded towards -Infinity

- C99 and IEEE754 remainder: n is rounded to the nearest integer.
Incidentally, this has the nice property that abs(result) <= abs(y/2).

> Please also update the Copyright year for simplify.c.

Ok.

> Otherwise, the patch looks OK.
>
> (Except for the mode change of libgcc/configure - which requires a build
> maintainer approval. If you want to make it executable, do the same also
> for libitm/configure - and do so in a separate patch.)

Ah, that was some autogenerated stuff that was accidentally included
in the patch. Please disregard it.

-- 
Janne Blomqvist

Re: RFA: RL78: Add -fstack-usage support

2012-04-04 Thread Eric Botcazou

>   Please could I apply the patch below to the RL78 port ?  It adds
>   support for generating stack use info with the -fstack-usage option.

You probably need to adjust gcc.dg/stack-usage-1.c too.

> Index: gcc/config/rl78/rl78.c
> ===
> --- gcc/config/rl78/rl78.c(revision 186130)
> +++ gcc/config/rl78/rl78.c(working copy)
> @@ -827,6 +827,9 @@
>if (!cfun->machine->computed)
>  rl78_compute_frame_info ();
>
> +  if (flag_stack_usage)
> +current_function_static_stack_size = cfun->machine->framesize;
> +
>for (i = 0; i < 16; i++)
>  if (cfun->machine->need_to_push [i])
>{

s/flag_stack_usage/flag_stack_usage_info/

-- 
Eric Botcazou

Re: [wwwdocs] Buildstat update for 4.4

2012-04-04 Thread Gerald Pfeifer

On Tue, 3 Apr 2012, Tom G. Christensen wrote:
> Testresults for 4.4.7:
>   powerpc-apple-darwin8.11.0

Thanks, Tom!

Gerald

Re: [PATCH] Fix PRs c/52283/37985

2012-04-04 Thread Manuel López-Ibáñez

On 4 April 2012 13:05, Christian Bruel  wrote:
>
>
> The testscase was part of the attached patch, along with the ChangeLog
> entries

You are right! Sorry, I may have been looking at the wrong place.

> It was bootstrapped and regtested for C and C++ on x86 (that was in bugzilla
> comment #22), sorry I should have mentioned it here too. done now :)

Thanks for taking care of this.

Cheers,

Manuel.

[libjava] Restore HAVE_INET6 tests (PR libgcj/52645)

2012-04-04 Thread Rainer Orth

It turns out I've been over-eager removing Tru64 UNIX support from
libjava, breaking at least the HP-UX 11.00 build.  The following patch
fixes this, tested by Dave Anglin on hppa2.0w-hp-hpux11.00 and
bootstrapped on i386-pc-solaris2.11.

Ok for mainline?

Thanks.
Rainer


2012-03-21  Rainer Orth  

PR libgcj/52645
* gnu/java/net/natPlainDatagramSocketImplPosix.cc (setOption):
Restore HAVE_INET6 check.
* gnu/java/net/natPlainDatagramSocketImplWin32.cc (setOption):
Likewise.

# HG changeset patch
# Parent e817b51d075737a1652e0b5630c8823a4b074cec
Restore HAVE_INET6 tests (PR libgcj/52645)

diff --git a/libjava/gnu/java/net/natPlainDatagramSocketImplPosix.cc b/libjava/gnu/java/net/natPlainDatagramSocketImplPosix.cc
--- a/libjava/gnu/java/net/natPlainDatagramSocketImplPosix.cc
+++ b/libjava/gnu/java/net/natPlainDatagramSocketImplPosix.cc
@@ -655,6 +655,7 @@ gnu::java::net::PlainDatagramSocketImpl:
 	len = sizeof (struct in_addr);
 	ptr = (const char *) &u.addr;
 	  }
+#ifdef HAVE_INET6
 	else if (len == 16)
 	  {
 	level = IPPROTO_IPV6;
@@ -663,6 +664,7 @@ gnu::java::net::PlainDatagramSocketImpl:
 	len = sizeof (struct in6_addr);
 	ptr = (const char *) &u.addr6;
 	  }
+#endif
 	else
 	  throw
 	new ::java::net::SocketException (JvNewStringUTF ("invalid length"));
diff --git a/libjava/gnu/java/net/natPlainDatagramSocketImplWin32.cc b/libjava/gnu/java/net/natPlainDatagramSocketImplWin32.cc
--- a/libjava/gnu/java/net/natPlainDatagramSocketImplWin32.cc
+++ b/libjava/gnu/java/net/natPlainDatagramSocketImplWin32.cc
@@ -540,6 +540,7 @@ gnu::java::net::PlainDatagramSocketImpl:
   len = sizeof (struct in_addr);
   ptr = (const char *) &u.addr;
 }
+#ifdef HAVE_INET6
   else if (len == 16)
 {
   level = IPPROTO_IPV6;
@@ -548,6 +549,7 @@ gnu::java::net::PlainDatagramSocketImpl:
   len = sizeof (struct in6_addr);
   ptr = (const char *) &u.addr6;
 }
+#endif
   else
 throw
   new ::java::net::SocketException (JvNewStringUTF ("invalid length"));
@@ -635,14 +637,14 @@ gnu::java::net::PlainDatagramSocketImpl:
 goto error;
   if (u.address.sin_family == AF_INET)
 {
-laddr = JvNewByteArray (4);
-memcpy (elements (laddr), &u.address.sin_addr, 4);
+	  laddr = JvNewByteArray (4);
+	  memcpy (elements (laddr), &u.address.sin_addr, 4);
 }
 #ifdef HAVE_INET6
-else if (u.address.sin_family == AF_INET6)
+  else if (u.address.sin_family == AF_INET6)
 {
-laddr = JvNewByteArray (16);
-memcpy (elements (laddr), &u.address6.sin6_addr, 16);
+	  laddr = JvNewByteArray (16);
+	  memcpy (elements (laddr), &u.address6.sin6_addr, 16);
 }
 #endif
   else

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] Fix PR18589

2012-04-04 Thread Richard Guenther

On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
 wrote:
>
>
> On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
>> On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
>>  wrote:
>> > Hi,
>> >
>> > This is a re-post of the patch I posted for comments in January to
>> > address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
>> > modifies reassociation to expose repeated factors from __builtin_pow*
>> > calls, optimally reassociate repeated factors, and possibly reconstitute
>> > __builtin_powi calls from the results of reassociation.
>> >
>> > Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
>> > expect there may need to be some small changes, but I am targeting this
>> > for trunk approval.
>> >
>> > Thanks very much for the review,
>>
>> Hmm.  How much work would it be to extend the reassoc 'IL' to allow
>> a repeat factor per op?  I realize what you do is all within what reassoc
>> already does though ideally we would not require any GIMPLE IL changes
>> for building up / optimizing the reassoc IL but only do so when we commit
>> changes.
>>
>> Thanks,
>> Richard.
>
> Hi Richard,
>
> I've revised my patch along these lines; see the new version below.
> While testing it I realized I could do a better job of reducing the
> number of multiplies, so there are some changes to that logic as well,
> and a couple of additional test cases.  Regstrapped successfully on
> powerpc64-linux.
>
> Hope this looks better!

Yes indeed.  A few observations though.  You didn't integrate
attempt_builtin_powi
with optimize_ops_list - presumably because it's result does not really fit
the single-operation assumption?  But note that undistribute_ops_list and
optimize_range_tests have the same issue.  Thus, I'd have prefered if
attempt_builtin_powi worked in the same way, remove the parts of the
ops list it consumed and stick an operand for its result there instead.
That should simplify things (not having that special powi_result) and
allow for multiple "powi results" in a single op list?

Thanks,
Richard.

> Thanks,
> Bill
>
>
> gcc:
>
> 2012-04-03  Bill Schmidt  
>
>        PR tree-optimization/18589
>        * tree-pass.h: Replace pass_reassoc with pass_early_reassoc and
>        pass_late_reassoc.
>        * passes.c (init_optimization_passes): Change pass_reassoc calls to
>        pass_early_reassoc and pass_late_reassoc.
>        * tree-ssa-reassoc.c (reassociate_stats): Add two fields.
>        (operand_entry): Add count field.
>        (early_reassoc): New static var.
>        (add_repeat_to_ops_vec): New function.
>        (completely_remove_stmt): Likewise.
>        (remove_def_if_absorbed_call): Likewise.
>        (remove_visited_stmt_chain): Remove feeding builtin pow/powi calls.
>        (acceptable_pow_call): New function.
>        (linearize_expr_tree): Look for builtin pow/powi calls and add operand
>        entries with repeat counts when found.
>        (repeat_factor_d): New struct and associated typedefs.
>        (repeat_factor_vec): New static vector variable.
>        (compare_repeat_factors): New function.
>        (get_reassoc_pow_ssa_name): Likewise.
>        (attempt_builtin_powi): Likewise.
>        (reassociate_bb): Attempt to create __builtin_powi calls, and multiply
>        their results by any leftover reassociated factors; remove builtin
>        pow/powi calls that were absorbed by reassociation.
>        (fini_reassoc): Two new calls to statistics_counter_event.
>        (execute_early_reassoc): New function.
>        (execute_late_reassoc): Likewise.
>        (pass_early_reassoc): Replace pass_reassoc, renamed to reassoc1,
>        call execute_early_reassoc.
>        (pass_late_reassoc): New gimple_opt_pass named reassoc2 that calls
>        execute_late_reassoc.
>
> gcc/testsuite:
>
> 2012-04-03  Bill Schmidt  
>
>        PR tree-optimization/18589
>        * gcc.dg/pr46309.c: Change -fdump-tree-reassoc-details to
>        -fdump-tree-reassoc[12]-details.
>        * gcc.dg/tree-ssa/pr18589-1.c: New test.
>        * gcc.dg/tree-ssa/pr18589-2.c: Likewise.
>        * gcc.dg/tree-ssa/pr18589-3.c: Likewise.
>        * gcc.dg/tree-ssa/pr18589-4.c: Likewise.
>        * gcc.dg/tree-ssa/pr18589-5.c: Likewise.
>        * gcc.dg/tree-ssa/pr18589-6.c: Likewise.
>        * gcc.dg/tree-ssa/pr18589-7.c: Likewise.
>        * gcc.dg/tree-ssa/pr18589-8.c: Likewise.
>        * gcc.dg/tree-ssa/pr18589-9.c: Likewise.
>        * gcc.dg/tree-ssa/pr18589-10.c: Likewise.
>
>
> Index: gcc/tree-pass.h
> ===
> --- gcc/tree-pass.h     (revision 186108)
> +++ gcc/tree-pass.h     (working copy)
> @@ -441,7 +441,8 @@ extern struct gimple_opt_pass pass_copy_prop;
>  extern struct gimple_opt_pass pass_vrp;
>  extern struct gimple_opt_pass pass_uncprop;
>  extern struct gimple_opt_pass pass_return_slot;
> -extern struct gimple_opt_pass pass_reassoc;
> +extern struct gimple_opt_pass pass_early_reassoc;
> +extern struc

Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-04 Thread Dennis Gilmore

On Wed, 4 Apr 2012 09:06:05 + (UTC)
"Joseph S. Myers"  wrote:

> On Wed, 4 Apr 2012, Michael Hope wrote:
> 
> > The tricky one is new GCC with old GLIBC.  GCC may have to do a
> > configure time test and fall back to /lib/ld-linux.so.3 if the hard
> > float loader is missing.
> 
> I don't think that's appropriate for ABI issues.  If a different
> dynamic linker name is specified, GCC should use it unconditionally
> (and require new enough glibc or a glibc installation that was
> appropriately rearranged).
> 
> > > I have no idea whether shlib-versions files naming a file in a
> > > subdirectory will work - but if not, you'd need to send a patch to
> > > libc-alpha to support dynamic linkers in subdirectories, with
> > > appropriate justification for why you are doing something
> > > different from all other architectures.
> > 
> > Understood.  For now this is just a path.  There's more
> > infrastructure work needed if the path includes a directory.
> 
> Formally it's just a path - but an important feature of GNU/Linux and
> the GNU toolchain is consistency between different architectures and
> existing upstream practice is that the dynamic linker is always in
> the same directory as the other associated libraries and that this
> has the form /lib.  In the absence of a compelling reason,
> which I have not seen stated, to do otherwise for a single case, I
> think that existing practice should be followed with the dynamic
> linker being in a directory such as /libhf.

Consistency across architectures is why Fedora does many of the things
the way it does,  It really doesn't make much sense to me to diverge
from that.

> The "more infrastructure work needed" makes clear that you need
> libc-alpha buy-in *before* putting any patches into GCC or ports.
> But maybe if you don't try to put the dynamic linker in a different
> directory from the other libraries, it's easier to support via
> existing mechanisms (setting slibdir differently if
> --enable-multiarch-directories or similar)?
> 
> > Do the MIPS or PowerPC loaders detect the ABI and change the library
> > path based on that?  I couldn't tell from the code.
> 
> No, they don't detect the ABI.  Both ABIs (and, for Power, the e500v1
> and e500v2 variants - compatible with soft-float at the
> function-calling level but with some glibc ABI differences with
> soft-float and with each other) use the same directories.
> 
> > > (e) Existing practice for cases that do use different dynamic
> > > linkers is to use a separate library directory, not just dynamic
> > > linker name, as in lib32 and lib64 for MIPS or libx32 for x32;
> > > it's certainly a lot easier to make two sets of libraries work in
> > > parallel if you have separate library directories like that.
> > 
> > Is this required, or should it be left to the distro to choose?
> > Once the loader is in control then it can account for any distro
> > specific features, which may be the standard /lib and /usr/lib for
> > single ABI distros like Fedora or /usr/lib/$tuple for multiarch
> > distros like Ubuntu and Debian.
> 
> I thought Fedora used the standard upstream /lib64 on x86_64 and so
> would naturally use a standard upstream /libhf where appropriate.

Fedora does use /lib64 on x86_64 I would personally prefer /libhfp but
wouldn't object to /libhf  though today we have f17 about to go beta
and all of rawhide built using /lib 

Fedora also has software floating point being installed into /lib also 

> > > So it would seem more appropriate to define a directory libhf for
> > > ARM (meaning you need a binutils patch as well to handle that
> > > directory, I think)


Dennis

Re: [PATCH] Fix PR18589

2012-04-04 Thread William J. Schmidt

On Wed, 2012-04-04 at 13:35 +0200, Richard Guenther wrote:
> On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
>  wrote:
> >
> >
> > On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
> >> On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
> >>  wrote:
> >> > Hi,
> >> >
> >> > This is a re-post of the patch I posted for comments in January to
> >> > address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
> >> > modifies reassociation to expose repeated factors from __builtin_pow*
> >> > calls, optimally reassociate repeated factors, and possibly reconstitute
> >> > __builtin_powi calls from the results of reassociation.
> >> >
> >> > Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
> >> > expect there may need to be some small changes, but I am targeting this
> >> > for trunk approval.
> >> >
> >> > Thanks very much for the review,
> >>
> >> Hmm.  How much work would it be to extend the reassoc 'IL' to allow
> >> a repeat factor per op?  I realize what you do is all within what reassoc
> >> already does though ideally we would not require any GIMPLE IL changes
> >> for building up / optimizing the reassoc IL but only do so when we commit
> >> changes.
> >>
> >> Thanks,
> >> Richard.
> >
> > Hi Richard,
> >
> > I've revised my patch along these lines; see the new version below.
> > While testing it I realized I could do a better job of reducing the
> > number of multiplies, so there are some changes to that logic as well,
> > and a couple of additional test cases.  Regstrapped successfully on
> > powerpc64-linux.
> >
> > Hope this looks better!
> 
> Yes indeed.  A few observations though.  You didn't integrate
> attempt_builtin_powi
> with optimize_ops_list - presumably because it's result does not really fit
> the single-operation assumption?  But note that undistribute_ops_list and
> optimize_range_tests have the same issue.  Thus, I'd have prefered if
> attempt_builtin_powi worked in the same way, remove the parts of the
> ops list it consumed and stick an operand for its result there instead.
> That should simplify things (not having that special powi_result) and
> allow for multiple "powi results" in a single op list?

Multiple powi results are already handled, but yes, what you're
suggesting would simplify things by eliminating the need to create
explicit multiplies to join them and the cached-multiply results
together.  Sounds reasonable on the surface; it just hadn't occurred to
me to do it this way.  I'll have a look.

Any other major concerns while I'm reworking this?

Thanks,
Bill
> 
> Thanks,
> Richard.
>

Re: [PATCH] Fix PR18589

2012-04-04 Thread Richard Guenther

On Wed, Apr 4, 2012 at 2:35 PM, William J. Schmidt
 wrote:
> On Wed, 2012-04-04 at 13:35 +0200, Richard Guenther wrote:
>> On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
>>  wrote:
>> >
>> >
>> > On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
>> >> On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
>> >>  wrote:
>> >> > Hi,
>> >> >
>> >> > This is a re-post of the patch I posted for comments in January to
>> >> > address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
>> >> > modifies reassociation to expose repeated factors from __builtin_pow*
>> >> > calls, optimally reassociate repeated factors, and possibly reconstitute
>> >> > __builtin_powi calls from the results of reassociation.
>> >> >
>> >> > Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
>> >> > expect there may need to be some small changes, but I am targeting this
>> >> > for trunk approval.
>> >> >
>> >> > Thanks very much for the review,
>> >>
>> >> Hmm.  How much work would it be to extend the reassoc 'IL' to allow
>> >> a repeat factor per op?  I realize what you do is all within what reassoc
>> >> already does though ideally we would not require any GIMPLE IL changes
>> >> for building up / optimizing the reassoc IL but only do so when we commit
>> >> changes.
>> >>
>> >> Thanks,
>> >> Richard.
>> >
>> > Hi Richard,
>> >
>> > I've revised my patch along these lines; see the new version below.
>> > While testing it I realized I could do a better job of reducing the
>> > number of multiplies, so there are some changes to that logic as well,
>> > and a couple of additional test cases.  Regstrapped successfully on
>> > powerpc64-linux.
>> >
>> > Hope this looks better!
>>
>> Yes indeed.  A few observations though.  You didn't integrate
>> attempt_builtin_powi
>> with optimize_ops_list - presumably because it's result does not really fit
>> the single-operation assumption?  But note that undistribute_ops_list and
>> optimize_range_tests have the same issue.  Thus, I'd have prefered if
>> attempt_builtin_powi worked in the same way, remove the parts of the
>> ops list it consumed and stick an operand for its result there instead.
>> That should simplify things (not having that special powi_result) and
>> allow for multiple "powi results" in a single op list?
>
> Multiple powi results are already handled, but yes, what you're
> suggesting would simplify things by eliminating the need to create
> explicit multiplies to join them and the cached-multiply results
> together.  Sounds reasonable on the surface; it just hadn't occurred to
> me to do it this way.  I'll have a look.
>
> Any other major concerns while I'm reworking this?

No, the rest looks fine (you should not need to repace
-fdump-tree-reassoc-details
with -fdump-tree-reassoc1-details -fdump-tree-reassoc2-details in the first
testcase).

Thanks,
Richard.

> Thanks,
> Bill
>>
>> Thanks,
>> Richard.
>>
>
>

Re: [PATCH] Dissociate store_expr's temp from exp so that it is not marked as addressable

2012-04-04 Thread Martin Jambor

Hi,

On Tue, Apr 03, 2012 at 11:02:11AM +0200, Eric Botcazou wrote:
> > Yeah, that sounds reasonable.
> 
> There is a further subtlety in the second temp allocation when the expression 
> doesn't use the alias set of its type.  In that case, we cannot pass the type 
> to set_mem_attributes.  In fact, since assign_stack_temp_for_type already 
> calls the appropriate set_mem_* routines, the best thing to do might be to 
> remove the call to set_mem_attributes altogether in that case.
> 

So, something like this?  Bootstrapped and tested on x86_64-linux and
ia64-linux, I'm currently having problems bootsrapping sparc64 which
is what I need this mainly for but those are unelated and this should
help.

Thanks,

Martin



2012-04-03  Martin Jambor  

* expr.c (expand_expr_real_1): Pass type, not the expression, to
set_mem_attributes for a memory temporary.  Do not call the
function for temporaries with a different alias set.

Index: src/gcc/expr.c
===
--- src.orig/gcc/expr.c
+++ src/gcc/expr.c
@@ -9572,6 +9572,7 @@ expand_expr_real_1 (tree exp, rtx target
tree tem = get_inner_reference (exp, &bitsize, &bitpos, &offset,
&mode1, &unsignedp, &volatilep, true);
rtx orig_op0, memloc;
+   bool mem_attrs_from_type = false;
 
/* If we got back the original object, something is wrong.  Perhaps
   we are evaluating an expression too early.  In any event, don't
@@ -9677,6 +9678,7 @@ expand_expr_real_1 (tree exp, rtx target
memloc = assign_temp (nt, 1, 1, 1);
emit_move_insn (memloc, op0);
op0 = memloc;
+   mem_attrs_from_type = true;
  }
 
if (offset)
@@ -9849,7 +9851,6 @@ expand_expr_real_1 (tree exp, rtx target
emit_move_insn (new_rtx, op0);
op0 = copy_rtx (new_rtx);
PUT_MODE (op0, BLKmode);
-   set_mem_attributes (op0, exp, 1);
  }
 
return op0;
@@ -9870,7 +9871,14 @@ expand_expr_real_1 (tree exp, rtx target
if (op0 == orig_op0)
  op0 = copy_rtx (op0);
 
-   set_mem_attributes (op0, exp, 0);
+   /* If op0 is a temporary because of forcing to memory, pass only the
+  type to set_mem_attributes so that the original expression is never
+  marked as ADDRESSABLE through MEM_EXPR of the temporary.  */
+   if (mem_attrs_from_type)
+ set_mem_attributes (op0, TREE_TYPE (exp), 0);
+   else
+ set_mem_attributes (op0, exp, 0);
+
if (REG_P (XEXP (op0, 0)))
  mark_reg_pointer (XEXP (op0, 0), MEM_ALIGN (op0));

Re: [C11-atomic] [patch] gimple atomic statements

2012-04-04 Thread Richard Henderson

On 04/04/2012 04:45 AM, Richard Guenther wrote:
> I suppose you do not want to use builtins because for primitive types you
> end up with multiple statements for something "atomic"?

The primary motivation is that builtins cannot return two values.

Our current builtin returns one of the two values by reference, as
one would do from plain old C.  Even though we drop the reference
when we convert from gimple to rtl, this is not good enough to clean
up the variable we forced to a stack slot.

I suggested a specialized GIMPLE_ATOMIC opcode instead of doing a
totally generalized GIMPLE_ASSIGN_N, returning N values.

r~

Re: Ping [IA-64] Work around thinko in 'x' constraint implementation

2012-04-04 Thread Richard Henderson

Ok.


r~


On 04/04/2012 03:34 AM, Tristan Gingold wrote:
> 
> I'd like to ping this patch as it fixed an ICE visible on both ia64 linux and 
> ia64 openvms.
> 
> Tristan.
> 
> On Mar 6, 2012, at 11:07 PM, Eric Botcazou wrote:
> 
>> We have a regression on one of the testcases of our internal testsuite on 
>> IA-64 
>> with a 4.7-based compiler, which is of the form:
>>
>> test_vec_madd.adb: In function 'Test_Vec_Madd':
>> test_vec_madd.adb:160:5: error: could not split insn
>> (insn 887 4859 889 16 (set (reg:TI 158 f30 [orig:417 m ] [417])
>>(mem/c:TI (reg/f:DI 14 r14 [1025]) [0 S16 A128])) 
>> /gnu/lib/gcc/ia64-hp-
>> openvms/4_7_0/adainclude/g-altcon.adb:277 125 {movti_internal}
>> (nil))
>> +===GNAT BUG DETECTED==+
>> | Pro 7.1.0w (20120221-head) (ia64-hp-openvms) GCC error:  |
>> | in final_scan_insn, at final.c:2716  |
>> | Error detected around test_vec_madd.adb:160:5|
>>
>> The compiler aborts during the final pass because it couldn't split the insn.
>> The pattern for movti_internal is:
>>
>> (define_insn_and_split "movti_internal"
>>  [(set (match_operand:TI 0 "destination_operand" "=r,   *fm,*x,*f,  Q")
>>  (match_operand:TI 1 "general_operand" "r*fim,r,  Q, *fOQ,*f"))]
>>  "ia64_move_ok (operands[0], operands[1])"
>>  "@
>>   #
>>   #
>>   ldfp8 %X0 = %1%P1
>>   #
>>   #"
>>  "reload_completed && !ia64_load_pair_ok(operands[0], operands[1])"
>>  [(const_int 0)]
>>
>> The problem is that the operands satisfy ia64_load_pair_ok so the splitter
>> cannot be invoked on them.  The root cause is a discrepancy between this
>> predicate and how the 'x' constraint is interpreted.  The predicate uses
>> FP_REGNO_P to check the destination and this returns true for %f30 (but would
>> return false for the immediately following register %f31).  But recog 
>> interprets the 'x' constraint as meaning that every hard register in the 
>> destination must be in the FP_REGS class; now the mode is TImode so both 
>> %f30 
>> and %f31 are taken into account and %f31 isn't in the FP_REGS class, so the 
>> operand is rejected.
>>
>> AFAICS the problem dates back to the introduction of the code (r102463), so 
>> I'm 
>> not sure that we want to rewrite it at this point.  That's why the attached 
>> patch is a simple workaround that just avoid ICEing.
>>
>> Bootstrapped/regtested on IA-64/Linux, OK for the mainline?  Do we also want 
>> it 
>> for 4.7.1 (I assume that some RA change makes the issue visible in 4.7.x)?
>>
>>
>> 2012-03-06  Eric Botcazou  
>>
>>  * config/ia64/ia64.c (ia64_load_pair_ok): Return 0 if the second member
>>  of the destination isn't also a FP_REGS register.
>>
>>
>> -- 
>> Eric Botcazou
>> 
>

Re: [C11-atomic] [patch] gimple atomic statements

2012-04-04 Thread Richard Guenther

On Wed, Apr 4, 2012 at 3:26 PM, Richard Henderson  wrote:
> On 04/04/2012 04:45 AM, Richard Guenther wrote:
>> I suppose you do not want to use builtins because for primitive types you
>> end up with multiple statements for something "atomic"?
>
> The primary motivation is that builtins cannot return two values.
>
> Our current builtin returns one of the two values by reference, as
> one would do from plain old C.  Even though we drop the reference
> when we convert from gimple to rtl, this is not good enough to clean
> up the variable we forced to a stack slot.

If that is the only reason you can return two values by using a complex
or vector type (that would be only an IL implementation detail as far
as I can see).
We use that trick to get sincos () "sane" in our IL as well.

Are there other reasons to go with a new GIMPLE code?

> I suggested a specialized GIMPLE_ATOMIC opcode instead of doing a
> totally generalized GIMPLE_ASSIGN_N, returning N values.

We already support multiple SSA defs btw, there is just no operand slot for
it that is properly named or handled by the operand scanner.  Thus, a
new GIMPLE_ASSIGN sub-code class would do, too (of course nobody
expects multiple DEFs here thus it would not be a very good idea to do
that, IMHO).

Richard.

>
> r~

Re: RFA: RL78: Add -fstack-usage support

2012-04-04 Thread nick clifton


Hi Eric,

> On 04/04/12 12:24, Eric Botcazou wrote:

You probably need to adjust gcc.dg/stack-usage-1.c too.
s/flag_stack_usage/flag_stack_usage_info/


Thanks for the corrections.  Revised patch attached.

OK for mainline/4.7 branch ?

Cheers
  Nick

gcc/ChangeLog
2012-04-04  Nick Clifton  

* config/rl78/rl78.c (rl78_expand_prologue): Set stack use
information, if requested.

gcc/testsuite/ChangeLog
2012-04-04  Nick Clifton  

* gcc.dg/stack-usage-1.c (SIZE): Define for the RL78.
Index: gcc/config/rl78/rl78.c
===
--- gcc/config/rl78/rl78.c	(revision 186130)
+++ gcc/config/rl78/rl78.c	(working copy)
@@ -827,6 +827,9 @@
   if (!cfun->machine->computed)
 rl78_compute_frame_info ();
 
+  if (flag_stack_usage_info)
+current_function_static_stack_size = cfun->machine->framesize;
+
   for (i = 0; i < 16; i++)
 if (cfun->machine->need_to_push [i])
   {
Index: gcc/testsuite/gcc.dg/stack-usage-1.c
===
--- gcc/testsuite/gcc.dg/stack-usage-1.c	(revision 186130)
+++ gcc/testsuite/gcc.dg/stack-usage-1.c	(working copy)
@@ -58,6 +58,8 @@
 #  define SIZE 224
 #elif defined (__epiphany__)
 #  define SIZE (256 - __EPIPHANY_STACK_OFFSET__)
+#elif defined (__RL78__)
+#  define SIZE 254
 #else
 #  define SIZE 256
 #endif

[PATCH] Make gsi_remove return whether EH cleanup is required

2012-04-04 Thread Richard Guenther


Several passes needlessly cleanup EH after gsi_remove because they do
not know whether the stmt was removed from EH regions.  The following
patch returns this information from gsi_remove and adjusts all users
I could find appropriately.

Bootstrapped and tested on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2012-04-04  Richard Guenther  

* gimple-iterator.c (gsi_remove): Return whether EH edges need to be
cleanup.
* gimple.h (gsi_remove): Adjust.
* tree-ssa-operands.c (unlink_stmt_vdef): Optimize.
* tree-ssa-dom.c (optimize_stmt): Use gsi_remove result.
* tree-ssa-dse.c (dse_optimize_stmt): Likewise.
* tree-ssa-forwprop.c (remove_prop_source_from_use): Likewise.
* tree-ssa-math-opts.c (execute_optimize_widening_mul): Likewise.
* tree-ssa-pre.c (eliminate): Likewise.

Index: gcc/gimple.h
===
*** gcc/gimple.h.orig   2012-04-04 14:57:38.0 +0200
--- gcc/gimple.h2012-04-04 14:58:20.633570347 +0200
*** void gsi_insert_seq_after (gimple_stmt_i
*** 5095,5101 
   enum gsi_iterator_update);
  void gsi_insert_seq_after_without_update (gimple_stmt_iterator *, gimple_seq,
enum gsi_iterator_update);
! void gsi_remove (gimple_stmt_iterator *, bool);
  gimple_stmt_iterator gsi_for_stmt (gimple);
  void gsi_move_after (gimple_stmt_iterator *, gimple_stmt_iterator *);
  void gsi_move_before (gimple_stmt_iterator *, gimple_stmt_iterator *);
--- 5095,5101 
   enum gsi_iterator_update);
  void gsi_insert_seq_after_without_update (gimple_stmt_iterator *, gimple_seq,
enum gsi_iterator_update);
! bool gsi_remove (gimple_stmt_iterator *, bool);
  gimple_stmt_iterator gsi_for_stmt (gimple);
  void gsi_move_after (gimple_stmt_iterator *, gimple_stmt_iterator *);
  void gsi_move_before (gimple_stmt_iterator *, gimple_stmt_iterator *);
Index: gcc/gimple-iterator.c
===
*** gcc/gimple-iterator.c.orig  2012-04-04 14:57:38.0 +0200
--- gcc/gimple-iterator.c   2012-04-04 14:58:56.661952844 +0200
*** gsi_insert_after (gimple_stmt_iterator *
*** 499,511 
 REMOVE_PERMANENTLY is true when the statement is going to be removed
 from the IL and not reinserted elsewhere.  In that case we remove the
 statement pointed to by iterator I from the EH tables, and free its
!operand caches.  Otherwise we do not modify this information.  */
  
! void
  gsi_remove (gimple_stmt_iterator *i, bool remove_permanently)
  {
gimple_seq_node cur, next, prev;
gimple stmt = gsi_stmt (*i);
  
if (gimple_code (stmt) != GIMPLE_PHI)
  insert_debug_temps_for_defs (i);
--- 499,513 
 REMOVE_PERMANENTLY is true when the statement is going to be removed
 from the IL and not reinserted elsewhere.  In that case we remove the
 statement pointed to by iterator I from the EH tables, and free its
!operand caches.  Otherwise we do not modify this information.  Returns
!true whether EH edge cleanup is required.  */
  
! bool
  gsi_remove (gimple_stmt_iterator *i, bool remove_permanently)
  {
gimple_seq_node cur, next, prev;
gimple stmt = gsi_stmt (*i);
+   bool require_eh_edge_purge = false;
  
if (gimple_code (stmt) != GIMPLE_PHI)
  insert_debug_temps_for_defs (i);
*** gsi_remove (gimple_stmt_iterator *i, boo
*** 517,523 
  
if (remove_permanently)
  {
!   remove_stmt_from_eh_lp (stmt);
gimple_remove_stmt_histograms (cfun, stmt);
  }
  
--- 519,525 
  
if (remove_permanently)
  {
!   require_eh_edge_purge = remove_stmt_from_eh_lp (stmt);
gimple_remove_stmt_histograms (cfun, stmt);
  }
  
*** gsi_remove (gimple_stmt_iterator *i, boo
*** 537,542 
--- 539,546 
  gimple_seq_set_last (i->seq, prev);
  
i->ptr = next;
+ 
+   return require_eh_edge_purge;
  }
  
  
Index: gcc/tree-ssa-operands.c
===
*** gcc/tree-ssa-operands.c.orig2012-04-04 14:57:38.0 +0200
--- gcc/tree-ssa-operands.c 2012-04-04 14:58:20.634570358 +0200
*** unlink_stmt_vdef (gimple stmt)
*** 1475,1492 
imm_use_iterator iter;
gimple use_stmt;
tree vdef = gimple_vdef (stmt);
  
if (!vdef
|| TREE_CODE (vdef) != SSA_NAME)
  return;
  
!   FOR_EACH_IMM_USE_STMT (use_stmt, iter, gimple_vdef (stmt))
  {
FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
!   SET_USE (use_p, gimple_vuse (stmt));
  }
  
!   if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_vdef (stmt)))
! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_vuse (stmt)) = 1;
  }
  
--- 1475,1493 
imm_use_iterator iter;
gimple use_stmt;
tree vdef = gimple_vdef

Re: [C11-atomic] [patch] gimple atomic statements

2012-04-04 Thread Andrew MacLeod


On 04/04/2012 09:28 AM, Andrew MacLeod wrote:



I wasn't excited about creating a new gimple statement, but it seemed 
the best solution to my issues. In the end, I think this works very 
cleanly.  Im certainly open to better solutions. If there is a plan to 
change gimple in some way that this doesnt work with, then it would be 
good to know what that plan is.


btw, I did start my prototyping of this by creating atomic tree codes 
for each of the atomic buitins rather than a gimple atomic, but found 
that did not integrate very well (I forget exactly what the issue was 
now... something to do with when I was trying to translate them from 
bultins to treecodes), so I evolved to a gimple statement which gave me 
more control over things.


 If gimple is going to change somehow that will make this work better, 
I'm also fine with doing that.  i still have some of that code laying 
around.  OR I can go back and revisit it to remember exactly what the 
issue was.


Andrew

Re: [Libiberty]: Handle VMS as a LLP64 platform in splay-tree.h

2012-04-04 Thread Ian Lance Taylor

Tristan Gingold  writes:

> include/
> 2012-04-04  Tristan Gingold  
>
>   * splay-tree.h: Use LLP64 definitions of libi_shostptr_t and
>   libi_hostptr_t for VMS with 64bit pointers.

I was strongly opposed to adding a _WIN64 define here and this is just
making it worse.

Ian


> --- a/include/splay-tree.h
> +++ b/include/splay-tree.h
> @@ -37,7 +37,8 @@ extern "C" {
>  
>  #include "ansidecl.h"
>  
> -#ifndef _WIN64
> +#if !(defined (_WIN64) \
> +  || (defined (__VMS__) && __INITIAL_POINTER_SIZE == 64))
>typedef unsigned long int libi_uhostptr_t;
>typedef long int libi_shostptr_t;
>  #else

Re: [C11-atomic] [patch] gimple atomic statements

2012-04-04 Thread Andrew MacLeod


Im not sure what happened to my original reply, so I'll resend it..


On 04/04/2012 09:28 AM, Andrew MacLeod wrote:
On 04/04/2012 04:45 AM, Richard Guenther wrote:


The fact that you need to touch every place that wants to look at memory
accesses shows that you are doing it wrong.  Instead my plan was to
force _all_ memory accesses to GIMPLE_ASSIGNs (yes, including those
we have now in calls).  You're making a backwards step in my eyes.
I'm not sure I understand what you are saying, or at least I don't know 
what this plan you are talking about is...   Are you saying that you are 
planning to change gimple so that none of the gimple statement types 
other than GIMPLE_ASSIGN ever see an ADDR_EXPR or memory reference? 
Seems like that change, when it happens, would simply affect 
GIMPLE_ATOMIC like all the other gimple classes.  And if it was done 
before I tried to merge the branch, would fall on me to fix.  Right now, 
I'm just handling what the compiler sends my way...  A bunch of places 
need to understand a new gimple_statement kind...

What do you think is "easier" when you use a GIMPLE_ATOMIC
(why do you need a fntype field?!  Should the type not be available
via the operand types?!)


This is a WIP... that fntype fields is there for simplicity..   and 
no... you can do a 1 byte atomic operation on a full word object if you 
want by using __atomic_store_1 ()... so you can't just look at the 
object. We might be able to sort that type out eventually if all the 
casts are correct, but until everything is finished, this is safer.  I'm 
actually hoping eventually to not have a bunch of casts on the params, 
they are just there to get around the builtin's type-checking system.. 
we should be able to  just take care of required promotions at expansion 
time and do type-checking during verification.




Your tree-cfg.c parts need to be filled in.  They are the 
specification of

GIMPLE_ATOMIC - at the moment you allow any garbage.


well of course this isnt suppose to be a final patch, its to get the 
core changes into a branch while I continue working on it.  There are a 
number of parts that aren't filled in or flushed out yet.   Once its all 
working and what is expected is well defined, then I'll fill in the 
verification stuff.



Similar to how I dislike the choice of adding GIMPLE_TRANSACTION
instead of using builtin functions I dislike this.

I suppose you do not want to use builtins because for primitive types you
end up with multiple statements for something "atomic"?
builtins are just more awkward to work with, and don't support more than 
1 result.
compare_and swap was the worst case.. it has 2 results and that does not 
map to a built in function very well. we struggled all last fall with 
how to do it efficiently, and eventually gave up. given:


  int p = 1;
  bool ret;
  ret = __atomic_compare_exchange_n (&flag2, &p, 0, 0, 
__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);

  return ret;

with GCC 4.7 we currently end up generating

  p = 1;
  ret_1 = __atomic_compare_exchange_4 (&flag2, &p, 0, 0, 5, 5);
  return ret_1;

Note this actually requires leaving a local (p) on the stack, and 
reduces the optimizations that can be performed on it, even though there 
isn't really a need.


By going to a gimple statement, we can expose both results properly, and 
this ends up generating


  (ret_3, cmpxchg.2_4) = atomic_compare_exchange_strong <&flag2, 1, 0, 
SEQ_CST, SEQ_CST>

  return ret_3;

and during expansion to RTL, can trivially see that cmpxchg.2_4 is not 
used, and generate the really efficient compare and swap pattern which 
only produces a boolean result.   if only cmpxchg.2_4 were used, we can 
generate the C&S pattern which only returns the result.  Only if we see 
both are actually used do we have to fall back to the much uglier 
pattern we have that produces both results.  Currently we always 
generate this pattern.


Next, we have the C11 atomic type qualifier which needs to be 
implemented.  Every reference to this variable is going to have to be 
expanded into one or more atomic operations of some sort.  Yes, I 
probably could do that by emitting built-in functions, but they are a 
bit more unwieldy, its far simpler to just create gimple_statements.


I discovered last fall that when I tried to translate one built-in 
function into another form that dealing with parameters and all the call 
bits was a pain.  Especially when the library call I need to emit had a 
different number of parameters than the built-in did.   A GIMPLE_ATOMIC 
statement makes all of this trivial.


I also hope that when done, I can also remove all the ugly built-in 
overload code that was created for __sync and continues to be used by 
__atomic.  This would clean up where we have to take func_n and turn it 
into a func_1 or func_2 or whatever.  We also had to bend over and issue 
a crap load of different casts early to squeeze all the parameters into 
the 'proper' form for the builtins. This made it more awkward

[RFC] Should SRA stop producing COMPONENT_REF for non-bit-fields (again)?

2012-04-04 Thread Martin Jambor

Hi everyone, especially Richi and Eric,

I'd like to know what is your attitude to changing SRA's
build_ref_for_model to what it once looked like, so that it produces
COMPONENT_REFs only for bit-fields.  The non-bit field handling was
added in order to work-around problems when expanding non-aligned
MEM_REFs on strict-alignment targets but that should not be a problem
now and my experiments confirm that.  Last week I successfully
bootstrapped and tested this patch on sparc64-linux (with the
temporary MEM_EXPR patch, not including Java), ia64-linux (without
Ada), x86_64-linux, i686-linux and tested it on hppa-linux (only C and
C++).

The main downside of this change I see is that dumps would be a bit
more difficult to read and understand when the fields disappear from
them.

The upsides are:

  - the expr field of SRA access was originally intended only for
debugging (meaning both for compiler-produced debug info and
debugging SRA).  It was never intended to influence the memory
accesses produced by SRA and when we create them artificially, the
effects of the particular form are hard to reason about.  If we
ever lower bit-field accesses on gimple level, build_ref_for_model
could go away completely (yeah, I know I'm getting carried way
here).

  - If something like PR 51528 creeps up again and we need to create
replacements of type returned by lang_hooks.types.type_for_mode,
the produced MEM_REFs could simply have this type.  OTOH, the
current COMPONENT_REFs would require to be encapsulated in V_C_Es
and that is quite a nightmare.  I tried it in December, even made
it work, but it was particularly ugly and needed some quite
questionable uses of V_C_Es.

  - Well, it does the same thing and is much simpler, is it not?

The patch fulfills the criteria to be committed and I can do it soon.
OTOH, keeping it so on a number of platforms takes quite a lot of time
(and has uncovered some non-related bugs) so I'd like to know whether
it's worth it.

Thanks,

Martin



2012-03-20 Martin Jambor 

* tree-sra.c (build_ref_for_model): Create COMPONENT_REFs only for
bit-fields.

Index: src/gcc/tree-sra.c
===
*** src.orig/gcc/tree-sra.c
--- src/gcc/tree-sra.c
*** build_ref_for_offset (location_t loc, tr
*** 1489,1558 
return fold_build2_loc (loc, MEM_REF, exp_type, base, off);
  }
  
- DEF_VEC_ALLOC_P_STACK (tree);
- #define VEC_tree_stack_alloc(alloc) VEC_stack_alloc (tree, alloc)
- 
  /* Construct a memory reference to a part of an aggregate BASE at the given
!OFFSET and of the type of MODEL.  In case this is a chain of references
!to component, the function will replicate the chain of COMPONENT_REFs of
!the expression of MODEL to access it.  GSI and INSERT_AFTER have the same
!meaning as in build_ref_for_offset.  */
  
  static tree
  build_ref_for_model (location_t loc, tree base, HOST_WIDE_INT offset,
 struct access *model, gimple_stmt_iterator *gsi,
 bool insert_after)
  {
!   tree type = model->type, t;
!   VEC(tree,stack) *cr_stack = NULL;
! 
!   if (TREE_CODE (model->expr) == COMPONENT_REF)
  {
!   tree expr = model->expr;
! 
!   /* Create a stack of the COMPONENT_REFs so later we can walk them in
!order from inner to outer.  */
!   cr_stack = VEC_alloc (tree, stack, 6);
! 
!   do {
!   tree field = TREE_OPERAND (expr, 1);
!   tree cr_offset = component_ref_field_offset (expr);
!   HOST_WIDE_INT bit_pos
! = tree_low_cst (cr_offset, 1) * BITS_PER_UNIT
! + TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));
  
!   /* We can be called with a model different from the one associated
!  with BASE so we need to avoid going up the chain too far.  */
!   if (offset - bit_pos < 0)
! break;
! 
!   offset -= bit_pos;
!   VEC_safe_push (tree, stack, cr_stack, expr);
! 
!   expr = TREE_OPERAND (expr, 0);
!   type = TREE_TYPE (expr);
!   } while (TREE_CODE (expr) == COMPONENT_REF);
  }
! 
!   t = build_ref_for_offset (loc, base, offset, type, gsi, insert_after);
! 
!   if (TREE_CODE (model->expr) == COMPONENT_REF)
! {
!   unsigned i;
!   tree expr;
! 
!   /* Now replicate the chain of COMPONENT_REFs from inner to outer.  */
!   FOR_EACH_VEC_ELT_REVERSE (tree, cr_stack, i, expr)
!   {
! tree field = TREE_OPERAND (expr, 1);
! t = fold_build3_loc (loc, COMPONENT_REF, TREE_TYPE (field), t, field,
!  TREE_OPERAND (expr, 2));
!   }
! 
!   VEC_free (tree, stack, cr_stack);
! }
! 
!   return t;
  }
  
  /* Construct a memory reference consisting of component_refs and array_refs to
--- 1489,1520 
return fold_build2_loc (loc, MEM_REF, exp_type, base, off);
  }
  
  /* Construct a memory reference to a part of an aggregate BASE at the given
!

Re: [C11-atomic] [patch] gimple atomic statements

2012-04-04 Thread Andrew MacLeod


On 04/04/2012 04:45 AM, Richard Guenther wrote:


The fact that you need to touch every place that wants to look at memory
accesses shows that you are doing it wrong.  Instead my plan was to
force _all_ memory accesses to GIMPLE_ASSIGNs (yes, including those
we have now in calls).  You're making a backwards step in my eyes.
I'm not sure I understand what you are saying, or at least I don't know 
what this plan you are talking about is...   Are you saying that you are 
planning to change gimple so that none of the gimple statement types 
other than GIMPLE_ASSIGN ever see an ADDR_EXPR or memory reference? 
Seems like that change, when it happens, would simply affect 
GIMPLE_ATOMIC like all the other gimple classes.  And if it was done 
before I tried to merge the branch, would fall on me to fix.  Right now, 
I'm just handling what the compiler sends my way...  A bunch of places 
need to understand a new gimple_statement kind...

What do you think is "easier" when you use a GIMPLE_ATOMIC
(why do you need a fntype field?!  Should the type not be available
via the operand types?!)


This is a WIP... that fntype fields is there for simplicity..   and 
no... you can do a 1 byte atomic operation on a full word object if you 
want by using __atomic_store_1 ()... so you can't just look at the 
object. We might be able to sort that type out eventually if all the 
casts are correct, but until everything is finished, this is safer.  I'm 
actually hoping eventually to not have a bunch of casts on the params, 
they are just there to get around the builtin's type-checking system.. 
we should be able to  just take care of required promotions at expansion 
time and do type-checking during verification.




Your tree-cfg.c parts need to be filled in.  They are the specification of
GIMPLE_ATOMIC - at the moment you allow any garbage.


well of course this isnt suppose to be a final patch, its to get the 
core changes into a branch while I continue working on it.  There are a 
number of parts that aren't filled in or flushed out yet.   Once its all 
working and what is expected is well defined, then I'll fill in the 
verification stuff.



Similar to how I dislike the choice of adding GIMPLE_TRANSACTION
instead of using builtin functions I dislike this.

I suppose you do not want to use builtins because for primitive types you
end up with multiple statements for something "atomic"?
builtins are just more awkward to work with, and don't support more than 
1 result.
compare_and swap was the worst case.. it has 2 results and that does not 
map to a built in function very well. we struggled all last fall with 
how to do it efficiently, and eventually gave up. given:


  int p = 1;
  bool ret;
  ret = __atomic_compare_exchange_n (&flag2, &p, 0, 0, 
__ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);

  return ret;

with GCC 4.7 we currently end up generating

  p = 1;
  ret_1 = __atomic_compare_exchange_4 (&flag2, &p, 0, 0, 5, 5);
  return ret_1;

Note this actually requires leaving a local (p) on the stack, and 
reduces the optimizations that can be performed on it, even though there 
isn't really a need.


By going to a gimple statement, we can expose both results properly, and 
this ends up generating


  (ret_3, cmpxchg.2_4) = atomic_compare_exchange_strong <&flag2, 1, 0, 
SEQ_CST, SEQ_CST>

  return ret_3;

and during expansion to RTL, can trivially see that cmpxchg.2_4 is not 
used, and generate the really efficient compare and swap pattern which 
only produces a boolean result.   if only cmpxchg.2_4 were used, we can 
generate the C&S pattern which only returns the result.  Only if we see 
both are actually used do we have to fall back to the much uglier 
pattern we have that produces both results.  Currently we always 
generate this pattern.


Next, we have the C11 atomic type qualifier which needs to be 
implemented.  Every reference to this variable is going to have to be 
expanded into one or more atomic operations of some sort.  Yes, I 
probably could do that by emitting built-in functions, but they are a 
bit more unwieldy, its far simpler to just create gimple_statements.


I discovered last fall that when I tried to translate one built-in 
function into another form that dealing with parameters and all the call 
bits was a pain.  Especially when the library call I need to emit had a 
different number of parameters than the built-in did.   A GIMPLE_ATOMIC 
statement makes all of this trivial.


I also hope that when done, I can also remove all the ugly built-in 
overload code that was created for __sync and continues to be used by 
__atomic.  This would clean up where we have to take func_n and turn it 
into a func_1 or func_2 or whatever.  We also had to bend over and issue 
a crap load of different casts early to squeeze all the parameters into 
the 'proper' form for the builtins. This made it more awkward to dig 
down and find the things being operated on and manipulate them. The 
type-checking code is not a thing of bea

Re: [C11-atomic] [patch] gimple atomic statements

2012-04-04 Thread Richard Henderson

On 04/04/2012 09:46 AM, Richard Guenther wrote:
> If that is the only reason you can return two values by using a complex
> or vector type (that would be only an IL implementation detail as far
> as I can see).
> We use that trick to get sincos () "sane" in our IL as well.

That would work if the two values were of the same type, as they
are with sincos.  In the case of compare_exchange, they are not.

r~

Re: [C11-atomic] [patch] gimple atomic statements

2012-04-04 Thread Richard Guenther

On Wed, Apr 4, 2012 at 3:28 PM, Andrew MacLeod  wrote:
> On 04/04/2012 04:45 AM, Richard Guenther wrote:
>>
>>
>> The fact that you need to touch every place that wants to look at memory
>> accesses shows that you are doing it wrong.  Instead my plan was to
>> force _all_ memory accesses to GIMPLE_ASSIGNs (yes, including those
>> we have now in calls).  You're making a backwards step in my eyes.
>
> I'm not sure I understand what you are saying, or at least I don't know what
> this plan you are talking about is...   Are you saying that you are planning
> to change gimple so that none of the gimple statement types other than
> GIMPLE_ASSIGN ever see an ADDR_EXPR or memory reference?

A memory reference, yes.  And at most one, thus no aggregate copies
anymore.

>     Seems like that
> change, when it happens, would simply affect GIMPLE_ATOMIC like all the
> other gimple classes.  And if it was done before I tried to merge the
> branch, would fall on me to fix.  Right now, I'm just handling what the
> compiler sends my way...  A bunch of places need to understand a new
> gimple_statement kind...

I'm not sure if I ever will end up finishing the above I just wanted
to mention it.

>> What do you think is "easier" when you use a GIMPLE_ATOMIC
>> (why do you need a fntype field?!  Should the type not be available
>> via the operand types?!)
>
>
> This is a WIP... that fntype fields is there for simplicity..   and no...
> you can do a 1 byte atomic operation on a full word object if you want by
> using __atomic_store_1 ()... so you can't just look at the object. We might
> be able to sort that type out eventually if all the casts are correct, but
> until everything is finished, this is safer.  I'm actually hoping eventually
> to not have a bunch of casts on the params, they are just there to get
> around the builtin's type-checking system.. we should be able to  just take
> care of required promotions at expansion time and do type-checking during
> verification.

Oh, so you rather need a size or a mode specified, not a "fntype"?

>
>
>>
>> Your tree-cfg.c parts need to be filled in.  They are the specification of
>> GIMPLE_ATOMIC - at the moment you allow any garbage.
>
>
> well of course this isnt suppose to be a final patch, its to get the
> core changes into a branch while I continue working on it.  There are a
> number of parts that aren't filled in or flushed out yet.   Once its all
> working and what is expected is well defined, then I'll fill in the
> verification stuff.
>
>
>> Similar to how I dislike the choice of adding GIMPLE_TRANSACTION
>> instead of using builtin functions I dislike this.
>>
>> I suppose you do not want to use builtins because for primitive types you
>> end up with multiple statements for something "atomic"?
>
> builtins are just more awkward to work with, and don't support more than 1
> result.
> compare_and swap was the worst case.. it has 2 results and that does not map
> to a built in function very well. we struggled all last fall with how to do
> it efficiently, and eventually gave up. given:
>
>  int p = 1;
>  bool ret;
>  ret = __atomic_compare_exchange_n (&flag2, &p, 0, 0, __ATOMIC_SEQ_CST,
> __ATOMIC_SEQ_CST);
>  return ret;
>
> with GCC 4.7 we currently end up generating
>
>  p = 1;
>  ret_1 = __atomic_compare_exchange_4 (&flag2, &p, 0, 0, 5, 5);
>  return ret_1;
>
> Note this actually requires leaving a local (p) on the stack, and reduces
> the optimizations that can be performed on it, even though there isn't
> really a need.

You could use a vector, complex or aggregate return.

> By going to a gimple statement, we can expose both results properly, and
> this ends up generating
>
>  (ret_3, cmpxchg.2_4) = atomic_compare_exchange_strong <&flag2, 1, 0,
> SEQ_CST, SEQ_CST>
>  return ret_3;

In the example you only ever use address operands (not memory operands)
to the GIMPLE_ATOMIC - is that true in all cases?  Is the result always
non-memory?

I suppose the GIMPLE_ATOMICs are still optimization barriers for all
memory, not just that possibly referenced by them?

> and during expansion to RTL, can trivially see that cmpxchg.2_4 is not used,
> and generate the really efficient compare and swap pattern which only
> produces a boolean result.

I suppose gimple stmt folding could transform it as well?

>   if only cmpxchg.2_4 were used, we can generate
> the C&S pattern which only returns the result.  Only if we see both are
> actually used do we have to fall back to the much uglier pattern we have
> that produces both results.  Currently we always generate this pattern.
>
> Next, we have the C11 atomic type qualifier which needs to be implemented.
>  Every reference to this variable is going to have to be expanded into one
> or more atomic operations of some sort.  Yes, I probably could do that by
> emitting built-in functions, but they are a bit more unwieldy, its far
> simpler to just create gimple_statements.

As I understand you first generate builtins anyway and then

Re: [C11-atomic] [patch] gimple atomic statements

2012-04-04 Thread Richard Guenther

On Wed, Apr 4, 2012 at 4:32 PM, Richard Henderson  wrote:
> On 04/04/2012 09:46 AM, Richard Guenther wrote:
>> If that is the only reason you can return two values by using a complex
>> or vector type (that would be only an IL implementation detail as far
>> as I can see).
>> We use that trick to get sincos () "sane" in our IL as well.
>
> That would work if the two values were of the same type, as they
> are with sincos.  In the case of compare_exchange, they are not.

You can return an aggregate then (or adjust the IL so that they do have the
same type and only fix that up during expansion).

Richard.

>
>
> r~

C++ PATCH for c++/52845 (bogus warning with empty lambda)

2012-04-04 Thread Jason Merrill

My patch for return type deduction forgot to update the fntype local 
variable in finish_function, leading to a bogus warning about a missing 
return statement.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 12a282edca78579074f5f4180cd2dce1edebd2bf
Author: Jason Merrill 
Date:   Wed Apr 4 10:14:46 2012 -0400

	PR c++/52845
	* decl.c (finish_function): Update fntype after deducing return type.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index d210f19..e2f01d5 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -13518,6 +13518,7 @@ finish_function (int flags)
 		  "deduced to %");
 	}
   apply_deduced_return_type (fndecl, void_type_node);
+  fntype = TREE_TYPE (fndecl);
 }
 
   /* Save constexpr function body before it gets munged by
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-warn4.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-warn4.C
new file mode 100644
index 000..2afeccf
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-warn4.C
@@ -0,0 +1,7 @@
+// PR c++/52845
+// { dg-options "-std=c++11 -Wall" }
+
+void f()
+{
+  [](){};
+}

Re: [RFC] Should SRA stop producing COMPONENT_REF for non-bit-fields (again)?

2012-04-04 Thread Richard Guenther

On Wed, 4 Apr 2012, Martin Jambor wrote:

> Hi everyone, especially Richi and Eric,
> 
> I'd like to know what is your attitude to changing SRA's
> build_ref_for_model to what it once looked like, so that it produces
> COMPONENT_REFs only for bit-fields.  The non-bit field handling was
> added in order to work-around problems when expanding non-aligned
> MEM_REFs on strict-alignment targets but that should not be a problem
> now and my experiments confirm that.  Last week I successfully
> bootstrapped and tested this patch on sparc64-linux (with the
> temporary MEM_EXPR patch, not including Java), ia64-linux (without
> Ada), x86_64-linux, i686-linux and tested it on hppa-linux (only C and
> C++).
> 
> The main downside of this change I see is that dumps would be a bit
> more difficult to read and understand when the fields disappear from
> them.
> 
> The upsides are:
> 
>   - the expr field of SRA access was originally intended only for
> debugging (meaning both for compiler-produced debug info and
> debugging SRA).  It was never intended to influence the memory
> accesses produced by SRA and when we create them artificially, the
> effects of the particular form are hard to reason about.  If we
> ever lower bit-field accesses on gimple level, build_ref_for_model
> could go away completely (yeah, I know I'm getting carried way
> here).
> 
>   - If something like PR 51528 creeps up again and we need to create
> replacements of type returned by lang_hooks.types.type_for_mode,
> the produced MEM_REFs could simply have this type.  OTOH, the
> current COMPONENT_REFs would require to be encapsulated in V_C_Es
> and that is quite a nightmare.  I tried it in December, even made
> it work, but it was particularly ugly and needed some quite
> questionable uses of V_C_Es.
> 
>   - Well, it does the same thing and is much simpler, is it not?
> 
> The patch fulfills the criteria to be committed and I can do it soon.
> OTOH, keeping it so on a number of platforms takes quite a lot of time
> (and has uncovered some non-related bugs) so I'd like to know whether
> it's worth it.
> 
> Thanks,
> 
> Martin
> 
> 
> 
> 2012-03-20 Martin Jambor 
> 
>   * tree-sra.c (build_ref_for_model): Create COMPONENT_REFs only for
>   bit-fields.
> 
> Index: src/gcc/tree-sra.c
> ===
> *** src.orig/gcc/tree-sra.c
> --- src/gcc/tree-sra.c
> *** build_ref_for_offset (location_t loc, tr
> *** 1489,1558 
> return fold_build2_loc (loc, MEM_REF, exp_type, base, off);
>   }
>   
> - DEF_VEC_ALLOC_P_STACK (tree);
> - #define VEC_tree_stack_alloc(alloc) VEC_stack_alloc (tree, alloc)
> - 
>   /* Construct a memory reference to a part of an aggregate BASE at the given
> !OFFSET and of the type of MODEL.  In case this is a chain of references
> !to component, the function will replicate the chain of COMPONENT_REFs of
> !the expression of MODEL to access it.  GSI and INSERT_AFTER have the same
> !meaning as in build_ref_for_offset.  */
>   
>   static tree
>   build_ref_for_model (location_t loc, tree base, HOST_WIDE_INT offset,
>struct access *model, gimple_stmt_iterator *gsi,
>bool insert_after)
>   {
> !   tree type = model->type, t;
> !   VEC(tree,stack) *cr_stack = NULL;
> ! 
> !   if (TREE_CODE (model->expr) == COMPONENT_REF)
>   {
> !   tree expr = model->expr;
> ! 
> !   /* Create a stack of the COMPONENT_REFs so later we can walk them in
> !  order from inner to outer.  */
> !   cr_stack = VEC_alloc (tree, stack, 6);
> ! 
> !   do {
> ! tree field = TREE_OPERAND (expr, 1);
> ! tree cr_offset = component_ref_field_offset (expr);
> ! HOST_WIDE_INT bit_pos
> !   = tree_low_cst (cr_offset, 1) * BITS_PER_UNIT
> !   + TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));
>   
> ! /* We can be called with a model different from the one associated
> !with BASE so we need to avoid going up the chain too far.  */
> ! if (offset - bit_pos < 0)
> !   break;
> ! 
> ! offset -= bit_pos;
> ! VEC_safe_push (tree, stack, cr_stack, expr);
> ! 
> ! expr = TREE_OPERAND (expr, 0);
> ! type = TREE_TYPE (expr);
> !   } while (TREE_CODE (expr) == COMPONENT_REF);
>   }
> ! 
> !   t = build_ref_for_offset (loc, base, offset, type, gsi, insert_after);
> ! 
> !   if (TREE_CODE (model->expr) == COMPONENT_REF)
> ! {
> !   unsigned i;
> !   tree expr;
> ! 
> !   /* Now replicate the chain of COMPONENT_REFs from inner to outer.  */
> !   FOR_EACH_VEC_ELT_REVERSE (tree, cr_stack, i, expr)
> ! {
> !   tree field = TREE_OPERAND (expr, 1);
> !   t = fold_build3_loc (loc, COMPONENT_REF, TREE_TYPE (field), t, field,
> !TREE_OPERAND (expr, 2));
> ! }
> ! 
> !   VEC_free (tree, stack, cr_stack);
> ! }
> ! 
> !   return t;
>   }
>

Re: [Libiberty]: Handle VMS as a LLP64 platform in splay-tree.h

2012-04-04 Thread Tristan Gingold


On Apr 4, 2012, at 3:58 PM, Ian Lance Taylor wrote:

> Tristan Gingold  writes:
> 
>> include/
>> 2012-04-04  Tristan Gingold  
>> 
>>  * splay-tree.h: Use LLP64 definitions of libi_shostptr_t and
>>  libi_hostptr_t for VMS with 64bit pointers.
> 
> I was strongly opposed to adding a _WIN64 define here and this is just
> making it worse.

Understood.

Would something like that be acceptable ?
I have just checked that I can still build gcc with that patch.  If you like 
this approach I will properly submit a patch.

Tristan.

--- a/include/splay-tree.h
+++ b/include/splay-tree.h
@@ -37,18 +37,11 @@ extern "C" {
 
 #include "ansidecl.h"
 
-#ifndef _WIN64
-  typedef unsigned long int libi_uhostptr_t;
-  typedef long int libi_shostptr_t;
-#else
-#ifdef __GNUC__
-  __extension__
+#ifdef HAVE_STDINT_H
+#include 
 #endif
-  typedef unsigned long long libi_uhostptr_t;
-#ifdef __GNUC__
-  __extension__
-#endif
-  typedef long long libi_shostptr_t;
+#ifdef HAVE_INTTYPES_H
+#include 
 #endif
 
 #ifndef GTY
@@ -59,8 +52,8 @@ extern "C" {
these types, if necessary.  These types should be sufficiently wide
that any pointer or scalar can be cast to these types, and then
cast back, without loss of precision.  */
-typedef libi_uhostptr_t splay_tree_key;
-typedef libi_uhostptr_t splay_tree_value;
+typedef uintptr_t splay_tree_key;
+typedef uintptr_t splay_tree_value;
 
 /* Forward declaration for a node in the tree.  */
 typedef struct splay_tree_node_s *splay_tree_node;


index 7450eeb..fa45392 100644
--- a/gcc/gengtype.c
+++ b/gcc/gengtype.c
@@ -4976,6 +4976,7 @@ main (int argc, char **argv)
   POS_HERE (do_scalar_typedef ("double_int", &pos));
   POS_HERE (do_scalar_typedef ("uint64_t", &pos));
   POS_HERE (do_scalar_typedef ("uint8", &pos));
+  POS_HERE (do_scalar_typedef ("uintptr_t", &pos));
   POS_HERE (do_scalar_typedef ("jword", &pos));
   POS_HERE (do_scalar_typedef ("JCF_u2", &pos));
   POS_HERE (do_scalar_typedef ("void", &pos));

Re: [Libiberty]: Handle VMS as a LLP64 platform in splay-tree.h

2012-04-04 Thread Ian Lance Taylor

Tristan Gingold  writes:

> Would something like that be acceptable ?
> I have just checked that I can still build gcc with that patch.  If you like 
> this approach I will properly submit a patch.

Thanks.

You should also test that gdb continues to build with this patch.

I guess the question here is portability.  Do we still care about
portability to systems that have neither inttypes.h nor stdint.h?  I'm
willing to try this and see if anybody complains.

Ian


> --- a/include/splay-tree.h
> +++ b/include/splay-tree.h
> @@ -37,18 +37,11 @@ extern "C" {
>  
>  #include "ansidecl.h"
>  
> -#ifndef _WIN64
> -  typedef unsigned long int libi_uhostptr_t;
> -  typedef long int libi_shostptr_t;
> -#else
> -#ifdef __GNUC__
> -  __extension__
> +#ifdef HAVE_STDINT_H
> +#include 
>  #endif
> -  typedef unsigned long long libi_uhostptr_t;
> -#ifdef __GNUC__
> -  __extension__
> -#endif
> -  typedef long long libi_shostptr_t;
> +#ifdef HAVE_INTTYPES_H
> +#include 
>  #endif
>  
>  #ifndef GTY
> @@ -59,8 +52,8 @@ extern "C" {
> these types, if necessary.  These types should be sufficiently wide
> that any pointer or scalar can be cast to these types, and then
> cast back, without loss of precision.  */
> -typedef libi_uhostptr_t splay_tree_key;
> -typedef libi_uhostptr_t splay_tree_value;
> +typedef uintptr_t splay_tree_key;
> +typedef uintptr_t splay_tree_value;
>  
>  /* Forward declaration for a node in the tree.  */
>  typedef struct splay_tree_node_s *splay_tree_node;
>
>
> index 7450eeb..fa45392 100644
> --- a/gcc/gengtype.c
> +++ b/gcc/gengtype.c
> @@ -4976,6 +4976,7 @@ main (int argc, char **argv)
>POS_HERE (do_scalar_typedef ("double_int", &pos));
>POS_HERE (do_scalar_typedef ("uint64_t", &pos));
>POS_HERE (do_scalar_typedef ("uint8", &pos));
> +  POS_HERE (do_scalar_typedef ("uintptr_t", &pos));
>POS_HERE (do_scalar_typedef ("jword", &pos));
>POS_HERE (do_scalar_typedef ("JCF_u2", &pos));
>POS_HERE (do_scalar_typedef ("void", &pos));

Re: [Libiberty]: Handle VMS as a LLP64 platform in splay-tree.h

2012-04-04 Thread Tristan Gingold


On Apr 4, 2012, at 5:07 PM, Ian Lance Taylor wrote:

> Tristan Gingold  writes:
> 
>> Would something like that be acceptable ?
>> I have just checked that I can still build gcc with that patch.  If you like 
>> this approach I will properly submit a patch.
> 
> Thanks.
> 
> You should also test that gdb continues to build with this patch.

Argh, good point.

> I guess the question here is portability.  Do we still care about
> portability to systems that have neither inttypes.h nor stdint.h?  I'm
> willing to try this and see if anybody complains.

In this case, autoconf defines uintptr_t so I don't think there is an issue 
here.

Tristan.

> 
> Ian
> 
> 
>> --- a/include/splay-tree.h
>> +++ b/include/splay-tree.h
>> @@ -37,18 +37,11 @@ extern "C" {
>> 
>> #include "ansidecl.h"
>> 
>> -#ifndef _WIN64
>> -  typedef unsigned long int libi_uhostptr_t;
>> -  typedef long int libi_shostptr_t;
>> -#else
>> -#ifdef __GNUC__
>> -  __extension__
>> +#ifdef HAVE_STDINT_H
>> +#include 
>> #endif
>> -  typedef unsigned long long libi_uhostptr_t;
>> -#ifdef __GNUC__
>> -  __extension__
>> -#endif
>> -  typedef long long libi_shostptr_t;
>> +#ifdef HAVE_INTTYPES_H
>> +#include 
>> #endif
>> 
>> #ifndef GTY
>> @@ -59,8 +52,8 @@ extern "C" {
>>these types, if necessary.  These types should be sufficiently wide
>>that any pointer or scalar can be cast to these types, and then
>>cast back, without loss of precision.  */
>> -typedef libi_uhostptr_t splay_tree_key;
>> -typedef libi_uhostptr_t splay_tree_value;
>> +typedef uintptr_t splay_tree_key;
>> +typedef uintptr_t splay_tree_value;
>> 
>> /* Forward declaration for a node in the tree.  */
>> typedef struct splay_tree_node_s *splay_tree_node;
>> 
>> 
>> index 7450eeb..fa45392 100644
>> --- a/gcc/gengtype.c
>> +++ b/gcc/gengtype.c
>> @@ -4976,6 +4976,7 @@ main (int argc, char **argv)
>>   POS_HERE (do_scalar_typedef ("double_int", &pos));
>>   POS_HERE (do_scalar_typedef ("uint64_t", &pos));
>>   POS_HERE (do_scalar_typedef ("uint8", &pos));
>> +  POS_HERE (do_scalar_typedef ("uintptr_t", &pos));
>>   POS_HERE (do_scalar_typedef ("jword", &pos));
>>   POS_HERE (do_scalar_typedef ("JCF_u2", &pos));
>>   POS_HERE (do_scalar_typedef ("void", &pos));

Re: [Libiberty]: Handle VMS as a LLP64 platform in splay-tree.h

2012-04-04 Thread Pedro Alves

On 04/04/2012 04:07 PM, Ian Lance Taylor wrote:

> Tristan Gingold  writes:
> 
>> > Would something like that be acceptable ?
>> > I have just checked that I can still build gcc with that patch.  If you 
>> > like this approach I will properly submit a patch.
> Thanks.
> 
> You should also test that gdb continues to build with this patch.


GDB pulls stdint.h from gnulib (though not inttypes.h), so it should
be fine in principle.

> I guess the question here is portability.  Do we still care about
> portability to systems that have neither inttypes.h nor stdint.h?  I'm
> willing to try this and see if anybody complains.


-- 
Pedro Alves

Re: [C11-atomic] [patch] gimple atomic statements

2012-04-04 Thread Andrew MacLeod


On 04/04/2012 10:33 AM, Richard Guenther wrote:

On Wed, Apr 4, 2012 at 3:28 PM, Andrew MacLeod  wrote:
This is a WIP... that fntype fields is there for simplicity..   and no...
you can do a 1 byte atomic operation on a full word object if you want by

Oh, so you rather need a size or a mode specified, not a "fntype"?


yes, poorly named perhaps as I created things... its just a type node at 
the moment that indicates the size being operated on that I collected 
from the builtin in function.




In the example you only ever use address operands (not memory operands)
to the GIMPLE_ATOMIC - is that true in all cases?  Is the result always
non-memory?
The atomic address can be any arbitrary memory location... I haven't 
gotten to that yet.  its commonly just an address so I'm working with 
that first as proof of concept. When it gets something else it'll trap 
and I'll know :-)


Results are always non-memory, other than the side effects of the atomic 
contents changing and having to up date the second parameter to the 
compare_exchange routine.  The generic routines for arbitary structures 
(not added in yet), actually just work with blocks of memory, but they 
are all handled by addresses and the functions themselves are typically 
void.  I was planning on folding them right into the existing 
atomic_kinds as well... I can recognize from the type that it wont map 
to a integral type.  I needed separate builtins in 4.7  for them since 
the parameter list was different.

I suppose the GIMPLE_ATOMICs are still optimization barriers for all
memory, not just that possibly referenced by them?


yes, depending on the memory model used.  It can force synchronization 
with other CPUs/threads which will have the appearence of changing any 
shared memory location.  Various guarantees are made about whether those 
changes are visible to this thread after an atomic operation so we can't 
reuse shared values in those cases.  Various guarantees are made about 
what changes this thread has made are visible to other CPUs/threads at 
an atomic call as well, so that precludes moving stores downward in some 
models.



and during expansion to RTL, can trivially see that cmpxchg.2_4 is not used,
and generate the really efficient compare and swap pattern which only
produces a boolean result.

I suppose gimple stmt folding could transform it as well?
it could if I provided gimple statements for the 3 different forms of 
C&S. I was planning to just leave it this way since its the interface 
being forced by C++11 as well as C11... and then just emit the 
appropriate RTL for this one C&S type.  The RTL patterns are already 
defined for the 2 easy cases for the __sync routines. the third one was 
added for __atomic.  Its possible that the process of integrating the 
__sync routines with GIMPLE_ATOMIC will indicate its better to add those 
forms as atomic_kinds and then gimple_fold_stmt could take care of it as 
well.   Maybe that is just a good idea anyway...  I'll keep it in mind.





   if only cmpxchg.2_4 were used, we can generate
the C&S pattern which only returns the result.  Only if we see both are
actually used do we have to fall back to the much uglier pattern we have
that produces both results.  Currently we always generate this pattern.

Next, we have the C11 atomic type qualifier which needs to be implemented.
  Every reference to this variable is going to have to be expanded into one
or more atomic operations of some sort.  Yes, I probably could do that by
emitting built-in functions, but they are a bit more unwieldy, its far
simpler to just create gimple_statements.

As I understand you first generate builtins anyway and then lower them?
Or are you planning on emitting those for GENERIC as well?  Remember
GENERIC is not GIMPLE, so you'd need new tree codes anyway ;)
Or do you plan to make __atomic integral part of GENERIC and thus
do this lowering during gimplification?
I was actually thinking about doing it during gimplification... I hadnt 
gotten as far as figuring out what to do with the functions from the 
front end yet.  I dont know that code well, but I was in fact hoping 
there was a way to 'recognize' the function names easily and avoid built 
in functions completely...


The C parser is going to have to understand the set of C11 routine names 
for all these anyway.. I figured there was something in there that could 
be done.




I also hope that when done, I can also remove all the ugly built-in overload
code that was created for __sync and continues to be used by __atomic.

But the builtins will stay for our users consumption and libstdc++ use, no?


well, the names must remain exposed and recognizable since they are 'out 
there'.  Maybe under the covers I can just leave them as normal calls 
and then during gimplification simply recognize the names and generate 
GIMPLE_ATOMIC statements directly from the CALL_EXPR.  That would be 
ideal.  That way there are no builtins any more.




So bottom line, a GIMPLE

Re: PATCH: [x32] libitm failures on x32

2012-04-04 Thread Uros Bizjak

Hello!

> We need to use long long instead of long in gtm_jmpbuf for x86_64 since
> long in x32 is 32bits.  OK for trunk and 4.7 branch?
>
> 2012-04-03  H.J. Lu  
>
>   PR libitm/52854
>   * config/x86/target.h (gtm_jmpbuf): Replace long with long long
>   for x86-64.

OK.

Thanks,
Uros.

Re: [wwwdocs] Buildstat update for 4.5

2012-04-04 Thread Gerald Pfeifer

On Tue, 3 Apr 2012, Tom G. Christensen wrote:
> Latest results for 4.5.x
> 
> -tgc
> 
> Testresults for 4.5.3:
>   i386-pc-solaris2.8 (2)

Thanks, Tom, this is life.

Gerald

[PATCH, libstdc++]: Fix static linking failure on alphaev68-pc-linux-gnu

2012-04-04 Thread Uros Bizjak

Hello!

The fix for PR52689 caused following testsuite failure on
alphaev68-pc-linux-gnu:

Running target unix
FAIL: libmudflap.c++/pass41-frag.cxx (-static) (test for excess errors)
WARNING: libmudflap.c++/pass41-frag.cxx (-static) compilation failed
to produce executable

>From the testsuite log:

/home/uros/gcc-build/alphaev68-unknown-linux-gnu/libstdc++-v3/src/.libs/libstdc++.a(locale.o):
In function `std::locale::id::_M_id() const':^M
/home/uros/gcc-build/alphaev68-unknown-linux-gnu/libstdc++-v3/src/c++98/../../../../../gcc-svn/trunk/libstdc++-v3/src/c++98/locale.cc:423:
undefined reference to `std::num_get > >::id'^M
/home/uros/gcc-build/alphaev68-unknown-linux-gnu/libstdc++-v3/src/c++98/../../../../../gcc-svn/trunk/libstdc++-v3/src/c++98/locale.cc:424:
undefined reference to `std::num_put > >::id'^M
/home/uros/gcc-build/alphaev68-unknown-linux-gnu/libstdc++-v3/src/c++98/../../../../../gcc-svn/trunk/libstdc++-v3/src/c++98/locale.cc:425:
undefined reference to `std::money_get > >::id'^M
/home/uros/gcc-build/alphaev68-unknown-linux-gnu/libstdc++-v3/src/c++98/../../../../../gcc-svn/trunk/libstdc++-v3/src/c++98/locale.cc:426:
undefined reference to `std::money_put > >::id'^M
/home/uros/gcc-build/alphaev68-unknown-linux-gnu/libstdc++-v3/src/c++98/../../../../../gcc-svn/trunk/libstdc++-v3/src/c++98/locale.cc:428:
undefined reference to `std::num_get > >::id'^M
/home/uros/gcc-build/alphaev68-unknown-linux-gnu/libstdc++-v3/src/c++98/../../../../../gcc-svn/trunk/libstdc++-v3/src/c++98/locale.cc:429:
undefined reference to `std::num_put > >::id'^M
/home/uros/gcc-build/alphaev68-unknown-linux-gnu/libstdc++-v3/src/c++98/../../../../../gcc-svn/trunk/libstdc++-v3/src/c++98/locale.cc:430:
undefined reference to `std::money_get > >::id'^M

This happens in #ifdef _GLIBCXX_LONG_DOUBLE_COMPAT protected code.

Following partial revert fixes the failure:

2012-04-04  Uros Bizjak  

Partially revert:
2012-03-28  Benjamin Kosnik  

PR libstdc++/52689
* src/c++98/compatibility-ldbl.cc: Guard with PIC

Tested on alphaev68-pc-linux-gnu, approved in the PR audit trail by
Benjamin, committed to SVN.

Uros.

Index: src/c++98/compatibility-ldbl.cc
===
--- src/c++98/compatibility-ldbl.cc (revision 186092)
+++ src/c++98/compatibility-ldbl.cc (working copy)
@@ -27,8 +27,6 @@
 #include 
 #include 

-#ifdef PIC
-
 #ifdef _GLIBCXX_LONG_DOUBLE_COMPAT

 #ifdef __LONG_DOUBLE_128__
@@ -80,5 +78,3 @@
   __attribute__((alias ("_ZNKSt3tr14hashIeEclEe")));

 #endif
-
-#endif

Re: PATCH: Define TRY_EMPTY_VM_SPACE for Linux/x32

2012-04-04 Thread Uros Bizjak

Hello!

> This patch defines TRY_EMPTY_VM_SPACE for Linux/x32.  Tested on Linux/x32.
> OK for trunk?
>
> 2012-04-03  H.J. Lu  
>
>   * config/host-linux.c (TRY_EMPTY_VM_SPACE): Defined to
>   0x6000 for x32.

I think we can simply check for __LP64__, without version check, as is
the case with SPARC and MIPS targets.

Uros.

Index: host-linux.c
===
--- host-linux.c(revision 186141)
+++ host-linux.c(working copy)
@@ -68,8 +68,10 @@
 # define TRY_EMPTY_VM_SPACE0x100
 #elif defined(__ia64)
 # define TRY_EMPTY_VM_SPACE0x2001
+#elif defined(__x86_64) && defined(__LP64__)
+# define TRY_EMPTY_VM_SPACE0x10
 #elif defined(__x86_64)
-# define TRY_EMPTY_VM_SPACE0x10
+# define TRY_EMPTY_VM_SPACE0x6000
 #elif defined(__i386)
 # define TRY_EMPTY_VM_SPACE0x6000
 #elif defined(__powerpc__)

[PATCH 1/2] doc: Correct __builtin_arm_tinsr prototype documentation

2012-04-04 Thread Matt Turner

2012-04-04  Matt Turner  

gcc/
* doc/extend.texi (__builtin_arm_tinsrb): Add missing second
parameter.
(__builtin_arm_tinsrh): Likewise.
(__builtin_arm_tinsrw): Likewise.
---
This patch and 2/2 are tie-ons to
http://gcc.gnu.org/ml/gcc-patches/2012-02/msg01269.html

Still waiting on copyright assignment, but I think this doc patch
is trivial enough to be committed without it.

 gcc/doc/extend.texi |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index bb43825..966175d 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8676,9 +8676,9 @@ int __builtin_arm_textrmsw (v2si, int)
 int __builtin_arm_textrmub (v8qi, int)
 int __builtin_arm_textrmuh (v4hi, int)
 int __builtin_arm_textrmuw (v2si, int)
-v8qi __builtin_arm_tinsrb (v8qi, int)
-v4hi __builtin_arm_tinsrh (v4hi, int)
-v2si __builtin_arm_tinsrw (v2si, int)
+v8qi __builtin_arm_tinsrb (v8qi, int, int)
+v4hi __builtin_arm_tinsrh (v4hi, int, int)
+v2si __builtin_arm_tinsrw (v2si, int, int)
 long long __builtin_arm_tmia (long long, int, int)
 long long __builtin_arm_tmiabb (long long, int, int)
 long long __builtin_arm_tmiabt (long long, int, int)
-- 
1.7.3.4

[PATCH] doc: Fix typo: mno-lsc -> mno-llsc

2012-04-04 Thread Matt Turner

2012-04-04  Matt Turner  

gcc/
* doc/install.texi: Correct typo "-mno-lsc" -> "-mno-llsc".
---
Still waiting on copyright assignment, but I think this doc patch
is trivial enough to be committed without it.

 gcc/doc/install.texi |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 41dbf44..6da6c09 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1238,7 +1238,7 @@ Division by zero checks use the break instruction.
 
 @item --with-llsc
 On MIPS targets, make @option{-mllsc} the default when no
-@option{-mno-lsc} option is passed.  This is the default for
+@option{-mno-llsc} option is passed.  This is the default for
 Linux-based targets, as the kernel will emulate them if the ISA does
 not provide them.
 
-- 
1.7.3.4

[PATCH 2/2] arm: add iwMMXt mmx-2.c test

2012-04-04 Thread Matt Turner

2012-04-04  Matt Turner  

PR target/35294
* gcc.target/arm/mmx-2.c: New.
---
This patch and 1/2 are tie-ons to
http://gcc.gnu.org/ml/gcc-patches/2012-02/msg01269.html

Still waiting on copyright assignment, but please review in the meantime.

Is there anything else I need to do to wire this into the test suite
other than putting it in the testsuite/gcc.target/arm/ folder?

 gcc/testsuite/gcc.target/arm/mmx-2.c |  158 ++
 1 files changed, 158 insertions(+), 0 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/mmx-2.c

diff --git a/gcc/testsuite/gcc.target/arm/mmx-2.c 
b/gcc/testsuite/gcc.target/arm/mmx-2.c
new file mode 100644
index 000..603a63b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mmx-2.c
@@ -0,0 +1,158 @@
+/* { dg-do compile } */
+/* { dg-skip-if "Test is specific to the iWMMXt" { arm*-*-* } { "-mcpu=*" } { 
"-mcpu=iwmmxt" } } */
+/* { dg-skip-if "Test is specific to the iWMMXt" { arm*-*-* } { "-mabi=*" } { 
"-mabi=iwmmxt" } } */
+/* { dg-skip-if "Test is specific to the iWMMXt" { arm*-*-* } { "-march=*" } { 
"-march=iwmmxt" } } */
+/* { dg-skip-if "Test is specific to ARM mode" { arm*-*-* } { "-mthumb" } { "" 
} } */
+/* { dg-require-effective-target arm32 } */
+/* { dg-require-effective-target arm_iwmmxt_ok } */
+
+/* Internal data types for implementing the intrinsics.  */
+typedef int __v2si __attribute__ ((vector_size (8)));
+typedef short __v4hi __attribute__ ((vector_size (8)));
+typedef signed char __v8qi __attribute__ ((vector_size (8)));
+
+void
+foo(void)
+{
+  volatile int isink;
+  volatile long long llsink;
+  volatile __v8qi v8sink;
+  volatile __v4hi v4sink;
+  volatile __v2si v2sink;
+
+  isink = __builtin_arm_getwcx (0);
+  __builtin_arm_setwcx (isink, 0);
+  isink = __builtin_arm_textrmsb (v8sink, 0);
+  isink = __builtin_arm_textrmsh (v4sink, 0);
+  isink = __builtin_arm_textrmsw (v2sink, 0);
+  isink = __builtin_arm_textrmub (v8sink, 0);
+  isink = __builtin_arm_textrmuh (v4sink, 0);
+  isink = __builtin_arm_textrmuw (v2sink, 0);
+  v8sink = __builtin_arm_tinsrb (v8sink, isink, 0);
+  v4sink = __builtin_arm_tinsrh (v4sink, isink, 0);
+  v2sink = __builtin_arm_tinsrw (v2sink, isink, 0);
+  llsink = __builtin_arm_tmia (llsink, isink, isink);
+  llsink = __builtin_arm_tmiabb (llsink, isink, isink);
+  llsink = __builtin_arm_tmiabt (llsink, isink, isink);
+  llsink = __builtin_arm_tmiaph (llsink, isink, isink);
+  llsink = __builtin_arm_tmiatb (llsink, isink, isink);
+  llsink = __builtin_arm_tmiatt (llsink, isink, isink);
+  isink = __builtin_arm_tmovmskb (v8sink);
+  isink = __builtin_arm_tmovmskh (v4sink);
+  isink = __builtin_arm_tmovmskw (v2sink);
+  llsink = __builtin_arm_waccb (v8sink);
+  llsink = __builtin_arm_wacch (v4sink);
+  llsink = __builtin_arm_waccw (v2sink);
+  v8sink = __builtin_arm_waddb (v8sink, v8sink);
+  v8sink = __builtin_arm_waddbss (v8sink, v8sink);
+  v8sink = __builtin_arm_waddbus (v8sink, v8sink);
+  v4sink = __builtin_arm_waddh (v4sink, v4sink);
+  v4sink = __builtin_arm_waddhss (v4sink, v4sink);
+  v4sink = __builtin_arm_waddhus (v4sink, v4sink);
+  v2sink = __builtin_arm_waddw (v2sink, v2sink);
+  v2sink = __builtin_arm_waddwss (v2sink, v2sink);
+  v2sink = __builtin_arm_waddwus (v2sink, v2sink);
+  v8sink = __builtin_arm_walign (v8sink, v8sink, 0);  /* waligni: 3-bit 
immediate.  */
+  v8sink = __builtin_arm_walign (v8sink, v8sink, isink); /* walignr: GP 
register.  */
+  llsink = __builtin_arm_wand(llsink, llsink);
+  llsink = __builtin_arm_wandn (llsink, llsink);
+  v8sink = __builtin_arm_wavg2b (v8sink, v8sink);
+  v8sink = __builtin_arm_wavg2br (v8sink, v8sink);
+  v4sink = __builtin_arm_wavg2h (v4sink, v4sink);
+  v4sink = __builtin_arm_wavg2hr (v4sink, v4sink);
+  v8sink = __builtin_arm_wcmpeqb (v8sink, v8sink);
+  v4sink = __builtin_arm_wcmpeqh (v4sink, v4sink);
+  v2sink = __builtin_arm_wcmpeqw (v2sink, v2sink);
+  v8sink = __builtin_arm_wcmpgtsb (v8sink, v8sink);
+  v4sink = __builtin_arm_wcmpgtsh (v4sink, v4sink);
+  v2sink = __builtin_arm_wcmpgtsw (v2sink, v2sink);
+  v8sink = __builtin_arm_wcmpgtub (v8sink, v8sink);
+  v4sink = __builtin_arm_wcmpgtuh (v4sink, v4sink);
+  v2sink = __builtin_arm_wcmpgtuw (v2sink, v2sink);
+  llsink = __builtin_arm_wmacs (llsink, v4sink, v4sink);
+  llsink = __builtin_arm_wmacsz (v4sink, v4sink);
+  llsink = __builtin_arm_wmacu (llsink, v4sink, v4sink);
+  llsink = __builtin_arm_wmacuz (v4sink, v4sink);
+  v4sink = __builtin_arm_wmadds (v4sink, v4sink);
+  v4sink = __builtin_arm_wmaddu (v4sink, v4sink);
+  v8sink = __builtin_arm_wmaxsb (v8sink, v8sink);
+  v4sink = __builtin_arm_wmaxsh (v4sink, v4sink);
+  v2sink = __builtin_arm_wmaxsw (v2sink, v2sink);
+  v8sink = __builtin_arm_wmaxub (v8sink, v8sink);
+  v4sink = __builtin_arm_wmaxuh (v4sink, v4sink);
+  v2sink = __builtin_arm_wmaxuw (v2sink, v2sink);
+  v8sink = __builtin_arm_wminsb (v8sink, v8sink);
+  v4sink = __builtin_arm_wminsh (v4sink, v4sink);
+  v2sink = __builtin_arm_wminsw (v

Re: PATCH: Define TRY_EMPTY_VM_SPACE for Linux/x32

2012-04-04 Thread H.J. Lu

On Wed, Apr 4, 2012 at 11:08 AM, Uros Bizjak  wrote:
> Hello!
>
>> This patch defines TRY_EMPTY_VM_SPACE for Linux/x32.  Tested on Linux/x32.
>> OK for trunk?
>>
>> 2012-04-03  H.J. Lu  
>>
>>       * config/host-linux.c (TRY_EMPTY_VM_SPACE): Defined to
>>       0x6000 for x32.
>
> I think we can simply check for __LP64__, without version check, as is
> the case with SPARC and MIPS targets.
>
> Uros.
>
> Index: host-linux.c
> ===
> --- host-linux.c        (revision 186141)
> +++ host-linux.c        (working copy)
> @@ -68,8 +68,10 @@
>  # define TRY_EMPTY_VM_SPACE    0x100
>  #elif defined(__ia64)
>  # define TRY_EMPTY_VM_SPACE    0x2001
> +#elif defined(__x86_64) && defined(__LP64__)
> +# define TRY_EMPTY_VM_SPACE    0x10
>  #elif defined(__x86_64)
> -# define TRY_EMPTY_VM_SPACE    0x10
> +# define TRY_EMPTY_VM_SPACE    0x6000
>  #elif defined(__i386)
>  # define TRY_EMPTY_VM_SPACE    0x6000
>  #elif defined(__powerpc__)

When you compile GCC 4.8 with GCC 3.2 on Linux/x86-64,
__LP64__ won't be defined and TRY_EMPTY_VM_SPACE
will be 0x6000 instead of 0x10.


-- 
H.J.

libgo patch committed: More syscall improvements

2012-04-04 Thread Ian Lance Taylor

This patch to libgo adds more constants to the syscall package,
continuing the process of making the gccgo version of syscall more like
the one in the master library.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline and 4.7 branch.

Ian

diff -r 34124478458a libgo/configure.ac
--- a/libgo/configure.ac	Tue Apr 03 16:44:24 2012 -0700
+++ b/libgo/configure.ac	Wed Apr 04 11:44:28 2012 -0700
@@ -459,7 +459,7 @@
   ;;
 esac
 
-AC_CHECK_HEADERS(sys/mman.h syscall.h sys/epoll.h sys/ptrace.h sys/syscall.h sys/user.h sys/utsname.h sys/select.h sys/socket.h net/if.h net/if_arp.h sys/prctl.h sys/mount.h sys/vfs.h sys/statfs.h sys/timex.h sys/sysinfo.h utime.h linux/reboot.h)
+AC_CHECK_HEADERS(sys/mman.h syscall.h sys/epoll.h sys/ptrace.h sys/syscall.h sys/user.h sys/utsname.h sys/select.h sys/socket.h net/if.h net/if_arp.h sys/prctl.h sys/mount.h sys/vfs.h sys/statfs.h sys/timex.h sys/sysinfo.h utime.h linux/ether.h linux/reboot.h)
 
 AC_CHECK_HEADERS([linux/filter.h linux/netlink.h linux/rtnetlink.h], [], [],
 [#ifdef HAVE_SYS_SOCKET_H
diff -r 34124478458a libgo/mksysinfo.sh
--- a/libgo/mksysinfo.sh	Tue Apr 03 16:44:24 2012 -0700
+++ b/libgo/mksysinfo.sh	Wed Apr 04 11:44:28 2012 -0700
@@ -118,6 +118,9 @@
 #if defined(HAVE_UTIME_H)
 #include 
 #endif
+#if defined(HAVE_LINUX_ETHER_H)
+#include 
+#endif
 #if defined(HAVE_LINUX_REBOOT_H)
 #include 
 #endif
@@ -214,7 +217,7 @@
 fi
 
 # Networking constants.
-egrep '^const _(AF|ARPHRD|SOCK|SOL|SO|IPPROTO|TCP|IP|IPV6)_' gen-sysinfo.go |
+egrep '^const _(AF|ARPHRD|ETH|IN|SOCK|SOL|SO|IPPROTO|TCP|IP|IPV6)_' gen-sysinfo.go |
   sed -e 's/^\(const \)_\([^= ]*\)\(.*\)$/\1\2 = _\2/' >> ${OUT}
 grep '^const _SOMAXCONN' gen-sysinfo.go |
   sed -e 's/^\(const \)_\(SOMAXCONN[^= ]*\)\(.*\)$/\1\2 = _\2/' \
@@ -461,6 +464,10 @@
   >> ${OUT}
 echo "type DIR _DIR" >> ${OUT}
 
+# Values for d_type field in dirent.
+grep '^const _DT_' gen-sysinfo.go |
+  sed -e 's/^\(const \)_\(DT_[^= ]*\)\(.*\)$/\1\2 = _\2/' >> ${OUT}
+
 # The rusage struct.
 rusage=`grep '^type _rusage struct' gen-sysinfo.go`
 if test "$rusage" != ""; then
@@ -795,15 +802,16 @@
 # The termios constants.
 for n in IGNBRK BRKINT IGNPAR PARMRK INPCK ISTRIP INLCR IGNCR ICRNL IUCLC \
 IXON IXANY IXOFF IMAXBEL IUTF8 OPOST OLCUC ONLCR OCRNL ONOCR ONLRET \
-OFILL OFDEL NLDLY NL0 NL1 CRDLY CR0 CR1 CR2 CR3 TABDLY BSDLY VTDLY \
-FFDLY CBAUD CBAUDEX CSIZE CSTOPB CREAD PARENB PARODD HUPCL CLOCAL \
-LOBLK CIBAUD CMSPAR CRTSCTS ISIG ICANON XCASE ECHO ECHOE ECHOK ECHONL \
-ECHOCTL ECHOPRT ECHOKE DEFECHO FLUSHO NOFLSH TOSTOP PENDIN IEXTEN VINTR \
-VQUIT VERASE VKILL VEOF VMIN VEOL VTIME VEOL2 VSWTCH VSTART VSTOP VSUSP \
-VDSUSP VLNEXT VWERASE VREPRINT VDISCARD VSTATUS TCSANOW TCSADRAIN \
+OFILL OFDEL NLDLY NL0 NL1 CRDLY CR0 CR1 CR2 CR3 CS5 CS6 CS7 CS8 TABDLY \
+BSDLY VTDLY FFDLY CBAUD CBAUDEX CSIZE CSTOPB CREAD PARENB PARODD HUPCL \
+CLOCAL LOBLK CIBAUD CMSPAR CRTSCTS ISIG ICANON XCASE ECHO ECHOE ECHOK \
+ECHONL ECHOCTL ECHOPRT ECHOKE DEFECHO FLUSHO NOFLSH TOSTOP PENDIN IEXTEN \
+VINTR VQUIT VERASE VKILL VEOF VMIN VEOL VTIME VEOL2 VSWTCH VSTART VSTOP \
+VSUSP VDSUSP VLNEXT VWERASE VREPRINT VDISCARD VSTATUS TCSANOW TCSADRAIN \
 TCSAFLUSH TCIFLUSH TCOFLUSH TCIOFLUSH TCOOFF TCOON TCIOFF TCION B0 B50 \
 B75 B110 B134 B150 B200 B300 B600 B1200 B1800 B2400 B4800 B9600 B19200 \
-B38400 B57600 B115200 B230400; do
+B38400 B57600 B115200 B230400 B460800 B50 B576000 B921600 B100 \
+B1152000 B150 B200 B250 B300 B400; do
 
 grep "^const _$n " gen-sysinfo.go | \
 	sed -e 's/^\(const \)_\([^=]*\)\(.*\)$/\1\2 = _\2/' >> ${OUT}

Re: PATCH: Define TRY_EMPTY_VM_SPACE for Linux/x32

2012-04-04 Thread Uros Bizjak

On Wed, Apr 4, 2012 at 8:47 PM, H.J. Lu  wrote:
> On Wed, Apr 4, 2012 at 11:08 AM, Uros Bizjak  wrote:
>> Hello!
>>
>>> This patch defines TRY_EMPTY_VM_SPACE for Linux/x32.  Tested on Linux/x32.
>>> OK for trunk?
>>>
>>> 2012-04-03  H.J. Lu  
>>>
>>>       * config/host-linux.c (TRY_EMPTY_VM_SPACE): Defined to
>>>       0x6000 for x32.
>>
>> I think we can simply check for __LP64__, without version check, as is
>> the case with SPARC and MIPS targets.
>
> When you compile GCC 4.8 with GCC 3.2 on Linux/x86-64,
> __LP64__ won't be defined and TRY_EMPTY_VM_SPACE
> will be 0x6000 instead of 0x10.

Looking at how other targets implement this check, I don't think that
this is a problem at all. This issue only shows on a non-bootstrapped
build. A full bootstrap will use correct address.

Uros.

Re: PATCH: Define TRY_EMPTY_VM_SPACE for Linux/x32

2012-04-04 Thread Andrew Pinski

On Wed, Apr 4, 2012 at 11:50 AM, Uros Bizjak  wrote:
> Looking at how other targets implement this check, I don't think that
> this is a problem at all. This issue only shows on a non-bootstrapped
> build. A full bootstrap will use correct address.

The other place where it shows up is cross compilers but who is going
to use a 3.2 compiler with GCC 4.8 anyways?

-- Pinski

Re: PATCH: Define TRY_EMPTY_VM_SPACE for Linux/x32

2012-04-04 Thread H.J. Lu

On Wed, Apr 4, 2012 at 11:53 AM, Andrew Pinski  wrote:
> On Wed, Apr 4, 2012 at 11:50 AM, Uros Bizjak  wrote:
>> Looking at how other targets implement this check, I don't think that
>> this is a problem at all. This issue only shows on a non-bootstrapped
>> build. A full bootstrap will use correct address.
>
> The other place where it shows up is cross compilers but who is going
> to use a 3.2 compiler with GCC 4.8 anyways?
>

FWIW, I have no problem with checking __LP64__ only. I just want to
mention this potential issue.

Thanks.

-- 
H.J.

Re: [PATCH] Fix PR18589

2012-04-04 Thread William J. Schmidt

On Wed, 2012-04-04 at 15:08 +0200, Richard Guenther wrote:
> On Wed, Apr 4, 2012 at 2:35 PM, William J. Schmidt
>  wrote:
> > On Wed, 2012-04-04 at 13:35 +0200, Richard Guenther wrote:
> >> On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
> >>  wrote:
> >> >
> >> >
> >> > On Wed, 2012-03-28 at 15:57 +0200, Richard Guenther wrote:
> >> >> On Tue, Mar 6, 2012 at 9:49 PM, William J. Schmidt
> >> >>  wrote:
> >> >> > Hi,
> >> >> >
> >> >> > This is a re-post of the patch I posted for comments in January to
> >> >> > address http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589.  The patch
> >> >> > modifies reassociation to expose repeated factors from __builtin_pow*
> >> >> > calls, optimally reassociate repeated factors, and possibly 
> >> >> > reconstitute
> >> >> > __builtin_powi calls from the results of reassociation.
> >> >> >
> >> >> > Bootstrapped and passes regression tests for powerpc64-linux-gnu.  I
> >> >> > expect there may need to be some small changes, but I am targeting 
> >> >> > this
> >> >> > for trunk approval.
> >> >> >
> >> >> > Thanks very much for the review,
> >> >>
> >> >> Hmm.  How much work would it be to extend the reassoc 'IL' to allow
> >> >> a repeat factor per op?  I realize what you do is all within what 
> >> >> reassoc
> >> >> already does though ideally we would not require any GIMPLE IL changes
> >> >> for building up / optimizing the reassoc IL but only do so when we 
> >> >> commit
> >> >> changes.
> >> >>
> >> >> Thanks,
> >> >> Richard.
> >> >
> >> > Hi Richard,
> >> >
> >> > I've revised my patch along these lines; see the new version below.
> >> > While testing it I realized I could do a better job of reducing the
> >> > number of multiplies, so there are some changes to that logic as well,
> >> > and a couple of additional test cases.  Regstrapped successfully on
> >> > powerpc64-linux.
> >> >
> >> > Hope this looks better!
> >>
> >> Yes indeed.  A few observations though.  You didn't integrate
> >> attempt_builtin_powi
> >> with optimize_ops_list - presumably because it's result does not really fit
> >> the single-operation assumption?  But note that undistribute_ops_list and
> >> optimize_range_tests have the same issue.  Thus, I'd have prefered if
> >> attempt_builtin_powi worked in the same way, remove the parts of the
> >> ops list it consumed and stick an operand for its result there instead.
> >> That should simplify things (not having that special powi_result) and
> >> allow for multiple "powi results" in a single op list?
> >
> > Multiple powi results are already handled, but yes, what you're
> > suggesting would simplify things by eliminating the need to create
> > explicit multiplies to join them and the cached-multiply results
> > together.  Sounds reasonable on the surface; it just hadn't occurred to
> > me to do it this way.  I'll have a look.
> >
> > Any other major concerns while I'm reworking this?
> 
> No, the rest looks fine (you should not need to repace
> -fdump-tree-reassoc-details
> with -fdump-tree-reassoc1-details -fdump-tree-reassoc2-details in the first
> testcase).

Unfortunately this seems to be necessary if I name the two passes
"reassoc1" and "reassoc2".  If I try to name both of them "reassoc" I
get failures in other tests like gfortran.dg/reassoc_4, where
-fdump-tree-reassoc1 doesn't work.  Unless I'm missing something
obvious, I think I need to keep that change.

Frankly I was surprised and relieved that there weren't more tests that
used the generic -fdump-tree-reassoc.

Thanks,
Bill
> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Bill
> >>
> >> Thanks,
> >> Richard.
> >>
> >
> >
>

Re: [wwwdocs] Buildstat update for 4.6

2012-04-04 Thread Gerald Pfeifer

On Tue, 3 Apr 2012, Tom G. Christensen wrote:
> Latest results for 4.6.x
> 
> -tgc
> 
> Testresults for 4.6.3:
>   i386-pc-solaris2.8 (2)
>   i386-pc-solaris2.10

Thanks, online now.

Gerald

Re: [wwwdocs] Buildstat update for 4.7

2012-04-04 Thread Gerald Pfeifer

On Tue, 3 Apr 2012, Tom G. Christensen wrote:
> First round of results for 4.7.x

Quite some.  On top of your patch, I applied the following to
fix two markup issues.

Thanks,
Gerald

Index: buildstat.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/buildstat.html,v
retrieving revision 1.2
diff -u -3 -p -r1.2 buildstat.html
--- buildstat.html  4 Apr 2012 20:04:20 -   1.2
+++ buildstat.html  4 Apr 2012 20:08:29 -
@@ -30,6 +30,7 @@ Installing GCC: Final Installation.<
 
 
 
+
 hppa2.0w-hp-hpux11.11
  
 Test results:
@@ -78,6 +79,7 @@ Installing GCC: Final Installation.<
 
 
 
+
 powerpc-apple-darwin8.11.0
  
 Test results:

[patch, committed, backport] Backport IA64 patch to 4.6 branch

2012-04-04 Thread Steve Ellcey


I had a request to backport this patch to the 4.6 branch and since it
is an obvious fix and hasn't caused any problems on the main line I have
gone ahead and checked it in.  I tested the patch on the 4.6 branch with
IA64 HP-UX and had no regressions.

FYI:  Friday will be my last day at HP but I will be starting at MIPS
  next week and working on GCC.  I will change my email address in
  the MAINTAINERS file after I start at MIPS.

Steve Ellcey
s...@cup.hp.com



2012-04-04  Steve Ellcey 

Backported from mainline.
* decl.c (cxx_init_decl_processing): Use ptr_mode instead of Pmode.


Index: decl.c
===
--- decl.c  (revision 186141)
+++ decl.c  (working copy)
@@ -3636,7 +3636,7 @@ cxx_init_decl_processing (void)
 TYPE_SIZE_UNIT (nullptr_type_node) = size_int (GET_MODE_SIZE (ptr_mode));
 TYPE_UNSIGNED (nullptr_type_node) = 1;
 TYPE_PRECISION (nullptr_type_node) = GET_MODE_BITSIZE (ptr_mode);
-SET_TYPE_MODE (nullptr_type_node, Pmode);
+SET_TYPE_MODE (nullptr_type_node, ptr_mode);
 record_builtin_type (RID_MAX, "decltype(nullptr)", nullptr_type_node);
 nullptr_node = build_int_cst (nullptr_type_node, 0);
   }

Re: remove wrong code in immed_double_const

2012-04-04 Thread Mike Stump

On Mar 26, 2012, at 4:57 PM, Mike Stump wrote:
> On Mar 26, 2012, at 1:03 PM, Richard Sandiford wrote:
>> I think:
>> 
>> ...copies of the top bit.  Note however that values are neither inherently
>> signed nor inherently unsigned; where necessary, signedness is determined
>> by the rtl operation instead.
> 
> Sounds good to me, changed.

Oh, review caught one last problem:

> +, however values are neither signed nor unsigned.
> +All operations defined on such constants define the signededness.

This was edit cruft from the last rewording for the doc, the cruft has been 
removed.

* doc/rtl.texi (const_double): Document as sign-extending.
* expmed.c (expand_mult): Ensure we don't use shift
incorrectly.
* emit-rtl.c (immed_double_int_const): Refine to state the
value is signed.
* simplify-rtx.c (mode_signbit_p): Add a fixme for wider than
CONST_DOUBLE integers.
(simplify_const_unary_operation, UNSIGNED_FLOAT): Ensure no
negative values are converted.  Fix conversions bigger than
HOST_BITS_PER_WIDE_INT.
(simplify_binary_operation_1): Ensure we don't use shift
incorrectly.
(simplify_immed_subreg): Sign-extend CONST_DOUBLEs.
* explow.c (plus_constant_mode): Add.
(plus_constant): Implement with plus_constant_mode.
* rtl.h (plus_constant_mode): Add.

Index: doc/rtl.texi
===
--- doc/rtl.texi(revision 186111)
+++ doc/rtl.texi(working copy)
@@ -1479,8 +1479,13 @@ This type of expression represents the i
 is customarily accessed with the macro @code{INTVAL} as in
 @code{INTVAL (@var{exp})}, which is equivalent to @code{XWINT (@var{exp}, 0)}.
 
-Constants generated for modes with fewer bits than @code{HOST_WIDE_INT}
-must be sign extended to full width (e.g., with @code{gen_int_mode}).
+Constants generated for modes with fewer bits than in
+@code{HOST_WIDE_INT} must be sign extended to full width (e.g., with
+@code{gen_int_mode}).  For constants for modes with more bits than in
+@code{HOST_WIDE_INT} the implied high order bits of that constant are
+copies of the top bit.  Note however that values are neither
+inherently signed nor inherently unsigned; where necessary, signedness
+is determined by the rtl operation instead.
 
 @findex const0_rtx
 @findex const1_rtx
@@ -1510,7 +1515,13 @@ Represents either a floating-point const
 integer constant too large to fit into @code{HOST_BITS_PER_WIDE_INT}
 bits but small enough to fit within twice that number of bits (GCC
 does not provide a mechanism to represent even larger constants).  In
-the latter case, @var{m} will be @code{VOIDmode}.
+the latter case, @var{m} will be @code{VOIDmode}.  For integral values
+constants for modes with more bits than twice the number in
+@code{HOST_WIDE_INT} the implied high order bits of that constant are
+copies of the top bit of @code{CONST_DOUBLE_HIGH}.  Note however that
+integral values are neither inherently signed nor inherently unsigned;
+where necessary, signedness is determined by the rtl operation
+instead.
 
 @findex CONST_DOUBLE_LOW
 If @var{m} is @code{VOIDmode}, the bits of the value are stored in
Index: expmed.c
===
--- expmed.c(revision 186111)
+++ expmed.c(working copy)
@@ -3139,8 +3139,10 @@ expand_mult (enum machine_mode mode, rtx
{
  int shift = floor_log2 (CONST_DOUBLE_HIGH (op1))
  + HOST_BITS_PER_WIDE_INT;
- return expand_shift (LSHIFT_EXPR, mode, op0,
-  shift, target, unsignedp);
+ if (shift < 2 * HOST_BITS_PER_WIDE_INT - 1
+ || GET_MODE_BITSIZE (mode) <= 2 * HOST_BITS_PER_WIDE_INT)
+   return expand_shift (LSHIFT_EXPR, mode, op0,
+shift, target, unsignedp);
}
}
 
Index: emit-rtl.c
===
--- emit-rtl.c  (revision 186111)
+++ emit-rtl.c  (working copy)
@@ -517,8 +517,11 @@ immed_double_int_const (double_int i, en
 
 /* Return a CONST_DOUBLE or CONST_INT for a value specified as a pair
of ints: I0 is the low-order word and I1 is the high-order word.
-   Do not use this routine for non-integer modes; convert to
-   REAL_VALUE_TYPE and use CONST_DOUBLE_FROM_REAL_VALUE.  */
+   For values that are larger than 2*HOST_BITS_PER_WIDE_INT, the
+   implied upper bits are copies of the high bit of i1.  The value
+   itself is neither signed nor unsigned.  Do not use this routine for
+   non-integer modes; convert to REAL_VALUE_TYPE and use
+   CONST_DOUBLE_FROM_REAL_VALUE.  */
 
 rtx
 immed_double_const (HOST_WIDE_INT i0, HOST_WIDE_INT i1, enum machine_mode mode)
@@ -531,10 +534,9 @@ immed_double_const (HOST_WIDE_INT i0, HO
 
  1) If GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT, then we use

Re: [PATCH] Fix PR18589

2012-04-04 Thread William J. Schmidt

On Wed, 2012-04-04 at 13:35 +0200, Richard Guenther wrote:
> On Tue, Apr 3, 2012 at 10:25 PM, William J. Schmidt
>  wrote:
> >
> > Hi Richard,
> >
> > I've revised my patch along these lines; see the new version below.
> > While testing it I realized I could do a better job of reducing the
> > number of multiplies, so there are some changes to that logic as well,
> > and a couple of additional test cases.  Regstrapped successfully on
> > powerpc64-linux.
> >
> > Hope this looks better!
> 
> Yes indeed.  A few observations though.  You didn't integrate
> attempt_builtin_powi
> with optimize_ops_list - presumably because it's result does not really fit
> the single-operation assumption?  But note that undistribute_ops_list and
> optimize_range_tests have the same issue.  Thus, I'd have prefered if
> attempt_builtin_powi worked in the same way, remove the parts of the
> ops list it consumed and stick an operand for its result there instead.
> That should simplify things (not having that special powi_result) and
> allow for multiple "powi results" in a single op list?

An excellent suggestion.  I've implemented it below and it is indeed
much cleaner this way.

Bootstrapped/regression tested with no new failures on powerpc64-linux.
Is this incarnation OK for trunk?

Thanks,
Bill

> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Bill

gcc:

2012-04-04  Bill Schmidt  

PR tree-optimization/18589
* tree-pass.h: Replace pass_reassoc with pass_early_reassoc and
pass_late_reassoc.
* passes.c (init_optimization_passes): Change pass_reassoc calls to
pass_early_reassoc and pass_late_reassoc.
* tree-ssa-reassoc.c (reassociate_stats): Add two fields.
(operand_entry): Add count field.
(early_reassoc): New static var.
(add_repeat_to_ops_vec): New function.
(completely_remove_stmt): Likewise.
(remove_def_if_absorbed_call): Likewise.
(remove_visited_stmt_chain): Remove feeding builtin pow/powi calls.
(acceptable_pow_call): New function.
(linearize_expr_tree): Look for builtin pow/powi calls and add operand
entries with repeat counts when found.
(repeat_factor_d): New struct and associated typedefs.
(repeat_factor_vec): New static vector variable.
(compare_repeat_factors): New function.
(get_reassoc_pow_ssa_name): Likewise.
(attempt_builtin_powi): Likewise.
(reassociate_bb): Call attempt_builtin_powi.
(fini_reassoc): Two new calls to statistics_counter_event.
(execute_early_reassoc): New function.
(execute_late_reassoc): Likewise.
(pass_early_reassoc): Rename from pass_reassoc, call
execute_early_reassoc.
(pass_late_reassoc): New gimple_opt_pass that calls
execute_late_reassoc.

gcc/testsuite:

2012-04-04  Bill Schmidt  

PR tree-optimization/18589
* gcc.dg/pr46309.c: Change -fdump-tree-reassoc-details to
-fdump-tree-reassoc1-details and -fdump-tree-reassoc2-details.
* gcc.dg/tree-ssa/pr18589-1.c: New test.
* gcc.dg/tree-ssa/pr18589-2.c: Likewise.
* gcc.dg/tree-ssa/pr18589-3.c: Likewise.
* gcc.dg/tree-ssa/pr18589-4.c: Likewise.
* gcc.dg/tree-ssa/pr18589-5.c: Likewise.
* gcc.dg/tree-ssa/pr18589-6.c: Likewise.
* gcc.dg/tree-ssa/pr18589-7.c: Likewise.
* gcc.dg/tree-ssa/pr18589-8.c: Likewise.
* gcc.dg/tree-ssa/pr18589-9.c: Likewise.
* gcc.dg/tree-ssa/pr18589-10.c: Likewise.

Index: gcc/tree-pass.h
===
--- gcc/tree-pass.h (revision 186108)
+++ gcc/tree-pass.h (working copy)
@@ -441,7 +441,8 @@ extern struct gimple_opt_pass pass_copy_prop;
 extern struct gimple_opt_pass pass_vrp;
 extern struct gimple_opt_pass pass_uncprop;
 extern struct gimple_opt_pass pass_return_slot;
-extern struct gimple_opt_pass pass_reassoc;
+extern struct gimple_opt_pass pass_early_reassoc;
+extern struct gimple_opt_pass pass_late_reassoc;
 extern struct gimple_opt_pass pass_rebuild_cgraph_edges;
 extern struct gimple_opt_pass pass_remove_cgraph_callee_edges;
 extern struct gimple_opt_pass pass_build_cgraph_edges;
Index: gcc/testsuite/gcc.dg/pr46309.c
===
--- gcc/testsuite/gcc.dg/pr46309.c  (revision 186108)
+++ gcc/testsuite/gcc.dg/pr46309.c  (working copy)
@@ -1,6 +1,6 @@
 /* PR tree-optimization/46309 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-reassoc-details" } */
+/* { dg-options "-O2 -fdump-tree-reassoc1-details 
-fdump-tree-reassoc2-details" } */
 /* The transformation depends on BRANCH_COST being greater than 1
(see the notes in the PR), so try to force that.  */
 /* { dg-additional-options "-mtune=octeon2" { target mips*-*-* } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr18589-4.c
===
--- gcc/testsuite/gc

[PATCH, Android] MIPS support

2012-04-04 Thread Maxim Kuvyrkov

Chao,

Let's take discussion of MIPS changes to gcc-patches@.  Please follow up here.

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics



On 5/04/2012, at 10:10 AM, Fu, Chao-Ying wrote:

> Maxim Kuvyrkov wrote:
> 
>> I encourage you to submit the MIPS Android patches to 
>> gcc-patches@.  And, as long as your changes preserve the 
>> status quo of mips-*-* being big-endian by default and 
>> mipsel-*-* being little-endian by default, there should be no 
>> major obstacles to merge those in.
>> 
> 
>  For now, two MIPS changes in gnu-user.h and unwind-dw2-fde-dip.c can be 
> posted for comment.
> (I didn't tested this patch, though.)
> After starting to build toolchains for Android with Bionic, we may find new 
> files to
> patch.  Ex: Comment out getpagesize() for bionic.
> 
>  Any comment?  Thanks a lot!
> 
> Regards,
> Chao-ying
> 
> Index: gcc/gcc/config/mips/gnu-user.h
> ===
> --- gcc.orig/gcc/config/mips/gnu-user.h   2012-04-03 17:39:50.0 
> -0700
> +++ gcc/gcc/config/mips/gnu-user.h2012-04-04 14:31:50.804236000 -0700
> @@ -45,8 +45,8 @@ along with GCC; see the file COPYING3.  
> /* A standard GNU/Linux mapping.  On most targets, it is included in
>CC1_SPEC itself by config/linux.h, but mips.h overrides CC1_SPEC
>and provides this hook instead.  */
> -#undef SUBTARGET_CC1_SPEC
> -#define SUBTARGET_CC1_SPEC "%{profile:-p}"
> +#undef GNU_USER_SUBTARGET_CC1_SPEC
> +#define GNU_USER_SUBTARGET_CC1_SPEC "%{profile:-p}"
> 
> /* -G is incompatible with -KPIC which is the default, so only allow objects
>in the small data section if the user explicitly asks for it.  */
> @@ -54,8 +54,8 @@ along with GCC; see the file COPYING3.  
> #define MIPS_DEFAULT_GVALUE 0
> 
> /* Borrowed from sparc/linux.h */
> -#undef LINK_SPEC
> -#define LINK_SPEC \
> +#undef GNU_USER_TARGET_LINK_SPEC
> +#define GNU_USER_TARGET_LINK_SPEC \
>  "%(endian_spec) \
>   %{shared:-shared} \
>   %{!shared: \
> @@ -89,8 +89,8 @@ along with GCC; see the file COPYING3.  
> #undef ASM_OUTPUT_REG_PUSH
> #undef ASM_OUTPUT_REG_POP
> 
> -#undef LIB_SPEC
> -#define LIB_SPEC "\
> +#undef GNU_USER_TARGET_LIB_SPEC
> +#define GNU_USER_TARGET_LIB_SPEC "\
> %{pthread:-lpthread} \
> %{shared:-lc} \
> %{!shared: \
> @@ -133,7 +133,34 @@ extern const char *host_detect_local_cpu
>   LINUX_DRIVER_SELF_SPECS
> 
> /* Similar to standard Linux, but adding -ffast-math support.  */
> -#undef  ENDFILE_SPEC
> -#define ENDFILE_SPEC \
> +#undef  GNU_USER_TARGET_ENDFILE_SPEC
> +#define GNN_USER_TARGET_ENDFILE_SPEC \
>   "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \
>%{shared|pie:crtendS.o%s;:crtend.o%s} crtn.o%s"
> +
> +#undef  LINK_SPEC
> +#define LINK_SPEC\
> +  LINUX_OR_ANDROID_LD (GNU_USER_TARGET_LINK_SPEC,\
> +GNU_USER_TARGET_LINK_SPEC " " ANDROID_LINK_SPEC)
> +
> +#undef  SUBTARGET_CC1_SPEC
> +#define SUBTARGET_CC1_SPEC   \
> +  LINUX_OR_ANDROID_CC (GNU_USER_SUBTARGET_CC1_SPEC,  \
> +GNU_USER_SUBTARGET_CC1_SPEC " " ANDROID_CC1_SPEC)
> +
> +#undef  CC1PLUS_SPEC
> +#define CC1PLUS_SPEC \
> +  LINUX_OR_ANDROID_CC ("", ANDROID_CC1PLUS_SPEC)
> +
> +#undef  LIB_SPEC
> +#define LIB_SPEC \
> +  LINUX_OR_ANDROID_LD (GNU_USER_TARGET_LIB_SPEC, \
> +GNU_USER_TARGET_LIB_SPEC " " ANDROID_LIB_SPEC)
> +
> +#undef  STARTFILE_SPEC
> +#define STARTFILE_SPEC   
> \
> +  LINUX_OR_ANDROID_LD (GNU_USER_TARGET_STARTFILE_SPEC, 
> ANDROID_STARTFILE_SPEC)
> +
> +#undef  ENDFILE_SPEC
> +#define ENDFILE_SPEC \
> +  LINUX_OR_ANDROID_LD (GNU_USER_TARGET_ENDFILE_SPEC, ANDROID_ENDFILE_SPEC)
> Index: gcc/libgcc/unwind-dw2-fde-dip.c
> ===
> --- gcc.orig/libgcc/unwind-dw2-fde-dip.c  2012-04-03 17:07:28.0 
> -0700
> +++ gcc/libgcc/unwind-dw2-fde-dip.c   2012-04-04 14:51:01.338074000 -0700
> @@ -48,8 +48,9 @@
> #include "gthr.h"
> 
> #if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
> -&& (__GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ > 2) \
> - || (__GLIBC__ == 2 && __GLIBC_MINOR__ == 2 && defined(DT_CONFIG)))
> +&& ((defined(__BIONIC__) && (defined(mips) || defined(__mips__))) \
> +|| (__GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ > 2) \
> + || (__GLIBC__ == 2 && __GLIBC_MINOR__ == 2 && defined(DT_CONFIG
> # define USE_PT_GNU_EH_FRAME
> #endif
>

Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-04 Thread Michael Hope

On 4 April 2012 18:54, Jakub Jelinek  wrote:
> On Wed, Apr 04, 2012 at 01:34:30PM +1200, Michael Hope wrote:
>> >  I did two ports of Mandriva to armv7. One of my choice to use softfp,
>> > and another hardfp port to be compatible with other distros. But other
>> > than a previous armv5 port, there is not much else of Mandriva arm,
>> > so, it would be "good to have" to be able to run binaries for either
>> > without resorting to a chroot, and only testing purposes.
>> >
>> >  Bumping major or calling it ld-linux-foo.so.3 is out of question?
>>
>> I suspect /lib/ld-linux-$foo.so.3 would be fine.  There's two
>> questions here though: can the hard float loader have a different path
>> and, if so, what should it be?  We're still working on the first part.
>
> If the agreement is that arm 32-bit softfp really needs to be installable
> alongside 32-bit hardfp (and alongside aarch64), then IMHO it should do it
> like all other multilib ports (x86_64/i?86/x32, s390/s390x, ppc/ppc64, the
> various MIPS variants) and what FSB says, e.g. use
> /lib/ld-linux.so.3 and */lib dirs for softfp,
> /libhf/ld-linux.so.3 and */libhf dirs for hardfp and
> /lib64/ld-linux.so.3 and */lib64 dirs for aarch64, have 32-bit
> arm-linux-gnueabi gcc configured for softfp/hardfp multilib with
> MULTILIB_OSDIRNAMES, etc., have it configured in glibc

OK.  This gives a different path for the hard float loader and lets
the Debian guys add on top of that.  I'll ping them and see what they
think.

> and for those that
> choose the Debian layout instead, if it is added somehow configurable into
> upstream gcc/glibc of course handle it similarly there.

Agreed.

> I just wonder why that hasn't been done 10 years ago and only needs doing now

FPUs have only become common on ARM in the last few years.  softfp was
a good interim work around but performance is significantly better
with hard float.

> (of course, aarch64 is going to be new, talking now about the 32-bit softfp 
> vs. hardfp).

Yip.  I assume something like /lib64 to stay consistent with other
architectures.  aarch64 is hard float only.

-- Michael

Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-04 Thread Michael Hope

On 4 April 2012 21:06, Joseph S. Myers  wrote:
> On Wed, 4 Apr 2012, Michael Hope wrote:
>
>> The tricky one is new GCC with old GLIBC.  GCC may have to do a
>> configure time test and fall back to /lib/ld-linux.so.3 if the hard
>> float loader is missing.
>
> I don't think that's appropriate for ABI issues.  If a different dynamic
> linker name is specified, GCC should use it unconditionally (and require
> new enough glibc or a glibc installation that was appropriately
> rearranged).

OK.  I want GCC 4.7.1 to use the new path.  Does this mean that
released versions of GLIBC and GCC 4.7.1 will be incompatible, or does
GLIBC pick the path up from GCC?

>> > I have no idea whether shlib-versions files naming a file in a
>> > subdirectory will work - but if not, you'd need to send a patch to
>> > libc-alpha to support dynamic linkers in subdirectories, with appropriate
>> > justification for why you are doing something different from all other
>> > architectures.
>>
>> Understood.  For now this is just a path.  There's more infrastructure
>> work needed if the path includes a directory.
>
> Formally it's just a path - but an important feature of GNU/Linux and the
> GNU toolchain is consistency between different architectures and existing
> upstream practice is that the dynamic linker is always in the same
> directory as the other associated libraries and that this has the form
> /lib.  In the absence of a compelling reason, which I have not
> seen stated, to do otherwise for a single case, I think that existing
> practice should be followed with the dynamic linker being in a directory
> such as /libhf.

OK.  This matches Jakub's email.

> The "more infrastructure work needed" makes clear that you need libc-alpha
> buy-in *before* putting any patches into GCC or ports.

OK.  I'm glad we had this discussion as it had to start somewhere.
I'll do a follow up across gcc-patches, libc-alpha, and binutils.

> But maybe if you
> don't try to put the dynamic linker in a different directory from the
> other libraries, it's easier to support via existing mechanisms (setting
> slibdir differently if --enable-multiarch-directories or similar)?

OK.  /libhf may fit that better.

>> Do the MIPS or PowerPC loaders detect the ABI and change the library
>> path based on that?  I couldn't tell from the code.
>
> No, they don't detect the ABI.  Both ABIs (and, for Power, the e500v1 and
> e500v2 variants - compatible with soft-float at the function-calling level
> but with some glibc ABI differences with soft-float and with each other)
> use the same directories.

Sorry, I'm confused.  I had a poke about with MIPS and it uses
different argument registers for soft and hard float.  Soft float uses
$4 and hard float $f0.  Are there shims or similar installed by the
loader?

>> > (e) Existing practice for cases that do use different dynamic linkers is
>> > to use a separate library directory, not just dynamic linker name, as in
>> > lib32 and lib64 for MIPS or libx32 for x32; it's certainly a lot easier to
>> > make two sets of libraries work in parallel if you have separate library
>> > directories like that.
>>
>> Is this required, or should it be left to the distro to choose?  Once
>> the loader is in control then it can account for any distro specific
>> features, which may be the standard /lib and /usr/lib for single ABI
>> distros like Fedora or /usr/lib/$tuple for multiarch distros like
>> Ubuntu and Debian.
>
> I thought Fedora used the standard upstream /lib64 on x86_64 and so would
> naturally use a standard upstream /libhf where appropriate.

Good.  Dennis said the same.

>> > So it would seem more appropriate to define a directory libhf for ARM 
>> > (meaning you need a binutils patch as well to
>> > handle that directory, I think)
>>
>> I'd like to leave that discussion for now.  The Debian goal is to
>> support incompatible ABIs and, past that, incompatible architectures.
>> libhf is ambiguous as you could have a MIPS hard float library
>> installed on the same system as an ARM hard float library.
>
> If you want both ARM and MIPS hard-float then I'd think you want both
> big-endian and little-endian ARM hard-float - but your patch defines the
> same dynamic linker name for both of those.

Big endian is extremely uncommon on ARM and I'd rather define it when
needed.  For strictness sake I'll change the patch to use the new path
for hard float little endian only.

So:
 * Big endian: undefined, defaults to /lib/ld-linux.so.3
 * Little endian, soft float: /lib/ld-linux.so.3
 * Little endian, hard float: /libhf/ld-linux.so.3

> Standard upstream practice supports having multiple variants that
> plausibly run on the same system at the same time, such as /lib and
> /lib64, and it seems reasonable to support hard and soft float variants
> that way via a directory such as /libhf.  The Debian-style paths are not
> the default on any other architecture and I don't think it's appropriate
> to make them the default for this particu

[Patch, i386] Avoid LCP stalls (issue5975045)

2012-04-04 Thread Teresa Johnson

New patch to avoid LCP stalls based on feedback from earlier patch. I modified
H.J.'s old patch to perform the peephole2 to split immediate moves to HImode
memory. This is now enabled for Core2, Corei7 and Generic.

I verified that this enables the splitting to occur in the case that originally
motivated the optimization. If we subsequently find situations where LCP stalls
are hurting performance but an extra register is required to perform the
splitting, then we can revisit whether this should be performed earlier.

I also measured SPEC 2000/2006 performance using Generic64 on an AMD Opteron
and the results were neutral.

Bootstrapped and tested on x86_64-unknown-linux-gnu. Is this ok for trunk?

Thanks,
Teresa

2012-04-04   Teresa Johnson  

* config/i386/i386.h (ix86_tune_indices): Add
X86_TUNE_LCP_STALL.
* config/i386/i386.md (move immediate to memory peephole2):
Add cases for HImode move when LCP stall avoidance is needed.
* config/i386/i386.c (initial_ix86_tune_features): Initialize
X86_TUNE_LCP_STALL entry.

Index: config/i386/i386.h
===
--- config/i386/i386.h  (revision 185920)
+++ config/i386/i386.h  (working copy)
@@ -262,6 +262,7 @@ enum ix86_tune_indices {
   X86_TUNE_MOVX,
   X86_TUNE_PARTIAL_REG_STALL,
   X86_TUNE_PARTIAL_FLAG_REG_STALL,
+  X86_TUNE_LCP_STALL,
   X86_TUNE_USE_HIMODE_FIOP,
   X86_TUNE_USE_SIMODE_FIOP,
   X86_TUNE_USE_MOV0,
@@ -340,6 +341,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_L
 #define TARGET_PARTIAL_REG_STALL ix86_tune_features[X86_TUNE_PARTIAL_REG_STALL]
 #define TARGET_PARTIAL_FLAG_REG_STALL \
ix86_tune_features[X86_TUNE_PARTIAL_FLAG_REG_STALL]
+#define TARGET_LCP_STALL \
+   ix86_tune_features[X86_TUNE_LCP_STALL]
 #define TARGET_USE_HIMODE_FIOP ix86_tune_features[X86_TUNE_USE_HIMODE_FIOP]
 #define TARGET_USE_SIMODE_FIOP ix86_tune_features[X86_TUNE_USE_SIMODE_FIOP]
 #define TARGET_USE_MOV0ix86_tune_features[X86_TUNE_USE_MOV0]
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 185920)
+++ config/i386/i386.md (working copy)
@@ -16977,9 +16977,11 @@
(set (match_operand:SWI124 0 "memory_operand")
 (const_int 0))]
   "optimize_insn_for_speed_p ()
-   && !TARGET_USE_MOV0
-   && TARGET_SPLIT_LONG_MOVES
-   && get_attr_length (insn) >= ix86_cur_cost ()->large_insn
+   && ((TARGET_LCP_STALL
+   && GET_MODE (operands[0]) == HImode)
+   || (!TARGET_USE_MOV0
+  && TARGET_SPLIT_LONG_MOVES
+  && get_attr_length (insn) >= ix86_cur_cost ()->large_insn))
&& peep2_regno_dead_p (0, FLAGS_REG)"
   [(parallel [(set (match_dup 2) (const_int 0))
  (clobber (reg:CC FLAGS_REG))])
@@ -16991,8 +16993,10 @@
(set (match_operand:SWI124 0 "memory_operand")
 (match_operand:SWI124 1 "immediate_operand"))]
   "optimize_insn_for_speed_p ()
-   && TARGET_SPLIT_LONG_MOVES
-   && get_attr_length (insn) >= ix86_cur_cost ()->large_insn"
+   && ((TARGET_LCP_STALL
+   && GET_MODE (operands[0]) == HImode)
+   || (TARGET_SPLIT_LONG_MOVES
+  && get_attr_length (insn) >= ix86_cur_cost ()->large_insn))"
   [(set (match_dup 2) (match_dup 1))
(set (match_dup 0) (match_dup 2))])
 
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 185920)
+++ config/i386/i386.c  (working copy)
@@ -1964,6 +1964,10 @@ static unsigned int initial_ix86_tune_features[X86
   /* X86_TUNE_PARTIAL_FLAG_REG_STALL */
   m_CORE2I7 | m_GENERIC,
 
+  /* X86_TUNE_LCP_STALL: Avoid an expensive length-changing prefix stall
+   * on 16-bit immediate moves into memory on Core2 and Corei7.  */
+  m_CORE2I7 | m_GENERIC,
+
   /* X86_TUNE_USE_HIMODE_FIOP */
   m_386 | m_486 | m_K6_GEODE,
 

--
This patch is available for review at http://codereview.appspot.com/5975045

Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-04 Thread Joseph S. Myers

On Thu, 5 Apr 2012, Michael Hope wrote:

> > I don't think that's appropriate for ABI issues.  If a different dynamic
> > linker name is specified, GCC should use it unconditionally (and require
> > new enough glibc or a glibc installation that was appropriately
> > rearranged).
> 
> OK.  I want GCC 4.7.1 to use the new path.  Does this mean that
> released versions of GLIBC and GCC 4.7.1 will be incompatible, or does
> GLIBC pick the path up from GCC?

Released versions would be incompatible (you could make GCC check at 
configure time for too-old glibc if --with-float=hard); the path needs 
hardcoding in both places.

> >> Do the MIPS or PowerPC loaders detect the ABI and change the library
> >> path based on that?  I couldn't tell from the code.
> >
> > No, they don't detect the ABI.  Both ABIs (and, for Power, the e500v1 and
> > e500v2 variants - compatible with soft-float at the function-calling level
> > but with some glibc ABI differences with soft-float and with each other)
> > use the same directories.
> 
> Sorry, I'm confused.  I had a poke about with MIPS and it uses
> different argument registers for soft and hard float.  Soft float uses
> $4 and hard float $f0.  Are there shims or similar installed by the
> loader?

No.  A system is either purely hard-float or purely soft-float, and the 
same paths are used for both so they can't coexist.  (Mismatches at 
*static* link time are detected through object attributes.)

> Big endian is extremely uncommon on ARM and I'd rather define it when
> needed.  For strictness sake I'll change the patch to use the new path
> for hard float little endian only.

I don't think that's correct; the new path should be used independent of 
endian, just as the existing path is.  But any multiarch support patch 
should presumably define separate multiarch paths for each endianness.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [Patch, i386] Avoid LCP stalls (issue5975045)

2012-04-04 Thread H.J. Lu

On Wed, Apr 4, 2012 at 5:07 PM, Teresa Johnson  wrote:
> New patch to avoid LCP stalls based on feedback from earlier patch. I modified
> H.J.'s old patch to perform the peephole2 to split immediate moves to HImode
> memory. This is now enabled for Core2, Corei7 and Generic.
>
> I verified that this enables the splitting to occur in the case that 
> originally
> motivated the optimization. If we subsequently find situations where LCP 
> stalls
> are hurting performance but an extra register is required to perform the
> splitting, then we can revisit whether this should be performed earlier.
>
> I also measured SPEC 2000/2006 performance using Generic64 on an AMD Opteron
> and the results were neutral.
>

What are the performance impacts on Core i7? I didn't notice any significant
changes when I worked on it for Core 2.

Thanks.

-- 
H.J.

Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-04 Thread Michael Hope

On 5 April 2012 12:07, Joseph S. Myers  wrote:
> On Thu, 5 Apr 2012, Michael Hope wrote:
>
>> > I don't think that's appropriate for ABI issues.  If a different dynamic
>> > linker name is specified, GCC should use it unconditionally (and require
>> > new enough glibc or a glibc installation that was appropriately
>> > rearranged).
>>
>> OK.  I want GCC 4.7.1 to use the new path.  Does this mean that
>> released versions of GLIBC and GCC 4.7.1 will be incompatible, or does
>> GLIBC pick the path up from GCC?
>
> Released versions would be incompatible (you could make GCC check at
> configure time for too-old glibc if --with-float=hard); the path needs
> hardcoding in both places.
>
>> >> Do the MIPS or PowerPC loaders detect the ABI and change the library
>> >> path based on that?  I couldn't tell from the code.
>> >
>> > No, they don't detect the ABI.  Both ABIs (and, for Power, the e500v1 and
>> > e500v2 variants - compatible with soft-float at the function-calling level
>> > but with some glibc ABI differences with soft-float and with each other)
>> > use the same directories.
>>
>> Sorry, I'm confused.  I had a poke about with MIPS and it uses
>> different argument registers for soft and hard float.  Soft float uses
>> $4 and hard float $f0.  Are there shims or similar installed by the
>> loader?
>
> No.  A system is either purely hard-float or purely soft-float, and the
> same paths are used for both so they can't coexist.  (Mismatches at
> *static* link time are detected through object attributes.)

Ah, the same as ARM then.  The MIPS community would need something
similar to this patch if they wanted to support soft and hard float
side by side.

>> Big endian is extremely uncommon on ARM and I'd rather define it when
>> needed.  For strictness sake I'll change the patch to use the new path
>> for hard float little endian only.
>
> I don't think that's correct; the new path should be used independent of
> endian, just as the existing path is.

OK.

> But any multiarch support patch
> should presumably define separate multiarch paths for each endianness.

That's up to Debian.  I've asked for documentation on the final tuples
and what they mean as the one at:
 http://wiki.debian.org/Multiarch/Tuples

is out of date.  I prefer defining what is needed now and doing others
as needed.

-- Michael

Re: [PATCH] ARM: Use different linker path for hardfloat ABI

2012-04-04 Thread dann frazier

On Wed, Apr 04, 2012 at 02:39:58PM +1200, Michael Hope wrote:
> On 4 April 2012 10:56, Joseph S. Myers  wrote:
> > On Tue, 3 Apr 2012, Michael Hope wrote:
> >
> >> +#define GLIBC_DYNAMIC_LINKER \
> >> +   "%{mhard-float:" GLIBC_DYNAMIC_LINKER_HARD_FLOAT "} \
> >> +    %{mfloat-abi=hard:" GLIBC_DYNAMIC_LINKER_HARD_FLOAT "} \
> >> +    %{!mfloat-abi=hard:%{!mhard-float:" GLIBC_DYNAMIC_LINKER_SOFT_FLOAT 
> >> "}}"
> >
> > (a) -mhard-float is a .opt Alias for -mfloat-abi=hard so does not need to
> > be handled in specs.
> 
> Fixed.
> 
> > (b) You need to handle compilers configured with --with-float=hard, so
> > make the specs depend on the default ABI the compiler was configured with.
> 
> GCC seems to take configure time options into account when evaluating
> a spec file.
> 
> Tested by building a default, --with-float=hard, and
> --with-float=softfp compiler then checking the loader path for all
> combinations of {,-mglibc,-mbionic,-muclibc} x
> {,-mhard-float,-msoft-float,-mfloat-abi=hard,-mfloat-abi=softfp}.
> 
> > (c) Please include libc-ports on future submissions and provide both the
> > GCC patch and the glibc ports patch that have been tested to work together
> > to build and install the library in the given path; a patch to one
> > component like this cannot sensibly be considered in isolation.  I imagine
> > you'll need appropriate ARM preconfigure support to detect what ABI the
> > compiler is using, much like the support for MIPS, so that the right
> > shlib-versions files are used.
> 
> Agreed.
> 
> >  I try to follow all ARM glibc discussions
> > on libc-ports closely, as the ARM glibc maintainer; was there a previous
> > discussion of the dynamic linker naming issue there that I missed?
> 
> Steve McIntyre is driving this inside Debian.  I'll ping him on the
> GLIBC support.
> 
> The tricky one is new GCC with old GLIBC.  GCC may have to do a
> configure time test and fall back to /lib/ld-linux.so.3 if the hard
> float loader is missing.
> 
> >  (The only previous relevant discussion that I recall is one on
> > patc...@eglibc.org starting at
> > , regarding how the
> > dynamic linker should check that a library has the right ABI, and there
> > was no real followup on that after I indicated what would seem to be the
> > appropriate implementation approaches and places for subsequent
> > discussion.)
> 
> The patch above changes the loader to catch a mixed installation and
> reject mixing incompatible libraries.  The static linker does this
> currently but it's not essential.
> 
> > I have no idea whether shlib-versions files naming a file in a
> > subdirectory will work - but if not, you'd need to send a patch to
> > libc-alpha to support dynamic linkers in subdirectories, with appropriate
> > justification for why you are doing something different from all other
> > architectures.
> 
> Understood.  For now this is just a path.  There's more infrastructure
> work needed if the path includes a directory.
> 
> > (d) Existing practice for Power Architecture and MIPS at least is that
> > hard-float and soft-float *don't* use different library directories /
> > dynamic linkers.
> 
> The goal is to have a standard loader path for all hard float distros
> and, similar to how you can have a mixed 32/64 bit installation, allow
> mixed softfp/hard float installations for distros that want it.  This
> is a new requirement and ARM is the first one exposed to it.  I assume
> Debian would push for similar changes on MIPS and PowerPC.
> 
> Do the MIPS or PowerPC loaders detect the ABI and change the library
> path based on that?  I couldn't tell from the code.
> 
> > (e) Existing practice for cases that do use different dynamic linkers is
> > to use a separate library directory, not just dynamic linker name, as in
> > lib32 and lib64 for MIPS or libx32 for x32; it's certainly a lot easier to
> > make two sets of libraries work in parallel if you have separate library
> > directories like that.
> 
> Is this required, or should it be left to the distro to choose?  Once
> the loader is in control then it can account for any distro specific
> features, which may be the standard /lib and /usr/lib for single ABI
> distros like Fedora or /usr/lib/$tuple for multiarch distros like
> Ubuntu and Debian.
> 
> > So it would seem more appropriate to define a directory libhf for ARM 
> > (meaning you need a binutils patch as well to
> > handle that directory, I think)
> 
> I'd like to leave that discussion for now.  The Debian goal is to
> support incompatible ABIs and, past that, incompatible architectures.
> libhf is ambiguous as you could have a MIPS hard float library
> installed on the same system as an ARM hard float library.
> 
> > and these different Debian-style names
> > could be implemented separately in a multiarch patch if someone submits
> > one that properly accounts for my review comments on previous patch
> > versions (failure to produce such a fixed pa

[PATCH] Fix PR52614

2012-04-04 Thread William J. Schmidt

There seems to be tacit agreement that the vector tests should use
-fno-common on all targets to avoid the recent spate of failures (see
discussion in 52571 and 52603).  This patch (proposed by Dominique
D'Humieures) does just that.  I agreed to shepherd the patch through.
I've verified that it removes the failures for powerpc64-linux.  Various
others have verified for arm, sparc, and darwin.  OK for trunk?

Thanks,
Bill


gcc/testsuite:

2012-04-04  Bill Schmidt  
Dominique D'Humieures 

PR testsuite/52614
* gcc.dg/vect/vect.exp: Use -fno-common on all targets.
* gcc.dg/vect/costmodel/ppc/ppc-costmodel-vect.exp: Likewise.


Index: gcc/testsuite/gcc.dg/vect/costmodel/ppc/ppc-costmodel-vect.exp
===
--- gcc/testsuite/gcc.dg/vect/costmodel/ppc/ppc-costmodel-vect.exp  
(revision 186108)
+++ gcc/testsuite/gcc.dg/vect/costmodel/ppc/ppc-costmodel-vect.exp  
(working copy)
@@ -34,7 +34,7 @@ if ![is-effective-target powerpc_altivec_ok] {
 set DEFAULT_VECTCFLAGS ""
 
 # These flags are used for all targets.
-lappend DEFAULT_VECTCFLAGS "-O2" "-ftree-vectorize" "-fvect-cost-model"
+lappend DEFAULT_VECTCFLAGS "-O2" "-ftree-vectorize" "-fvect-cost-model" 
"-fno-common"
 
 # If the target system supports vector instructions, the default action
 # for a test is 'run', otherwise it's 'compile'.  Save current default.
Index: gcc/testsuite/gcc.dg/vect/vect.exp
===
--- gcc/testsuite/gcc.dg/vect/vect.exp  (revision 186108)
+++ gcc/testsuite/gcc.dg/vect/vect.exp  (working copy)
@@ -40,7 +40,7 @@ if ![check_vect_support_and_set_flags] {
 }
 
 # These flags are used for all targets.
-lappend DEFAULT_VECTCFLAGS "-ftree-vectorize" "-fno-vect-cost-model"
+lappend DEFAULT_VECTCFLAGS "-ftree-vectorize" "-fno-vect-cost-model" 
"-fno-common"
 
 # Initialize `dg'.
 dg-init

[branches/google/gcc-4_6] Backported r179661 and 179662 from mainline. (issue 5989043)

2012-04-04 Thread asharif


Reviewers: Diego Novillo, jingyu, davidxl,

Message:
Please take a look at this patch and tell me if it's OK for
branches/google/gcc-4_6.

Description:
Backported the following patch from trunk:

2011-10-07  Andrew Stubbs  

gcc/
* config/arm/predicates.md (shift_amount_operand): Remove constant
range check.
(shift_operator): Check range of constants for all shift operators.

gcc/testsuite/
* gcc.dg/pr50193-1.c: New file.
* gcc.target/arm/shiftable.c: New file.


Please review this at http://codereview.appspot.com/5989043/

Affected files:
   M.
  M gcc/ChangeLog
  M gcc/ChangeLog.google-4_6
  M gcc/config/arm/predicates.md
  M gcc/testsuite/ChangeLog
  A  +  gcc/testsuite/gcc.dg/pr50193-1.c
  A  +  gcc/testsuite/gcc.target/arm/shiftable.c

Re: [PATCH] Fix PR52614

2012-04-04 Thread Mike Stump

On Apr 4, 2012, at 7:56 PM, William J. Schmidt wrote:
> There seems to be tacit agreement that the vector tests should use
> -fno-common on all targets to avoid the recent spate of failures (see
> discussion in 52571 and 52603).

> OK for trunk?

Ok.  Any other solution I think will be real work and we shouldn't loose the 
testing between now and then by not having the test cases working.

Re: [Patch, i386] Avoid LCP stalls (issue 5975045)

2012-04-04 Thread davidxl



http://codereview.appspot.com/5975045/diff/6001/config/i386/i386.md
File config/i386/i386.md (right):

http://codereview.appspot.com/5975045/diff/6001/config/i386/i386.md#newcode16974
config/i386/i386.md:16974: ;; gets too big.
The comments may need to be updated.

http://codereview.appspot.com/5975045/

Re: [branches/google/gcc-4_6] Backported r179661 and 179662 from mainline. (issue 5989043)

2012-04-04 Thread Michael Hope

On 5 April 2012 15:56,   wrote:
> Reviewers: Diego Novillo, jingyu, davidxl,
>
> Message:
> Please take a look at this patch and tell me if it's OK for
> branches/google/gcc-4_6.
>
> Description:
> Backported the following patch from trunk:
>
> 2011-10-07  Andrew Stubbs  
>
>    gcc/
>    * config/arm/predicates.md (shift_amount_operand): Remove constant
>    range check.
>    (shift_operator): Check range of constants for all shift operators.
>
>    gcc/testsuite/
>    * gcc.dg/pr50193-1.c: New file.
>    * gcc.target/arm/shiftable.c: New file.
>
>
> Please review this at http://codereview.appspot.com/5989043/
>
> Affected files:
>   M    .
>  M     gcc/ChangeLog
>  M     gcc/ChangeLog.google-4_6
>  M     gcc/config/arm/predicates.md
>  M     gcc/testsuite/ChangeLog
>  A  +  gcc/testsuite/gcc.dg/pr50193-1.c
>  A  +  gcc/testsuite/gcc.target/arm/shiftable.c

Andrew, could you check that the Google guys have the final version of
your patch?

Could you backport the fix to the 4.6 release branch if valid?  Better
than having the same patch in three places.

-- Michael

Re: [Patch, i386] Avoid LCP stalls (issue5975045)

2012-04-04 Thread Teresa Johnson

On Wed, Apr 4, 2012 at 5:39 PM, H.J. Lu  wrote:
> On Wed, Apr 4, 2012 at 5:07 PM, Teresa Johnson  wrote:
>> New patch to avoid LCP stalls based on feedback from earlier patch. I 
>> modified
>> H.J.'s old patch to perform the peephole2 to split immediate moves to HImode
>> memory. This is now enabled for Core2, Corei7 and Generic.
>>
>> I verified that this enables the splitting to occur in the case that 
>> originally
>> motivated the optimization. If we subsequently find situations where LCP 
>> stalls
>> are hurting performance but an extra register is required to perform the
>> splitting, then we can revisit whether this should be performed earlier.
>>
>> I also measured SPEC 2000/2006 performance using Generic64 on an AMD Opteron
>> and the results were neutral.
>>
>
> What are the performance impacts on Core i7? I didn't notice any significant
> changes when I worked on it for Core 2.

One of our street map applications speeds up by almost 5% on Corei7
and almost 2.5% on Core2 from this optimization.  It contains a hot
inner loop with some conditional writes of zero into a short array.
The loop is unrolled so that it does not fit into the LSD which would
have avoided many of the LCP stalls.

Thanks,
Teresa

>
> Thanks.
>
> --
> H.J.



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

RE: [PATCH H8300] Add function_vector attribute

2012-04-04 Thread Ajinkya Dhobale

Hello Jeff,

>> So the fundamental difference is your version of function_vector is 
>> programmer directed.

Yes, programmer will need to assign function vector number manually 
during declaration of the attribute.

>> I think you need to investigate further since functions marked with 
>> the attribute should be called through the function vector.
>>
>> When the linker assigns the slot, the proper syntax is jsr :8

Even I was looking for same. However, when I checked the object dump of
the generated binary it had 'jsr' instruction in absolute addressing format.

Generated assembly was like:
mov.w   r7,r6  
jsr @_foo  
mov.w   @r7+,r6

And object dump of output file was like:
0d 76   mov.w   r7,r6
5e 00 02 30 jsr @0x230:24
6d 76   mov.w   @r7+,r6

Do I need to pass any additional command line options during compilation 
to generate 'jsr' in format 'jsr :8'?

>> ISTM that you either need to use a different attribute name or find a 
>> way to make the argument optional.

Because of the 'jsr' generated in absolute addressing format, 
I thought current implementation might be broken and hence I modified it.
However, if current implementation is to be kept as is then I will prefer
making argument passing optional. In that case, I will repost the modified 
patch.

Meanwhile, I have modified the previous patch for following two things:
1. Initially I was generating 'jsr' as 'jsr @@vect-number:8'.
However, I found it was not working on hardware. 

H8 programming manuals mention, 'jsr' instruction in memory indirect 
format expects 8-bit absolute address to be encoded in instruction code. 
Hence I modified the patch to generate it as: 
'jsr @@:8'.

Now it is working appropriately on hardware.

2. In H8, pointer size is 2 bytes in 'normal' CPU mode and 4 bytes in 
other modes. I modified the patch to vary 'jsr' destination generation 
accordingly.

Thanks and Regards,
Ajinkya



modified-h8300-func-vect.patch
Description: modified-h8300-func-vect.patch

Re: [patch] Fix PR52822 (stable_partition move-assigns object to itself) in trunk, 4.7, and 4.6

2012-04-04 Thread Paolo Carlini


Hi,

The attached patches fix http://gcc.gnu.org/PR52822, and have been
tested with `make check-c++` on linux-x86_64. The trunk patch applies
and tests cleanly on gcc-4_7-branch. The gcc-4_6-branch patch is
significantly simpler, as Paolo suggested on the bug.

A few small issues.

For the 4.6 version of the patch, you want to use std::__addressof, 
instead of simply &, which may be overloaded.


For the mainline/4.7 version, you want to cast __pred to bool: 
!bool(__pred(*__first)). Also, isn't clear to me why you have 
__local_len in __find_if_not_n instead of just working on __len (and in 
any case please prefer just __local_len as condition instead of the 
redundant __local_len != 0; likewise in many other similar situations).


Also, very minor nit, you will be touching the libstdc++-v3 Changelog 
thus no libstdc++-v3 prefixes in the ChangeLog entry.


Paolo.

Re: [PING][SCORE] Hookize PREFERRED_RELOAD_CLASS

2012-04-04 Thread Liqin Chen

在 2012年4月3日 上午2:51，Anatoly Sokolov  写道：
> Hi.
>
> Ping patch: http://gcc.gnu.org/ml/gcc-patches/2012-01/msg00261.html
>>
>>  This patch removes obsolete PREFERRED_RELOAD_CLASS macro from the SCORE
>> back end in the GCC and introduces equivalent TARGET_PREFERRED_RELOAD_CLASS
>> target hook.
>>
>>  Untested.
>>
>>  OK to install?
>>
>> * config/score/score.h (PREFERRED_RELOAD_CLASS): Remove.
>> * config/score/score-protos.h (score_preferred_reload_class): Remove.
>> * config/score/score.c (TARGET_PREFERRED_RELOAD_CLASS: Define.
>> (score_preferred_reload_class): Make static. Change return and
>> 'rclass' argument type to reg_class_t.
>>
> Anatoly.

It's OK to install, Thanks.

--liqin

Re: [patch] Fix PR52822 (stable_partition move-assigns object to itself) in trunk, 4.7, and 4.6

2012-04-04 Thread Paolo Carlini


On 4/4/12 11:09 PM, Paolo Carlini wrote:

Hi,

The attached patches fix http://gcc.gnu.org/PR52822, and have been
tested with `make check-c++` on linux-x86_64. The trunk patch applies
and tests cleanly on gcc-4_7-branch. The gcc-4_6-branch patch is
significantly simpler, as Paolo suggested on the bug.

A few small issues.

For the 4.6 version of the patch, you want to use std::__addressof, 
instead of simply &, which may be overloaded.

And by the way what's wrong with just comparing the iterators?

Paolo.

99 matches

Mail list logo